Hostname: page-component-7c8c6479df-8mjnm Total loading time: 0 Render date: 2024-03-28T17:33:37.151Z Has data issue: false hasContentIssue false

Item response theory and the measurement of psychiatric constructs: some empirical and conceptual issues and challenges

Published online by Cambridge University Press:  08 April 2016

S. P. Reise*
Affiliation:
University of California, Los Angeles, USA
A. Rodriguez
Affiliation:
University of California, Los Angeles, USA
*
*Address for correspondence: S. P. Reise, Ph.D., Department of Psychology, UCLA, Franz Hall, Los Angeles, CA 90095, USA. (Email: reise@psych.ucla.edu)

Abstract

Item response theory (IRT) measurement models are now commonly used in educational, psychological, and health-outcomes measurement, but their impact in the evaluation of measures of psychiatric constructs remains limited. Herein we present two, somewhat contradictory, theses. The first is that, when skillfully applied, IRT has much to offer psychiatric measurement in terms of scale development, psychometric analysis, and scoring. The second argument, however, is that psychiatric measurement presents some unique challenges to the application of IRT – challenges that may not be easily addressed by application of conventional IRT models and methods. These challenges include, but are not limited to, the modeling of conceptually narrow constructs and their associated limited item pools, and unipolar constructs where the expected latent trait distribution is highly skewed.

Type
Review Article
Copyright
Copyright © Cambridge University Press 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bollen, K, Lennox, R (1991). Conventional wisdom on measurement: a structural equation perspective. Psychological Bulletin 110, 305314.Google Scholar
Cai, L (2013). flexMIRT: a Numerical Engine for Flexible Multilevel Multidimensional Item Analysis and Test Scoring (Version 2.0) (Computer software) . Vector Psychometric Group: Chapel Hill, NC.Google Scholar
Cai, L, Thissen, D, du Toit, SHC (2011). IRTPRO for Windows (Computer software) . Scientific Software International: Lincolnwood, IL.Google Scholar
Cella, D, Riley, W, Stone, A, Rothrock, N, Reeve, B, Yount, S, Amtmann, D, Bode, R, Buysse, D, Choi, S, Cook, K, Devellis, R, DeWalt, D, Fries, JF, Gershon, R, Hahn, EA, Lai, JS, Pilkonis, P, Revicki, D, Rose, M, Weinfurt, K, Hays, R (2010). The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology 63, 11791194.Google Scholar
Chalmers, RP, Pritikin, J, Robitzsch, A, Zoltak, M (2015). mirt: a multidimensional item response theory package for the R environment. Journal of Statistical Software 48, 129. Google Scholar
Chen, WH, Thissen, D (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics 22, 265289.Google Scholar
Choi, SW (2009). Firestar: Computerized Adaptive Testing simulation program for polytomous item response theory models. Applied Psychological Measurement 33, 644645.Google Scholar
Choi, SW, Schalet, B, Cook, KF, Cella, D (2014). Establishing a common metric for depressive symptoms: linking the BDI-II, CES-D, and PHQ-9 to PROMIS Depression. Psychological Assessment 26, 513527. Google Scholar
Embretson, SE, Reise, SP (2000). Item Response Theory for Psychologists, Mahwah, NJ: Erlbaum.Google Scholar
Eysenck, SB, Eysenck, HJ (1978). Impulsiveness and Venturesomeness: their position in a dimensional system of personality description. Psychological Reports 43, 12471255.Google Scholar
Fayers, PM, Hand, DJ (2002). Causal variables, indicator variables and measurement scales: an example from quality of life. Journal of the Royal Statistical Society: Series A (Statistics in Society) 165, 233253.Google Scholar
Gershon, RC, Wagster, MV, Hendrie, HC, Fox, NA, Cook, KF, Nowinski, CJ (2013). NIH Toolbox for assessment of neurological and behavioral function. Neurology 80, S2S6.Google Scholar
Han, KCT, Paek, I (2014). A review of commercial software packages for multidimensional IRT modeling. Applied Psychological Measurement 13, 113.Google Scholar
Horn, NR, Dolan, M, Elliott, R, Deakin, JFW, Woodruff, PWR (2003). Response inhibition and impulsivity: an fMRI study. Neuropsychologia 41, 19591966.Google Scholar
Lucke, JF (2014). Positive trait item response models. In New Developments in Quantitative Psychology: Presentations from the 77th Annual Psychometric Society Meeting (ed. Millsap, R. E. van der Ark, L. A., Bolt, D. M. and Woods, C. M.), pp. 199213. New York: Springer.Google Scholar
Lucke, JF (2015). Unipolar item response models. In Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment (ed. Reise, S. P. and Revicki, D.), pp. 272284. Routledge: New York.Google Scholar
Maydeu-Olivares, A, Cai, L, Hernández, A (2011). Comparing the fit of item response theory and factor analysis models. Structural Equation Modeling: a Multidisciplinary Journal 18, 333356.Google Scholar
Meijer, RR, Tendeiro, JN, Wanders, RBK (2015). The use of nonparametric item response theory to explore data quality. In Handbook of Item response Theory Modeling: Applications to Typical Performance Assessment (ed. Reise, S. P. and Revicki, D. A.), pp. 85110. Routledge: London, England.Google Scholar
Molenaar, D (2014). Heteroscedastic latent trait models for dichotomous data. Psychometrika 625644.Google Scholar
Monroe, S, Cai, L (2014). Estimation of a Ramsay-curve item response model by the Metropolis-Hastings Robbins-Monro algorithm. Educational and Psychological Measurement 74, 343369.Google Scholar
Mungas, D, Reed, BR, Marshall, SC, González, HM (2000). Development of psychometrically matched English and Spanish language neuropsychological tests for older persons. Neuropsychology 14, 209223.Google Scholar
Muthén, LK, Muthén, BO (2012). Mplus. The Comprehensive Modelling Program for Applied Researchers: User's Guide, 5.Google Scholar
Orlando, M, Thissen, D (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement 24, 5064.Google Scholar
Orlando, M, Thissen, D (2003). Further investigation of the performance of S-X2: an item fit index for use with dichotomous item response theory models. Applied Psychological Measurement 27, 289298.Google Scholar
Partchev, I (2015). Package ‘irtoys’: simple interface to the estimation and plotting of IRT models (https://cran.rproject.org/web/packages/irtoys/irtoys.pdf).Google Scholar
R Development Core Team (2015). R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria (http://www.R-project.org).Google Scholar
Reckase, M (2009). Multidimensional Item Response Theory. Springer: New York.Google Scholar
Reise, SP, Revicki, D (2015). Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment. Routledge: New York.Google Scholar
Reise, SP, Waller, NG (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods 8, 164184.CrossRefGoogle ScholarPubMed
Reise, SP, Waller, NG (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology 5, 2748.CrossRefGoogle ScholarPubMed
Rizopoulos, D (2015) ltm: latent trait models under IRT (https://cran.r-project.org/web/packages/ltm/ltm.pdf).Google Scholar
Schalet, BD, Cook, KF, Choi, SW, Cella, D (2014). Establishing a common metric for self-reported anxiety: linking the MASQ, PANAS, and GAD-7 to PROMIS Anxiety. Journal of Anxiety Disorders 28, 8896.Google Scholar
Steinberg, L, Thissen, D (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology. Psychological Methods 1, 181197.Google Scholar
Tavares, HR, Andrade, DFD, Pereira, CADB (2004). Detection of determinant genes and diagnostic via item response theory. Genetics and Molecular Biology 27, 679685.Google Scholar
Thomas, ML (2011). The value of item response theory in clinical assessment: a review. Assessment 18, 291307.Google Scholar
Van der Ark, LA (2014). New developments in Mokken scale analysis in R. Journal of Statistical Software 48, 127. Google Scholar
Wall, MM, Park, JY, Moustaki, I (2015). IRT modeling in the presence of zero-inflation with application to psychiatric disorder severity. Applied Psychological Measurement 39, 583597.Google Scholar
Woods, CM (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods 11, 253270.Google Scholar
Wu, EJ, Bentler, PM (2011). EQSIRT: a User-friendly IRT Program (computer software) . Multivariate Software: Encino, CA.Google Scholar
Xu, MK, Gaysina, D, Barnett, JH, Scoriels, L, van de Lagemaat, LN, Wong, A, Jones, PB (2015). Psychometric precision in phenotype definition is a useful step in molecular genetic investigation of psychiatric disorders. Translational Psychiatry 5, e593.CrossRefGoogle ScholarPubMed
Yang, FM, Kao, ST (2014). Item response theory for measurement validity. Shanghai Archives of Psychiatry 26, 171177.Google Scholar