Skip to main content
Log in

The Heteroscedastic Graded Response Model with a Skewed Latent Trait: Testing Statistical and Substantive Hypotheses Related to Skewed Item Category Functions

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

The Graded Response Model (GRM; Samejima, Estimation of ability using a response pattern of graded scores, Psychometric Monograph No. 17, Richmond, VA: The Psychometric Society, 1969) can be derived by assuming a linear regression of a continuous variable, Z, on the trait, θ, to underlie the ordinal item scores (Takane & de Leeuw in Psychometrika, 52:393–408, 1987). Traditionally, a normal distribution is specified for Z implying homoscedastic error variances and a normally distributed θ. In this paper, we present the Heteroscedastic GRM with Skewed Latent Trait, which extends the traditional GRM by incorporation of heteroscedastic error variances and a skew-normal latent trait. An appealing property of the extended GRM is that it includes the traditional GRM as a special case. This enables specific tests on the normality assumption of Z. We show how violations of normality in Z can lead to asymmetrical category response functions. The ability to test this normality assumption is beneficial from both a statistical and substantive perspective. In a simulation study, we show the viability of the model and investigate the specificity of the effects. We apply the model to a dataset on affect and a dataset on alexithymia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.

Similar content being viewed by others

Notes

  1. Note that the error term may include a systematic component due to misfit.

  2. Cramér’s theorem states that if X 1 and X 2 are independent random variables and X 1+X 2 is normally distributed, it follows that both X 1 and X 2 are normally distributed.

  3. Specifically, Azevedo et al. (2011) used the centered skew-normal distribution, see below.

  4. We note however that for the Positive Affect scale, results are similar.

  5. Because of the illustrational purposes, we judge an RMSEA smaller than 0.08 to be an indication of acceptable model fit (see Schermelleh-Engel, Moosbrugger & Müller, 2003).

  6. This sample was selected from a data set that is much larger (N=5780). The original data included various manipulations (15 in total). We selected subjects from 2 conditions (N 1=410 and N 2=406) that appeared to be homogeneous with respect to the manipulation (specifically, the subjects we selected completed the questionnaire with respectively a ‘scroll’ answer scale and a ‘static’ answer scale).

  7. For the Cognitive Analyzing and Affective Analyzing subscales, only one item is associated with heteroscedasticity in the opposing direction.

References

  • Agresti, A. (2002). Categorical data analysis (2nd ed.). New York: Wiley.

    Book  Google Scholar 

  • Allport, G.W. (1937). Personality. A psychological interpretation. New York: Henry Holt.

    Google Scholar 

  • Arnold, B., & Beaver, R. (2002). Skewed multivariate models related to hidden truncation and/or selective reporting. Test, 11, 7–54.

    Article  Google Scholar 

  • Arnold, B.C., Beaver, R.J., Groeneveld, R.A., & Meeker, W.Q. (1993). The nontruncated marginal of a truncated bivariate normal distribution. Psychometrika, 58, 471–488.

    Article  Google Scholar 

  • Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12, 171–178.

    Google Scholar 

  • Azzalini, A. (1986). Further results on a class of distributions which includes the normal ones. Statistica, 46, 199–208.

    Google Scholar 

  • Azzalini, A. (2005). The skew-normal distribution and related multivariate families. Scandinavian Journal of Statistics, 32, 159–188.

    Article  Google Scholar 

  • Azzalini, A., & Capatanio, A. (1999). Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society. Series B, 61, 579–602.

    Article  Google Scholar 

  • Azzalini, A., & Dalla Valle, A. (1996). The multivariate skew-normal distribution. Biometrika, 83, 715–726.

    Article  Google Scholar 

  • Azevedo, C.L.N., Bolfarine, H., & Andrade, D.F. (2011). Bayesian inference for a skew-normal IRT model under the centred parameterization. Computational Statistics & Data Analysis, 55, 353–365.

    Article  Google Scholar 

  • Bauer, D.J., & Hussong, A.M. (2009). Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. Psychological Methods, 14, 101–125.

    Article  PubMed  Google Scholar 

  • Baumeister, R.E., & Tice, T.M. (1988). Metatraits. Journal of Personality, 56, 571–598.

    Article  Google Scholar 

  • Bazán, J.L., Bolfarine, H., & Branco, D.M. (2004). A new family of asymmetric models for item response theory: a skew-normal IRT family (Technical Report No. RT-MAE-2004-17). Department of Statistics, University of São Paulo.

  • Bazán, J.L., Branco, M.D., & Bolfarine, H. (2006). A skew item response model. Bayesian Analysis, 1, 861–892.

    Article  Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord & M.R. Novick (Eds.), Statistical theories of mental test scores. Reading: Addison Wesley (Chapters 17–20).

    Google Scholar 

  • Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443–459.

    Article  Google Scholar 

  • Bollen, K.A. (1996). A limited-information estimator for LISREL models with or without heteroscedastic errors. In G.A. Marcoulides & R.E. Schumacker (Eds.), Advanced structural equation modeling: issues and techniques (pp. 227–241). Mahwah: Erlbaum.

    Google Scholar 

  • Chen, M.-H., Dey, D.K., & Shao, Q.M. (1999). A new skewed link model for dichotomous quantal response data. Journal of the American Statistical Association, 94, 1172–1186.

    Google Scholar 

  • Chiogna, M. (2005). A note on the asymptotic distribution of the maximum likelihood estimator for the scalar skew-normal distribution. Statistical Methods & Applications, 14, 331–334.

    Article  Google Scholar 

  • Cramér, H. (1937). Random variables and probability distributions. Cambridge: Cambridge University Press.

    Google Scholar 

  • Cramér, H. (1946). Mathematical methods of statistics. Princeton: Princeton University Press.

    Google Scholar 

  • Czado, C., & Santner, T.J. (1992). The effect of link misspecification on binary regression inference. Journal of Statistical Planning and Inference, 33, 213–231.

    Article  Google Scholar 

  • Emons, W.H., Meijer, R.R., & Denollet, J. (2007). Negative affectivity and social inhibition in cardiovascular disease: evaluating type-D personality and its assessment using item response theory. Journal of Psychosomatic Research, 63, 27–39.

    Article  PubMed  Google Scholar 

  • Fisher, R.A. (1928). The general sampling distribution of the multiple correlation coefficient. Proceedings of the Royal Society of London. Series A, 121, 654–673.

    Article  Google Scholar 

  • Fraley, R.C., Waller, N.G., & Brennan, K.A. (2000). An item response theory analysis of self-report measures of adult attachment. Journal of Personality and Social Psychology, 78, 350–365.

    Article  PubMed  Google Scholar 

  • Guadagnoli, E., & Mor, V. (1989). Measuring cancer patients’ affect: revision and psychometric properties of the Profile of Mood States (POMS). Psychological Assessment, 1, 150–154.

    Article  Google Scholar 

  • Hessen, D.J., & Dolan, C.V. (2009). Heteroscedastic one-factor models and marginal maximum likelihood estimation. British Journal of Mathematical & Statistical Psychology, 62, 57–77.

    Article  Google Scholar 

  • Jinks, J.L., & Fulker, D.W. (1970). Comparison of the biometrical genetical, MAVA, and classical approaches to the analysis of human behavior. Psychological Bulletin, 73, 311–349.

    Article  PubMed  Google Scholar 

  • Jöreskog, K.J. (2002). Structural equation modeling with ordinal variables using LISREL. Scientific Software International Inc. Retrieved November 3, 2010, from: http://www.ssicentral.com/lisrel/techdocs/ordinal.pdf.

  • Keselman, H.J., & Lix, L.M. (1997). Analyzing multivariate repeated measures designs when covariance matrices are heterogeneous. British Journal of Mathematical & Statistical Psychology, 50, 319–338.

    Article  Google Scholar 

  • Kirisci, L., Hsu, T., & Yu, L. (2001). Robustness of item parameter estimation programs to assumptions of unidimensionality and normality. Applied Psychological Measurement, 25, 146–162.

    Article  Google Scholar 

  • Konishi, S., & Kitagawa, G. (2008). Information criteria and statistical modeling. New York: Springer.

    Book  Google Scholar 

  • Long, J.S., & Ervin, L.H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. American Statistician, 54, 217–224.

    Google Scholar 

  • Markus, H. (1977). Self-schemata and processing information about the self. Journal of Personality and Social Psychology, 35, 63–78.

    Article  Google Scholar 

  • McDonald, R.P. (1999). Test theory: a unified treatment. Mahwah: Lawrence Erlbaum.

    Google Scholar 

  • Mehta, P.D., Neale, M.C., & Flay, B.R. (2004). Squeezing interval change from ordinal panel data: latent growth curves with ordinal outcomes. Psychological Methods, 9, 301–333.

    Article  PubMed  Google Scholar 

  • Meijer, E., & Mooijaart, A. (1996). Factor analysis with heteroscedastic errors. British Journal of Mathematical & Statistical Psychology, 49, 189–202.

    Article  Google Scholar 

  • Mellenbergh, G.J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.

    Article  Google Scholar 

  • Molenaar, D., Dolan, C.V., & van der Maas, H.L.J. (2011). Modeling ability differentiation in the second-order factor model. Structural Equation Modeling, 18, 578–594.

    Article  Google Scholar 

  • Molenaar, D., Dolan, C.V., & Verhelst, N.D. (2010a). Testing and modeling non-normality within the one factor model. British Journal of Mathematical & Statistical Psychology, 63, 293–317.

    Article  Google Scholar 

  • Molenaar, D., Dolan, C.V., & Wicherts, J.M. (2009). The power to detect sex differences in IQ test scores using multi-group covariance and mean structure analysis. Intelligence, 37, 396–404.

    Article  Google Scholar 

  • Molenaar, D., Dolan, C.V., Wicherts, J.M., & van der Maas, H.L.J. (2010b). Modeling differentiation of cognitive abilities within the higher-order factor model using moderated factor analysis. Intelligence, 38, 611–624.

    Article  Google Scholar 

  • Molenaar, D., van der Sluis, S., Boomsma, D.I., & Dolan, C.V. (2012). Detecting specific genotype by environment interaction using marginal maximum likelihood estimation in the classical twin design. Behavior Genetics, 42, 483–499.

    Article  PubMed  Google Scholar 

  • Monti, A.C. (2003). A note on the estimation of the skew normal and the skew exponential power distributions. Metron, LXI, 205–219.

    Google Scholar 

  • Muthén, B., & Hofacker, C. (1988). Testing the assumptions underlying tetrachoric correlations. Psychometrika, 53, 563–578.

    Article  Google Scholar 

  • Muthén, L.K., & Muthén, B.O. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén & Muthén.

    Google Scholar 

  • Neale, M.C. (1998). Modeling interaction and nonlinear effects with Mx: a general approach. In G. Marcoulides & R. Schumacker (Eds.), Interaction and non-linear effects in structural equation modeling (pp. 43–61). New York: Lawrence Erlbaum Associates.

    Google Scholar 

  • Neale, M.C., Aggen, S.H., Maes, H.H., Kubarych, T.S., & Schmitt, J.E. (2006). Methodological issues in the assessment of substance use phenotypes. Addictive Behaviors, 31, 1010–1034.

    Article  PubMed  Google Scholar 

  • Neale, M.C., Boker, S.M., Xie, G., & Maes, H.H. (2002). Mx: statistical modeling (6th ed.). Richmond: VCU.

    Google Scholar 

  • Ramsay, J.O., & Abrahamowicz, M. (1989). Binomial regression with monotone splines: a psychometric application. Journal of the American Statistical Association, 84, 906–915.

    Google Scholar 

  • Ree, M.J. (1979). Estimating item characteristic curves. Applied Psychological Measurement, 3, 371–385.

    Article  Google Scholar 

  • Rochon, J. (1992). ARMA covariance structures with time heteroscedasticity for repeated measures experiments. Journal of the American Statistical Association, 87, 777–784.

    Google Scholar 

  • Rogers, T.B., Kuiper, N.A., & Kirker, W.S. (1977). Self-reference and the encoding of personal information. Journal of Personality and Social Psychology, 35, 677–688.

    Article  PubMed  Google Scholar 

  • Samejima, F. (1969). Psychometric monograph: Vol. 17. Estimation of ability using a response pattern of graded scores. Richmond: The Psychometric Society.

    Google Scholar 

  • Samejima, F. (1997). Departure from normal assumptions: a promise for future psychometrics with substantive mathematical modeling. Psychometrika, 62, 471–493.

    Article  Google Scholar 

  • Samejima, F. (2000). Logistic positive exponent family of models: virtue of asymmetric item characteristic curves. Psychometrika, 65, 319–335.

    Article  Google Scholar 

  • Samejima, F. (2008). Graded response model based on the logistic positive exponent family of models for dichotomous responses. Psychometrika, 73, 561–578.

    Article  Google Scholar 

  • Satorra, A., & Saris, W.E. (1985). The power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83–90.

    Article  Google Scholar 

  • Schermelleh-Engel, K., Moosbrugger, H., & Müller, H. (2003). Evaluating the fit of structural equation models: tests of significance and descriptive goodness-of-fit measures. Methods of Psychological Research, 8, 23–74.

    Google Scholar 

  • Schmitt, J.E., Mehta, P.D., Aggen, S.H., Kubarych, T.S., & Neale, M.C. (2006). Semi-nonparametric methods for detecting latent non-normality: a fusion of latent trait and ordered latent class modeling. Multivariate Behavioral Research, 41, 427–443.

    Article  Google Scholar 

  • Schmueli, G. (2010). To explain or to predict. Statistical Science, 25, 289–310.

    Article  Google Scholar 

  • Seong, T.J. (1990). Sensitivity of marginal maximum likelihood estimation of item and ability parameters to the characteristics of the prior ability distributions. Applied Psychological Measurement, 14, 299–311.

    Article  Google Scholar 

  • Spearman, C.E. (1927). The abilities of man: their nature and measurement. New York: Macmillan.

    Google Scholar 

  • Stone, C.A. (1992). Recovery of marginal maximum likelihood estimates in the two-parameter logistic response model: an evaluation of MULTILOG. Applied Psychological Measurement, 16, 1–16.

    Article  Google Scholar 

  • Swaminathan, H., & Gifford, J. (1983). Estimation of parameters in the three-parameter latent trait model. In D.J. Weiss (Ed.), New horizons in testing: latent trait test theory and computerized adaptive testing (pp. 13–30). New York: Academic Press.

    Google Scholar 

  • Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.

    Article  Google Scholar 

  • Tellegen, A. (1988). The analysis of consistency in personality assessment. Journal of Personality, 56, 621–663.

    Article  Google Scholar 

  • Tucker-Drob, E.M. (2009). Differentiation of cognitive abilities across the life span. Developmental Psychology, 45, 1097–1118.

    Article  PubMed  Google Scholar 

  • van den Oord, E.J. (2005). Estimating Johnson curve population distributions in MULTILOG. Applied Psychological Measurement, 29, 45–64.

    Article  Google Scholar 

  • van der Sluis, S., Dolan, C.V., Neale, M.C., Boomsma, D.I., & Posthuma, D. (2006). Detecting genotype-environment interaction in monozygotic twin data: comparing the Jinks & Fulker test and a new test based on marginal maximum likelihood estimation. Twin Research and Human Genetics, 9, 377–392.

    PubMed  Google Scholar 

  • Verhelst, N.D. (2009). Latent variable analysis with skew distributions. Manuscript in preparation.

  • Vermunt, J.K. (2004). An EM algorithm for the estimation of parametric and nonparametric hierarchical nonlinear models. Statistica Neerlandica, 58, 220–233.

    Article  Google Scholar 

  • Vermunt, J.K., & Hagenaars, J.A. (2004). Ordinal longitudinal data analysis. In R.C. Hauspie, N. Cameron, & L. Molinari (Eds.), Methods in human growth research (pp. 374–393). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Vorst, H.C.M., & Bermond, B. (2001). Validity and reliability of the Bermond–Vorst alexithymia questionnaire. Personality and Individual Differences, 30, 413–434.

    Article  Google Scholar 

  • Wirth, R.J., & Edwards, M.C. (2007). Item factor analysis: current approaches and future directions. Psychological Methods, 12, 58–79.

    Article  PubMed  Google Scholar 

  • Woods, C.M. (2007). Ramsay-curve IRT for Likert type data. Applied Psychological Measurement, 31, 195–212.

    Article  Google Scholar 

  • Zwinderman, A.H., & van den Wollenberg, A.L. (1990). Robustness of marginal maximum likelihood estimation in the Rasch model. Applied Psychological Measurement, 14, 73–81.

    Article  Google Scholar 

Download references

Acknowledgements

The research by Dylan Molenaar was made possible by a grant from the Netherlands Organization for Scientific Research (NWO). We thank Harry Vorst, Martin Muller, Pieter Röhling, and Clasine van der Wal for providing the data used in the illustration section and Mariska Knol for the computational resources used in carrying out the simulation study. We also thank three reviewers, the associate editor, and the editor for their comments on previous versions of this paper. Mx input files are available from the site of the first author, www.dylanmolenaar.nl.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dylan Molenaar.

Appendix

Appendix

We conducted a small simulation study to see how the presence of heteroscedasticity influences the parameter estimates of the traditional GRM. In this simulation study, we simulated 50 data sets according to the extended GRM. We simulated either heteroscedastic residuals or homoscedastic residuals. In case of heteroscedastic residuals, we simulated either a small (δ 1=0.5), medium (δ 1=1), or large effect size (δ 1=1.5). Please note that the large effect size is still realistic, as we found such effect sizes in Application 2 of the manuscript (for instance, for the Cognitive Analyzing subscale 5 of the 7 items showed effects around δ 1=−1.5, see Table 4). We carried out this procedure for N=1000 and N=3000. Next, we fitted the traditional GRM to see how parameter estimates differed from the true values. We found effects on the discrimination parameters and the thresholds, but not on the trait estimates.

From Table A.1 it can be seen that a small degree of heteroscedasticity does not have much effect, but when the heteroscedasticity increases, the root mean squared difference (RMSD) increases when it is not recognized. In the case of a large degree of heteroscedasticity and N=3000, the RMSD is more than two times larger (0.17 to 0.19) compared to the case in which they are homoscedastic (0.07 to 0.08). We found negligible effects on the standard errors (both the empirical standard errors and the theoretical standard errors). It is clear that the discrimination parameters of the traditional GRM are biased if the true model is heteroscedastic. For the threshold, we found even larger effects but this is to be expected as the non-normality in the data is partly captured by the threshold. As already mentioned, for the trait estimates we found no (clear) effects.

Table A.1. Root mean squared difference between estimated discrimination parameters (using the traditional GRM) and true discrimination parameters (within the het-GRM-skew).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Molenaar, D., Dolan, C.V. & de Boeck, P. The Heteroscedastic Graded Response Model with a Skewed Latent Trait: Testing Statistical and Substantive Hypotheses Related to Skewed Item Category Functions. Psychometrika 77, 455–478 (2012). https://doi.org/10.1007/s11336-012-9273-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-012-9273-5

Key words

Navigation