Skip to main content

Advertisement

Log in

An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Diagnostic models (DMs) provide researchers and practitioners with tools to classify respondents into substantively relevant classes. DMs are widely applied to binary response data; however, binary response models are not applicable to the wealth of ordinal data collected by educational, psychological, and behavioral researchers. Prior research developed confirmatory ordinal DMs that require expert knowledge to specify the underlying structure. This paper introduces an exploratory DM for ordinal data. In particular, we present an exploratory ordinal DM, which uses a cumulative probit link along with Bayesian variable selection techniques to uncover the latent structure. Furthermore, we discuss new identifiability conditions for structured multinomial mixture models with binary attributes. We provide evidence of accurate parameter recovery in a Monte Carlo simulation study across moderate to large sample sizes. We apply the model to twelve items from the public-use, Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999 approaches to learning and self-description questionnaire and report evidence to support a three-attribute solution with eight classes to describe the latent structure underlying the teacher and parent ratings. In short, the developed methodology contributes to the development of ordinal DMs and broadens their applicability to address theoretical and substantive issues more generally across the social sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics, 17(3), 251–269.

    Google Scholar 

  • Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422), 669–679.

    Google Scholar 

  • Allman, E. S., Matias, C., & Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. Annals of Statistics, 37, 3099–3132.

    Google Scholar 

  • Bao, J., & Hanson, T. E. (2015). Bayesian nonparametric multivariate ordinal regression. Canadian Journal of Statistics, 43(3), 337–357.

    Google Scholar 

  • Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541–561.

    Google Scholar 

  • Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419–437.

    Google Scholar 

  • Chen, J., & de la Torre, J. (2018). Introducing the general polytomous diagnosis modeling framework. Frontiers in Psychology, 9, 1–9.

    Google Scholar 

  • Chen, Y., & Culpepper, S. A. (2018). A multivariate probit model for learning trajectories with application to classroom assessment. In Paper presentation at the international meeting of the psychometric society, New York.

  • Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of the DINA Q-matrix. Psychometrika, 83, 89–108.

    PubMed  Google Scholar 

  • Chen, Y., Culpepper, S. A., & Liang, F. (2018). Beyond the Q-matrix: A general approach to cognitive diagnostic models. In Paper presentation at the international meeting of the psychometric society, New York.

  • Chen, Y., Culpepper, S. A., Wang, S., & Douglas, J. A. (2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42, 5–23.

    PubMed  Google Scholar 

  • Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850–866.

    PubMed  Google Scholar 

  • Cowles, M. K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6(2), 101–111.

    Google Scholar 

  • Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476.

    Google Scholar 

  • Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142–1163.

    PubMed  Google Scholar 

  • Culpepper, S. A. (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84, 333–357. 10.1007/s11336-018-9643-8.

    PubMed  Google Scholar 

  • Culpepper, S. A., & Chen, Y. (2018). Development and application of an exploratory reduced reparameterized unified model. Journal of Educational and Behavioral Statistics, 44, 3–24.

    Google Scholar 

  • DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8–26.

    Google Scholar 

  • de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.

    Google Scholar 

  • de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353.

    Google Scholar 

  • de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595–624.

    Google Scholar 

  • DeYoreo, M., & Kottas, A. (2018). Bayesian nonparametric modeling for multivariate ordinal regression. Journal of Computational and Graphical Statistics, 27(1), 71–84.

    Google Scholar 

  • DeYoreo, M., Reiter, J. P., & Hillygus, D. S. (2017). Bayesian mixture models with focused clustering for mixed ordinal and nominal data. Bayesian Analysis, 12(3), 679–703.

    Google Scholar 

  • Fang, G., Liu, J., & Ying, Z. (2019). On the identifiability of diagnostic classification models. Psychometrika, 84, 19–40.

    PubMed  Google Scholar 

  • Green, B. F. (1951). A general solution for the latent class model of latent structure analysis. Psychometrika, 16(2), 151–166.

    PubMed  Google Scholar 

  • Haberman, S. J., von Davier, M., & Lee, Y.-H. (2008). Comparison of multidimensional item response models: Multivariate normal ability distributions versus multivariate polytomous ability distributions. ETS Research Report Series, 2008(2), 1–25.

    Google Scholar 

  • Henson, R. A., & Templin, J. (2007). Importance of Q-matrix construction and its effects cognitive diagnosis model results. In Annual meeting of the national council on measurement in education, Chicago, IL.

  • Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210.

    Google Scholar 

  • Hojtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62(2), 171–189.

    Google Scholar 

  • Jain, S., & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of Computational and Graphical Statistics, 13(1), 158–182.

    Google Scholar 

  • Karelitz, T. M. (2004). Ordered category attribute coding framework for cognitive assessments. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

  • Kaya, Y., & Leite, W. L. (2017). Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: An evaluation of model performance. Educational and Psychological Measurement, 77(3), 369–388.

    PubMed  Google Scholar 

  • Kottas, A., Müller, P., & Quintana, F. (2005). Nonparametric Bayesian modeling for multivariate ordinal data. Journal of Computational and Graphical Statistics, 14(3), 610–625.

    Google Scholar 

  • Kruskal, J. B. (1976). More factors than subjects, tests and treatments: An indeterminacy theorem for canonical decomposition and individual differences scaling. Psychometrika, 41(3), 281–293.

    Google Scholar 

  • Kruskal, J. B. (1977). Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra and Its Applications, 18(2), 95–138.

    Google Scholar 

  • Li, F., Cohen, A., Bottge, B., & Templin, J. (2016). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204.

    PubMed  Google Scholar 

  • Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19(5A), 1790–1817.

    PubMed  PubMed Central  Google Scholar 

  • Liu, R., & Jiang, Z. (2018). Diagnostic classification models for ordinal item responses. Frontiers in Psychology, 9, 1–12.

    Google Scholar 

  • Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253–275.

    PubMed  Google Scholar 

  • Ma, W., & de la Torre, J. (2019). An empirical Q-matrix validation method for the sequential generalized DINA model. British Journal of Mathematical and Statistical Psychology. https://doi.org/10.1111/bmsp.12156.

  • Madison, M. J., & Bradshaw, L. P. (2018). Assessing growth in a diagnostic classification model framework. Psychometrika, 83, 963–990.

    PubMed  Google Scholar 

  • McDonald, R. P. (1962). A note on the derivation of the general latent class model. Psychometrika, 27(2), 203–206.

    Google Scholar 

  • Proctor, C. H. (1970). A probabilistic formulation and statistical analysis of guttman scaling. Psychometrika, 35(1), 73–78.

    Google Scholar 

  • Rost, J. (1988). Rating scale analysis with latent class models. Psychometrika, 53(3), 327–348.

    Google Scholar 

  • Rupp, A. A., & Templin, J. L. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68(1), 78–96.

    Google Scholar 

  • Shute, V. J., Hansen, E. G., & Almond, R. G. (2008). You can’t fatten a hog by weighing it-or can you? Evaluating an assessment for learning system called ACED. International Journal of Artificial Intelligence in Education, 18(4), 289–316.

    Google Scholar 

  • Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298–321.

    Google Scholar 

  • Templin, J. L. (2004). Generalized linear mixed proficiency models. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

  • Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287.

    PubMed  Google Scholar 

  • Templin, J. L., Henson, R. A., Templin, S. E., & Roussos, L. (2008). Robustness of hierarchical modeling of skill association in cognitive diagnosis models. Applied Psychological Measurement, 32, 559–574.

    Google Scholar 

  • Tourangeau, K., Nord, C., Lê, T., Sorongon, A., Hagedorn, M., Daly, P., & Najarian, M. (2015). Early childhood longitudinal study, kindergarten class of 2010–2011 (ECLS-K:2011), user’s manual for the ECLS-K:2011 kindergarten data file and electronic codebook, public version (NCES 2015-074). Early childhood longitudinal study, kindergarten class of 2010–2011 (ECLS-K:2011), user’s manual for the ECLS-K:2011 kindergarten data file and electronic codebook, public version (NCES 2015-074). U.S. Department of Education. Washington, DC: National Center for Education Statistics. https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010070. Accessed 19 Apr 2018.

  • von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287–307.

    Google Scholar 

  • von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement: Interdisciplinary Research and Perspectives, 7, 67–74.

    Google Scholar 

  • Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. (2017). Tracking skill acquisition with cognitive diagnosis models: A higher-order hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 43(1), 57–87.

    Google Scholar 

  • Xu, G. (2017). Identifiability of restricted latent class models with binary responses. Annals of Statistics, 45(2), 675–707.

    Google Scholar 

  • Xu, G., & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113(523), 1284–1295.

    Google Scholar 

  • Ye, S., Fellouris, G., Culpepper, S. A., & Douglas, J. (2016). Sequential detection of learning in cognitive diagnosis. British Journal of Mathematical and Statistical Psychology, 69(2), 139–158.

    PubMed  Google Scholar 

Download references

Acknowledgements

This research was partially supported by National Science Foundation Methodology, Measurement, and Statistics Program Grants 1632023 and 1758631 and Spencer Foundation Grant 201700062. The manuscript benefited from the comments of Editor, Associate Editor, three blind reviewers and Jeff Douglas. Any remaining short-comings belong to the author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Andrew Culpepper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

This section discusses the full conditional distributions used to approximate the posterior distribution of the ordinal diagnostic model parameters with Gibbs sampling. For iteration \(t=1,\ldots , T\) we sample:

  1. 1.

    For \(i=1,\ldots ,n\),

    1. (a)

      \({\varvec{\alpha }}_i^{(t)}\) from the multinomial full conditional distribution \({\varvec{\alpha }}_i^{(t)}|{\varvec{Y}}_i,\mathbf B ^{(t-1)},{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)}\) where the conditional probability \({\varvec{\alpha }}_i^{(t)}\) is classified as profile c is,

      $$\begin{aligned} P({\varvec{\alpha }}_i^{(t)\top }{\varvec{v}}&=c|{\varvec{Y}}_i,\mathbf B ^{(t-1)},{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)})\nonumber \\&=\frac{(n_{ci}+n_{c0})\prod _{j=1}^J \theta _{jc,y_{ij}}^{(t-1)} }{\sum _{c=0}^{2^K-1} (n_{ci}+n_{c0}) \prod _{j=1}^J \theta _{jc,y_{ij}}^{(t-1)} } \end{aligned}$$
      (A1)

      where \(\theta _{jc,y_{ij}}^{(t-1)}=\Phi \left( \tau _{jc,y_{ij}+1}-{\varvec{a}}_c^\top {\varvec{\beta }}_j^{(t-1)}\right) -\Phi \left( \tau _{jc,y_{ij}}-{\varvec{a}}_c^\top {\varvec{\beta }}_j^{(t-1)}\right) \). Notice that we integrate \({\varvec{\pi }}\) from the prior distribution \(p(\mathbf A ,{\varvec{\pi }})=p({\varvec{\alpha }}_1,\ldots ,{\varvec{\alpha }}_n|{\varvec{\pi }})p({\varvec{\pi }})\) and instead use the conditional prior distribution \(p({\varvec{\alpha }}_i|{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)})\) which implies the usual \(\pi _c\) (e.g., see Equation 7 of Culpepper, 2019) is replaced with \(n_{ci}+n_{c0}\) where \(n_{ci}\) is the number of respondents other than i that are classified in class c (e.g., see Jain & Neal, 2004) and \(n_{c0}\) is the prior Dirichlet parameter (note \(n_{c0}=1\) for a uniform prior).

    2. (b)

      For \(j=1,\ldots ,J\) update the latent augmented data from the full conditional distribution

      $$\begin{aligned} Y_{ij}^{*(t)}|Y_{ij},{\varvec{\alpha }}_i^{(t)},{\varvec{\beta }}_j^{(t-1)}\sim \mathcal N({\varvec{a}}_i^{(t)\top }{\varvec{\beta }}_j^{(t-1)},1) \mathcal I(\tau _{jc,y_{ij}}<Y_{ij}^{*(t)}<\tau _{jc,y_{ij}+1}) \end{aligned}$$
      (A2)

      where \(\tau _{jc,y_{ij}}\) and \(\tau _{jc,y_{ij}+1}\) are lower and upper thresholds for the observed value of \(Y_{ij}\) for class c and item j. Recall we follow previously discussed strategies and fix the thresholds as \({\varvec{\tau }}_j=(0,2,\ldots , 2(M_j-2))^\top \).

  2. 2.

    Update the latent class probabilities (i.e., the mixing weights) as \({\varvec{\pi }}^{(t)}|\mathbf{A }^{(t)}\) from the Dirichlet full conditional distribution (e.g., see Culpepper, 2015).

  3. 3.

    For \(j=1,\ldots ,J\),

    1. (a)

      For \(k=1,\ldots ,K\) sample \(q_{jk}^{(t)}\) from the Bernoulli full conditional distribution \(q_{jk}^{(t)}|{\varvec{\beta }}_j^{(t-1)}, q_{j1}^{(t)},\ldots ,q_{j,k-1}^{(t)},q_{j,k+1}^{(t-1)},\ldots ,q_{jK}^{(t-1)},\omega ^{(t-1)}\) (e.g., see Culpepper, 2019).

    2. (b)

      For \(p=1,\ldots ,P\) sample \(\beta _{jp}^{(t)}\) from the truncated normal full conditional distribution \(\beta _{jp}^{(t)}|Y_{1j}^{*(t)},\ldots ,Y_{nj}^{*(t)},\mathbf A ^{(t)},\beta _{j1}^{(t)},\ldots ,\beta _{j,p-1}^{(t)},\beta _{j,p+1}^{(t-1)},\ldots ,,\beta _{j,P+1}^{(t-1)}, {\varvec{q}}_j^{(t)}\) (e.g., see Culpepper, 2019).

  4. 4.

    Sample \(\omega ^{(t)}\) from the Beta full conditional distribution \(\omega ^{(t)}|\mathbf{Q }^{(t)}\) (e.g., see Culpepper, 2019).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Culpepper, S.A. An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation. Psychometrika 84, 921–940 (2019). https://doi.org/10.1007/s11336-019-09683-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-019-09683-4

Keywords

Navigation