An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation

Culpepper, Steven Andrew

doi:10.1007/s11336-019-09683-4

An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation

Published: 20 August 2019

Volume 84, pages 921–940, (2019)
Cite this article

Psychometrika Aims and scope Submit manuscript

Steven Andrew Culpepper ORCID: orcid.org/0000-0003-4226-6176¹

1080 Accesses
25 Citations
Explore all metrics

Abstract

Diagnostic models (DMs) provide researchers and practitioners with tools to classify respondents into substantively relevant classes. DMs are widely applied to binary response data; however, binary response models are not applicable to the wealth of ordinal data collected by educational, psychological, and behavioral researchers. Prior research developed confirmatory ordinal DMs that require expert knowledge to specify the underlying structure. This paper introduces an exploratory DM for ordinal data. In particular, we present an exploratory ordinal DM, which uses a cumulative probit link along with Bayesian variable selection techniques to uncover the latent structure. Furthermore, we discuss new identifiability conditions for structured multinomial mixture models with binary attributes. We provide evidence of accurate parameter recovery in a Monte Carlo simulation study across moderate to large sample sizes. We apply the model to twelve items from the public-use, Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999 approaches to learning and self-description questionnaire and report evidence to support a three-attribute solution with eight classes to describe the latent structure underlying the teacher and parent ratings. In short, the developed methodology contributes to the development of ordinal DMs and broadens their applicability to address theoretical and substantive issues more generally across the social sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Note on Weaker Conditions for Identifying Restricted Latent Class Models for Binary Responses

Article 27 July 2022

Restricted Latent Class Models for Nominal Response Data: Identifiability and Estimation

Article 19 December 2023

Multiple imputation of ordinal missing not at random data

Article Open access 22 August 2022

References

Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics, 17(3), 251–269.
Google Scholar
Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422), 669–679.
Google Scholar
Allman, E. S., Matias, C., & Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. Annals of Statistics, 37, 3099–3132.
Google Scholar
Bao, J., & Hanson, T. E. (2015). Bayesian nonparametric multivariate ordinal regression. Canadian Journal of Statistics, 43(3), 337–357.
Google Scholar
Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541–561.
Google Scholar
Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419–437.
Google Scholar
Chen, J., & de la Torre, J. (2018). Introducing the general polytomous diagnosis modeling framework. Frontiers in Psychology, 9, 1–9.
Google Scholar
Chen, Y., & Culpepper, S. A. (2018). A multivariate probit model for learning trajectories with application to classroom assessment. In Paper presentation at the international meeting of the psychometric society, New York.
Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of the DINA Q-matrix. Psychometrika, 83, 89–108.
PubMed Google Scholar
Chen, Y., Culpepper, S. A., & Liang, F. (2018). Beyond the Q-matrix: A general approach to cognitive diagnostic models. In Paper presentation at the international meeting of the psychometric society, New York.
Chen, Y., Culpepper, S. A., Wang, S., & Douglas, J. A. (2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42, 5–23.
PubMed Google Scholar
Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850–866.
PubMed Google Scholar
Cowles, M. K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6(2), 101–111.
Google Scholar
Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476.
Google Scholar
Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142–1163.
PubMed Google Scholar
Culpepper, S. A. (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84, 333–357. 10.1007/s11336-018-9643-8.
PubMed Google Scholar
Culpepper, S. A., & Chen, Y. (2018). Development and application of an exploratory reduced reparameterized unified model. Journal of Educational and Behavioral Statistics, 44, 3–24.
Google Scholar
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8–26.
Google Scholar
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.
Google Scholar
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353.
Google Scholar
de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595–624.
Google Scholar
DeYoreo, M., & Kottas, A. (2018). Bayesian nonparametric modeling for multivariate ordinal regression. Journal of Computational and Graphical Statistics, 27(1), 71–84.
Google Scholar
DeYoreo, M., Reiter, J. P., & Hillygus, D. S. (2017). Bayesian mixture models with focused clustering for mixed ordinal and nominal data. Bayesian Analysis, 12(3), 679–703.
Google Scholar
Fang, G., Liu, J., & Ying, Z. (2019). On the identifiability of diagnostic classification models. Psychometrika, 84, 19–40.
PubMed Google Scholar
Green, B. F. (1951). A general solution for the latent class model of latent structure analysis. Psychometrika, 16(2), 151–166.
PubMed Google Scholar
Haberman, S. J., von Davier, M., & Lee, Y.-H. (2008). Comparison of multidimensional item response models: Multivariate normal ability distributions versus multivariate polytomous ability distributions. ETS Research Report Series, 2008(2), 1–25.
Google Scholar
Henson, R. A., & Templin, J. (2007). Importance of Q-matrix construction and its effects cognitive diagnosis model results. In Annual meeting of the national council on measurement in education, Chicago, IL.
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210.
Google Scholar
Hojtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62(2), 171–189.
Google Scholar
Jain, S., & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of Computational and Graphical Statistics, 13(1), 158–182.
Google Scholar
Karelitz, T. M. (2004). Ordered category attribute coding framework for cognitive assessments. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.
Kaya, Y., & Leite, W. L. (2017). Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: An evaluation of model performance. Educational and Psychological Measurement, 77(3), 369–388.
PubMed Google Scholar
Kottas, A., Müller, P., & Quintana, F. (2005). Nonparametric Bayesian modeling for multivariate ordinal data. Journal of Computational and Graphical Statistics, 14(3), 610–625.
Google Scholar
Kruskal, J. B. (1976). More factors than subjects, tests and treatments: An indeterminacy theorem for canonical decomposition and individual differences scaling. Psychometrika, 41(3), 281–293.
Google Scholar
Kruskal, J. B. (1977). Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra and Its Applications, 18(2), 95–138.
Google Scholar
Li, F., Cohen, A., Bottge, B., & Templin, J. (2016). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204.
PubMed Google Scholar
Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19(5A), 1790–1817.
PubMed PubMed Central Google Scholar
Liu, R., & Jiang, Z. (2018). Diagnostic classification models for ordinal item responses. Frontiers in Psychology, 9, 1–12.
Google Scholar
Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253–275.
PubMed Google Scholar
Ma, W., & de la Torre, J. (2019). An empirical Q-matrix validation method for the sequential generalized DINA model. British Journal of Mathematical and Statistical Psychology. https://doi.org/10.1111/bmsp.12156.
Madison, M. J., & Bradshaw, L. P. (2018). Assessing growth in a diagnostic classification model framework. Psychometrika, 83, 963–990.
PubMed Google Scholar
McDonald, R. P. (1962). A note on the derivation of the general latent class model. Psychometrika, 27(2), 203–206.
Google Scholar
Proctor, C. H. (1970). A probabilistic formulation and statistical analysis of guttman scaling. Psychometrika, 35(1), 73–78.
Google Scholar
Rost, J. (1988). Rating scale analysis with latent class models. Psychometrika, 53(3), 327–348.
Google Scholar
Rupp, A. A., & Templin, J. L. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68(1), 78–96.
Google Scholar
Shute, V. J., Hansen, E. G., & Almond, R. G. (2008). You can’t fatten a hog by weighing it-or can you? Evaluating an assessment for learning system called ACED. International Journal of Artificial Intelligence in Education, 18(4), 289–316.
Google Scholar
Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298–321.
Google Scholar
Templin, J. L. (2004). Generalized linear mixed proficiency models. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287.
PubMed Google Scholar
Templin, J. L., Henson, R. A., Templin, S. E., & Roussos, L. (2008). Robustness of hierarchical modeling of skill association in cognitive diagnosis models. Applied Psychological Measurement, 32, 559–574.
Google Scholar
Tourangeau, K., Nord, C., Lê, T., Sorongon, A., Hagedorn, M., Daly, P., & Najarian, M. (2015). Early childhood longitudinal study, kindergarten class of 2010–2011 (ECLS-K:2011), user’s manual for the ECLS-K:2011 kindergarten data file and electronic codebook, public version (NCES 2015-074). Early childhood longitudinal study, kindergarten class of 2010–2011 (ECLS-K:2011), user’s manual for the ECLS-K:2011 kindergarten data file and electronic codebook, public version (NCES 2015-074). U.S. Department of Education. Washington, DC: National Center for Education Statistics. https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010070. Accessed 19 Apr 2018.
von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287–307.
Google Scholar
von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement: Interdisciplinary Research and Perspectives, 7, 67–74.
Google Scholar
Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. (2017). Tracking skill acquisition with cognitive diagnosis models: A higher-order hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 43(1), 57–87.
Google Scholar
Xu, G. (2017). Identifiability of restricted latent class models with binary responses. Annals of Statistics, 45(2), 675–707.
Google Scholar
Xu, G., & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113(523), 1284–1295.
Google Scholar
Ye, S., Fellouris, G., Culpepper, S. A., & Douglas, J. (2016). Sequential detection of learning in cognitive diagnosis. British Journal of Mathematical and Statistical Psychology, 69(2), 139–158.
PubMed Google Scholar

Download references

Acknowledgements

This research was partially supported by National Science Foundation Methodology, Measurement, and Statistics Program Grants 1632023 and 1758631 and Spencer Foundation Grant 201700062. The manuscript benefited from the comments of Editor, Associate Editor, three blind reviewers and Jeff Douglas. Any remaining short-comings belong to the author.

Author information

Authors and Affiliations

Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA
Steven Andrew Culpepper

Authors

Steven Andrew Culpepper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steven Andrew Culpepper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

This section discusses the full conditional distributions used to approximate the posterior distribution of the ordinal diagnostic model parameters with Gibbs sampling. For iteration $t=1,\ldots , T$ we sample:

1.
For $i=1,\ldots ,n$,
1. (a)
  ${\varvec{\alpha }}_i^{(t)}$ from the multinomial full conditional distribution ${\varvec{\alpha }}_i^{(t)}|{\varvec{Y}}_i,\mathbf B ^{(t-1)},{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)}$ where the conditional probability ${\varvec{\alpha }}_i^{(t)}$ is classified as profile c is,
  $$\begin{aligned} P({\varvec{\alpha }}_i^{(t)\top }{\varvec{v}}&=c|{\varvec{Y}}_i,\mathbf B ^{(t-1)},{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)})\nonumber \\&=\frac{(n_{ci}+n_{c0})\prod _{j=1}^J \theta _{jc,y_{ij}}^{(t-1)} }{\sum _{c=0}^{2^K-1} (n_{ci}+n_{c0}) \prod _{j=1}^J \theta _{jc,y_{ij}}^{(t-1)} } \end{aligned}$$
  (A1)
  where $\theta _{jc,y_{ij}}^{(t-1)}=\Phi \left( \tau _{jc,y_{ij}+1}-{\varvec{a}}_c^\top {\varvec{\beta }}_j^{(t-1)}\right) -\Phi \left( \tau _{jc,y_{ij}}-{\varvec{a}}_c^\top {\varvec{\beta }}_j^{(t-1)}\right) $. Notice that we integrate ${\varvec{\pi }}$ from the prior distribution $p(\mathbf A ,{\varvec{\pi }})=p({\varvec{\alpha }}_1,\ldots ,{\varvec{\alpha }}_n|{\varvec{\pi }})p({\varvec{\pi }})$ and instead use the conditional prior distribution $p({\varvec{\alpha }}_i|{\varvec{\alpha }}_1^{(t)},\ldots ,{\varvec{\alpha }}_{i-1}^{(t)},{\varvec{\alpha }}_{i+1}^{(t-1)},\ldots ,{\varvec{\alpha }}_{n}^{(t-1)})$ which implies the usual $\pi _c$ (e.g., see Equation 7 of Culpepper, 2019) is replaced with $n_{ci}+n_{c0}$ where $n_{ci}$ is the number of respondents other than i that are classified in class c (e.g., see Jain & Neal, 2004) and $n_{c0}$ is the prior Dirichlet parameter (note $n_{c0}=1$ for a uniform prior).
2. (b)
  For $j=1,\ldots ,J$ update the latent augmented data from the full conditional distribution
  $$\begin{aligned} Y_{ij}^{*(t)}|Y_{ij},{\varvec{\alpha }}_i^{(t)},{\varvec{\beta }}_j^{(t-1)}\sim \mathcal N({\varvec{a}}_i^{(t)\top }{\varvec{\beta }}_j^{(t-1)},1) \mathcal I(\tau _{jc,y_{ij}}<Y_{ij}^{*(t)}<\tau _{jc,y_{ij}+1}) \end{aligned}$$
  (A2)
  where $\tau _{jc,y_{ij}}$ and $\tau _{jc,y_{ij}+1}$ are lower and upper thresholds for the observed value of $Y_{ij}$ for class c and item j. Recall we follow previously discussed strategies and fix the thresholds as ${\varvec{\tau }}_j=(0,2,\ldots , 2(M_j-2))^\top $.
2.
Update the latent class probabilities (i.e., the mixing weights) as ${\varvec{\pi }}^{(t)}|\mathbf{A }^{(t)}$ from the Dirichlet full conditional distribution (e.g., see Culpepper, 2015).
3.
For $j=1,\ldots ,J$,
1. (a)
  For $k=1,\ldots ,K$ sample $q_{jk}^{(t)}$ from the Bernoulli full conditional distribution $q_{jk}^{(t)}|{\varvec{\beta }}_j^{(t-1)}, q_{j1}^{(t)},\ldots ,q_{j,k-1}^{(t)},q_{j,k+1}^{(t-1)},\ldots ,q_{jK}^{(t-1)},\omega ^{(t-1)}$ (e.g., see Culpepper, 2019).
2. (b)
  For $p=1,\ldots ,P$ sample $\beta _{jp}^{(t)}$ from the truncated normal full conditional distribution $\beta _{jp}^{(t)}|Y_{1j}^{*(t)},\ldots ,Y_{nj}^{*(t)},\mathbf A ^{(t)},\beta _{j1}^{(t)},\ldots ,\beta _{j,p-1}^{(t)},\beta _{j,p+1}^{(t-1)},\ldots ,,\beta _{j,P+1}^{(t-1)}, {\varvec{q}}_j^{(t)}$ (e.g., see Culpepper, 2019).
4.
Sample $\omega ^{(t)}$ from the Beta full conditional distribution $\omega ^{(t)}|\mathbf{Q }^{(t)}$ (e.g., see Culpepper, 2019).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Culpepper, S.A. An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation. Psychometrika 84, 921–940 (2019). https://doi.org/10.1007/s11336-019-09683-4

Download citation

Received: 11 February 2019
Revised: 29 July 2019
Published: 20 August 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11336-019-09683-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation

Abstract

Access this article

Similar content being viewed by others

A Note on Weaker Conditions for Identifying Restricted Latent Class Models for Binary Responses

Restricted Latent Class Models for Nominal Response Data: Identifiability and Estimation

Multiple imputation of ordinal missing not at random data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation

Abstract

Access this article

Similar content being viewed by others

A Note on Weaker Conditions for Identifying Restricted Latent Class Models for Binary Responses

Restricted Latent Class Models for Nominal Response Data: Identifiability and Estimation

Multiple imputation of ordinal missing not at random data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

Appendix: Gibbs Sampling Algorithm and Full Conditional Distributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation