Abstract
We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. We illustrate the general version of the LM model which includes individual covariates, and several constrained versions. Constraints make the model more parsimonious and allow us to consider and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also illustrate in detail maximum likelihood estimation through the Expectation–Maximization algorithm, which may be efficiently implemented by recursions taken from the hidden Markov literature. We outline methods for obtaining standard errors for the parameter estimates. We also illustrate methods for selecting the number of states and for path prediction. Finally, we mention issues related to Bayesian inference of LM models. Possibilities for further developments are given among the concluding remarks.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281
Altman RM (2007) Mixed hidden Markov models: an extension of the hidden Markov model to the longitudinal data setting. J Am Stat Assoc 102:201–210
Anderson TW (1951) Probability models for analysing time changes in attitudes. In: Paul FL (ed) The use of mathematical models in the measurement of the attitudes, The RAND Research Memorandum No. 455
Anderson TW (1954) Probability models for analysing time changes in attitudes. In: Paul FL (ed) Mathematical thinking in the social science. The Free press, IL
Andersson S, Rydén T (2009) Subspace estimation and prediction methods for hidden Markov models. Ann Stat 37:4131–4152
Archer GEB, Titterington DM (2009) Parameter estimation for hidden Markov chains. J Stat Plann Inference 108:365–390
Bacci S, Pandolfi S, Pennoni F (2014) A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Adv Data Anal Classif 8:125–145
Bartolucci F (2006) Likelihood inference for a class of latent Markov models under linear hypotheses on the transition probabilities. J R Stat Soc Ser B 68:155–178
Bartolucci F, Farcomeni A (2009) A multivariate extension of the dynamic logit model for longitudinal data based on a latent Markov heterogeneity structure. J Am Stat Assoc 104:816–831
Bartolucci F, Farcomeni A (2010) A note on the mixture transition distribution and hidden Markov models. J Time Ser Anal 31:132–138
Bartolucci F, Pandolfi S (2013) A new constant memory recursion for hidden Markov models. J Comput Biol (2014, in press)
Bartolucci F, Pennoni F (2007) A class of latent Markov models for capture-recapture data allowing for time, heterogeneity and behavior effects. Biometrics 63:568–578
Bartolucci F, Pennoni F, Francis B (2007) A latent Markov model for detecting patterns of criminal activity. J R Stat Soc Ser A 170:151–132
Bartolucci F, Lupparelli M, Montanari GE (2009) Latent Markov model for binary longitudinal data: an application to the performance evaluation of nursing homes. Ann Appl Stat 3:611–636
Bartolucci F, Pennoni F, Vittadini G (2011) Assessment of school performance through a multilevel latent Markov Rasch model. J Educ Behav Stat 36:491–522
Bartolucci F, Farcomeni A, Pennoni F (2013) Latent Markov models for longitudinal data. Chapman and Hall/CRC Press, Boca Raton
Baum L, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37:1554–1563
Baum L, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164–171
Berchtold A (2004) Optimization of mixture models: Comparison of different strategies. Comput Stat 19:385–406
Bernardo JM, Smith AFM (1994) Bayesian Theory. Wiley, Chichester
Bickel PJ, Ritov Y, Rydén T (1998) Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. Ann Stat 26:1614–1635
Bollen KA, Curran PJ (2006) Latent curve models: a structural equation perspective. Wiley, Hoboken
Bonrmann L, Mutz R, Daniel HD (2008) Latent Markov modeling applied to grant peer review. J Informetr 2:217–228
Boucheron S, Gassiat E (2007) An information-theoretic perspective on order estimation. In: O Cappé TR E Moulines (ed) Inference in Hidden Markov models, Springer, Berlin, pp 565–602
Bye BV, Schechter ES (1986) A latent Markov model approach to the estimation of response error in multiwave panel data. J Am Stat Assoc 81:375–380
Cappé O, Moulines E, Rydén T (2005) Inference in Hidden Markov models. Springer, New York
Cheng RCH, Liu WB (2001) The consistency of estimators in finite mixture models. Scand J Stat 28:603–616
Chib S (1996) Calculating posterior distributions and modal estimates in Markov mixture models. J Econom 75:79–97
Collins LM, Wugalter SE (1992) Latent class models for stage-sequential dynamic latent variables. Multivar Behav Res 27:131–157
Colombi R, Forcina A (2001) Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88:1007–1019
Congdon P (2006) Bayesian model choice based on Monte Carlo estimates of posterior model probabilities. Comput Stat Data Anal 50:346–357
Cowles MK, Carlin BP (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904
Dannemann J (2012) Semiparametric hidden Markov models. J Comput Graphical Stat 21:677–692
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38
Dias JG, Vermunt JK (2007) Latent class modeling of website users’ search patterns: Implications for online market segmentation. J Retailing Consum Serv 14:359–368
Elliot DS, Huizinga D, Menard S (1989) Multiple problem youth: delinquency, substance use, and mental health problems. Springer, New York
Farcomeni A (2011) Hidden Markov partition models. Stat Probab Lett 81:1766–1770
Farcomeni A (2012) Quantile regression for longitudinal data based on latent Markov subject-specific parameters. Stat Comput 22:141–152
Farcomeni A, Arima S (2012) A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data. Stat Appl Genetics Mol Biol 11(4):article 3
Feng Z, McCulloch CE (1996) Using bootstrap likelihood ratios in finite mixture models. J R Stat Soc Ser B 58:609–617
Fitzmaurice G, Davidian M, Verbeke G, G M, (eds) (2009) Longitudinal data analysis. Chapman and Hall, CRC, London
Frühwirth-Schnatter S (2001) Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J Am Stat Assoc 96:194–209
García-Escudero L, Gordaliza A, Mayo-Iscar A (2013) A constrained robust proposal for mixture modeling avoiding spurious solutions. Adv Data Anal Classif 1–17: doi:10.1007/s11634-013-0153-3
Ghahramani Z, Jordan MI (1997) Factorial hidden Markov models. Mach Learn 29:245–273
Glonek GFV, McCullagh P (1995) Multivariate logistic models. J R Stat Soc B 57:533–546
Goodman LA (1961) Statistical methods for the mover-stayer model. J Am Stat Assoc 56:841–868
Goodman LA (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231
Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732
Hambleton RK, Swaminathan H (1985) Item response theory: principles and applications. Kluwer Nijhoff, Boston
Hoffmann L, Lehrke M, Todt E (1985) Development and changes in pupils’ interest in physics (grade 5 to 10): design of a longitudinal study. In: Lehrke M, Hoffmann L, Gardner PL (eds) Interest in science and technology education. IPN, Kiel, pp 71–80
Juang B, Rabiner L (1991) Hidden Markov models for speech recognition. Technometrics 33:251–272
Kaplan D (2008) An overview of Markov chain methods for the study of stage-sequential developmental processes. Dev Psychol 44:457–467
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Khreich W, Granger E, Miri A, Sabourin R (2010) On the memory complexity of the forward-backward algorithm. Pattern Recognit Lett 31:91–99
Koski T (2001) Hidden Markov models for bioinformatics. Kluwer, Dordrecht
Künsch HR (2005) State space and hidden Markov models. In: Barndorff-Nielsen OE, Cox DR, Klüppelberg C (eds) Complex stochastic systems. Chapman and Hall/CRC, Boca Raton, FL, pp 109–173
Langeheine R (1988) New development in latent class theory. In: Langeheine R, Rost J (eds) Latent trait and latent class models. Plenum Press, New York, pp 77–108
Langeheine R (1994) Latent variables Markov models. In: von Eye A, Clogg C (eds) Latent variables analysis: applications for developmental research. Sage, Thousand Oaks, CA, pp 373–395
Langeheine R, van de Pol F (1994) Discrete-time mixed Markov latent class models. In: Dale A, Davies R (eds) Analyzing social and political change: a casebook of methods. Sage Publications, London, pp 171–197
Lazarsfeld PF (1950) The logical and mathematical foundation of latent structure analysis. In: Stouffer SA, Guttman L, Suchman EA (ed) Measurement and prediction. Princeton University Press, New York
Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, Boston
Leonard T (1975) Bayesian estimation methods for two-way contingency tables. J R Stat Soc Ser B 37:23–37
Leroux BG, Puterman ML (1992) Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. Biometrics 48:545–558
Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst Tech J 62:1035–1074
Louis T (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44:226–233
Lystig TC, Hughes J (2002) Exact computation of the observed information matrix for hidden Markov models. J Comput Graphical Stat 11:678–689
MacDonald IL, Zucchini W (1997) Hidden Markov and other models for discrete-valued time series. Chapman and Hall, London
Magidson J, Vermunt JK (2001) Latent class factor and cluster models, bi-plots and related graphical displays. Sociol Methodol 31:223–264
Maruotti A (2011) Mixed hidden Markov models for longitudinal data: an overview. Int Stat Rev 79:427–454
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, CRC, London
McHugh RB (1956) Efficient estimation and local identification in latent class analysis. Psychometrika 21:331–347
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Muthén B (2004) Latent variable analysis: growth mixture modeling and related techniques for longitudinal data. In: Kaplan D (ed) Handbook of quantitative methodology for the social sciences. Sage Publications, Newbury Park, pp 345–368
Nagin D (1999) Analyzing developmental trajectories: a semi-parametric, group-based approach. Psychol Methods 4:139–157
Nazaret W (1987) Bayesian log-linear estimates for three-way contingency tables. Biometrika 74:401–410
Oakes D (1999) Direct calculation of the information matrix via the EM algorithm. J R Stat Soc Ser B 61:479–482
Paas LJ, Vermunt JK, Bijlmolt THA (2009) Discrete time, discrete state latent Markov modelling for assessing and predicting household acquisitions of financial products. J R Stat Soc Ser A 170:955–974
van de Pol F, Langeheine R (1990) Mixed Markov latent class models. Sociol Method 20:213–247
Rijmen F, Vansteelandt K, De Boeck P (2007) Latent class models for diary methods data: parameter estimation by local computations. Psychometrika 73:167–182
Robert C, Ryden T, Titterington D (2000) Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. J R Stat Soc Ser B 62:57–75
Robert CP, Casella G (2010) Monte Carlo statistical methods, 2nd edn. Springer, New York
Robert CP, RydÉn T, Titterington D (1999) Convergence controls for MCMC algorithms, with applications to hidden Markov chains. J Stat Comput Simul 64:327–355
Roeder K, Lynch KG, Nagin DS (1999) Modeling uncertainty in latent class membership: a case study in criminology. J Am Stat Assoc 94:766–776
Rost J (2002) Mixed and latent Markov models as item response models. Methods of psychological research online, Special Issue, pp 53–70
Rusakov D, Geiger D (2002) Asymptotic model selection for naive Bayesian networks. In: Proceedings of the eighteenth conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 438–455
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Scott SL (2002) Bayesian methods for hidden Markov models: recursive computing in the 21st century. J Am Stat Assoc 97:337–351
Seidel W, Ševčíková H (2004) Types of likelihood maxima in mixture models and their implication on the performance of tests. Ann Inst Stat Math 56:631–654
Spezia L (2010) Bayesian analysis of multivariate Gaussian hidden Markov models with an unknown number of regimes. J Time Ser Anal 31:1–11
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64:583–639
Turner R (2008) Direct maximization of the likelihood of a hidden Markov model. Comput Stat Data Anal 52:4147–4160
Turner TR, Cameron MA, Thomson PJ (1998) Hidden Markov chains in generalized linear models. Can J Stat 26:107–125
Tuyl F, Gerlach R, Mengersen K (2009) Posterior predictive arguments in favor of the Bayes-Laplace prior as the consensus prior for binomial and multinomial parameters. Bayesian Anal 4:151–158
Vansteelandt K, Rijmen F, Pieters G, Vanderlinden J (2007) Drive for thinness, affect regulation and physical activity in eating disorders: a daily life study. Behav Res Ther 45:1717–1734
Vermunt J (2010) Longitudinal research with latent variables. In: van Montfort K, Oud J, Satorra A (eds) Handbook of advanced multilevel analysis. Springer, Heidelberg, pp 119–152
Vermunt JK, Langeheine R, Böckenholt U (1999) Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. J Educ Behav Stat 24:179–207
Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inform Theory 13:260–269
Welch LR (2003) Hidden Markov models and the Baum-Welch algorithm. IEEE Inform Theory Soc Newsl 53:1–13
Wiggins L (1955) Mathematical models for the analysis of multi-wave panels. In: University C (ed) Ph.D. Dissertation, University microfilms, Ann Arbor
Wiggins L (1973) Panel analysis: latent probability models for attitude and behaviour processes. Elsevier, Amsterdam
Yau C, Papaspiliopoulos O, Roberts G, Holmes C (2011) Bayesian nonparametric hidden Markov models with application to the analysis of copy-number-variation in mammalian genomes. J R Stat Soc Ser B 73:37–57
Zucchini W, MacDonald IL (2009) Hidden Markov Models for time series: an introduction using R. Springer, New York
Acknowledgments
F. Bartolucci and F. Pennoni acknowledge the financial support from the grant “Finite mixture and latent variable models for causal inference and analysis of socio-economic data” (FIRB-Futuro in ricerca) funded by the Italian Government (RBFR12SHVV).
Author information
Authors and Affiliations
Corresponding author
Additional information
This invited paper is discussed in comments available at doi:10.1007/s11749-014-0387-1; doi:10.1007/s11749-014-0388-0; doi:10.1007/s11749-014-0389-z; doi:10.1007/s11749-014-0390-6.
Rights and permissions
About this article
Cite this article
Bartolucci, F., Farcomeni, A. & Pennoni, F. Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates. TEST 23, 433–465 (2014). https://doi.org/10.1007/s11749-014-0381-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-014-0381-7
Keywords
- EM algorithm
- Bayesian framework
- Forward–Backward recursions
- Hidden Markov models
- Measurement errors
- Panel data
- Unobserved heterogeneity