Skip to main content
Log in

Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates

  • Invited Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. We illustrate the general version of the LM model which includes individual covariates, and several constrained versions. Constraints make the model more parsimonious and allow us to consider and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also illustrate in detail maximum likelihood estimation through the Expectation–Maximization algorithm, which may be efficiently implemented by recursions taken from the hidden Markov literature. We outline methods for obtaining standard errors for the parameter estimates. We also illustrate methods for selecting the number of states and for path prediction. Finally, we mention issues related to Bayesian inference of LM models. Possibilities for further developments are given among the concluding remarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281

  • Altman RM (2007) Mixed hidden Markov models: an extension of the hidden Markov model to the longitudinal data setting. J Am Stat Assoc 102:201–210

    Article  MATH  MathSciNet  Google Scholar 

  • Anderson TW (1951) Probability models for analysing time changes in attitudes. In: Paul FL (ed) The use of mathematical models in the measurement of the attitudes, The RAND Research Memorandum No. 455

  • Anderson TW (1954) Probability models for analysing time changes in attitudes. In: Paul FL (ed) Mathematical thinking in the social science. The Free press, IL

  • Andersson S, Rydén T (2009) Subspace estimation and prediction methods for hidden Markov models. Ann Stat 37:4131–4152

    Article  MATH  Google Scholar 

  • Archer GEB, Titterington DM (2009) Parameter estimation for hidden Markov chains. J Stat Plann Inference 108:365–390

    Article  MathSciNet  Google Scholar 

  • Bacci S, Pandolfi S, Pennoni F (2014) A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Adv Data Anal Classif 8:125–145

    Article  MathSciNet  Google Scholar 

  • Bartolucci F (2006) Likelihood inference for a class of latent Markov models under linear hypotheses on the transition probabilities. J R Stat Soc Ser B 68:155–178

    Article  MATH  MathSciNet  Google Scholar 

  • Bartolucci F, Farcomeni A (2009) A multivariate extension of the dynamic logit model for longitudinal data based on a latent Markov heterogeneity structure. J Am Stat Assoc 104:816–831

    Article  MathSciNet  Google Scholar 

  • Bartolucci F, Farcomeni A (2010) A note on the mixture transition distribution and hidden Markov models. J Time Ser Anal 31:132–138

    Article  MATH  MathSciNet  Google Scholar 

  • Bartolucci F, Pandolfi S (2013) A new constant memory recursion for hidden Markov models. J Comput Biol (2014, in press)

  • Bartolucci F, Pennoni F (2007) A class of latent Markov models for capture-recapture data allowing for time, heterogeneity and behavior effects. Biometrics 63:568–578

    Article  MATH  MathSciNet  Google Scholar 

  • Bartolucci F, Pennoni F, Francis B (2007) A latent Markov model for detecting patterns of criminal activity. J R Stat Soc Ser A 170:151–132

    Article  MathSciNet  Google Scholar 

  • Bartolucci F, Lupparelli M, Montanari GE (2009) Latent Markov model for binary longitudinal data: an application to the performance evaluation of nursing homes. Ann Appl Stat 3:611–636

    Article  MATH  MathSciNet  Google Scholar 

  • Bartolucci F, Pennoni F, Vittadini G (2011) Assessment of school performance through a multilevel latent Markov Rasch model. J Educ Behav Stat 36:491–522

    Article  Google Scholar 

  • Bartolucci F, Farcomeni A, Pennoni F (2013) Latent Markov models for longitudinal data. Chapman and Hall/CRC Press, Boca Raton

    MATH  Google Scholar 

  • Baum L, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37:1554–1563

    Article  MATH  MathSciNet  Google Scholar 

  • Baum L, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164–171

    Article  MATH  MathSciNet  Google Scholar 

  • Berchtold A (2004) Optimization of mixture models: Comparison of different strategies. Comput Stat 19:385–406

    MATH  MathSciNet  Google Scholar 

  • Bernardo JM, Smith AFM (1994) Bayesian Theory. Wiley, Chichester

    Book  MATH  Google Scholar 

  • Bickel PJ, Ritov Y, Rydén T (1998) Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. Ann Stat 26:1614–1635

    Article  MATH  Google Scholar 

  • Bollen KA, Curran PJ (2006) Latent curve models: a structural equation perspective. Wiley, Hoboken

    Google Scholar 

  • Bonrmann L, Mutz R, Daniel HD (2008) Latent Markov modeling applied to grant peer review. J Informetr 2:217–228

    Article  Google Scholar 

  • Boucheron S, Gassiat E (2007) An information-theoretic perspective on order estimation. In: O Cappé TR E Moulines (ed) Inference in Hidden Markov models, Springer, Berlin, pp 565–602

  • Bye BV, Schechter ES (1986) A latent Markov model approach to the estimation of response error in multiwave panel data. J Am Stat Assoc 81:375–380

    Article  Google Scholar 

  • Cappé O, Moulines E, Rydén T (2005) Inference in Hidden Markov models. Springer, New York

    MATH  Google Scholar 

  • Cheng RCH, Liu WB (2001) The consistency of estimators in finite mixture models. Scand J Stat 28:603–616

    Article  MATH  MathSciNet  Google Scholar 

  • Chib S (1996) Calculating posterior distributions and modal estimates in Markov mixture models. J Econom 75:79–97

    Article  MATH  MathSciNet  Google Scholar 

  • Collins LM, Wugalter SE (1992) Latent class models for stage-sequential dynamic latent variables. Multivar Behav Res 27:131–157

    Article  Google Scholar 

  • Colombi R, Forcina A (2001) Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88:1007–1019

    Article  MATH  MathSciNet  Google Scholar 

  • Congdon P (2006) Bayesian model choice based on Monte Carlo estimates of posterior model probabilities. Comput Stat Data Anal 50:346–357

    Article  MATH  MathSciNet  Google Scholar 

  • Cowles MK, Carlin BP (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904

    Article  MATH  MathSciNet  Google Scholar 

  • Dannemann J (2012) Semiparametric hidden Markov models. J Comput Graphical Stat 21:677–692

    Article  MathSciNet  Google Scholar 

  • Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38

    MATH  MathSciNet  Google Scholar 

  • Dias JG, Vermunt JK (2007) Latent class modeling of website users’ search patterns: Implications for online market segmentation. J Retailing Consum Serv 14:359–368

    Article  Google Scholar 

  • Elliot DS, Huizinga D, Menard S (1989) Multiple problem youth: delinquency, substance use, and mental health problems. Springer, New York

    Book  Google Scholar 

  • Farcomeni A (2011) Hidden Markov partition models. Stat Probab Lett 81:1766–1770

    Article  MATH  MathSciNet  Google Scholar 

  • Farcomeni A (2012) Quantile regression for longitudinal data based on latent Markov subject-specific parameters. Stat Comput 22:141–152

    Article  MathSciNet  Google Scholar 

  • Farcomeni A, Arima S (2012) A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data. Stat Appl Genetics Mol Biol 11(4):article 3

  • Feng Z, McCulloch CE (1996) Using bootstrap likelihood ratios in finite mixture models. J R Stat Soc Ser B 58:609–617

    MATH  Google Scholar 

  • Fitzmaurice G, Davidian M, Verbeke G, G M, (eds) (2009) Longitudinal data analysis. Chapman and Hall, CRC, London

  • Frühwirth-Schnatter S (2001) Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J Am Stat Assoc 96:194–209

    Article  MATH  Google Scholar 

  • García-Escudero L, Gordaliza A, Mayo-Iscar A (2013) A constrained robust proposal for mixture modeling avoiding spurious solutions. Adv Data Anal Classif 1–17: doi:10.1007/s11634-013-0153-3

  • Ghahramani Z, Jordan MI (1997) Factorial hidden Markov models. Mach Learn 29:245–273

    Article  MATH  Google Scholar 

  • Glonek GFV, McCullagh P (1995) Multivariate logistic models. J R Stat Soc B 57:533–546

    MATH  Google Scholar 

  • Goodman LA (1961) Statistical methods for the mover-stayer model. J Am Stat Assoc 56:841–868

    Article  Google Scholar 

  • Goodman LA (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231

    Article  MATH  MathSciNet  Google Scholar 

  • Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732

    Article  MATH  MathSciNet  Google Scholar 

  • Hambleton RK, Swaminathan H (1985) Item response theory: principles and applications. Kluwer Nijhoff, Boston

    Book  Google Scholar 

  • Hoffmann L, Lehrke M, Todt E (1985) Development and changes in pupils’ interest in physics (grade 5 to 10): design of a longitudinal study. In: Lehrke M, Hoffmann L, Gardner PL (eds) Interest in science and technology education. IPN, Kiel, pp 71–80

    Google Scholar 

  • Juang B, Rabiner L (1991) Hidden Markov models for speech recognition. Technometrics 33:251–272

    Article  MATH  MathSciNet  Google Scholar 

  • Kaplan D (2008) An overview of Markov chain methods for the study of stage-sequential developmental processes. Dev Psychol 44:457–467

    Article  Google Scholar 

  • Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795

    Article  MATH  Google Scholar 

  • Khreich W, Granger E, Miri A, Sabourin R (2010) On the memory complexity of the forward-backward algorithm. Pattern Recognit Lett 31:91–99

    Article  Google Scholar 

  • Koski T (2001) Hidden Markov models for bioinformatics. Kluwer, Dordrecht

    Book  MATH  Google Scholar 

  • Künsch HR (2005) State space and hidden Markov models. In: Barndorff-Nielsen OE, Cox DR, Klüppelberg C (eds) Complex stochastic systems. Chapman and Hall/CRC, Boca Raton, FL, pp 109–173

    Google Scholar 

  • Langeheine R (1988) New development in latent class theory. In: Langeheine R, Rost J (eds) Latent trait and latent class models. Plenum Press, New York, pp 77–108

  • Langeheine R (1994) Latent variables Markov models. In: von Eye A, Clogg C (eds) Latent variables analysis: applications for developmental research. Sage, Thousand Oaks, CA, pp 373–395

    Google Scholar 

  • Langeheine R, van de Pol F (1994) Discrete-time mixed Markov latent class models. In: Dale A, Davies R (eds) Analyzing social and political change: a casebook of methods. Sage Publications, London, pp 171–197

  • Lazarsfeld PF (1950) The logical and mathematical foundation of latent structure analysis. In: Stouffer SA, Guttman L, Suchman EA (ed) Measurement and prediction. Princeton University Press, New York

  • Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, Boston

    MATH  Google Scholar 

  • Leonard T (1975) Bayesian estimation methods for two-way contingency tables. J R Stat Soc Ser B 37:23–37

    MATH  MathSciNet  Google Scholar 

  • Leroux BG, Puterman ML (1992) Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models. Biometrics 48:545–558

    Article  Google Scholar 

  • Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst Tech J 62:1035–1074

    Article  MATH  MathSciNet  Google Scholar 

  • Louis T (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44:226–233

    MATH  MathSciNet  Google Scholar 

  • Lystig TC, Hughes J (2002) Exact computation of the observed information matrix for hidden Markov models. J Comput Graphical Stat 11:678–689

    Article  MathSciNet  Google Scholar 

  • MacDonald IL, Zucchini W (1997) Hidden Markov and other models for discrete-valued time series. Chapman and Hall, London

    MATH  Google Scholar 

  • Magidson J, Vermunt JK (2001) Latent class factor and cluster models, bi-plots and related graphical displays. Sociol Methodol 31:223–264

    Article  Google Scholar 

  • Maruotti A (2011) Mixed hidden Markov models for longitudinal data: an overview. Int Stat Rev 79:427–454

    Article  MATH  Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, CRC, London

    Book  MATH  Google Scholar 

  • McHugh RB (1956) Efficient estimation and local identification in latent class analysis. Psychometrika 21:331–347

    Article  MATH  MathSciNet  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • Muthén B (2004) Latent variable analysis: growth mixture modeling and related techniques for longitudinal data. In: Kaplan D (ed) Handbook of quantitative methodology for the social sciences. Sage Publications, Newbury Park, pp 345–368

    Google Scholar 

  • Nagin D (1999) Analyzing developmental trajectories: a semi-parametric, group-based approach. Psychol Methods 4:139–157

    Article  Google Scholar 

  • Nazaret W (1987) Bayesian log-linear estimates for three-way contingency tables. Biometrika 74:401–410

    Article  MATH  MathSciNet  Google Scholar 

  • Oakes D (1999) Direct calculation of the information matrix via the EM algorithm. J R Stat Soc Ser B 61:479–482

    Article  MATH  MathSciNet  Google Scholar 

  • Paas LJ, Vermunt JK, Bijlmolt THA (2009) Discrete time, discrete state latent Markov modelling for assessing and predicting household acquisitions of financial products. J R Stat Soc Ser A 170:955–974

    Article  Google Scholar 

  • van de Pol F, Langeheine R (1990) Mixed Markov latent class models. Sociol Method 20:213–247

    Article  Google Scholar 

  • Rijmen F, Vansteelandt K, De Boeck P (2007) Latent class models for diary methods data: parameter estimation by local computations. Psychometrika 73:167–182

    Article  Google Scholar 

  • Robert C, Ryden T, Titterington D (2000) Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. J R Stat Soc Ser B 62:57–75

    Article  MATH  MathSciNet  Google Scholar 

  • Robert CP, Casella G (2010) Monte Carlo statistical methods, 2nd edn. Springer, New York

    Google Scholar 

  • Robert CP, RydÉn T, Titterington D (1999) Convergence controls for MCMC algorithms, with applications to hidden Markov chains. J Stat Comput Simul 64:327–355

    Article  MATH  Google Scholar 

  • Roeder K, Lynch KG, Nagin DS (1999) Modeling uncertainty in latent class membership: a case study in criminology. J Am Stat Assoc 94:766–776

    Article  Google Scholar 

  • Rost J (2002) Mixed and latent Markov models as item response models. Methods of psychological research online, Special Issue, pp 53–70

    Google Scholar 

  • Rusakov D, Geiger D (2002) Asymptotic model selection for naive Bayesian networks. In: Proceedings of the eighteenth conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 438–455

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MATH  Google Scholar 

  • Scott SL (2002) Bayesian methods for hidden Markov models: recursive computing in the 21st century. J Am Stat Assoc 97:337–351

  • Seidel W, Ševčíková H (2004) Types of likelihood maxima in mixture models and their implication on the performance of tests. Ann Inst Stat Math 56:631–654

    Article  MATH  Google Scholar 

  • Spezia L (2010) Bayesian analysis of multivariate Gaussian hidden Markov models with an unknown number of regimes. J Time Ser Anal 31:1–11

    Article  MATH  MathSciNet  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64:583–639

    Article  MATH  Google Scholar 

  • Turner R (2008) Direct maximization of the likelihood of a hidden Markov model. Comput Stat Data Anal 52:4147–4160

    Article  MATH  Google Scholar 

  • Turner TR, Cameron MA, Thomson PJ (1998) Hidden Markov chains in generalized linear models. Can J Stat 26:107–125

    Article  MATH  MathSciNet  Google Scholar 

  • Tuyl F, Gerlach R, Mengersen K (2009) Posterior predictive arguments in favor of the Bayes-Laplace prior as the consensus prior for binomial and multinomial parameters. Bayesian Anal 4:151–158

    Article  MathSciNet  Google Scholar 

  • Vansteelandt K, Rijmen F, Pieters G, Vanderlinden J (2007) Drive for thinness, affect regulation and physical activity in eating disorders: a daily life study. Behav Res Ther 45:1717–1734

    Article  Google Scholar 

  • Vermunt J (2010) Longitudinal research with latent variables. In: van Montfort K, Oud J, Satorra A (eds) Handbook of advanced multilevel analysis. Springer, Heidelberg, pp 119–152

    Google Scholar 

  • Vermunt JK, Langeheine R, Böckenholt U (1999) Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. J Educ Behav Stat 24:179–207

    Article  Google Scholar 

  • Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inform Theory 13:260–269

    Article  MATH  Google Scholar 

  • Welch LR (2003) Hidden Markov models and the Baum-Welch algorithm. IEEE Inform Theory Soc Newsl 53:1–13

    Google Scholar 

  • Wiggins L (1955) Mathematical models for the analysis of multi-wave panels. In: University C (ed) Ph.D. Dissertation, University microfilms, Ann Arbor

  • Wiggins L (1973) Panel analysis: latent probability models for attitude and behaviour processes. Elsevier, Amsterdam

    Google Scholar 

  • Yau C, Papaspiliopoulos O, Roberts G, Holmes C (2011) Bayesian nonparametric hidden Markov models with application to the analysis of copy-number-variation in mammalian genomes. J R Stat Soc Ser B 73:37–57

    Article  MathSciNet  Google Scholar 

  • Zucchini W, MacDonald IL (2009) Hidden Markov Models for time series: an introduction using R. Springer, New York

    Book  Google Scholar 

Download references

Acknowledgments

F. Bartolucci and F. Pennoni acknowledge the financial support from the grant “Finite mixture and latent variable models for causal inference and analysis of socio-economic data” (FIRB-Futuro in ricerca) funded by the Italian Government (RBFR12SHVV).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F. Pennoni.

Additional information

This invited paper is discussed in comments available at doi:10.1007/s11749-014-0387-1; doi:10.1007/s11749-014-0388-0; doi:10.1007/s11749-014-0389-z; doi:10.1007/s11749-014-0390-6.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartolucci, F., Farcomeni, A. & Pennoni, F. Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates. TEST 23, 433–465 (2014). https://doi.org/10.1007/s11749-014-0381-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-014-0381-7

Keywords

Mathematics Subject Classification

Navigation