Skip to main content
Log in

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

  • Published:
Prevention Science Aims and scope Submit manuscript

Abstract

Random coefficient-dependent (RCD) missingness is a non-ignorable mechanism through which missing data can arise in longitudinal designs. RCD, for which we cannot test, is a problematic form of missingness that occurs if subject-specific random effects correlate with propensity for missingness or dropout. Particularly when covariate missingness is a problem, investigators typically handle missing longitudinal data by using single-level multiple imputation procedures implemented with long-format data, which ignores within-person dependency entirely, or implemented with wide-format (i.e., multivariate) data, which ignores some aspects of within-person dependency. When either of these standard approaches to handling missing longitudinal data is used, RCD missingness leads to parameter bias and incorrect inference. We explain why multilevel multiple imputation (MMI) should alleviate bias induced by a RCD missing data mechanism under conditions that contribute to stronger determinacy of random coefficients. We evaluate our hypothesis with a simulation study. Three design factors are considered: intraclass correlation (ICC; ranging from .25 to .75), number of waves (ranging from 4 to 8), and percent of missing data (ranging from 20 to 50%). We find that MMI greatly outperforms the single-level wide-format (multivariate) method for imputation under a RCD mechanism. For the MMI analyses, bias was most alleviated when the ICC is high, there were more waves of data, and when there was less missing data. Practical recommendations for handling longitudinal missing data are suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Albert, P. S., & Follmann, D. (2009). Shared-parameter models. Longitudinal data analysis, 433–452

  • Asparouhov, T., & Muthén, B. (2010). Multiple imputation with Mplus. MPlus Web Notes.

  • Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: Performance of alternative specifications and methods of estimation. Psychological Methods, 16, 373–390.

    Article  PubMed  PubMed Central  Google Scholar 

  • Bollen, K. A. (2014). Structural equations with latent variables. Wiley.

  • Bollen, K. A., Kirby, J. B., Curran, P. J., Paxton, P. M., & Chen, F. (2007). Latent variable models under misspecification: Two-stage lease squares (2SLS) and maximum likelihood (ML) estimators. Sociological Methods & Research, 36, 48–86.

    Article  Google Scholar 

  • Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330.

    Article  CAS  PubMed  Google Scholar 

  • Demirtas, H., & Schafer, J. L. (2003). On the performance of random-coefficient pattern-mixture models for non-ignorable drop-out. Statistics in Medicine, 22, 2553–2575.

    Article  PubMed  Google Scholar 

  • Enders, C. K. (2010). Applied missing data analysis. New York: Guilford Press.

  • Enders, C. K. (2011). Missing not at random models for latent growth curve analyses. Psychological Methods, 16, 1–16.

    Article  PubMed  Google Scholar 

  • Enders, C. K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7, 27–31.

    Article  Google Scholar 

  • Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods. doi:10.1037/met0000063.

    Google Scholar 

  • Gottfredson, N. C. (2011). Evaluating shared parameter mixture models for analyzing change in the presence of non-randomly missing data (doctoral dissertation). The University of North Carolina at Chapel Hill: ProQuest.

  • Gottfredson, N. C., Bauer, D. J., & Baldwin, S. A. (2014). Modeling change in the presence of nonrandomly missing data: Evaluating a shared parameter mixture model. Structural Equation Modeling: A Multidisciplinary Journal, 21, 196–209.

    Article  Google Scholar 

  • Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.

    Article  PubMed  Google Scholar 

  • Graham, J. W. (2012). Missing data theory. In J. W. Graham (Ed.), Missing data: Analysis and design (pp. 3–46). New York: Springer.

    Chapter  Google Scholar 

  • Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450.

    Article  CAS  PubMed  Google Scholar 

  • Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note. Behavior Research Methods, 48, 640–649.

    Article  PubMed  Google Scholar 

  • Hallquist, M. & Wiley, J. (2014). MplusAutomation: Automating Mplus model estimation and interpretation. R package version 0.6-3.

  • Kaplan, D. (1988). The impact of specification error on the estimation, testing, and improvement of structural equation models. Multivariate Behavioral Research, 23, 69–86.

    Article  CAS  PubMed  Google Scholar 

  • Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983–997.

    Article  CAS  PubMed  Google Scholar 

  • Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.

    Article  CAS  PubMed  Google Scholar 

  • Little, R. J. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88(421), 125–134.

  • Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.

    Article  Google Scholar 

  • Little, R. J., & Zhang, N. (2011). Subsample ignorable likelihood for regression analysis with missing data. Journal of the Royal Statistical Society. Series C, Applied Statistics, 60, 591–605.

    Article  Google Scholar 

  • Lüdtke, O., Robitzsch, A., & Grund, S. (2016). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods.

  • Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86–92.

    Article  Google Scholar 

  • McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92, 162–170.

    Article  Google Scholar 

  • Muthén, B., Asparouhov, T., Hunter, A. M., & Leuchter, A. F. (2011). Growth modeling with nonignorable dropout: Alternative analyses of the STAR* D antidepressant trial. Psychological Methods, 16, 17.

    Article  PubMed  PubMed Central  Google Scholar 

  • R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna. URL http://www.R-project.org/.

  • Roy, J. (2003). Modeling longitudinal data with nonignorable dropouts using a latent dropout class model. Biometrics, 59, 829–836.

    Article  PubMed  Google Scholar 

  • Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.

    Article  Google Scholar 

  • Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys. Wiley.

  • Schafer, J. L., & Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457.

    Article  Google Scholar 

  • Sterba, S. K., & Gottfredson, N. C. (2015). Diagnosing global case influence on MAR versus MNAR model comparisons. Structural Equation Modeling: A Multidisciplinary Journal, 22, 294–307.

    Article  Google Scholar 

  • Tsonaka, R., Verbeke, G., & Lesaffre, E. (2009). A semi-parametric shared parameter model to handle nonmonotone nonignorable missingness. Biometrics, 65, 81–87.

    Article  PubMed  Google Scholar 

  • van Buuren, S. (2011). Multiple imputation of multilevel data. In J. Hox and J. K. Roberts (Eds.), Handbook of advanced multilevel analysis (pp. 173–196). Psychology Press.

  • Vonesh, E. F., Greene, T., & Schluchter, M. D. (2006). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25, 143–163.

    Article  PubMed  Google Scholar 

  • Wu, W., West, S. G., & Taylor, A. B. (2009). Evaluating model fit for growth curve models: Integration of fit indices from SEM and MLM frameworks. Psychological Methods, 14, 183–201.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to thank Dan Bauer and Kris Preacher for feedback on previous drafts of this manuscript. We are also grateful for the highly constructive and insightful feedback that we received from our anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nisha C. Gottfredson.

Ethics declarations

Funding

Research reported in this publication was supported by the National Institutes of Health through grant funding awarded to Dr. Gottfredson (K01 DA035153) and Dr. Jackson (K02 AA13938 and R01 AA016838). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

Not applicable.

Informed Consent

Not applicable.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gottfredson, N.C., Sterba, S.K. & Jackson, K.M. Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data. Prev Sci 18, 12–19 (2017). https://doi.org/10.1007/s11121-016-0735-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11121-016-0735-3

Keywords

Navigation