Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Gottfredson, Nisha C.; Sterba, Sonya K.; Jackson, Kristina M.

doi:10.1007/s11121-016-0735-3

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Published: 19 November 2016

Volume 18, pages 12–19, (2017)
Cite this article

Prevention Science Aims and scope Submit manuscript

Nisha C. Gottfredson¹,
Sonya K. Sterba² &
Kristina M. Jackson³

594 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Random coefficient-dependent (RCD) missingness is a non-ignorable mechanism through which missing data can arise in longitudinal designs. RCD, for which we cannot test, is a problematic form of missingness that occurs if subject-specific random effects correlate with propensity for missingness or dropout. Particularly when covariate missingness is a problem, investigators typically handle missing longitudinal data by using single-level multiple imputation procedures implemented with long-format data, which ignores within-person dependency entirely, or implemented with wide-format (i.e., multivariate) data, which ignores some aspects of within-person dependency. When either of these standard approaches to handling missing longitudinal data is used, RCD missingness leads to parameter bias and incorrect inference. We explain why multilevel multiple imputation (MMI) should alleviate bias induced by a RCD missing data mechanism under conditions that contribute to stronger determinacy of random coefficients. We evaluate our hypothesis with a simulation study. Three design factors are considered: intraclass correlation (ICC; ranging from .25 to .75), number of waves (ranging from 4 to 8), and percent of missing data (ranging from 20 to 50%). We find that MMI greatly outperforms the single-level wide-format (multivariate) method for imputation under a RCD mechanism. For the MMI analyses, bias was most alleviated when the ICC is high, there were more waves of data, and when there was less missing data. Practical recommendations for handling longitudinal missing data are suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple imputation of missing covariate values in multilevel models with random slopes: a cautionary note

Article 05 May 2015

Simon Grund, Oliver Lüdtke & Alexander Robitzsch

Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach

Article Open access 23 May 2021

Simon Grund, Oliver Lüdtke & Alexander Robitzsch

Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

References

Albert, P. S., & Follmann, D. (2009). Shared-parameter models. Longitudinal data analysis, 433–452
Asparouhov, T., & Muthén, B. (2010). Multiple imputation with Mplus. MPlus Web Notes.
Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: Performance of alternative specifications and methods of estimation. Psychological Methods, 16, 373–390.
Article PubMed PubMed Central Google Scholar
Bollen, K. A. (2014). Structural equations with latent variables. Wiley.
Bollen, K. A., Kirby, J. B., Curran, P. J., Paxton, P. M., & Chen, F. (2007). Latent variable models under misspecification: Two-stage lease squares (2SLS) and maximum likelihood (ML) estimators. Sociological Methods & Research, 36, 48–86.
Article Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330.
Article CAS PubMed Google Scholar
Demirtas, H., & Schafer, J. L. (2003). On the performance of random-coefficient pattern-mixture models for non-ignorable drop-out. Statistics in Medicine, 22, 2553–2575.
Article PubMed Google Scholar
Enders, C. K. (2010). Applied missing data analysis. New York: Guilford Press.
Enders, C. K. (2011). Missing not at random models for latent growth curve analyses. Psychological Methods, 16, 1–16.
Article PubMed Google Scholar
Enders, C. K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7, 27–31.
Article Google Scholar
Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods. doi:10.1037/met0000063.
Google Scholar
Gottfredson, N. C. (2011). Evaluating shared parameter mixture models for analyzing change in the presence of non-randomly missing data (doctoral dissertation). The University of North Carolina at Chapel Hill: ProQuest.
Gottfredson, N. C., Bauer, D. J., & Baldwin, S. A. (2014). Modeling change in the presence of nonrandomly missing data: Evaluating a shared parameter mixture model. Structural Equation Modeling: A Multidisciplinary Journal, 21, 196–209.
Article Google Scholar
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.
Article PubMed Google Scholar
Graham, J. W. (2012). Missing data theory. In J. W. Graham (Ed.), Missing data: Analysis and design (pp. 3–46). New York: Springer.
Chapter Google Scholar
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450.
Article CAS PubMed Google Scholar
Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note. Behavior Research Methods, 48, 640–649.
Article PubMed Google Scholar
Hallquist, M. & Wiley, J. (2014). MplusAutomation: Automating Mplus model estimation and interpretation. R package version 0.6-3.
Kaplan, D. (1988). The impact of specification error on the estimation, testing, and improvement of structural equation models. Multivariate Behavioral Research, 23, 69–86.
Article CAS PubMed Google Scholar
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983–997.
Article CAS PubMed Google Scholar
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.
Article CAS PubMed Google Scholar
Little, R. J. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88(421), 125–134.
Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.
Article Google Scholar
Little, R. J., & Zhang, N. (2011). Subsample ignorable likelihood for regression analysis with missing data. Journal of the Royal Statistical Society. Series C, Applied Statistics, 60, 591–605.
Article Google Scholar
Lüdtke, O., Robitzsch, A., & Grund, S. (2016). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods.
Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86–92.
Article Google Scholar
McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92, 162–170.
Article Google Scholar
Muthén, B., Asparouhov, T., Hunter, A. M., & Leuchter, A. F. (2011). Growth modeling with nonignorable dropout: Alternative analyses of the STAR* D antidepressant trial. Psychological Methods, 16, 17.
Article PubMed PubMed Central Google Scholar
R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna. URL http://www.R-project.org/.
Roy, J. (2003). Modeling longitudinal data with nonignorable dropouts using a latent dropout class model. Biometrics, 59, 829–836.
Article PubMed Google Scholar
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Article Google Scholar
Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys. Wiley.
Schafer, J. L., & Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457.
Article Google Scholar
Sterba, S. K., & Gottfredson, N. C. (2015). Diagnosing global case influence on MAR versus MNAR model comparisons. Structural Equation Modeling: A Multidisciplinary Journal, 22, 294–307.
Article Google Scholar
Tsonaka, R., Verbeke, G., & Lesaffre, E. (2009). A semi-parametric shared parameter model to handle nonmonotone nonignorable missingness. Biometrics, 65, 81–87.
Article PubMed Google Scholar
van Buuren, S. (2011). Multiple imputation of multilevel data. In J. Hox and J. K. Roberts (Eds.), Handbook of advanced multilevel analysis (pp. 173–196). Psychology Press.
Vonesh, E. F., Greene, T., & Schluchter, M. D. (2006). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25, 143–163.
Article PubMed Google Scholar
Wu, W., West, S. G., & Taylor, A. B. (2009). Evaluating model fit for growth curve models: Integration of fit indices from SEM and MLM frameworks. Psychological Methods, 14, 183–201.
Article PubMed Google Scholar

Download references

Acknowledgements

We would like to thank Dan Bauer and Kris Preacher for feedback on previous drafts of this manuscript. We are also grateful for the highly constructive and insightful feedback that we received from our anonymous reviewers.

Author information

Authors and Affiliations

Department of Health Behavior, University of North Carolina at Chapel Hill, Campus Box 7440, 135 Dauer Drive, Chapel Hill, NC, 27599-7440, USA
Nisha C. Gottfredson
Vanderbilt University, Nashville, TN, USA
Sonya K. Sterba
Brown University, Providence, RI, USA
Kristina M. Jackson

Authors

Nisha C. Gottfredson
View author publications
You can also search for this author in PubMed Google Scholar
Sonya K. Sterba
View author publications
You can also search for this author in PubMed Google Scholar
Kristina M. Jackson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nisha C. Gottfredson.

Ethics declarations

Funding

Research reported in this publication was supported by the National Institutes of Health through grant funding awarded to Dr. Gottfredson (K01 DA035153) and Dr. Jackson (K02 AA13938 and R01 AA016838). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

Not applicable.

Informed Consent

Not applicable.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gottfredson, N.C., Sterba, S.K. & Jackson, K.M. Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data. Prev Sci 18, 12–19 (2017). https://doi.org/10.1007/s11121-016-0735-3

Download citation

Published: 19 November 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s11121-016-0735-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Abstract

Access this article

Similar content being viewed by others

Multiple imputation of missing covariate values in multilevel models with random slopes: a cautionary note

Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach

Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Funding

Conflict of Interest

Ethical Approval

Informed Consent

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Abstract

Access this article

Similar content being viewed by others

Multiple imputation of missing covariate values in multilevel models with random slopes: a cautionary note

Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach

Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Funding

Conflict of Interest

Ethical Approval

Informed Consent

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation