Abstract
Random coefficient-dependent (RCD) missingness is a non-ignorable mechanism through which missing data can arise in longitudinal designs. RCD, for which we cannot test, is a problematic form of missingness that occurs if subject-specific random effects correlate with propensity for missingness or dropout. Particularly when covariate missingness is a problem, investigators typically handle missing longitudinal data by using single-level multiple imputation procedures implemented with long-format data, which ignores within-person dependency entirely, or implemented with wide-format (i.e., multivariate) data, which ignores some aspects of within-person dependency. When either of these standard approaches to handling missing longitudinal data is used, RCD missingness leads to parameter bias and incorrect inference. We explain why multilevel multiple imputation (MMI) should alleviate bias induced by a RCD missing data mechanism under conditions that contribute to stronger determinacy of random coefficients. We evaluate our hypothesis with a simulation study. Three design factors are considered: intraclass correlation (ICC; ranging from .25 to .75), number of waves (ranging from 4 to 8), and percent of missing data (ranging from 20 to 50%). We find that MMI greatly outperforms the single-level wide-format (multivariate) method for imputation under a RCD mechanism. For the MMI analyses, bias was most alleviated when the ICC is high, there were more waves of data, and when there was less missing data. Practical recommendations for handling longitudinal missing data are suggested.
Similar content being viewed by others
References
Albert, P. S., & Follmann, D. (2009). Shared-parameter models. Longitudinal data analysis, 433–452
Asparouhov, T., & Muthén, B. (2010). Multiple imputation with Mplus. MPlus Web Notes.
Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: Performance of alternative specifications and methods of estimation. Psychological Methods, 16, 373–390.
Bollen, K. A. (2014). Structural equations with latent variables. Wiley.
Bollen, K. A., Kirby, J. B., Curran, P. J., Paxton, P. M., & Chen, F. (2007). Latent variable models under misspecification: Two-stage lease squares (2SLS) and maximum likelihood (ML) estimators. Sociological Methods & Research, 36, 48–86.
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330.
Demirtas, H., & Schafer, J. L. (2003). On the performance of random-coefficient pattern-mixture models for non-ignorable drop-out. Statistics in Medicine, 22, 2553–2575.
Enders, C. K. (2010). Applied missing data analysis. New York: Guilford Press.
Enders, C. K. (2011). Missing not at random models for latent growth curve analyses. Psychological Methods, 16, 1–16.
Enders, C. K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7, 27–31.
Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods. doi:10.1037/met0000063.
Gottfredson, N. C. (2011). Evaluating shared parameter mixture models for analyzing change in the presence of non-randomly missing data (doctoral dissertation). The University of North Carolina at Chapel Hill: ProQuest.
Gottfredson, N. C., Bauer, D. J., & Baldwin, S. A. (2014). Modeling change in the presence of nonrandomly missing data: Evaluating a shared parameter mixture model. Structural Equation Modeling: A Multidisciplinary Journal, 21, 196–209.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.
Graham, J. W. (2012). Missing data theory. In J. W. Graham (Ed.), Missing data: Analysis and design (pp. 3–46). New York: Springer.
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450.
Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note. Behavior Research Methods, 48, 640–649.
Hallquist, M. & Wiley, J. (2014). MplusAutomation: Automating Mplus model estimation and interpretation. R package version 0.6-3.
Kaplan, D. (1988). The impact of specification error on the estimation, testing, and improvement of structural equation models. Multivariate Behavioral Research, 23, 69–86.
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983–997.
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.
Little, R. J. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88(421), 125–134.
Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.
Little, R. J., & Zhang, N. (2011). Subsample ignorable likelihood for regression analysis with missing data. Journal of the Royal Statistical Society. Series C, Applied Statistics, 60, 591–605.
Lüdtke, O., Robitzsch, A., & Grund, S. (2016). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods.
Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86–92.
McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92, 162–170.
Muthén, B., Asparouhov, T., Hunter, A. M., & Leuchter, A. F. (2011). Growth modeling with nonignorable dropout: Alternative analyses of the STAR* D antidepressant trial. Psychological Methods, 16, 17.
R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna. URL http://www.R-project.org/.
Roy, J. (2003). Modeling longitudinal data with nonignorable dropouts using a latent dropout class model. Biometrics, 59, 829–836.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys. Wiley.
Schafer, J. L., & Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457.
Sterba, S. K., & Gottfredson, N. C. (2015). Diagnosing global case influence on MAR versus MNAR model comparisons. Structural Equation Modeling: A Multidisciplinary Journal, 22, 294–307.
Tsonaka, R., Verbeke, G., & Lesaffre, E. (2009). A semi-parametric shared parameter model to handle nonmonotone nonignorable missingness. Biometrics, 65, 81–87.
van Buuren, S. (2011). Multiple imputation of multilevel data. In J. Hox and J. K. Roberts (Eds.), Handbook of advanced multilevel analysis (pp. 173–196). Psychology Press.
Vonesh, E. F., Greene, T., & Schluchter, M. D. (2006). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25, 143–163.
Wu, W., West, S. G., & Taylor, A. B. (2009). Evaluating model fit for growth curve models: Integration of fit indices from SEM and MLM frameworks. Psychological Methods, 14, 183–201.
Acknowledgements
We would like to thank Dan Bauer and Kris Preacher for feedback on previous drafts of this manuscript. We are also grateful for the highly constructive and insightful feedback that we received from our anonymous reviewers.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
Research reported in this publication was supported by the National Institutes of Health through grant funding awarded to Dr. Gottfredson (K01 DA035153) and Dr. Jackson (K02 AA13938 and R01 AA016838). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical Approval
Not applicable.
Informed Consent
Not applicable.
Rights and permissions
About this article
Cite this article
Gottfredson, N.C., Sterba, S.K. & Jackson, K.M. Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data. Prev Sci 18, 12–19 (2017). https://doi.org/10.1007/s11121-016-0735-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11121-016-0735-3