Skip to main content

Missing Data from a Causal Perspective

  • Conference paper
  • First Online:
Book cover Advanced Methodologies for Bayesian Networks (AMBN 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9505))

Included in the following conference series:

Abstract

This paper applies graph based causal inference procedures for recovering information from missing data. We establish conditions that permit and prohibit recoverability. In the event of theoretical impediments to recoverability, we develop graph based procedures using auxiliary variables and external data to overcome such impediments. We demonstrate the perils of model-blind recovery procedures both in determining whether or not a query is recoverable and in choosing an estimation procedure when recoverability holds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    The presence of a non-recoverable factor in a summand does not always imply the non-recoverability of the summand. See Example-3 in [18].

References

  1. Allison, P.D.: Missing Data Series: Quantitative Applications in the Social Sciences (2002)

    Google Scholar 

  2. Collins, L.M., Schafer, J.L., Kam, C.-M.: A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol. Methods 6(4), 330 (2001)

    Article  Google Scholar 

  3. Daniel, R.M., Kenward, M.G., Cousens, S.N., De Stavola, B.L.: Using causal diagrams to guide analysis in missing data problems. Stat. Methods Med. Res. 21(3), 243–256 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  4. Darwiche, A.: Modeling and Reasoning with Bayesian Networks. Cambridge University Press, New York (2009)

    Book  MATH  Google Scholar 

  5. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Stat. Soc. B. (Methodol.) 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  6. Enders, C.K.: Applied Missing Data Analysis. Guilford Publications, New York (2010)

    Google Scholar 

  7. Garcia, F.M.: Definition and diagnosis of problematic attrition in randomized controlled experiments. Working paper, April 2013. http://ssrn.com/abstract=2267120

  8. Graham, J.W.: Missing Data: Analysis and Design. Statistics for Social and Behavioral Sciences. Springer, New York (2012)

    Book  MATH  Google Scholar 

  9. Heitjan, D.F., Rubin, D.B.: Ignorability and coarse data. Ann. Stat. 19(4), 2244–2253 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  10. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. Cambridge University Press, New York (2009)

    MATH  Google Scholar 

  11. Lauritzen, S.L.: The EM algorithm for graphical association models with missing data. Comput. Stat. Data Anal. 19(2), 191–201 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  12. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2002)

    Book  MATH  Google Scholar 

  13. Marlin, B.M., Zemel, R.S.: Collaborative prediction and ranking with non-random missing data. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 5–12. ACM (2009)

    Google Scholar 

  14. Marlin, B.M., Zemel, R.S., Roweis, S., Slaney, M.: Collaborative filtering and the missing at random assumption. In: UAI (2007)

    Google Scholar 

  15. Marlin, B.M., Zemel, R.S., Roweis, S.T., Slaney, M.: Recommender systems: missing data and statistical model estimation. In: IJCAI (2011)

    Google Scholar 

  16. Mohan, K., Pearl, J.: On the testability of models with missing data. In: Proceedings of AISTAT (2014)

    Google Scholar 

  17. Mohan, K., Pearl, J., Tian, J.: Graphical models for inference with missing data. Adv. Neural Inf. Process. Syst. 26, 1277–1285 (2013)

    Google Scholar 

  18. Mohan, K., Pearl J.: Graphical models for recovering probabilistic and causal queries from missing data. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 1520–1528 (2014)

    Google Scholar 

  19. Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, New York (2009)

    Book  MATH  Google Scholar 

  20. Pearl, J., Mohan, K.: Recoverability and testability of missing data: Introduction and summary of results. Technical report R-417, UCLA (2013). http://ftp.cs.ucla.edu/pub/stat_ser/r417.pdf

  21. Robins, J.M., Rotnitzky, A.: Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell, N.P., Dietz, K., Farewell, V.T. (eds.) AIDS Epidemiology, pp. 297–331. Springer, New York (1992)

    Chapter  Google Scholar 

  22. Robins, J.M., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89(427), 846–866 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  23. Rothman, K.J., Greenland, S., Lash, T.L.: Modern Epidemiology. Lippincott Williams & Wilkins, Philadelphia (2008)

    Google Scholar 

  24. Rubin, D.B.: Inference and missing data. Biometrika 63, 581–592 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  25. Shadish, W.R.: Revisiting field experimentation: field notes for the future. Psychol. Methods 7(1), 3 (2002)

    Article  Google Scholar 

  26. Shpitser, I., Mohan, K., Pearl, J.: Missing data as a causal and probabilistic problem. In: Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (2015)

    Google Scholar 

  27. Thoemmes, F., Mohan, K.: Graphical representation of missing data problems. Struct. Equ. Model. Multi. J. 37(1), 1–13 (2015)

    Google Scholar 

  28. Thoemmes, F., Rose, N.: Selection of auxiliary variables in missing data problems: Not all auxiliary variables are created equal. Technical report R-002, Cornell University (2013)

    Google Scholar 

  29. Thoemmes, F., Mohan, K.: Graphical representation of missing data problems. Struct. Equ. Model. Multi. J. 22(4), 1–13 (2015)

    Google Scholar 

  30. Twisk, J., de Vente, W.: Attrition in longitudinal studies: how to deal with missing data. J. clin. epidemiol. 55(4), 329–337 (2002)

    Article  Google Scholar 

  31. Van Der Laan, M.J., Robins, J.M.: Locally efficient estimation with current status data and time-dependent covariates. J. Am. Stat. Assoc. 93(442), 693–701 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  32. Van der Laan, M.J., Robins, J.M.: Unified Methods for Censored Longitudinal Data and Causality. Springer, New York (2003)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karthika Mohan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mohan, K., Pearl, J. (2015). Missing Data from a Causal Perspective. In: Suzuki, J., Ueno, M. (eds) Advanced Methodologies for Bayesian Networks. AMBN 2015. Lecture Notes in Computer Science(), vol 9505. Springer, Cham. https://doi.org/10.1007/978-3-319-28379-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28379-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28378-4

  • Online ISBN: 978-3-319-28379-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics