Skip to main content

Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track (ECML PKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14174))

  • 999 Accesses

Abstract

Predicting the risk of in-hospital mortality from electronic health records (EHRs) has received considerable attention. Such predictions will provide early warning of a patient’s health condition to healthcare professionals so that timely interventions can be taken. This prediction task is challenging since EHR data are intrinsically irregular, with not only many missing values but also varying time intervals between medical records. Existing approaches focus on exploiting the variable correlations in patient medical records to impute missing values and establishing time-decay mechanisms to deal with such irregularity. This paper presents a novel contrastive learning-based imputation-prediction network for predicting in-hospital mortality risks using EHR data. Our approach introduces graph analysis-based patient stratification modeling in the imputation process to group similar patients. This allows information of similar patients only to be used, in addition to personal contextual information, for missing value imputation. Moreover, our approach can integrate contrastive learning into the proposed network architecture to enhance patient representation learning and predictive performance on the classification task. Experiments on two real-world EHR datasets show that our approach outperforms the state-of-the-art approaches in both imputation and prediction tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The implementation code is available at https://github.com/liulab1356/CL-ImpPreNet.

  2. 2.

    https://mimic.physionet.org.

  3. 3.

    https://eicu-crd.mit.edu/.

References

  1. Cao, W., Wang, D., Li, J., Zhou, H., Li, L., Li, Y.: Brits: bidirectional recurrent imputation for time series. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  2. Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 1–12 (2018)

    Article  Google Scholar 

  3. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  4. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  5. Cui, S., Wang, J., Gui, X., Wang, T., Ma, F.: Automed: automated medical risk predictive modeling on electronic health records. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 948–953. IEEE (2022)

    Google Scholar 

  6. Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Article  MathSciNet  Google Scholar 

  7. Groenwold, R.H.: Informative missingness in electronic health record systems: the curse of knowing. Diagn. Prognostic Res. 4(1), 1–6 (2020)

    Article  MathSciNet  Google Scholar 

  8. Harutyunyan, H., Khachatrian, H., Kale, D.C., Ver Steeg, G., Galstyan, A.: Multitask learning and benchmarking with clinical time series data. Sci. Data 6(1), 1–18 (2019)

    Article  Google Scholar 

  9. Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016)

    Article  MathSciNet  Google Scholar 

  10. Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  12. Le-Khac, P.H., Healy, G., Smeaton, A.F.: Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020)

    Article  Google Scholar 

  13. Lee, Y., Jun, E., Choi, J., Suk, H.I.: Multi-view integrative attention-based deep representation learning for irregular clinical time-series data. IEEE J. Biomed. Health Inform. 26(8), 4270–4280 (2022)

    Article  Google Scholar 

  14. Li, J., Shang, J., McAuley, J.: Uctopic: unsupervised contrastive learning for phrase representations and topic mining. arXiv preprint arXiv:2202.13469 (2022)

  15. Li, M., Li, C.G., Guo, J.: Cluster-guided asymmetric contrastive learning for unsupervised person re-identification. IEEE Trans. Image Process. 31, 3606–3617 (2022)

    Article  Google Scholar 

  16. Li, R., Ma, F., Gao, J.: Integrating multimodal electronic health records for diagnosis prediction. In: AMIA Annual Symposium Proceedings, vol. 2021, p. 726. American Medical Informatics Association (2021)

    Google Scholar 

  17. Luo, Y., Cai, X., Zhang, Y., Xu, J., et al.: Multivariate time series imputation with generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  18. Luo, Y., Zhang, Y., Cai, X., Yuan, X.: E2GAN: end-to-end generative adversarial network for multivariate time series imputation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 3094–3100. AAAI Press (2019)

    Google Scholar 

  19. Ma, L., et al.: Adacare: explainable clinical health status representation learning via scale-adaptive feature extraction and recalibration. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 825–832 (2020)

    Google Scholar 

  20. Ma, L., et al.: Distilling knowledge from publicly available online EMR data to emerging epidemic for prognosis. In: Proceedings of the Web Conference 2021, pp. 3558–3568 (2021)

    Google Scholar 

  21. Ma, L., et al.: Concare: personalized clinical feature embedding via capturing the healthcare context. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 833–840 (2020)

    Google Scholar 

  22. McCombe, N., et al.: Practical strategies for extreme missing data imputation in dementia diagnosis. IEEE J. Biomed. Health Inform. 26(2), 818–827 (2021)

    Article  Google Scholar 

  23. Mulyadi, A.W., Jun, E., Suk, H.I.: Uncertainty-aware variational-recurrent imputation network for clinical time series. IEEE Trans. Cybern. 52(9), 9684–9694 (2021)

    Article  Google Scholar 

  24. Ni, Q., Cao, X.: MBGAN: an improved generative adversarial network with multi-head self-attention and bidirectional RNN for time series imputation. Eng. Appl. Artif. Intell. 115, 105232 (2022)

    Article  Google Scholar 

  25. Oh, E., Kim, T., Ji, Y., Khyalia, S.: Sting: self-attention based time-series imputation networks using GAN. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 1264–1269. IEEE (2021)

    Google Scholar 

  26. Pang, B., et al.: Unsupervised representation for semantic segmentation by implicit cycle-attention contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2044–2052 (2022)

    Google Scholar 

  27. Pereira, R.C., Abreu, P.H., Rodrigues, P.P.: Partial multiple imputation with variational autoencoders: tackling not at randomness in healthcare data. IEEE J. Biomed. Health Inform. 26(8), 4218–4227 (2022)

    Article  Google Scholar 

  28. Pollard, T.J., Johnson, A.E., Raffa, J.D., Celi, L.A., Mark, R.G., Badawi, O.: The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data 5(1), 1–13 (2018)

    Article  Google Scholar 

  29. Sheikhalishahi, S., Balaraman, V., Osmani, V.: Benchmarking machine learning models on multi-centre eicu critical care dataset. PLoS ONE 15(7), e0235424 (2020)

    Article  Google Scholar 

  30. Shi, Z., et al.: Deep dynamic imputation of clinical time series for mortality prediction. Inf. Sci. 579, 607–622 (2021)

    Article  MathSciNet  Google Scholar 

  31. Tan, Q., et al.: Data-GRU: dual-attention time-aware gated recurrent unit for irregular multivariate time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 930–937 (2020)

    Google Scholar 

  32. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  33. Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7303–7313 (2021)

    Google Scholar 

  34. Wang, Y., Min, Y., Chen, X., Wu, J.: Multi-view graph contrastive representation learning for drug-drug interaction prediction. In: Proceedings of the Web Conference 2021, pp. 2921–2933 (2021)

    Google Scholar 

  35. Xu, D., Sheng, J.Q., Hu, P.J.H., Huang, T.S., Hsu, C.C.: A deep learning-based unsupervised method to impute missing values in patient records for improved management of cardiovascular patients. IEEE J. Biomed. Health Inform. 25(6), 2260–2272 (2020)

    Article  Google Scholar 

  36. Yang, C., An, Z., Cai, L., Xu, Y.: Mutual contrastive learning for visual representation learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 3045–3053 (2022)

    Google Scholar 

  37. Yıldız, A.Y., Koç, E., Koç, A.: Multivariate time series imputation with transformers. IEEE Signal Process. Lett. 29, 2517–2521 (2022)

    Article  Google Scholar 

  38. Yuan, X., et al.: Multimodal contrastive training for visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6995–7004 (2021)

    Google Scholar 

  39. Zang, C., Wang, F.: SCEHR: supervised contrastive learning for clinical risk prediction using electronic health records. In: Proceedings of IEEE International Conference on Data Mining, vol. 2021, pp. 857–866 (2021)

    Google Scholar 

  40. Zhang, Y., Zhou, B., Cai, X., Guo, W., Ding, X., Yuan, X.: Missing value imputation in multivariate time series with end-to-end generative adversarial networks. Inf. Sci. 551, 67–82 (2021)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgement

This research is partially funded by the ARC Centre of Excellence for Automated Decision-Making and Society (CE200100005) by the Australian Government through the Australian Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuxi Liu .

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

The experimental datasets used for this work are obtained from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) dataset and the eICU Collaborative Research dataset. These data were used under license. The authors declare that they have no conflicts of interest. This article does not contain any studies involving human participants performed by any of the authors.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Y., Zhang, Z., Qin, S., Salim, F.D., Yepes, A.J. (2023). Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs. In: De Francisci Morales, G., Perlich, C., Ruchansky, N., Kourtellis, N., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14174. Springer, Cham. https://doi.org/10.1007/978-3-031-43427-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43427-3_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43426-6

  • Online ISBN: 978-3-031-43427-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics