Optimizing sepsis treatment strategies via a reinforcement learning model

Zhang, Tianyi; Qu, Yimeng; wang, Deyong; Zhong, Ming; Cheng, Yunzhang; Zhang, Mingwei

doi:10.1007/s13534-023-00343-2

Optimizing sepsis treatment strategies via a reinforcement learning model

Original Article
Published: 04 January 2024

Volume 14, pages 279–289, (2024)
Cite this article

Biomedical Engineering Letters Aims and scope Submit manuscript

Tianyi Zhang^1,2,
Yimeng Qu⁴,
Deyong wang^1,2,
Ming Zhong³,
Yunzhang Cheng^1,2 &
…
Mingwei Zhang ORCID: orcid.org/0000-0002-4388-2838^1,2

356 Accesses
Explore all metrics

Abstract

Purpose

The existing sepsis treatment lacks effective reference and relies too much on the experience of clinicians. Therefore, we used the reinforcement learning model to build an assisted model for the sepsis medication treatment.

Methods

Using the latest Sepsis 3.0 diagnostic criteria, 19,582 sepsis patients were screened from the Medical Intensive Care Information III database for treatment strategy research, and forty-six features were used in modeling. The study object of the medication strategy is the dosage of vasopressor drugs and intravenous infusion. Dueling DDQN is proposed to predict the patient’s medication strategy (vasopressor and intravenous infusion dosage) through the relationship between the patient’s state, reward function, and medication action. We also constructed protection against the possible high-risk behaviors of Dueling DDQN, especially sudden dose changes of vasopressors can lead to harmful clinical effects. In order to improve the guiding effect of clinically effective medication strategies on the model, we proposed a hybrid model (safe-dueling DDQN + expert strategies) to optimize medication strategies.

Results

The Dueling DDQN medication model for sepsis patients is superior to clinical strategies and other models in terms of off-policy evaluation values and mortality, and reduced the mortality of clinical strategies from 16.8 to 13.8%. Safe-Dueling DDQN we proposed, compared with Dueling DDQN, has an overall reduction in actions involving vasopressors and reduces large dose fluctuations. The hybrid model we proposed can switch between expert strategies and safe dueling DDQN strategies based on the current state of patients.

Conclusions

The reinforcement learning model we proposed for sepsis medication treatment, has practical clinical value and can improve the survival rate of patients to a certain extent while ensuring the balance and safety of medication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards more efficient and robust evaluation of sepsis treatment with deep reinforcement learning

Article Open access 01 March 2023

Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space

Combining Model-Based and Model-Free Reinforcement Learning Policies for More Efficient Sepsis Treatment

References

Seymour CW, Liu VX, Iwashyna TJ, Brunkhorst FM, Rea TD, Scherag A, Rubenfeld G, Kahn JM, Shankar-Hari M, Singer M, Deutschman CS. Assessment of clinical criteria for sepsis: for the third international consensus definitions for sepsis and septic shock (sepsis-3). JAMA. 2016;315(8):762–74.
Article CAS PubMed PubMed Central Google Scholar
Rhodes A, Evans LE, Alhazzani W, Levy MM, Antonelli M, Ferrer R, Kumar A, Sevransky JE, Sprung CL, Nunnally ME, Rochwerg B. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock: 2016. Intensiv Care Med. 2017;43:304–77. https://doi.org/10.1007/s00134-017-4683-6.
Article Google Scholar
Gaieski DF, Edwards JM, Kallan MJ, et al. Benchmarking the incidence and mortality of severe sepsis in the United States. Crit Care Med. 2013. https://doi.org/10.1097/CCM.0b013e31827c09f8.
Article PubMed PubMed Central Google Scholar
Levy MM, Evans LE, Rhodes A. The surviving sepsis campaign bundle: 2018 update. Intensiv Care Med. 2018. https://doi.org/10.1007/s00134-018-5085-0.
Article Google Scholar
Jinxin Z, Kuo S, Dahai H, et al. (2022) Advances in early diagnosis and treatment of sepsis. Chinese journal of injury and repair (Electronic Edition)
Littman M. Reinforcement learning improves behaviour from evaluative feedback. Nature. 2015. https://doi.org/10.1038/nature14540.
Article PubMed PubMed Central Google Scholar
Jeter R, Josef C, Shashikumar S, Nemati S. (2019) Does the “Artificial Intelligence Clinician” learn optimal treatment strategies for sepsis in intensive care? arXiv preprint arXiv: 1902.03271. https://arxiv.org/abs/1902.03271
Johnson A, Pollard T, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016. https://doi.org/10.1038/sdata.2016.35.
Article PubMed PubMed Central Google Scholar
Van Hasselt H, Guez A, Silver D. (2016) Deep reinforcement learning with double Q-Learning. National Conference on Artificial Intelligence, Beijing, China: IEEE. https://doi.org/10.1609/aaai.v30i1.10295.
Wang G, Schaul T, Hessel M, et al. (2016) Dueling network architectures for deep reinforcement learning. International Conference on Machine Learning, USA: IEEE. http://proceedings.mlr.press/v48/wangf16.pdf.
Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016. https://doi.org/10.1001/jama.2016.0287.
Article PubMed PubMed Central Google Scholar
Raghu A, Komorowski M, Celi L A, Szolovits P, Ghassemi M. (2017) Continuous state-space models for optimal sepsis treatment: a deep reinforcement learning approach. Machine Learning for Healthcare Conference. https://proceedings.mlr.press/v68/raghu17a.html.
Peng X, Ding Y, Wihl D, Gottesman O, Komorowski M, Li-wei HL, Ross A, Faisal A, Doshi-Velez F. (2018) Improving sepsis treatment strategies by combining deep and kernel-based reinforcement learning. American Medical Informatics Association (AMIA) Annual Symposium Proceedings. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6371300/.
Futoma, J, Lin, A, Sendak, M, Bedoya, A, Clement, M, O’Brien, C, Heller, K. (2018) Learning to treat sepsis with multi-output gaussian process deep recurrent q-networks. https://openreview.net/forum?id=SyxCqGbRZ.
Roggeveen L, El Hassouni A, Ahrendt J, Guo T, Fleuren L, Thoral P, Girbes AR, Hoogendoorn M, Elbers PW. Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis. Artif Intell Med. 2021. https://doi.org/10.1016/j.artmed.2020.102003.
Article PubMed Google Scholar
Fohner AE, Greene JD, Lawson BL, Chen JH, Kipnis P, Escobar GJ, Liu VX. Assessing clinical heterogeneity in sepsis through treatment patterns and machine learning. J Am Med Inform Assoc. 2019;26(12):1466–77. https://doi.org/10.1093/jamia/ocz161.
Article PubMed PubMed Central Google Scholar
Vincent JL, de Backer D. Circulatory shock. N Engl J Med. 2013. https://doi.org/10.1056/NEJMra1208943.
Article PubMed PubMed Central Google Scholar
Malbrain ML, Van Regenmortel N, Saugel B, De Tavernier B, Van Gaal PJ, Joannes-Boyau O, Teboul JL, Rice TW, Mythen M, Monnet X. Principles of fluid management and stewardship in septic shock: it is time to consider the four D’s and the four phases of fluid therapy. Ann Intensiv Care. 2018. https://doi.org/10.1186/s13613-018-0402-x.
Article Google Scholar
Kotani Y, Di Gioia A, Landoni G, Belletti A, Khanna AK. An updated “norepinephrine equivalent” score in intensive care as a marker of shock severity. Crit Care. 2023. https://doi.org/10.1186/s13054-023-04322-y.
Article PubMed PubMed Central Google Scholar
Jia Y, Lawton T, Burden J, Burden J, McDermid J, Habli I. Safety-driven design of machine learning for sepsis treatment. J Biomed Inform. 2021. https://doi.org/10.1016/j.jbi.2021.103762.
Article PubMed Google Scholar
Liang D, Deng H, Liu Y. The treatment of sepsis: an episodic memory-assisted deep reinforcement learning approach. Appl Intell. 2022. https://doi.org/10.1007/s10489-022-04099-7.
Article Google Scholar
Tianhao L, Zhishun W, Wei L, Zhang Q. Electronic health records based reinforcement learning for treatment optimizing. Inf Syst. 2022. https://doi.org/10.1016/j.is.2021.101878.
Article Google Scholar
Jia, Yan, et al. (2020) "Safe reinforcement learning for sepsis treatment." 2020 IEEE International conference on healthcare informatics (ICHI). IEEE. https://doi.org/10.1109/ICHI48887.2020.9374403.
Fatemi M, Killian TW, Subramanian J, Ghassemi M. (2021) Medical dead-ends and learning to identify high-risk states and treatments. Adv Neural Inf Proces Syst. https://proceedings.neurips.cc/paper_files/paper/2021/hash/26405399c51ad7b13b504e74eb7c696c-Abstract.html.
Chan A J, van der Schaar M. (2021) Scalable Bayesian inverse reinforcement learning. International Conference on Learning Representations.https://doi.org/10.48550/arXiv.2102.06483.
Liu X, Yu C, Huang Q, Wang L, Wu J, Guan X. (2021) Combining Model-Based and Model-Free Reinforcement Learning Policies for More Efficient Sepsis Treatment. In Bioinformatics Research and Applications: 17th International Symposium, ISBRA. https://doi.org/10.1007/978-3-030-91415-8_10.
Beier K, Eppanapally S, Bazick HS, Chang D, Mahadevappa K, Gibbons FK, Christopher KB. Elevation of bun is predictive of long-term mortality in critically ill patients independent of normal creatinine. Crit Care Med. 2011;39(2):305.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank all authors for their contributions to this work.

Funding

This work was supported by the Academic Leader Program of Shanghai Public Health System Construction 3-Year Action Plan (2020–2022) (Grant Number: GWV-10.2-XD32); Shanghai “Science and Technology Innovation Action Plan” Biomedical Science and Technology Support Special Project (Grant Number: 20S31905100); Shanghai Engineering Technology Research Center Support Project (Grant Number: 18DZ2250900).

Author information

Authors and Affiliations

School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
Tianyi Zhang, Deyong wang, Yunzhang Cheng & Mingwei Zhang
Shanghai Interventional Medical Device Engineering Technology Research Center, Shanghai, 200093, China
Tianyi Zhang, Deyong wang, Yunzhang Cheng & Mingwei Zhang
Department of Critical Care Medicine, Zhongshan Hospital Affiliated to Fudan University, Shanghai, 200032, China
Ming Zhong
Suzhou Medical College, Suzhou University, Suzhou, 215031, China
Yimeng Qu

Authors

Tianyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yimeng Qu
View author publications
You can also search for this author in PubMed Google Scholar
Deyong wang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Mingwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by TZ, DW and MZ. The first draft of the manuscript was written by TZ and all authors commented on previous versions of the manuscript. TZ, YQ and MZ completed the revisions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mingwei Zhang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

This article does not require the informed consent.

Consent for publications

All of the authors confirmed the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, T., Qu, Y., wang, D. et al. Optimizing sepsis treatment strategies via a reinforcement learning model. Biomed. Eng. Lett. 14, 279–289 (2024). https://doi.org/10.1007/s13534-023-00343-2

Download citation

Received: 12 July 2023
Revised: 28 October 2023
Accepted: 13 November 2023
Published: 04 January 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13534-023-00343-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing sepsis treatment strategies via a reinforcement learning model