Skip to main content
Log in

Optimizing sepsis treatment strategies via a reinforcement learning model

  • Original Article
  • Published:
Biomedical Engineering Letters Aims and scope Submit manuscript

Abstract

Purpose

The existing sepsis treatment lacks effective reference and relies too much on the experience of clinicians. Therefore, we used the reinforcement learning model to build an assisted model for the sepsis medication treatment.

Methods

Using the latest Sepsis 3.0 diagnostic criteria, 19,582 sepsis patients were screened from the Medical Intensive Care Information III database for treatment strategy research, and forty-six features were used in modeling. The study object of the medication strategy is the dosage of vasopressor drugs and intravenous infusion. Dueling DDQN is proposed to predict the patient’s medication strategy (vasopressor and intravenous infusion dosage) through the relationship between the patient’s state, reward function, and medication action. We also constructed protection against the possible high-risk behaviors of Dueling DDQN, especially sudden dose changes of vasopressors can lead to harmful clinical effects. In order to improve the guiding effect of clinically effective medication strategies on the model, we proposed a hybrid model (safe-dueling DDQN + expert strategies) to optimize medication strategies.

Results

The Dueling DDQN medication model for sepsis patients is superior to clinical strategies and other models in terms of off-policy evaluation values and mortality, and reduced the mortality of clinical strategies from 16.8 to 13.8%. Safe-Dueling DDQN we proposed, compared with Dueling DDQN, has an overall reduction in actions involving vasopressors and reduces large dose fluctuations. The hybrid model we proposed can switch between expert strategies and safe dueling DDQN strategies based on the current state of patients.

Conclusions

The reinforcement learning model we proposed for sepsis medication treatment, has practical clinical value and can improve the survival rate of patients to a certain extent while ensuring the balance and safety of medication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Seymour CW, Liu VX, Iwashyna TJ, Brunkhorst FM, Rea TD, Scherag A, Rubenfeld G, Kahn JM, Shankar-Hari M, Singer M, Deutschman CS. Assessment of clinical criteria for sepsis: for the third international consensus definitions for sepsis and septic shock (sepsis-3). JAMA. 2016;315(8):762–74.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Rhodes A, Evans LE, Alhazzani W, Levy MM, Antonelli M, Ferrer R, Kumar A, Sevransky JE, Sprung CL, Nunnally ME, Rochwerg B. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock: 2016. Intensiv Care Med. 2017;43:304–77. https://doi.org/10.1007/s00134-017-4683-6.

    Article  Google Scholar 

  3. Gaieski DF, Edwards JM, Kallan MJ, et al. Benchmarking the incidence and mortality of severe sepsis in the United States. Crit Care Med. 2013. https://doi.org/10.1097/CCM.0b013e31827c09f8.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Levy MM, Evans LE, Rhodes A. The surviving sepsis campaign bundle: 2018 update. Intensiv Care Med. 2018. https://doi.org/10.1007/s00134-018-5085-0.

    Article  Google Scholar 

  5. Jinxin Z, Kuo S, Dahai H, et al. (2022) Advances in early diagnosis and treatment of sepsis. Chinese journal of injury and repair (Electronic Edition)

  6. Littman M. Reinforcement learning improves behaviour from evaluative feedback. Nature. 2015. https://doi.org/10.1038/nature14540.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Jeter R, Josef C, Shashikumar S, Nemati S. (2019) Does the “Artificial Intelligence Clinician” learn optimal treatment strategies for sepsis in intensive care? arXiv preprint arXiv: 1902.03271. https://arxiv.org/abs/1902.03271

  8. Johnson A, Pollard T, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016. https://doi.org/10.1038/sdata.2016.35.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Van Hasselt H, Guez A, Silver D. (2016) Deep reinforcement learning with double Q-Learning. National Conference on Artificial Intelligence, Beijing, China: IEEE. https://doi.org/10.1609/aaai.v30i1.10295.

  10. Wang G, Schaul T, Hessel M, et al. (2016) Dueling network architectures for deep reinforcement learning. International Conference on Machine Learning, USA: IEEE. http://proceedings.mlr.press/v48/wangf16.pdf.

  11. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016. https://doi.org/10.1001/jama.2016.0287.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Raghu A, Komorowski M, Celi L A, Szolovits P, Ghassemi M. (2017) Continuous state-space models for optimal sepsis treatment: a deep reinforcement learning approach. Machine Learning for Healthcare Conference. https://proceedings.mlr.press/v68/raghu17a.html.

  13. Peng X, Ding Y, Wihl D, Gottesman O, Komorowski M, Li-wei HL, Ross A, Faisal A, Doshi-Velez F. (2018) Improving sepsis treatment strategies by combining deep and kernel-based reinforcement learning. American Medical Informatics Association (AMIA) Annual Symposium Proceedings. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6371300/.

  14. Futoma, J, Lin, A, Sendak, M, Bedoya, A, Clement, M, O’Brien, C, Heller, K. (2018) Learning to treat sepsis with multi-output gaussian process deep recurrent q-networks. https://openreview.net/forum?id=SyxCqGbRZ.

  15. Roggeveen L, El Hassouni A, Ahrendt J, Guo T, Fleuren L, Thoral P, Girbes AR, Hoogendoorn M, Elbers PW. Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis. Artif Intell Med. 2021. https://doi.org/10.1016/j.artmed.2020.102003.

    Article  PubMed  Google Scholar 

  16. Fohner AE, Greene JD, Lawson BL, Chen JH, Kipnis P, Escobar GJ, Liu VX. Assessing clinical heterogeneity in sepsis through treatment patterns and machine learning. J Am Med Inform Assoc. 2019;26(12):1466–77. https://doi.org/10.1093/jamia/ocz161.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Vincent JL, de Backer D. Circulatory shock. N Engl J Med. 2013. https://doi.org/10.1056/NEJMra1208943.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Malbrain ML, Van Regenmortel N, Saugel B, De Tavernier B, Van Gaal PJ, Joannes-Boyau O, Teboul JL, Rice TW, Mythen M, Monnet X. Principles of fluid management and stewardship in septic shock: it is time to consider the four D’s and the four phases of fluid therapy. Ann Intensiv Care. 2018. https://doi.org/10.1186/s13613-018-0402-x.

    Article  Google Scholar 

  19. Kotani Y, Di Gioia A, Landoni G, Belletti A, Khanna AK. An updated “norepinephrine equivalent” score in intensive care as a marker of shock severity. Crit Care. 2023. https://doi.org/10.1186/s13054-023-04322-y.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Jia Y, Lawton T, Burden J, Burden J, McDermid J, Habli I. Safety-driven design of machine learning for sepsis treatment. J Biomed Inform. 2021. https://doi.org/10.1016/j.jbi.2021.103762.

    Article  PubMed  Google Scholar 

  21. Liang D, Deng H, Liu Y. The treatment of sepsis: an episodic memory-assisted deep reinforcement learning approach. Appl Intell. 2022. https://doi.org/10.1007/s10489-022-04099-7.

    Article  Google Scholar 

  22. Tianhao L, Zhishun W, Wei L, Zhang Q. Electronic health records based reinforcement learning for treatment optimizing. Inf Syst. 2022. https://doi.org/10.1016/j.is.2021.101878.

    Article  Google Scholar 

  23. Jia, Yan, et al. (2020) "Safe reinforcement learning for sepsis treatment." 2020 IEEE International conference on healthcare informatics (ICHI). IEEE. https://doi.org/10.1109/ICHI48887.2020.9374403.

  24. Fatemi M, Killian TW, Subramanian J, Ghassemi M. (2021) Medical dead-ends and learning to identify high-risk states and treatments. Adv Neural Inf Proces Syst. https://proceedings.neurips.cc/paper_files/paper/2021/hash/26405399c51ad7b13b504e74eb7c696c-Abstract.html.

  25. Chan A J, van der Schaar M. (2021) Scalable Bayesian inverse reinforcement learning. International Conference on Learning Representations.https://doi.org/10.48550/arXiv.2102.06483.

  26. Liu X, Yu C, Huang Q, Wang L, Wu J, Guan X. (2021) Combining Model-Based and Model-Free Reinforcement Learning Policies for More Efficient Sepsis Treatment. In Bioinformatics Research and Applications: 17th International Symposium, ISBRA. https://doi.org/10.1007/978-3-030-91415-8_10.

  27. Beier K, Eppanapally S, Bazick HS, Chang D, Mahadevappa K, Gibbons FK, Christopher KB. Elevation of bun is predictive of long-term mortality in critically ill patients independent of normal creatinine. Crit Care Med. 2011;39(2):305.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank all authors for their contributions to this work.

Funding

This work was supported by the Academic Leader Program of Shanghai Public Health System Construction 3-Year Action Plan (2020–2022) (Grant Number: GWV-10.2-XD32); Shanghai “Science and Technology Innovation Action Plan” Biomedical Science and Technology Support Special Project (Grant Number: 20S31905100); Shanghai Engineering Technology Research Center Support Project (Grant Number: 18DZ2250900).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by TZ, DW and MZ. The first draft of the manuscript was written by TZ and all authors commented on previous versions of the manuscript. TZ, YQ and MZ completed the revisions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mingwei Zhang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

This article does not require the informed consent.

Consent for publications

All of the authors confirmed the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Qu, Y., wang, D. et al. Optimizing sepsis treatment strategies via a reinforcement learning model. Biomed. Eng. Lett. 14, 279–289 (2024). https://doi.org/10.1007/s13534-023-00343-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13534-023-00343-2

Keywords

Navigation