A minimal set of physiomarkers in continuous high frequency data streams predict adult sepsis onset earlier

https://doi.org/10.1016/j.ijmedinf.2018.12.002Get rights and content

Abstract

Purpose

Sepsis is a life-threatening condition with high mortality rates and expensive treatment costs. To improve short- and long-term outcomes, it is critical to detect at-risk sepsis patients at an early stage.

Methods

A data-set consisting of high-frequency physiological data from 1161 critically ill patients was analyzed. 377 patients had developed sepsis, and had data at least 3 h prior to the onset of sepsis. A random forest classifier was trained to discriminate between sepsis and non-sepsis patients in real-time using a total of 132 features extracted from a moving time-window. The model was trained on 80% of the patients and was tested on the remaining 20% of the patients, for two observational periods of lengths 3 and 6 h prior to onset.

Results

The model that used continuous physiological data alone resulted in sensitivity and F1 score of up to 80% and 67% one hour before sepsis onset. On average, these models were able to predict sepsis 294.19 ± 6.50 min (5 h) before the onset.

Conclusions

The use of machine learning algorithms on continuous streams of physiological data can allow for early identification of at-risk patients in real-time with high accuracy.

Introduction

In the critical care environment, the availability of vast volumes of data present a unique opportunity to generate novel insights for better care [1,2]. The analysis of substantial volumes of data are more tractable with the use of new sophisticated and efficient machine learning methods and strategies [1,[3], [4], [5]]. The management of sepsis can benefit from the use of such tools, specifically to identify at-risk patients earlier. Sepsis is a deadly life-threatening condition that arises from a significantly dysregulated response to infection, resulting in acute single or multi-organ failure and death [6,7]. If recognition is delayed, sepsis can rapidly progress to multiple organ dysfunction (MOD), resulting in high mortality rates [6,8]; an increase of approximately 8% in mortality rate is observed for each hour of delayed diagnosis of sepsis [9]. Predictive analytics applied to routinely collected continuous data, such as physiological data, can reduce recognition gaps while allowing for targeted and early goal-directed therapy, while improving situational awareness in critical care.

Machine learning techniques have been extensively used in medical decision making and treatment planning. For instance, these algorithms have been used to predict at-risk patients or patient outcomes, and to reduce alarm fatigue [[10], [11], [12]]. Similarly, machine learning algorithms have been successfully implemented in various medical image analyses to assist diagnosis and therapy in neurology, cardiology, and the detection of various cancers [[13], [14], [15], [16], [17], [18], [19], [20]]. While, to date, machine learning algorithms have shown promise in detecting and predicting sepsis [21,22], much of the recent work has been centered around static and often manually entered electronic health record (EHR) data [23]. Recent work has shown that ‘physiomarkers,’ such as reduced heart rate variability, may precede the onset of sepsis [[24], [25], [26]], enabling a window of early recognition and treatment. In this paper, we utilize machine learning to predict the onset of sepsis in patients who are admitted to the intensive care unit (ICU), using continuous minute-by-minute data captured at the bedside.

This paper introduces a novel method for applying a machine learning pipeline to high-frequency data streams in the area of sepsis prediction. Therefore, we make the following key contributions in the area of precision medicine as applied to critical care medicine:

  • 1

    A predictive sepsis model built using a minimal set of six continuous, routinely collected bedside physiological data streams

  • 2

    An analysis pipeline tailored for ‘online’ implementation

  • 3

    Identification of salient physiomarkers that predict the onset of sepsis in critically ill adults using an integrated machine learning approach

Section snippets

Data characterization

Continuous minute-by-minute physiological data was captured using a proprietary Cerner CareAware iBus® platform [27] at the Methodist LeBonheur Healthcare (MLH) System in Memphis TN between February and December 2017. The data was collected across Intensive Care Units (ICUs) in four adult hospitals within the MLH system. We captured heart rate (HR), diastolic blood pressure (DBP), systolic blood pressure (SBP) (via cuff if arterial not available), mean arterial pressure (MAP), temperature,

Results

Table 1 illustrates several descriptive statistics for the sepsis and non-sepsis patient subgroups for the 3-hour observational period, all of whom were aged between 18–99 years of age. A t-test between the sepsis and non-sepsis subgroups indicate statistically significant differences across all parameters based on p-values.

Balanced training and test sets, with an equal number of sepsis and non-sepsis patients, were generated to avoid favoring the more represented observations in the dataset.

Discussion and conclusion

Continuous monitoring applied to the development of sepsis in hospitalized patients can reduce the gap between underlying pathological onset and clinical recognition. In this study, we follow the sepsis-2 definition, a widely used indicator of sepsis, to automatically identify the time of onset of sepsis. We further demonstrate that physiomarkers exist prior to the onset of sepsis, independent of WBC values. Furthermore, a unique advantage of our modelling approach is the minimal set of

Contributors’ statement page

Dr. Begoli provided informatics specific feedback and critically reviewed the manuscript for important intellectual content.

Dr. Davis conceptualized and designed the study and critically reviewed the manuscript for important intellectual content.

Dr. Kamaleswaran conceptualized and designed the study, developed software to collect and preprocess the data, performed data analysis, and drafted the initial manuscript.

Dr. Khojandi conceptualized and designed the study, supervised data analysis, and

Conflicts

Dr. Davis received funding from GlaxoSmithKline. The remaining authors have disclosed that they do not have any potential conflicts of interest.

Summary points

What is known:

  • Sepsis, characterized by the body’s over-reaction to infection, is among the most significant contributors of mortality in intensive care units.

  • Predictive algorithms have been shown to significantly advance the time to detection of sepsis, however using data that is captured aperiodically.

  • High-frequency data captured by

Acknowledgments

We would like to acknowledge the efforts of Michael Younker, Brian Williams, Don MacMillan for their work in preparing and providing key data elements that were used in this paper. We would like to thank Dr. David Maslove, Queen’s University, Canada for his invaluable feedback throughout the algorithm development.

References (41)

  • J.S. Upperman et al.

    Specific etiologies associated with the multiple organ dysfunction syndrome in children: part 1

    Pediatr. Crit. Care Med.

    (2017)
  • I. Jawad et al.

    Assessing available information on the burden of sepsis: global estimates of incidence, prevalence and mortality

    J. Global Health

    (2012)
  • S.P. Shashikumar et al.

    Early sepsis detection in critical care patients using multiscale blood pressure and heart rate dynamics

    J. Electrocardiol.

    (2017)
  • L.M. Eerikäinen et al.

    Reduction of false arrhythmia alarms using signal selection and machine learning

    Physiol. Meas.

    (2016)
  • V.J. Ribas et al.

    On the use of decision trees for ICU outcome prediction in sepsis patients treated with statins

    Computational Intelligence and Data Mining (CIDM)

    (2011)
  • M. Khalilia et al.

    Predicting disease risks from highly imbalanced data using random forest

    BMC Med. Inf. Decis. Making

    (2011)
  • D. Kollias et al.

    Deep neural architectures for prediction in healthcare

    Complex Intell. Syst.

    (2017)
  • H. Wang et al.

    Mitosis detection in breast cancer pathology images by combining handcrafted and convolutional neural network features

    J. Med. Imaging

    (2014)
  • A. Khemphila et al.

    Heart disease classification using neural network and feature selection

    21st International Conference

    (2011)
  • M.R. Kraft et al.

    Data mining in healthcare information systems: case study of a veterans’ administration spinal cord injury population

    Proceedings of the 36th Annual Hawaii International Conference

    (2003)
  • Cited by (49)

    View all citing articles on Scopus
    View full text