Introduction

Respiration is unstable in the newborn, especially in the preterm infant. In these infants, apnea is a frequent phenomenon, which can be isolated, associated with hypoxemia or with a combination of hypoxemia and bradycardia. The current clinical definition of apnea or cardio-respiratory events in this context is not very precise and remains difficult to use in practice. An apneic spell is defined by the American academy of pediatrics as a cessation of breathing for 20 seconds or longer or a shorter pause accompanied by bradycardia (\(<100\) beats per minute), cyanosis, or pallor1. Those latter combined events may compromise brain oxygenation and tissue perfusion, especially when hypoxemia is followed by bradycardia2. The long-term impact of this intermittent hypoxemia during early postnatal life remains to be determined, but cardiorespiratory events have been evoked to potentially impact the risks of retinopathy, growth alteration, sleep-disordered breathing, and neurodevelopmental impairment3,4,5.

In Neonatal Intensive Care Units (NICU), preterm infants undergo continuous monitoring through polygraphic recordings, to detect cardiorespiratory events and to initiate quick nursing actions, such as manual stimulation, mask ventilation and sometimes intubation. In those clinical settings, nurse interventions and treatment decisions are mostly based on oxygen saturation (SaO2) and heart rate alarms6. The detection of apneas through analysis of chest impedance waveform remains unreliable with reported false alarm rates of more than 60% and lack of detection in more than 25% of apneas lasting more than 10 s7,8. The pulse oximeters also exhibit around 20% lack of detection and more than 50% false alarm rates9. Bradycardia has been associated with apneas and desaturations in 83% and 86% of the recorded events with a mean delay from the onset of apnea, which is around five seconds10. The detection of bradycardias based on simple thresholding techniques is more reliable than apnea or desaturation detection but suffers from low specificity and high detection delays from the onset of respiration cessation6,8,11,12. From these studies, it mainly appears that the exposure of nurses to non-actionable alarms evokes the phenomenon of alarm fatigue and may yield a longer treatment response time, that may detrimentally affect patient care and safety. Duration and amplitude of bradycardia depend at least in part on the time delay between the onset of the bradycardia and the intervention, which remains often long, around 30 to 60 s8,13. Recent works based on system identification approaches have been proposed for the early detection of bradycardia events from ECG signals14,15. In both of these works, data from 10 newborns have been processed off-line, in batch mode, following a phase of manual verification for the rejection of artifacts due to movement, disconnection, or erroneous peak detections. A series of papers from our team have been focused on the early detection of apnea bradycardia events, but based on original Markov model architectures16,17. These methods have been evaluated using data from 32 preterm newborns, again, using a batch, post-processing mode.

We hypothesize that a systemic view of the problem, integrating a better understanding of the nurse response to NICU alarms together with an improved, early detection of bradycardia episodes, may help decrease the intervention delay and, therefore, reduce the potential consequences of intermittent hypoxemia18. Therefore, in this paper, we describe the nurse responses to alarms with regard to the depth of the observed bradycardia and this characterization leads to a detailed annotated database acquired in real-life conditions. We then propose and analyze a new early bradycardia detector, functioning in real-time and on-line mode (as current NICU monitors do), adapted to preterm infants. The proposed detector is compared in terms of detection delay with respect to the cardiac alarms generated by a monitor commonly used in NICU. We design this new detector to be at least equivalent to the existing one in terms of sensitivity and false alarm rate, but minimizing the detection delay.

Results

The observational prospective multicenter clinical study was performed in three French NICU (Rennes, Nantes and Tours). During the study period, 8 h ECG recordings (two periods of 4 h) were obtained from 52 very preterm newborns with a mean gestational age of 27.7 (\(\pm 1.7\)) weeks and a birth-weight of 1078 (\(\pm 329\)) g. The characteristics of the population and the circumstances associated with cardio-respiratory events at inclusion are presented in Table 1. Cardio-respiratory events were considered as occurring without an associated explanatory factor for 15 preterm newborns (29%).

Table 1 Characteristics of the population studied.

Alarms in the neonatal unit

During the period of observation, 356 monitor alarms (mean = 1 alarm/70 min/patient) were generated by the monitors in the neonatal units, including 225 red alarms. Among these 356 observed monitor alarms, 198 (56%) were manually recorded on nursing sheets during the acquisition period (intra-recording phase) by the attendant nurse. Among these 198 alarms, 77 (39%), were followed by a manual stimulation on the newborn, performed by the nurse. The stimulations were preceded by a rapid hand-wash in 40 cases (53% of the manual interventions).

Characterization of Bradycardia events

As described in the “Methods” section, a post-recording phase of alarm validation and curation was performed, in order to exhaustively classify the bradycardia events into three definitions, in line with recent literature: HR lower than 100 bpm during at least 5 s1 (\(B^{100bpm}_{5s}\)), HR lower than 80 bpm during at least 10 s5 (\(B^{80bpm}_{10s}\)), and ”clinically-significant bradycardia” selected by the experts (\(B^{clin}\)).

During this curation phase, 316 \(B^{100bpm}_{5s}\) events were manually annotated. Among these events, 84 were further classified as \(B^{80bpm}_{10s}\), and 216 as \(B^{clin}\) (see first column of Table 2). The other 40 alarms corresponded to artifacts, decrease in HR during less than 5 seconds and respiratory or SpO2 alarms.

Table 2 Comparison of bradycardia detection performances. False alarm rate, sensitivity and detection delay are reported according to the gold-standard annotations. Red alarm and Yellow alarm are respectively denoted \(D^{mon}_{red}\), \(D^{mon}_{yellow}\) for the monitors currently used in neonatal units. The proposed fusion method is denoted \(D^{fusion}_{red}\), \(D^{fusion}_{yellow}\) for Red and Yellow alarms respectively.

Based on this event classification, we have studied the nurse responses during the intra-acquisition phase. Among the 198 alarms recorded on nursing sheets by the attendant nurse, we identified 70 \(B^{80bpm}_{10s}\) events. All 77 events that motivated manual stimulation were associated with a \(B^{clin}\) annotation. We observed 59 stimulations for \(B^{80bpm}_{10s}\) (71% of the total of \(B^{80bpm}_{10s}\) annotations) and 18 stimulations for less severe situations. It is to note that none of the preterm infants were orally fed during the recording periods. We did not observe any \(B^{80bpm}_{10s}\) induced by external interventions during the whole study period.

Figure 1 shows the main observed profiles, with monophasic bradycardia examples corresponding to \(B^{100bpm}_{5s}\) and \(B^{80bpm}_{10s}\) definitions in Fig. 1a, b respectively. Typical examples of triphasic (Fig. 1c) and biphasic (Fig. 1d) bradycardia profiles, observed in 34% of the events, are also depicted. The onset and offset instants of each bradycardia were manually annotated and then automatically corrected, as described in the “Methods” section, in order to minimize annotation jitter. The mean shift between manually-annotated onset of bradycardia annotations and the automatically-adjusted time-instant was − 1.94 s (i.e adjusted instants appear earlier than clinician annotations) with a standard deviation of 1.39 s. The mean shift between the offset bradycardia annotation time of the clinician and the adjusted time-instant was 0.11 s with a standard deviation of 1.32 s.

Figure 1
figure 1

Example profiles of the observed bradycardia events. On each panel, solid curves represent the observed heart rate. Horizontal dotted and dashed lines represent respectively the yellow alarm threshold (100 bpm) and the red alarm threshold (80 bpm). Panel (a) shows a bradycardia of less than 100 bpm during at least 5 s (\(B^{100bpm}_{5s}\)). Panel (b) represents a severe bradycardia of less than 80 bpm during at least 10 s (\(B^{80bpm}_{10s}\)). Panel (c,d) show triphasic and biphasic bradycardias annotated as single \(B^{80bpm}_{10s}\) events.

Automatic bradycardia detection performance

Table 2 shows the performance of the proposed bradycardia detection method. The sensitivity and false alarm rate of the proposed detector highly depend on the bradycardia definition used (i.e. \(B^{100bpm}_{5s}\), \(B^{80bpm}_{10s}\) or \(B^{clin}\)). For the primary endpoint (\(B^{80bpm}_{10s}\)), results from the application of the proposed detection fusion approach \(D^{fusion}_{red}\) showed a significant decrease of the mean detection delay of 2.9 seconds with respect to the detection performed by the monitor \(D^{mon}_{red}\). This early detection was obtained without altering sensitivity and without increasing the false alarm rate. When compared with the monitor detector, the proposed approach yielded also significantly decreased detection delays for \(B^{100bpm}_{5s}\) or \(B^{clin}\), while maintaining comparable performance levels (Table 2). Regarding the fusion step, for \(D^{fusion}_{yellow}\), 40% of the detections were generated by a simultaneaous activation of the 3 methods (fixed-threshold, relative adaptative threshold and abrupt change), 30% by the pair fixed/adaptative, 15% by the pair fixed/abrupt and 14% by the pair adaptative/abrupt. For \(D^{fusion}_{red}\), these values are 69%, 3%, 17% and 11% respectively.

Discussion

In this multicenter observational study performed in very preterm infants we described the nurse response to bedside monitor alarms and we proposed and evaluated a new real-time fusion-based method for the early detection of bradycardia events. Concerning the nurse response to bedside monitor alarms, in this study we observed that: i) A high percentage (29%) of severe bradycardia events with heart rate less than 80 bpm for more than 10 seconds did not result in nurse intervention, ii) the nurse intervention was not preceded by hand-wash before intervening on the newborn in around one half of the situations, iii) there is a high level of false alarm rate for the detection of severe cardio-respiratory events, which is known to favor alarm desensitization.

For all bradycardia definitions (\(B^{80bpm}_{10s}\), \(B^{100bpm}_{5s}\), \(B^{clin}\)) and settings (red, yellow), the proposed new fusion-based detector was able to provide an earlier event detection (between 2.7 s earlier than the monitor for \(B^{clin}\) on red alarms and 3.4 s earlier than the monitor for \(B^{100bpm}_{5s}\) on yellow alarms) without impairing the reliability of the detection, when compared to standard methods currently available on NICU monitors. Taken together, these results suggest that early bradycardia detection is possible without increasing the rate of false alarms, potentially allowing improvement in the management of cardio-respiratory events of the newborn.

Heart rate alarms are currently the most reliable way to detect the cardio-respiratory events associated with prematurity. The choice of the fusion method has the advantage to be easy to implement and embeddable into devices with low computational resources, when compared to other more sophisticated methods16,17,19,20,21. Portet et al also observed a small improvement in the time delay for detection of bradycardia using a decision tree approach20. Our group has proposed a set of hidden and coupled semi-Markovian methods that also provide earlier detection delays with increased performance levels16,17,21. However, none of these works have evaluated performance sensitivity with respect to the different definitions of bradycardia and under real-life conditions. We have shown here that this performance is particularly sensitive to the bradycardia definition used. This is an important aspect to consider, because the definition of a significant bradycardia remains conflicting with regards to its duration and deepness, due to the absence of established relationships between neonatal bradycardia deepness and duration and later neurodevelopment in very preterm infants5.

Monitoring of vital signs, and more specifically monitoring of heart rate in neonatal intensive care units, is intended to obtain actionable alarms and rapid responses to clinically-relevant changes in the newborn’s conditions. Despite this, as other authors6,8,11,12, we observed that many relevant alarms were not followed by rapid interventions by the attendant nurses. The probability to respond in time to an alarm not only depends on the duration of the bradycardia, but also on its interpretation by the nurse, which is known to be influenced by many factors. This could also be due to alarm fatigue6,7,22 or to the fact that alarms are not always directly exploited, but rather used as an additional information for decision support. It seems from the literature that the consequences of cardio-respiratory events in preterm infants could be mainly linked to incidence, duration and severity of hypoxemic episodes5. In this study we have chosen to focus on bradycardia rather than on hypoxemia because of the high false alarm rate associated with pluse oymetry9. In the current study we observed nurses intervention in only 36% of the bradycardias associated with a desaturation of at least 10% (\(B^{clin}\)) and in only 71% of the events with a bradycardia of less than 80 bpm with duration of more than 10 sec which were always associated with oxygen desaturation. It appears therefore that improving detection and prevention of bradycardia events in preterm infants will result in a decrease of hypoxemic episodes and potentially on the associated morbidity. Moreover, we also observed that hand wash is not always performed in such a context of an emergency intervention. As hand hygiene has a well-known effect on infection reduction, this could have increased the risk of transmission of infectious agents23. It seems therefore necessary to test new strategies to limit this risk.

In this study, we have chosen to focus on heart rate alarms as a first step to improve alarm setting for an early detection of cardio-respiratory events. With such an approach, we expect to early detect 85% of the cardio-respiratory events including all the severe episodes10,24 without increasing the false detection rate. We expect that such an early detection will improve the response of the newborn to the intervention. We also expect that this will allow for the development of new therapeutic strategies, such as automatic intervention systems which could also decrease the sepsis risk associated with manual intervention, as suggested in18.

One limitation of this work is the lack of combination of the different modes of monitoring (i.e. respiratory rate and SpO2), which would have theoretically been a better way to approach oxygen delivery and therefore to quantify the risk of intermittent hypoxia events. However, the benefit of such a combined approach has not been demonstrated yet in clinical practice for several reasons. Usually, in the presence of artifacts, all the signals are corrupted together25. Also, SpO2, as well as chest impedance, are known to be insufficiently reliable7,8,9. Another limitation of heart rate alarms is that it is known that bradycardia, in the absence of concurrent prolonged hypoxemia of more than one minute, may not be of prognostic importance5. However, it is known that it can be difficult to reverse the bradycardia when the event is prolonged. Therefore, we think that it is clinically relevant to early detect and intervene on most of the bradycardias, with the expectation to improve neonatal outcome through a decrease in intermittent hypoxemia events. The price to pay for this early detection, using our approach as well as monitor alarms, is that only around one third of the red alarms correspond to severe bradycardias. We think that the proposed approach could be incorporated into new strategies using multi-signal analyses and/or closed-loop, automated stimulation, in order to limit alarm fatigue for healthcare givers and minimize the consequences of repeated cardio-respiratory events for very preterm newborns.

Methods

Clinical study

This observational prospective multicenter study was designed to validate a new method for early detection of severe bradycardia and was performed in the French NICU from Tours, Nantes and Rennes university hospitals. The study was approved by the committee on protection of individuals (Comité de Protection des Personnes Ouest II, 03/05-445) and registered on clinical trials registry (2008-A00898-47, NCT00950287). Informed parental consent was obtained and all methods were carried out in accordance with relevant guidelines and regulations. The responses to the alarms in real-life conditions were observed during two 4-h periods (one following the inclusion and the second one two to seven days later). Eligible infants were preterm born before 33 weeks of gestational age with a post-menstrual age of less than 36 weeks, presenting with at least 2 significant apnea-bradycardia in a 3-hour period. The definition used for a significant apnea-bradycardia at inclusion was a decrease in heart rate of more than 33% with respect to HR baseline during at least 10 s, associated with and oxygen saturation SpO2 less than 80% for at least 4 s10. The exclusion criteria were: a postnatal age of less than four days, mechanical ventilation, intracerebral lesion (grade 3 or 4 intracranial hemorrhage, periventricular leukomalacia or ischemic lesion), and treatment with doxapram. Caffeine treatment or nasal continuous positive pressure ventilation were not exclusion criteria.

Data acquisition and intra-recording annotations

Each newborn was recorded twice over 4-hour periods with real-time annotation of the records by a specific research nurse who continuously observed the newborn without intervening in care. Annotations included: characteristics of the cardio-respiratory events, kind of intervention of the nurse in charge of the patient, hand washing before intervention and recording of the bradycardia event by the attendant nurse on the health record. The two recording periods were chosen to get data with different rates of alarms. The first period began as soon as possible after inclusion. The second period was acquired at least 24 h after the first one. The recordings were performed in neonatal units in a real-life situation, with the preterm infant remaining in usual condition with limitations of external stimulations during the study periods.

In order to obtain all raw ECG signals, as well as synchronized annotations, two parallel monitoring systems were applied to each newborn: the standard NICU monitor and a custom-made research ECG monitoring system (INTEM system). The INTEM system integrates (i) an original equipment manufacturer (OEM) CE-marked ECG acquisition board (MCC GmbH & Co. KG, Karlsruhe, Germany) able to acquire 3 ECG leads and to perform analog-to-digital conversion at a sampling rate of 300 Hz; (ii) a Bluetooth communication board for wireless transfer from the acquisition module to a standard computer; (iii) a custom electronic board to control these modules by a specific microcontroller and associated firmware and iv) a computer application with specific GUI for the acquisition, display and storage of the raw signals and to allow the research nurse to annotate the observed events during acquisition. Also, the NICU monitor alarms were recorded by the INTEM system by means of a custom-made automatic detection device connected to the monitors alarm light-emitting diodes (LED). Both systems were connected to the newborn using the same electrodes and a Y-connector.

Post-recording annotations and curation of the database

Even if the intra-recording annotations are very valuable, these annotations should be curated and further completed during a post-recording phase in order to cope with four main factors: (i) bradycardia events annotated by the research nurse or automatically recorded by the system can be associated with a false alarm, (ii) miss-detections can occur and the corresponding events are thus not annotated during the intra-recording phase, (iii) LED alarms can be of many types and should be manually curated and, especially, (iv) NICU monitors from each center may be of different models, have different firmware versions or may be configured with different parameters, leading to LED alarms that may not all correspond to the same detection criteria.

In order to cope with these limitations, we decided to develop a test-bench so as to obtain a set of homogeneous HR monitor alarm annotations (Fig. 2). The recorded INTEM ECG signals were converted from digital to analog, conditioned, scaled, and presented as input to a single Philips IntelliVue MP20 monitor (software version F.01.43) with the default settings for NICU monitoring (red alarm set at 80 bpm and yellow alarm at 100 bpm). Alarms generated by the monitor were acquired using a photoresistor sensor fixed on the alarm activity LED, as performed during the intra-recording phase (see Fig. 2). Two National Instruments modules (PCI-6220 for analog input and PCI-6733 for analog output) were linked to an I/O Connector Block to the Philips monitor (CB-68LP) and a computer was used to present all recordings at the appropriate rate.

Figure 2
figure 2

Proposed test-bench. Recorded ECG signals were converted from digital to analog, conditioned, scaled and presented as input to a Philips IntelliVue MP20 monitor. Alarms generated by the monitor were captured using a photoresistor sensor fixed on the alarm activity LED of the monitor.

Once all records have been ran through this test-bench and once all ECG signals were processed for the estimation of heart rate series, a curation and further annotation phase was initiated. Firstly, all bradycardia events were manually annotated by two expert clinicians, by analyzing the original intra-recording annotations, the acquired signals and the derived heart rate series. A given bradycardia was considered as a unique biphasic or multi-phasic event when the interval between two heart rate reductions below 100 bpm lasted less than 10 seconds (Fig. 3).

Figure 3
figure 3

Representation of a biphasic bradycardia event. The solid curve represents the heart rate time series. Horizontal dotted line represents the 100 bpm threshold (Yellow alarm). The vertical solid line represents the beginning of the bradycardia and the vertical dashed line represents the end of the event, as annotated by clinicians. During this bradycardia episode, the heart rate was higher than 100 bpm during a short time period (less than 10 s). We consider this event as one single biphasic bradycardia.

Three definitions of bradycardia events have been used:

  • HR lower than 100 bpm during at least 5 s (\(B^{100bpm}_{5s}\)): It was chosen to be in line with a recent definition of bradycardia1 and because it was close to the yellow HR alarms generated by the monitor.

  • HR lower than 80 bpm during at least 10 s (\(B^{80bpm}_{10s}\)): This globally accepted definition of severe bradycardia was chosen to be in line with a recent publication regarding the outcome after episodes of bradycardia with heart rate below 80 bpm for 10 seconds or longer5.

  • Clinically-significant bradycardia (\(B^{clin}\)): These bradycardia were chosen by the experts, taking into account the ECG waveform, SpO2 and real-time clinical annotations of apneas performed by the research nurse during the intra-recording phase. They were considered clinically significant in case of: \(B^{80bpm}_{10s}\) or \(B^{100bpm}_{5s}\), associated with apnea and a decrease in SpO2 of at least 10%. Isolated \(B^{100bpm}_{5s}\), which were not associated with a decrease in SpO2 were not considered as clinically significant. Similarly, isolated \(B^{100bpm}_{5s}\) observed in case of low basal HR or due to HR oscillations associated with periodic breathing were not considered as clinically significant.

Please note that events classified in both \(B^{80bpm}_{10s}\) and \(B^{clin}\) are subsets of the events classified as \(B^{100bpm}_{5s}\). For each bradycardia annotation, an automated optimization process initially proposed in26 was applied in order to obtain precise and reproducible measurements of the instant of bradycardia onset and duration. This optimization method is based on the adjustment of a sigmoid function: the adjusted event corresponds to the first point where the derivative of the interpolated sigmoid function is greater than 1 (see Fig. 4).

Figure 4
figure 4

Automatic annotation onset adjustment process. RR time series is represented by the black line. The sigmoid function interpolating the observed RR series is represented by the dotted blue line and its first derivative by the dashed blue line. The vertical red lines represent the onset of the bradycardia proposed by the clinician (dashed line) and the corrected onset (solid line), using the first point where the derivative of the sigmoid adjustment function is greater than 1. In this example, the corrected annotation is located 4.16 s before the one performed by the clinician.

In a second step, each annotated bradycardia was compared with the recorded LED alarms from the monitors. If the monitor alarm was caused by another pathophysiological phenomenon, such as tachycardia or by a false positive, the alarm was excluded from the analysis.

Signal processing

ECG processing

Prior to the detection of bradycardia events, the RR series, representing the time intervals between consecutive heart beats (QRS complex) were built using a multi-feature probabilistic real-time detector27. Briefly, this detector first derives relevant features (slope, amplitude and correlation on a window length of 50 ms) from the analysis of a single-lead ECG signal and from the estimates of the feature probability distributions. Bayesian probabilities were then computed for each feature before being merged through an adaptive centralized decision node using the Kullback-Leibler divergence measure. Probability distributions were then updated according to the final detection. Let t(k) be the time instant of the \(k-th\) detected beat, then the RR series was classically built by computing the difference between two successive QRS complexes \(RR(k) = t(k + 1) - t(k)\).

Bradycardia detection algorithm

The detection fusion approach used in this paper was initially proposed by our team, in order to perform an early detection of bradycardias without increasing the rate of false alarms28. It is based on an optimal decentralized fusion of three detectors: a standard fixed-threshold, an adaptive-threshold and an abrupt change detection method. These three methods are based on an analysis of beat-to-beat intervals (RR intervals) taking into account not only the instantaneous values of the RR series, but also their temporal evolution.

The fixed-threshold method compares each beat-to-beat RR interval with a fixed threshold \(U_0\) (set at 600 ms when \(B^{100bpm}_{5s}\) bradycardia are targeted and at 750 ms for \(B^{80bpm}_{10s}\)). If successive RR intervals were higher than \(U_0\) during more than 4 s, and if during this 4 s period, 2 RR intervals were higher than a second threshold \(U_1\) (set at 640 ms for \(B^{100bpm}_{5s}\) bradycardia and at 800 ms for \(B^{80bpm}_{10s}\)), a bradycardia was detected. Thresholds \(U_0\), \(U_1\) and period durations were set in order to minimize detection delay while maximizing detection performance. In summary, a bradycardia detection occurs when the following conditions are satisfied:

$$\begin{aligned} RR(k)>U_0\text { during }\Delta _{t}>4~s\text { with }2~RR(k)\text { occurrences }>U_1. \end{aligned}$$

The relative adaptive threshold method considers the fact that the basal heart rate of a patient can change through time. Thus, the threshold \(U_0\) becomes adaptive and is denoted \(U'_0(k) = 1.33 \cdot RR_{mean}(k)\), where \(RR_{mean}(k)\) is the mean RR value calculated over all RR values preceding beat k during the last 20 s. As for the precedent method, a second threshold \(U'_1\) (set at 640 ms when \(B^{100bpm}_{5s}\) bradycardia are targeted and at 800 ms for \(B^{80bpm}_{10s}\)) was used. Detection of a bradycardia event occurs when the following conditions are satisfied:

$$\begin{aligned} RR(k)>U_0^{'}(k)\text { during }\Delta _{t}>4~s\text { with 2 }RR(k)\text { occurrences }>U_1^{'}. \end{aligned}$$

The abrupt change detection method was based on the Cumulative Sum algorithm28,29. It models the bradycardia event as a statistically significant change in the mean of the RR series. In this approach, the RR interval is considered as a constant piecewise function perturbed by a noise of known variance. Two composite hypotheses are tested at each k:

$$\begin{aligned} H_0 :RR(k)&= \overline{RR_0}(k)+e(k) \text { for 0 }\le k \le n \end{aligned}$$
(1)
$$\begin{aligned} H_1 :RR(k)&= \overline{RR_0}(k)+e(k) \text { for 0 }\le k \le r-1 \\ RR(k)&=\overline{RR_0}(k)+\nu +e(k) \text { for r } \le k \le n \nonumber \end{aligned}$$
(2)

where \(\overline{RR_0}(k)\) is the mean value of RR(k) before the bradycardia event and \(\nu\) represents the value of the significant mean change i.e. the cardiac cycle length increase. In order to detect these events, the \(H_1\) hypothesis must be accepted in front of hypothesis \(H_0\). Classically, the decision between the two hypothesis is evaluated through the likelihood ratio between the composite hypotheses. As the time instant of the jump r and the magnitude of the jump \(\nu\) are unknown, a generalized likelihood test is performed which leads to the famous Page Hinkley algorithm29 which consists in:

  1. 1.

    computing the cumulative sum \(\Lambda _n(t)= \sum _{k=t}^{n}\left( RR(k) -\overline{RR_0}(k) - \frac{\nu }{2} \right)\);

  2. 2.

    recording the current minimum value \(m_n(t)=\underset{1 \le k \le n}{min} \Lambda _n(t)\);

  3. 3.

    setting an alarm at a time instant t when : \(\Lambda _n(t) - m_n(t) \ge \lambda\).

In this study, the window length of \(\overline{RR_0}\) was set to 290 s, the threshold \(\lambda =717\) and \(\nu =415\).

The final step consists in applying a fusion rule to the output of the above-mentioned detectors, in order to obtain the final decision. In this work, we adopted a voting scheme, similar to the ones used in recent ensemble machine learning methods30,31. Let n \(=3\), be the number of local detectors and \(d_i\) (\(i \in {1,..,n}\)) the individual decision obtained from each detector (\(d_i=1\) if a detection occurs, \(d_i=0\) otherwise), a bradycardia detection occurs when:

$$\begin{aligned} \sum _{i=1}^{n} d_i > \frac{n}{2} \end{aligned}$$
(3)

The bradycardia detector described above is characterized by a number of parameters that have to be correctly defined (window sizes, thresholds,...). In order to optimize these parameters, we used data from a previous different study by Beuchée et al, which studied the changes in HR variability associated with neonatal late onset sepsis (local ethics committee 03/05-445)32. Briefly, this database was composed of 148 1-hour ECG recordings acquired from 32 very preterm infants, included after parental consent. In this database, 116 bradycardia events with HR less than 100 bpm during more than 5 seconds were identified and validated by two expert neonatologists. The parameter optimization was performed using an evolutionary algorithm, as described in33. Once this optimization process was concluded, all the parameters were fixed, in order to apply the method prospectively on the data acquired for this study.

Performance evaluation and statistical analysis

Bradycardia detection performance was evaluated in terms of true positive (TP), false negative (FN), false positive (FP), sensitivity (TP/(TP+FN)), false alarm rate (FP/(FP+TP)) and detection delay. The following rules were applied to define the status of each detection:

  • Detections located 5 s before or within the 30 s following the annotation of the onset of a bradycardia event were considered as TP, all other cases were considered as FP;

  • If two detections were included in the window [-5:+30 s], corresponding to the same annotation, only the first one was considered as TP and used to calculate the delay (the second one was ignored);

  • If two annotations corresponded to the same detection, only one TP was considered, the second annotation was ignored.

The detection delay was computed as the time elapsed between the adjusted bradycardia onset annotation time and the detection instant. Delays induced by the dedicated electronics and real-time QRS detection (1512 ms) were taken into account for bradycardia detection algorithm.

Three different gold-standard annotations, corresponding to the three bradycardia definitions given above (\(B^{80bpm}_{10s}\), \(B^{100bpm}_{10s}\), \(B^{clin}\)), were used in this study and compared with the output of the proposed detector, as well as the output of the monitor alarms. In order to compare the performance of the proposed detector versus the monitor alarms, the following notation was used:

  • Annotations \(B^{80bpm}_{10s}\) were compared to the output of red alarms of the monitors, denoted \(D^{mon}_{red}\), and with the output of proposed detector, denoted here \(D^{fusion}_{red}\), with parameters optimized for \(B^{80bpm}_{10s}\) events.

  • Annotations \(B^{100bpm}_{5s}\) were compared to the yellow alarm output of the monitors denoted here \(D^{mon}_{yelllow}\) and with the proposed detector, denoted here \(D^{fusion}_{yellow}\), with parameters optimized for \(B^{100bpm}_{5s}\) events.

  • \(B^{clin}\) annotations are not related to a particular alarm category (red or yellow). Therefore, they were compared to both \(D^{type }_{red}\) and \(D^{type}_{yellow}\) where type stands for the monitor alarms (mon) or the proposed fusion method (fusion).

Global population results are reported as mean ± standard deviation. Statistical analysis was performed using the Wilcoxon signed-rank test due to the non-normality distribution of the data. For all tests, a significance level of 0.05 was used. No exclusion of outliers was performed prior statistical analysis. The sensitivity and false alarm rate standard deviation values were not reported because they were calulated on the whole database.