Introduction

The assessment of pain in ventilated infants is a difficult issue in paediatric intensive care because patients are mainly not able to communicate with the pain.

On the other hand, intensive care therapy is associated with repeated painful procedures and numerous environmental stressors. Patients on a neonatal intensive care unit underwent 14 ± 4 painful procedures daily, mainly nasal, endotracheal and pharyngeal suctioning [19]. Pain and distress in the neonatal period are linked with acute and chronic physiologic alterations [9], an increased morbidity and mortality [2, 3], and an altered pain behaviour to immunizations or surgery in later childhood [18, 22].

Mechanical ventilation is a significant stressor and leads to hormonal catecholamine and cortisol stress responses, as well as ventilator asynchrony in neonates; both of which can be reduced by opioid infusion [13, 20]. In term newborns, infants and older children, sedation, and often, analgesia is necessary to achieve an acceptance of mechanical ventilation and associated medical procedures [4].

An external assessment of the degree of pain by means of a reliable, valid and practicable pain scoring system is considered important to provide an adaptive and individualised pain therapy. In contrast to numerous published scoring systems accounting for acute procedural and postoperative pain [11, 14], only few scoring systems that exist aimed at assessing the level of pain in ventilated newborns and infants [1, 12, 16]. In 1991, a score to assess pain and sedation in mechanically ventilated newborns, infants and older children was developed (Hartwig score, Fig. 1, [15]), but initial validation of this score has been incomplete up to the present. We saw the necessity for further validation of the Hartwig score because it has now become part of the routine care of ventilated infants in German NICUs and PICUs, and to our knowledge, is well accepted by the medical staff. The Hartwig score has the unique advantage in that it can assess the infant's pain response to endotracheal aspiration and the patient's acceptance of the respirator. Acceptance of the respirator and of the painful procedures related to ventilation and intensive care directly correlate with the patient's need for analgesic and sedative medication. The objectives of this study were to evaluate the Hartwig score in respect to (1) inter-rater reliability, internal consistency and concurrent validity, by correlating it to a previously validated scale and a visual analogue scale (VAS) (a horizontal line from 0 to 100 mm with 0 mm indicating no pain and 100 mm the worst possible pain) and (2) to define a cut-off value indicating the need for analgesic therapy.

Fig. 1
figure 1

Hartwig score for ventilated newborns and infants

Methods

The study was approved by the Ethical Committee of the Medical Faculty of the University of Cologne. Ventilated infants from term newborns to infants of 12 months of age were eligible for the study. Exclusion criteria were states impeding the assessment of pain (coma, hypoxic-ischemic encephalopathy, brain injury, neuromuscular diseases and muscle relaxants). After obtaining written informed consent of guardians of the child, a videotape of the ventilated infant, showing the whole body and facial features, was done at baseline, 3 min before, during and after the routine procedure of endotracheal suctioning. Heart rate and blood pressure were documented every minute, and the values were displayed in the video. If a child was video recorded twice, it was assured that there was a time interval of more than 24 h between the two situations. The infants received analgesia and/or sedation according to the unit's pain treatment protocol (continuous infusion of fentanyl and midazolam, or single dose of fentanyl or piritramide).

Description of the scoring instruments are as follows: (1) The VAS is a horizontal line from 0 to 100 mm with 0 mm indicating no pain and 100 mm the worst possible pain. Since there is no “gold standard” for objective pain assessment in ventilated newborns and infants, the VAS was chosen to obtain an expert assessment of pain. (2) The original Comfort scale comprises six behavioural items (alertness, calmness-agitation, respiratory response, physical movement, muscle tone and facial tension) and two physiologic items (mean arterial pressure and heart rate), with a scale range 8–40 points. The Comfort scale was originally developed to assess distress in the paediatric intensive care environment in newborns and older children on mechanical ventilation or cpap [1]. Follow-up studies demonstrated the Comfort-B scale with exclusion of arterial blood pressure and heart rate to be a reliable scoring system to assess over- and under-sedation [17] since these physiological items did not correlate well with the behavioural ones of the scale. In assessing procedural pain, the Comfort scale was able to distinguish between the infant's state at rest and that of in pain [13]; further testing revealed initial reliability and validity of the Comfort scale in assessing postoperative pain in 0 to 3-year-old children [23]. (3) The Hartwig score comprises five behavioural items: motor response, grimacing, eyes, the patient's acceptance of the mechanical ventilation and the reaction to endotracheal suctioning. It has a score range of 7–25 points. Endotracheal suctioning is one of the most frequent painful measures done at regular intervals in ventilated patients, and as such, can be used to test the quality of analgesia. Initial validation of the score was done in a study evaluating the efficacy of intravenously administered fentanyl and midazolam in 24 ventilated newborn and older infants for analgesia and sedation [15].

The anonymised video recordings were analysed by four independent observers (two neonatologists, two experienced nurses not involved in the therapy). The observers had to give their opinion on whether the analgesia was sufficient or if the patient was in need for analgesic therapy. After that, the assessment of pain on the VAS was done and they filled out the Hartwig score and the Comfort scale. All observers were familiar with both the Hartwig score and the Comfort scale, because both scales are part of the unit's pain and sedation protocol.

Statistical analysis of the data was performed with the Statistical Package for Social Sciences (SPSS 12.0; SPSS Inc; Chicago, IL, USA). Descriptive statistics was used to analyse demographic data. Testing of significant differences of the key vital parameters before and during endotracheal aspiration was done with the Student's t test. We performed statistical analysis of all 54 observations, comprising repeated videotapes of the same patient, as well as the analysis of only the first videotape of the 28 patients with the exclusion of repeated recordings to prevent statistical dependence between assessments of the same patients. The internal consistency was assessed by calculating Cronbach's alpha coefficient and the inter-rater reliability was measured by the intraclass correlation coefficient (ICC). Concurrent validity was assessed by comparing the Hartwig score and the Comfort scale, as well as the Hartwig score and VAS, according to the method of Bland and Altman [5]. Mean results of two observers (neonatologist and nurse) for the Comfort scale and VAS were compared to the mean results the other two observers (neonatologist and nurse) for the Hartwig score.

The cut-off value of the Hartwig score to discriminate the need for analgesic treatment from sufficient analgesia was determined by receiver operating characteristic (ROC) analysis and the area under the curve.

Results

Twenty-eight patients were included into this study (Table 1), and these were videotaped 54 times during different situations of endotracheal suctioning. Eighteen were neonates and ten were infants from 2 to 10 months of age. Fifty-seven percent were male and 43 % were female. Sixty percent of the patients were ventilated after cardiac surgery, and the remainder were ventilated for respiratory failure (meconium aspiration syndrome, pneumonia). Forty-four percent of the patients had continuous analgesia and sedation with fentanyl and midazolam, 26% had a single dose of fentanyl without continuous infusion and 30% of the children had neither a continuous infusion, nor a single dose of analgesic or sedative medication before endotracheal suctioning.

Table 1 Demographic patient data

Statistical analysis was done on all 54 video recordings and separately on the first video recording of each of the 28 patients. The results of the analysis of only the first videotape are given in round brackets.

Score

Mean Hartwig score of all the 54 (28) situations was 11.8 ± 3.7 (10.9 ± 3.2), mean Comfort scale score 16.7 ± 5.9 (15.7 ± 5.6) and mean VAS score 35 ± 25 mm (30 ± 24 mm). Hartwig scores of 12 or less were assigned to 61% of the situations, 13 points or more to 39% of the situations. According to the published data, a mean Comfort scale score value of 16.7 and a VAS sore of 35 mm are indicative of the absence of pain requiring treatment.

Internal consistency

Internal consistency of the compound Hartwig score with all five items resulted in a Cronbach's alpha of 0.867 (0.853). The deletion of the weakest item “eyes” improved the internal consistency of the Hartwig score to Cronbach's alpha of 0.895 (0.884). In general, values exceeding 0.7 indicate a presumable reliability of a psychometric tool.

Inter-rater reliability

The inter-rater reliability was significant with an intraclass correlation coefficient of 0.934 (0.853).

Concurrent validity

Comparing the Hartwig score with the VAS, the Bland–Altman analysis (Fig. 2) gave a mean of the differences of 0.77 (0.77) with a 95% CI of 0.508–1.03 (0.411; 1.129) and levels of agreement from −1.76 to 3.23 (−1.74 to 3.0). The difference in agreement is greater with increasing pain scores. Comparing the Hartwig score with the Comfort scale (Fig. 3) showed a mean of the differences of 0.15 (0.34) with a 95% CI of 0.013–0.287 (0.104; 0.576) and levels of agreement from −1.49 to 1.77 (−1.13 to 1.82) with the difference in agreement again being greater with increasing pain scores.

Fig. 2
figure 2

Bland–Altman plot of the agreement of the mean Hartwig score of observers 1 + 2 and mean VAS of observers 3 + 4. Mean difference of the scores is 0.77, levels of agreement 3.23 to −1.76

Fig. 3
figure 3

Bland–Altman plot of the agreement of the mean Hartwig score of observers 1 + 2 and mean Comfort scale of observers 3 + 4. Mean difference of the scores is 0.15 and levels of agreement are 1.77 to −1.49

Cut-off value

The ROC analysis (Fig. 4) of the Hartwig scores with the rater's assignment of sufficient or insufficient analgesia resulted in a cut-off value of 12.1 points with a specificity of 100% and a sensitivity of 89% [AUC 0.995].

Fig. 4
figure 4

Receiver operating characteristic. Hartwig score sum in respect of correctly identified sufficient analgesia versus falsely identified sufficient analgesia according to the observers' subjective judgement

Physiological parameters

In situations of endotracheal suctioning with a Hartwig score of 13 and more points indicating pain, heart rate, diastolic and systolic blood pressure, showed a statistically significant but clinically insignificant increase from baseline during procedure (p < 0.05). Heart rate increased from 128.7 ± 16.6 to 136.2 ± 17.2 beats/min (+7.5%), systolic blood pressure from 76.3 ± 13.7 to 80.8. ± .17.5 mmHg (+4.5%) and diastolic blood pressure from 46.7 ± 8.9 to 48.4 ± 11 mmHg (+1.7%). In situations with a mean Hartwig score < 13, the increase was 2% for heart rate, 3% for systolic blood pressure and 2.2% for the diastolic blood pressure.

Discussion

The results of this study support the validity and reliability of the Hartwig score in assessing pain in ventilated newborns and infants.

We were able to demonstrate an excellent agreement of the Hartwig score and the Comfort scale with a mean difference of 0.15 and a confidence interval of 0.013–0.287, thereby suggesting that these two observational scales reliably measure the same phenomenon. The narrow confidence interval is indicative of a small risk of a measurement error. Since these assessments were done during endotracheal aspiration, which is a painful stimulus, it can be assumed that the behavioural reaction is related to pain. The excellent agreement that was found between the Hartwig score and the Comfort scale may be partly due to the fact that the observers of the videotapes were highly sensitive to the problem of pain and sedation in neonatology and intensive care, as well as used to the application of the Hartwig score and Comfort scale. However, a trial by Carvalho et al. assessing sedation in 18 ventilated patients aged 16 days to 5 years [6] also showed a good correlation of Hartwig score and Comfort scale with the Hartwig score being easier to use.

The mean difference in agreement of 0.77 between the Hartwig score and the VAS, as seen in the Bland Altman plot (Fig. 2), slightly exceeded the difference in agreement of Hartwig score and Comfort scale, indicating that numerical pain assessment using VAS resulted in slightly higher pain scores than when using observational pain scales. However, it should be borne in mind that in this experimental setting, pain can easily be overestimated by the observers and that this might explain the higher pain scores of the VAS.

Inter-rater reliability between the four observers with an intraclass correlation coefficient of 0.934 was excellent, suggestive of the clear definition and reproducibility of the single items and of the observers' adequate training.

The good internal consistency of the score was reflected by the Cronbach's alpha coefficient of 0.867. Deletion of the item “eyes” with the lowest internal consistency (Crohnbach's alpha 0.37–0.5) led to a discreet improvement of the score's internal consistency, indicating that this item is weakly reflecting pain expression.

Because behavioural and physiological parameters both lack specificity concerning pain, these parameters are often combined in two-dimensional pain measures to increase validity and reliability [1, 7, 10, 21]. Observations in infants with postoperative pain showed that the correlation of physiological parameters and behavioural pain expressions to be weakly to moderately significant with large interindividual differences [23], that heart rate, blood pressure and oxygen saturation did not add to the information already provided by the assessment of the behavioural parameters [7] and that blood pressure and heart rate were unreliable and had no discriminant power to detect an analgesic demand during the postoperative period [8]. Physiological parameters, such as heart rate, blood pressure or oxygen saturation, especially in the setting of intensive medical care, are highly influenced by numerous intrinsic and extrinsic factors, e.g. volume depletion, shock, opioids, catecholamines, respiratory morbidities and others. These findings are in line with the results of this study, since we found a slight but not clinical significant increase in blood pressure and heart rate from before to during the procedure of endotracheal aspiration in all patients. The mean rise in heart rate and blood pressure did not exceed 10% in situations judged as being painful, as well as being not painful. Subtle changes of these parameters of less than 10% can be regarded as to be in the range of the normal baseline. In all, the results confirmed our concept of a unidimensional pain score.

Analysis of the ROC matching the Hartwig scores with the observer's individual evaluation of sufficient or insufficient analgesia revealed that analgesic therapy in the context of endotracheal suctioning was rated to be necessary with a score value of 13 or more.

Limitations of the study

Limitation is the small number of patients. The analysis of repeated videotapes of the same child at different points of time did not influence the results. There will be further testing under bedside conditions with a greater number of patients and participating nurses.

The majority of patients were infants following cardiac surgery. Patients and analgesic treatment were inhomogeneous, and the study conditions were not standardized but determined by the patients' individual needs. However, our results clearly show a correlation of the Hartwig score and the Comfort scale and a correlation of the score with the observers' individual assessment of the presence of pain requiring treatment.

Conclusion

We could demonstrate good inter-rater reliability and internal consistency of the Hartwig score, as well as a good to excellent agreement with the VAS resp. Comfort scale. The Hartwig score can be reliably used to assess pain due to endotracheal suctioning and ventilator-induced pain in ventilated newborns and infants. It is easier to use as the original Comfort scale. Further testing will be done with a greater number of patients and nurses under clinical conditions.