Sepsis is associated with poor long-term survival and functional recovery [1,2,3], including cognitive impairment, reduced activities of daily living, worsening cardiovascular disease, more hospital readmissions, and excess mortality [4, 5]. However, few studies have compared outcomes of patients with sepsis with non-septic patients who are similar with regards to admission characteristics and severity of illness. In a recent article in this journal, Thompson and colleagues report a propensity score matched analysis of patients with and without sepsis enrolled in a large, multicentre randomised clinical trial (CHEST) [6]. The original trial compared crystalloid fluid resuscitation with hydroxyethyl starch in ICU patients. Using propensity score matching, the authors aimed to explore whether ICU and hospital length of stay, health-related quality of life (HRQoL), long-term survival, and ICU and hospital costs were different between adult critically ill patients with and without sepsis [7]. They generated an impressive 1600 patients for analysis.

The authors found that ICU patients had similar impaired HRQoL at 6 months regardless of whether they had sepsis, and similar numbers survived 2 years after their illness. Although patients with sepsis used more resources during the index hospital and ICU stay, which translated into higher total hospital costs over 2 years, hospital survivors subsequently used similar hospital resources in the two groups. This suggested that the excess cost in the sepsis cohort was driven by costs associated with their index ICU and hospital admission rather than during post-illness survival.

The study used a high-quality database with prospectively collected data fields collected in a trial. The exceptional follow-up rate for HRQOL (≥ 95% of survivors at 6 months) is impressive and, given the similar mortality rates in cohorts, represents an unbiased evaluation of HRQoL in the matched patients. The use of linkage to administrative data, whilst limited to a subgroup of patients within New South Wales, meant near complete follow-up was achieved for other outcomes. The reporting of the propensity matched analysis is to the highest standards, following best practice. This is an exciting step in the understanding of differences between patients with and without sepsis. However, propensity score matching has challenges and some of these pose a potential threat to the external validity of the results. We highlight three: the patient information available at baseline to inform matching; identifying patients at baseline for inclusion into the study; and the ability to match all patients across the disease spectrum.

Propensity scoring aims to improve balance between exposed and unexposed groups in observational research. In this study, groups were balanced on their propensity to develop sepsis (the exposure), and the small standardised differences for each chosen variable show this was achieved. However, propensity scores can only use observed characteristics; unobserved characteristics may be important and still be imbalanced between groups, which could result in residual confounding [8]. The data from the CHEST trial used for propensity score matching included age, gender, weight, admission source, medical or surgical admission, trauma, creatinine concentration, heart rate, mean arterial pressure, mechanical ventilation status and APACHE II score. These are important, but other baseline factors have also been shown to influence sepsis outcome, including pre-sepsis health status, frailty, risk factors for infection, health care setting, treatments provided, limitations in activities of daily living, race, and individual response to treatment [5, 9, 10]. Many of these were not measured in the original CHEST trial and therefore could not contribute to the propensity score analysis. An unknown imbalance between the groups could have influenced the findings.

Identifying patients with and without sepsis at baseline was an important first step in this study. Thompson and colleagues used a sepsis diagnosis based on prospective clinical screening of patients by research co-ordinators, according to the 1992 sepsis definitions. Interestingly, 10.6% of the patients in the non-sepsis group had a non-operative APACHE III diagnosis of sepsis. The authors note that these patients may not have satisfied the clinical criteria for sepsis (as identified by research coordinators). To address this diagnostic uncertainty, the authors assessed robustness of their findings to different definitions of sepsis, reassuringly demonstrating no difference. Other studies used different methods of identifying sepsis, including claims-based definitions, ICD-9 codes and APACHE III codes [5, 10, 11]. These differences highlight a particular challenge in critical care studies in relation to dichotomising sepsis versus non-sepsis populations.

Using the defined baseline variables, Thompson and colleagues matched sepsis and non-sepsis patients using the 1:1 greedy matching method, which selects a sepsis patient at random and matches them to a non-sepsis patient whose propensity score is closest. The process was repeated until the list of sepsis patients able to be matched was exhausted, leaving 11% of the sepsis patients remaining unmatched. While this is common in propensity-matched cohorts, a potential limitation is that the sepsis-attributable risk is prone to bias if the control population chosen is not similar to the entire sepsis cohort. The unmatched sepsis patients in this study had a higher APACHE II score, heart rate, lactate, creatinine and renal SOFA score, and a lower mean arterial pressure than the matched cases, indicating they had higher illness severity. Therefore, reported differences in mortality, costs and HRQoL may not represent the entire sepsis cohort, in particular the sickest patients. External generalisability to all ICU populations might also be limited by uncertainty about patients who were not enrolled in CHEST (and therefore included in this study), who might also have been systematically different.

Long-term mortality and HRQoL following sepsis have been reported in several clinical trials and cohort studies. In a recent systematic review, mortality after hospital discharge was consistently increased 1 year after sepsis compared to non-sepsis controls, and HRQoL, measured with both the EQ5D and the SF-36 was reduced [2]. Baseline health status is an important determinant of post-ICU recovery, and the systematic review noted that HRQoL was often reduced in sepsis patients prior to ICU admission [2, 12]. In Thomson and colleagues’ study, the EQ5D at 6-month follow-up was not different between sepsis and non-sepsis groups. Although this may be a true measure of effect, it may have been influenced by pre-illness health that was not available for adjustment. The direction of the effect that this unmeasured confounder might have on the between-group differences is uncertain.

Despite the potential limitations highlighted, this study used high-quality complete data from a clearly defined population, and applied detailed comprehensive transparent analyses. As such, it is important because it challenges the emerging view that long-term outcomes for sepsis patients are worse than for other non-sepsis critically ill patients in terms of survival, HRQoL and healthcare utilisation. Whether the poor long-term health experienced by ICU patients is ‘generic’ to critical illness as opposed to sepsis patients being somehow ‘special’ is fundamental to directing our future research efforts and defining care pathways to support recovery.