Introduction

Novel coronavirus (COVID-19) infection is a global public health issue that has now affected more than 200 countries worldwide and caused second wave of pandemic1. Severe adult respiratory syndrome-CoV-2 (SARS-CoV-2) pneumonia is associated with a high risk of mortality. However, factors that predict poor clinical outcomes of individual patients with SARS-CoV-2 pneumonia remains under intensive investigation.

Current studies have showed that patients with SARS-CoV-2 pneumonia exhibit a wide range of symptoms such as fever, cough, myalgia, fatigue, or others2,3,4,5. Many patients experience a mild disease course, although approximately 15–25% develop more severe disease. Progression may result in acute respiratory distress syndrome (ARDS), multiple organ failure, and death6. Therefore, it is of ultimate importance to identify the high-risk group of patients in order to implement prompt medical intervention to improve clinical outcomes.

The aim of this study was to establish and validate a prognostic model for increased risk of mortality and survival time among individual patients with SARS-CoV-2 pneumonia. Our validated model stratifies patients into those with high versus low risk of death before life-threatening complications develop. This knowledge could be used to inform and justify critical patient management decisions and promote optimal use of often limited medical resources during the COVID-19 pandemic.

Methods

Study design and participants

This retrospective, multi-center cohort study involved adults patients who were diagnosed with COVID-19 pneumonia in four major government designated hospitals in Wuhan: Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology (TJH), Renmin Hospital of Wuhan University (RHWU), Wuhan Pulmonary Hospital (WPH), and Wuhan No.1 Hospital of Tongji Medical College of Huazhong University of Science and Technology (WNH) (Supplementary Fig. 1). Patients were followed until the 18th September, 2020. Patients were divided into three cohorts: the training cohort (TC) was used for establishment of a prognostic model, and 2 validation cohorts (VC1 and VC2) were used for external validation and assessment of robustness of the models. The TC included data collected from TJH between Jan 21th and Feb 16th, 2020. The VC1 consisted of patients from RHW and WNH admitted between Jan 23rd and Feb 16th, 2020. The VC2 included patients from WPH admitted between Jan 10th and Feb 27th, 2020. The primary outcome was mortality at the end of the study period. To be included for study, participants had to meet following diagnostic criteria for COVID-19 pneumonia: (1) confirmed diagnosis of SARS-CoV-2 pneumonia using RT-PCR on nasopharyngeal/oropharyngeal swab samples , (2) Computerised tomography (CT) evidence of viral pneumonia, defined as COVID-19. In addition, patient clinical outcome data had to be available. Exclusion criteria included: (1) death occurred within 24 h after hospital admission and for which related health records were unavailable, (2) no data on clinical outcomes were available, (3) suspected cases lacked a positive result for F137nCoV test, and (4) patients refused to participate in this study.

Following informed consent, the following data were collected on admission:age, sex, symptoms from onset to hospital admission (fever, cough, dyspnea, myalgia, rhinorrhea, arthralgia, chest pain, headache, and vomiting), comobidities (cardiovascular disease, chronic pulmonary disease, cerebrovascular disease and chronic neurological disorders, diabetes, malignancy, and smoking), vital signs (heart rate, respiratory rate, and blood pressure), laboratory values on admission (serum hemoglobin concentration, lymphocyte counts, platelet counts, diverse protein markers), treatment regime used for COVID-19 pneumonia (antiviral agents, antibacterial agents, corticosteroids, and interferon therapy), dates of symptom onset, admission, virus testing, CT-scan, as well as changes in patient condition and living status. All methods were carried out in accordance with relevant guidelines and regulation in Declaration of Helsinki.

Treatment protocol and criteria for discharged from hospitals for SARS-CoV-2 pneumonia

The treatment strategy for patients with COVID-19 pneumonia was based on the guidelines of World Health Organization (WHO)7, and included symptoms relief, treatment of underlying diseases, prevention of superimposed bacterial infections, active prevention of complications such as sepsis and ARDS and timely support of vital organ function. Oxygen supplementation was provided for patients with reduced O2 saturations and was administered via high flow oxygen via nasal prong (< 300 mmHg), non-invasive and invasive mechanical ventilation (< 200 and < 150 mmHg, respectively), or extracorporeal membrane oxygenation (ECMO) if required.

The discharge criteria for patients with SARS-CoV-2 pneumonia included one of the following: (1) haemodynamically stable and afebrile for > 3 days, (2) radiological evidence of significant resolution of pneumonia on CT-scan, (3) two sequential negative results for the F137nCoV test with at least 1 day interval, and (4) no concurrent acute medical issues requiring transfer to another medical facility.

Statistical considerations

Survival time was calculated from the date of hospital admission until death due to SARS-CoV-2 pneumonia or until the date of the last follow-up. Death due to SARS-CoV-2 pneumonia was considered as an event. Continuous variables were reported as means with standard deviations (SD) for normally distributed variables and as medians and interquartile ranges (IQR) for non-normally distributed variables. Categorical variables were reported as proportion.

According to the transparent reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis guidelines8, we developed a model using the training and validation cohorts. During the model development, all patients’ demographic characteristics, clinical information, vital signs and laboratory values were analyzed for a possible association between the life status (deceased versus living) and survival time using the least absolute shrinkage and selection operator (LASSO) for multivariable selection9. An iterative process combining forward and backward selection was applied to remove non-significant covariates. During each step of the iteration, the Akaike information criterion (AIC) was used to evaluate model fit10. The final model was then established with a minimum value of AIC. The AUC value was used to evaluate the accuracy of the prediction for the vital status. Model calibration was performed to ensure the robustness. The Cox proportional hazards regression analysis was used to evaluate the assessment of the prognostic model for individual survival times. Proportional hazards assumption for the Cox proportional hazards regression model was assessed by using the Schoenfeld residuals test.

In order to validate the prognostic model, two independent validation cohorts (VC1, VC2) with the same discrimination method and survival function were used. The 95% confidence intervals (CIs) were estimated via 5000 bootstraps replicates. All statistical analyses were performed using R software version 3.6. A p < 0.05 was considered as statistically significant.

Role of funding

The funders were not involved in any activities of this study, aside from providing financing.

Ethics, consent and permission

This study was approved by the institutional ethics board of Tongji Hospital of Tongji Medical College, Huazhong University of Science and Technology (No. IRBID:TJ-C 20,200,107). All participants agreed to take part in this study.

Consent to publish

Informed consents were obtained from participants for the purpose of publication.

Results

Demographic and clinical features in training and validation cohorts

Overall 492 patients were recruited in this study. The demographic characteristics and clinical features of patients from the TC (n = 237; TJH), VC1 (n = 120; RHWU + WNH), and VC2 (n = 135; WPH) cohorts are listed in the Table 1. The mortality rates in the three cohorts were 44.3% (TC), 25.8% (VC1) and 33.3% (VC2), respectively. A total of 105 events occurred in the TC, events in VC1 and VC2 were 31 and 45, respectively. The median survival times were comparable among these three cohorts (15, 17 and 14 days for TC, VC1, and VC2, respectively). TC patients had a median age of 62 (IQR 50–70) and were older than those in VC1 (median age 46, IQR 37–66), but similar to those in VC2 (median age 63, IQR 52–70.5). There was no significant difference in sex distribution among the three cohorts. Most of patients were non-smokers (92% [TC], 97.5% [VC1], and 83% [VC2]). The number of patients with associated comorbidities varied between three cohorts (52.7% [TC] vs. 37.5% [VC1] vs. 73.3% [VC2]; Table 1). The number of severe cases requiring intensive care unit (ICU) admission also varied among the three cohorts (8.9% [TC], 1.7% [VC1], and 20% [VC2]). Lymphopenia occurred in the majority of patients (69.2% [TC], 60.8% [VC1], and 75.6% [VC2], Table 1). Leucocytosis was observed 24.5% in the TC, 21.7% in the VC1, and 20.7% in theVC2. Neutrophilia was observed in 34.6% in the TC, 30.8% in the VC1, and 31.1% in the VC2.

Table 1 Clinical characteristics, treatments and laboratory findings of the Training (TJH), Validation 1 (RHWU + WNH), and Validation 2 (WPH) cohorts.

The median time from symptom onset to hospital admission varied between cohorts (TC, median 10.0 days, IQR 7–14 days; VC1, median 7.0 days, IQR 4–10 days; VC2, median 10.0 days, IQR 7–13 days) with fever being the most common symptom on admission. The duration of hospitalization and treatment for TC, VC1, and VC2 were 15.0 days (IQR 7–24 days), 17.0 days (IQR 11.8–25.0 days), and 14.0 days (IQR 10.0–19.0 days), respectively. The majority of patients in TC, VC1, and VC2 were treated with antibiotics (86.5%, 92.5, and 85.9%, respectively) and antivirals (lopinavir/ritonavir; 99.6%, 95.8% and 97.0%, respectively).

Potential risk factors associated with vital status for COVID-19

Univariate analysis revealed that advanced age, increased body temperature on admission, and the presence of underlying diseases were associated with a higher mortality rate in patients with COVID-19 infection (Table 2). Tachypnoea and hypertension, as well as treatment with antibiotics, corticosteroids or intravenous immunoglobulin were also associated with increased mortality (Table 2). Several laboratory parameters including serum bilirubin, D-dimer, potassium level, prothrombin time (s), lactate dehydrogenase, aspartate transaminase (AST), and urea were also found to be associated with increased risk of death. In addition, patients with lymphopenia, leukocytosis, or neutrophilia also had an increased risk of death (Table 2). Of note, the deceased lymphocyte count, and increased neutrophile count, as well as the increased NLR were also significant risk factors for mortality.

Table 2 Risk factors associated with mortality of COVID-19.

Construction of a prognostic model for vital status and survival in SARS-CoV-2

For the TC, a multivariate analysis was performed to analyze the association between vital status, survival time, and all the covariates listed in Table 1. Five covariates were statistically significant predictors for vital status and survival time: (1) age (adjusted odds ratio (AOR): 1.1/years increase [95% CI 1.06–1.13]; Wald’s p < 0.001), (2) NLR (AOR: 1.14 increase [95% CI 1.08–1.2]; p < 0.001), 3) body temperature at admission (AOR: 1.53/°C increase [95% CI 1.0–5.26]; p = 0.005), 4) aspartate transaminase (AST) (AOR: 2.47 [95% CI 1.16–5.26] for increase vs. normal; p = 0.019), and 5) total protein (AOR: 1.69 [95% CI 0.78–3.64] for decrease vs. normal ; p = 0.018; Table 2). Based on the weights (coefficients) of these five significant covariates (Table 2), a prognostic model was constructed and applied to predict the vital status of the training cohort. The results of this analysis yielded an AUC of 0.912 (95% CI 0.878–0.947; Fig. 1A). This indicated that the prognostic model was able to effectively differentiate between patients with SARS-CoV-2 pneumonia who survive and were subsequently discharged versus those who died. In the prediction of overall survival, the model reached a Harrell’s c-index of 0.758 (95% CI 0.723–0.793; Fig. 1C). The model was also able to define a high-risk subgroup with a significantly increased likelihood of death due to SARS-CoV-2 pneumonia (hazard ratio [HR]: 24.22 [95% CI 10.57–55.5]) versus a low-risk subgroup. The predicted survival probabilities were compared with observed survival probabilities on the 7th, 14th, 21th, and 28th day after admission (Fig. 1B). The nomogram was constructed to assess impact of these factors (Supplementary Fig. 2). The predicted 30-days survival rates of the high- and low-risk subgroups in the training cohort are visualized in Fig. 1C (≥ 799 and < 799). Here, 799 represented the cutoff in the model based on the average of minimum calculated scores among deceased patients.

Figure 1
figure 1

(A) AUC (area under curve) of the ROC (receiver operating characteristic) analysis for the training cohort and two validation cohorts (left), and for the age-specific three cohorts (< 50 year, 50–70 year, and > 70 year, right). (B) Calibration plot showed the comparison between predicted and observed survival rates of patients in training cohort on 7th, 14th, 21th, and 28th after hospital admission. (C) Clinical stratification and prediction of survival rate on the basis of the developed prognostic model. Survival in the low and high risk subgroups in training cohort (TJH) were stratificated by the a cutoff of ≤ 799 and > 799, respectively (left), predicted survival rates in the this cohort (right). Smooth lines represent mean predicted survival probabilities for each risk group; dots symbolize corresponding predicted rates with 95% CI (vertical lines) (this figure is produced using R version 3.637).

Validation of the model for vital status and survival

In order to validate the prognostic value of the established outcome prediction model for SARS-CoV-2 pneumonia, external validation using 2 cohorts (VC1 and VC2) was performed. The model reached an AUC of 0.928 [95% CI 0.884–0.971; VC1] and 0.883 [95% CI 0.815–0.952; VC2] to predict the vital status (Fig. 1A). For the prediction of survival of both validation cohorts, the model yielded C indices of 0.762 [95% CI 0.723–0.801; validation cohort 1] and 0.711 [95% CI 0.672–0.75; validation cohort 2] (Fig. 2). By applying the same cutoff of model score, high-risk subgroups with lower survival rates were defined to clearly differentiate between the low-risk subgroups in both validation cohorts (HR: 11.53 [95% CI 4.01–33.15 for VC1 and HR: 9.3 [95% CI 3.32–26.03] for VC2) (Fig. 2). Of note, the predicted 30-day survival rates in high- and low-risk subgroups in both validation cohorts were similar to the observed survival rates in the training cohort (Fig. 2), thereby confirming the strength of the model for the prognosis for SARS-CoV-2 pneumonia.

Figure 2
figure 2

Prognostic model achieves clinical stratification and predicts overall survival (OS) in two validation cohorts. (A) Survival probabilities in the low- and high-risk subgroups defined by the consistent cutoff of 799 in the validation cohort 1 (RHWU + WNH) (left), correspondingly predicted survival probabilities in this cohort (right). (B) Survival probabilities in the low- and high-risk subgroups defined by the same cutoff in the validation cohort 2 (WPH) (left), correspondingly predicted survival probability in this cohort (right) (this figure is produced using R 3.637).

To investigate the impact of age on the prognostic model, these two validation cohorts were merged and then divided into three groups by age to form three subgroups: < 50 year (cohort_5), 50–70 year (cohort_6), and > 70 year (cohort_7), respectively. For the prediction of the vital status, the model yielded an AUC of 0.911 [95% CI 0.853–0.97; cohort_5], 0.809 [95% CI 0.713–0.904; cohort_6], and 0.825 [95% CI 0.719–0.931; cohort_7; Fig. 1A]. For the survival prediction, the model yielded C indices of 0.572 [95% CI 0.533–0.611; cohort_5], 0.721 [95% CI 0.682–0.76; cohort_6], and 0.706 [95% CI 0.667–0.745; cohort_7; Table 3]. Finally, to aid in the current clinical management of SARS-CoV-2, a web-based application (http://82.165.167.23:8734/SIMTaskMaster/SARS2_Tool) was developed to enable broad testing and utilization of the developed prognostic model (Supplement Fig. 3).

Table 3 Vital status and overall survival prediction in age-specific cohorts.

Discussion

In this retrospective multicenter study of 492 hospitalised patients with SARS-CoV-2 pneumonia, we found that advanced age, high body temperature on admission, high NLR, elevated AST as well as decreased total protein was associated with an increased risk of mortality. The prognostic model established based on these five clinical parameters was robustly validated using two separate validation cohorts. The aim of model application was the early identification and prioritization of individual patients requiring early administration of intensive treatment strategies.

The rapid transmission of the disease and the current second wave of COVID-19 pandemic have created public crisis on a global scale. To avoid overwhelming the public health systems and exacerbating the economic burdens, strategies to overcome this pandemic are being vigorously explored, including studies aimed at identifying the greatest at-risk populations. Prior studies have reported various potential risk factors associated with mortality in the setting of SARS-CoV-2 pneumonia4,5. For instance, Chen and colleagues found that age, obesity, and comorbidity were three identifiable risk factors for mortality5. Wang and colleagues identified that neutrophilia, lymphopenia, and elevated D-dimer and creatinine level were observed in non-survivors, implying that a cellular immune deficiency plus coagulation activation could potentially mediate disease severity11. In our study, the association between increased NLR and mortality suggests that altered immune cell function plays a critical role in the pathogenesis of SARS-CoV-2 pneumonia. This result is consistent with several recent independent studies12,13,14.

Advanced age has also been identified as an independent risk factor of COVID-195,15,16. The underlying mechanisms could include changes of anatomical respiratory structure with aging17, immunosenescene18, and inflammaging19, which would, respectively, facilitate entry of SARS-CoV-2, weaken anti-viral immunity, and promote a cytokine storm, leading to multiple organ damages. Further, age-related alterations in metabolism are known to underlay changes in innate and adaptive immunity20, which also contribute to the weakening of immunity. Our findings that increased NLR in elderly patients is a major risk factor for mortality support the role of inflammaging in COVID-19 pathogenesis.

Aspartate transaminase (AST) is an important clinical marker for early diagnosis of various diseases including progression and/or metastatic potential of solid tumor21,22. Further, AST is an important enzyme involved in diverse metabolic pathways including purine metabolism23, steroid biosynthesis24, and synthesis of amino-acids such as arginine25, phenylalanine26, tyrosine27, and others28. Thus, elevated serum AST levels is considered an indicator of metabolic dysfunction. Furthermore, hypoalbuminaemia—manifested in our study as reduced total protein— is often related to malnutrition and recent studies have shown that diminished availability of metabolic nutrients directly leads to changes of immune responses29,30,31,32.

Because SARS-CoV-2 replication and pathogenesis are highly dependent on the host metabolism33. The decreased total protein strongly suggests a heightened viral burden and predicts a severe disease course. In total, poorer outcomes observed in our patients with elevated AST levels and decreased total protein could be related to age- and/or virus-induced metabolic dysfunction in these individuals.

Lastly, an increase in body temperature is one clinical manifestation of pro-inflammatory cytokine production (e.g., TNFα, TNFβ, IL-1β) by activated macrophages and T-lymphocytes. Dysregulated production of such cytokines can lead to a “cytokine storm” that ultimately damages vital organs, including the lungs, contributing to ARDS. Dysregulated and sustained TNFα production in response to other viral infections (HIV/AIDS) also mediates cachexia (muscle wasting) and is characterized by changes in total protein32. Thus, an elevated body temperature in patients at risk for mortality from COVID-19 pneumonia may reflect an aberrant cytokine response to SARS-CoV-2 infection.

In sum, these five clinical parameters, when combined, predict mortality in patients with COVID-19 pneumonia in our model are reflective of the status of host immunity. Further, as with other coronaviruses, SARS-CoV-2 does not possess its own metabolism, the viral replication and pathogenesis are highly dependent on the host metabolism. The hijack of host metabolism remains the only way for the viral survival.

Our study has several strengths. First, while several studies have previously reported relevant risk factors associated with SARS-CoV-2 pneumonia34,35,36, our study combined such factors into a robust and validated prognostic model for outcome of COVID-19 infection. Second, our model utilizes five commonly used clinical parameters that are routinely obtained on hospital admission and are not confounded by prior treatment since this has not yet been initialized. Third, our study involved a large number of patients and the prognostic model was fully validated with two large, independent external cohorts. Fourth, the model was also validated for age-specific cohorts. Specifically, the high AUC and C-indices of the prediction of the vital status and survival in patients aged 50–70 years versus > 70 indicate the suitability of the prognostic model for elderly patients. In total, this established prognostic model can assist clinicians in identification and stratification of high risk patients, thereby promoting initialization of vital treatment strategies that can improve outcomes.

The major limitation of this study is that the model was developed and validated purely based on Chinese population. Therefore, its application to the regions outside of China needs to be further determined. We speculate that the model could still reach a high prediction rate, however, the cutoff of optimal model score of 799 might need to be adjusted correspondingly to cover a broader spectrum of disease trajectories. Further, our prognostic model excluded gender and presence of comorbidity due to low statistical significance. One of reasons might be that this study was not able to include COVID-19 patients outside of China. Moreover, another study of our group has shown that presence of comorbidity was an age-dependent risk factor for COVID-19 infection (under review).

In addition, due to the nature of an observational study, potential confounders may exist which can have impacts on the results. Therefore, further prospective international multicenter studies are needed to test the robustness of this model.

Conclusion

In this retrospective multi-center cohort study, a prognostic model was developed and validated to predict the outcome of individual patients suffering from SARS-CoV-2 pneumonia. We identified five common clinical parameters that are relevant to outcome of COVID-19 infection. This model enables clinical patient stratification to efficiently prioritize medical resources in the treatment and management of patients with SARS-CoV-2 pneumonia. The model’s clinical application may also inform treatment recommendations to save more lives in a high-risk group of patients while avoiding overtreatment in those at lower risk.