Introduction

Extended hepatectomy (EH) is the only potentially curative treatment for bilobar or large liver lesions1,2. Recent developments in the field of hepatobiliary surgery have increased the indications for EH and it has become a standard surgical procedure in most high-volume centers. Despite improvements, the rates of morbidity and mortality after EH remain high3,4. A comprehensive preoperative assessment and proper patient selection criteria may predict and reduce the risk of postoperative morbidity and mortality. This would improve patient selection and patient management, and thereby improve the intraoperative findings and postoperative outcomes of EH.

Several risk assessment scores have been proposed to predict outcomes after liver resection3,5,6,7,8,9. However, most are based on postoperative parameters. Because the phase shortly after EH is critical, many patients may not benefit from a risk assessment that is based on postoperative data. Some preoperative risk scoring systems have been introduced10,11,12,13, but their discriminatory performance have not been evaluated and compared exclusively in patients undergoing EH, who have a relatively higher risk of postoperative morbidity and mortality than those undergoing minor hepatectomy.

The aim of this study was to evaluate the ability of well-known risk assessment scores to predict mortality risk after EH. All included risk assessments have been developed or validated in large cohorts of liver resection patients and have been published in high-impact hepatobiliary surgery or hepatogastroenterology journals. A second aim was to propose a risk assessment strategy for patients undergoing EH based on these risk scores.

Results

Patient collective

The demographic and baseline data were compared, as well as perioperative outcomes between patient risk assessment scores that were calculated with incomplete data (n = 26, 9.4%) and complete data. Baseline data and outcomes did not differ significantly between the two groups (data not shown). After excluding patients with incomplete data, 250 patients were included in the final analysis. The mean age of included patients was 60 ± 12 years (range: 18–86 years old) and 134 patients (53.6%) were male. The most common indication for EH was primary liver malignancy in 136 patients (54.4%), followed by liver metastasis in 80 patients (32.0%), and benign indications in 34 patients (13.6%). One hundred and four patients (43.3%) received preoperative systemic chemotherapy. Baseline clinical and demographic characteristics, as well as preoperative laboratory data, are shown in Tables 1 and 2.

Table 1 Demographic and preoperative clinical data.
Table 2 Preoperative laboratory data.

Discriminatory value of predictive risk scores

Curve estimation was applied to show changes in the proportion of major morbidity and 90-day mortality with increasing risk. The low-, intermediate-, and high-risk groups are shown in green, yellow, and red respectively in Fig. 1 according to the predefined cut-off points of each risk assessment score (Table 3). As expected, the major morbidity and 90-day mortality after EH increased with increasing value in all risk scores. As shown in Table 4, except for ALBI, the rate of major morbidity and 90-day mortality significantly differed between low-, intermediate-, and high-risk patients based on the proposed cut-off values of all risk scores. The FIB-4 index showed the highest increase in major morbidity and 90-day mortality (40% for both) from the low-risk group to the high-risk group.

Figure 1
figure 1

The incidence of major morbidity (–) and 90-day mortality (–) in low- (green), intermediate- (yellow), and high-risk (red) patients based on different risk assessment scores. APRI, aminotransferase-to-platelet ratio index; ALBI, albumin to bilirubin grade; FIB-4, liver fibrosis index; MELD, model for end-stage liver disease.

Table 3 Details of the selected risk assessment scores.
Table 4 The rate of morbidity and mortality in the risk categories of risk assessment scores.

Prediction of 90-day mortality

Based on the receiver operating characteristic (ROC) curve analysis (Fig. 2), the Heidelberg score (area under the curve [AUC] = 79%), FIB-4 index (AUC = 77%), and APRI (AUC = 73%) had AUCs more than 70%, and the model for end-stage liver disease (MELD) score (AUC = 69%), Breitenstein score (AUC = 69%), and ALBI score (AUC = 66%) had AUCs between 60% and 70% (all p < 0.01). The estimated cut-off values for ALBI, APRI, Breitenstein score, FIB-4 index, Heidelberg score, and MELD score were −3.74, 0.19, 5.50, 1.52, 5.50, and 7.38, respectively. Figure 3 shows the sensitivity and specificity of risk scores for 90-day mortality based on the estimated cut-offs. APRI (86%), ALBI grade (86%), Heidelberg score (81%), and FIB-4 index (79%) had the highest sensitivity for 90-day mortality. However, only the FIB-4 index and Heidelberg score had an acceptable specificity of 70% and 65%, respectively. The APRI showed a specificity of 53% and the ALBI grade a specificity of 48% for 90-day mortality.

Figure 2
figure 2

Area under the curve (AUC) for each risk assessment score for discrimination of 90-day mortality. APRI, aminotransferase-to-platelet ratio index; ALBI, albumin to bilirubin grade; FIB-4, liver fibrosis index; MELD, model for end-stage liver disease.

Figure 3
figure 3

Sensitivity (blue) and specificity (orange) of each risk assessment score for predicting 90-day mortality. APRI, aminotransferase-to-platelet ratio index; ALBI, albumin to bilirubin grade; FIB-4, liver fibrosis index; MELD, model for end-stage liver disease.

To compare the discriminatory ability of different risk scores for 90-day mortality, pairwise comparison of AUCs was performed. As shown in Fig. 2, the Heidelberg risk score performed better than the ALBI grade, Breitenstein score, and MELD score but there were no significant differences between the discriminatory abilities of the Heidelberg score compared with the APRI and FIB-4 index. There was also no significant difference in discriminatory ability between other risk scores. Accordingly, the Heidelberg score, APRI, and FIB-4 index were selected to propose a risk assessment strategy for patients undergoing EH.

Proposed risk assessment strategy for EH

Based on the determined sensitivity and specificity of the selected risk scores, a two-stage screening method was proposed. A high sensitivity test (Heidelberg and APRI scores) was selected for the first stage and a high specificity test (FIB-4 index) was selected for the second stage. Two risk assessment strategies were assessed:

  1. 1.

    A two-stage risk assessment strategy using the Heidelberg score as the first and the FIB-4 index as the second screening test.

  2. 2.

    A two-stage risk assessment strategy using the APRI score as the first and the FIB-4 index as the second screening test.

Heidelberg−FIB-4 model

In this proposed model, all patients were first screened with the Heidelberg score, and those whose risk score was < 5.50 were considered low-risk. High-risk patients (risk score ≥ 5.50) underwent a second, more specific screening with the FIB-4 index, and those whose FIB-4 index was < 1.52 were considered acceptable risk. Patients who were considered high risk by both tests were assumed to be at high risk of mortality after surgery. The overall sensitivity and specificity of this two-stage risk assessment strategy are 70% and 86%, respectively.

APRI—FIB-4 model

We also tested the combination of APRI (first screening) and FIB-4 index (second screening) using the same method. This stepwise risk assessment strategy has an overall sensitivity of 77% and an overall specificity of 72%.

Discussion

Extended liver resection is the only curative treatment for patients with multiple or large tumors that can prolong survival14,15,16,17,18,19,20. However, inadequate liver remnant due to EH can cause serious complications such as post-hepatectomy liver failure and poses a significant risk of morbidity and mortality21,22,23. Therefore, proper patient selection is crucial to improve post-EH outcomes24. Considering the importance of patient selection, different preoperative risk assessment scores have been proposed to improve the efficacy of the operation and prognosis. However, these criteria have not been comprehensively compared and their sensitivity and specificity have not been simultaneously evaluated in a homogenous population of patients undergoing EH. Hence, in the current study, we compared well-known risk assessment scores and evaluated their ability to predict mortality in patients undergoing EH.

The results of the present study indicate that, in the absence of EH-specific risk assessment scores, some existing risk assessment methods can predict the outcomes after EH with acceptable discriminatory ability. However, the sensitivity and specificity of these scores were heterogenic and there was no agreement in the selection of high-risk patients. The APRI, ALBI grade, Heidelberg score, and FIB-4 index were the most sensitive, and the FIB-4 index was the most specific predictive score for mortality after EH.

The APRI was introduced by Wai et al. in 2003 as a predictive measure for fibrosis and cirrhosis in patients with chronic hepatitis25. Later, different studies concluded that its preoperative measures significantly predict post-hepatectomy morbidity and mortality26,27,28. Mai et al. recently evaluated 1,044 hepatocellular carcinoma patients that underwent liver resection, and demonstrated that APRI could significantly predict post-hepatectomy outcomes (AUC = 0.743), in agreement with our results. This confirms the ability of the APRI to predict mortality after EH27.

Hoffman et al. evaluated patient- and procedure-related factors that affect postoperative morbidity and mortality after liver resection. They reviewed the records of 1,796 patients that underwent liver resection, and showed that age, extension of planned liver resection, preoperative platelet count, international normalized ratio (INR), g-GT, creatinine levels, histologic tumor diagnosis, and ASA classification were significantly associated with post-surgical morbidity and mortality. They introduced the Heidelberg score as a prognostic risk score, which was externally validated in 281 patients and had an AUC of 0.86629. Results of the current study showed a similar ability of the Heidelberg score (AUC of 0.793) to predict mortality in patients undergoing EH.

The FIB-4 index is considered a valid measure for assessing liver fibrosis in different liver diseases12,30. Toyoda et al. indicated that this index could also be used to predict long-term post-curative hepatectomy outcomes; they showed that higher FIB-4 measures were associated with a significant increase in mortality12. In the current study, we show that this measure can significantly predict 90-day mortality after EH.

Wang et al. have demonstrated that the ALBI grade can assess the risk of post-hepatectomy mortality in hepatocellular carcinoma patients (AUC = 0.607)13. However, our results suggest that the ALBI is better able to predict mortality after EH (AUC = 0.664).

Breitenstein et al.31 studied 615 hepatectomy cases in a single center. They introduced a calibrated scoring index to predict poor post-hepatectomy outcomes in non-cirrhotic patients that was based on the odds ratios for preoperative AST levels, ASA grade, extension of resection, and extrahepatic procedures during surgery. Their scoring system, including three different risk levels, enhanced the accuracy of decision making by considering individual patient characteristics, and indicated the costs of health care. In the current study, we have shown that the Breitenstein score can predict 90-day mortality after EH, with a sensitivity and specificity above 65%.

The MELD score was originally introduced as an assessment measure for the intensity of chronic liver conditions, but can also calculate the risks of hepatectomy in patients with liver metastasis32. Mortality risk was more than two times higher in patients with a MELD score > 7.2433. This index has also been validated for post-hepatectomy complications in cirrhotic patients with hepatocellular carcinoma (AUC = 0.85)33,34. Data from the present study supports the significant accuracy of the MELD score in predicting mortality after EH (AUC = 0.690).

The risk scores investigated in this study have been validated in large patient cohorts and were shown to significantly predict outcomes after liver resection. Here, we show that these scores can also partly predict post-EH outcomes. However, the sensitivity and specificity of the different risk scores were heterogeneous, and none could satisfactorily predict mortality in patients who underwent EH. Therefore, we proposed a stepwise (two-stage) risk assessment strategy for predicting mortality in patients undergoing EH. Based on our sensitivity and specificity results, we suggest a two-stage Heidelberg−FIB-4 model, as shown in Fig. 4.

Figure 4
figure 4

A proposed risk assessment strategy for patients undergoing major liver resection.

There are some limitations to the present study. This is a retrospective study, which should be validated by further prospective studies including only patients undergoing major hepatectomy. Furthermore, the Heidelberg score was developed in the same center. However, 3/4 of patients reported in the present study were used to develop the Heidelberg score. Also, the Heidelberg score was developed based on a cohort of 1,796 patients who underwent all types of liver resections (minor or major hepatectomy) and not just EH. The score was externally validated in a cohort of 281 patients from another center. Additionally, decisions about surgery based on the proposed scoring system should be made cautiously. Further trials are needed to evaluate the impact of one-stage surgery vs. two-stage surgery or associating liver partition and portal vein ligation for staged hepatectomy (ALPPS) in patients deemed “at-risk”. The present risk stratification can, however, help surgeons to at-risk patients before surgery. This will help prepare surgeons for providing special intra- and postoperative care, such as expert intraoperative anesthetic care, individual evaluation of the surgical approach (one- vs. two-stage hepatectomy, transection method, Pringle maneuver, etc.), and postoperative ICU care.

In conclusion, we have shown that no single risk assessment score can predict mortality in patients undergoing EH. Although mortality was predicted with acceptable discrimination, the sensitivity and specificity of the different tests were highly heterogeneous. Therefore, we have proposed a two-stage screening strategy using the risk scores with the highest sensitivity and specificity (Heidelberg−FIB-4 model) for patients undergoing major liver resection. Patient selection strategies are different in hepatobiliary centers across the world, so a multicenter prospective evaluation with higher sample sizes and a simultaneous assessment of predictive risk assessment scores is needed to determine which score can predict mortality following major liver resection.

Methods

Study design

Relevant data of all consecutive patients who underwent liver resection between January 2001 and January 2019 were investigated from a prospectively collected database. Only adult patients who underwent EH were included in this study. EH was defined as resection of five or more hepatic segments, based on the Brisbane 2000 classification35. A total of 276 patients were entered in the analysis. Twenty-six patients were excluded because data on the parameters used for all risk scores were missing. In the end, 250 patients were included in the final analyses. This study was approved by the independent ethics committee of the University of Heidelberg (approval number: S-754/2018). The requirement for informed consent was waived by the independent ethics committee of the University of Heidelberg because of the retrospective nature of the study. All procedures were conducted in accordance with the most recent revision of the Declaration of Helsinki.

Study endpoints

The primary endpoint of this study was the ability of each test score to predict the risk of postoperative mortality after EH. To assess the discriminatory value of each test, the best cut-off point was evaluated, as well as the sensitivity and specificity of each test at predicting mortality after EH. The secondary endpoint was all-cause death occurring within 90 days after EH. The distribution of major morbidity in different risk groups (low-, intermediate-, and high-risk groups) was also assessed. Major morbidity was defined as any grade IIIb–IV complications (based on the Clavien–Dindo36 classification) that occurred within the first 90 days after surgery.

Determination of well-known risk scores

To find relevant risk scores for predicting mortality after EH, high-impact hepatobiliary surgery or hepatogastroenterology journals were systematically searched. Journals were Annals of Surgery, JAMA Surgery, British Journal of Surgery, Journal of the American College of Surgeons, Journal of Hepato-Biliary-Pancreatic Sciences, Annals of Surgical Oncology, Surgery, Journal of Hepatology, Hepatology, American Journal of Gastroenterology, Liver Cancer, Clinical Gastroenterology and Hepatology, The Lancet Gastroenterology and Hepatology, and Liver International. Risk scoring systems that were reported in one of the above-mentioned journals and were defined or validated in more than 500 patients were identified. The parameters needed to assess each score were evaluated and risk assessment scores that included parameters that were not available in the institutional database were excluded. The included risk assessment scores were the aspartate aminotransferase-to-platelet ratio index (APRI)25, albumin to bilirubin (ALBI) grade13,37, predictive score developed by Breitenstein et al.31, liver fibrosis (FIB-4) index12,30, Heidelberg score29, and MELD score33. Risk assessment scores are presented in Table 3.

Statistical analysis

Statistical analysis was performed using IBM SPSS Statistics for Windows, Version 24.0 (IBM Corp. Released 2013. Armonk, NY). Categorical data are presented as frequencies and proportions, and continuous data as means ± standard deviations. Categorical data were compared using chi-square test of association or Fisher’s exact test. Continuous data were compared using Student’s t-test. The proportion of morbidity and mortality in each risk assessment score was calculated and best curve estimations were made. The rate of major morbidity and 90-day mortality were compared between low-, intermediate-, and high-risk groups defined by each risk assessment score. ROC curve analysis and diagonal reference lines charting were used to compute the cut-off value that best discriminates the 90-day mortality risk between groups, as well as sensitivity and specificity of each risk assessment score for 90-day mortality. Cut-off points were identified by Youden’s J statistic. Pairwise comparison of AUC of risk assessment scores was performed using the method of DeLong et al.38 and MedCalc version 19.0.3 (MedCalc Software, Inc., Mariakerke, Belgium). A two-sided p value less than 0.05 was considered significant in all analyses.