Machine Learning Prediction Models for Diagnosing Hepatocellular Carcinoma with HCV-related Chronic Liver Disease

https://doi.org/10.1016/j.cmpb.2020.105551Get rights and content

Highlights

  • Machine-learning techniques can estimate HCC development risk with very high accuracy.

  • Independent factors building the models: age, AFP, ALP, albumin, and total bilirubin.

  • Multi-linear regression, AD Tree, REP-Tree, the CART algorithm were compared.

  • Alternating Decision tree showed the best accuracy: 95.6% and AUC: 99.0%.

Abstract

Background and Objective

Considered as one of the most recurrent types of liver malignancy, Hepatocellular Carcinoma (HCC) needs to be assessed in a non-invasive way. The objective of the current study is to develop prediction models for Chronic Hepatitis C (CHC)-related HCC using machine learning techniques.

Methods

A dataset, for 4423 CHC patients, was investigated to identify the significant parameters for predicting HCC presence. In this study, several machine learning techniques (Classification and regression tree, alternating decision tree, reduce pruning error tree and linear regression algorithm) were used to build HCC classification models for prediction of HCC presence.

Results

Age, alpha-fetoprotein (AFP), alkaline phosphate (ALP), albumin, and total bilirubin attributes were statistically found to be associated with HCC presence. Several HCC classification models were constructed using several machine learning algorithms. The proposed HCC classification models provide adequate area under the receiver operating characteristic curve (AUROC) and high accuracy of HCC diagnosis. AUROC ranges between 95.5% and 99%, plus overall accuracy between 93.2% and 95.6%.

Conclusion

Models with simplistic factors have the power to predict the existence of HCC with outstanding performance.

Introduction

The primary cause of chronic hepatitis is infection with hepatitis C virus (HCV), which also is a standard predisposing factor for development of hepatocellular carcinoma (HCC) [1]. HCC is a malignant tumour of the liver [2]. It is the fifth-most common cancer in the world and third-most common cause of death from cancer [3]. In Egypt, it is the most frequent malignant tumour in Egyptian men and second-most frequent in Egyptian women [4]. HCC is the leading cause of mortality from malignant tumours in Egypt and represents 32.35% of total cancer deaths. HCC incidence has increased from 7.3% of total cases of malignant tumours in 2003 to 19.7% in 2018 [5,6]. This rising incidence may be due to the high prevalence of and complications associated with chronic hepatitis C infection [7,8].

HCC risk increases concurrently with progression of liver fibrosis. It is therefore important to monitor for HCC among patients with advanced fibrosis. Cancer in the liver is generally diagnosed using tri-phasic computed tomography (CT) and magnetic resonance imaging (MRI) [9]. Repeat exams are often necessary, but this can be costly for patients and difficult in countries that lack resources. Cross-sectional studies have identified potential factors that are correlated with an elevated risk of HCC, including demographic (e.g., gender, age), virus-related (e.g., serum HCV level), and disease-related (e.g., alpha-fetoprotein [AFP] level, presence of cirrhosis) factors. However, most of these studies involve a limited number of participants [10], [11], [12].

Machine learning approaches can enhance clinical decision support by offering less time-consuming but still accurate and effective early prediction of fibrosis and liver cancer [13,14]. Using artificial intelligence and statistical analysis to predict and recognize patterns in enormous datasets, machine learning algorithms can be used to predict hepatic diseases [10], [11], [12], [13], [14], [15]. For example, Wen et al. tabulated risk predictors of HCC using the Cox proportional hazards regression [10]. This method uses age, sex, health history, hepatitis B and C virus status, and serum levels of aspartate aminotransferase, alanine aminotransferase, and AFP as statistically significant independent predictors of HCC. In Chang et al. [11], a Cox regression indicated that old age, high AFP, low platelet counts, and advanced fibrosis are independent risk features of HCC.

This study aimed to determine the risk factors for HCC among patients with HCV with advanced fibrosis. The study used different decision-tree learning techniques with machine learning to develop an accurate estimation score for HCC development, as determined by the proposed independent risk factors. It also focused on patients from Egypt. Data were gathered by specialists from Kasr Al-Aini Hospital, Cairo University, Egypt.

Section snippets

Patients and Data

This retrospective study used a dataset of 4,423 patients (after filtering), all of whom were diagnosed with HCV of genotype 4 with advanced fibrosis. Data were collected from two institutes in Egypt: the Egyptian National Committee for the Control of Viral Hepatitis and the multidisciplinary HCC clinic at Cairo University's Kasr Al-Aini Hospital.

For HCV patients without HCC, a cohort of 3,099 (1,003 women and 2,096 men) with chronic hepatitis C infection was selected from patients enrolled in

Results

The dataset included 4,423 (3,104 male and 1,319 female) patients aged 16 to 80 years with HCV with advanced fibrosis. Of those, 1,324 had HCC. Patients with mild to moderate fibrosis are unlikely to have liver cancer and thus were excluded from this study. Table 1 references the statistical analysis outcomes and reports the baseline characteristics of patients as the mean ± SD, unless otherwise stated.

In this study, the filters method was used to pre-process the data, and then the learning

Discussion

Egypt has the highest prevalence of HCV worldwide (18%), with genotype-4a accounting for almost 90% of infections [29]. The major cause of HCC is HCV. Every 30 seconds, one person in the world dies from liver cancer [30], underscoring the need for HCC surveillance for patients with advanced fibrosis. In recent years, machine-learning techniques have been used to predict HCC risk. The machine-learning process extracts valuable information from a dataset and transforms it into logical structures

Conclusion

Machine-learning techniques can estimate the risk of HCC development with high accuracy. These methods offer efficient and non-invasive ways for physicians to diagnose and monitor patients with HCC. Age, AFP, ALP, albumin, and total bilirubin were found to be strongly correlated with the presence of HCC. This study assessed the performance of multi-linear regression, classification and regression tree, alternating decision tree, and reduced error pruning tree methods in identifying HCC

Declaration of competing interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgement

A special appreciations to the multidisciplinary HCC clinic located in Kasr Al-Aini Hospital, Cairo University, Egypt for the patients’ information dataset.

References (31)

  • Somaya Hashem et al.

    Accurate Prediction of Advanced Liver Fibrosis Using the Decision Tree Learning Algorithm in Chronic Hepatitis C Egyptian Patients

    Gastroenterology Research and Practice

    (2016)
  • Somaya Hashem et al.

    A Simple multi-linear regression model for predicting fibrosis scores in chronic Egyptian hepatitis C virus patients

    International Journal of Bio-Technology and Research (IJBTR)

    (2014 Jun)
  • Chi-Pang Wen et al.

    Hepatocellular Carcinoma Risk Prediction Model for the General Population: The Predictive Power of Transaminases

    Journal of the National Cancer Institute

    (2012)
  • Kuo-Chin Chang et al.

    A novel predictive score for hepatocellular carcinoma development in patients with chronic hepatitis C after sustained response to pegylated interferon and ribavirin combination therapy

    Journal of Antimicrobial Chemotherapy

    (August 16, 2012)
  • Dalia Abd El Hamid Omran et al.

    Application of Data Mining Techniques to Explore Predictors of HCC in Egyptian Patients with HCV-related Chronic Liver Disease

    Asian Pacific Journal of Cancer Prevention

    (2015)
  • Cited by (0)

    Co-Last authors.

    View full text