Combining neural network predictions for medical diagnosis

https://doi.org/10.1016/S0010-4825(02)00006-9Get rights and content

Abstract

We present our results from combining the predictions of an ensemble of neural networks for the diagnosis of hepatobiliary disorders. To improve the accuracy of the diagnosis, we train the second level networks using the outputs of the first level networks as input data. The second level networks achieve an accuracy that is higher than that of the individual networks in the first level. Compared to the simple method which averages the outputs of the first level networks, the second level networks are also more accurate. We discuss how the overall predictive accuracy can be improved by introducing bias during the training of the level one networks.

Introduction

Many authors have shown that combining the predictions of several models often results in a prediction accuracy that is higher than that of the individual models. The general framework for predicting using an ensemble of models consists of two levels and is often referred to as stacked generalization [1]. In the first level, various learning methods are used to learn different models from the original data set. The predictions of the models from the first level along with the corresponding target class of the original input data are then used as inputs to learn a second level model.

As neural networks are among the most popular models for pattern classification, numerous papers that report on theoretical and experimental results on combining the neural network predictions can be found in the literature. Among the second level models proposed for combining the network predictions are the simple averaging method and the generalized ensemble method [2] and the weighted least-squares method [3]. In these methods, the second level model is a simple weighted predictions of the component networks in the first level. The three methods differ in their computation of the ensemble weights given to the component networks in the ensemble. While the ensemble weights for the simple averaging method are equal for all component networks, the generalized ensemble weights depend on the correlation matrix of the errors of the component networks. In the weighted least-squares method, the weights are computed as the product of the component networks’ outputs and the target vector of the training samples.

The accuracy of the different methods for combining regularized neural networks have been compared on a breast cancer database [4]. The regularized neural networks investigated are networks that have been trained to minimize a cost function involving the sum of squared error function and a quadratic penalty term of the network weights. The first level models are neural networks that have been initialized with different random initial weights and neural networks that have been trained using different subsets of the data. The different data subsets are obtained by randomly drawing samples from the original data set with replacement. The second level models used include simple averaging method and a variance-based weighting method of the first level neural networks.

Another application of the network ensemble approach for the diagnosis of breast cancer has also been reported recently [5]. The network ensemble is adapted so that it is less likely to make false positive diagnosis (malignant diagnosis for benign data). The adaptation is achieved through training neural networks using different proportions of malignant to benign data. The first level models are two groups of neural networks. The networks in the first group have been trained with greater proportion of benign samples, while those in the second group with greater proportion of malignant samples. The second level model is a threshold decision mechanism which, based on a certain empirically determined threshold, decides whether the output of the first group or the second group is to be taken as the final output.

In this paper, we present our experimental results on combining neural network predictions for the diagnosis of hepatobiliary disorders. The data have been collected from a total of 536 patients who were admitted to a university-affiliated hospital in Japan. Nine real-valued measurements from biomedical tests were obtained from these patients. The hepatobiliary disorders alcoholic liver damage (ALD), primary hepatoma (PH), liver cirrhosis (LC), and cholelithiasis (C) constitute the four output classes. Because there are four possible outcomes of a diagnosis, for the first level models we have used four sets of neural networks. Networks in each set have been trained so that they are likely to be more accurate for one type of disorder than the other three disorders. The predictions of the networks in the first level are combined by a second level neural network. We have been able to achieve significant improvement in accuracy by applying neural networks as the second level model compared to the simple averaging method.

The outline of this paper is as follows. In Section 2, we describe the data that have been collected in more detail. We also describe the neural network topology used in this section. In Section 3, we present the results of our experiments using neural network for combining the predictions of the first level networks. Finally, in Section 4 we conclude the paper.

Section snippets

The data set

The hepatobiliary disorder data set contains 536 samples with nine input attributes. The attributes correspond to measurements from biomedical tests. They are glutamic oxaloacetic transaminase (GOT1), glutamic pyruvic transaminase (GPT2), lactate dehydrase (LDH), gamma glutamyl transpeptidase (GGT), blood urea nitrogen (BUN), mean corpuscular volume of red blood cells (MCV), mean corpuscular hemoglobin

Simple averaging

Simple averaging of the predictions have been known to improve the performance of the individual predictions.

Table 4 shows the accuracy obtained by averaging the predictions from N networks, where N is 5, 10, or 15. The accuracy rates are averaged over five groups of randomly selected N networks from the 30 networks that we have trained. From the figures in this table, we see that there is a 1% increase in predictive accuracy over the average accuracy of the individual networks. When we also

Summary

We proposed the use of neural networks to combine the predictions of a neural network ensemble that has been trained for diagnosing hepatobiliary disorders. In order to generate networks with differing error patterns we generated biased networks by training the networks in four separate groups. Networks in each group were trained with different targets. The learning targets were modified so that the trained networks would predict one particular disorder with higher accuracy than the other three

Yoichi Hayashi received the B.E. degree in Management Science, and the M.E. and Dr. Eng. degrees in Systems Engineering, all from the Science University of Tokyo, Japan, in 1979, 1981, and 1984, respectively. He joined Ibaraki University, Japan, as an Assistant Professor in 1986 and was a Visiting Professor at the University of Alabama at Birmingham and University of Canterbury for 10 months. Currently, he is a Professor of Computer Science at Meiji University, Japan. He has published 140

References (10)

There are more references available in the full text version of this article.

Cited by (49)

  • A deep learning algorithm for classification of oral lichen planus lesions from photographic images: A retrospective study

    2023, Journal of Stomatology, Oral and Maxillofacial Surgery
    Citation Excerpt :

    Many of them discovered that neural networks are instruments for receiving reasonably optimum solutions of partial and restricted data sets since they are flexible in modeling and have logical accuracy in prediction. As a result, neural networks are capable of combining data in many forms of a system, such as data obtained through clinical and experimental assessment methods, as well as aspects of signals and photographs [9,24,25]. One of the most difficult challenges in dermatology is differentiating between erythematous and squamous disorders.

  • Accurate prediction of continuous blood glucose based on support vector regression and differential evolution algorithm

    2018, Biocybernetics and Biomedical Engineering
    Citation Excerpt :

    Thus, for each patient, the DE algorithm fixed the optimal parameters and the SVR was trained automatically. To quantify the prediction performance, several measures were used and defined in the literature [14,32,33]. In this work, we have used the root mean square error (RMSE), the mean absolute percentage error (MAPE) and the fitness degree (R2).

  • Pattern recognition at different scales: A statistical perspective

    2014, Chaos, Solitons and Fractals
    Citation Excerpt :

    The theory of artificial neural networks (ANN) represents an open research field setting the stage for the implementation of a statistical mechanical approach in novel interdisciplinary problems, such as the modeling of the collective behavior of the human brain neurons. An important field of application of ANN is represented by the pattern recognition analysis [1,2], which has received an increasing interest in the literature, witnessed by the extensive application of ANN to tackle complex real-word problems, e.g. in medical diagnosis [3–5] and in biological sequences analysis [6–9]. Recent works, in this field, paved also the way to the systematic use of technical tools borrowed from Information Theory and Statistical Mechanics [10–12].

View all citing articles on Scopus

Yoichi Hayashi received the B.E. degree in Management Science, and the M.E. and Dr. Eng. degrees in Systems Engineering, all from the Science University of Tokyo, Japan, in 1979, 1981, and 1984, respectively. He joined Ibaraki University, Japan, as an Assistant Professor in 1986 and was a Visiting Professor at the University of Alabama at Birmingham and University of Canterbury for 10 months. Currently, he is a Professor of Computer Science at Meiji University, Japan. He has published 140 papers in academic journals and international conference proceedings in the fields of computer and information sciences. His current research interest includes artificial neural networks, fuzzy logic, soft computing, expert systems, computational intelligence, data mining and medical informatics. Dr. Hayashi is an Associate Editor of IEEE Transactions on Fuzzy Systems. He is a member of the IEEE, ACM, AAAI, IFSA, INNS, NAFIPS, IPSJ and EICE.

Rudy Setiono received the B.S. degree in Computer Science from Eastern Michigan University in 1984, the M.S. and Ph.D. degrees in Computer Science from the University of Wisconsin-Madison in 1986 and 1990, respectively. Since August 1990, he has been with the National University of Singapore where he is currently an Associate Professor at the Information Systems Department, School of Computing. He is IEEE Senior Member and an Associate Editor of IEEE Transactions on Neural Networks.

View full text