Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease
Introduction
Thyroid diseases affect about 200 million people worldwide, which accounts for nearly 15% of the adult population. In the United States, about 27 million people are affected by thyroid diseases, half of whom remain undiagnosed. The most common thyroid diseases lead to thyroid dysfunction, with 80% of cases being diagnosed as hypothyroidism, a condition in which the thyroid gland, a small, butterfly-shaped gland in the lower part of the neck, is underactive and incapable of producing adequate levels of thyroid hormone, and 20% of cases are diagnosed as hyperthyroidism, a condition in which the thyroid gland is overactive and produces an excess of thyroid hormone. The thyroid gland produces two active hormones—triiodothyronine (T3) and levothyroxine (LT4)—that play a variety of important roles in the human body and are vital in the production of proteins and the regulation of body temperature. Because thyroid function affects every major organ in the body, thyroid disorders should be taken very seriously. The thyroid gland is susceptible to a number of very distinct problems, some of which are extremely common. Under production of thyroid hormone can lead to hypothyroidism, whereas overproduction can result in hyperthyroidism. Both of these disorders are relatively common among the general population [1]. To help cells convert oxygen and nutrients into energy, T3 and LT4 production must fall within normal ranges; these ranges are usually based on the concentration of thyroid hormone in the blood. In the present study, normal ranges were calculated for the following: T3, LT4, thyroid stimulating hormone (TSH), thyroxine (T4) utilization rate, and free thyroxine index (FTI) [2]. Symptoms related to thyroid diseases can be easily confused with those for other conditions, making thyroid diseases difficult to diagnose. Fortunately, a TSH test can identify thyroid disorders even before symptom onset [3].
Thyroid function diagnosis is a three-class classification problem. Numerous supervised methods for diagnosing thyroid disease have been successfully applied to the classification of different tissues. These methods include the following: extreme learning machines (ELMs) [1]; support vector machines (SVMs) [4], [5], [6]; neural networks (NNs) [7], [8], [9], [10], [11], [12]; decision trees (DTs) [7]; k-nearest neighbor (kNN) classifiers [13]; fuzzy classifiers (FCs) [14], [15]; hybrid case-based reasoning [16]; mixture-of-expert models [17]; immune algorithms [18]; immune recognition systems [19]; neuro-fuzzy expert systems [20]; differential evolution [21]; classification [22], [23]; and discriminant analysis [24].
However, the majority of methods currently used to diagnose thyroid disease are black-box models that are unable to satisfactory reveal information hidden in the data. For example, even if a method allows instances to be accurately assigned to groups, no information is provided to users regarding the reasoning underlying that assignment. Therefore, systems and/or algorithms that can provide insight into these underlying rationales are highly desirable. Among supervised methods, rule extraction studies are becoming increasingly popular due to their capability of providing such explanations. However, extracted rules must have a high level of accuracy, particularly in the medical setting, and yet still be simple and easy to understand.
Some researchers have experimented with extracting Boolean rules from NNs [25], [26], [27], which has led to encouraging results that exhibit good performance, a reduced number of rules, relevant input variables, and increased interpretability. However, because these methods use Boolean rules, they do not extract continuous rules.
The Recursive-Rule Extraction algorithm with J48graft (Re-RX with J48graft) was first proposed in 2015 [28], and its considerable effectiveness for use in financial [29] and medical datasets [30], [31] has since been confirmed. Re-RX with J48graft is based on the Re-RX algorithm [32]. In the Re-RX algorithm, a C4.5 DT [33] is frequently employed in a recursive manner, while multi-layer perceptrons (MLPs) are trained using backpropagation NNs (BPNNs); this allows pruning [34] and therefore generates more efficient MLPs for highly accurate rule extraction. The Re-RX algorithm cascade repeats the BPNN, the pruning, and C4.5.
Re-RX with J48graft, a white-box model, can provide highly accurate and concise classification and be easily explained and interpreted in accordance with the concise extracted rules associated with IF-THEN forms. Due to the ease of understanding this type of model allows, it is often preferred by physicians and clinicians.
Recently, I proposed using both discrete and continuous attributes to generate a DT in the Re-RX algorithm framework (hereafter Continuous Re-RX [35]). Although this seems to be counterintuitive with the design concept of the Re-RX algorithm in that it results in the generation of a more complex DT, the use of both types of attributes is done to enhance accuracy [35].
Typically, the accuracy of each extracted classification rule is assessed by the number of correctly classified test samples, while the interpretability is assessed by the number of extracted rules and the average number of antecedents in the extracted rules; however, for extracted classification rules, both accuracy and interpretability should be considered.
Results regarding the extraction of classification rules for the diagnosis of thyroid disease have been reported in a number of studies [5], [11], [14], [15]. Among these, Duch et al. [11] reported highly accuracy classification for the Thyroid dataset and provided two types of relatively simple and concrete classification rule sets.
The present study, aimed to elucidate the synergy effects of grafting in J48graft and subdivision in Re-RX with J48graft, which work effectively in combination to extract highly accurate and concise classification rules for the diagnosis of thyroid disease. This paper demonstrates the reason why highly accurate and concise classification rules can be extracted by grafting and subdivision, respectively, and provide a theoretical explanation of the synergy effects. A concrete comparison is also given of the accuracy and characteristics of these rules for the Thyroid dataset in the form of IF-THEN rules with those extracted by Continuous Re-RX [35] and both types of rule extraction algorithms proposed by Duch et al. [11].
The Thyroid dataset is a multi-class and highly imbalanced medical dataset that comprises both discrete and continuous, i.e., mixed, attributes. It was obtained from the University of California Irvine (UCI) Machine Learning Repository [36]. Two versions of the Thyroid dataset have been used for benchmarking in previous studies. One version comprises 7200 samples [11], [14], [15], [17], while the other comprises 215 samples [1], [3], [5], [6], [16], [17], [18], [19], [20], [21].
Section snippets
Extreme learning machines (ELMs)
ELMs [37], which are based on the structure of MLPs, represent a new rapid learning method. The most noteworthy aspect of ELM training is that it is carried out by setting the network weights randomly to obtain the inverse of the hidden-layer output matrix. The most advantageous characteristics of this technique are its simplicity, in that it makes the training algorithm extremely fast, and its outstanding performance, in that it usually performs better than other established methods, such as
Thyroid dataset and experimental setup
The Thyroid dataset is a large medical dataset composed of screening tests for thyroid symptoms. The training and test data in the set comprise 3772 and 3428 medical records, respectively, in which the thyroid is classified as normally functioning (Class 3), under-functioning (primary hypothyroidism, Class 1), or overactive (hyperthyroidism, Class 2). Hyper- and hypothyroidism represent 2.3% (166 cases) and 5.1% (367 cases) of the dataset, respectively; the remaining 92.6% (6667 cases) is
Performance
The Thyroid dataset was trained using the BPNN to obtain the maximum training accuracy (TR ACC) and the maximum test accuracy (TS ACC), the number of extracted rules (# rules), the average number of antecedents (Ave. # ante.), and the area under the receiver operating characteristics curve [52] (Table 1). To the best of my knowledge, there have been no reports in the rule extraction and classification literature regarding thyroid TS ACC based on the k-cross validation [53] method. Therefore,
Characteristics of rules extracted for the thyroid dataset
A comparison of the four rule sets revealed characteristic class differences between each rule. All classes appeared in the rules extracted in the present study at least once. It is presumed that this was due to the characteristics of the Thyroid dataset, which primarily consists of Class 3 samples. Given that assumption, correctly classifying Class 3 samples in detail is necessary.
The rule set shown in Fig. 5 (before subdivision) defines Class 3 using only one rule with one attribute (TSH),
Synergy effects between grafting and the subdivision in Re-RX with J48graft
This section investigates the synergy effects between grafting and subdivision in Re-RX with J48graft, which work effectively in combination to extract highly accurate and concise classification rules.
Medical significance of the present research
Historically, the degree and burden of suffering associated with hypo- and hyperthyroidism have remained unclear. Symptoms of these conditions have lacked intersubjective validity, and sufferers have faced stigmatization, making it difficult to construct a meaningful identity as a person with an illness [58].
The clinical symptoms and signs of these conditions often lack specificity, and TSH and thyroid hormone measurements are crucial for diagnosis and monitoring. Minor abnormalities in thyroid
Conclusion
The present study, elucidates the synergy effects of grafting in J48graft and subdivision in Re-RX with J48graft, which work effectively and in combination to extract highly accurate and concise classification rules for the diagnosis of thyroid disease. These rules are then compared with those extracted by other commonly used models. A theoretical explanation of the excellent synergy effects observed between grafting and the subdivision in Re-RX with J48graft is also provided.
The maximum
Conflict of interest
The author declares no conflicts of interest. For this type of study, formal consent is not required.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Acknowledgments
The author wishes to sincerely thank the UCI Machine Learning Repository and all those who donated to the thyroid dataset.
References (65)
- et al.
An expert system based on generalized discriminant analysis and wavelet support vector machine for diagnosis of thyroid diseases
Expert Syst. Appl.
(2011) - et al.
Rule extraction from support vector machines based on consistent region covering reduction
Knowl. Based Syst.
(2013) - et al.
An automatic diagnosis system based on thyroid gland: ADSTG
Expert Syst. Appl.
(2010) A comparative study on thyroid disease diagnosis using neural networks
Expert Syst. Appl.
(2009)A method for single and multiple knowledge based networks
Artif. Intell. Med.
(2003)- et al.
Extension of mixture-of-experts networks for binary classification of hierarchical data
Artif. Intell. Med.
(2007) - et al.
A hybrid immune-estimation distribution of algorithm for mining thyroid grand data
Expert Syst. Appl.
(2010) - et al.
Medical application of information gain based artificial immune recognition system (AIRS): diagnosis of thyroid disease
Expert Syst. Appl.
(2009) - et al.
ESTDD: expert system for thyroid diseases diagnosis
Expert Syst. Appl.
(2008) - et al.
A novel information transferring approach for the classification of remote sensing images
EURASIP J. Adv. Signal Process.
(2015)