Elsevier

Knowledge-Based Systems

Volume 131, 1 September 2017, Pages 170-182
Knowledge-Based Systems

Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease

https://doi.org/10.1016/j.knosys.2017.06.011Get rights and content

Highlights

  • Elucidate synergy effects between grafting and subdivision in Re-RX with J48graft.

  • Provide the theoretical explanation underlying the excellent synergy effects.

  • demonstrate how grafting and subdivision can highly extract accurate and concise rules.

  • Re-RX with J48graft extracts highly accurate and concise rules from the thyroid dataset.

Abstract

Numerous methods for diagnosing thyroid disease have been developed, but the majority of these are black-box models. By contrast, the Recursive-Rule Extraction algorithm with J48graft is a white-box model that can provide highly accurate and concise classification rules. However, the potential capabilities of Re-RX with J48graft in terms of rule extraction remain unknown. Therefore, the aim of the present study was to elucidate the synergy effects between grafting and subdivision in Re-RX with J48graft, which work effectively in combination to extract highly accurate and concise classification rules for the diagnosis of thyroid disease. In the present study, I demonstrate how grafting and subdivision can extract highly accurate and concise classification rules from the Thyroid dataset, which is a large and highly imbalanced dataset consisting of 7200 medical records classified as normally functioning thyroid, hypothyroidism, or hyperthyroidism. I also provide the theoretical explanation underlying the excellent synergy effects between the two processes. Re-RX with J48graft not only achieved the most accurate classification rules, but also extracted simple and concrete concise classification rules for majority class samples. In addition, compared with previous methods, Re-RX with J48graft extracted rules with fewer antecedents. The maximum accuracy of the extracted rules was very high, at 97.02%. These findings suggest that Re-RX with J48graft can extract highly accurate and concise rules, which could assist healthcare professionals in the diagnosis of thyroid disease and help improve the level of care.

Introduction

Thyroid diseases affect about 200 million people worldwide, which accounts for nearly 15% of the adult population. In the United States, about 27 million people are affected by thyroid diseases, half of whom remain undiagnosed. The most common thyroid diseases lead to thyroid dysfunction, with 80% of cases being diagnosed as hypothyroidism, a condition in which the thyroid gland, a small, butterfly-shaped gland in the lower part of the neck, is underactive and incapable of producing adequate levels of thyroid hormone, and 20% of cases are diagnosed as hyperthyroidism, a condition in which the thyroid gland is overactive and produces an excess of thyroid hormone. The thyroid gland produces two active hormones—triiodothyronine (T3) and levothyroxine (LT4)—that play a variety of important roles in the human body and are vital in the production of proteins and the regulation of body temperature. Because thyroid function affects every major organ in the body, thyroid disorders should be taken very seriously. The thyroid gland is susceptible to a number of very distinct problems, some of which are extremely common. Under production of thyroid hormone can lead to hypothyroidism, whereas overproduction can result in hyperthyroidism. Both of these disorders are relatively common among the general population [1]. To help cells convert oxygen and nutrients into energy, T3 and LT4 production must fall within normal ranges; these ranges are usually based on the concentration of thyroid hormone in the blood. In the present study, normal ranges were calculated for the following: T3, LT4, thyroid stimulating hormone (TSH), thyroxine (T4) utilization rate, and free thyroxine index (FTI) [2]. Symptoms related to thyroid diseases can be easily confused with those for other conditions, making thyroid diseases difficult to diagnose. Fortunately, a TSH test can identify thyroid disorders even before symptom onset [3].

Thyroid function diagnosis is a three-class classification problem. Numerous supervised methods for diagnosing thyroid disease have been successfully applied to the classification of different tissues. These methods include the following: extreme learning machines (ELMs) [1]; support vector machines (SVMs) [4], [5], [6]; neural networks (NNs) [7], [8], [9], [10], [11], [12]; decision trees (DTs) [7]; k-nearest neighbor (kNN) classifiers [13]; fuzzy classifiers (FCs) [14], [15]; hybrid case-based reasoning [16]; mixture-of-expert models [17]; immune algorithms [18]; immune recognition systems [19]; neuro-fuzzy expert systems [20]; differential evolution [21]; classification [22], [23]; and discriminant analysis [24].

However, the majority of methods currently used to diagnose thyroid disease are black-box models that are unable to satisfactory reveal information hidden in the data. For example, even if a method allows instances to be accurately assigned to groups, no information is provided to users regarding the reasoning underlying that assignment. Therefore, systems and/or algorithms that can provide insight into these underlying rationales are highly desirable. Among supervised methods, rule extraction studies are becoming increasingly popular due to their capability of providing such explanations. However, extracted rules must have a high level of accuracy, particularly in the medical setting, and yet still be simple and easy to understand.

Some researchers have experimented with extracting Boolean rules from NNs [25], [26], [27], which has led to encouraging results that exhibit good performance, a reduced number of rules, relevant input variables, and increased interpretability. However, because these methods use Boolean rules, they do not extract continuous rules.

The Recursive-Rule Extraction algorithm with J48graft (Re-RX with J48graft) was first proposed in 2015 [28], and its considerable effectiveness for use in financial [29] and medical datasets [30], [31] has since been confirmed. Re-RX with J48graft is based on the Re-RX algorithm [32]. In the Re-RX algorithm, a C4.5 DT [33] is frequently employed in a recursive manner, while multi-layer perceptrons (MLPs) are trained using backpropagation NNs (BPNNs); this allows pruning [34] and therefore generates more efficient MLPs for highly accurate rule extraction. The Re-RX algorithm cascade repeats the BPNN, the pruning, and C4.5.

Re-RX with J48graft, a white-box model, can provide highly accurate and concise classification and be easily explained and interpreted in accordance with the concise extracted rules associated with IF-THEN forms. Due to the ease of understanding this type of model allows, it is often preferred by physicians and clinicians.

Recently, I proposed using both discrete and continuous attributes to generate a DT in the Re-RX algorithm framework (hereafter Continuous Re-RX [35]). Although this seems to be counterintuitive with the design concept of the Re-RX algorithm in that it results in the generation of a more complex DT, the use of both types of attributes is done to enhance accuracy [35].

Typically, the accuracy of each extracted classification rule is assessed by the number of correctly classified test samples, while the interpretability is assessed by the number of extracted rules and the average number of antecedents in the extracted rules; however, for extracted classification rules, both accuracy and interpretability should be considered.

Results regarding the extraction of classification rules for the diagnosis of thyroid disease have been reported in a number of studies [5], [11], [14], [15]. Among these, Duch et al. [11] reported highly accuracy classification for the Thyroid dataset and provided two types of relatively simple and concrete classification rule sets.

The present study, aimed to elucidate the synergy effects of grafting in J48graft and subdivision in Re-RX with J48graft, which work effectively in combination to extract highly accurate and concise classification rules for the diagnosis of thyroid disease. This paper demonstrates the reason why highly accurate and concise classification rules can be extracted by grafting and subdivision, respectively, and provide a theoretical explanation of the synergy effects. A concrete comparison is also given of the accuracy and characteristics of these rules for the Thyroid dataset in the form of IF-THEN rules with those extracted by Continuous Re-RX [35] and both types of rule extraction algorithms proposed by Duch et al. [11].

The Thyroid dataset is a multi-class and highly imbalanced medical dataset that comprises both discrete and continuous, i.e., mixed, attributes. It was obtained from the University of California Irvine (UCI) Machine Learning Repository [36]. Two versions of the Thyroid dataset have been used for benchmarking in previous studies. One version comprises 7200 samples [11], [14], [15], [17], while the other comprises 215 samples [1], [3], [5], [6], [16], [17], [18], [19], [20], [21].

Section snippets

Extreme learning machines (ELMs)

ELMs [37], which are based on the structure of MLPs, represent a new rapid learning method. The most noteworthy aspect of ELM training is that it is carried out by setting the network weights randomly to obtain the inverse of the hidden-layer output matrix. The most advantageous characteristics of this technique are its simplicity, in that it makes the training algorithm extremely fast, and its outstanding performance, in that it usually performs better than other established methods, such as

Thyroid dataset and experimental setup

The Thyroid dataset is a large medical dataset composed of screening tests for thyroid symptoms. The training and test data in the set comprise 3772 and 3428 medical records, respectively, in which the thyroid is classified as normally functioning (Class 3), under-functioning (primary hypothyroidism, Class 1), or overactive (hyperthyroidism, Class 2). Hyper- and hypothyroidism represent 2.3% (166 cases) and 5.1% (367 cases) of the dataset, respectively; the remaining 92.6% (6667 cases) is

Performance

The Thyroid dataset was trained using the BPNN to obtain the maximum training accuracy (TR ACC) and the maximum test accuracy (TS ACC), the number of extracted rules (# rules), the average number of antecedents (Ave. # ante.), and the area under the receiver operating characteristics curve [52] (Table 1). To the best of my knowledge, there have been no reports in the rule extraction and classification literature regarding thyroid TS ACC based on the k-cross validation [53] method. Therefore,

Characteristics of rules extracted for the thyroid dataset

A comparison of the four rule sets revealed characteristic class differences between each rule. All classes appeared in the rules extracted in the present study at least once. It is presumed that this was due to the characteristics of the Thyroid dataset, which primarily consists of Class 3 samples. Given that assumption, correctly classifying Class 3 samples in detail is necessary.

The rule set shown in Fig. 5 (before subdivision) defines Class 3 using only one rule with one attribute (TSH),

Synergy effects between grafting and the subdivision in Re-RX with J48graft

This section investigates the synergy effects between grafting and subdivision in Re-RX with J48graft, which work effectively in combination to extract highly accurate and concise classification rules.

Medical significance of the present research

Historically, the degree and burden of suffering associated with hypo- and hyperthyroidism have remained unclear. Symptoms of these conditions have lacked intersubjective validity, and sufferers have faced stigmatization, making it difficult to construct a meaningful identity as a person with an illness [58].

The clinical symptoms and signs of these conditions often lack specificity, and TSH and thyroid hormone measurements are crucial for diagnosis and monitoring. Minor abnormalities in thyroid

Conclusion

The present study, elucidates the synergy effects of grafting in J48graft and subdivision in Re-RX with J48graft, which work effectively and in combination to extract highly accurate and concise classification rules for the diagnosis of thyroid disease. These rules are then compared with those extracted by other commonly used models. A theoretical explanation of the excellent synergy effects observed between grafting and the subdivision in Re-RX with J48graft is also provided.

The maximum

Conflict of interest

The author declares no conflicts of interest. For this type of study, formal consent is not required.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Acknowledgments

The author wishes to sincerely thank the UCI Machine Learning Repository and all those who donated to the thyroid dataset.

References (65)

  • LiL et al.

    A new face recognition method via semi-discrete decomposition for one sample problem

    Optik

    (2016)
  • R Setiono

    Generating concise and accurate classification rules for breast cancer disgnosis

    Artif. Intell. Med.

    (2000)
  • R Setiono

    Extracting rules from pruned neural networks for breast cancer diagnosis

    Artif. Intell. Med.

    (1996)
  • Y Hayashi

    Application of rule extraction algorithm family based on the Re-RX algorithm to financial credit risk assessment from pareto optimal perspective

    Oper. Res. Perspect.

    (2016)
  • Y Hayashi et al.

    Use of a recursive-rule extraction algorithm with J48graft to archive highly accurate and concise rule extraction from a large breast cancer dataset

    Inform. Med. Unlocked

    (2015)
  • Y Hayashi et al.

    Rule extraction using recursive-rule extraction algorithm with J48graft with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the pima Indian dataset

    Inform. Med. Unlocked

    (2016)
  • Y Hayashi et al.

    Use of the recursive-rule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease

    Inform. Med. Unlocked

    (2015)
  • HuangG-B et al.

    Extreme learning machine: theory and applications

    Neurocomputing

    (2006)
  • HuangG et al.

    Optimization method based extreme learning machine for classification

    Neurocomputing

    (2010)
  • E Alexandre et al.

    Hybridizing extreme learning machines and genetic algorithms to select acoustic features in vehicle classification applications

    Neurocomputing

    (2015)
  • CJ Mantas et al.

    Analysis of Credal-C4.5 for classification in noisy domains

    Expert Syst. Appl.

    (2016)
  • J Maillo et al.

    kNN-IS: an iterative spark-based design of the k-nearest neighbors classifiers for big data

    Knowl.-Based Syst.

    (2017)
  • F Beloufa et al.

    Design of fuzzy classifier for diabetes disease using modified artificial bee colony algorithm

    Comput. Method Program Biomed.

    (2013)
  • K Hornik et al.

    Multilayer feedforward networks are universal approximators

    Neural Netw.

    (1989)
  • JR Quinlan

    Simplifying decision tree

    Int. J. Man Mach. Stud.

    (1987)
  • LiLN et al.

    A computer aided diagnosis system for thyroid disease using extreme learning machine

    J. Med. Syst.

    (2012)
  • JP Nolan et al.

    Case-finding for unsuspected thyroid disease: costs and health benefits

    Am. J. Clin. Pathol.

    (1985)
  • ChenHL et al.

    A three-stage expert system based on support vector machines for thyroid disease diagnosis

    J. Med. Syst.

    (2012)
  • L Pasi

    Similarity classifier applied to medical datasets

  • G Serpen et al.

    Performance analysis of probabilistic potential function neural network classifier

  • L Ozyilmaz et al.

    Diagnosis of thyroid disease using artificial neural network methods

  • W Duch et al.

    A new methodology of extraction, optimization and application of crisp and fuzzy logical rules

    IEEE Trans. Neural Netw.

    (2001)
  • Cited by (0)

    View full text