Comparison between statistical and fuzzy approaches for improving diagnostic decision making in patients with chronic nasal symptoms

doi:10.1016/j.fss.2013.10.013

Fuzzy Sets and Systems

Volume 237, 16 February 2014, Pages 136-150

https://doi.org/10.1016/j.fss.2013.10.013 Get rights and content

Abstract

This paper compares a fuzzy model, expressed in rule-form, with a well known statistical approach (i.e. logistic regression model) for diagnostic decision making in patients with chronic nasal symptoms. The analyses were carried out using a database obtained from a questionnaire administered to 1359 patients with nasal symptoms containing personal data, clinical data and skin prick test (SPT) results. Both the fuzzy model and the logistic regression model developed were validated using a data set obtained from another medical institution. The accuracy of the two models in identifying patients with positive or negative SPT was similar. This study is a preliminary step to the creation of a software that primary care doctors can use to make a diagnostic decision, when deciding whether patients with nasal symptoms need allergy testing or not.

Introduction

Chronic rhinitis is typically classified as allergic rhinitis (AR) if the symptoms and triggers correlate with a specific IgE-mediated response, or as non-allergic rhinitis (NAR) if symptoms are induced by irritant triggers in the absence of specific IgE-mediated responses [4]. AR is a common condition affecting 5–40% of the general population and there is evidence that its prevalence is increasing [3]. Rhinitis is an inflammation of the nasal membrane that causes periods of nasal discharge, sneezing, and congestion that persist for at least two hours per day [23]. Rhinitis is considered allergic when allergen-specific IgE initiates the immunologic reaction that causes symptoms, while it is non-allergic if allergen-specific IgE is negative [10].

Diagnostic allergy tests attempt to detect specific IgE, which causes the nasal symptoms, binding common allergens, such as house dust mites, pollens, animal proteins, and mold spores [12]. However, primary care doctors are usually the first to encounter patients with chronic nasal symptoms, but they are often uncertain about how to differentiate between allergic and non-allergic forms of the disease. They normally require a consultation with an allergy specialist if nasal symptoms have been present for more than two years, and they occur cyclically [26].

The availability of a short questionnaire for the diagnostic decision, that correlates with the positive or negative allergy test, may serve to modify and rationalize the current approach taken by primary care doctors for evaluating patients with chronic nasal symptoms. In other words, we will try to answer the question: is it necessary for patients to undergo allergy testing? This decision will be made considering their demographic and clinical characteristics.

Due to its simplicity, high sensitivity, rapid interpretation, and a relatively low cost, SPT was until recently, often recommended by primary care doctors. But because of the current state of the economy and health care system problems, health care expenditure has fallen. In the Italian health care system, the total cost of an allergy test and in vivo testing is €44, considering the cost of the allergy extract and the allergistʼs charge (allergistʼs time for the medical history, clinical examination, and SPT [2]). This is paid by the patient, who possibly does not need an allergy test.

An important contribution to the rhinitis diagnostic decision can be provided by the examination of a database performed on a wide sample of patients with chronic nasal symptoms. The crucial point is how to examine the data obtained from the database in order to assemble a questionnaire that will facilitate the diagnostic decision of primary care doctors for new patients with chronic nasal symptoms.

The study analyses a database of 1359 patients with chronic nasal symptoms, performed with a logistic model and a fuzzy model, to evaluate the accuracy of the results of the SPT. The performances of the two models was validated through a data set obtained from another medical institution.

A considerable amount of scientific production is directed at the exploitation of databases or questionnaires in order to implement models and algorithms useful to the assessment, assistance in medical diagnosis, and treatment of allergic rhinitis and respiratory diseases. In their seminal work, Pantin and Merrett (1982) [21] applied a computer system to predict the IgE-mediated allergies by referring to a database compiled from previous patientsʼ answers and their IgE antibody profiles. Chae et al. (1992) [6] improved the capability of the medical decision support system for diagnosing nasal allergy combining statistical and rule-based approach by a neural network. Shortly after, Chae et al. (1995) [5], through a covariance structure modeling, determined a structural relationship among patients characteristics, treatment and results of allergic rhinitis. Park et al. (1996) [22] developed a knowledge-based system to automate the diagnosis of allergic rhinitis by a combination of case-base and rule-based reasoning with a neural network. Recently, in the same clinical field, Zarandi et al. (2010) [27] developed a fuzzy rule-based expert system to diagnose asthma at initial stages. Zolnoori et al. (2012) [29] provided an intelligent fuzzy system for the problem of the underestimating of asthma control levels. In the same year Zolnoori et al. (2012) [28] developed a fuzzy expert system for evaluating the level of asthma exacerbation, and finally the same authors (2012) [30] furnished a solution of intelligence fuzzy system for the prescription of medicine for asthma in the primary stages based on asthma severity levels. Recently Tomita et al. (2013) [25] developed a scoring algorithm using clinical parameters to predict the presence of asthma in adult patients with respiratory symptoms. Padilla et al. (2013) [20] proposed a cross-sectional study aimed to assess the association between allergic rhinitis and asthma control in Peruvian school children. Weger et al. (2013) [8] used multiple regression analysis to develop two-step pollen and hay fever symptom prediction using actual and forecast weather parameters, grass pollen data and patient symptom diaries. Chatzimichail et al. (2013) [7] predicted asthma outcome using partial least square regression and artificial neural networks. The results of these applications are not applied in clinical practice.

Our paper, after the introduction, is presented in seven sections. We describe in Section 2 the Palermo database; in Section 3, we report the methods of statistical analysis and fuzzy analysis, used to examine the Palermo database; in Section 4, we report the results of the statistical analysis and of the fuzzy analysis, and the results of the logistic regression model and of the fuzzy model; in Section 5 we report the comparison between the logistic regression model and the fuzzy model of the Palermo database; in Section 6 we present the results of validation of the diagnostic decision, performed using a new database, with both logistic regression and fuzzy model; finally in Section 7 conclusions and future perspectives are described.

Section snippets

Description of the database

The original database consists of 1511 patients, consecutively seen and evaluated in the outpatient allergy office of the Dipartimento BioMedico di Medicina Interna e Specialistica (Di.Bi.M.I.S.) (ex Dipartimento di Medicina Clinica e delle Patologie Emergenti) of the University of Palermo, Italy. The database was previously used to analyze the characteristics of allergic rhinitis disease (see Di Lorenzo et al. [10]). Of the 1511 patients with nasal symptoms reported in the previous study, we

Statistical tests

We compared each input variable between SPT positive and SPT negative output variable, using the student t test [13] or Mann–Whitney test [19] for continuous variables, depending on the distribution of the data, the $χ^{2}$ test [24] for the dichotomous variables, and finally the Mann–Whitney test for the ordinal variables.

Logistic regression model

All variables found to be significantly different between patients with positive SPT and negative SPT ( $p < 0.05$ ) were selected for the logistic regression model and analyzed with

Results

On the basis of the preliminary analysis reported in Sections 3.1, 3.2 (statistical analysis) and 3.3, 3.4 (fuzzy analysis), we selected the respective variables used to make the logistic regression model and the fuzzy model. Seventy-one percent of the subjects present in the Palermo database were positive on SPT ( $n = 961$ ).

Comparison of the logistic regression model and the fuzzy model of the Palermo database

The area under the ROC curve of the fuzzy model is greater than that the logistic regression model approach, as displayed in Fig. 5.

We reported the metric of the logistic regression model and fuzzy model in Table 10 and the post-test probability of the two models in Table 11.

Our results confirmed that the models can both be considered as diagnostic decision making tools.

Validation of the logistic regression model and of the fuzzy model

We analyzed a new database obtained from 88 adult patients with chronic nasal symptoms, consecutively seen and evaluated clinically and with SPT, in the outpatient allergy office of the Dipartimento di Pediatria Ospedaliera (Pediatric Department) “G.B. Grassi” of Rome, Italy. We used the input variables of logistic model and fuzzy model and their best cutoff, >0.70 and >0.58, respectively, obtained from the Palermo database. In this way we evaluated how many patients with chronic nasal symptoms

Conclusions and future work

The two models performed well at predicting the result of SPT in individuals with chronic nasal symptoms (ROC curve areas are 0.95 for the logistic model and 0.96 for fuzzy model).

Both logistic regression model and fuzzy model had all the metric values greater than 80%, at ideal thresholds.

We would prefer the logistic regression model: while performing equally well, the logistic model is simpler and easier to interpret. In fact, it uses a smaller number of variables, and the influence of each

Author contributions

GDL planned and organized the study. VL developed both the logistic model and fuzzy model. MSLB and SLP administered the questionnaires and performed the allergy testing, in Palermo. GLP entered the variables in the database. GP administered the questionnaires and performed the allergy testing in Rome. GDL and VL developed the hypothesis for this paper, conducted the analysis, and wrote the first and second draft of the paper. All authors commented on the analysis, and revised the paper.

Acknowledgements

The authors wish to thank Area Editor and the anonymous Referees for their valuable comments and suggestions.

This study was supported by grants from MIUR (Italian University and Research Ministry) (former 60% funds) to Gabriele Di Lorenzo and Valerio Lacagnina. No support was received from the pharmaceutical and diagnostic industry. The authors declare they have no competing of interest.

We would like to thank Professor Peter Dawson for the accurate proof-reading.

References (31)

D. Brandt et al.
Questionnaire evaluation and risk factor identification for nonallergic vasomotor rhinitis
Ann. Allergy, Asthma, & Immun.
(2006)
E.H. Mamdani et al.
An experiment in linguistic synthesis with a fuzzy logic controller
Int. J. Man-Mach. Stud.
(1975)
L.M. Bachmann et al.
Sample sizes of studies on diagnostic accuracy: literature survey
Br. Med. J.
(2006)
F. Borghesan et al.
In vivo and in vitro allergy diagnostics: itʼs time to re-appraise the costs
Clin. Chem. Lab. Med.
(2007)
J. Bousquet et al.
Allergic rhinitis and its impact on asthma (ARIA) 2008 update
Allergy
(2008)
Y.M. Chae et al.
Structural modeling of differential diagnosis, treatment, and results for allergic rhinitis
Yonsei Med. J.
(1995)
Y.M. Chae et al.
The development of a decision support system for diagnosing nasal allergy
Yonsei Med. J.
(1992)
E. Chatzimichail et al.
Predicting asthma outcome using partial least square regression and artificial neural networks
BioMed Res. Int.
(2013)
L.A. de Weger et al.
Development and validation of a 5-day-ahead hay fever forecast for patients with grass-pollen-induced allergic rhinitis
Int. J. Biometeorol.
(2013)
E.R. DeLong et al.
Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach
Biometrics
(1988)

G. Di Lorenzo et al.

Differences and similarities between allergic and nonallergic rhinitis in a large sample of adult patients with rhinitis symptoms

Int. Arch. Allergy Immunol.

(2011)

R.W. Fletcher et al.

Clinical Epidemiology: The Essentials

(2005)

K. Gendo et al.

Evidence-based diagnostic strategies for evaluating suspected allergic rhinitis

Ann. Intern. Med.

(2004)

W. Sealey Gosset

The probable error of a mean

Biometrika

(1908)

K.O. Hajian-Tilaki et al.

A comparison of parametric and nonparametric approaches to ROC analysis of quantitative diagnostic tests

Med. Decis. Mak.

(1997)

Cited by (5)

Measuring efficiency in education: The influence of imprecision and variability in data on DEA estimates
2019, Socio-Economic Planning Sciences
Citation Excerpt :
To the best of our knowledge, this approach has not been previously applied in the educational sector to estimate efficiency measures, thus one of the main contributions of the present work is to provide alternative measures of school performance incorporating the hidden variability existing behind a mean (or crisp) value. Additionally, we compare these measures with scores estimated with traditional DEA models based on aggregated (or crisp) values at school level following the same line of some application studies where the standard and fuzzy approaches are compared (e.g. Refs. [16,17] or [18]). The literature devoted to FDEA has experienced a notable development in recent years [19,20].
Many studies devoted to efficiency performance evaluation in the education sector are based on measures of central tendency at school level as, for example, the average values of students belonging to the same school. Although this is a common and accepted way of summarizing data from the original observations (students), it is not less true that this approach neglects the existing dispersion of data, which may become a serious problem if variability across schools is high. Additionally, imprecision may arise when experts on each evaluated subject select the battery of questions, with different levels of difficulty, which will be the base for the final questionnaires completed by students. This paper uses data from US students and schools participating in PISA (Programme for International Student Assessment) 2015 to illustrate that schools' efficiency measures based on aggregate data and imprecision may reflect an inaccurate picture of their performance if they are compared to measures estimated accounting for broader information provided by all students of the same school. In order to operationalize our approach, we resort to Fuzzy Data Envelopment Analysis. This methodology allows us to deal with the notion of fuzziness in some variables such as the socio-economic status of students or test scores. Our results indicate that the estimated measures of performance obtained with the fuzzy DEA approach are highly correlated with those calculated with traditional DEA models. However, we find some relevant divergences in the identification of efficient units when we account for data dispersion and vagueness.
Some new qualitative insights into quality of fuzzy rule-based models
2017, Fuzzy Sets and Systems
Citation Excerpt :
Over the course of time, we have witnessed many major developments and visible trends in their design and analysis. The Takagi–Sugeno fuzzy model [18] is one such architecture which has gained notable popularity, and remains commonly considered to this day, being used in diverse areas of system modeling [4,8,14]. At the beginning of the era of fuzzy modeling, the primary focus was on the development of different model designs; however, over time, the accuracy of fuzzy models became a key design criterion, while the interpretability, transparency, and comprehensibility of models started to play a less visible role.
Rules in fuzzy rule-based models convey essential knowledge about the system under discussion. As such, they capture the essence of relationships occurring among input and output variables. While the quality of such fuzzy models is predominantly related with the accuracy and eventual interpretability of rules (although to a limited extent), the quality of rules being regarded as generic pieces of knowledge has not been studied so far. In this study, we formulate and investigate this problem by looking at the quality of rules, including aspects of (a) stability, (b) generalizability, and (c) conflict. We identify a concept of rule multiplicity, conflict, and study an emergence of rule generalization. A number of pertinent performance indices are developed, and their usage is presented through a series of experimental studies.
Application of fuzzy consensus for oral pre-cancer and cancer susceptibility assessment
2016, Egyptian Informatics Journal
Citation Excerpt :
Clinicoepidemiological data of the studied subjects were analyzed statistically using SPSS version 17 for risk estimation analysis and primary selection of features to be used in application of fuzzy consensus. During this analysis, each input variable between patients with and without oral lesions was compared using Pearson’s χ2 test [35] and the cutoff significance was established at p < 0.01. At 95% confidence interval OR was also calculated.
Health questionnaire data assessment conventionally relies upon statistical analysis in understanding disease susceptibility using discrete numbers and fails to reflect physician’s perspectives and missing narratives in data, which play subtle roles in disease prediction. In addressing such limitations, the present study applies fuzzy consensus in oral health and habit questionnaire data for a selected Indian population in the context of assessing susceptibility to oral pre-cancer and cancer. Methodically collected data were initially divided into age based small subgroups and fuzzy membership function was assigned to each. The methodology further proposed the susceptibility to oral precancers (viz. leukoplakia, oral submucous fibrosis) and squamous cell carcinoma in patients considering a fuzzy rulebase through If-Then rules with certain conditions. Incorporation of similarity measures using the Jaccard index was used during conversion into the linguistic output of fuzzy set to predict the disease outcome in a more accurate manner and associated condition of the relevant features. It is also expected that this analytical approach will be effective in devising strategies for policy making through real-life questionnaire data handling.
A risk decision-making approach to customs targeting
2016, Open Cybernetics and Systemics Journal
Healthy diet assessment mechanism based on fuzzy markup language for Japanese food
2016, Soft Computing

View full text

Comparison between statistical and fuzzy approaches for improving diagnostic decision making in patients with chronic nasal symptoms

Abstract

Introduction

Section snippets

Description of the database

Statistical tests

Logistic regression model

Results

Comparison of the logistic regression model and the fuzzy model of the Palermo database

Validation of the logistic regression model and of the fuzzy model

Conclusions and future work

Author contributions

Acknowledgements

Ann. Allergy, Asthma, & Immun.

Int. J. Man-Mach. Stud.

Sample sizes of studies on diagnostic accuracy: literature survey

Br. Med. J.

In vivo and in vitro allergy diagnostics: itʼs time to re-appraise the costs

Clin. Chem. Lab. Med.

Allergic rhinitis and its impact on asthma (ARIA) 2008 update

Allergy

Structural modeling of differential diagnosis, treatment, and results for allergic rhinitis

Yonsei Med. J.

The development of a decision support system for diagnosing nasal allergy

Yonsei Med. J.

Predicting asthma outcome using partial least square regression and artificial neural networks

BioMed Res. Int.

Development and validation of a 5-day-ahead hay fever forecast for patients with grass-pollen-induced allergic rhinitis

Int. J. Biometeorol.

Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach

Biometrics