Elsevier

Pattern Recognition Letters

Volume 45, 1 August 2014, Pages 211-216
Pattern Recognition Letters

Cost-sensitive Bayesian network classifiers

https://doi.org/10.1016/j.patrec.2014.04.017Get rights and content

Highlights

  • Cost-sensitive learning has received increased attention in recent years.

  • Most of the existing works are devoted to make decision trees cost-sensitive.

  • We propose cost-sensitive Bayesian network classifiers.

  • The experimental results validate their effectiveness.

Abstract

Cost-sensitive learning has received increased attention in recent years. However, in existing studies, most of the works are devoted to make decision trees cost-sensitive and very few works discuss cost-sensitive Bayesian network classifiers. In this paper, an instance weighting method is incorporated into various Bayesian network classifiers. The probability estimation of Bayesian network classifiers is modified by the instance weighting method, which makes Bayesian network classifiers cost-sensitive. The experimental results on 36 UCI data sets show that when cost ratio is large, the cost-sensitive Bayesian network classifiers perform well in terms of the total misclassification costs and the number of high cost errors. When cost ratio is small, the advantage of cost-sensitive Bayesian network classifiers is not so obvious in terms of the total misclassification costs, but still obvious in terms of the number of high cost errors, compared to the original cost-insensitive Bayesian network classifiers.

Introduction

Classification is one of the most important tasks in data mining and machine learning. Traditional data mining and machine learning algorithms [19] are designed to yield classifiers that minimize the number of misclassification errors. In this case, the costs for different misclassification errors are considered to be equal. However, in real-world domains, many practical classification problems have different costs associated with different types of error. For example, in medical diagnosis, the cost of misclassifying a cancer patient to be a healthy person is significantly greater than the opposite type of error. So it is important to create a classifier that minimizes the total misclassification costs rather than the number of misclassification errors. This kind of classification task is called cost-sensitive classification [8], [16], [31]. In recent years, cost-sensitive classification has received increased attention.

In existing studies, most of the works are devoted to make decision trees cost-sensitive. A detailed survey of cost-sensitive decision tree induction algorithms can be found in Lomax and Vadera’s paper [18]. However, a comprehensive study of cost-sensitive Bayesian network classifiers is rare in existing studies. Only a few works make naive Bayesian network classifiers cost-sensitive. For example, Gama [12] presents a cost-sensitive iterative Bayes. For another example, Chai et al. [1] specifically consider test-cost sensitive learning and propose a test-cost sensitive naive Bayes. For the third example, Fang [9] develops a cost-sensitive naive Bayes method which learns and infers the order relation from the training data and classifies the instance based on the inferred order relation.

In this paper, we focus our attention on cost-sensitive Bayesian network classifiers. In existing cost-sensitive studies, some meta-learning methods, such as MetaCost [6], instance weighting [23], thresolding [21], and sampling [13] etc., can be applied to make Bayesian network classifiers cost-sensitive. Among these, instance weighting is a simple, easy to understand and efficient method. Inspired by the success of cost-sensitive C4.5 [23] and weighted random forest [3], in this paper, the instance weighting method is incorporated into various Bayesian network classifiers, such as naive Bayes (NB), tree augmented Bayesian networks (TAN) [10], averaged one dependence estimators (AODE) [25], and hidden naive Bayes (HNB) [14], to make these Bayesian network classifiers cost-sensitive. The resulting classifiers are called cost-sensitive Bayesian network classifiers in this paper. The experimental results on a large number of UCI data sets show that these cost-sensitive Bayesian network classifiers achieve a substantial reduction in the total misclassification costs and the number of high cost errors.

The rest of this paper is organized as follows. In Section 2, some works related to cost-sensitive learning are introduced. In Section 3, we revisit several state-of-the-art Bayesian network classifiers and then incorporate the instance weighting method into them. In Section 4, we conduct a series of experiments on 36 UCI benchmark data sets to validate our proposed cost-sensitive Bayesian network classifiers. In Section 5, we draw conclusions and outline the main directions for our future work.

Section snippets

Related work

Cost-sensitive learning is generally categorized into two categories [16]. One is the direct method, which is to directly introduce and utilize misclassification costs into the learning algorithms. As a result, the learning algorithms are cost-sensitive in themselves. Some direct cost-sensitive learning algorithms include ICET [24], cost-sensitive iterative naive Bayes [12], and cost-sensitive decision trees [7], [17].

The other category is called cost-sensitive meta-learning method.

The instance weighting method in cost-sensitive C4.5

The central choice in the decision tree induction is selecting which attribute to split the training data at each non-terminal node in the tree. The information gain measure and its variants are generally used to select attributes. No matter the information gain measure or its variants are based on a measure called entropy [19]. Given a node t of a decision tree, let N(t) be the number of instances in node t and Nj(t) be the number of class j instances in node t, and the entropy of node t is

Data sets

We ran our experiments on the whole 36 UCI data sets published on the main web site of Weka platform [27], which represent a wide range of domains and data characteristics. The original description of these 36 data sets can be found from our previous papers [14], [15]. In our experiments, the following 4 processing steps are adopted.

  • 1.

    Replacing missing attribute values: The unsupervised filter named ReplaceMissingValues in Weka is used to replace all missing attribute values in each data set.

  • 2.

Conclusions and future work

In real-world applications, cost-sensitive learning has received increased attention. Traditional Bayesian network classifiers are designed to minimize the misclassification errors. When they are applied to cost-sensitive learning tasks, the performance is generally poor. In order to scale up their performance, we incorporate the instance weighing method into various Bayesian network classifiers and propose cost-sensitive Bayesian network classifiers in this paper. The instance weighting method

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This work was partially supported by the National Natural Science Foundation of China (61203287), the Program for New Century Excellent Talents in University (NCET-12-0953), the Provincial Natural Science Foundation of Hubei (2011CDA103), and the Fundamental Research Funds for the Central Universities (CUG130504, CUG130414).

References (31)

  • M. Galar et al.

    Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling

    Pattern Recogn.

    (2013)
  • Y. Sun et al.

    Cost-sensitive boosting for classification of imbalanced data

    Pattern Recogn.

    (2007)
  • X. Chai, L. Deng, Q. Yang, C.X. Ling, Test-cost sensitive naive bayes classification, in: Fourth IEEE International...
  • N.V. Chawla et al.

    Smote: synthetic minority over-sampling technique

    J. Artif. Intell. Res.

    (2002)
  • C. Chen, A. Liaw, L. Breiman, Using random forest to learn imbalanced data, University of California Berkeley,...
  • D.M. Chickering, Learning bayesian networks is np-complete, in: Learning from Data, 1996, pp....
  • C.K. Chow et al.

    Approximating discrete probability distributions with dependence trees

    IEEE Trans. Inf. Theory

    (1968)
  • P. Domingos

    Metacost: a general method for making classifiers cost-sensitive

  • C. Drummond, R. Holte, Exploiting the cost (in)sensitivity of decision tree splitting criteria, in: Proceedings of the...
  • C. Elkan

    The foundations of cost-sensitive learning

  • X. Fang

    Inference-based naive bayes: turning nave bayes cost-sensitive

    IEEE Trans. Knowl. Data Eng.

    (2013)
  • N. Friedman et al.

    Bayesian network classifiers

    Mach. Learn.

    (1997)
  • J. Gama, A cost-sensitive iterative bayes, in: Workshop on Cost-Sensitive Learning at the Seventeenth International...
  • L. Jiang, C. Li, Z. Cai, H. Zhang, Sampled bayesian network classifiers for class-imbalance and cost-sensitive...
  • L. Jiang et al.

    A novel bayes model: hidden naive bayes

    IEEE Trans. Knowl. Data Eng.

    (2009)
  • Cited by (61)

    • Extended natural neighborhood for SMOTE and its variants in imbalanced classification

      2023, Engineering Applications of Artificial Intelligence
    • Probabilistic personalised cascade with abstention

      2021, Pattern Recognition Letters
      Citation Excerpt :

      Some recent publications proposed solutions to this problem discussing possible scenarios to estimate regions where a classifier is confident in its decision, i.e., they considered how to learn a rejector efficiently [2–4]. An interesting research direction is to explore feature redundancies to reduce the cost of learning and prediction [5], and to explore structure in cost sensitive learning, using, e.g., Bayesian networks [6]. We are motivated by the challenges coming from personalised medicine where it is crucial to find optimal — in terms of money, time, other budget constrains, and predictive accuracy — individual diagnostic protocols.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by G. Moser.

    View full text