Exploiting Diversity in Ensembles: Improving the Performance on Unbalanced Datasets

Chawla, Nitesh V.; Sylvester, Jared

doi:10.1007/978-3-540-72523-7_40

Nitesh V. Chawla¹ &
Jared Sylvester²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4472))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

1313 Accesses
21 Citations

Abstract

Ensembles are often capable of greater predictive performance than any of their individual classifiers. Despite the need for classifiers to make different kinds of errors, the majority voting scheme, typically used, treats each classifier as though it contributed equally to the group‘s performance. This can be particularly limiting on unbalanced datasets, as one is more interested in complementing classifiers that can assist in improving the true positive rate without signicantly increasing the false positive rate. Therefore, we implement a genetic algorithm based framework to weight the contribution of each classifier by an appropriate fitness function, such that the classifiers that complement each other on the unbalanced dataset are preferred, resulting in significantly improved performances. The proposed framework can be built on top of any collection of classifiers with different fitness functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Thirteenth International Conference on Machine Learning (1996)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Dietterich, T.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51, 181–207 (2003)
Article MATH Google Scholar
Sylvester, J., Chawla, N.V.: Evolutionary ensemble creation and thinning. In: International Joint Conference on Neural Networks, pp. 5148–5155 (2006)
Google Scholar
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Google Scholar
Chawla, N.V., et al.: SMOTE: Synthetic Minority Oversampling TEchnique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar
Provost, F., Fawcett, T.: Robust Classification for Imprecise Environments. Machine Learning 42/3, 203–231 (2001)
Article Google Scholar
Provost, F., Domingos, P.: Tree induction for probability-based rankings. Machine Learning 52(3) (2003)
Google Scholar
Opitz, D.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379–384 (1999)
Google Scholar
Guerra-Salcedo, C., Whitley, L.D.: Genetic approach to feature selection for ensemble creation. In: International Conference on Genetic and Evolutionary Computation, pp. 236–243 (1999)
Google Scholar
Yang, J., Honavar, V.: Feature subset selection using A genetic algorithm. In: Genetic Programming 1997: Proceedings of the Second Annual Conference, July 13–16,1997, p. 380 (1997)
Google Scholar
Kim, Y.S., Street, N., Menczer, F.: Meta-evolutionary ensembles. In: IEEE Intl. Joint Conf. on Neural Networks, pp. 2791–2796. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Menczer, F., Street, W.N., Degeratu, M.: Evolving heterogeneous neural agents by local selection. In: Honavar, V., Patel, M., Balakrishnan, K. (eds.) Advances in the Evolutionary Synthesis of Neural Systems, MIT Press, Cambridge (2000)
Google Scholar
Liu, Y., Yao, X., Higuchi, T.: Evolutionary ensembles with negative correlation learning. IEE Transactions on Evolutionary Computation 4(4), 380–387 (2000)
Article Google Scholar
Kuncheva, L.I., Jain, L.C.: Designing classifier fusion systems by genetic algorithms. IEEE-EC 4(4), 327–336 (2000)
Google Scholar
Langdon, W.B., Buxton, B.F.: Genetic programming for combining classifiers. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 66–73 (2001)
Google Scholar
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Notre Dame, IN 46556, USA
Nitesh V. Chawla
Department of Computer Science, University of Maryland, College Park, MD 20742, USA
Jared Sylvester

Authors

Nitesh V. Chawla
View author publications
You can also search for this author in PubMed Google Scholar
Jared Sylvester
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michal Haindl Josef Kittler Fabio Roli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chawla, N.V., Sylvester, J. (2007). Exploiting Diversity in Ensembles: Improving the Performance on Unbalanced Datasets. In: Haindl, M., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2007. Lecture Notes in Computer Science, vol 4472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72523-7_40

Download citation

DOI: https://doi.org/10.1007/978-3-540-72523-7_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72481-0
Online ISBN: 978-3-540-72523-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics