Abstract
Ensembles are often capable of greater predictive performance than any of their individual classifiers. Despite the need for classifiers to make different kinds of errors, the majority voting scheme, typically used, treats each classifier as though it contributed equally to the groupās performance. This can be particularly limiting on unbalanced datasets, as one is more interested in complementing classifiers that can assist in improving the true positive rate without signicantly increasing the false positive rate. Therefore, we implement a genetic algorithm based framework to weight the contribution of each classifier by an appropriate fitness function, such that the classifiers that complement each other on the unbalanced dataset are preferred, resulting in significantly improved performances. The proposed framework can be built on top of any collection of classifiers with different fitness functions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Thirteenth International Conference on Machine Learning (1996)
Breiman, L.: Bagging predictors. Machine LearningĀ 24(2), 123ā140 (1996)
Dietterich, T.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol.Ā 1857, pp. 1ā15. Springer, Heidelberg (2000)
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine LearningĀ 51, 181ā207 (2003)
Sylvester, J., Chawla, N.V.: Evolutionary ensemble creation and thinning. In: International Joint Conference on Neural Networks, pp. 5148ā5155 (2006)
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Chawla, N.V., et al.: SMOTE: Synthetic Minority Oversampling TEchnique. Journal of Artificial Intelligence ResearchĀ 16, 321ā357 (2002)
Provost, F., Fawcett, T.: Robust Classification for Imprecise Environments. Machine LearningĀ 42/3, 203ā231 (2001)
Provost, F., Domingos, P.: Tree induction for probability-based rankings. Machine LearningĀ 52(3) (2003)
Opitz, D.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379ā384 (1999)
Guerra-Salcedo, C., Whitley, L.D.: Genetic approach to feature selection for ensemble creation. In: International Conference on Genetic and Evolutionary Computation, pp. 236ā243 (1999)
Yang, J., Honavar, V.: Feature subset selection using A genetic algorithm. In: Genetic Programming 1997: Proceedings of the Second Annual Conference, July 13ā16,1997, p. 380 (1997)
Kim, Y.S., Street, N., Menczer, F.: Meta-evolutionary ensembles. In: IEEE Intl. Joint Conf. on Neural Networks, pp. 2791ā2796. IEEE Computer Society Press, Los Alamitos (2002)
Menczer, F., Street, W.N., Degeratu, M.: Evolving heterogeneous neural agents by local selection. In: Honavar, V., Patel, M., Balakrishnan, K. (eds.) Advances in the Evolutionary Synthesis of Neural Systems, MIT Press, Cambridge (2000)
Liu, Y., Yao, X., Higuchi, T.: Evolutionary ensembles with negative correlation learning. IEE Transactions on Evolutionary ComputationĀ 4(4), 380ā387 (2000)
Kuncheva, L.I., Jain, L.C.: Designing classifier fusion systems by genetic algorithms. IEEE-ECĀ 4(4), 327ā336 (2000)
Langdon, W.B., Buxton, B.F.: Genetic programming for combining classifiers. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 66ā73 (2001)
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary ComputationĀ 1(1), 67ā82 (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
Ā© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Chawla, N.V., Sylvester, J. (2007). Exploiting Diversity in Ensembles: Improving the Performance on Unbalanced Datasets. In: Haindl, M., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2007. Lecture Notes in Computer Science, vol 4472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72523-7_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-72523-7_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72481-0
Online ISBN: 978-3-540-72523-7
eBook Packages: Computer ScienceComputer Science (R0)