Abstract
Evolutionary algorithms are adaptive methods based on natural evolution that may be used for search and optimization. As instance selection can be viewed as a search problem, it could be solved using evolutionary algorithms.
In this chapter, we have carried out an empirical study of the performance of CHC as representative evolutionary algorithm model. This study includes a comparison between this algorithm and other non-evolutionary instance selection algorithms applied in different size data sets to evaluate the scaling up problem. The results show that the stratified evolutionary instance selection algorithms consistently outperform the non-evolutionary ones. The main advantages are: better instance reduction rates, higher classification accuracy and reduction in resources consumption.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adriaans, P., Zantinge, D. (1996): Data mining. Addison-Wesley
Back, T., Fogel, D., Michalewicz, Z. (1997): Handbook of evolutionary computation. Oxford University Press
Brighton, H., Mellish, C. (2002): Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6, 153–172
Cano, J.R., Herrera, F., Lozano, M. (2003): Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study. IEEE Transaction on Evolutionary Computation (In press)
Domingo, C., Gavalda, R., Watanabe, O. (2002): Adaptative sampling methods for scaling up knowledge discovery algorithms. Data Mining and Knowledge Discovery 6, 131–152
Eshelman, L. J. (1991): The CHC adaptive search algorithm: how to have safe search when engaging in nontraditional genetic recombination. (Foundations of Genetic Algorithms-1), Rawlins, G.J.E. (Eds.), Morgan Kauffman, 265–283
Esposito, F., Malerba, D., Semeraro, G. (1997): A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 476–491
Frank, E., Witten, I. H. (1999): Making better use of global discretization. (Proc. Sixteenth International Conference on Machine Learning), Bratko, I., Dzeroski, S. (Eds.), Morgan Kaufmann, 115–123
Freitas, A. A. (2002): Data mining and knowledge discovery with evolutionary algorithms. Springer-Verlag
Freitas, A.A. (2002): A survey of evolutionary algorithms for data mining and knowledge discovery. (Advances in evolutionary computation), Ghosh, A., Tsutsui, S. (Eds.), Springer-Verlag, 819–845
Goldberg, D. E. (1989): Genetic algorithms in search, optimization, and machine learning. Addison-Wesley
Hart, P. E. (1968): The condensed nearest neighbour rule. IEEE Transaction on Information Theory, 18, 431–433
Kibbler, D., Aha, D. W. (1987): Learning representative exemplars of concepts: An initial case of study. Proc. of the (Fourth International Workshop on Machine Learning), Morgan Kaufmann, 24–30
Kuncheva, L. (1995): Editing for the k-nearest neighbors rule by a genetic algorithm. Pattern Recognition Letters, 16, 809–814
Liu, H., Motoda, H. (1998): Feature selection for knowledge discovery and data mining. Kluwer Academic Publishers
Liu, H., Motoda, H. (2001): Data reduction via instance selection. (Instance Selection and Construction for Data Mining), Liu, H., Motoda, H. (Eds.), Kluwer Academic Publishers, 3–20
Liu, H., Motoda, H. (2002): On issues of instance selection. Data Mining and Knowledge Discovery, 6, 115–130
Reinartz, T. (2002): A unifying view on instance selection. Data mining and Knowledge Discovery, 6, 191–210
Safavian, S. R., Landgrebe, D. (1991): A survey of decision tree classifier methodology. IEEE Transaction on Systems, Man. and Cybernetics, 21, 660–674
Shanahan, J. G. (2000): Soft computing for knowledge discovery. Kluwer Academic Publishers
Wilson, D. R., Martinez, T. R. (1997): Instance pruning techniques. (Proceedings of the International Conference), Morgan Kaufmann, 403–411
Witten, I. H., Frank, E. (2000): Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cano, J.R., Herrera, F., Lozano, M. (2005). Strategies for Scaling Up Evolutionary Instance Reduction Algorithms for Data Mining. In: Ghosh, A., Jain, L.C. (eds) Evolutionary Computation in Data Mining. Studies in Fuzziness and Soft Computing, vol 163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32358-9_2
Download citation
DOI: https://doi.org/10.1007/3-540-32358-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22370-2
Online ISBN: 978-3-540-32358-7
eBook Packages: EngineeringEngineering (R0)