Abstract
Evolutionary instance selection outperforms in most cases non-evolutionary methods, also for function approximation tasks considered in this work. However, as the number of instances encoded into the chromosome grows, finding the optimal subset becomes more difficult, especially that running the optimization too long leads to over-fitting. A solution to that problem, which we evaluate in this work is to reduce the search space by clustering the dataset, run the instance selection algorithm for each cluster and combine the results. We also address the issue of properly processing the instances close to the cluster boundaries, as this is where the drop of accuracy can appear. The method is experimentally verified on several regression datasets with thousands of instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tallón-Ballesteros, A.J., Riquelme, J.C.: Data cleansing meets feature selection: a supervised machine learning approach. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo-Moreo, F.J., Adeli, H. (eds.) IWINAC 2015. LNCS, vol. 9108, pp. 369–378. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18833-1_39
Olvera-López, A., Carrasco-Ochoa, J., Martínez-Trinidad, F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)
Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)
Rusiecki, A., Kordos, M., Kamiński, T., Greń, K.: Training neural networks on noisy data. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014. LNCS (LNAI), vol. 8467, pp. 131–142. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07173-2_13
Kordos, M.: Optimization of evolutionary instance selection. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 359–369. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_32
Kordos, M., Wydrzyński, M., Łapa, K.: Obtaining pareto front in instance selection with ensembles and populations. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018. LNCS (LNAI), vol. 10841, pp. 438–448. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91253-0_41
Merelo, J.J., et. al.: There is noisy lunch: a study of noise in evolutionary optimization problems. In: 7th International Joint Conference on Computational Intelligence (IJCCI), pp. 261–268 (2015)
Antonelli, M., Ducange, P., Marcelloni, F.: Genetic training instance selection in multiobjective evolutionary fuzzy systems: a coevolutionary approach. IEEE Trans. Fuzzy Syst. 20(2), 276–290 (2012)
Tsai, C.-F., Eberle, W., Chu, C.-Y.: Genetic algorithms in feature and instance selection. Knowl.-Based Syst. 39, 240–247 (2013)
Cano, J.R., Herrera, F., Lozano, M.: Instance selection using evolutionary algorithms: an experimental study. In: Pal, N.R., Jain, L. (eds.) Advanced Techniques in Knowledge Discovery and Data Mining. Advanced Information and Knowledge Processing, pp. 127–152. Springer, London (2005). https://doi.org/10.1007/1-84628-183-0_5
Derrac, J., et al.: Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection. Inf. Sci. 186, 73–92 (2012)
Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Boston (1989)
Lobo, F.G., Lima, C.F., Michalewicz, Z.: Parameter Setting in Evolutionary Algorithms. Studies in Computational Intelligence, vol. 54. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-69432-8
Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Hoboken (2001)
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, Boca Raton (2013)
Rosales-Pérez, A., García, S., Gonzalez, J.A.: An evolutionary multiobjective model and instance selection for support vector machines with pareto-based ensembles. IEEE Trans. Evol. Comput. 21(6), 863–877 (2017)
Escalante, H.J., et al.: MOPG: a multi-objective evolutionary algorithm for prototype generation. Pattern Anal. Appl. 20(1), 33–47 (2017)
Gong, D., Zhou, Y.: Multi-population genetic algorithms with space partition for multi-objective optimization problems. IJCSNS Int. J. Comput. Sci. Netw. Secur. 6, 52–58 (2006)
Ali, F.A., Ahmed, N.N.: Differential evolution algorithm with space partitioning for large-scale optimization problems. Intell. Syst. Appl. 11, 49–59 (2015)
Arnaiz-González, Á., Díez-Pastor, J.F., Rodríguez, J.J., García-Osorio, C.: Instance selection for regression: adapting DROP. Neurocomputing 201, 66–81 (2016)
Alcala-Fdez, J., et al.: KEEL Data-Mining Software Tool and Data Set Repository (2017). http://sci2s.ugr.es/keel/datasets.php
Acknowledgements
This work was supported by Polish National Science Center (NCN) grant “Evolutionary Methods in Data Selection” No. 2017/01/X/ST6/00202.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kordos, M., Czepielik, Ł., Blachnik, M. (2018). Data Set Partitioning in Evolutionary Instance Selection. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_66
Download citation
DOI: https://doi.org/10.1007/978-3-030-03493-1_66
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03492-4
Online ISBN: 978-3-030-03493-1
eBook Packages: Computer ScienceComputer Science (R0)