Skip to main content

Data Set Partitioning in Evolutionary Instance Selection

  • Conference paper
  • First Online:
Intelligent Data Engineering and Automated Learning – IDEAL 2018 (IDEAL 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11314))

Abstract

Evolutionary instance selection outperforms in most cases non-evolutionary methods, also for function approximation tasks considered in this work. However, as the number of instances encoded into the chromosome grows, finding the optimal subset becomes more difficult, especially that running the optimization too long leads to over-fitting. A solution to that problem, which we evaluate in this work is to reduce the search space by clustering the dataset, run the instance selection algorithm for each cluster and combine the results. We also address the issue of properly processing the instances close to the cluster boundaries, as this is where the drop of accuracy can appear. The method is experimentally verified on several regression datasets with thousands of instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tallón-Ballesteros, A.J., Riquelme, J.C.: Data cleansing meets feature selection: a supervised machine learning approach. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo-Moreo, F.J., Adeli, H. (eds.) IWINAC 2015. LNCS, vol. 9108, pp. 369–378. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18833-1_39

    Chapter  Google Scholar 

  2. Olvera-López, A., Carrasco-Ochoa, J., Martínez-Trinidad, F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)

    Article  Google Scholar 

  3. Garcia, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)

    Article  Google Scholar 

  4. Rusiecki, A., Kordos, M., Kamiński, T., Greń, K.: Training neural networks on noisy data. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014. LNCS (LNAI), vol. 8467, pp. 131–142. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07173-2_13

    Chapter  Google Scholar 

  5. Kordos, M.: Optimization of evolutionary instance selection. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 359–369. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_32

    Chapter  Google Scholar 

  6. Kordos, M., Wydrzyński, M., Łapa, K.: Obtaining pareto front in instance selection with ensembles and populations. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018. LNCS (LNAI), vol. 10841, pp. 438–448. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91253-0_41

    Chapter  Google Scholar 

  7. Merelo, J.J., et. al.: There is noisy lunch: a study of noise in evolutionary optimization problems. In: 7th International Joint Conference on Computational Intelligence (IJCCI), pp. 261–268 (2015)

    Google Scholar 

  8. Antonelli, M., Ducange, P., Marcelloni, F.: Genetic training instance selection in multiobjective evolutionary fuzzy systems: a coevolutionary approach. IEEE Trans. Fuzzy Syst. 20(2), 276–290 (2012)

    Article  Google Scholar 

  9. Tsai, C.-F., Eberle, W., Chu, C.-Y.: Genetic algorithms in feature and instance selection. Knowl.-Based Syst. 39, 240–247 (2013)

    Article  Google Scholar 

  10. Cano, J.R., Herrera, F., Lozano, M.: Instance selection using evolutionary algorithms: an experimental study. In: Pal, N.R., Jain, L. (eds.) Advanced Techniques in Knowledge Discovery and Data Mining. Advanced Information and Knowledge Processing, pp. 127–152. Springer, London (2005). https://doi.org/10.1007/1-84628-183-0_5

    Chapter  Google Scholar 

  11. Derrac, J., et al.: Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection. Inf. Sci. 186, 73–92 (2012)

    Article  Google Scholar 

  12. Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Boston (1989)

    MATH  Google Scholar 

  13. Lobo, F.G., Lima, C.F., Michalewicz, Z.: Parameter Setting in Evolutionary Algorithms. Studies in Computational Intelligence, vol. 54. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-69432-8

    Book  MATH  Google Scholar 

  14. Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Hoboken (2001)

    MATH  Google Scholar 

  15. Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, Boca Raton (2013)

    Book  Google Scholar 

  16. Rosales-Pérez, A., García, S., Gonzalez, J.A.: An evolutionary multiobjective model and instance selection for support vector machines with pareto-based ensembles. IEEE Trans. Evol. Comput. 21(6), 863–877 (2017)

    Article  Google Scholar 

  17. Escalante, H.J., et al.: MOPG: a multi-objective evolutionary algorithm for prototype generation. Pattern Anal. Appl. 20(1), 33–47 (2017)

    Article  MathSciNet  Google Scholar 

  18. Gong, D., Zhou, Y.: Multi-population genetic algorithms with space partition for multi-objective optimization problems. IJCSNS Int. J. Comput. Sci. Netw. Secur. 6, 52–58 (2006)

    Google Scholar 

  19. Ali, F.A., Ahmed, N.N.: Differential evolution algorithm with space partitioning for large-scale optimization problems. Intell. Syst. Appl. 11, 49–59 (2015)

    Google Scholar 

  20. Arnaiz-González, Á., Díez-Pastor, J.F., Rodríguez, J.J., García-Osorio, C.: Instance selection for regression: adapting DROP. Neurocomputing 201, 66–81 (2016)

    Article  Google Scholar 

  21. Alcala-Fdez, J., et al.: KEEL Data-Mining Software Tool and Data Set Repository (2017). http://sci2s.ugr.es/keel/datasets.php

Download references

Acknowledgements

This work was supported by Polish National Science Center (NCN) grant “Evolutionary Methods in Data Selection” No. 2017/01/X/ST6/00202.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirosław Kordos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kordos, M., Czepielik, Ł., Blachnik, M. (2018). Data Set Partitioning in Evolutionary Instance Selection. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03493-1_66

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03492-4

  • Online ISBN: 978-3-030-03493-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics