Abstract
In the paper, we investigate the speeding up of the evolutionary induction of decision trees, which is an emerging alternative to greedy top-down solutions. In particular, we design and implement graphics processing units (GPU)-based parallelization to generate regression trees (decision trees employed to solve regression problems) on large-scale data. The most time consuming part of the algorithm, which is parallelized, is the evaluation of individuals in the population. Other parts of the algorithms (like selection, genetic operators) are performed sequentially on a CPU. A data-parallel approach is applied to split the dataset over the GPU cores. After each assigned chunk of data is processed, the results calculated on all GPU cores are merged and sent to the CPU. We use a Compute Unified Device Architecture (CUDA) programming model, which supports general purpose computation on a GPU (GPGPU). Experimental validation of the proposed approach is performed on artificial and real-life datasets. A computational performance comparison with the traditional CPU version shows that GPU-accelerated evolutionary induction of regression trees is significantly (even up to 1000 times) faster and allows for processing of much larger datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alba, E., Tomassini, M.: Parallelism and evolutionary algorithms. IEEE Trans. Evol. Comput. 6(5), 443–462 (2002)
Bacardit, J., Llor, X.: Large-scale data mining using genetics-based machine learning. WIRE Data Min. Knowl. Discov. 3(1), 37–61 (2013)
Barros, R.C., Basgalupp, M.P., Carvalho, A.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. SMC Part C 42(3), 291–312 (2012)
Blake, C., Keogh, E., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)
Chitty, D.: Fast parallel genetic programming: multi-core CPU versus many-core GPU. Soft Comput. 16(10), 1795–1814 (2012)
Czajkowski, M., Jurczuk, K., Kretowski, M.: A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS (LNAI), vol. 9119, pp. 340–349. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19324-3_31
Czajkowski, M., Jurczuk, K., Kretowski, M.: Hybrid parallelization of evolutionary model tree induction. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 370–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39378-0_32
Czajkowski, M., Kretowski, M.: Evolutionary induction of global model trees with specialized operators and memetic extensions. Inf. Sci. 288, 153–173 (2014)
Czajkowski, M., Kretowski, M.: The role of decision tree representation in regression problems an evolutionary perspective. Appl. Soft Comput. 48, 458–475 (2016)
Fan, G., Gray, J.B.: Regression tree analysis using TARGET. J. Comput. Graph. Stat. 14(1), 206–218 (2005)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1996)
Gong, Y.J., Chen, W.N., Zhan, Z.H., Zhang, J., Li, Y., Zhang, Q., Li, J.J.: Distributed evolutionary algorithms and their models: a survey of the state-of-the-art. Appl. Soft Comput. 34, 286–300 (2015)
Grama, A., Karypis, G., Kumar, V., Gupta, A.: Introduction to Parallel Computing. Addison-Wesley, Boston (2003)
Hazan, A., Ramirez, R., Maestre, E., Perez, A., Pertusa, A.: Modelling expressive performance: a regression tree approach based on strongly typed genetic programming. In: Rothlauf, F., et al. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 676–687. Springer, Heidelberg (2006). https://doi.org/10.1007/11732242_64
Jurczuk, K., Czajkowski, M., Kretowski, M.: Evolutionary induction of a decision tree for large-scale data: a GPU-based approach. Soft Comput. (2017, in press)
Jurczuk, K., Kretowski, M., BezyWendling, J.: GPU-based computational modeling of magnetic resonance imaging of vascular structures. Int. J. High Perform. Comput. Appl. (2017, in press)
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)
Kretowski, M., Grześ, M.: Global learning of decision trees by an evolutionary algorithm. In: Saeed, K., Pejaś, J. (eds.) Information Processing and Security Systems, pp. 401–410. Springer, Boston (2005). https://doi.org/10.1007/0-387-26325-X_36
Lo, W., Chang, Y., Sheu, R., Chiu, C., Yuan, S.: CUDT: a CUDA based decision tree algorithm. Sci. World J. 1–12 (2014)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, London (1996). https://doi.org/10.1007/978-3-662-03315-9
NVIDIA: CUDA C programming guide. Technical report (2017). https://docs.nvidia.com/cuda/cuda-c-programming-guide/
Ortuno, F.M., Valenzuela, O., Prieto, B., Saez-Lara, M.J., Torres, C., Pomares, H., Rojas, I.: Comparing different machine learning and mathematical regression models to evaluate multiple sequence alignments. Neurocomputing 164, 123–136 (2015)
Rokach, L., Maimon, O.: Top-down induction of decision trees classifiers - a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 35(4), 476–487 (2005)
Strnad, D., Nerat, A.: Parallel construction of classification trees on a GPU. Concurr. Comput. Pract. Exp. 28(5), 1417–1436 (2016)
Tsutsui, S., Collet, P.: Massively Parallel Evolutionary Computation on GPGPUs. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37959-8
Wilt, N.: CUDA Handbook: A Comprehensive Guide to GPU Programming. Addison-Wesley, Boston (2013)
Acknowledgments
This work was supported by the grants S/WI/2/13 (first and third author) and W/WI/1/2017 (second author) from Bialystok University of Technology founded by Ministry of Science and Higher Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jurczuk, K., Czajkowski, M., Kretowski, M. (2017). GPU-Accelerated Evolutionary Induction of Regression Trees. In: Martín-Vide, C., Neruda, R., Vega-Rodríguez, M. (eds) Theory and Practice of Natural Computing. TPNC 2017. Lecture Notes in Computer Science(), vol 10687. Springer, Cham. https://doi.org/10.1007/978-3-319-71069-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-71069-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71068-6
Online ISBN: 978-3-319-71069-3
eBook Packages: Computer ScienceComputer Science (R0)