Abstract
In the globalized knowledge economy, the challenge of translating best available evidence from customer profiling and experience into policy and practice is universal. Customers are diverse in nature and require personalized services from financial institutions, whereas financial institutions need to predict their wants and needs to understand them on a deeper level. Customer segmentation is a very crucial process for a financial institution to profile new customers into specific segments and find patterns from existing customers. Usually, rule-based techniques focusing on specific customer characteristics, according to expert knowledge, are applied to segment them. However, these techniques highlight the fact that traditional classifications in the big data era are becoming increasingly irrelevant and agree to the claim of financial institutions not knowing their customers well enough. The main objective of this work is to propose an evolutionary clustering approach as a rule extractor mechanism that facilitates decision makers to recognize the most significant customer characteristics and profile them into segments. Particularly, a population-based metaheuristic algorithm (Genetic Algorithm) is used in a hybrid synthesis with unsupervised machine learning algorithms (K-means Algorithms) to solve data clustering problems. Based on the clustering result, labels are added for every data point in the dataset. This dataset is used to train supervised ML algorithms such as deep learning and random forests to predict in which cluster a new customer can be mapped. A cluster analysis conducted on behalf of the EXUS financial solutions company that provides financial institutions with financial software that can deliver debt collection services effectively, meeting both academic requirements and practical needs. Two real-world datasets collected from financial institutions in Greece, explored and analyzed for segmentation purposes. To demonstrate the effectiveness of the proposed method, well-known benchmark datasets from UCI machine learning repository were also used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Monge, M., Quesada-López, Martínez, A., & Jenkins, M. (2021). Data mining and machine learning techniques for bank customers segmentation: A systematic mapping study. In: K. Arai, S. Kapoor, & R. Bhatia (Eds.), Intelligent systems and applications: Proceedings of the 2020 intelligent systems conference (IntelliSys) Volume 2 (pp. 666–684). Springer. https://doi.org/10.1007/978-3-030-55187-2_48
Hong Kong Institute for Monetary and Financial Research (2020). Artificial intelligence in banking: The changing landscape in compliance and supervision. HKIMR Applied Research Report No. 2/2020.
Chawla, D., & Joshi, H. (2021). Segmenting mobile banking users based on the usage of mobile banking services. Global Business Review, 22(3), 68–704.
Bijak, K., & Thomas, L. (2012). Does segmentation always improve model performance in credit scoring? Expert Systems with Applications, 39, 2433–2442.
Thomas, L., Edelman, D., & Crook, J. (2002). Credit scoring and its applications. SIAM.
Baesens, B., Rösch, D., & Scheule, H. (2016). Credit risk analytics: Measurement techniques, applications, and examples in SASⓇ. Wiley.
Bequé, A., Coussement, K., Gayler, R., & Lessmann, S. (2017). Approaches for credit scorecard calibration: An empirical analysis. Knowledge-based Systems, 134, 213–227.
Lappas, P. Z., & Yannacopoulos, A. N. (2021). Credit scoring: A constrained optimization framework with hybrid evolutionary feature selection. In: B. Christiansen, & T. Škrinjarić (Eds.), Handbook of research on applied AI for international business and marketing applications (pp. 580–605). IGI Global. https://doi.org/10.4018/978-1-7998-5077-9.ch028
Hsieh, N.-C. (2004). An integrated data mining and behavioral scoring model for analyzing bank customers. Expert Systems with Applications, 27, 623–633.
Bizhani, M., & Tarokh, M.-J. (2011). Behavioral rules of bank’s point-of-sale for segments description and scoring prediction. International Journal of Industrial Engineering Computations, 2, 337–350.
Bahrami, M., Bozkaya, B., & Balcisoy, S. (2020). Using behavioral analytics to predict customer invoice payment. Big Data, 8(1), 25–37.
Liao, S.-H., Chu, P.-H., & Hsiao, P.-Y. (2012). Data mining techniques and applications. Expert Systems with Applications, 39, 11303–11311.
Mirza, S., Mittal, S., & Zaman, M. (2016). A review of data mining literature. International Journal of Computer Science and Information Security, 14(11), 437–442.
Aggarwal, C., & Reddy, C. (2014). Data clustering: Algorithms and applications. CRC Press.
Bandyopadhyay, S., & Saha, S. (2013). Unsupervised classification: Similarity measures, classical and metaheuristic approaches, and applications. Springer.
Bizhani, M., & Tarokh, M.-J. (2010). Behavioral segmentation of bank’s point-of-sales using RF*M* approach. Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing (pp. 81–86). https://doi.org/10.1109/ICCP.2010.5606461
Rezaeinia, S.-M., Keramati, A., & Albadvi, A. (2012). An integrated AHP-RFM method to banking customer segmentation. International Journal of Electronic Customer Relationship Management, 6(2), 153–168.
Barman, D., & Chowdhury, N. (2019). A novel approach for the customer segmentation using clustering through self-organizing map. International Journal of Business Analytics, 6(2), 23–45.
Bach, M.-P., Juković, S., Dumic̆ić, K., & S̆arlija, N. (2013). Business client segmentation in banking using self-organizing maps. South East European Journal of Economics and Business, 8(2), 32–41.
Seret, A., Bejinaru, A., & Baesens, B. (2015). Domain knowledge based segmentation of online banking customers. Intelligent Data Analysis, 19, S163–S184.
Wang, G., Li, F., Zhang, P., Tian, Y., & Shi, Y. (2009). Data mining for customer segmentation in personal financial market. In: Y. Shi, S. Wang, J. Li, & Y. Zeng (Eds.), Cutting-edge research topics on multiple criteria decision making (pp. 614–621). Springer. https://doi.org/10.1007/978-3-642-02298-2_90
Sivasankar, E., & Vijaya, J. (2017). Customer segmentation by various clustering approaches and building an effective hybrid learning system on churn prediction dataset. In: H. Behera, & D. Mohapatra (Eds), Computational Intelligence in Data Mining (pp. 181–191). Springer. https://doi.org/10.1007/978-981-10-3874-7_18
Barman, D., Shaw, K.-K., Tudu, A., & Chowdhury, N. (2016). Classification of bank direct marketing data using subsets of training data. In: S. Satapathy, J. Mandal, S. Udgata, & V. Bhateja (Eds.), Information systems design and intelligent applications (pp. 143–151). Springer. https://doi.org/10.1007/978-81-322-2757-1_16
Rashinkar, P., & Krushnasamy, V. S. (2017). An overview of data fusion techniques. Proceedings of the 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore (pp. 694–697).
Meng, T., Jing, X., Yan, Z., & Pedrycz, W. (2020). A survey on machine learning for data fusion. Information Fusion, 57, 115–129.
Oliveira, G., Coutinho, F., Campello, G., & Naldi, M. (2017). Improving k-means through distributed scalable metaheuristics. Neurocomputing, 24, 45–57.
Jamel, A., & Akay, B. (2019). A survey and systematic categorization of parallel K-means and Fuzzy-c-Means algorithms. International Journal of Computer Systems Science and Engineering, 5, 259–281.
Tsai, C.-W., Liu, S.-J., & Wang Y.-C. (2018). A parallel metaheuristic data clustering framework for cloud. Journal of Parallel and Distributed Computing, 116, 39–49.
Hossain, M., Sebestyen, M., Mayank, D., Ardakanian, O., & Khazaei, H. (2020). Large-scale data-driven segmentation of banking customers. Proceedings of the 2020 IEEE International Conference on Big Data (pp. 4392–4401). https://doi.org/10.1109/BigData50022.2020.9378483
Motevali, M., Shanghooshabad, A., Aram, R., & Keshavarz, H. (2019). WHO: A new evolutionary algorithm bio-inspired of wildebeests with a case study on bank customer segmentation. International Journal of Pattern Recognition and Artificial Intelligence, 33(5), 1959017.
Dhaenens, C., & Jourdan, L. (2016). Metaheuristics for Big Data. Wiley.
Talbi, E.-G. (2009). Metaheuristics: From design to implementation. Wiley
José-García, A., & Gómez-Flores, W. (2016). Automatic clustering using nature-inspired metaheuristics: A survey. Applied Soft Computing Journal, 41, 192–213.
Ezugwu, A., Shukla, A., Agbaje, M., Oyelade, O., José-García, A., & Agushaka, J. (2020). Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature. Neural Computing and Applications, 33, 6247–6306. https://doi.org/10.1007/s00521-020-05395-4
Ezugwu, A. (2020). Nature-inspired metaheuristic techniques for automatic clustering: A survey and performance study. SN Applied Sciences, 2, 273.
Mehrmolaei, S., Keyvanpour, M., & Savargiv, M. (2020). Metaheuristics on time series clustering problem: Theoretical and empirical evaluation. Evolutionary Intelligence. https://doi.org/10.1007/s12065-020-00511-8
Mohanty, P., Nayak, S., Mohapatra, U., & Mishra, D. (2019). A survey on partitional clustering using single-objective metaheuristic approach. International Journal of Innovative Computing and Applications, 10(3–4), 207–226.
Nanda, S., & Panda, G. (2014). A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm and Evolutionary Computation, 16, 1–18.
Nguyen, Q., & Rayward-Smith, V. J. (2011). CLAM: Clustering large applications using metaheuristics. Journal of Mathematical Modelling and Algorithms, 10(1), 57–78.
Gribel, D., & Vidal, T. (2019). HG-Means: A scalable hybrid genetic algorithm for minimum sum-of-squared clustering. Pattern Recognition, 88, 569–583.
Naik, B., Mahaptra, S., Nayak, J., & Behera, H. (2017). Fuzzy clustering with improved swarm optimization and genetic algorithm: Hybrid approach. In: H. Behera, & D. Mohapatra (Eds.), Computational Intelligence in Data Mining (pp. 237–247). Springer. https://doi.org/10.1007/978-981-10-3874-7_23
Nayak, J., Nanda, M., Nayak, K., Naik, B., & Behera, H. (2014). An improved firefly fuzzy c-means (FAFCM) algorithm for clustering real world data sets. In: M. Kumar, D. Mohapatra, A. Konar, & A. Chakraborty (Eds.), Advanced Computing, Networking and Informatics (pp. 339–348). Springer. https://doi.org/10.1007/978-3-319-07353-8_40
Valdez, F., Castillo, O., & Melin, P. (2021). Bio-inspired algorithms and its applications for optimization in fuzzy clustering. Algorithms, 14(4), 122.
Kuo, R., Zheng, Y., & Nguyen, T. P. Q. (2021). Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering. Information Sciences, 557, 1–15.
Consoli, S., Korst, J., Pauws, S., & Geleijnse, G. (2020). Improved metaheuristics for the quartet method of hierarchical clustering. Journal of Global Optimization, 78, 241–270.
Pacheco, J. (2005). A scatter search approach for the minimum sum-of-squares clustering problem. Computers and Operations Research, 32, 1325–1335.
Nayak, S., Rout, P., & Jagadev, A. (2019). Multi-objective clustering: a kernel based approach using differential evolution. Connection Science, 31(3), 294–321.
Hu, K.-C., Tsai, C.-W., & Chiang, M.-C. (2020). A multiple-search multi-start framework for metaheuristics for clustering problems. IEEE Access, 8, 96173–96183.
Belacel, N., Hansen, P., & Mladenović, N. (2002). Fuzzy J-Means: A new heuristic for fuzzy clustering. Pattern Recognition, 35, 2193–2200.
Senthilnath, J., Kulkarni, S., Suresh, S., Yang, X., & Benediktsson, J. (2019). FPA clust: evaluation of the flower pollination algorithm for data clustering. Evolutionary Intelligence, 14, 1189–1199. https://doi.org/10.1007/s12065-019-00254-1
Hansen, P., & Mladenović, N. (2001). J-Means: A new local search heuristic for minimum sum of squared clustering. Pattern Recognition, 34, 405–413.
Kumar, V., Chhabra, J., & Kumar, D. (2017). Grey wolf algorithm-based clustering technique. Journal of Intelligent Systems, 26(1), 153–168.
Bonab, M., Hashim, S., Haurt, T., & Kheng, G. (2019). A new swarm-based simulated annealing hyper-heuristic algorithm for clustering problem. Procedia Computer Science, 163, 228–236.
González-Almagro, G., Luengo, J., Cano, J.-R., & García, S. (2020). DILS: Constrained clustering through dual iterative local search. Computers and Operations Research, 121, 104979.
Liu, Y., Wang, L., & Chen, K. (2005). A tabu search based method for minimum sum of squares clustering. In: S. Singh, M. Singh, C. Apte, & P. Perner (Eds.), Pattern Recognition and Data Mining (pp. 248–256). Springer. https://doi.org/10.1007/11551188_27
Dowlatshahi, M. B., & Nezamabadi-pour, H. (2014). GGSA: A grouping gravitational search algorithm for data clustering. Engineering Applications of Artificial Intelligence, 36, 114–121.
Gyamfi, K., Brusey, J., & Hunt, A. (2017). K-means clustering using tabu search with quantized means. arXiv:1703.08440v1 [cs.LG]
Boushaki, S., Kamel, N., & Bendjeghaba, O. (2018). A new quantum chaotic cuckoo search algorithm for data clustering. Expert Systems with Applications, 96, 358–372.
Kuo, R., & Zulvia, F. (2020). Multi-objective cluster analysis using a gradient evolution algorithm. Soft Computing, 24, 11545–11559.
Liu, Y., & Shen, Y. (2010). Data clustering with cat swarm optimization. Journal of Convergence Information Technology, 5(8), 21–28.
Kamel, N., & Boucheta, R. (2014). A new clustering algorithm based on chameleon army strategy. In: S. Boonkrong, H. Unger, & P. Meesad (Eds.), Recent Advances in Information and Communication Technology (pp. 23–32). Springer. https://doi.org/10.1007/978-3-319-06538-0_3
Harifi, S., Khalilian, M., Mohammadzadeh, J., & Ebrahimnejad, S. (2020). New generation of metaheuristics by inspiration from ancient. Proceedings of the 2020 10th International Conference on Computer and Knowledge Engineering (pp. 256–261). https://doi.org/10.1109/ICCKE50421.2020.9303653
Kaur, A., & Kumar, Y. (2021). A new metaheuristic algorithm based on water wave optimization for data clustering. Evolutionary Intelligence. https://doi.org/10.1007/s12065-020-00562-x
Irsalinda, N., Yanto, I., Chiroma, H., & Herawan, T. (2017). A framework of clustering based on chicken swarm optimization. In: T. Herawan, R. Ghazali, N. Nawi, & M. Deris (Eds.), Recent Advances on Soft Computing and Data Mining (pp. 336–343). Springer. https://doi.org/10.1007/978-3-319-51281-5_34
Komarasamy, G., & Wahi, A. (2012). An optimized K-means clustering technique using bat algorithm. European Journal of Scientific Research, 84(2), 263–273.
Alshamiri, A. K., Singh, A., & Surampudi, R. B. (2016). Artificial bee colony algorithm for clustering: An extreme learning approach. Soft Computing, 20, 3163–3176.
Kumar, S., Datta, D., & Singh, S. (2015). Black hole algorithm and its applications. In: A. Azar, & S. Vaidyanathan (Eds.), Computational Intelligence Applications in Modeling and Contro (pp. 147–170). Springer. https://doi.org/10.1007/978-3-319-11017-2_7
Mageshkumar, C., Karthik, S., & Arunachalam, V. (2019). Hybrid metaheuristic algorithm for improving the efficiency of data clustering. Cluster Computing, 22, S435–S442.
Silva-Filho, T., Pimentel, B., Souza, R., & Oliveira, A. (2015). Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Systems with Applications, 42, 6315–6328.
Sharma, M., & Chhabra, J. (2019). An efficient hybrid PSO polygamous crossover based clustering algorithm. Evolutionary Intelligence, 14, 1213–1231. https://doi.org/10.1007/s12065-019-00235-4
Zheng, L., Chao, F., Parthaláin, N., Zhang, D., & Shen, Q. (2021). Feature grouping and selection: A graph-based approach. Information Sciences, 546, 1256–1272.
Niño-Adan, I., Manjarres, D., Landa-Torres, I., & Portillo, E. (2021). Feature weighting methods: A review. Expert Systems with Applications, 184, 115424.
Wang, L., Wang, Y., & Chang, Q. (2016). Feature selection methods for big data bioinformatics: A survey from the search perspective. Methods, 111, 21–31.
Moshki, M., Kabiri, P., & Mohebaljojeh, A. (2015). Scalable feature selection in high-dimensional data based on grasp. Applied Artificial Intelligence, 29, 283–296.
Sarhani, M., & Vob, S. (2021). Chunking and cooperation in particle swarm optimization for feature selection. Annals of Mathematics and Artificial Intelligence. https://doi.org/10.1007/s10472-021-09752-4
Ji, B., Lu, X., Sun, G., Zhang, W., Li, J., & Xiao, Y. (2020). Bio-inspired feature selection: An improved binary particle swarm optimization approach. IEEE Access, 8, 85989–86002. https://doi.org/10.1109/ACCESS.2020.2992752
Lappas, P. Z., & Yannacopoulos, A. N. (2021). A machine learning approach combining expert knowledge with genetic algorithms in feature selection for credit risk assessment. Applied Soft Computing Journal, 107, 107391.
Kumar, V., & Kumar, D. (2019). Automatic clustering and feature selection using gravitational search algorithm and its application to microarray data analysis. Neural Computing and Applications, 31, 3647–3663.
Prakash, J., & Singh, P. (2019). Gravitational search algorithm and K-means for simultaneous feature selection and data clustering: A multi-objective approach. Soft Computing, 23, 2083–2100.
Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G.-Z. (2019). XAI – Explainable artificial intelligence. Science Robotics, 4(37), 1–2.
Roselli, D., Matthews, J., & Talagala, N. (2019). Managing bias in AI. In L. Liu, & R. White (Eds.), WWW ‘19: Companion Proceedings of the 2019 World Wide Web Conference (pp. 539–544). San Francisco, USA.
Gaikwad, M. R., Umbarkar, A. J., & Bamane, S. S. (2020). Large-scale data clustering using improved artificial bee colony algorithm. In: M. Tuba, S. Akashe, & A. Joshi (Eds.), ICT Systems and Sustainability (pp. 467–475). Springer. https://doi.org/10.1007/978-981-15-0936-0_50
Marinakis, Y., Marinaki, M., Matsatsinis, N., & Zopounidis, C. (2008). A memetic-grasp algorithm for clustering. Proceedings of the 10th International Conference on Enterprise Information Systems – AIDSS (pp. 36–43). https://doi.org/10.5220/0001694700360043
Kowalski, P., Łukasik, S., Charytanowicz, M., & Kulczycki, P. (2019). Nature inspired clustering – Use cases of krill herd algorithm and flower pollination algorithm. In: L. Kóczy, J. Medina-Moreno, & E. Ramírez-Poussa (Eds), Interactions between Computational Intelligence and Mathematics (pp. 83–98). Springer. https://doi.org/10.1007/978-3-030-01632-6_6
Marinakis, Y., Marinaki, M., & Matsatsinis, N. (2007). A hybrid particle swarm optimization algorithm for clustering analysis. In: I. Y. Song, J. Eder, & T. M. Nguyen (Eds.), Data Warehousing and Knowledge Discovery (pp. 241–250). Springer. https://doi.org/10.1007/978-3-540-74553-2_22
Marinakis, Y., Marinaki, M., & Matsatsinis, N. (2008). A stochastic nature inspired metaheuristic for cluster analysis. International Journal of Business Intelligence and Data Mining, 3(1), 30–44.
Marinakis, Y., Marinaki, M., Matsatsinis, N. (2008). A hybrid clustering algorithm based on multi-swarm constriction PSO and GRASP. In: I.-Y. Song, J. Eder, & T. M. Nguyen (Eds.), Data Warehousing and Knowledge Discovery (pp. 186–195). Springer.
Marinakis, Y., Marinaki, M., & Matsatsinis, N. (2009). A hybrid bumble bees mating optimization – GRASP algorithm for clustering. In: E. Corchado, X. Wu, E. Oja, Á. Herrero, & B. Baruque (Eds.), Hybrid Artificial Intelligence Systems (pp. 549–556). Springer.
Marinakis, Y., Marinaki, M., Doumpos, M., Matsatsinis, N., & Zopounidis, C. (2011). A hybrid ACO-GRASP algorithm for clustering analysis. Annals of Operations Research, 188(1), 343–358.
Saida, I., Nadjet, K., & Omar, B. (2014). A new algorithm for data clustering based in cuckoo search optimization. In: J. S. Pan, P. Krömer, & V. Snás̆el (Eds.), Genetic and Evolutionary Computing (pp. 55–64). Springer. https://doi.org/10.1007/978-3-319-01796-9_6
Singh, T., Saxena, N., Khurana, M., Singh, D., & Abdalla, M. (2021). Data clustering using moth-flame optimization algorithm. Sensors, 21, 4086.
Tian, Z., Fong, S., Wong, R., & Millham, R. (2016). Elephant search algorithm on data clustering. Proceedings of 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (pp. 787–793). https://doi.org/10.1109/FSKD.2016.7603276
Cho, P. P. W., & Nyunt, T. T. S. (2020). Data clustering based on differential evolution with modified mutation strategy. Proceedings of the 2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (pp. 222–225). https://doi.org/10.1109/ECTI-CON49241.2020.9158243
Kuo, R. J., Amornnikun, P., & Nguyen, T. P. Q. (2020). Metaheuristic-based possibilistic multivariate fuzzy weighted c-means algorithms for market segmentation. Applied Soft Computing Journal, 96, 106639.
Eskandari, S., & Javidi, M. M. (2019). A novel hybrid bat algorithm with a fast clustering-based hybridization. Evolutionary Intelligence. https://doi.org/10.1007/s12065-019-00307-5
Agbaje, M., Ezugwu, A., & Els, E. (2019). Automatic data clustering using hybrid firefly particle swarm optimization algorithm. IEEE Access, 7, 184963–184984. https://doi.org/10.1109/ACCESS.2019.2960925
Wu, Z.-X., Huang, K.-W., Chen, J.-L., & Yang, C.-S. (2019). A memetic fuzzy whale optimization algorithm for data clustering. Proceedings of 2019 IEEE Congress on Evolutionary Computation (pp. 1446–1452). https://doi.org/10.1109/CEC.2019.8790044
Hatamlou, A., & Hatamlou, M. (2013). PSOHS: An efficient two-stage approach for data clustering. Memetic Computing, 5, 155–161.
Mitchell, M. (1998). An introduction to genetic algorithms. MIT Press.
Kelleher, J. D. (2019). Deep learning. MIT Press.
Pavlov, Y. (2000). Random forests. VSP.
Acknowledgements
The authors would like to thank EXUS financial solutions company for its support with respect to the work described here.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Lappas, P.Z., Xanthopoulos, S.Z., Yannacopoulos, A.N. (2023). Metaheuristic-Based Machine Learning Approach for Customer Segmentation. In: Eddaly, M., Jarboui, B., Siarry, P. (eds) Metaheuristics for Machine Learning. Computational Intelligence Methods and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-19-3888-7_4
Download citation
DOI: https://doi.org/10.1007/978-981-19-3888-7_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3887-0
Online ISBN: 978-981-19-3888-7
eBook Packages: Computer ScienceComputer Science (R0)