Abstract
Predicting academic performance is a crucial task in educational institutions, as it aids in identifying students who may require additional support or intervention. This research paper presents an innovative approach to academic performance prediction by combining K-Means clustering with an in-depth comparison of advanced machine learning algorithms. The study utilizes a comprehensive student dataset sourced from Kaggle, encompassing diverse attributes related to students’ backgrounds and academic history. The dataset is subjected to K-Means clustering, resulting in the identification of distinct student clusters. Subsequently, four prominent machine learning algorithms—K-Nearest Neighbors (KNN), Neural Network (NN), Random Forest (RF), and Support Vector Machine (SVM)—are rigorously evaluated for their predictive efficacy on academic performance. The experimental results showcase the utility of K-Means clustering in segmenting students into meaningful clusters based on shared characteristics. The comparative analysis of the machine learning algorithms reveals varying levels of accuracy, precision, recall, and F1-score in predicting academic performance across different clusters. The outcomes highlight the algorithm that exhibits superior performance in this specific context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Data Mining Introductory and Advanced Topics, Margaret H. Dunhan, Pearson
Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques, 3rd Edition
Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, 2nd Edition
Han, J., Kamber, M., Pei, J.: Data Mining, Concepts and Techniques, 3rd Edition
Fuentes, A.: Hands-on Predictive Analytics with Python
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd Edition
Kumar, S.: Neural Networks A classroom Approach, 2nd Edition
Saxena, P.S., Govil, M.C.: Prediction of student’s academic performance using clustering. Special Conference Issue: National Conference on Cloud Computing & Big Data
Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of k-means clustering algorithm for prediction of students’ academic performance. (IJCSIS) Int. J. Comp. Sci. Info. Sec. 7(1) (2010)
Arora, R.K., Badal, D.: Evaluating Student’s Performance Using k-Means Clustering. Int. J. Comp. Sci. Technol. IJCST 4(2) (2013). ISSN: 0976-8491 (Online) | ISSN : 2229-4333 (Print)
Borgavakar, S.P., Shrivastava, A.: Evaluating student’s performance using k-means clustering. Int. J. Eng. Res. Technol. (IJERT) 6(05) (2017). ISSN: 2278-0181
Rajitha Devi, G.: Prediction of student academic performance using clustering. Int. J. Current Res. Multidiscip. (IJCRM) 5(6), pp. 01–05 (June’20). ISSN: 2456-0979
Goh, Y.L., et al.: Prediction of students’ academic performance by k-means clustering. Peer-review under responsibility of 4th Asia International Multidisciplinary Conference 2020 Scientific Committee
Vankayalapati, R.: Kalyani balaso ghutugade, rekha vannapuram, bejjanki pooja sree prasanna, “K-means algorithm for clustering of learners performance levels using machine learning techniques.” Revue d’Intelligence Artificielle 35(1), 99–104 (2021)
Alamri, L.H., et al.: Nida Aslam ICETM 2020. United Kingdom Predicting Student Academic Performance using Support Vector Machine and Random Forest, London (2020)
Hemasri, G., Kiran, K.: Students Performance Prediction Using Random Forest Algorithm. Int. J. Res. Trends and Innov. IJRTI 7(12) (2022). ISSN: 2456-3315
Batool, S., Rashid, J.: Mohammad Ali Jinnah University International Conference on Computing (MAJICC), A Random Forest Students’ Performance Prediction (RFSPP) Model Based on Students’ Demographic Features (2021)
Jayaprakash, S., Krishnan, S., Jaiganesh, V.: International Conference on Emerging Smart Computing and Informatics (ESCI) AISSMS Institute of Information Technology, Pune, India. Mar 12–14, 2020, Predicting Students Academic Performance using an Improved Random Forest Classifier (2020)
Kulkarni, V.Y., Sinha, P.K.: Effective learning and classification using random forest algorithm. Int. J. Eng. Innov. Technol. (IJEIT) 3(11) (2014)
Huang, S., Wei, J.: Student Performance Prediction in Mathematics Course Based on the Random Forest and Simulated Annealing. Scientific Programming Volume, Article ID 9340434, 9 (2022). https://doi.org/10.1155/2022/9340434
Ajay, P., Pranati, M., Ajay, M., Reena, P., BalaKrishna, T.: Prediction of Student Performance Using Random Forest Classification Technique. Int. Res. J. Eng. Technol. (IRJET) 07(08) (2020)
Jawthari, M., Stoffov, V.: An International Journal for Engineering and Information Sciences. https://doi.org/10.1556/606.2021.00374 © 2021 Predicting students’ academic performance using a modified kNN algorithm
Tanner, T., Toivonen, H.: Predicting and preventing student failure – using the k-nearest neighbour method to predict student performance in an online course environment
Asril, T., Isa, S.M.: Prediction of students study period using K-nearest neighbor algorithm. Int. J. Emerg. Trends Eng. Res. Emerg. Trends Eng. Res ISSN 2347-3983. https://doi.org/10.30534/ijeter/2020/60862020
Abu Amra, I.A., Maghari, A.Y.A.: 8th International Conference on Information Technology (ICIT) Students Performance Prediction Using KNN and Naïve Bayesian (2017)
Wiyono, S., Wibowo, D.S., Fikri Hidayatullah, M., Dairoh: Comparative Study of KNN, SVM and Decision Tree Algorithm for Student’s Performance Prediction. Int. J. Comp. Sci. Appl. Maths. 6(2) (2020)
Seetharam Nagesh, A.: Satyamurty, V.S., Akhila, K.: Predicting Student Performance using KNN Classification in Bigdata Environment. CVR J. Sci. Technol. 13 (2017). ISSN 2277-3916
Bansod, D.A., Shah, A.D., A Review of Student Performance Prediction Techniques in Virtual Learning Environment. IJCRT Int. J. Creative Res. Thoughts (IJCRT) 9(8) (2021). ISSN: 2320-2882
Dhilipan, J., Vijayalakshmi, N., Suriya, S.: Arockiya Christopher IOP Conf. Series: Materials Science and Engineering. Prediction of Students Performance using Machine learning 1055 012122. IOP Publishing (2021). https://doi.org/10.1088/1757-899X/1055/1/012122
Wiyono, S., Abidin, T.: Comparative Study of Machine Learning Knn, Svm, and Decision Tree Algorithm to Predict Student’s Performance. Int. J. Res. Granthaalayah 7(1) (2019). ISSN: 2350-0530(O), ISSN: 2394-3629(P) https://doi.org/10.5281/zenodo.2550651
Cardonaa, T.A., Cudneya, E.A.: 25th International Conference on Production Research Manufacturing Innovation: Cyber Physical Manufacturing August 9-14. Illinois (USA Predicting Student Retention Using Support Vector Machines, Chicago (2019)
Bhutto, S., Ali Arain, Q., Siddiqui, I.F., Anwar, M.: International Conference on Information Science and Communication Technology Predicting Students’ Academic Performance Through Supervised Machine Learning (2020)
Burman, I., Som, S.: IEEE Predicting Students Academic Performance Using Support Vector Machine (2019). 978-1-5386-9346-9/19/$31.00
Pang, Y., Judd, N., O’Brien, J., Ben-Avie, M.: IEEE Predicting Students’ Graduation Outcomes through Support Vector Machines (2017). 978-1-5090-5920-1/17/$31.00
Casuat, C.D., Festijo, E.D.: 6th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS) Predicting Students. Employability using Machine Learning Approach (2019)
Kumari, P., Jain, P.K., Pamula, R.: 4th Int’l Conf. on Recent Advances in Information Technology, RAIT-2018. An Efficient use of Ensemble Methods to Predict Students Academic Performance
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., et al.: A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing https://doi.org/10.1016/j.neucom.2019.10.118
Damuluri, S., Islam, K., Ahmadi, P., Qureshi, N.: Analyzing Navigational Data and Predicting Student Grades Using Support Vector Machine. Emerging Science Journal 4(4) (2020)
Alamri, L.H., et al.: ICETM 2020. United Kingdom Predicting Student Academic Performance using Support Vector Machine and Random Forest, London (2020)
Leon, M., Markovic, T., Punnekkat, S.: Comparative evaluation of machine learning algorithms for network intrusion detection and attack classification. International Joint Conference on Neural Networks (IJCNN), pp. 01–08. Padua, Italy (2022). https://doi.org/10.1109/IJCNN55064.2022.9892293
Alawi, S.J.S., Shaharanee, I.N.M., Jamil, J.M.: Clustering student performance data using k-means algorithms. J. Computat. Innov. Analy. 2(1), 41–55 (2023). https://doi.org/10.32890/jcia2023.2.1.3
Acknowledgment
Authors would like to thank Dr. Ranjit Patil, Principal, Dr. D. Y. Patil Arts, Science and Commerce College, Pimpri, Pune (MS) for helping to write this research paper and useful discussions. I also like to express my sincere thanks to Dr. Bharat Shinde, Principal, Mr. Gajanan Joshi, Head Department of Computer Science, Vidya Pratishthan’s Arts, Science and Commerce College, M.I.D.C., Baramati, Pune (MS) for giving me valuable guidelines and suggestions regarding this work. We would like to extend my sincere appreciation to Mr. Amol Mohite for their invaluable assistance in formatting this research paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kudale, G.A., Rajpoot, S.S. (2025). Enhancing Academic Performance Prediction Through K-Means Clustering and Comparative Evaluation of Machine Learning Algorithms: A Case Study on Student Dataset. In: Singh, R., Gehlot, A. (eds) Business Data Analytics. ICBDA 2023. Communications in Computer and Information Science, vol 2358. Springer, Cham. https://doi.org/10.1007/978-3-031-80778-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-80778-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-80777-0
Online ISBN: 978-3-031-80778-7
eBook Packages: Computer ScienceComputer Science (R0)