Skip to main content

Enhancing Academic Performance Prediction Through K-Means Clustering and Comparative Evaluation of Machine Learning Algorithms: A Case Study on Student Dataset

  • Conference paper
  • First Online:
Business Data Analytics (ICBDA 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2358))

Included in the following conference series:

  • 135 Accesses

Abstract

Predicting academic performance is a crucial task in educational institutions, as it aids in identifying students who may require additional support or intervention. This research paper presents an innovative approach to academic performance prediction by combining K-Means clustering with an in-depth comparison of advanced machine learning algorithms. The study utilizes a comprehensive student dataset sourced from Kaggle, encompassing diverse attributes related to students’ backgrounds and academic history. The dataset is subjected to K-Means clustering, resulting in the identification of distinct student clusters. Subsequently, four prominent machine learning algorithms—K-Nearest Neighbors (KNN), Neural Network (NN), Random Forest (RF), and Support Vector Machine (SVM)—are rigorously evaluated for their predictive efficacy on academic performance. The experimental results showcase the utility of K-Means clustering in segmenting students into meaningful clusters based on shared characteristics. The comparative analysis of the machine learning algorithms reveals varying levels of accuracy, precision, recall, and F1-score in predicting academic performance across different clusters. The outcomes highlight the algorithm that exhibits superior performance in this specific context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Data Mining Introductory and Advanced Topics, Margaret H. Dunhan, Pearson

    Google Scholar 

  2. Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques, 3rd Edition

    Google Scholar 

  3. Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, 2nd Edition

    Google Scholar 

  4. Han, J., Kamber, M., Pei, J.: Data Mining, Concepts and Techniques, 3rd Edition

    Google Scholar 

  5. Fuentes, A.: Hands-on Predictive Analytics with Python

    Google Scholar 

  6. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd Edition

    Google Scholar 

  7. Kumar, S.: Neural Networks A classroom Approach, 2nd Edition

    Google Scholar 

  8. Saxena, P.S., Govil, M.C.: Prediction of student’s academic performance using clustering. Special Conference Issue: National Conference on Cloud Computing & Big Data

    Google Scholar 

  9. Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of k-means clustering algorithm for prediction of students’ academic performance. (IJCSIS) Int. J. Comp. Sci. Info. Sec. 7(1) (2010)

    Google Scholar 

  10. Arora, R.K., Badal, D.: Evaluating Student’s Performance Using k-Means Clustering. Int. J. Comp. Sci. Technol. IJCST 4(2) (2013). ISSN: 0976-8491 (Online) | ISSN : 2229-4333 (Print)

    Google Scholar 

  11. Borgavakar, S.P., Shrivastava, A.: Evaluating student’s performance using k-means clustering. Int. J. Eng. Res. Technol. (IJERT) 6(05) (2017). ISSN: 2278-0181

    Google Scholar 

  12. Rajitha Devi, G.: Prediction of student academic performance using clustering. Int. J. Current Res. Multidiscip. (IJCRM) 5(6), pp. 01–05 (June’20). ISSN: 2456-0979

    Google Scholar 

  13. Goh, Y.L., et al.: Prediction of students’ academic performance by k-means clustering. Peer-review under responsibility of 4th Asia International Multidisciplinary Conference 2020 Scientific Committee

    Google Scholar 

  14. Vankayalapati, R.: Kalyani balaso ghutugade, rekha vannapuram, bejjanki pooja sree prasanna, “K-means algorithm for clustering of learners performance levels using machine learning techniques.” Revue d’Intelligence Artificielle 35(1), 99–104 (2021)

    Article  MATH  Google Scholar 

  15. Alamri, L.H., et al.: Nida Aslam ICETM 2020. United Kingdom Predicting Student Academic Performance using Support Vector Machine and Random Forest, London (2020)

    Google Scholar 

  16. Hemasri, G., Kiran, K.: Students Performance Prediction Using Random Forest Algorithm. Int. J. Res. Trends and Innov. IJRTI 7(12) (2022). ISSN: 2456-3315

    Google Scholar 

  17. Batool, S., Rashid, J.: Mohammad Ali Jinnah University International Conference on Computing (MAJICC), A Random Forest Students’ Performance Prediction (RFSPP) Model Based on Students’ Demographic Features (2021)

    Google Scholar 

  18. Jayaprakash, S., Krishnan, S., Jaiganesh, V.: International Conference on Emerging Smart Computing and Informatics (ESCI) AISSMS Institute of Information Technology, Pune, India. Mar 12–14, 2020, Predicting Students Academic Performance using an Improved Random Forest Classifier (2020)

    Google Scholar 

  19. Kulkarni, V.Y., Sinha, P.K.: Effective learning and classification using random forest algorithm. Int. J. Eng. Innov. Technol. (IJEIT) 3(11) (2014)

    Google Scholar 

  20. Huang, S., Wei, J.: Student Performance Prediction in Mathematics Course Based on the Random Forest and Simulated Annealing. Scientific Programming Volume, Article ID 9340434, 9 (2022). https://doi.org/10.1155/2022/9340434

  21. Ajay, P., Pranati, M., Ajay, M., Reena, P., BalaKrishna, T.: Prediction of Student Performance Using Random Forest Classification Technique. Int. Res. J. Eng. Technol. (IRJET) 07(08) (2020)

    Google Scholar 

  22. Jawthari, M., Stoffov, V.: An International Journal for Engineering and Information Sciences. https://doi.org/10.1556/606.2021.00374 © 2021 Predicting students’ academic performance using a modified kNN algorithm

  23. Tanner, T., Toivonen, H.: Predicting and preventing student failure – using the k-nearest neighbour method to predict student performance in an online course environment

    Google Scholar 

  24. Asril, T., Isa, S.M.: Prediction of students study period using K-nearest neighbor algorithm. Int. J. Emerg. Trends Eng. Res. Emerg. Trends Eng. Res ISSN 2347-3983. https://doi.org/10.30534/ijeter/2020/60862020

  25. Abu Amra, I.A., Maghari, A.Y.A.: 8th International Conference on Information Technology (ICIT) Students Performance Prediction Using KNN and Naïve Bayesian (2017)

    Google Scholar 

  26. Wiyono, S., Wibowo, D.S., Fikri Hidayatullah, M., Dairoh: Comparative Study of KNN, SVM and Decision Tree Algorithm for Student’s Performance Prediction. Int. J. Comp. Sci. Appl. Maths. 6(2) (2020)

    Google Scholar 

  27. Seetharam Nagesh, A.: Satyamurty, V.S., Akhila, K.: Predicting Student Performance using KNN Classification in Bigdata Environment. CVR J. Sci. Technol. 13 (2017). ISSN 2277-3916

    Google Scholar 

  28. Bansod, D.A., Shah, A.D., A Review of Student Performance Prediction Techniques in Virtual Learning Environment. IJCRT Int. J. Creative Res. Thoughts (IJCRT) 9(8) (2021). ISSN: 2320-2882

    Google Scholar 

  29. Dhilipan, J., Vijayalakshmi, N., Suriya, S.: Arockiya Christopher IOP Conf. Series: Materials Science and Engineering. Prediction of Students Performance using Machine learning 1055 012122. IOP Publishing (2021). https://doi.org/10.1088/1757-899X/1055/1/012122

  30. Wiyono, S., Abidin, T.: Comparative Study of Machine Learning Knn, Svm, and Decision Tree Algorithm to Predict Student’s Performance. Int. J. Res. Granthaalayah 7(1) (2019). ISSN: 2350-0530(O), ISSN: 2394-3629(P) https://doi.org/10.5281/zenodo.2550651

  31. Cardonaa, T.A., Cudneya, E.A.: 25th International Conference on Production Research Manufacturing Innovation: Cyber Physical Manufacturing August 9-14. Illinois (USA Predicting Student Retention Using Support Vector Machines, Chicago (2019)

    Google Scholar 

  32. Bhutto, S., Ali Arain, Q., Siddiqui, I.F., Anwar, M.: International Conference on Information Science and Communication Technology Predicting Students’ Academic Performance Through Supervised Machine Learning (2020)

    Google Scholar 

  33. Burman, I., Som, S.: IEEE Predicting Students Academic Performance Using Support Vector Machine (2019). 978-1-5386-9346-9/19/$31.00

    Google Scholar 

  34. Pang, Y., Judd, N., O’Brien, J., Ben-Avie, M.: IEEE Predicting Students’ Graduation Outcomes through Support Vector Machines (2017). 978-1-5090-5920-1/17/$31.00

    Google Scholar 

  35. Casuat, C.D., Festijo, E.D.: 6th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS) Predicting Students. Employability using Machine Learning Approach (2019)

    Google Scholar 

  36. Kumari, P., Jain, P.K., Pamula, R.: 4th Int’l Conf. on Recent Advances in Information Technology, RAIT-2018. An Efficient use of Ensemble Methods to Predict Students Academic Performance

    Google Scholar 

  37. Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., et al.: A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing https://doi.org/10.1016/j.neucom.2019.10.118

  38. Damuluri, S., Islam, K., Ahmadi, P., Qureshi, N.: Analyzing Navigational Data and Predicting Student Grades Using Support Vector Machine. Emerging Science Journal 4(4) (2020)

    Google Scholar 

  39. Alamri, L.H., et al.: ICETM 2020. United Kingdom Predicting Student Academic Performance using Support Vector Machine and Random Forest, London (2020)

    Google Scholar 

  40. Leon, M., Markovic, T., Punnekkat, S.: Comparative evaluation of machine learning algorithms for network intrusion detection and attack classification. International Joint Conference on Neural Networks (IJCNN), pp. 01–08. Padua, Italy (2022). https://doi.org/10.1109/IJCNN55064.2022.9892293

  41. Alawi, S.J.S., Shaharanee, I.N.M., Jamil, J.M.: Clustering student performance data using k-means algorithms. J. Computat. Innov. Analy. 2(1), 41–55 (2023). https://doi.org/10.32890/jcia2023.2.1.3

Download references

Acknowledgment

Authors would like to thank Dr. Ranjit Patil, Principal, Dr. D. Y. Patil Arts, Science and Commerce College, Pimpri, Pune (MS) for helping to write this research paper and useful discussions. I also like to express my sincere thanks to Dr. Bharat Shinde, Principal, Mr. Gajanan Joshi, Head Department of Computer Science, Vidya Pratishthan’s Arts, Science and Commerce College, M.I.D.C., Baramati, Pune (MS) for giving me valuable guidelines and suggestions regarding this work. We would like to extend my sincere appreciation to Mr. Amol Mohite for their invaluable assistance in formatting this research paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gautam Appasaheb Kudale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kudale, G.A., Rajpoot, S.S. (2025). Enhancing Academic Performance Prediction Through K-Means Clustering and Comparative Evaluation of Machine Learning Algorithms: A Case Study on Student Dataset. In: Singh, R., Gehlot, A. (eds) Business Data Analytics. ICBDA 2023. Communications in Computer and Information Science, vol 2358. Springer, Cham. https://doi.org/10.1007/978-3-031-80778-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-80778-7_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-80777-0

  • Online ISBN: 978-3-031-80778-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics