ABSTRACT
The effectiveness of learning in massive open online courses (MOOCs) can be significantly enhanced by introducing personalized intervention schemes which rely on building predictive models of student learning behaviors such as some engagement or performance indicators. A major challenge that has to be addressed when building such models is to design handcrafted features that are effective for the prediction task at hand. In this paper, we make the first attempt to solve the feature learning problem by taking the unsupervised learning approach to learn a compact representation of the raw features with a large degree of redundancy. Specifically, in order to capture the underlying learning patterns in the content domain and the temporal nature of the clickstream data, we train a modified auto-encoder (AE) combined with the long short-term memory (LSTM) network to obtain a fixed-length embedding for each input sequence. When compared with the original features, the new features that correspond to the embedding obtained by the modified LSTM-AE are not only more parsimonious but also more discriminative for our prediction task. Using simple supervised learning models, the learned features can improve the prediction accuracy by up to 17% compared with the supervised neural networks and reduce overfitting to the dominant low-performing group of students, specifically in the task of predicting students' performance. Our approach is generic in the sense that it is not restricted to a specific supervised learning model nor a specific prediction task for MOOC learning analytics.
- Yoshua Bengio. 2012. Practical Recommendations for Gradient-Based Training of Deep Architectures. Springer Berlin Heidelberg, Berlin, Heidelberg, 437--478.Google Scholar
- Nigel Bosch. 2017. Unsupervised Deep Autoencoders for Feature Extraction with Educational Data. In Proceedings of the EDM 2017 Workshops and Tutorials co-located with the 10th International Conference on Educational Data Mining. EDM, Urbana, IL, USA.Google Scholar
- Sebastien Boyer and Kalyan Veeramachaneni. 2015. Transfer Learning for Predictive Models in Massive Open Online Courses. In Artificial Intelligence in Education. Springer International Publishing, Massachusetts Institute of Technology, 54--63.Google Scholar
- Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim. 2015. Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks. In AIED Workshops. AIED, Seoul, South Korea.Google Scholar
- T. Daradoumis, R. Bassi, F. Xhafa, and S. Caballé. 2013. A Review on Massive E-Learning (MOOC) Design, Delivery and Assessment. In 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC, Mytilini, Greece, 208--213. Google ScholarDigital Library
- M. Fei and D. Y. Yeung. 2015. Temporal Models for Predicting Student Dropout in Massive Open Online Courses. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW). ICDMW, Hong Kong, China, 256--263. Google ScholarDigital Library
- Sherif Halawa, Daniel Greene, and John Mitchell. 2014. Dropout prediction in MOOCs using learner activity features. Proceedings of the Second European MOOC Stakeholder Summit 37, 1 (2014), 58--65.Google Scholar
- Jiazhen He, James Bailey, Benjamin IP Rubinstein, and Rui Zhang. 2015. Identifying At-Risk Students in Massive Open Online Courses. In AAAI. AAAI, Melbourne, Australia, 1749--1755. Google ScholarDigital Library
- Geoffrey E Hinton and Sam T Roweis. 2003. Stochastic neighbor embedding. In Advances in neural information processing systems. NIPS, Toronto, Canada, 857--864. Google ScholarDigital Library
- I. T. Jolliffe. 1986. Principal Component Analysis and Factor Analysis. Springer, New York, NY, 115--128.Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). arXiv:1412.6980 http://arxiv.org/abs/1412.6980Google Scholar
- Severin Klingler, Rafael Wampfler, Tanja Käser, Barbara Solenthaler, and Markus Gross. 2017. Efficient Feature Embeddings for Student Classification with Variational Autoencoders. In Proceedings of the 10th International Conference on Educational Data Mining. EDM, ETH Zurich, Switzerland, 72--79.Google Scholar
- Marius Kloft, Felix Stiehler, Zhilin Zheng, and Niels Pinkwart. 2014. Predicting MOOC dropout over weeks using machine learning methods. In Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs. EMNLP, Berlin, Germany, 60--65.Google ScholarCross Ref
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579--2605.Google Scholar
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Nitish Srivastava, Elman Mansimov, and Ruslan Salakhudinov. 2015. Unsupervised Learning of Video Representations using LSTMs. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 843--852. Google ScholarDigital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27. Curran Associates, Inc., Mountain View, CA, USA, 3104--3112. Google ScholarDigital Library
- Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning 4, 2 (2012), 26--31.Google Scholar
- Jacob Whitehill, Kiran Mohan, Daniel Seaton, Yigal Rosen, and Dustin Tingley. 2017. MOOC Dropout Prediction: How to Measure Accuracy?. In Proceedings of the Fourth (2017) ACM Conference on Learning@Scale. ACM, L@S, Worcester, MA, USA, 161--164. Google ScholarDigital Library
- Jacob Whitehill, Joseph Jay Williams, Glenn Lopez, Cody Austun Coleman, and Justin Reich. 2015. Beyond prediction: First steps toward automatic intervention in MOOC student stopout. In Proceedings of the 8th International Conference on Educational Data Mining. EDM, Worcester, MA, USA.Google ScholarCross Ref
- Cheng Ye and Gautam Biswas. 2014. Early prediction of student dropout and performance in MOOCs using higher granularity temporal information. Journal of Learning Analytics 1, 3 (2014), 169--172.Google ScholarCross Ref
Index Terms
- Effective Feature Learning with Unsupervised Learning for Improving the Predictive Models in Massive Open Online Courses
Recommendations
Transfer Learning using Representation Learning in Massive Open Online Courses
LAK19: Proceedings of the 9th International Conference on Learning Analytics & KnowledgeIn a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on ...
Supporting learners' self-regulated learning in Massive Open Online Courses
AbstractIn MOOCs, learners are typically presented with great autonomy over their learning process. Therefore, learners should engage in self-regulated learning (SRL) in order to successfully study in a MOOC. Learners however often struggle to self-...
Highlights- Learners struggle to regulate their learning in massive open online courses (MOOCs).
- A self-regulated learning (SRL) intervention was implemented in three MOOCs.
- Learners' SRL was measured with trace data variables.
- ...
Benefit and Cost Analysis of Massive Open Online Courses: Pedagogical Implications on Higher Education
There has been much research done on online learning including research on online educational activities and methods. The use of technology is gaining rising importance in higher education due to the benefits that it brings. In terms of adopting new ...
Comments