ABSTRACT
In this paper, we introduce ProgSnap2, a standardized format for logging programming process data. ProgSnap2 is a tool for computing education researchers, with the goal of enabling collaboration by helping them to collect and share data, analysis code, and data-driven tools to support students. We give an overview of the format, including how events, event attributes, metadata, code snapshots and external resources are represented. We also present a case study to evaluate how ProgSnap2 can facilitate collaborative research. We investigated three metrics designed to quantify students' difficulty with compiler errors - the Error Quotient, Repeated Error Density and Watwin score - and compared their distributions and ability to predict students' performance. We analyzed five different ProgSnap2 datasets, spanning a variety of contexts and programming languages. We found that each error metric is mildly predictive of students' performance. We reflect on how the common data format allowed us to more easily investigate our research questions.
- [n.d.]. SPLICE: Standards, Protocols, and Learning Infrastructure for Computing Education. https://cssplice.github.io/. Accessed: 2019-08--19.Google Scholar
- Austin Cory Bart, Javier Tibau, Eli Tilevich, Clifford A Shaffer, and Dennis Kafura. 2017. Blockpy: An open access data-science environment for introductory programmers. Computer 50, 5 (2017), 18--26.Google ScholarDigital Library
- Brett A. Becker. 2016. A New Metric to Quantify Repeated Compiler Errors for Novice Programmers. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '16). ACM, New York, NY, USA, 296--301. https://doi.org/10.1145/2899415.2899463Google ScholarDigital Library
- Brett A. Becker, Paul Denny, Raymond Pettit, Durell Bouchard, Dennis J. Bouvier, Brian Harrington, Amir Kamil, Amey Karkare, Chris McDonald, Peter-Michael Osera, Janice L. Pearce, and James Prather. 2019. Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (ITiCSE-WGR '19). ACM, New York, NY, USA, 177--210. https://doi.org/10.1145/3344429.3372508Google ScholarDigital Library
- Brett A Becker and Catherine Mooney. 2016. Categorizing compiler error messages with principal component analysis. In 12th China-Europe International Symposium on Software Engineering Education (CEISEE 2016), Shenyang, China, 28--29 May 2016.Google Scholar
- Neil C.C. Brown and Amjad Altadmri. 2014. Investigating Novice Programming Mistakes: Educator Beliefs vs. Student Data. In Proceedings of the Tenth Annual Conference on International Computing Education Research (ICER '14). ACM, New York, NY, USA, 43--50. https://doi.org/10.1145/2632320.2632343Google ScholarDigital Library
- Neil CC Brown, Amjad Altadmri, Sue Sentance, and Michael Kölling. 2018. Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project. In Proceedings of the 2018 ACM Conference on International Computing Education Research. ACM, 196--204.Google ScholarDigital Library
- Neil C.C. Brown, Michael Kölling, Davin McCall, and Ian Utting. 2014. Blackbox: A Large Scale Repository of Novice Programmers' Activity. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). ACM, New York, NY, USA, 223--228. https://doi.org/10.1145/2538862.2538924Google ScholarDigital Library
- Adam S Carter, Christopher D Hundhausen, and Olusola Adesope. 2015. The normalized programming state model: Predicting student performance in computing courses based on programming behavior. In Proceedings of the eleventh annual International Conference on International Computing Education Research. ACM, 141--150.Google ScholarDigital Library
- Stephen H Edwards and Krishnan Panamalai Murali. 2017. CodeWorkout: short programming exercises with built-in data collection. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education. ACM, 188--193.Google ScholarDigital Library
- Luke Gusukuma, Austin Cory Bart, Dennis Kafura, and Jeremy Ernst. 2018. Misconception-Driven Feedback. Proceedings of the 2018 ACM Conference on International Computing Education Research - ICER '18 1 (2018), 160--168. https://doi.org/10.1145/3230977.3231002Google Scholar
- Qiang Hao, David H Smith IV, Naitra Iriumi, Michail Tsikerdekis, and Andrew J Ko. 2019. A Systematic Investigation of Replications in Computing Education Research. ACM Transactions on Computing Education (TOCE) 19, 4 (2019), 42.Google Scholar
- Arto Hellas, Juho Leinonen, and Petri Ihantola. 2017. Plagiarism in Take-home Exams: Help-seeking, Collaboration, and Systematic Cheating. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '17). ACM, New York, NY, USA, 238--243. https://doi.org/10.1145/3059009.3059065Google ScholarDigital Library
- David Hovemeyer, Arto Hellas, Andrew Petersen, and Jaime Spacco. 2017. Progsnap: Sharing Programming Snapshots for Research (Abstract Only). In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE '17). ACM, New York, NY, USA, 709--709. https://doi.org/10.1145/3017680.3022418Google ScholarDigital Library
- David Hovemeyer and Jaime Spacco. 2013. CloudCoder: a web-based programming exercise system. Journal of Computing Sciences in Colleges 28, 3 (2013), 30--30.Google ScholarDigital Library
- Petri Ihantola, Arto Vihavainen, Alireza Ahadi, Matthew Butler, Jürgen Börstler, Stephen H Edwards, Essi Isohanni, Ari Korhonen, Andrew Petersen, Kelly Rivers, et al. 2015. Educational data mining and learning analytics in programming: Literature review and case studies. In Proceedings of the 2015 ITiCSE on Working Group Reports. ACM, 41--63.Google ScholarDigital Library
- Matthew C. Jadud. 2006. Methods and Tools for Exploring Novice Compilation Behaviour. In Proceedings of the Second International Workshop on Computing Education Research (ICER '06). ACM, New York, NY, USA, 73--84. https://doi.org/10.1145/1151588.1151600Google ScholarDigital Library
- Matthew C Jadud and Brian Dorn. 2015. Aggregate Compilation Behavior: Findings and Implications from 27,698 Users. In Proceedings of the 11th International Computing Education Research Conference. 131--139. https://doi.org/10.1145/2787622.2787718Google ScholarDigital Library
- Ioannis Karvelas, Annie Li, and Brett A. Becker. 2020. The Effects of Compilation Mechanisms and Error Message Presentation on Novice Programmer Behavior. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE '20). Association for Computing Machinery, New York, NY, USA, 759--765. https://doi.org/10.1145/3328778.3366882Google Scholar
- Kenneth R Koedinger, Ryan SJd Baker, Kyle Cunningham, Alida Skogsholm, Brett Leber, and John Stamper. 2010. A data repository for the EDM community: The PSLC DataShop. Handbook of educational data mining 43 (2010), 43--56.Google Scholar
- Kenneth R Koedinger, John Stamper, and Paulo F Carvalho. [n.d.]. Sharing and Reusing Data and Analytic Methods with LearnSphere. Hands-on 2 ([n. d.]), 30p.Google Scholar
- Daniel Marchena Parreira, Andrew Petersen, and Michelle Craig. 2015. Pcrs-c: Helping students learn c. In Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education. ACM, 347--347.Google Scholar
- Andrew Petersen, Jaime Spacco, and Arto Vihavainen. 2015. An Exploration of Error Quotient in Multiple Contexts. In Proceedings of the 15th Koli Calling Conference on Computing Education Research (Koli Calling '15). ACM, New York, NY, USA, 77--86. https://doi.org/10.1145/2828959.2828966Google ScholarDigital Library
- Thomas W. Price, Yihuan Dong, and Dragan Lipovac. 2017. iSnap: Towards Intelligent Tutoring in Novice Programming Environments. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE '17). ACM, NewYork, NY, USA, 483--488. https://doi.org/10.1145/3017680.3017762Google ScholarDigital Library
- Thomas W Price and Ge Gao. 2019. Lightning Talk: Curating Analyses for Programming Log Data. In Proceedings of SPLICE 2019 workshop Computing Science Education Infrastructure: From Tools to Data at 15th ACM International Computing Education Research Conference.Google Scholar
- Thomas W Price, Rui Zhi, and Tiffany Barnes. 2017. Evaluation of a Data-driven Feedback Algorithm for Open-ended Programming.. In EDM.Google Scholar
- Kyle Reestman and Brian Dorn. 2019. Native Language's Effect on Java Compiler Errors. In Proceedings of the 2019 ACM Conference on International Computing Education Research (ICER '19). Association for Computing Machinery, New York, NY, USA, 249--257. https://doi.org/10.1145/3291279.3339423Google ScholarDigital Library
- Kelly Rivers, Erik Harpstead, and Ken Koedinger. 2016. Learning Curve Analysis for Programming: Which Concepts do Students Struggle With?. In Proceedings of the International Computing Education Research Conference. 143--151.Google ScholarDigital Library
- Kelly Rivers and Kenneth R Koedinger. 2017. Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artificial Intelligence in Education 27, 1 (2017), 37--64.Google ScholarCross Ref
- Maria Mercedes T Rodrigo, Emily Tabanao, Ma Beatriz E Lahoz, and Matthew C Jadud. 2009. Analyzing online protocols to characterize novice java programmers. Philippine Journal of Science 138, 2 (2009), 177--190.Google Scholar
- Jaime Spacco, Jaymie Strecker, David Hovemeyer, and William Pugh. 2005. Software repository mining with Marmoset: An automated programming project snapshot and testing system. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 1--5.Google ScholarDigital Library
- John Stamper, Stephen Edwards, Andrew Petersen, Thomas Price, and Ian Utting. 2017. Developing a Data Standard for Computing Education Learning Process Data (DATASTAND). https://cssplice.github.io/DATASTAND.pdf. Accessed: 2019-08--19.Google Scholar
- Emily S. Tabanao, Ma. Mercedes T. Rodrigo, and Matthew C. Jadud. 2011. Predicting At-risk Novice Java Programmers Through the Analysis of Online Protocols. In Proceedings of the Seventh International Workshop on Computing Education Research (ICER '11). ACM, New York, NY, USA, 85--92. https://doi.org/10.1145/2016911.2016930Google Scholar
- Christopher Watson, Frederick WB Li, and Jamie L Godwin. 2013. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In 2013 IEEE 13th International Conference on Advanced Learning Technologies. IEEE, 319--323.Google ScholarDigital Library
- Christopher Watson, Frederick W.B. Li, and Jamie L. Godwin. 2014. No Tests Required: Comparing Traditional and Dynamic Predictors of Programming Success. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). ACM, New York, NY, USA, 469--474. https://doi.org/10.1145/2538862.2538930Google Scholar
- Michael Yudelson, Roya Hosseini, Arto Vihavainen, and Peter Brusilovsky. 2014. Investigating automated student modeling in a Java MOOC. Educational Data Mining 2014 (2014), 261--264.Google Scholar
Index Terms
- ProgSnap2: A Flexible Format for Programming Process Data
Recommendations
Exploring Novice Programming Behavior over Time
ITiCSE '21: Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 2This work focuses on the effect that programming time has on novice programmers' interaction with two versions of the BlueJ programming environment that differ in compilation mechanism and error message presentation. We utilize programming process data ...
Investigating Novice Programmers' Interaction with Programming Environments
ITiCSE '19: Proceedings of the 2019 ACM Conference on Innovation and Technology in Computer Science EducationLearning computer programming can be challenging for novices. Students have to deal with theoretical aspects of programming and problem solving in general, as well as mastering the syntax of a programming language. However, the feedback students receive ...
Sympathy for the (Novice) Developer: Programming Activity When Compilation Mechanism Varies
SIGCSE 2022: Proceedings of the 53rd ACM Technical Symposium on Computer Science Education - Volume 1In this work we investigate compilation behavior and error resolution time of thousands of novice programmers using two different versions of the BlueJ pedagogical Java programming environment. The two versions feature different compilation and error ...
Comments