skip to main content
10.1145/3341525.3387373acmconferencesArticle/Chapter ViewAbstractPublication PagesiticseConference Proceedingsconference-collections
research-article

ProgSnap2: A Flexible Format for Programming Process Data

Published:15 June 2020Publication History

ABSTRACT

In this paper, we introduce ProgSnap2, a standardized format for logging programming process data. ProgSnap2 is a tool for computing education researchers, with the goal of enabling collaboration by helping them to collect and share data, analysis code, and data-driven tools to support students. We give an overview of the format, including how events, event attributes, metadata, code snapshots and external resources are represented. We also present a case study to evaluate how ProgSnap2 can facilitate collaborative research. We investigated three metrics designed to quantify students' difficulty with compiler errors - the Error Quotient, Repeated Error Density and Watwin score - and compared their distributions and ability to predict students' performance. We analyzed five different ProgSnap2 datasets, spanning a variety of contexts and programming languages. We found that each error metric is mildly predictive of students' performance. We reflect on how the common data format allowed us to more easily investigate our research questions.

References

  1. [n.d.]. SPLICE: Standards, Protocols, and Learning Infrastructure for Computing Education. https://cssplice.github.io/. Accessed: 2019-08--19.Google ScholarGoogle Scholar
  2. Austin Cory Bart, Javier Tibau, Eli Tilevich, Clifford A Shaffer, and Dennis Kafura. 2017. Blockpy: An open access data-science environment for introductory programmers. Computer 50, 5 (2017), 18--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Brett A. Becker. 2016. A New Metric to Quantify Repeated Compiler Errors for Novice Programmers. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '16). ACM, New York, NY, USA, 296--301. https://doi.org/10.1145/2899415.2899463Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brett A. Becker, Paul Denny, Raymond Pettit, Durell Bouchard, Dennis J. Bouvier, Brian Harrington, Amir Kamil, Amey Karkare, Chris McDonald, Peter-Michael Osera, Janice L. Pearce, and James Prather. 2019. Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (ITiCSE-WGR '19). ACM, New York, NY, USA, 177--210. https://doi.org/10.1145/3344429.3372508Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brett A Becker and Catherine Mooney. 2016. Categorizing compiler error messages with principal component analysis. In 12th China-Europe International Symposium on Software Engineering Education (CEISEE 2016), Shenyang, China, 28--29 May 2016.Google ScholarGoogle Scholar
  6. Neil C.C. Brown and Amjad Altadmri. 2014. Investigating Novice Programming Mistakes: Educator Beliefs vs. Student Data. In Proceedings of the Tenth Annual Conference on International Computing Education Research (ICER '14). ACM, New York, NY, USA, 43--50. https://doi.org/10.1145/2632320.2632343Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Neil CC Brown, Amjad Altadmri, Sue Sentance, and Michael Kölling. 2018. Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project. In Proceedings of the 2018 ACM Conference on International Computing Education Research. ACM, 196--204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Neil C.C. Brown, Michael Kölling, Davin McCall, and Ian Utting. 2014. Blackbox: A Large Scale Repository of Novice Programmers' Activity. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). ACM, New York, NY, USA, 223--228. https://doi.org/10.1145/2538862.2538924Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Adam S Carter, Christopher D Hundhausen, and Olusola Adesope. 2015. The normalized programming state model: Predicting student performance in computing courses based on programming behavior. In Proceedings of the eleventh annual International Conference on International Computing Education Research. ACM, 141--150.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stephen H Edwards and Krishnan Panamalai Murali. 2017. CodeWorkout: short programming exercises with built-in data collection. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education. ACM, 188--193.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Luke Gusukuma, Austin Cory Bart, Dennis Kafura, and Jeremy Ernst. 2018. Misconception-Driven Feedback. Proceedings of the 2018 ACM Conference on International Computing Education Research - ICER '18 1 (2018), 160--168. https://doi.org/10.1145/3230977.3231002Google ScholarGoogle Scholar
  12. Qiang Hao, David H Smith IV, Naitra Iriumi, Michail Tsikerdekis, and Andrew J Ko. 2019. A Systematic Investigation of Replications in Computing Education Research. ACM Transactions on Computing Education (TOCE) 19, 4 (2019), 42.Google ScholarGoogle Scholar
  13. Arto Hellas, Juho Leinonen, and Petri Ihantola. 2017. Plagiarism in Take-home Exams: Help-seeking, Collaboration, and Systematic Cheating. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '17). ACM, New York, NY, USA, 238--243. https://doi.org/10.1145/3059009.3059065Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David Hovemeyer, Arto Hellas, Andrew Petersen, and Jaime Spacco. 2017. Progsnap: Sharing Programming Snapshots for Research (Abstract Only). In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE '17). ACM, New York, NY, USA, 709--709. https://doi.org/10.1145/3017680.3022418Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David Hovemeyer and Jaime Spacco. 2013. CloudCoder: a web-based programming exercise system. Journal of Computing Sciences in Colleges 28, 3 (2013), 30--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Petri Ihantola, Arto Vihavainen, Alireza Ahadi, Matthew Butler, Jürgen Börstler, Stephen H Edwards, Essi Isohanni, Ari Korhonen, Andrew Petersen, Kelly Rivers, et al. 2015. Educational data mining and learning analytics in programming: Literature review and case studies. In Proceedings of the 2015 ITiCSE on Working Group Reports. ACM, 41--63.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Matthew C. Jadud. 2006. Methods and Tools for Exploring Novice Compilation Behaviour. In Proceedings of the Second International Workshop on Computing Education Research (ICER '06). ACM, New York, NY, USA, 73--84. https://doi.org/10.1145/1151588.1151600Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Matthew C Jadud and Brian Dorn. 2015. Aggregate Compilation Behavior: Findings and Implications from 27,698 Users. In Proceedings of the 11th International Computing Education Research Conference. 131--139. https://doi.org/10.1145/2787622.2787718Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ioannis Karvelas, Annie Li, and Brett A. Becker. 2020. The Effects of Compilation Mechanisms and Error Message Presentation on Novice Programmer Behavior. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE '20). Association for Computing Machinery, New York, NY, USA, 759--765. https://doi.org/10.1145/3328778.3366882Google ScholarGoogle Scholar
  20. Kenneth R Koedinger, Ryan SJd Baker, Kyle Cunningham, Alida Skogsholm, Brett Leber, and John Stamper. 2010. A data repository for the EDM community: The PSLC DataShop. Handbook of educational data mining 43 (2010), 43--56.Google ScholarGoogle Scholar
  21. Kenneth R Koedinger, John Stamper, and Paulo F Carvalho. [n.d.]. Sharing and Reusing Data and Analytic Methods with LearnSphere. Hands-on 2 ([n. d.]), 30p.Google ScholarGoogle Scholar
  22. Daniel Marchena Parreira, Andrew Petersen, and Michelle Craig. 2015. Pcrs-c: Helping students learn c. In Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education. ACM, 347--347.Google ScholarGoogle Scholar
  23. Andrew Petersen, Jaime Spacco, and Arto Vihavainen. 2015. An Exploration of Error Quotient in Multiple Contexts. In Proceedings of the 15th Koli Calling Conference on Computing Education Research (Koli Calling '15). ACM, New York, NY, USA, 77--86. https://doi.org/10.1145/2828959.2828966Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Thomas W. Price, Yihuan Dong, and Dragan Lipovac. 2017. iSnap: Towards Intelligent Tutoring in Novice Programming Environments. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE '17). ACM, NewYork, NY, USA, 483--488. https://doi.org/10.1145/3017680.3017762Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Thomas W Price and Ge Gao. 2019. Lightning Talk: Curating Analyses for Programming Log Data. In Proceedings of SPLICE 2019 workshop Computing Science Education Infrastructure: From Tools to Data at 15th ACM International Computing Education Research Conference.Google ScholarGoogle Scholar
  26. Thomas W Price, Rui Zhi, and Tiffany Barnes. 2017. Evaluation of a Data-driven Feedback Algorithm for Open-ended Programming.. In EDM.Google ScholarGoogle Scholar
  27. Kyle Reestman and Brian Dorn. 2019. Native Language's Effect on Java Compiler Errors. In Proceedings of the 2019 ACM Conference on International Computing Education Research (ICER '19). Association for Computing Machinery, New York, NY, USA, 249--257. https://doi.org/10.1145/3291279.3339423Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kelly Rivers, Erik Harpstead, and Ken Koedinger. 2016. Learning Curve Analysis for Programming: Which Concepts do Students Struggle With?. In Proceedings of the International Computing Education Research Conference. 143--151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kelly Rivers and Kenneth R Koedinger. 2017. Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artificial Intelligence in Education 27, 1 (2017), 37--64.Google ScholarGoogle ScholarCross RefCross Ref
  30. Maria Mercedes T Rodrigo, Emily Tabanao, Ma Beatriz E Lahoz, and Matthew C Jadud. 2009. Analyzing online protocols to characterize novice java programmers. Philippine Journal of Science 138, 2 (2009), 177--190.Google ScholarGoogle Scholar
  31. Jaime Spacco, Jaymie Strecker, David Hovemeyer, and William Pugh. 2005. Software repository mining with Marmoset: An automated programming project snapshot and testing system. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 1--5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. John Stamper, Stephen Edwards, Andrew Petersen, Thomas Price, and Ian Utting. 2017. Developing a Data Standard for Computing Education Learning Process Data (DATASTAND). https://cssplice.github.io/DATASTAND.pdf. Accessed: 2019-08--19.Google ScholarGoogle Scholar
  33. Emily S. Tabanao, Ma. Mercedes T. Rodrigo, and Matthew C. Jadud. 2011. Predicting At-risk Novice Java Programmers Through the Analysis of Online Protocols. In Proceedings of the Seventh International Workshop on Computing Education Research (ICER '11). ACM, New York, NY, USA, 85--92. https://doi.org/10.1145/2016911.2016930Google ScholarGoogle Scholar
  34. Christopher Watson, Frederick WB Li, and Jamie L Godwin. 2013. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In 2013 IEEE 13th International Conference on Advanced Learning Technologies. IEEE, 319--323.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christopher Watson, Frederick W.B. Li, and Jamie L. Godwin. 2014. No Tests Required: Comparing Traditional and Dynamic Predictors of Programming Success. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). ACM, New York, NY, USA, 469--474. https://doi.org/10.1145/2538862.2538930Google ScholarGoogle Scholar
  36. Michael Yudelson, Roya Hosseini, Arto Vihavainen, and Peter Brusilovsky. 2014. Investigating automated student modeling in a Java MOOC. Educational Data Mining 2014 (2014), 261--264.Google ScholarGoogle Scholar

Index Terms

  1. ProgSnap2: A Flexible Format for Programming Process Data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ITiCSE '20: Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education
      June 2020
      615 pages
      ISBN:9781450368742
      DOI:10.1145/3341525

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 June 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate552of1,613submissions,34%

      Upcoming Conference

      ITiCSE 2024

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader