ABSTRACT
Reading a programming error message is the first step in understanding what it is trying to tell the programmer about how to fix an error in their code. However, these are often difficult to read, especially for novices which is not surprising given that error messages in many of the most popular languages in which novices learn to code were not written with readability in mind. As a result, novices frequently struggle to understand them. This is a long-standing problem, with researchers highlighting concerns about programming error message readability over the last six decades. Very recent work has put forward evidence of the need for measuring readability in error messages and a framework for doing so. This framework consists of four factors of readability for programming error messages: message length, vocabulary, jargon, and sentence construction. We use this framework to implement an approach to automatically assess the readability of programming error messages. Using established readability factors as predictors in a machine learning model, we train several models using a dataset of C and Java error messages. We examine the performance of these models, and apply the best performing model to a previously published set of messages evaluated for readability by experts, non-experts and students. Our results validate the previously proposed readability factors, and our model classifies messages similarly to human raters. Finally, we discuss future work needed to improve the accuracy of the model.
- Toufique Ahmed, Noah Rose Ledesma, and Premkumar Devanbu. 2021. SYNFIX: Automatically Fixing Syntax Errors using Compiler Diagnostics. arXiv preprint arXiv:2104.14671 (2021). https://doi.org/10.48550/arXiv.2104.14671Google ScholarCross Ref
- Titus Barik, Denae Ford, Emerson Murphy-Hill, and Chris Parnin. 2018. How Should Compilers Explain Problems to Developers?. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). ACM, NY, NY, USA, 633--643. https://doi.org/10.1145/3236024.3236040Google ScholarDigital Library
- Titus Barik, Justin Smith, Kevin Lubick, Elisabeth Holmes, Jing Feng, Emerson Murphy-Hill, and Chris Parnin. 2017. Do Developers Read Compiler Error Messages?. In Proceedings of the 39th International Conference on Software Engineering (Buenos Aires, Argentina) (ICSE '17). IEEE Press, Piscataway, NJ, USA, 575--585. https://doi.org/10.1109/ICSE.2017.59Google ScholarDigital Library
- Brett A. Becker. 2015. An Exploration Of The Effects Of Enhanced Compiler Error Messages For Computer Programming Novices. Masters Thesis. Dublin Institute of Technology. https://doi.org/10.13140/RG.2.2.26637.13288Google ScholarCross Ref
- Brett A. Becker. 2016. An Effective Approach to Enhancing Compiler Error Messages. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (Memphis, Tennessee, USA) (SIGCSE '16). ACM, NY, NY, USA, 126--131. https://doi.org/10.1145/2839509.2844584Google ScholarDigital Library
- Brett A. Becker. 2016. A New Metric to Quantify Repeated Compiler Errors for Novice Programmers. In Proceedings of the 21st ACM Conference on Innovation and Technology in Computer Science Education (Arequipa, Peru) (ITiCSE '16). ACM, NY, NY, USA, 296--301. https://doi.org/10.1145/2899415.2899463Google ScholarDigital Library
- Brett A. Becker. 2021. What Does Saying That ?Programming is Hard' Really Say, and About Whom? Commun. ACM 64, 8 (jul 2021), 27--29. https://doi.org/10.1145/3469115Google ScholarDigital Library
- Brett A. Becker, Paul Denny, Raymond Pettit, Durell Bouchard, Dennis J. Bouvier, Brian Harrington, Amir Kamil, Amey Karkare, Chris McDonald, Peter-Michael Osera, Janice L. Pearce, and James Prather. 2019. Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (Aberdeen, Scotland Uk) (ITiCSE-WGR '19). ACM, NY, NY, USA, 177--210. https://doi.org/10.1145/3344429.3372508Google ScholarDigital Library
- Brett A. Becker, Paul Denny, James Prather, Raymond Pettit, Robert Nix, and Catherine Mooney. 2021. Towards Assessing the Readability of Programming Error Messages. In Australasian Computing Education Conference (Virtual) (ACE'21). ACM, NY, NY, USA. https://doi.org/10.1145/3441636.3442320Google ScholarDigital Library
- Brett A. Becker, Graham Glanville, Ricardo Iwashima, Claire McDonnell, Kyle Goslin, and Catherine Mooney. 2016. Effective Compiler Error Message Enhancement for Novice Programming Students. Computer Science Education 26, 2-3 (2016), 148--175. https://doi.org/10.1080/08993408.2016.1225464Google ScholarCross Ref
- Brett A. Becker and Catherine Mooney. 2016. Categorizing Compiler Error Messages With Principal Component Analysis. In 12th China-Europe International Symposium on Software Engineering Education (CEISEE 2016), Shenyang, China, 28--29 May 2016. https://researchrepository.ucd.ie/handle/10197/7889Google Scholar
- Tao Chen, Ruifeng Xu, and Xuan Wang. 2016. Improving Sentiment Analysis via Sentence Type Classification Using BiLSTM-CRF and CNN. Expert Systems with Applications (11 2016). https://doi.org/10.1016/j.eswa.2016.10.065Google ScholarDigital Library
- Pedro Curto, Nuno Mamede, and Jorge Baptista. 2015. Automatic Text Difficulty Classifier. In Proceedings of the 7th International Conference on Computer Supported Education - Volume 1 (Lisbon, Portugal) (CSEDU 2015). SCITEPRESS - Science and Technology Publications, Lda, Setubal, PRT, 36--44. https://doi.org/10.5220/0005428300360044Google ScholarDigital Library
- Paul Denny, James Prather, and Brett A. Becker. 2020. Error Message Readability and Novice Debugging Performance. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education (Trondheim, Norway) (ITiCSE '20). ACM, NY, NY, USA, 480--486. https://doi.org/10.1145/3341525.3387384Google ScholarDigital Library
- Paul Denny, James Prather, Brett A Becker, Catherine Mooney, John Homer, Zachary C Albrecht, and Garrett B Powell. 2021. On Designing Programming Error Messages for Novices: Readability and its Constituent Factors. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1--15.Google ScholarDigital Library
- Tao Dong and Kandarp Khandwala. 2019. The Impact of "Cosmetic" Changes on the Usability of Error Messages. In EA of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI EA '19). ACM, NY, NY, USA, Article LBW0273, 6 pages. https://doi.org/10.1145/3290607.3312978Google ScholarDigital Library
- Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. Deepfix: Fixing common c language errors by deep learning. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Björn Hartmann, Daniel MacDougall, Joel Brandt, and Scott R. Klemmer. 2010. What Would Other Programmers Do: Suggesting Solutions to Error Messages. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI '10). ACM, NY, NY, USA, 1019--1028. https://doi.org/10.1145/1753326.1753478Google ScholarDigital Library
- Felienne Hermans. 2020. Hedy: A Gradual Language for Programming Education. In Proceedings of the 2020 ACM ICER Conference (Virtual Event, New Zealand) (ICER '20). ACM, NY, NY, USA, 259--270. https://doi.org/10.1145/3372782.3406262Google ScholarDigital Library
- James J Horning. 1976. What the Compiler Should Tell the User. In Compiler Construction: An Advanced Course, G Goos and J Hartmanis (Eds.). Springer-Verlag, Berlin-Heidelberg, 525--548.Google Scholar
- Barbara S. Isa, James M. Boyle, Alan S. Neal, and Roger M. Simons. 1983. AMethodology for Objectively Evaluating Error Messages. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, Massachusetts, USA) (CHI '83). ACM, NY, NY, USA, 68--71. https://doi.org/10.1145/800045.801583Google ScholarDigital Library
- Kamran Kowsari, Donald E. Brown, Mojtaba Heidarysafa, Kiana Jafari Meimandi, Matthew S. Gerber, and Laura E. Barnes. 2017. HDLTex: Hierarchical Deep Learning for Text Classification. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (Dec 2017). https://doi.org/10.1109/icmla.2017.0-134Google ScholarCross Ref
- Tobias Kuhn. 2014. A Survey and Classification of Controlled Natural Languages. Comput. Linguist. 40, 1 (March 2014), 121--170. https://doi.org/10.1162/COLI_a_00168Google ScholarDigital Library
- William Lidwell, Kritina Holden, and Jill Butler. 2010. Universal Principles of Design, Revised and Updated: 125 Ways to Enhance Usability, Influence Perception, Increase Appeal, Make Better Design Decisions, and Teach through Design. Rockport Publishers, Beverly, Massachusetts.Google Scholar
- Guillaume Marceau, Kathi Fisler, and Shriram Krishnamurthi. 2011. Measuring the Effectiveness of Error Messages Designed for Novice Programmers. In Proceedings of the 42nd ACM SIGCSE TS (Dallas, TX, USA) (SIGCSE '11). ACM, NY, NY, USA, 499--504. https://doi.org/10.1145/1953163.1953308Google ScholarDigital Library
- Cormac Murray. 2019. An Analysis of Programming Process Data in a CS1 Programming Module: Factors Influencing Success. Masters Thesis. University College Dublin.Google Scholar
- Charles Kay Ogden. 1930. Basic English: A General Introduction with Rules and Grammar. (1930).Google Scholar
- Raymond S. Pettit, John Homer, and Roger Gee. 2017. Do Enhanced Compiler Error Messages Help Students? Results Inconclusive.. In Proceedings of the 2017 ACM SIGCSE TS (Seattle, Washington, USA) (SIGCSE '17). ACM, NY, NY, USA, 465--470. https://doi.org/10.1145/3017680.3017768Google ScholarDigital Library
- James Prather, Brett A Becker, Michelle Craig, Paul Denny, Dastyni Loksa, and Lauren Margulieux. 2020. What DoWe ThinkWe ThinkWe are Doing? Metacognition and Self-regulation in Programming. In Proceedings of the 2020 ACM ICER Conference. 2--13.Google Scholar
- James Prather, Raymond Pettit, Kayla McMurry, Alani Peters, John Homer, and Maxine Cohen. 2018. Metacognitive Difficulties Faced by Novice Programmers in Automated Assessment Tools. In Proceedings of the 2018 ACM ICER Conference (Espoo, Finland) (ICER '18). ACM, NY, NY, USA, 41--50. https://doi.org/10.1145/3230977.3230981Google ScholarDigital Library
- James Prather, Raymond Pettit, Kayla Holcomb McMurry, Alani Peters, John Homer, Nevan Simone, and Maxine Cohen. 2017. On Novices' Interaction with Compiler Error Messages: A Human Factors Approach. In Proceedings of the 2017 ACM ICER Conference (Tacoma, Washington, USA) (ICER '17). ACM, NY, NY, USA, 74--82. https://doi.org/10.1145/3105726.3106169Google ScholarDigital Library
- Thomas W Price, David Hovemeyer, Kelly Rivers, Austin Cory Bart, Andrew Petersen, Brett A. Becker, and Jason Lefever. 2019. ProgSnap2: A Flexible Format for Programming Process Data. In Proceedings of the Educational Data Mining in Computer Science Workshop in the Companion Proceedings of the International Conference on Learning Analytics and Knowledge (LAK 2019). Tempe, AZ, USA, 1--7. https://people.engr.ncsu.edu/twprice/website/files/CSEDM2019ProgSnap2.pdfGoogle Scholar
- Hyunmin Seo, Caitlin Sadowski, Sebastian Elbaum, Edward Aftandilian, and Robert Bowdidge. 2014. Programmers' Build Errors: A Case Study (at Google). In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014). ACM, NY, NY, USA, 724--734. https://doi.org/10.1145/2568225.2568255Google ScholarDigital Library
- Simon, Raina Mason, Tom Crick, James H. Davenport, and Ellen Murphy. 2018. Language Choice in Introductory Programming Courses at Australasian and UK Universities. In Proceedings of the 49th ACM SIGCSE TS (Baltimore, Maryland, USA) (SIGCSE '18). ACM, NY, NY, USA, 852--857. https://doi.org/10.1145/3159450.3159547Google ScholarDigital Library
- Andreas Stefik and Richard Ladner. 2017. The Quorum Programming Language (Abstract Only). In Proceedings of the 2017 ACM SIGCSE TS (Seattle, Washington, USA) (SIGCSE '17). ACM, NY, NY, USA, 641. https://doi.org/10.1145/3017680.3022377Google ScholarDigital Library
- Andreas Stefik and Susanna Siebert. 2013. An Empirical Investigation into Programming Language Syntax. ACM TOCE 13, 4 (2013), 1--40. https://doi.org/10.1145/2534973Google ScholarDigital Library
- Emillie Thiselton and Christoph Treude. 2019. Enhancing Python Compiler Error Messages via Stack. In 2019 ACM/IEEE Int. Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, Piscataway, New Jersey, 1--12.Google ScholarCross Ref
- Alexander William Wong, Amir Salimi, Shaiful Chowdhury, and Abram Hindle. 2019. Syntax and Stack Overflow: A Methodology for Extracting a Corpus of Syntax Errors and Fixes. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Piscataway, New Jersey, 318--322.Google ScholarCross Ref
- John Wrenn and Shriram Krishnamurthi. 2017. Error Messages are Classifiers: A Process to Design and Evaluate Error Messages. In Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. ACM NY, NY, USA, Vancouver, BC, Canada, 134--147. https://doi.org/10.1145/3133850.3133862Google ScholarDigital Library
- Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical Attention Networks for Document Classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 1480--1489. https://doi.org/10.18653/v1/N16-1174Google ScholarCross Ref
Index Terms
- First Steps Towards Predicting the Readability of Programming Error Messages
Recommendations
Using Large Language Models to Enhance Programming Error Messages
SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1A key part of learning to program is learning to understand programming error messages. They can be hard to interpret and identifying the cause of errors can be time-consuming. One factor in this challenge is that the messages are typically intended for ...
On Designing Programming Error Messages for Novices: Readability and its Constituent Factors
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing SystemsProgramming error messages play an important role in learning to program. The cycle of program input and error message response completes a loop between the programmer and the compiler/interpreter and is a fundamental interaction between human and ...
Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research
ITiCSE-WGR '19: Proceedings of the Working Group Reports on Innovation and Technology in Computer Science EducationDiagnostic messages generated by compilers and interpreters such as syntax error messages have been researched for over half of a century. Unfortunately, these messages which include error, warning, and run-time messages, present substantial difficulty ...
Comments