skip to main content
research-article

A Model of How Students Engineer Test Cases With Feedback

Published:14 January 2024Publication History
Skip Abstract Section

Abstract

Background and Context. Students’ programming projects are often assessed on the basis of their tests as well as their implementations, most commonly using test adequacy criteria like branch coverage, or, in some cases, mutation analysis. As a result, students are implicitly encouraged to use these tools during their development process (i.e., so they have awareness of the strength of their own test suites).

Objectives. Little is known about how students choose test cases for their software while being guided by these feedback mechanisms. We aim to explore the interaction between students and commonly used testing feedback mechanisms (in this case, branch coverage and mutation-based feedback).

Method. We use grounded theory to explore this interaction. We conducted 12 think-aloud interviews with students as they were asked to complete a series of software testing tasks, each of which involved a different feedback mechanism. Interviews were recorded and transcripts were analyzed, and we present the overarching themes that emerged from our analysis.

Findings. Our findings are organized into a process model describing how students completed software testing tasks while being guided by a test adequacy criterion. Program comprehension strategies were commonly employed to reason about feedback and devise test cases. Mutation-based feedback tended to be cognitively overwhelming for students, and they resorted to weaker heuristics in order to address this feedback.

Implications. In the presence of testing feedback, students did not appear to consider problem coverage as a testing goal so much as program coverage. While test adequacy criteria can be useful for assessment of software tests, we must consider whether they represent good goals for testing, and if our current methods of practice and assessment are encouraging poor testing habits.

REFERENCES

  1. [1] Aaltonen Kalle, Ihantola Petri, and Seppälä Otto. 2010. Mutation analysis vs. code coverage in automated assessment of students’ testing skills. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion (OOPSLA’10). Association for Computing Machinery, 153160. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Ammann P. and Offutt J.. 2008. Introduction to Software Testing. Cambridge University Press. 2007035077 Retrieved from https://books.google.com/books?id=BMbaAAAAMAAJGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Aniche Maurício. 2022. Effective Software Testing: A Developer’s Guide. Manning, Shelter Island, NY.Google ScholarGoogle Scholar
  4. [4] Aniche Maurício, Hermans Felienne, and Deursen Arie van. 2019. Pragmatic software testing education. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE’19). ACM, 414420. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Aniche Maurício, Treude Christoph, and Zaidman Andy. 2021. How developers engineer test cases: An observational study. IEEE Transactions on Software Engineering 48, 12 (2021), 4925–4946 DOI:Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Michigan CS Learning 4 U. Group at The University of. 2015. 16.9 — Rainfall Problem — AP CS Principles — Student Edition. Retrieved September 30, 2023 from https://runestone.academy/ns/books/published/StudentCSP/CSPIntroData/rainfall.htmlGoogle ScholarGoogle Scholar
  7. [7] Bai Gina R., Smith Justin, and Stolee Kathryn T.. 2021. How students unit test: Perceptions, practices, and pitfalls. In Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 1 (ITiCSE’21). Association for Computing Machinery, 248254. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Castro F. E. V. and Fisler K.. 2020. Qualitative analyses of movements between task-level and code-level thinking of novice programmers. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, 487493. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Tie Y. Chun, Birks M., and Francis K.. 2019. Grounded theory research: A design framework for novice researchers. SAGE Open Medicine 7 (2019), 2050312118822927. Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Claessen Koen and Hughes John. 2000. QuickCheck: A lightweight tool for random testing of haskell programs. SIGPLAN Notices 35, 9(2000), 268279. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Clarke Peter J., Davis Debra, King Tariq M., Pava Jairo, and Jones Edward L.. 2014. Integrating testing into software engineering courses supported by a collaborative learning environment. ACM Transactions on Computing Education 14, 3, (2014), 33 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Clarke Peter J., Davis Debra L., Chang-Lau Raymond, and King Tariq M.. 2017. Impact of using tools in an undergraduate software testing course supported by WReSTT. ACM Transactions on Computing Education 17, 4 (2017), 28 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Cook Thomas D., Campbell Donald Thomas, and Day Arles. 1979. Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin Boston. Google ScholarGoogle Scholar
  14. [14] Cordova Lucas, Carver Jeffrey, Gershmel Noah, and Walia Gursimran. 2021. A comparison of inquiry-based conceptual feedback vs. traditional detailed feedback mechanisms in software testing education: An empirical investigation. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (SIGCSE’21). Association for Computing Machinery, 8793. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Crosby Martha E., Scholtz Jean, and Wiedenbeck Susan. 2002. The roles beacons play in comprehension for novice and expert programmers. In Proceedings of the PPIG. 5.Google ScholarGoogle Scholar
  16. [16] Delamaro M. E., Offutt J., and Ammann P.. 2014. Designing deletion mutation operators. In Proceedings of the 2014 IEEE 7th International Conference on Software Testing, Verification, and Validation. IEEE, 1120. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] DeMillo R. A., Lipton R. J., and Sayward F. G.. 1978. Hints on test data selection: Help for the practicing programmer. Computer 11, 4(1978), 3441. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Derezińska A. and Hałas K.. 2014. Analysis of mutation operators for the python language. In Proceedings of the Ninth International Conference on Dependability and Complex Systems DepCoS-RELCOMEX. Brunów, Poland, Zamojski W., Mazurkiewicz J., Sugier J., Walkowiak T., and Kacprzyk J. (Eds.), Springer International Publishing, Cham, 155164. Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Doorn Niels, Vos Tanja, Marín Beatriz, and Barendsen Erik. 2023. Set the right example when teaching programming: Test Informed Learning with Examples (TILE). In Proceedings of the 2023 IEEE Conference on Software Testing, Verification and Validation (ICST). 269280. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Doorn Niels, Vos Tanja E. J., Marín Beatriz, Passier Harrie, Bijlsma Lex, and Cacace Silvio. 2021. Exploring students’ sensemaking of test case design. An initial study. In Proceedings of the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). 10691078. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Edmison Bob and Edwards Stephen H.. 2019. Experiences using heat maps to help students find their bugs: Problems and solutions. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE’19). Association for Computing Machinery, 260266. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Edmison Bob and Edwards Stephen H.. 2020. Turn up the heat! using heat maps to visualize suspicious code to help students successfully complete programming problems faster. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET’20). Association for Computing Machinery, 3444. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Edwards Stephen H.. 2004. Using software testing to move students from trial-and-error to reflection-in-action. ACM SIGCSE Bulletin 36, 1(2004), 26. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Edwards Stephen H. and Shams Zalia. 2014. Comparing test quality measures for assessing student-written tests. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion 2014). ACM, 354363. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Edwards Stephen H. and Shams Zalia. 2014. Comparing test quality measures for assessing student-written tests. In Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion 2014). Association for Computing Machinery, 354363. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Edwards Stephen H., Shams Zalia, Cogswell Michael, and Senkbeil Robert C.. 2012. Running students’ software tests against each others’ code: New life for an old “Gimmick”. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education (SIGCSE’12). Association for Computing Machinery, 221226. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Enoiu Eduard and Feldt Robert. 2021. Towards human-like automated test generation: Perspectives from cognition and problem solving. In Proceedings of the 2021 IEEE/ACM 13th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). 123124. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Enoiu Eduard, Tukseferi Gerald, and Feldt Robert. 2020. Towards a model of testers’ cognitive processes: Software testing as a problem solving approach. In Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C). 272279. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Glaser B. G. and Strauss A. L.. 1967. The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine. Google ScholarGoogle Scholar
  30. [30] Goldwasser Michael H.. 2002. A gimmick to integrate software testing throughout the curriculum. ACM SIGCSE Bulletin 34, 1(2002), 271275. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Goodenough John B. and Gerhart Susan L.. 1975. Toward a theory of test data selection. IEEE Transactions on Software Engineering SE-1, 2 (1975), 156173. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Gopinath Rahul, Jensen Carlos, and Groce Alex. 2014. Mutations: How close are they to real faults?. In Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability Engineering. 189200. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Hall Braxton and Baniassad Elisa. 2022. Evaluating the quality of student-written software tests with curated mutation analysis. In Proceedings of the 2022 ACM SIGPLAN International Symposium on SPLASH-E (SPLASH-E 2022). Association for Computing Machinery, 2434. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Hemmati Hadi. 2015. How effective are code coverage criteria?. In Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security. 151156. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Hollén Joy W. and Zacarias Patrick S.. 2013. Exploring Code Coverage in Software Testing and its Correlation with Software Quality; A Systematic Literature Review. Bachelor’s Thesis. University of Gothenburg, 405 30 Gothenburg, Sweden.Google ScholarGoogle Scholar
  36. [36] Inozemtseva Laura and Holmes Reid. 2014. Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, 435445. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Janzen David and Saiedian Hossein. 2008. Test-driven learning in early programming courses. ACM SIGCSE Bulletin 40, 1(2008), 532536. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Jones Edward L.. 2000. Software testing in the computer science curriculum – a holistic approach. In Proceedings of the Australasian Conference on Computing Education (ACSE’00). ACM, 153157. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Just René, Jalali Darioush, Inozemtseva Laura, Ernst Michael D., Holmes Reid, and Fraser Gordon. 2014. Are mutants a valid substitute for real faults in software testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). Association for Computing Machinery, 654665. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Karabenick Stuart A.. 2004. Perceived achievement goal structure and college student help seeking. Journal of Educational Psychology 96, 3 (2004), 569581. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Kazerouni Ayaan M., Davis James C., Basak Arinjoy, Shaffer Clifford A., Servant Francisco, and Edwards Stephen H.. 2021. Fast and accurate incremental feedback for students’ software tests using selective mutation analysis. Journal of Systems and Software 175, 110905 (2021). https://www.sciencedirect.com/science/article/pii/S0164121221000029?via%3DihubGoogle ScholarGoogle ScholarCross RefCross Ref
  42. [42] Kazerouni Ayaan M., Edwards Stephen H., and Shaffer Clifford A.. 2017. Quantifying incremental development practices and their relationship to procrastination. In Proceedings of the 2017 ACM Conference on International Computing Education Research (ICER’17). Association for Computing Machinery, 191199. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Kazerouni Ayaan M., Shaffer Clifford A., Edwards Stephen H., and Servant Francisco. 2019. Assessing incremental testing practices and their impact on project outcomes. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE’19). Association for Computing Machinery, 407413. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] King K. N. and Offutt A. Jefferson. 1991. A fortran language system for mutation-based software testing. Journal of Software: Practice and Experience 21, 7 (1991), 685718. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Kintis Marinos, Papadakis Mike, Papadopoulos Andreas, Valvis Evangelos, and Malevris Nicos. 2016. Analysing and comparing the effectiveness of mutation testing tools: A manual study. In Proceedings of the 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM). 147156. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Kochhar Pavneet Singh, Thung Ferdian, and Lo David. 2015. Code coverage and test suite effectiveness: Empirical study with real bugs in large systems. In Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 560564. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Krekel Holger, Oliveira Bruno, Pfannschmidt Ronny, Bruynooghe Floris, Laugher Brianna, and Bruhin Florian. 2004. pytest 6.2.2. Retrieved from https://github.com/pytest-dev/pytest. Accessed December 4, 2023.Google ScholarGoogle Scholar
  48. [48] Madeyski Lech, Orzeszyna Wojciech, Torkar Richard, and Józala Mariusz. 2014. Overcoming the equivalent mutant problem: A systematic literature review and a comparative experiment of second order mutation. IEEE Transactions on Software Engineering 40, 1 (2014), 2342. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] McCracken Michael, Almstrum Vicki, Diaz Danny, Guzdial Mark, Hagan Dianne, Kolikant Yifat Ben-David, Laxer Cary, Thomas Lynda, Utting Ian, and Wilusz Tadeusz. 2001. A multi-national, multi-institutional study of assessment of programming skills of first-year CS students. In Proceedings of the Working Group Reports from ITiCSE on Innovation and Technology in Computer Science Education (ITiCSE-WGR’01). Association for Computing Machinery, 125180. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Middleton Justin and Stolee Kathryn T.. 2022. Understanding similar code through comparative comprehension. In Proceedings of the 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 111. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Myers Glenford J., Sandler Corey, and Badgett Tom. 2012. The Art of Software Testing (3rd. ed.). John Wiley and Sons, Hoboken and N.J.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Offutt A. Jefferson, Lee Ammei, Rothermel Gregg, Untch Roland H., and Zapf Christian. 1996. An experimental determination of sufficient mutant operators. ACM Transactions on Software Engineering and Methodology 5, 2(1996), 99118. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Offutt A. J. and Voas J. M.. 1996. Subsumption of Condition Coverage Techniques by Mutation Testing. Technical Report ISSE-TR-96-01. Information and Software Systems Engineering, George Mason University.Google ScholarGoogle Scholar
  54. [54] Pennington Nancy. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychology 19, 3 (1987), 295341. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Petrović Goran, Ivanković Marko, Fraser Gordon, and Just René. 2021. Does mutation testing improve testing practices?. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 910921. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Prather James, Pettit Raymond, Becker Brett A., Denny Paul, Loksa Dastyni, Peters Alani, Albrecht Zachary, and Masci Krista. 2019. First things first: Providing metacognitive scaffolding for interpreting problem prompts. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE’19). Association for Computing Machinery, 531537. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Sajaniemi J.. 2002. An empirical analysis of roles of variables in novice-level procedural programs. In Proceedings of the IEEE 2002 Symposia on Human Centric Computing Languages and Environments. 3739. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Sajaniemi Jorma and Kuittinen Marja. 2005. An experiment on using roles of variables in teaching introductory programming. Computer Science Education 15, 1 (2005), 5982. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Scatalon Lilian Passos, Carver Jeffrey C., Garcia Rogério Eduardo, and Barbosa Ellen Francine. 2019. Software testing in introductory programming courses: A systematic mapping study. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE’19). Association for Computing Machinery, 421427. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Namin Akbar Siami, Andrews James H., and Murdoch Duncan J.. 2008. Sufficient mutation operators for measuring test effectiveness. In Proceedings of the 30th International Conference on Software Engineering (ICSE’08). ACM, 351360. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Spacco Jaime and Pugh William. 2006. Helping students appreciate test-driven development (tdd). In Proceedings of the Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’06). Association for Computing Machinery, 907913. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Strauss Anselm and Corbin Juliet M.. 1997. Grounded Theory in Practice. Sage. Google ScholarGoogle Scholar
  63. [63] Tengeri Dávid, Vidács László, Beszédes Árpád, Jász Judit, Balogh Gergõ, Vancsics Béla, and Gyimóthy Tibor. 2016. Relating code coverage, mutation score and test suite reducibility to defect density. In Proceedings of the 2016 IEEE Ninth International Conference on Software Testing, Verification and Validation Workshops (ICSTW). 174179. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Whalley Jacqueline, Settle Amber, and Luxton-Reilly Andrew. 2023. A think-aloud study of novice debugging. ACM Transactions on Computing Education 23, 2, (2023), 38 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. [65] Wrenn John. 2022. Executable Examples: Empowering Students to Hone Their Problem Comprehension. Ph. D. Dissertation. Brown University.Google ScholarGoogle Scholar
  66. [66] Wrenn John and Krishnamurthi Shriram. 2019. Executable examples for programming problem comprehension. In Proceedings of the 2019 ACM Conference on International Computing Education Research (ICER’19). Association for Computing Machinery, 131139. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Wrenn John, Krishnamurthi Shriram, and Fisler Kathi. 2018. Who tests the testers?. In Proceedings of the 2018 ACM Conference on International Computing Education Research (ICER’18). ACM, 5159. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. [68] Wrenn John, Nelson Tim, and Krishnamurthi Shriram. 2020. Using relational problems to teach property-based testing. The Art, Science, and Engineering of Programming 5, 2(2020). DOI: https://programming-journal.org/2021/5/9/Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Model of How Students Engineer Test Cases With Feedback

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computing Education
        ACM Transactions on Computing Education  Volume 24, Issue 1
        March 2024
        412 pages
        EISSN:1946-6226
        DOI:10.1145/3613506
        • Editor:
        • Amy J. Ko
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 January 2024
        • Online AM: 20 October 2023
        • Accepted: 12 October 2023
        • Revised: 1 October 2023
        • Received: 26 May 2023
        Published in toce Volume 24, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)230
        • Downloads (Last 6 weeks)47

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text