skip to main content
10.1145/3351287.3351290acmotherconferencesArticle/Chapter ViewAbstractPublication PagesukicerConference Proceedingsconference-collections
research-article

The False-Positive Rate of Automated Plagiarism Detection for SQL Assessments

Published:05 September 2019Publication History

ABSTRACT

Automated assessment is becoming increasingly common in Computer Science and with it automated plagiarism detection is also common. However, little attention has been paid to SQL assessment where submissions are much shorter and must be less varied than in imperative languages. This brings the challenge of avoiding high false-positive rates that require manual inspection and undermine the usefulness of automated detection.

In this paper we investigate the false-positive rate of various automated plagiarism detection algorithms. We find that there is a significant false-positive rate of between 15% and 64%. These results call into question the usefulness of automated detection for SQL since they imply that a lot of manual inspection will still be needed.

However, our results suggest that the false-positive rate may be restricted to shorter queries (e.g. under 200 characters). Further research is needed because our datasets consist mostly of short queries and the results for longer queries are based on a small subset of the data.

References

  1. Ilia Bider and David Rogers. 2016. YASQLT Yet another SQL tutor: A pragmatic approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9975 LNCS. Springer, Cham, 197--206.Google ScholarGoogle Scholar
  2. Jess Bidgood and Jeremy B. Merrill. 2017. As Computer Coding Classes Swell, So Does Cheating. https://www.nytimes.com/2017/05/29/us/computer-science-cheating.htmlhttps://www.nytimes.com/2017/05/29/us/computer-science-cheating.html%0Ahttps://www.nytimes.com/2017/05/29/us/computer-science-cheating.html?smid=nytcore-ipad-share&smprod=nytcore-ipad&_rGoogle ScholarGoogle Scholar
  3. Samuel Breese, Evan Maicus, Matthew Peveler, and Barbara Cutler. 2018. Correlation of a Flexible Late Day Policy with Student Stress and Programming Assignment Plagiarism. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education - SIGCSE '18. ACM Press, New York, New York, USA, 1089--1089. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Georgina Cosma and Mike Joy. 2006. Source-code plagiarism: A UK academic perspective. Research Report No. 422 (2006), 1--74. https://www.dcs.warwick.ac.uk/report/pdfs/cs-rr-422.pdfhttp://eprints.dcs.warwick.ac.uk/52/Google ScholarGoogle Scholar
  5. Robert Fraser. 2014. Collaboration, collusion and plagiarism in computer science coursework. Informatics in Education 13, 2 (2014), 179--195.Google ScholarGoogle ScholarCross RefCross Ref
  6. Amardeep Kahlon, Bonnie MacKellar, and Anastasia Kurdia. 2018. Combating the Wide Web of Plagiarism. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education - SIGCSE '18. ACM Press, New York, New York, USA, 1069--1069. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Anthony Kleerekoper and Andrew Schofield. 2018. SQL tester: an online SQL assessment tool and its impact. In Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education - ITiCSE 2018. ACM Press, New York, New York, USA, 87--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Joshua License. 2017. testSQL: Learn SQL the Interactive Way. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education - ITiCSE '17. ACM Press, New York, New York, USA, 376--376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tony Mason, Ada Gavrilovska, and David A. Joyner. 2019. Collaboration Versus Cheating. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education - SIGCSE '19. ACM Press, New York, New York, USA, 1004--1010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Matija Novak, Mike Joy, and Dragutin Kermek. 2019. Source-code Similarity Detection and Detection Tools Used in Academia. ACM Transactions on Computing Education 19, 3 (5 2019), 1--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Julia Opgen-Rhein, Bastian Küppers, and Ulrik Schroeder. 2018. An Application to Discover Cheating in Digital Exams. In Proceedings of the 18th Koli Calling International Conference on Computing Education Research. ACM, 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Julia Prior. 2014. AsseSQL: an Online, Browser-based SQL Skills Assessment Tool. In Proceedings of the 2014 conference on Innovation & technology in computer science education ITiCSE '14. ACM Press, New York, New York, USA, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Eric Roberts. 2002. Strategies for promoting academic integrity in CS courses. In 32nd Annual Frontiers in Education, Vol. 2. IEEE, F3G--F3G.Google ScholarGoogle ScholarCross RefCross Ref
  14. Gordon Russell and Andrew Cumming. 2005. Online assessment and checking of SQL: detecting and preventing plagiarism.. In Teaching, Learning and Assessment in Databases.Google ScholarGoogle Scholar
  15. Nikolai Scerbakov, Alexander Schukin, and Oleg Sabinin. 2018. Plagiarism detection in SQL student assignments. In Advances in Intelligent Systems and Computing, Vol. 716. Springer, Cham, 110--115.Google ScholarGoogle Scholar
  16. Judy Sheard, Simon, Matthew Butler, Katrina Falkner, Michael Morgan, and Amali Weerasinghe. 2017. Strategies for Maintaining Academic Integrity in First-Year Computing Courses. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education - ITiCSE 17. ACM Press, New York, New York, USA, 244--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Narjes Tahaei and David C. Noelle. 2018. Automated Plagiarism Detection for Computer Programming Exercises Based on Patterns of Resubmission. In Proceedings of the 2018 ACM Conference on International Computing Education Research - ICER '18. ACM Press, New York, New York, USA, 178--186. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The False-Positive Rate of Automated Plagiarism Detection for SQL Assessments

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        UKICER '19: Proceedings of the 2019 Conference on United Kingdom & Ireland Computing Education Research
        September 2019
        81 pages
        ISBN:9781450372572
        DOI:10.1145/3351287

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 September 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader