skip to main content
10.1145/3395363.3397375acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

DeepSQLi: deep semantic learning for testing SQL injection

Published:18 July 2020Publication History

ABSTRACT

Security is unarguably the most serious concern for Web applications, to which SQL injection (SQLi) attack is one of the most devastating attacks. Automatically testing SQLi vulnerabilities is of ultimate importance, yet is unfortunately far from trivial to implement. This is because the existence of a huge, or potentially infinite, number of variants and semantic possibilities of SQL leading to SQLi attacks on various Web applications. In this paper, we propose a deep natural language processing based tool, dubbed DeepSQLi, to generate test cases for detecting SQLi vulnerabilities. Through adopting deep learning based neural language model and sequence of words prediction, DeepSQLi is equipped with the ability to learn the semantic knowledge embedded in SQLi attacks, allowing it to translate user inputs (or a test case) into a new test case, which is se- mantically related and potentially more sophisticated. Experiments are conducted to compare DeepSQLi with SQLmap, a state-of-the-art SQLi testing automation tool, on six real-world Web applications that are of different scales, characteristics and domains. Empirical results demonstrate the effectiveness and the remarkable superiority of DeepSQLi over SQLmap, such that more SQLi vulnerabilities can be identified by using a less number of test cases, whilst running much faster.

References

  1. Dennis Appelt, Nadia Alshahwan, and Lionel C. Briand. 2013. Assessing the Impact of Firewalls and Database Proxies on SQL Injection Testing. In FITTEST'13: Proc. Workshop of the 2013 Future Internet Testing-First International. 32-47.Google ScholarGoogle Scholar
  2. Dennis Appelt, Cu Duy Nguyen, Lionel C. Briand, and Nadia Alshahwan. 2014. Automated testing for SQL injection vulnerabilities: an input mutation approach. In ISSTA'14: Proc. of the 2014 International Symposium on Software Testing and Analysis. 259-269.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dennis Appelt, Cu D. Nguyen, Annibale Panichella, and Lionel C. Briand. 2018. A Machine-Learning-Driven Evolutionary Approach for Testing Web Application Firewalls. IEEE Trans. Reliability 67, 3 ( 2018 ), 733-757.Google ScholarGoogle ScholarCross RefCross Ref
  4. Davide Ariu, Igino Corona, Roberto Tronci, and Giorgio Giacinto. 2015. Machine Learning in Security Applications. Trans. MLDM 8, 1 ( 2015 ), 3-39.Google ScholarGoogle Scholar
  5. Ilies Benikhlef, Chenghong Wang, and Sangirov Gulomjon. 2016. Mutation based SQL injection test cases generation for the web based application vulnerability testing. In ICENCE'16: Proc. of the 2nd International Conference on Electronics, Network and Computer Engineering.Google ScholarGoogle ScholarCross RefCross Ref
  6. Josip Bozic, Bernhard Garn, Dimitris E. Simos, and Franz Wotawa. 2015. Evaluation of the IPO-Family algorithms for test case generation in web security testing. In ICST'15 Workshops: Proc. Workshop of the 2015 Eighth IEEE International Conference on Software Testing, Verification and Validation. 1-10.Google ScholarGoogle ScholarCross RefCross Ref
  7. Peter F. Brown, Stephen Della Pietra, Vincent J. Della Pietra, Jennifer C. Lai, and Robert L. Mercer. 1992. An Estimate of an Upper Bound for the Entropy of English. Computational Linguistics 18, 1 ( 1992 ), 31-40.Google ScholarGoogle Scholar
  8. Chenyu, Mao, Fan, and Guo. 2016. Defending SQL Injection Attacks basedon Intention-Oriented Detection. In ICCSE'16: Proc. of the 11th International Conference on Computer Science & Education. IEEE, 939-944.Google ScholarGoogle Scholar
  9. Mark Curphey and Rudolph Arawo. 2006. Web application security assessment tools. IEEE Security & Privacy 4, 4 ( 2006 ), 32-41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Linhao Dong, Shuang Xu, and Bo Xu. [n.d.]. Speech-Transformer : A NoRecurrence Sequence-to-Sequence Model for Speech Recognition. In ICASSP'18: Proc. of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing.Google ScholarGoogle Scholar
  11. Rohan Doshi, Noah Apthorpe, and Nick Feamster. 2018. Machine Learning DDoS Detection for Consumer Internet of Things Devices. In SP Workshop'18: Proc. of the 2018 IEEE Security and Privacy. 29-35.Google ScholarGoogle Scholar
  12. David Guthrie, Ben Allison, Wei Liu, Louise Guthrie, and Yorick Wilks. 2006. A Closer Look at Skip-gram Modelling. In LREC'06: Proc. of the 5th International Conference on Language Resources and Evaluation. 1222-1225.Google ScholarGoogle Scholar
  13. Halfond, William GJ, Choudhary, Shauvik Roy, Orso, and Alessandro. 2009. Penetration testing with improved input vector identification. In ICST'09: Proc. of the 2nd International Conference on Software Testing Verification and Validation. 346-355.Google ScholarGoogle Scholar
  14. William G. J. Halfond and Alessandro Orso. 2005. AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks. In ASE'05: Proc. of the 20th IEEE/ACM International Conference on Automated Software Engineering. 174-183.Google ScholarGoogle Scholar
  15. William G. J. Halfond, Alessandro Orso, and Panagiotis Manolios. 2006. Using positive tainting and syntax-aware evaluation to counter SQL injection attacks. In SIGSOFT'06: Proc. of the 14th ACM International Symposium on Foundations of Software Engineering. 175-185.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. William G. J. Halfond, Alessandro Orso, and Pete Manolios. 2008. WASP: Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation. IEEE Trans. Software Eng. 34, 1 ( 2008 ), 65-81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Ian Simon, Curtis Hawthorne, Noam Shazeer, Andrew M. Dai, Matthew D. Hofman, Monica Dinculescu, and Douglas Eck. 2019. Music Transformer: Generating Music with Long-Term Structure. In ICLR'19: Proc. of the 7th International Conference on Learning Representations.Google ScholarGoogle Scholar
  18. Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A Convolutional Neural Network for Modelling Sentences. In ACL'14: Proc. of the 52nd Association for Computational Linguistics. 655-665.Google ScholarGoogle ScholarCross RefCross Ref
  19. Adam Kiezun, Philip J. Guo, Karthick Jayaraman, and Michael D. Ernst. 2009. Automatic creation of SQL Injection and cross-site scripting attacks. In ICSE'09: Proc. of the 31st International Conference on Software Engineering. 199-209.Google ScholarGoogle Scholar
  20. Mi-Yeon Kim and Dong Hoon Lee. 2014. Data-mining based SQL injection attack detection using internal query trees. Expert Syst. Appl. 41, 11 ( 2014 ), 5416-5430.Google ScholarGoogle Scholar
  21. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR'15: Proc. of the 52nd Association for Computational Linguistics.Google ScholarGoogle Scholar
  22. Huichen Li, Xiaojun Xu, Chang Liu, Teng Ren, Kun Wu, Xuezhi Cao, Weinan Zhang, Yong Yu, and Dawn Song. 2018. A Machine Learning Approach to Prevent Malicious Calls over Telephony Networks. In SP'18: Proc. of the 2018 IEEE Symposium on Security and Privacy. 53-69.Google ScholarGoogle ScholarCross RefCross Ref
  23. Ofer Maor and Amichai Shulman. 2004. SQL injection signatures evasion. Imperva, Inc., Apr ( 2004 ).Google ScholarGoogle Scholar
  24. Stuart McDonald. 2002. SQL Injection: Modes of attack, defense, and why it matters. White paper, GovernmentSecurity. org ( 2002 ).Google ScholarGoogle Scholar
  25. Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent Models of Visual Attention. In NIPS'14: Proc. of the 2014 Neural Information Processing Systems. 2204-2212.Google ScholarGoogle Scholar
  26. Veselin Raychev, Martin T. Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In PLDI'14: Proc. of the 2014 Programming Language Design and Implementation. 419-428.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Naghmeh Moradpoor Sheykhkanloo. 2017. A Learning-based Neural Network Model for the Detection and Classification of SQL Injection Attacks. IJCWT 7, 2 ( 2017 ), 16-41.Google ScholarGoogle Scholar
  28. Sanjib Sinha. 2018. SQL Mapping. In Beginning Ethical Hacking with Kali Linux. Springer, 221-258.Google ScholarGoogle Scholar
  29. Jaroslaw Skaruz and Franciszek Seredynski. 2007. Recurrent neural networks towards detection of SQL attacks. In IPDPS'07: Proc. of the 21th International Parallel and Distributed Processing Symposium. 1-8.Google ScholarGoogle ScholarCross RefCross Ref
  30. Julian Thomé, Alessandra Gorla, and Andreas Zeller. 2014. Search-based security testing of web applications. In SBST'14: Proc. of the 7th International Workshop on Search-Based Software Testing. 5-14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wei Tian, Jufeng Yang, Jing Xu, and Guannan Si. 2012. Attack Model Based Penetration Test for SQL Injection Vulnerability. In COMPSAC'12: Proc. Workshops of the 36th Annual IEEE Computer Software and Applications. 589-594.Google ScholarGoogle Scholar
  32. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS'17: Proc. of the 2017 Neural Information Processing Systems. 5998-6008.Google ScholarGoogle Scholar
  33. Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey E. Hinton. 2015. Grammar as a Foreign Language. In NIPS'15: Proc. of the 2015 Neural Information Processing Systems. 2773-2781.Google ScholarGoogle Scholar

Index Terms

  1. DeepSQLi: deep semantic learning for testing SQL injection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISSTA 2020: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis
        July 2020
        591 pages
        ISBN:9781450380089
        DOI:10.1145/3395363

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 July 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate58of213submissions,27%

        Upcoming Conference

        ISSTA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader