skip to main content
10.1145/3545258.3545283acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

An adaptive search optimization algorithm for improving the detection capability of software vulnerability

Published:15 September 2022Publication History

ABSTRACT

Deep learning-based vulnerability detection frees human experts from the tedious task of defining features and allows for better detection capabilities. The common practice is to convert program code into vector representation for neural network model training. Since the length of the vector representation varies across program code, finding the optimal vector length is critical to ensuring detection accuracy. This paper proposes an adaptive search optimization algorithm for finding the optimal vector length. It sorts all the vector lengths obtained by word2vec and takes the vector length corresponding to the point where the trend changes from slow to fast as the output. We evaluate our algorithm on three publicly available datasets against state-of-the-art algorithms. The results show that, without significantly increasing the time overhead, our algorithm can more accurately choose an appropriate vector length instead of setting a value empirically or arbitrarily. Furthermore, it shows that while a larger vector length can usually produces a higher detection accuracy, the extra time overhead incurred often does not suffice to compensate for the corresponding accuracy improvement.

References

  1. Jinfu Chen, Bo Liu, Saihua Cai, Weijia Wang, and Shengran Wang. 2021. AIdetectorX: A Vulnerability Detector Based on TCN and Self-attention Mechanism. In International Symposium on Dependable Software Engineering: Theories, Tools, and Applications. 161–177.Google ScholarGoogle Scholar
  2. Jacob A Harer, Louis Y Kim, Rebecca L Russell, 2018. Automated software vulnerability detection with machine learning. arXiv preprint arXiv:1803.04497(2018).Google ScholarGoogle Scholar
  3. Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. 2017. VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery. In Proceedings of 2017 IEEE Symposium on Security and Privacy (S&P’17). 595–614.Google ScholarGoogle ScholarCross RefCross Ref
  4. Colin Lea, Rene Vidal, Austin Reiter, and Gregory D Hager. 2016. Temporal convolutional networks: A unified approach to action segmentation. In European Conference on Computer Vision. 47–54.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jingyue Li and Michael D. Ernst. 2012. CBCD: Cloned Buggy Code Detector. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). 310–320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A Deep Learning-based System for Vulnerability Detection. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS’18). 1–15.Google ScholarGoogle ScholarCross RefCross Ref
  7. Guanjun Lin, Jun Zhang, Wei Luo, Lei Pan, Yang Xiang, Olivier De Vel, and Paul Montague. 2018. Cross-project transfer representation learning for vulnerable function discovery. IEEE Transactions on Industrial Informatics 14, 7 (2018), 3289–3297.Google ScholarGoogle ScholarCross RefCross Ref
  8. Shigang Liu, Guanjun Lin, Lizhen Qu, Jun Zhang, Olivier De Vel, Paul Montague, and Yang Xiang. 2020. CD-VulD: Cross-domain vulnerability discovery based on deep domain adaptation. IEEE Transactions on Dependable and Secure Computing (2020), 1–14.Google ScholarGoogle Scholar
  9. Xu-Ying Liu, Jianxin Wu, and Zhi-Hua Zhou. 2008. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39, 2(2008), 539–550.Google ScholarGoogle Scholar
  10. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.Google ScholarGoogle Scholar
  11. Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 426–437.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting Vulnerable Software Components via Text Mining. IEEE Transactions on Software Engineering 40, 10 (2014), 993–1006.Google ScholarGoogle ScholarCross RefCross Ref
  13. Lwin Khin Shar and Hee Beng Kuan Tan. 2013. Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns. Information and Software Technology 55, 10 (2013), 1767–1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Zhixing Tan, Mingxuan Wang, Jun Xie, Yidong Chen, and Xiaodong Shi. 2018. Deep semantic role labeling with self-attention. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18). 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  15. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). 6000–6010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. James Walden, Jeff Stuckman, and Riccardo Scandariato. 2014. Predicting Vulnerable Components: Software Metrics vs Text Mining. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (ISSRE’14). 23–33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). 297–308.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fabian Yamaguchi, Alwin Maier, Hugo Gascon, and Konrad Rieck. 2015. Automatic Inference of Search Patterns for Taint-Style Vulnerabilities. In Proceedings of 2015 IEEE Symposium on Security and Privacy (S&P’15). 797–812.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An adaptive search optimization algorithm for improving the detection capability of software vulnerability

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        Internetware '22: Proceedings of the 13th Asia-Pacific Symposium on Internetware
        June 2022
        291 pages
        ISBN:9781450397803
        DOI:10.1145/3545258

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 September 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate55of111submissions,50%
      • Article Metrics

        • Downloads (Last 12 months)19
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format