skip to main content
10.1145/3133956.3134085acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits

Authors Info & Claims
Published:30 October 2017Publication History

ABSTRACT

Patches and related information about software vulnerabilities are often made available to the public, aiming to facilitate timely fixes. Unfortunately, the slow paces of system updates (30 days on average) often present to the attackers enough time to recover hidden bugs for attacking the unpatched systems. Making things worse is the potential to automatically generate exploits on input-validation flaws through reverse-engineering patches, even though such vulnerabilities are relatively rare (e.g., 5% among all Linux kernel vulnerabilities in last few years). Less understood, however, are the implications of other bug-related information (e.g., bug descriptions in CVE), particularly whether utilization of such information can facilitate exploit generation, even on other vulnerability types that have never been automatically attacked.

In this paper, we seek to use such information to generate proof-of-concept (PoC) exploits for the vulnerability types never automatically attacked. Unlike an input validation flaw that is often patched by adding missing sanitization checks, fixing other vulnerability types is more complicated, usually involving replacement of the whole chunk of code. Without understanding of the code changed, automatic exploit becomes less likely. To address this challenge, we present SemFuzz, a novel technique leveraging vulnerability-related text (e.g., CVE reports and Linux git logs) to guide automatic generation of PoC exploits. Such an end-to-end approach is made possible by natural-language processing (NLP) based information extraction and a semantics-based fuzzing process guided by such information. Running over 112 Linux kernel flaws reported in the past five years, SemFuzz successfully triggered 18 of them, and further discovered one zero-day and one undisclosed vulnerabilities. These flaws include use-after-free, memory corruption, information leak, etc., indicating that more complicated flaws can also be automatically attacked. This finding calls into question the way vulnerability-related information is shared today.

Skip Supplemental Material Section

Supplemental Material

References

  1. 2016. 2016 Financial Industry Cybersecurity Report. https://cdn2.hubspot.net/hubfs/533449/SecurityScorecard 2016 Financial Report.pdf. (2016).Google ScholarGoogle Scholar
  2. 2016. FullDisclosure: CVE-2016--8655 Linux af packet.c race condition (local root). http://seclists.org/oss-sec/2016/q4/607. (2016).Google ScholarGoogle Scholar
  3. 2016. Kernel: Add KCOV Code Coverage. https://lwn.net/Articles/671640/. (2016).Google ScholarGoogle Scholar
  4. 2016. Syzkaller. https://github.com/google/syzkaller. (2016).Google ScholarGoogle Scholar
  5. 2016. Yahoo: Hackers Stole Data On Another Billion Accounts. https://www.forbes.com/sites/thomasbrewster/2016/12/14/yahoo-admitsanother-billion-user-accounts-were-leaked-in-2013. (2016).Google ScholarGoogle Scholar
  6. 2017. Application Vulnerability: Trend Analysis and Correlation of Coding Patterns across Industries. https://www.cognizant.com/whitepapers/ApplicationVulnerability-Trend-Analysis-and-Correlation-of-Coding-Patterns-AcrossIndustries.pdf. (2017).Google ScholarGoogle Scholar
  7. 2017. Bug 195709. https://bugzilla.kernel.org/show bug.cgi?id=195709. (2017).Google ScholarGoogle Scholar
  8. 2017. Bug 195807. https://bugzilla.kernel.org/show bug.cgi?id=195807. (2017).Google ScholarGoogle Scholar
  9. 2017. Common Vulnerabilities and Exposures. https://cve.mitre.org. (2017).Google ScholarGoogle Scholar
  10. 2017. Common Weakness Enumeration. https://cwe.mitre.org. (2017).Google ScholarGoogle Scholar
  11. 2017. CWE: Improper Input Validation. https://cwe.mitre.org/data/definitions/20.html. (2017).Google ScholarGoogle Scholar
  12. 2017. FullDisclosure Mailing List. http://seclists.org/fulldisclosure. (2017).Google ScholarGoogle Scholar
  13. 2017. Information Security Resources. https://www.sans.org/security-resources/blogs. (2017).Google ScholarGoogle Scholar
  14. 2017. Krebs on Security. https://krebsonsecurity.com. (2017).Google ScholarGoogle Scholar
  15. 2017. Linux Kernel Git Repositories. https://git.kernel.org. (2017).Google ScholarGoogle Scholar
  16. 2017. Linux man pages online. http://man7.org/linux/man-pages/index.html. (2017).Google ScholarGoogle Scholar
  17. 2017. National Vulnerability Database. https://nvd.nist.gov. (2017).Google ScholarGoogle Scholar
  18. 2017. pyStatParser. https://github.com/emilmont/pyStatParser. (2017).Google ScholarGoogle Scholar
  19. 2017. STP Constraint Solver. http://stp.github.io. (2017).Google ScholarGoogle Scholar
  20. 2017. Vulnerability. https://en.wikipedia.org/wiki/Vulnerability (computing). (2017).Google ScholarGoogle Scholar
  21. 2017. vUSBf. https://github.com/schumilo/vUSBf. (2017).Google ScholarGoogle Scholar
  22. 2017. WannaCry Ransomware Attack. https://en.wikipedia.org/wiki/WannaCry ransomware attack. (2017).Google ScholarGoogle Scholar
  23. 2017. What is CVE and How Does It Work? http://www.csoonline.com/article/3204884/application-security/what-isthe-cve-and-how-does-it-work.html. (2017).Google ScholarGoogle Scholar
  24. Abeer Alhuzali, Birhanu Eshete, Rigel Gjomemo, and VN Venkatakrishnan. 2016. Chainsaw: Chained Automated Workflow-Based Exploit Generation. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS 2016). ACM, 641--652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Frances E Allen. 1970. Control Flow Analysis. In ACM SIGPLAN Notices, Vol. 5. ACM, 1--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Thanassis Avgerinos, Sang Kil Cha, Alexandre Rebert, Edward J Schwartz, Maverick Woo, and David Brumley. 2014. Automatic Exploit Generation. Commun. ACM 57, 2 (2014), 74--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage- Based Greybox Fuzzing as Markov Chain. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS 2016). ACM, 1032--1043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. David Brumley, Pongsin Poosankam, Dawn Song, and Jiang Zheng. 2008. Automatic Patch-Based Exploit Generation is possible: Techniques and Implications. In Proceedings of the 29th IEEE Symposium on Security & Privacy (S&P 2008). IEEE, 143--157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yan Cai and Lingwei Cao. 2015. Effective and Precise Dynamic Detection of Hidden Races for Java Programs. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, (FSE 2015). 450--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Eugene Charniak. 1996. Tree-Bank Grammars. In Proceedings of the 10th National Conference on Artificial Intelligence (AAAI 1996). 1031--1036.Google ScholarGoogle Scholar
  31. Kai Chen, Xueqiang Wang, Yi Chen, Peng Wang, Yeonjoon Lee, XiaoFeng Wang, Bin Ma, Aohui Wang, Yingjun Zhang, and Wei Zou. 2016. Following Devil's Footprints: Cross-Platform Analysis of Potentially Harmful Libraries on Android and iOS. In Proceedings of the 37th IEEE Symposium on Security & Privacy (S&P 2016). 357--376.Google ScholarGoogle ScholarCross RefCross Ref
  32. Kai Chen, Yingjun Zhang, and Peng Liu. 2016. Dynamically Discovering Likely Memory Layout to Perform Accurate Fuzzing. IEEE Trans. Reliability 65, 3 (2016), 1180--1194. Google ScholarGoogle ScholarCross RefCross Ref
  33. Alessandra Gorla, Ilaria Tavecchia, Florian Gross, and Andreas Zeller. 2014. Checking App Behavior Against App Descriptions. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, 1025--1035. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hong Hu, Zheng Leong Chua, Sendroiu Adrian, Prateek Saxena, and Zhenkai Liang. 2015. Automatic Generation of Data-Oriented Exploits. In Proceedings of the 24th USENIX Security Symposium (Security 2015). 177--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Shih-Kun Huang, Han-Lin Lu, Wai-Meng Leong, and Huan Liu. 2013. Craxweb: Automatic Web Application Testing and Attack Generation. In Proceedings of the 7th IEEE International Conference on Software Security and Reliability (SERE 2013). IEEE, 208--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. James C King. 1976. Symbolic Execution and Program Testing. Commun. ACM 19, 7 (1976), 385--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2005). 306--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Bin Liang, Pan Bian, Yan Zhang, Wenchang Shi, Wei You, and Yan Cai. 2016. AntMiner: mining more bugs by reducing noise interference. In Proceedings of the 38th International Conference on Software Engineering (ICSE 2016). 333--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lannan Luo, Qiang Zeng, Chen Cao, Kai Chen, Jian Liu, Limin Liu, Neng Gao, Min Yang, Xinyu Xing, and Peng Liu. 2017. System Service Call-oriented Symbolic Execution of Android Framework with Applications to Vulnerability Discovery and Exploit Generation. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys 2017). 225--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Mitchell P Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 2 (1993), 313--330.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Andrew Meneely, Harshavardhan Srinivasan, Ayemi Musa, Alberto Rodriguez Tejeda, Matthew Mokary, and Brian Spates. 2013. When a Patch Goes Bad: Exploring the Properties of Vulnerability-Contributing Commits. In Proceedings of the 7th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, (ESEM 2013). IEEE, 65--74.Google ScholarGoogle ScholarCross RefCross Ref
  42. Andrew Meneely, Alberto C Rodriguez Tejeda, Brian Spates, Shannon Trudeau, Danielle Neuberger, Katherine Whitlock, Christopher Ketant, and Kayla Davis. 2014. An Empirical Investigation of Socio-Technical Code Review Metrics and Security Vulnerabilities. In Proceedings of the 6th International Workshop on Social Software Engineering (SSE 2014). ACM, 37--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Andrew Meneely and Oluyinka Williams. 2012. Interactive Churn Metrics: SocioTechnical Variants of Code Churn. ACM SIGSOFT Software Engineering Notes 37, 6 (2012), 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Barton P Miller, Louis Fredriksen, and Bryan So. 1990. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM 33, 12 (1990), 32--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, and Tudor Dumitras. 2015. The Attack of the Clones: a Study of the Impact of Shared Code on Vulnerability Patching. In Proceedings of the 36th IEEE Symposium on Security & Privacy (S&P 2015). IEEE, 692--708. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Stephan Neuhaus, Thomas Zimmermann, Christian Holler, and Andreas Zeller. 2007. Predicting vulnerable software components. In Proceedings of the 14th ACM conference on Computer and Communications Security (CCS 2007). ACM, 529--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Rahul Pandita, Xusheng Xiao, Wei Yang, William Enck, and Tao Xie. 2013. WHYPER: Towards Automating Risk Assessment of Mobile Applications. In Proceedings of the 22nd USENIX Security Symposium (Security 2013). 527--542.Google ScholarGoogle Scholar
  48. Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. Vccfinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS 2015). ACM, 426--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-Aware Evolutionary Fuzzing. In Proceedings of the 24th Annual Network and Distributed System Security Symposium (NDSS 2017). ISOC.Google ScholarGoogle ScholarCross RefCross Ref
  50. Edward J Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In Proceedings of the 31st IEEE Symposium on Security & Privacy (S&P 2010). IEEE, 317--331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When Do Changes Induce Fixes?. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 1--5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing through Selective Symbolic Execution. In Proceedings of the 23nd Annual Network and Distributed System Security Symposium (NDSS 2016). Google ScholarGoogle ScholarCross RefCross Ref
  53. Michael Sutton, Adam Greene, and Pedram Amini. 2007. Fuzzing: Brute Force Vulnerability Discovery. Pearson Education.Google ScholarGoogle Scholar
  54. Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. iComment: Bugs or Bad Comments?. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP 2007). ACM, 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: mining annotations from comments and code to detect interrupt related concurrency bugs. In Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011). IEEE, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In Proceedings of the 38th IEEE Symposium on Security & Privacy (S&P 2017). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  57. Xusheng Xiao, Amit Paradkar, Suresh Thummalapenta, and Tao Xie. 2012. Automated Extraction of Security Policies from Natural-Language Software Documents. In Proceedings of the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2012). ACM, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Junfeng Yang, Ang Cui, Salvatore J Stolfo, and Simha Sethumadhavan. 2012. Concurrency Attacks. HotPar 12 (2012), 15.Google ScholarGoogle Scholar
  59. Juan Zhai, Jianjun Huang, Shiqing Ma, Xiangyu Zhang, Lin Tan, Jianhua Zhao, and Feng Qin. 2016. Automatic Model Generation from Documentation for Java API Functions. In Proceedings of the 38th International Conference on Software Engineering (ICSE 2016). ACM, 380--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2009. Inferring Resource Specifi- cations from Natural Language API Documentation. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009). IEEE, 307--318. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
      October 2017
      2682 pages
      ISBN:9781450349468
      DOI:10.1145/3133956

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CCS '17 Paper Acceptance Rate151of836submissions,18%Overall Acceptance Rate1,261of6,999submissions,18%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader