ABSTRACT
For comprehensive analysis of all executable code, and fast turn-around time for transformations, it is essential to operate directly on binaries to enable profiling, security hardening, and architectural adaptation. Disassembling binaries is difficult, and prior work relies on a process virtual machine to translate references on the fly or inefficient binary code patching. Our Egalito recompiler leverages metadata present in current stripped x86_64 and ARM64 binaries to generate a complete disassembly, and allows arbitrary modifications that may affect program layout without any constraints from the original binary. We utilize our own layout-agnostic intermediate representation, which is low-level enough to make the regeneration of output code predictable, yet supports a dual high-level representation for sophisticated analysis. We demonstrate nine binary tools including a novel continuous code randomization technique where Egalito transforms itself, and software emulation of the control-flow integrity in upcoming hardware. We evaluated Egalito on a large set of Debian packages, completely analyzing 99.9% of a selection of 867 executables and libraries; a majority of 149 applicable Debian packages pass all tests under Egalito. On SPEC CPU 2006, thanks to our binary optimizations, Egalito actually observes a 1.7% performance speedup.
- Martín Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay Ligatti. 2005. Control-flow Integrity. In Proc. of ACM CCS. 340--353.Google Scholar
- Ioannis Agadakos, Di Jin, David Williams-King, Vasileios P. Kemerlis, and Georgios Portokalidis. 2019. Nibbler: Debloating Binary Shared Libraries. In Proc. of ACSAC. 70--83.Google ScholarDigital Library
- Kapil Anand, Matthew Smithson, Khaled Elwazeer, Aparna Kotha, Jim Gruen, Nathan Giles, and Rajeev Barua. 2013. A Compiler-level Intermediate Representation based Binary Analysis and Rewriting System. In Proc. of ACM EuroSys. 295--308.Google ScholarDigital Library
- Dennis Andriesse, Xi Chen, Victor van der Veen, Asia Slowinska, and Herbert Bos. 2016. An In-Depth Analysis of Disassembly on Full-Scale x86/x64 Binaries.. In Proc. of USENIX SEC. 583--600.Google Scholar
- Michael Backes and Stefan Nürnberger. 2014. Oxymoron: Making Fine-Grained Memory Randomization Practical by Allowing Code Sharing. In Proc. of USENIX SEC. 433--447.Google Scholar
- Erick Bauman, Zhiqiang Lin, and Kevin W. Hamlen. 2018. Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics. In Proc. of NDSS. 40--47.Google Scholar
- Eli Bendersky. 2011. Position Independent Code (PIC) in shared libraries on x64. https://eli.thegreenplace.net/2011/11/11/positionindependent- code-pic-in-shared-libraries-on-x64.Google Scholar
- David Bigelow, Thomas Hobson, Robert Rudd, William Streilein, and Hamed Okhravi. 2015. Timely Rerandomization for Mitigating Memory Disclosures. In Proc. of ACM CCS. 268--279.Google ScholarDigital Library
- Andrea Bittau, Adam Belay, Ali Mashtizadeh, David Mazieres, and Dan Boneh. 2014. Hacking Blind. In Proc. of IEEE S&P. 227--242.Google Scholar
- Derek Bruening, Timothy Garnett, and Saman Amarasinghe. 2003. An Infrastructure for Adaptive Dynamic Optimization. In Proc. of CGO. 265--275.Google ScholarCross Ref
- David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J. Schwartz. 2011. BAP: A Binary Analysis Platform. In Proc. of CAV. 463--469.Google ScholarDigital Library
- Bryan Buck and Jeffrey K. Hollingsworth. 2000. An API for Runtime Code Patching. IJHPCA 14, 4 (2000), 317--329.Google Scholar
- Nathan Burow, Xinping Zhang, and Mathias Payer. 2019. SoK: Shining Light on Shadow Stacks. In Proc. of IEEE S&P. 985--999.Google ScholarCross Ref
- Yurong Chen, Tian Lan, and Guru Venkataramani. 2017. DamGate: Dynamic Adaptive Multi-feature Gating in Program Binaries. In Proc. of ACM FEAST. 23--29.Google ScholarDigital Library
- GNU Compiler Collection. 2017. Using the GNU Compiler Collection (GCC): AArch64 Options. https://gcc.gnu.org/onlinedocs/gcc/ AArch64-Options.html.Google Scholar
- Thurston H.Y. Dang, Petros Maniatis, and David Wagner. 2015. The Performance Cost of Shadow Stacks and Stack Canaries. In Proc. of ACM CCS. 555--566.Google ScholarDigital Library
- Al Danial. 2017. AlDanial/cloc. https://github.com/AlDanial/cloc.Google Scholar
- Debian. 2015. Hardening - Debian Wiki. https://wiki.debian.org/ Hardening.Google Scholar
- Alessandro Di Federico, Mathias Payer, and Giovanni Agosta. 2017. REV.NG: A Unified Binary Analysis Framework to Recover CFGs and Function Boundaries. In Proc. of CC. 131--141.Google ScholarDigital Library
- Chris Eagle. 2011. The IDA Pro Book: The Unofficial Guide to theWorld's Most Popular Disassembler. No Starch Press.Google Scholar
- Fedora. 2016. Harden All Packages - Fedora Project. https:// fedoraproject.org/wiki/Changes/Harden_All_Packages.Google Scholar
- Erich Gamma. 1995. Design Patterns: Elements of Reusable Object- Oriented Software. Pearson Education, India.Google Scholar
- Google. 2018. fuchsia Git repositories. https://fuchsia.googlesource. com/.Google Scholar
- Cosmin Gorgovan. 2016. Escaping DynamoRIO and Pin - or why it's a worse-than-you-think idea to run untrusted code or to input untrusted data. https://github.com/lgeek/dynamorio_pin_escape.Google Scholar
- Cosmin Gorgovan, Amanieu D'antras, and Mikel Luján. 2016. MAMBO: A Low-Overhead Dynamic Binary Modification Tool for ARM. ACM TACO 13, 1 (2016), 14.Google Scholar
- Thomas Huet. 2017. AFL. https://github.com/mirrorer/afl/blob/master/ docs/technical_details.txt.Google Scholar
- Intel. 2016. Intel is innovating to stop cyber attacks. https://blogs.intel. com/blog/intel-innovating-stop-cyber-attacks/.Google Scholar
- Intel. 2017. Control-flow Enforcement Technology Preview. https: //software.intel.com/sites/default/files/managed/4d/2a/control-flowenforcement- technology-preview.pdf.Google Scholar
- Vasileios P. Kemerlis, Georgios Portokalidis, and Angelos D. Keromytis. 2012. kGuard: Lightweight Kernel Protection against Return-to-user Attacks. In Proc. of USENIX SEC. 459--474.Google ScholarDigital Library
- Taegyu Kim, Chung Hwan Kim, Hongjun Choi, Yonghwi Kwon, Brendan Saltaformaggio, Xiangyu Zhang, and Dongyan Xu. 2017. RevARM: A Platform-Agnostic ARM Binary Rewriter for Security Applications. In Proc. of ACSAC. 412--424.Google ScholarDigital Library
- Vladimir Kiriansky, Derek Bruening, and Saman Amarasinghe. 2002. Secure Execution via Program Shepherding. In Proc. of USENIX SEC. 191--206.Google Scholar
- Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwartz, and Yuval Yarom. 2019. Spectre Attacks: Exploiting Speculative Execution. In Proc. of IEEE S&P. 1--19.Google ScholarCross Ref
- Hyungjoon Koo, Yaohui Chen, Long Lu, Vasileios P. Kemerlis, and Michalis Polychronakis. 2018. Compiler-assisted Code Randomization. In Proc. of IEEE S&P. 461--477.Google ScholarCross Ref
- Michael Larabel. 2018. Benchmarking Retpoline-Enabled GCC 8 With -mindirect-branch=thunk. https://www.phoronix.com/scan.php?page= article&item=gcc8-mindirect-thunk&num=2.Google Scholar
- Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. of CGO. 75--86.Google ScholarCross Ref
- Michael A. Laurenzano, Mustafa M. Tikir, Laura Carrington, and Allan Snavely. 2010. PEBIL: Efficient Static Binary Instrumentation for Linux. In Proc. of ISPASS. 175--183.Google ScholarCross Ref
- LLVM. 2019. LLVM Language Reference Manual. https://llvm.org/ docs/LangRef.html.Google Scholar
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proc. of ACM SIGPLAN PLDI. 190--200.Google ScholarDigital Library
- Marcus Meissner. 2017. openSUSE Tumbleweed now full of PIE. https: //lists.opensuse.org/opensuse-factory/2017-06/msg00403.html.Google Scholar
- Microsoft. 2016. -DYNAMICBASE (Use address space layout randomization). https://docs.microsoft.com/en-us/cpp/build/reference/ dynamicbase-use-address-space\protect\discretionary{\char\ hyphenchar\font}{}{}layout-randomization.Google Scholar
- Andreas Moser, Christopher Kruegel, and Engin Kirda. 2007. Exploring Multiple Execution Paths for Malware Analysis. In Proc. of IEEE S&P. 231--245.Google ScholarDigital Library
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. In ACM SIGPLAN Notices, Vol. 42. 89--100.Google ScholarDigital Library
- Aleph One. 1996. Smashing The Stack For Fun And Profit. Phrack 7, 49 (Nov 1996).Google Scholar
- Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis. 2012. Smashing the Gadgets: Hindering Return-Oriented Programming using In-Place Code Randomization. In Proc. of IEEE S&P. 601--615.Google Scholar
- Avery Pennarun, Bill Allombert, and Petter Reinholdtsen. 2019. Debian Popularity Contest. https://popcon.debian.org/.Google Scholar
- Ashwin Ramaswamy, Sergey Bratus, Sean W. Smith, and Michael E. Locasto. 2010. Katana: A Hot Patching Framework for ELF Executables. In Proc. of ARES. 507--512. ASPLOS '20, March 16--20, 2020, Lausanne, Switzerland D. Williams-King, H. Kobayashi, K. Williams-King, et al.Google Scholar
- Martin Richtarsky. 2017. Hardening C/C++ Programs Part II - Executable-Space Protection and ASLR. https://www.productivecpp. com/hardening-cpp-programs-executable-space-protectionaddress- space-layout-randomization-aslr/.Google Scholar
- Hovav Shacham. 2007. The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86). In Proc. of ACM CCS. 552--61.Google ScholarDigital Library
- Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Audrey Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, and Giovanni Vigna Vigna. 2016. SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In Proc. of IEEE S&P. 138--157.Google ScholarCross Ref
- Maksim Shudrak. 2019. drAFL. https://github.com/mxmssh/drAFL.Google Scholar
- Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena. 2008. BitBlaze: A New Approach to Computer Security via Binary Analysis. In Proc. of ICISS. 1--25.Google ScholarDigital Library
- Paul Turner. 2018. Retpoline: a software construct for preventing branch-target-injection. https://support.google.com/faqs/answer/ 7625886.Google Scholar
- Ubuntu. 2016. Security/features - Ubuntu Wiki. https://wiki.ubuntu. com/Security/Features#Userspace_Hardening.Google Scholar
- Ruoyu Wang, Yan Shoshitaishvili, Antonio Bianchi, Aravind Machiry, John Grosen, Paul Grosen, Christopher Kruegel, and Giovanni Vigna. 2017. Ramblr: Making Reassembly Great Again. In Proc. of NDSS.Google ScholarCross Ref
- Shuai Wang, Pei Wang, and Dinghao Wu. 2016. UROBOROS: Instrumenting Stripped Binaries with Static Reassembling. In Proc. of IEEE SANER. 236--247.Google ScholarCross Ref
- Richard Wartell, Vishwath Mohan, Kevin W. Hamlen, and Zhiqiang Lin. 2012. Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code. In Proc. of ACM CCS. 157--168.Google ScholarDigital Library
- Richard Wartell, Vishwath Mohan, Kevin W. Hamlen, and Zhiqiang Lin. 2012. Securing Untrusted Code via Compiler-Agnostic Binary Rewriting. In Proc. of ACSAC. 299--308.Google ScholarDigital Library
- Richard Wartell, Yan Zhou, Kevin W. Hamlen, Murat Kantarcioglu, and Bhavani Thuraisingham. 2011. Differentiating Code from Data in x86 Binaries. In Proc. of ECML PKDD. 522--536.Google ScholarDigital Library
- David Williams-King et al. 2020. columbia/egalito. https://github.com/ columbia/egalito.Google Scholar
- David Williams-King et al. 2020. Egalito. https://egalito.org.Google Scholar
- David Williams-King, Graham Gobieski, Kent Williams-King, James P. Blake, Xinhao Yuan, Patrick Colp, Michelle Zheng, Vasileios P. Kemerlis, Junfeng Yang, and William Aiello. 2016. Shuffler: Fast and Deployable Continuous Code Re-Randomization. In Proc. of USENIX OSDI. 367--382.Google Scholar
- David Williams-King and Junfeng Yang. 2019. CodeMason: Binary- Level Profile-Guided Optimization. In Proc. of ACM FEAST. 47--53.Google ScholarDigital Library
- Michal Zalewski. 2019. AFL. http://lcamtuf.coredump.cx/afl/.Google Scholar
- Mingwei Zhang, Rui Qiao, Niranjan Hasabnis, and R Sekar. 2014. A Platform for Secure Static Binary Instrumentation. ACM SIGPLAN Notices 49, 7 (2014), 129--140.Google ScholarDigital Library
- Mingwei Zhang and R Sekar. 2013. Control Flow Integrity for COTS Binaries. In Proc. of USENIX SEC. 337--352.Google Scholar
Index Terms
- Egalito: Layout-Agnostic Binary Recompilation
Recommendations
What You Trace is What You Get: Dynamic Stack-Layout Recovery for Binary Recompilation
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2Users of proprietary and/or legacy programs without vendor support are denied the significant advances in compiler technologies of the past decades. Adapting these technologies to operate directly on binaries without source code is often infeasible. ...
CodeMason: Binary-Level Profile-Guided Optimization
FEAST'19: Proceedings of the 3rd ACM Workshop on Forming an Ecosystem Around Software TransformationOptimizing a program for a specific machine or a specific workload is possible with today's compilers, but infrequently used, despite significant performance gains. We implement workload specialization, or Profile-Guided Optimization (PGO), at the ...
Update with care: Testing candidate bug fixes and integrating selective updates through binary rewriting
AbstractEnterprise software updates depend on the interaction between user and developer organizations. This interaction becomes especially complex when a single developer organization writes software that services hundreds of different user ...
Highlights- Generates test cases from record–replay execution trace.
- Tests candidate fixes ...
Comments