Skip to main content
Log in

On the Representation of Results of Binary Code Reverse Engineering

  • Published:
Programming and Computer Software Aims and scope Submit manuscript

Abstract

A representation of algorithms extracted from binary code by reverse engineering is discussed. Both intermediate representations designed for automatic analysis and final representations passed to the end user are considered. The two main tasks of reverse engineering—automatic detection of exploitable vulnerabilities and discovery of undocumented features— are analyzed. The basic scheme of the system implementing the automatic detection of exploitable vulnerabilities is presented and the key properties of the intermediate representation designed for solving this problem using an efficient generation of a system of equations for an SMT solver are described. The workflow for discovering undocumented features is described. These steps are the localization of the algorithm, its representation in the form that is convenient for analysis, and investigation of its properties. To automate the first phase, a combined static and dynamic representation is constructed, which includes OS-level events and calls to library functions; they serve as anchor points used by the analyst for the algorithm localization. The further support of localization uses code slicing and navigation algorithms. Once the algorithm is localized, the further work goes in two directions: interactive construction of a compact annotated representation of the algorithm by a flowchart and automated investigation of the algorithm properties aimed at determining declared and undeclared data flows. The representation of the algorithm is based on the construction of simplified models of functions taking into account input and output buffers and on the automatic detection of data dependences between buffers of various function calls. The overall scenario of the analyst' work with such a flowchart in the context of discovering undocumented features is described; this scenario is based on annotating the declared data flows and on the automatic detection of undeclared data flows. In conclusion, an example of the resulting representation is discussed and the directions of further research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Wang, X., Zeldovich, N., Kaashoek, M.F., and Solar-Lezama, A., A Differntial Approach to Undefined Behavior Detection, ACM Trans. Comput. Syst., 2015, vol. 33, no. 1, pp. 1–29.

    Article  Google Scholar 

  2. Song, D., Brumley, D., Yin, H., et al., BitBlaze: A new approach to computer security via binary analysis, Inf. Syst. Security, 2008, pp. 1–25

    Google Scholar 

  3. Brumley, D., Jager, I., Avgerinos, T., et al., BAP: A binary analysis platform, in Int. Conf. on Computer Aided Verification, 2011, pp. 463–469

    Chapter  Google Scholar 

  4. Shoshitaishvili, Y., Wang, R., Salls, C., et al., Sok: (state of) the art of war: Offensive techniques in binary analysis, in IEEE Symposium on Security and Privacy (SP), 2016, pp. 138–157

    Google Scholar 

  5. Cha, S. K., Avgerinos, T., Rebert, A., et al., Unleashing mayhem on binary code, in IEEE Symposium on Security and Privacy (SP), 2012, pp. 380–394

    Google Scholar 

  6. Defense Advanced Research Projects Agency Program Information: Cyber Grand Challenge (CGC). https://doi.org/www.darpa.mil/program/cyber-grand-challenge

  7. Padaryan, V.A., Get’man, A.I., Solov’ev, M.A., Bakulin, M.G., Borzilov, A.I., Kaushan, V.V., Ledovskykh, I.N., Markin, Yu.V., and Panasenko, S.S., Methods and software tools supporting the combined analysis of binary code, Trudy ISP RAN, 2014, vol. 26, no. 1, pp. 251–276.

    Google Scholar 

  8. Ivannikov, V.P., Belevantsev, A.A., Borodin, A.E., Ignatiev, V.N., Zhurikhin, D.M., and Avetisyan, A.I., Static analyzer Svace for finding defects in a source program code, Program. Comput. Software, 2014, vol. 40, no. 5, pp. 265–275.

    Article  Google Scholar 

  9. Koshelev, V.K., Ignat’ev, V.N., Borzilov, A.I., and Belevantsev, A.A., SharpChecker static analysis tool for C, Program. Comput. Software, 2017, vol. 43, no. 4, pp. 268–276.

    Article  Google Scholar 

  10. Dudina, I.A. and Belevantsev, A.A., Using static symbolic execution to detect buffer overflows, Program. Comput. Software, 2017, vol. 43, no. 5, pp. 277–288.

    Article  MathSciNet  Google Scholar 

  11. Belevantsev, A.A., Multilevel static analysis for improving program quality, Program. Comput. Software, 2017, vol. 43, no. 6, pp. 321–336.

    Article  MathSciNet  Google Scholar 

  12. Kaushan, V.V., Mamontov, A.Yu., Padaryan, V.A., and Fedotov, A.N., A method for detecting some types of memory bugs in binary code, Trudy ISP RAN, 2015, vol. 27, no. 2, pp. 105–126.

    Google Scholar 

  13. Nethercote, N. and Seward, J., Valgrind: A framework for heavyweight dynamic binary instrumentation, ACM SIGPLAN Notices, 2007, vol. 42, no. 6, pp. 89–100.

    Article  Google Scholar 

  14. Luk, C.K., Cohn, R., Muth, R., et al., Pin: Building customized program analysis tools with dynamic instrumentation, ACM SIGPLAN Notices, 2005, vol. 40, no. 6, pp. 190–200.

    Article  Google Scholar 

  15. Bellard, F., QEMU, a fast and portable dynamic translator, in USENIX Annual Technical Conference, FREENIX Track, 2005, pp. 41–46

    Google Scholar 

  16. De Moura, L. and Bjorner, N., Z3: An efficient SMT solver, in Tools and Algorithms for the Construction and Analysis of Systems, 2008, pp. 337–340

    Chapter  Google Scholar 

  17. Padaryan, V.A., Solov’ev, M.A., and Kononov, A.I., Simulation of operational semantics of machine instructions, Program. Comput. Software, 2011, vol. 37, no. 3, pp. 161–170.

    Article  MATH  Google Scholar 

  18. Dullien, T. and Porst, S., REIL: A platform-independent intermediate representation of disassembled code for static code analysis, in Proc. of CanSecWest, 2009.

    Google Scholar 

  19. Fedotov, A.N., Padaryan, V.A., Kaushan, V.V., Kurmangaleev, Sh.F., Vishnyakov, A.V., and Nurmukhametov, A.R., Assesing the criticality of software vulnerabilities under the conditions of modern protection mechanisms, Trudy ISP RAN, 2016, vol. 28, no. 5, pp. 73–92.

    Google Scholar 

  20. Caselden, D., Bazhanyuk, A., Payer, M., McCamant, S., and Song, D., HI-CFG: Construction by binary analysis and application to attack polymorphism, in Computer Security–ESORICS 2013, Lect. Notes Comput. Sci., 2013, vol. 8134. pp. 164–181.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. A. Padaryan.

Additional information

Original Russian Text © V.A. Padaryan, I.N. Ledovskikh, 2018, published in Trudy Instituta Sistemnogo Programmirovaniya, 2017, Vol. 29, No. 3.

The work was supported by RFBR grant 16-29-09632

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Padaryan, V.A., Ledovskikh, I.N. On the Representation of Results of Binary Code Reverse Engineering. Program Comput Soft 44, 200–206 (2018). https://doi.org/10.1134/S0361768818030064

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0361768818030064

Keywords

Navigation