research-article

Formalizing the LLVM intermediate representation for verified program transformations

Authors:
Jianzhou Zhao

University of Pennsylvania, Philadelphia, PA, USA

University of Pennsylvania, Philadelphia, PA, USA
View Profile

,
Santosh Nagarakatte

University of Pennsylvania, Philadelphia, PA, USA

University of Pennsylvania, Philadelphia, PA, USA
View Profile

,
Milo M.K. Martin

University of Pennsylvania, Philadelphia, PA, USA

University of Pennsylvania, Philadelphia, PA, USA
View Profile

,
Steve Zdancewic

University of Pennsylvania, Philadelphia, PA, USA

University of Pennsylvania, Philadelphia, PA, USA
View Profile

POPL '12: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesJanuary 2012Pages 427–440https://doi.org/10.1145/2103656.2103709

Published:25 January 2012Publication History

POPL '12: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Pages 427–440

ABSTRACT

This paper presents Vellvm (verified LLVM), a framework for reasoning about programs expressed in LLVM's intermediate representation and transformations that operate on it. Vellvm provides a mechanized formal semantics of LLVM's intermediate representation, its type system, and properties of its SSA form. The framework is built using the Coq interactive theorem prover. It includes multiple operational semantics and proves relations among them to facilitate different reasoning styles and proof techniques.

To validate Vellvm's design, we extract an interpreter from the Coq formal semantics that can execute programs from LLVM test suite and thus be compared against LLVM reference implementations. To demonstrate Vellvm's practicality, we formalize and verify a previously proposed transformation that hardens C programs against spatial memory safety violations. Vellvm's tools allow us to extract a new, verified implementation of the transformation pass that plugs into the real LLVM infrastructure; its performance is competitive with the non-verified, ad-hoc original.

Supplemental Material

popl_7a_1.mp4

mp4

213.4 MB

Download

References

E. Alkassar and M. A. Hillebrand. Formal functional verification of device drivers. In VSTTE '08: Proceedings of the 2nd International Conference on Verified Software: Theories, Tools, Experiments, 2008. Google ScholarDigital Library
A. W. Appel. Foundational proof-carrying code. In LICS '01: Proceedings of the 16th Annual IEEE Symposium on Logic in Computer Science, 2001. Google ScholarDigital Library
A. W. Appel. Verified software toolchain. In ESOP '11: Proceedings of the 20th European Conference on Programming Languages and Systems, 2011. Google ScholarDigital Library
B. Aydemir, A. Charguéraud, B. C. Pierce, R. Pollack, and S. Weirich. Engineering formal metatheory. In POPL '08: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2008. Google ScholarDigital Library
N. Benton and N. Tabareau. Compiling functional types to relational specifications for low level imperative code. In TLDI '09: Proceedings of the 4th International Workshop on Types in Language design and Implementation, 2009. Google ScholarDigital Library
S. Blazy, B. Robillard, and A. W. Appel. Formal verification of coalescing graph-coloring register allocation. In ESOP '10: Proceedings of the 19th European Conference on Programming Languages and Systems, 2010. Google ScholarDigital Library
J. Chen, D. Wu, A. W. Appel, and H. Fang. A provably sound TAL for back-end optimization. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, 2003. Google ScholarDigital Library
A. Chlipala. A verified compiler for an impure functional language. In POPL '10: Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2010. Google ScholarDigital Library
A. Chlipala. A certified type-preserving compiler from lambda calculus to assembly language. In PLDI '07: Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007. Google ScholarDigital Library
The Coq Proof Assistant Reference Manual (Version 8.3pl1). The Coq Development Team, 2011.Google Scholar
K. Crary. Toward a foundational typed assembly language. In POPL '03: Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2003. Google ScholarDigital Library
K. Crary and R. Harper. Mechanized definition of standard ml (alpha release), 2009. http://www.cs.cmu.edu/crary/papers/2009/mldef-alpha.tar.gz.Google Scholar
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst., 13: 451--490, 1991. Google ScholarDigital Library
G. A. Kildall. A unified approach to global program optimization. In POPL '73: Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, 1973. Google ScholarDigital Library
hen}KN04G. Klein, T. Nipkow, and T. U. München. A machine-checked model for a Java-like language, virtual machine and compiler. ACM Trans. Program. Lang. Syst., 28: 619--695, 2006. Google ScholarDigital Library
C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, 2004. Google ScholarDigital Library
S. Lerner, T. Millstein, E. Rice, and C. Chambers. Automated soundness proofs for dataflow analyses and transformations via local rules. In POPL '05: Proceedings of the 32th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2005. Google ScholarDigital Library
X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43 (4): 363--446, 2009. Google ScholarDigital Library
The LLVM Reference Manual (Version 2.6). The LLVM Development Team, 2010. http://llvm.org/releases/2.6/docs/LangRef.html.Google Scholar
V. S. Menon, N. Glew, B. R. Murphy, A. McCreight, T. Shpeisman, A.-R. Adl-Tabatabai, and L. Petersen. A verifiable SSA program representation for aggressive compiler optimization. In POPL '06: Proceedings of the 33th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2006. Google ScholarDigital Library
S. Nagarakatte, J. Zhao, M. M. K. Martin, and S. Zdancewic. SoftBound: Highly compatible and complete spatial memory safety for C. In PLDI '09: Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation, 2009. Google ScholarDigital Library
G. C. Necula. Translation validation for an optimizing compiler. In PLDI '00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, 2000. Google ScholarDigital Library
NIST Juliet Test Suite for C/C+. NIST, 2010. http://samate.nist.gov/SRD/testCases/suites/Juliet-2010--12.c.cpp.zip.Google Scholar
M. Nita and D. Grossman. Automatic transformation of bit-level C code to support multiple equivalent data layouts. In CC'08: Proceedings of the 17th International Conference on Compiler Construction, 2008. Google ScholarDigital Library
M. Nita, D. Grossman, and C. Chambers. A theory of platform-dependent low-level software. In POPL '08: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2008. Google ScholarDigital Library
A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In TACAS '98: Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems, 1998. Google ScholarDigital Library
P. Sewell, F. Zappa Nardelli, S. Owens, G. Peskine, T. Ridge, S. Sarkar, and R. Strnisa. Ott: Effective tool support for the working semanticist. In ICFP '07: Proceedings of the 9th ACM SIGPLAN International Conference on Functional Programming, 2007. Google ScholarDigital Library
M. Stepp, R. Tate, and S. Lerner. Equality-Based translation validator for LLVM. In CAV '11: Proceedings of the 23rd International Conference on Computer Aided Verification, 2011. Google ScholarDigital Library
Z. T. Sudipta Kundu and S. Lerner. Proving optimizations correct using parameterized program equivalence. In PLDI '09: Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation, 2009. Google ScholarDigital Library
D. Syme. Reasoning with the formal definition of Standard ML in HOL. In Sixth International Workshop on Higher Order Logic Theorem Proving and its Applications, 1993. Google ScholarDigital Library
Z. Tatlock and S. Lerner. Bringing extensibility to verified compilers. In PLDI '10: Proceedings of the ACM SIGPLAN 2010 Conference on Programming Language Design and Implementation, 2010. Google ScholarDigital Library
J.-B. Tristan and X. Leroy. Formal verification of translation validators: a case study on instruction scheduling optimizations. In POPL '08: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2008. Google ScholarDigital Library
J.-B. Tristan and X. Leroy. Verified validation of lazy code motion. In PLDI '09: Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation, 2009. Google ScholarDigital Library
J. B. Tristan and X. Leroy. A simple, verified validator for software pipelining. In POPL '10: Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2010. Google ScholarDigital Library
J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for llvm. In PLDI '11: Proceedings of the ACM SIGPLAN 2011 Conference on Programming Language Design and Implementation, 2011. Google ScholarDigital Library
A. Zaks and A. Pnueli. Program analysis for compiler validation. In PASTE '08: Proceedings of the 8th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, 2008. Google ScholarDigital Library
L. Zhao, G. Li, B. De Sutter, and J. Regehr. ARMor: Fully verified software fault isolation. In EMSOFT '11: Proceedings of the 9th ACM International Conference on Embedded Software, 2011. Google ScholarDigital Library

Index Terms

Formalizing the LLVM intermediate representation for verified program transformations
1. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program verification
    2. Program semantics
      1. Operational semantics

Recommendations

Modular, compositional, and executable formal semantics for LLVM IR

This paper presents a novel formal semantics, mechanized in Coq, for a large, sequential subset of the LLVM IR. In contrast to previous approaches, which use relationally-specified operational semantics, this new semantics is based on monadic ...
Read More
SVF: interprocedural static value-flow analysis in LLVM
CC 2016: Proceedings of the 25th International Conference on Compiler Construction

This paper presents SVF, a tool that enables scalable and precise interprocedural Static Value-Flow analysis for C programs by leveraging recent advances in sparse analysis. SVF, which is fully implemented in LLVM, allows value-flow construction and ...
Read More
Formal verification of SSA-based optimizations for LLVM
PLDI '13: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation

Modern compilers, such as LLVM and GCC, use a static single assignment(SSA) intermediate representation (IR) to simplify and enable many advanced optimizations. However, formally verifying the correctness of SSA-based optimizations is challenging ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
POPL '12: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
January 2012
602 pages
ISBN:9781450310833
DOI:10.1145/2103656
General Chair:
John Field
Google, USA
,
Program Chair:
Michael Hicks
University of Maryland, College Park, USA
ACM SIGPLAN Notices Volume 47, Issue 1
POPL '12
January 2012
569 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2103621
Issue’s Table of Contents
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 January 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Coq
LLVM
memory safety
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate824of4,130submissions,20%
Upcoming Conference
POPL '25

Sponsor:

sigplan

The 52nd Annual ACM SIGPLAN Symposium on Principles of Programming Languages

January 19 - 25, 2025

Denver , CO , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 175
  Total Citations
  View Citations
- 1,496
  Total Downloads
- Downloads (Last 12 months)157
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Formalizing the LLVM intermediate representation for verified program transformations

POPL '12: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Modular, compositional, and executable formal semantics for LLVM IR

SVF: interprocedural static value-flow analysis in LLVM

Formal verification of SSA-based optimizations for LLVM