Skip to main content

Robustness Testing of Intermediate Verifiers

  • Conference paper
  • First Online:
Automated Technology for Verification and Analysis (ATVA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11138))

Abstract

Program verifiers are not exempt from the bugs that affectnearly every piece of software. In addition, they often exhibit brittle behavior: their performance changes considerably with details of how the input program is expressed—details that should be irrelevant, such as the order of independent declarations. Such a lack of robustness frustrates users who have to spend considerable time figuring out a tool’s idiosyncrasies before they can use it effectively. This paper introduces a technique to detect lack of robustness of program verifiers; the technique is lightweight and fully automated, as it is based on testing methods (such as mutation testing and metamorphic testing). The key idea is to generate many simple variants of a program that initially passes verification. All variants are, by construction, equivalent to the original program; thus, any variant that fails verification indicates lack of robustness in the verifier. We implemented our technique in a tool called \(\mu \) gie, which operates on programs written in the popular Boogie language for verification—used as intermediate representation in numerous program verifiers. Experiments targeting 135 Boogie programs indicate that brittle behavior occurs fairly frequently (16 programs) and is not hard to trigger. Based on these results, the paper discusses the main sources of brittle behavior and suggests means of improving robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, the term “verification” also designates validation techniques such as testing.

  2. 2.

    http://boogie-docs.readthedocs.io/en/latest/#front-ends-that-emit-boogie-ivl.

  3. 3.

    By an anonymous reviewer of FM 2018.

  4. 4.

    [6] describes some experiments with seeds that fail verification. Unsurprisingly, random mutations are unlikely to turn an unverified program into a verified one—therefore, the main paper focuses on using verified programs as seeds.

  5. 5.

    See Sect. 5 for a discussion of how robustness testing differs from traditional mutation testing.

  6. 6.

    For clarity, we initially focus on Boogie 4.5.0, and later discuss differences with other versions.

  7. 7.

    Additionally, Why3 times out on 51 mutants of 2 seeds in group S; this seems to reflect an ineffective translation performed by b2w  [1] rather than brittleness of Why3.

References

  1. Ameri, M., Furia, C.A.: Why just Boogie? In: Ábrahám, E., Huisman, M. (eds.) IFM 2016. LNCS, vol. 9681, pp. 79–95. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-33693-0_6

    Chapter  Google Scholar 

  2. AutoProof verified code repository. http://tiny.cc/autoproof-repo

  3. Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The oracle problem in software testing: a survey. IEEE Trans. Softw. Eng. 41(5), 507–525 (2015)

    Article  Google Scholar 

  4. Chen, T.Y., Cheung, S.C., Yiu, S.M.: Metamorphic testing: a new approach for generating next test cases. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology (1998)

    Google Scholar 

  5. Chen, Y.T., Furia, C.A.: Triggerless happy. In: Polikarpova, N., Schneider, S. (eds.) IFM 2017. LNCS, vol. 10510, pp. 295–311. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66845-1_19

    Chapter  Google Scholar 

  6. Chen, Y.T., Furia, C.A.: Robustness testing of intermediate verifiers. http://arxiv.org/abs/1805.03296 (2018)

  7. Claessen, K., Hughes, J.: Quickcheck: a lightweight tool for random testing of Haskell programs. In: ICFP, pp. 268–279. ACM (2000)

    Google Scholar 

  8. Dafny examples and tests. https://github.com/Microsoft/dafny/tree/master/Test

  9. Filliâtre, J.-C., Paskevich, A.: Why3—where programs meet provers. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 125–128. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_8

    Chapter  Google Scholar 

  10. Furia, C.A., Meyer, B., Velder, S.: Loop invariants: analysis, classification, and examples. ACM Comput. Surv. 46(3) (2014)

    Article  Google Scholar 

  11. Furia, C.A., Nordio, M., Polikarpova, N., Tschannen, J.: AutoProof: auto-active functional verification of object-oriented programs. STTT 19(6), 697–716 (2016)

    Article  Google Scholar 

  12. Godefroid, P., Levin, M.Y., Molnar, D.A.: SAGE: whitebox fuzzing for security testing. Commun. ACM 55(3), 40–44 (2012)

    Article  Google Scholar 

  13. Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts, M.L., Setty, S.T.V., Zill, B.: IronFleet: proving practical distributed systems correct. In: SOSP, pp. 1–17. ACM (2015)

    Google Scholar 

  14. Hawblitzel, C., Howell, J., Lorch, J.R., Narayan, A., Parno, B., Zhang, D., Zill, B.: Ironclad Apps: end-to-end security via automated full-system verification. In: USENIX OSDI, pp. 165–181. USENIX Association (2014)

    Google Scholar 

  15. Hierons, R.M., et al.: Using formal specifications to support testing. ACM Comput. Surv. 41(2), 9:1–9:76 (2009)

    Article  Google Scholar 

  16. Jia, Y., Harman, M.: An analysis and survey of the development of mutation testing. IEEE Trans. Softw. Eng. 37(5), 649–678 (2011)

    Article  Google Scholar 

  17. Leino, K.R.M.: This is Boogie 2 (2008). http://goo.gl/QsH6g

  18. Leino, K., Rustan, M.: Dafny: an automatic program verifier for functional correctness. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI), vol. 6355, pp. 348–370. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17511-4_20

    Chapter  Google Scholar 

  19. Leino, K.R.M., Pit-Claudel, C.: Trigger selection strategies to stabilize program verifiers. In: CAV, pp. 361–381. Springer, Berlin (2016)

    Google Scholar 

  20. Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009)

    Article  Google Scholar 

  21. Liew, D., Cadar, C., Donaldson, A.F.: Symbooglix: A symbolic execution engine for boogie programs. In: ICST, pp. 45–56. IEEE Computer Society (2016)

    Google Scholar 

  22. McKeeman, W.M.: Differential testing for software. Digit. Tech. J. 10(1), 100–107 (1998)

    Google Scholar 

  23. \(\mu \)gie repository. https://emptylambda.github.io/mu-gie/

  24. Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: ICSE, pp. 75–84. IEEE Computer Society (2007)

    Google Scholar 

  25. Polikarpova, N., Furia, C.A., West, S.: To run what no one has run before: executing an intermediate verification language. In: Legay, A., Bensalem, S. (eds.) RV 2013. LNCS, vol. 8174, pp. 251–268. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40787-1_15

    Chapter  Google Scholar 

  26. Segura, S., Fraser, G., Sanchez, A.B., Ruiz-Cortés, A.: A survey on metamorphic testing. IEEE Trans. Softw. Eng. 42(9), 805–824 (2016)

    Article  Google Scholar 

  27. Tange, O.: GNU parallel—the command-line power tool. Login: USENIX Mag. 36, 42–47 (2011)

    Google Scholar 

  28. Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. ACM SIGPLAN Not. ACM 46, 283–294 (2011)

    Article  Google Scholar 

  29. Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28(2), 183–200 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to YuTing Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Y., Furia, C.A. (2018). Robustness Testing of Intermediate Verifiers. In: Lahiri, S., Wang, C. (eds) Automated Technology for Verification and Analysis. ATVA 2018. Lecture Notes in Computer Science(), vol 11138. Springer, Cham. https://doi.org/10.1007/978-3-030-01090-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01090-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01089-8

  • Online ISBN: 978-3-030-01090-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics