Abstract
Giving correctness assurance to the generated code in the context of generative programming is a poorly explored problem. Such assurance is particularly desired for applications where correctness of the optimized code is far from obvious, such as cryptography.
This work presents a unified approach to program generation and verification, and applies it to an implementation of Number-Theoretic Transform, a key building block in lattice-based cryptography. Our strategy for verification is based on problem decomposition: While we found that an attempt to prove functional correctness of the whole program all at once is intractable, low-level components in the optimized program and its high-level algorithm structure can be separately verified using procedures of appropriate levels of abstraction.
We demonstrate that such a decomposition and subsequent verification of each component are naturally realized in a program-generation approach based on the tagless-final style, leading to an end-to-end functional correctness verification of a highly optimized program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Our code is available in https://github.com/masahi/nttverify.
- 2.
In the module system of ML-family languages, a signature is an interface of a module, and a structure is its implementation.
- 3.
We use 12289, which fits in 14 bits, as the modulus parameter q (See Sect. 2.1).
- 4.
This is for maximizing parallelism from vectorization.
- 5.
In cryptography implementations, being constant-time refers to having no data-dependent control flow, which can become a security hole for timing attacks.
- 6.
For simplicity, we do not consider the effect of vectorization for our verification purpose, although the generated program is fully vectorized with multiple SIMD instruction sets. All of the low-level issues that motivate our verification effort are manifested in the non-vectorized implementation.
- 7.
mullo(mulhi(x,5),q) is not greater than\(\left\lfloor x \frac{1}{q} \right\rfloor q\), since \(5q < 65535\) for our choice of \(q\).
- 8.
It took only a few seconds for the input of size 1024.
- 9.
We have chosen options that maximize the precision of the analysis.
- 10.
The symbol \(=\) represents the exact equality on integers. The additional conditional subtraction is necessary since the outputs of Barrett reduction can be larger than q.
- 11.
Refer to our source code for details on the translation from DSL to Z3 formulas.
- 12.
The coefficients computed by the NTT program may contain negative values due to subtraction in the butterfly operation.
- 13.
See for our trusted base.
- 14.
However, note that both interpretations are based on the tagless-final style and thus they operate on DSL constructs at the most primitive level (such as translating the DSL for loop to that of OCaml or C). Therefore, we believe that their correctness is a reasonable assumption.
References
ANSI/ISO C specification language. https://frama-c.com/html/acsl.html
Akbarpour, B., Tahar, S.: A methodology for the formal verification of FFT algorithms in HOL. In: Hu, A.J., Martin, A.K. (eds.) FMCAD 2004. LNCS, vol. 3312, pp. 37–51. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30494-4_4
Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange: a new hope. In: Proceedings of the 25th USENIX Conference on Security Symposium, SEC 2016, pp. 327–343. USENIX Association, USA (2016)
Almeida, J.B., et al.: Jasmin: high-assurance and high-speed cryptography. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1807–1823. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3133956.3134078
Amin, N., Rompf, T.: LMS-Verify: abstraction without regret for verified systems programming. In: Castagna, G., Gordon, A.D. (eds.) Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, 18–20 January 2017. pp. 859–873. ACM (2017). https://doi.org/10.1145/3009837.3009867
Barrett, P.: Implementing the rivest shamir and adleman public key encryption algorithm on a standard digital signal processor. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 311–323. Springer, Heidelberg (1987). https://doi.org/10.1007/3-540-47721-7_24
Bühler, D.: Structuring an abstract interpreter through value and state abstractions: EVA, an LC. (Structurer un interpréteur abstrait au moyen d’abstractions de valeurs et d’états: Eva, une analyse de valeur évoluée pour Frama-C). Ph.D. thesis, University of Rennes 1, France (2017), https://tel.archives-ouvertes.fr/tel-01664726
Capretta, V.: Certifying the fast fourier transform with Coq. In: Boulton, R.J., Jackson, P.B. (eds.) TPHOLs 2001. LNCS, vol. 2152, pp. 154–168. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44755-5_12
Carette, J., Kiselyov, O., Shan, C.: Finally tagless, partially evaluated: tagless staged interpreters for simpler typed languages. J. Funct. Program. 19(5), 509–543 (2009). https://doi.org/10.1017/S0956796809007205
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press, Cambridge (2009)
Erbsen, A., Philipoom, J., Gross, J., Sloan, R., Chlipala, A.: Simple high-level code for cryptographic arithmetic - with proofs, without compromises. In: 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 1202–1219. IEEE (2019). https://doi.org/10.1109/SP.2019.00005
Gamboa, R.A.: The correctness of the fast fourier transform: a structured proof in ACL2. Form. Methods Syst. Des. 20(1), 91–106 (2002). https://doi.org/10.1023/A:1012912614285
Güneysu, T., Oder, T., Pöppelmann, T., Schwabe, P.: Software speed records for lattice-based signatures. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS, vol. 7932, pp. 67–82. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38616-9_5
Kiselyov, O., Biboudis, A., Palladinos, N., Smaragdakis, Y.: Stream fusion, to completeness. In: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, pp. 285–299. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3009837.3009880
Krishnaswami, N.R., Yallop, J.: A typed, algebraic approach to parsing. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pp. 379–393. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3314221.3314625
Kroening, D., Strichman, O.: Decision Procedures - An Algorithmic Point of View, Second Edition. Texts in Theoretical Computer Science. An EATCS Series. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-50497-0
Longa, P., Naehrig, M.: Speeding up the number theoretic transform for faster ideal lattice-based cryptography. In: Foresti, S., Persiano, G. (eds.) CANS 2016. LNCS, vol. 10052, pp. 124–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_8
Lyubashevsky, V., Peikert, C., Regev, O.: On ideal lattices and learning with errors over rings. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 1–23. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_1
Masuda, M., Kameyama, Y.: FFT program generation for ring LWE-based cryptography. In: Nakanishi, T., Nojima, R. (eds.) IWSEC 2021. LNCS, vol. 12835, pp. 151–171. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85987-9_9
Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44, 519–521 (1985)
Navas, J.A., Dutertre, B., Mason, I.A.: Verification of an optimized NTT algorithm. In: Christakis, M., Polikarpova, N., Duggirala, P.S., Schrammel, P. (eds.) NSV/VSTTE -2020. LNCS, vol. 12549, pp. 144–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63618-0_9
Protzenko, J., et al.: Evercrypt: a fast, verified, cross-platform cryptographic provider. In: 2020 IEEE Symposium on Security and Privacy (SP), pp. 983–1002 (2020). https://doi.org/10.1109/SP40000.2020.00114
Seiler, G.: Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography. IACR Cryptol. ePrint Arch. 2018, 39 (2018)
Shaikhha, A., Klonatos, Y., Koch, C.: Building efficient query engines in a high-level language. ACM Trans. Database Syst. 43(1) (2018). https://doi.org/10.1145/3183653
Wei, G., Chen, Y., Rompf, T.: Staged abstract interpreters: fast and modular whole-program analysis via meta-programming. Proc. ACM Program. Lang. 3(OOPSLA), 126:1–126:32 (2019). https://doi.org/10.1145/3360552
Zinzindohoué, J.K., Bhargavan, K., Protzenko, J., Beurdouche, B.: HACL*: a verified modern cryptographic library. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1789–1806. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3133956.3134043
Acknowledgements
We thank Hiroshi Unno for the helpful discussion. Feedback from anonymous reviewers helped improve this paper and is greatly appreciated. The second author is supported in part by JSPS Grant-in-Aid for Scientific Research (B) 18H03218.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A Programs to be Verified and their Semantics
Appendix A Programs to be Verified and their Semantics
The verification procedure in Sect. 5 is a series of step-by-step simplifications of programs and their correctness proofs. The following table lists the programs and the domain interpretations in the procedure.
Program | Domain | Arithmetic operation | |
---|---|---|---|
\(P_0\) | DFT formula (1) | \(Z_q\) | Arithmetic operations in \(Z_q\) |
\(P_1\) | DSL program in Sect. 2.2 | \(Z_q\) | Arithmetic operations in \(Z_q\) |
\(P_2\) | The same as \(P_1\) | Unsigned int | Arithmetic with modulo-q |
\(P_3\) | The same as \(P_1\) | Unsigned int | Low-level operations |
\(P_4\) | \(P_1\) + lazy reduction | Unsigned int | Low-level operations |
\(P_5\) | Generated C code | Unsigned int in C | Arithmetic operations in C |
\(P_0\) is the DFT formula (1) in Sect. 2.1. \(P_1\), \(P_2\), and \(P_3\) are the DSL program whose inner-most loop was given in Sect. 2.2 with different domain interpretations. For the interpretation of DSL, we take the natural ‘interpreter’ semantics, which is essentially the same as the module R in Sect. 2.2.
\(P_1\), \(P_2\), and \(P_3\) differ in the domain interpretations. For \(P_1\), the domain is interpreted as \(Z_q\). For \(P_2\), the domain is interpreted as the set of 16 bit unsigned integers, and the arithmetic operations are those for unsigned integers followed by the modulo-q operation. To treat multiplication within 16 bits, we use mullo and mulhi in Sect. 3. For \(P_3\), the domain remains the same as \(P_2\), while the arithmetic operations are replaced by low-level operations such as Barrett reduction. The semantics of unsigned integers and their operations is specified by the bit-vector theory [16]. \(P_4\) is the same as \(P_3\) except that it employs lazy reduction in Sect. 3.
\(P_5\) is the C code generated by interpreting the DSL constructs as generators for strings that represent the corresponding C code. This process (called offshoring in the literature) is conceptually a trivial injection, however, formalizing it involves the semantics of the C language and is beyond the scope of this paper, and we put the equivalence of \(P_4\) and \(P_5\) into our trusted base.
Besides it, our trusted base includes correctness of our interval analysis, symbolic execution, and the implementations of helper functions such as mullo and mulhi. With this trusted base as well as the language and domain interpretations explained above, this paper has verified that, for \(0 \le k \le 3\), \(P_k\) is extensionally equal (modulo q) to \(P_{k+1}\) (written \(P_k =_{ext} P_{k+1}\)): \(P_3 =_{ext} P_4\) and \(P_1 =_{ext} P_2\) in Sect. 4, \(P_2 =_{ext} P_3\) in Sect. 5.3, and \(P_0 =_{ext} P_1\) in Sect. 5.4.
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Masuda, M., Kameyama, Y. (2022). Unified Program Generation and Verification: A Case Study on Number-Theoretic Transform. In: Hanus, M., Igarashi, A. (eds) Functional and Logic Programming. FLOPS 2022. Lecture Notes in Computer Science, vol 13215. Springer, Cham. https://doi.org/10.1007/978-3-030-99461-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-99461-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99460-0
Online ISBN: 978-3-030-99461-7
eBook Packages: Computer ScienceComputer Science (R0)