Keywords

1 Introduction

In recent years, the popularity of lattice-based cryptography has greatly increased. Lattices have been used to design traditional cryptographic primitives such as one way functions, public key encryption, key exchange, digital signatures, as well as more advanced constructions such as identity and attribute based encryption, and fully homomorphic encryption.

One reason for this popularity is that lattice problems, e.g. the Shortest Vector Problem (SVP) and Bounded Distance Decoding (BDD), are believed to be hard also for quantum computers. Hence, schemes based on such problems are good candidates for providing quantum-safe public key cryptography. Indeed, 23 of the original 69 complete and proper schemes submitted to the National Institute of Standards and Technology (NIST) as part of the Post Quantum Standardisation Process [NIS16] are based on various lattice problems with varying amounts of structure. Given the long shelf life of cryptographic standards and the high stakes of standardising primitives, the security of these schemes, and thus the concrete hardness of lattice problems, should be understood in detail.

Two popular problems chosen to design lattice-based schemes are the Learning With Errors (LWE) problem (with its ring and module variants) and the NTRU problem. A variety of attack strategies against these problems exist. Asymptotically, the best option is the approach of Arora–Ge [AG11], while, again asymptotically, in the case of binary secrets, BKW variants [KF15, GJS15] perform well. In practice however, the best attacks seem to be the primal, dual and hybrid attacks. All three rely on lattice reduction algorithms, such as BKZ [SE91, SE94, CN11], Progressive BKZ [AWHT16], Self-Dual BKZ [MW16], G6K [ADH+19] and Slide Reduction [GN08a], to find either a unique (up to sign) embedded shortest vector, or more generally a good lattice basis. In particular, the primal attack is often estimated as the cheapest option [ACD+18].

The primal attack against LWE and NTRU consists of using lattice reduction to solve an instance of the unique Shortest Vector Problem (uSVP). The most popular lattice reduction algorithm is BKZ. Current complexity estimates for solving uSVP directly depend on estimating the smallest block size \(\beta \) such that BKZ-\(\beta \) successfully recovers the unique shortest vector. This \(\beta \) is commonly found by following the methodology introduced in [ADPS16, §6.3], and experimentally investigated in [AGVW17].

In their experiments, Albrecht et al. [AGVW17] and Bai et al. [BMW19], report that smaller than expected block sizes can result in a non-negligible probability of solving uSVP instances arising from the primal attack, when using BKZ. Some concerns were raised [BCLv19] that this could indicate an overestimate of the complexity of the primal attack for cryptographically sized instances. Furthermore, the experiments carried out in 2017 [AGVW17] only focused on recovering a unique shortest vector sampled coefficientwise from a discrete Gaussian distribution. While [AGVW17] claims that the [ADPS16] methodology would also hold for binary and ternary distributions, the authors do not provide experimental evidence. Recent work [CCLS20] revisited the binary and ternary case in the small block size regime \(\beta \le 45\) and concluded that discrete Gaussian errors are more secure. We disagree, and discuss [CCLS20] further in Sect. 5.2.

Dachman-Soled et al. [DSDGR20] recently proposed an approach for estimating the complexity of the primal attack that makes use of probability distributions for the norms of particular projections of the unique shortest vector, rather than only expected values. This results in a new approach that allows one to better predict the behaviour of the attack when considering block sizes smaller than those expected to be successful by the [ADPS16] methodology. The authors of [DSDGR20] use this approach to develop a simulator that predicts the expected block size by which Progressive BKZ will solve an isotropic uSVP instance. In this work, we call such a simulator a uSVP simulator. They use this uSVP simulator in the setting of solving LWE instances with extra hints about the secret, and verify the accuracy of their predictions as the number of hints varies.

Our Contributions. Our first contribution is the implementation of a variant of the uSVP simulator for Progressive BKZ, and the development of a new uSVP simulator for BKZ 2.0. Rather than only returning the expected successful block size, we extract full probability mass functions for successful block sizes, which allow for a more direct comparison to experimental results. Our simulators are also faster than those in [DSDGR20], simulating success probabilities for Kyber1024 in 31 s against the 2 h of [DSDGR20]. This allows for potentially easier inclusion in parameter selection scripts, such as the LWE estimator [APS15]. We note that since the time of writing, the latest version of the simulator proposed in [DSDGR20] adopted the same speedup techniques.

Our second contribution is extensive experiments on the success probability of different block sizes for BKZ 2.0 and Progressive BKZ, on uSVP lattices generated from LWE instances with discrete Gaussian, binary or ternary secret and error distributions. Our experiments show that the uSVP simulators accurately predict the block sizes needed to solve uSVP instances via lattice reduction, for all distributions tested.

As a final contribution, we reestimate the security of the three lattice KEM finalists of the NIST PQC using our uSVP simulators. We compare the expected block sizes they suggest to those predicted by the original methodology of [ADPS16]. We note that our uSVP simulators estimate that a slightly larger average block size than predicted is required, meaning that [ADPS16] likely resulted in an underestimate of their security.Footnote 1 We also observe that this phenomenon can, in large part, be attributed to the original [ADPS16] methodology using the Geometric Series Assumption. Replacing this assumption with the output of the [CN11] BKZ simulator reduces the predictive gap between the [ADPS16] methodology and our uSVP simulators.

All of our code and data can be found at github.com/fvirdia/usvp-simulation.

Related Work. The Geometric Series Assumption (GSA), used to predict the output quality of lattice reduction, was introduced in [Sch03]. A simulator, specifically for the output quality of BKZ, was introduced in [CN11]. This simulator more accurately predicts the final, or tail, region of the basis profile of a BKZ reduced lattice, improving over the GSA. A refined BKZ simulator was presented in [BSW18], which improves over the [CN11] simulator in the first region, or head, of the basis profile. Alkim et al. [ADPS16] introduced a BKZ specific method for estimating the block size required to solve uSVP instances arising from the primal attack; its accuracy was investigated in [AGVW17, BMW19]. This method, combined with basis profile simulation after BKZ reduction and arguments about distributions describing the lengths of projections of the unique short vector, is extended in [DSDGR20] to predict the expected block size by which Progressive BKZ will solve isotropic uSVP instances.

Paper Structure. In Sect. 2 we introduce the necessary preliminaries and notation regarding linear algebra, computational lattice problems, and lattice reduction. In Sect. 3 we review the original [ADPS16] methodology for predicting the expected required block sizes for solving uSVP instances. In Sect. 4 we review the approach of [DSDGR20] and use it to propose uSVP simulators for BKZ 2.0 and Progressive BKZ. In Sect. 5 we describe our experiments and results. In Sect. 6 we use our uSVP simulators to provide preliminary estimates of the block sizes required to successfully perform key recovery attacks on the three NIST PQC lattice KEM finalists, and compare this to predictions using the [ADPS16] methodology.

2 Preliminaries

Linear Algebra. The set \(\{1, \dots , n\}\) is denoted by \([n]\). We denote vectors by bold lowercase letters such as \(\varvec{v}\), and matrices by bold uppercase letters such as \(\varvec{M}\). We denote the \(n \times n\) identity matrix as \(\varvec{I}_n\). Throughout, we use row vectors and count indices from \(1\). We represent a basis \(\{\varvec{b}_1, \dots , \varvec{b}_d\}\) of \(\mathbb {R}^d\) as the matrix \(\varvec{B}\) having the basis vectors as rows. Given a basis \(\varvec{B}\), we can derive an orthogonal basis \(\varvec{B}^*\) via the Gram–Schmidt process. The rows of \(\varvec{B}^*\) are

$$ \varvec{b}_i^* = \varvec{b}_i - \sum _{j < i} \mu _{i, j} \varvec{b}_j^* \quad \text {for} \quad i \in [d], \quad \text {where} \quad \mu _{i, j} = {\langle \varvec{b}_i, \varvec{b}_j^* \rangle }/{\Vert \varvec{b}_j^*\Vert ^2} \quad \text {for} \quad i > j. $$

We denote by \(\mathrm {span}_\mathbb {R}\left( {\{\varvec{v}_i\}}_i\right) = \{\sum _i \lambda _i \varvec{v}_i :\lambda _i \in \mathbb {R}\}\) the real span of a set of real vectors \({\{\varvec{v}_i\}}_i\). Given a basis \(\varvec{B}\) of \(\mathbb {R}^d\) we denote by \(\pi _{\varvec{B}, k} :\mathbb {R}^d \rightarrow \mathbb {R}^d\) the linear operator projecting vectors orthogonally to the subspace \(\mathrm {span}_\mathbb {R}\left( \{\varvec{b}_1, \dots , \varvec{b}_{k-1}\}\right) \). Note \(\pi _{\varvec{B}, 1}\) is the identity on \(\mathbb {R}^d\). We write \(\pi _i\) when the basis is clear from context. Given a vector space \(V = \text {span}_\mathbb {R}(\varvec{B})\), its projective subspace \(\pi _k(V)\) of dimension \(d-k+1\) has a basis \(\{\pi _k(\varvec{b}_k), \dots , \pi _k(\varvec{b}_d)\}\), where

$$ \pi _k(\varvec{b}_i) = \varvec{b}_i - \sum _{j< k}\mu _{i, j} \varvec{b}_j^* = \varvec{b}_i^* + \sum _{k \le j < i} \mu _{i, j} \varvec{b}_j^* \quad \text {for} \quad i \ge k. $$

By definition, this implies that \(\pi _k(\varvec{b}_k) = \varvec{b}_k^*\), and that \(\pi _j(\pi _k(\varvec{v})) = \pi _k(\varvec{v})\) for any \(j \le k\). Given an orthogonal basis \(\varvec{B}^*\) and a vector \(\varvec{t} = t_1^* \varvec{b}_1^* + \cdots + t_d^* \varvec{b}_d^*\), its projections are given by \(\pi _k(\varvec{t}) = t_k^* \varvec{b}_k^* + \cdots + t_d^* \varvec{b}_d^*\). We abuse notation and write \(\pi _i(\varvec{B}[j:k])\) to mean the matrix with rows \(\pi _i(\varvec{b}_j),\dots ,\pi _i(\varvec{b}_k)\).

Probability. Given a probability distribution D with support \(S \subset \mathbb {R}\), we denote sampling an element \(s \in S\) according to D as \(s \leftarrow D\). For a finite support S, we denote the uniform distribution over S as \(\mathcal {U}(S)\). We denote the mean and variance of D as \(\mathbb {E}(s)\) or \(\mathbb {E}(D)\), and \(\mathbb {V}(s)\) or \(\mathbb {V}(D)\), respectively. We sometimes use \(\sqrt{\mathbb {V}}\) similarly to denote the standard deviation. Given a discrete (resp. continuous) probability distribution D, we denote its probability mass function (resp. probability density function) as \(f_D\) and its cumulative mass function (resp. cumulative density function) as \(F_D\). Given \(s \leftarrow D\), by definition \(P[s \le x] = F_D(x)\). We recall the conditional probability chain rule. If \(E_1\), ..., \(E_n\) are events, then \( P[E_1 \cap \cdots \cap E_n] = P[E_1 | E_2 \cap \cdots \cap E_n] P[E_2 \cap \cdots \cap E_n]. \) We denote by \(\varGamma \) the gamma function \(\varGamma (x) = \int _0^\infty t^{x-1} e^{-t} dt\) for \(x > 0\). The Gaussian Distribution. We recall some properties of the continuous Gaussian distribution. We denote by \(N(\mu , \sigma ^2)\) the probability distribution over \(\mathbb {R}\) of mean \(\mu \) and standard deviation \(\sigma \), variance \(\sigma ^2\), with density function

$$ f_{N(\mu , \sigma ^2)}(x) = \frac{1}{\sigma \sqrt{2\pi }}e^{-\frac{1}{2}{\left( \frac{x-\mu }{\sigma }\right) }^2}. $$

Given a random variable \(X \sim N(\mu _X, \sigma _X^2)\) and a scalar \(\lambda > 0\), the random variable \( Y = \lambda \cdot X \) follows a distribution \(N(\lambda \mu _X, \lambda ^2 \sigma _X^2)\). Given n independent and identically distributed random variables \(X_i \sim N(0, 1)\), the random variable \(X_1^2 + \cdots + X_n^2\) follows a chi-squared distribution \(\chi ^2_n\) over \(\mathbb {R}_{\ge 0}\) of mean n and variance 2n, with probability density function

$$ f_{\chi ^2_n}(x) = \frac{1}{2^{n/2}\varGamma (n/2)}\; x^{n/2-1} e^{-x/2}. $$

Given n independent and identically distributed random variables \(Y_i \sim N(0, \sigma ^2)\), the random variable \(Y_1^2 + \cdots + Y_n^2\) follows a distribution \(\sigma ^2 \cdot \chi ^2_n\) of mean \(n \sigma ^2\) and variance \(2n \sigma ^4\), that is, a chi-squared distribution where every sample is scaled by a factor of \(\sigma ^2\). We call this a scaled chi-squared distribution.

Discrete Gaussians. We denote by \(D_{\mu , \sigma }\) the discrete Gaussian distribution over \(\mathbb {Z}\) with mean \(\mu \in \mathbb {R}\) and standard deviation \(\sigma \in \mathbb {R}^+\). It has probability mass function \(f_{D_{\mu , \sigma }} :\mathbb {Z}\rightarrow [0, 1], x \mapsto f_{N(\mu , \sigma ^2)}(x)/f_{N(\mu , \sigma ^2)}(\mathbb {Z})\), where \(f_{N(\mu , \sigma ^2)}(\mathbb {Z}) = \sum _{x \in \mathbb {Z}}{f_{N(\mu , \sigma ^2)}(x)}\). Discrete Gaussian distributions with \(\mu = 0\), or the distributions these imply over \(\mathbb {Z}_q\) for some modulus \(q\), are widely used in lattice cryptography to sample entries of error and secret vectors from. In our analyses below, we work with vectors \(\varvec{t}\) sampled coefficientwise from a discrete Gaussian, and with their projections \(\pi _i(\varvec{t})\). We model the squared norms \(\left\| \pi _i({\varvec{t}})\right\| ^2\) as random variables following a scaled chi-squared distribution with the appropriate degrees of freedom. For example, for some vector \(\varvec{v} = (v_1, \dots , v_d)\) with each \(v_i \leftarrow D_{0, \sigma }\) sampled independently, we model \(\left\| \pi _{\varvec{B}, i}(\varvec{v})\right\| ^2 \sim \sigma ^2 \cdot \chi ^2_{d - i + 1}\), where \(\varvec{B}\) is a lattice basis being reduced.

Bounded Uniform Distributions. Given a finite subset \(S \subset \mathbb {Z}\), we call the uniform distribution \(\mathcal {U}(S)\) a bounded uniform distribution. Of particular interest in this work are the binary and ternary distributions, where \(S = \{0, 1\}\) and \(S =\{-1, 0, 1\}\). Similarly to the case of the discrete Gaussian, works using the [ADPS16] methodology for estimating the complexity of lattice reduction, such as the ‘LWE estimator’ [APS15], implicitly model \(\left\| \pi _{\varvec{B}, i}(\varvec{v})\right\| ^2 \sim \sigma ^2 \cdot \chi ^2_{d - i + 1}\) for vectors \(\varvec{v}\) sampled coefficientwise from a bounded uniform distribution having \(\mathbb {E}(\mathcal {U}(S)) = 0\) and \(\mathbb {V}(\mathcal {U}(S)) = \sigma ^2\), and \(\varvec{B}\) a lattice basis being reduced.

Lattices. A real lattice of rank \(n\) and dimension \(d\) is the integer span of \(n\) linearly independent vectors \(\varvec{b}_1, \dots , \varvec{b}_{n} \in \mathbb {R}^d\), which we collect into a basis \(\varvec{B}\). The lattice generated by \(\varvec{B}\) is

$$ \varLambda = \varLambda (\varvec{B}) = \left\{ x_1 \varvec{b}_1 + \cdots + x_n \varvec{b}_n :x_i \in \mathbb {Z}\right\} , $$

and is a discrete subgroup of \((\mathbb {R}^d, +)\). For \(n \ge 2\) and \(\varLambda = \varLambda (\varvec{B})\), we have also \(\varLambda = \varLambda (\varvec{U} \varvec{B})\) for any \(\varvec{U} \in \mathrm {GL}_n(\mathbb {Z})\). Hence \(\varLambda \) has infinitely many bases. An invariant of a lattice is its volume.

Definition 1

(Lattice volume). Given any basis \(\varvec{B}\) for a lattice \(\varLambda \),

$$ \mathrm {vol}(\varLambda ) = \sqrt{\det (\varvec{B^t B})} = \displaystyle \prod \limits _{i = 1}^n{\Vert \varvec{b}_i^*\Vert }. $$

This quantity is exactly the volume of a fundamental parallelepiped of \(\varLambda \), that is, the volume of the set \(\{\varvec{x}\varvec{B} :\varvec{x} \in {[0, 1)}^n\}\). Other properties of interest in lattices are their minima.

Definition 2

(Lattice minima). Let \(B_d(r)\) be the closed ball of radius \(r\) in \(\mathbb {R}^d\) and \(i \in [n]\). Define \(\lambda _i(\varLambda )\), the \(i^{th}\) minima of \(\varLambda \),

$$\begin{aligned} \lambda _i(\varLambda ) = \displaystyle \min {\left\{ r \in \mathbb {R}^+ :\varLambda \cap B_d(r) \ \text {contains { i} linearly independent vectors}\right\} }. \end{aligned}$$

A lattice can be tessellated by centring a copy of the fundamental domain on each lattice point. This fact can be used to approximate the number of lattice points in some ‘nice enough’ measurable set. The Gaussian heuristic says that the number of lattice points in a measurable set \(S\) is approximately \(\mathrm {vol}(S) / \mathrm {vol}(\varLambda )\). The Gaussian heuristic can be used to approximate the first minimum \(\lambda _1(\varLambda )\).

Definition 3

(Gaussian heuristic for the shortest vector). Given a rank \(n\) lattice \(\varLambda \), the Gaussian heuristic approximates the smallest radius containing a lattice point as

$$ \mathrm {gh}(\varLambda ) = \sqrt{\frac{n}{2 \pi e}}\, \mathrm {vol}{(\varLambda )}^{1/n}. $$

Various computational problems can be defined using lattices. We focus on the following.

Definition 4

(Shortest Vector Problem (SVP)). Given a lattice \(\varLambda \) find a vector \(\varvec{v} \in \varLambda \) of length \(\lambda _1(\varLambda )\).

Definition 5

( \(\gamma \)-unique Shortest Vector Problem (\(\mathbf {uSVP}_\gamma \))). Given a lattice \(\varLambda \) such that \(\lambda _2(\varLambda ) > \gamma \lambda _1(\varLambda )\), find the unique (up to sign) \(\varvec{v} \in \varLambda \) of length \(\lambda _1(\varLambda )\). Unless specified, \(\gamma = 1\).

Definition 6

(Learning With Errors (LWE) [Reg09]). Let n, q be positive integers, \(\chi \) be a probability distribution on \(\mathbb {Z}_q\) and \(\varvec{s}\) be a secret vector in \(\mathbb {Z}_{q}^{n}\). We denote by \(L_{\varvec{s}, \chi }\) the probability distribution on \(\mathbb {Z}_{q}^{n} \times \mathbb {Z}_{q}\) obtained by sampling \(\varvec{a} \leftarrow \mathcal {U}(\mathbb {Z}_q^n)\), \(e \leftarrow \chi \), and returning \((\varvec{a}, c) = (\mathbf {a},\langle \mathbf {a}, \mathbf {s}\rangle +e) \in \mathbb {Z}_{q}^{n} \times \mathbb {Z}_{q}\).

Decision LWE is the problem of deciding whether pairs \((\mathbf {a}, c) \in \mathbb {Z}_{q}^{n} \times \mathbb {Z}_{q}\) are sampled according to \(L_{\mathbf {s}, \chi }\) or  \(\mathcal {U}(\mathbb {Z}_{q}^{n} \times \mathbb {Z}_{q})\).

Search LWE is the problem of recovering \(\varvec{s}\) from pairs sampled according to \(L_{\mathbf {s}, \chi }\).

For a given distribution \(L_{\mathbf {s}, \chi }\) and prime power modulus \(q\), Decision LWE and Search LWE are polynomial time equivalent [Reg09].

We note that the distribution \(\chi \) from which the error is drawn tends to encode some notion of smallness, which is usually required for functionality. Throughout this work, we assume m LWE samples \({\{(\varvec{a}_i, c_i) \leftarrow L_{\mathbf {s}, \chi }\}}_{i=1}^m\) are available. These can be written in matrix form as \((\varvec{A}, \varvec{c}) = (\varvec{A},\varvec{s} \varvec{A} + \varvec{e}) \in \mathbb {Z}_q^{n \times m} \times \mathbb {Z}_q^{1 \times m}\). In the original formulation, the LWE secret vector is sampled uniformly from \(\mathbb {Z}_q^n\). A standard transformation [MR09, ACPS09] maps m samples from an LWE distribution \(L_{\mathbf {s}, \chi }\) with \(\varvec{s} \leftarrow \mathcal {U}(\mathbb {Z}_q^n)\) to \(m-n\) samples from an LWE distribution \(L_{\mathbf {s'}, \chi }\) where the secret vector \(\varvec{s'}\) is sampled coefficientwise from \(\chi \). Such a distribution is said to be in normal form. In general, more efficient key exchange can be built from LWE distributions where the secret is sampled from a narrow distribution such as \(\chi \) (small secret LWE) or from a distribution imposing or implying few non zero entries in \(\varvec{s}\) (sparse secret LWE). In this work \(\chi _s\) (resp. \(\chi _e\)) represents the distribution from which coefficients of \(\varvec{s}\) (resp. \(\varvec{e}\)) are sampled. Note that with high probability any n samples \((\varvec{A}, \varvec{c})\) from an LWE distribution with prime modulus q with \(\varvec{s} \leftarrow \chi _s^n\) and \(\varvec{e} \leftarrow \chi _e^n\) can be turned into n LWE samples \((\varvec{A}^{-1}, \varvec{c} \varvec{A}^{-1})\) where the roles of \(\chi _e\) and \(\chi _s\) are swapped. This can be useful for creating embedding lattices (see below) when choosing \(m \le n\).

Embedding Lattices. The primal attack transforms the Search LWE problem into a uSVP instance. This can always be achieved using Kannan’s embedding [Kan87]. In the case of small secret LWE, the Bai–Galbraith embedding variant [BG14] can also exploit differences in \(\chi _s\) and \(\chi _e\), whenever the former is small or sparse. In particular, given LWE samples \((\varvec{A}, \varvec{c})\) in such an instance, the primal attack starts by constructing the following embedding lattice basis

$$\begin{aligned} \varvec{B}=\left( \begin{array}{c@{}c@{}c} \mathbf {0} &{} q \mathbf {I}_{m} &{} \mathbf {0} \\ \nu \mathbf {I}_{n} &{} -\varvec{A} &{} \mathbf {0} \\ \mathbf {0} &{} \mathbf {c} &{} c \end{array}\right) \end{aligned}$$
(1)

and performs lattice reduction to recover the unique shortest vector \(\varvec{t} = ( *\mid \varvec{s} \mid 1) \cdot \varvec{B} = (\nu \, \varvec{s} \mid \varvec{e} \mid c)\) for suitable values of \(*\) and c, and a scalar \(\nu \) that balances the contributions of \(\varvec{s}\) and \(\varvec{e}\) to the norm of \(\varvec{t}\). An alternative approach is to first reduce the \((n+m) \times (n+m)\) top left minor of \(\varvec{B}\) as a form of preprocessing (e.g. if \(\varvec{A}\) is a common reference string for multiple LWE distributions), and later append the last row to finish the search for a specific target vector [LN13]. While lattice reduction software that takes \(\varvec{B}\) as input often requires that \(\nu \in \mathbb {Z}\), in the IACR ePrint version of this paper we discuss a standard way to construct variants of this embedding that allow us in practice to use any \(\nu \in \mathbb {R}\), as well as to centre the \(\chi _s\) and \(\chi _e\) distributions. For example, applying these techniques to an LWE instance with a binary secret distribution results in an embedding where the first n coordinates of \(\varvec{t}\) are distributed uniformly in \(\{-1, 1\}\).

Lattice Reduction. In general, lattice reduction is any algorithmic technique that takes as input a basis of a lattice and finds a basis of better quality. Many different notions of reduced basis exist, most of which can be intuitively captured by a basis being formed of short and close to orthogonal vectors. The celebrated LLL algorithm [LLL82] achieves the following.

Definition 7

(LLL reduced). For \(\delta \in (1/4, 1)\) a basis \(\varvec{B}\) is \(\delta \)-LLL reduced if \(|\mu _{i, j} |\le 1/2\) for all \(1 \le j < i \le d\) and \((\delta - \mu _{i, i - 1}^2) \left\| \varvec{b}_{i - 1}^*\right\| ^2 \le \left\| \varvec{b}_i^*\right\| ^2\) for \(i \in \{2, \dots , d\}\).

In this work we consider the performance of the BKZ algorithm [SE91, SE94], which achieves the following.

Definition 8

(BKZ- \(\beta \) reduced). A basis \(\varvec{B}\) is BKZ-\(\beta \) reduced if it is LLL reduced and for all \(i \in [d - 1], \Vert \varvec{b}_i^* \Vert = \lambda _1\left( \pi _i(\varvec{B}{[i :\min (i + \beta - 1, d)]})\right) \).

In order to do this, an oracle \(O_\text {SVP}\) is used, that, given a lattice, finds its shortest vector. BKZ repeatedly calls \(O_\text {SVP}\) on the projected sublattices, or blocks, \(\pi _i(\varvec{B}{[i :\min (i + \beta - 1, d)]})\). If the output vector \(\varvec{v}\) is shorter than the current first vector in the block, it is inserted into the basis at the beginning of the block. Then LLL is run on the basis to remove linear dependencies introduced by this insertion. Throughout, we make use of the BKZ implementation in the FPLLL [dt16a] library, which sets \(\delta = 0.99\) in its underlying calls to LLL.

figure a

In Algorithm 1, we present a description of the BKZ algorithm. In its original description, BKZ terminates after a full tour is executed without inserting. We follow algorithmic improvements and do not necessarily run tours until this point. In particular, the notion of early abort (called auto-abort in some implementations [dt16a]) was introduced as part of the BKZ 2.0 algorithm [CN11]. The idea is that the majority of improvement occurs in a few early tours, whereas many tours are required before convergence. Following experimental analysis of BKZ [Che13, Figure 4.6], [Alb17, §2.5], Albrecht [Alb17] identifies \(\tau = 16\) as the number of tours after which little improvement is made to the basis quality. Furthermore, BKZ 2.0 integrates local block rerandomisation and preprocessing into the originally proposed \(O_\text {SVP}\) oracle, enumeration. We note that while recent advances in lattice sieving mean that enumeration \(O_\text {SVP}\) oracles are no longer the fastest in practice [ADH+19] for large SVP instances, our heuristic analysis is independent of the underlying \(O_\text {SVP}\) oracle, and for the block sizes we consider the enumeration of FPLLL is slightly faster than the sieves of [ADH+19].

In [AWHT16], Aono et al. introduce another variant of BKZ that they name Progressive BKZ. Here, the basis is reduced using increasingly larger block sizes \(\beta \), running tours of BKZ-\(\beta \) each time. For the purposes of this paper, we define Progressive BKZ as in Algorithm 2, allowing an arbitrary number \(\tau \) of tours to be run for each block size.

figure b

One consequence of lattice reduction is that it controls how quickly the lengths of the Gram–Schmidt vectors \(\varvec{b}_i^*\) (for an output basis \(\varvec{B}\)) decay. In particular, the larger \(\beta \) is chosen in BKZ, the slower these lengths decay and the closer to orthogonal the basis vectors are. We call the lengths of the Gram–Schmidt vectors, the basis profile.

Definition 9

(Basis profile). Given a basis \(\varvec{B}\) of a lattice of rank n, we define the profile of \(\varvec{B}\) as the set of squared norms of the orthogonal vectors \({\{\left\| \varvec{b}_i^*\right\| ^2\}}_{i=1}^n\).

Remark 1

In our algorithms, we refer to exact or estimated values \(\left\| \varvec{b}_i^*\right\| ^2\) for a basis as \(\mathtt {profile}[i]\).

Theoretical results exist about the output quality of BKZ-\(\beta \) [HPS11, ALNSD20], as well as heuristic assumptions, which better model average case performance when reducing random q-ary lattices.

Definition 10

(Geometric Series Assumption (GSA) [Sch03]). Given a basis \(\varvec{B}\), the norms of the Gram-Schmidt vectors \(\varvec{b}_i^*\) after lattice reduction satisfy

$$\begin{aligned} \Vert \varvec{b}_i^*\Vert = \alpha ^{i-1} \cdot \Vert \varvec{b}_1\Vert , \text { for some } 0< \alpha < 1. \end{aligned}$$

In the case of BKZ-\(\beta \), \(\alpha \) can be derived as a function of \(\beta \), by combining an estimate for \(\left\| \varvec{b}_1\right\| \) returned by BKZ [Che13] and the (constancy of the) lattice volume. The GSA can be seen as a global view of a lattice basis, using only the constant volume of the full lattice \(\varLambda \) and an estimate for the length of the first basis vector to calculate \(\alpha \). However, the volume of local blocks is not constant as LLL or BKZ is run on a basis. Chen and Nguyen propose a BKZ simulator [CN11] that takes this intuition into account to improve on the GSA in the case of BKZ. It takes as input a profile \({\{\left\| \varvec{b}_i^*\right\| ^2\}}_i\) and simulates a tour of BKZ-\(\beta \) by calculating, block by block, the Gaussian heuristic of the current \(\beta \) dimensional block, ‘inserting’ a vector of that length at the beginning of said block, and redistributing the necessary length to the subsequent Gram–Schmidt vectors to keep \(\mathrm {vol}(\varLambda )\) constant. Since projected sublattices of small rank, e.g. \(n \le 45\), do not behave as random,Footnote 2 to simulate the profile for the final indices of the basis the BKZ simulator stops using the Gaussian heuristic and instead uses experimental averages over unit volume lattices (scaled appropriately). This design also allows for one to simulate a fixed number of tours, rather than assuming convergence, as in the GSA. The process can be made probabilistic by ‘inserting’ a vector with length drawn from a probability distribution centred on the length suggested by the Gaussian heuristic [BSW18]. The latter approach better captures a phenomenon of lattice reduction called the head concavity.

Throughout our work we make use of the Chen–Nguyen simulator as implemented in FPyLLL [dt16b]. In Algorithm 3 we define a BKZSim subroutine that returns a [CN11] simulation for an input basis profile. Here LWE\(_{n, q, \chi , m}\) is a basis produced as in (1) with \(c = 1\), assuming normal form so that \(\nu = 1\) and \(\chi = \chi _s = \chi _e\). To produce the profile of an LLL reduced LWE basis, we considered three options. In the case of the instances used in our experiments, which are described in Sect. 5, such a profile can be easily obtained by performing LLL on any particular embedding basis. However, this is not the case for cryptographically sized embeddings, where FPLLL’s implementation of LLL can only run with high enough floating point precision by using MPFR [FHL+07], which becomes impractically slow. An alternative is to use a GSA slope corresponding to LLL reduction. This correctly predicts the slope of the main section of the profile, but does not account for the role played by the q-vectors in the embedding basis, which are short enough to not be affected by LLL [HG07]. The third option is to use a specific basis profile simulator for LLL that captures the effect of the q-vectors. We opt for the third option; a description of the Z-shape phenomenon and its simulation can be found in the IACR ePrint version of this paper.

figure c

3 Choosing BKZ Block Sizes and the ‘2016 Estimate’

In this section we motivate and explain the approach introduced in [ADPS16] to predict the block size required to solve uSVP using lattice reduction.

The runtime of BKZ-\(\beta \) is dominated by that of the \(O_\text {SVP}\) subroutine. The latter is often implemented using lattice point enumeration with preprocessing, which has time complexity \(\beta ^{\varTheta (\beta )}\), or lattice sieving, which has time and memory complexity \(2^{\varTheta (\beta )}\). Therefore, to estimate the complexity of solving uSVP using lattice reduction, it is crucial to estimate the smallest block size sufficient to recover the unique shortest vector \(\varvec{t} \in \varLambda \).

The most successful approach for making such estimates was introduced in [ADPS16, §6.3] and is sometimes referred to in the literature as the ‘2016 estimate’. The idea is to estimate a block size \(\beta \) such that at some point during lattice reduction, \(O_\text {SVP}\) will return a projection of the uSVP solution as the shortest vector in a local projected sublattice. If the rank of this projected sublattice is large enough, subsequent cheap lattice reduction operations (usually, a single call to LLL [AGVW17]) will recover the full uSVP solution. Concretely, this approach consists of finding the smallest \(\beta \) such that in the final full sized block starting at index \(d-\beta +1\),

$$\begin{aligned} \left\| \pi _{d - \beta + 1}(\varvec{t})\right\| \le \left\| \varvec{b}_{d - \beta + 1}^*\right\| , \end{aligned}$$
(2)

resulting in \(O_\text {SVP}\) recovering the projection of \(\varvec{t}\) at index \(d - \beta + 1\).

In [ADPS16], the authors consider normal form LWE, and assume the secret distribution \(\chi \) to be centred around 0. The uSVP solution will be an embedded vector for which each entry is drawn i.i.d. from a distribution of standard deviation \(\sigma \) and mean \(\mu = 0\), with the addition of one final, constant, entry \(c\).Footnote 3 Using the Bai–Galbraith embedding, our target vector is \(\varvec{t} = (\varvec{s} \mid \varvec{e} \mid c)\), of dimension \(d = n + m + 1\). The squared norm \(\left\| \varvec{t}\right\| ^2\) may be modelled as a random variable following a scaled chi-squared distribution \(\sigma ^2 \cdot \chi ^2_{d-1}\) with \(d - 1\) degrees of freedom, plus a fixed contribution from c, resulting in \(\mathbb {E}(\left\| \varvec{t}\right\| ^2) = {(d - 1)\sigma ^2 + c^2}\).

In [ADPS16], the authors approximate the left hand side of (2) as \(\left\| \pi _{d-\beta +1}(\varvec{t})\right\| \approx \mathbb {E}(\left\| \varvec{t}\right\| ) \sqrt{\beta /d} \approx \sigma \sqrt{\beta }\), where they approximate \(\mathbb {E}(\left\| \varvec{t}\right\| ) \approx \sigma \sqrt{d}\). The approximation \(\mathbb {E}(\left\| \varvec{t}\right\| ) \approx \sigma \sqrt{d}\) replaces \({(d - 1)\sigma ^2 + c^2}\) with \(d\sigma ^2\), which for large \(d\) or for \(c \approx \sigma \) introduces little error, and assumes that \(\mathbb {E}(\left\| \varvec{t}\right\| ) = {\mathbb {E}(\left\| \varvec{t}\right\| ^2)}^{1/2}\). The error in this assumption tends to 0 as \(d \rightarrow \infty \), so we ignore it. An exact derivation can be found in the IACR ePrint version of this paper. This assumption can also be avoided altogether by working with squared lengths, as we do in our analysis.

To approximate the right hand side of (2), in [ADPS16, §6.3] the authors make use of the GSA. Assuming that BKZ-\(\beta \) returns a first basis vector of length \(\ell _1(\beta )\) when called with the basis of a random q-ary lattice as input, this results in the following win condition that \(\beta \) must satisfy for solving uSVP using BKZ-\(\beta \),

$$\begin{aligned} \sqrt{\beta } \sigma \approx \left\| \pi _{d - \beta + 1}(\varvec{t})\right\| \le \left\| \varvec{b}_{d - \beta + 1}^*\right\| \approx \alpha {(\beta )}^{d - \beta } \cdot \ell _1(\beta ). \end{aligned}$$
(3)

At first glance the careful reader may notice an apparent contradiction in the methodology. Indeed, the GSA describes the basis profile produced by BKZ for a random lattice, and in [ADPS16] \(\ell _1\) is determined assuming this is the case. However, we are reducing a uSVP embedding lattice. While the embedding basis looks like that of a random q-ary lattice, the shortest vector will be shorter than \(\ell _1(\beta )\). Yet, this shortest vector is hard to find. What (3) aims to capture is exactly the moment where BKZ is able to find this shortest vector, and hence distinguish our uSVP embedding lattice from a random \(q\)-ary lattice. The GSA and \(\ell _1\) are used to describe the status of the basis up until this moment, while it still looks like the basis of a random q-ary lattice.

In this model, (3) provides a clear cut answer to what is the smallest viable block size to solve uSVP. In practice, BKZ 2.0 is a randomised algorithm, working on a random uSVP instance. In [AGVW17], the authors verify the validity of this win condition, resulting in a success probability of approximately \(90\%\) when using \(\beta \) chosen by following (3). However, they also measure that somewhat smaller block sizes also present some relatively high success probabilities of solving uSVP.

4 Simulating Solving uSVP

In this section, we review and extend recent work on capturing the probabilistic nature of the described uSVP win condition. In [DSDGR20], Dachman-Soled et al. revisit the [ADPS16] heuristic methodology described in Sect. 3. The authors are concerned with accurately predicting the effects that introducing side channel information to their lattice embedding has on the success probability of solving uSVP using Progressive BKZ, while also maintaining accuracy in the small block size regime, \(\beta \le 45\). The authors describe a uSVP simulator (not to be confused with the BKZ simulator of [CN11]), designed to predict the success probability of Progressive BKZ solving an isotropic uSVP instance by a specific block size.Footnote 4 Using their uSVP simulator, they predict the expected successful block size for a series of experiments they run, and verify the accuracy of their predictions. We start by simplifying the [DSDGR20] uSVP simulator for Progressive BKZ, and then develop a similar uSVP simulator for BKZ 2.0. We focus on the simulator as described in [DSDGR20] at the time of release. Since the time of writing, the latest version of the simulator proposed in [DSDGR20] adopted some of the techniques described below, for allowing \(\tau > 1\) and faster simulations.

4.1 Progressive BKZ

The approach proposed in [DSDGR20] to estimate the required block size to solve a uSVP instance is to simulate the status of a lattice basis as it is being reduced, and with it the probability at each step of the lattice reduction algorithm that the target vector is recovered.

figure d

Let W be the event of solving uSVP during the run of Progressive SVP, \(W_\beta \) the probability of being able to solve uSVP during the round with block size \(\beta \), and \(F_\beta = \lnot W_\beta \). Following the notation in Algorithm 2, we assume \(\tau = 1\), meaning that for each block size \(\beta \) exactly one tour of BKZ-\(\beta \) is run. They implicitly partition W as follows

$$ P[W] = P[W_3] + P[W_4 \wedge F_3] + P[W_5 \wedge F_4 \wedge F_3] + \cdots = \sum _{\beta =3}^d P\left[ W_\beta \wedge \bigwedge _{j = 3}^{\beta -1} F_j \right] . $$

Their computation of the expected winning block size \(\bar{\beta }\) amounts to implicitly defining a probability mass function for a random variable \(B\) representing the first viable block size to solve the uSVP instance, and computing its expected value. In the case of Progressive BKZ, a block size \(\beta \) being the first viable means that it is the round of BKZ run with block size \(\beta \) (i.e. the tour of Line 3 of Algorithm 2 with block size \(\beta \)) and not any earlier round using a smaller block size, that will solve the uSVP instance. The resulting probability mass function for the distribution of B can be modelled as

$$ P[B = \beta ] = P\left[ W_\beta \wedge \bigwedge _{j = 3}^{\beta -1} F_j\right] . $$

The probability \(P[W_\beta ]\) is itself modelled as the product of the probability of successfully recovering \(\pi _{d - \beta + 1}(\varvec{t})\) by calling \(O_\text {SVP}\) on the last full size block,

$$ P[\pi _{d - \beta + 1}(\varvec{t}) \text { recovered using block size }\beta ] \approx P[x \leftarrow \chi _\beta ^2 :x \le \mathtt {profile}[d-\beta +1]], $$

and the probability of successfully lifting the projection over subsequent rounds, \(p_\text {lift}\). In their implementation of Algorithm 4, Dachman-Soled et al. use a chain of conditional probabilities to compute \(p_\text {lift}\). Events \(W_i\) and \(F_j\) for \(i \ne j\) are considered to be independent, therefore \(P[B = \beta ]\) is computed as the relevant product.

We introduce two simplifications to the above uSVP simulator. Firstly, we noticed experimentally that running BKZ with block sizes smaller than \(40\) will not solve instances for which the [ADPS16] approach predicts a winning block size of \(\beta \gtrsim 60\), where most cryptographic applications (and our experiments) reside. Therefore, we skip probability computations for any block sizes smaller than \(40\). Furthermore, values of \(p_\text {lift}\) approach 1 quickly as \(\beta \) increases, such that one can simply assign \(p_\text {lift} = 1\) for \(\beta \ge 40\); a similar phenomenon is noted in [AGVW17]. Finally, by allowing multiple tours per block size, we define a uSVP simulator, Algorithm 5, for Progressive BKZ as described in Algorithm 2 where \(\tau \) may be greater than 1. A comparison between the output of Algorithms 4 and 5 can be found in Fig. 1 for four isotropic LWE instances, where \(\tau = 1\). To produce Fig. 1, we tweaked the original [DSDGR20] code in order to extract the implicit probability mass function \(P[B = \beta ]\). Our simplifications significantly speed up the simulation by avoiding the expensive computation of \(p_\text {lift}\). In particular, our simulations for Kyber 512 (resp. 1024) take 4 s (resp. 31 s) against the 20 min (resp. 2 h) of [DSDGR20]. We can see that the output probabilities \(P[B \le \beta ]\) and the expected successful block sizes differ only slightly, and optimistically for the attacker, on low dimensional instances, with this difference shrinking for cryptographically sized problems.

Fig. 1.
figure 1

Comparison between the output of Algorithm 4 [DSDGR20] and Algorithm 5 (this work) for isotropic parameters (\(\sigma = 1\)) from Table 1, and on Kyber 512 and 1024 [SAB+19]. The difference in predicted mean first viable block size between the two simulators is reported as \(\varDelta \mathbb {E}(\beta )\).

figure e

4.2 BKZ

Using the same approach as for Algorithm 4 and Algorithm 5, we implemented a uSVP simulator for BKZ, described in Algorithm 6. In this case, the basis profile after a number of tours of BKZ-\(\beta \) is simulated in one shot using the [CN11] simulator. Given that the block size is fixed, the probabilities are only accumulated over tours. It should be noted that the event of \(\beta \) being the first viable block size changes in the case of BKZ. In this case, no unsuccessful tours with a smaller block size are run by the algorithm. Instead, we consider \(\beta \) being first viable if running BKZ-\((\beta -1)\) would not result in a solution to the uSVP instance but running BKZ-\(\beta \) would.

Algorithm 6 returns the probability that \(\tau \) tours of BKZ-\(\beta \) will solve uSVP, but does not exclude the possibility of winning with a smaller block size. We assume in our model that if \(\tau \) tours of BKZ-\(\beta \) solve a given uSVP instance, then \(\tau \) tours of BKZ-\(\beta '\), for \(\beta ' > \beta \), also will. The values output by Algorithm 6 for a given instance can therefore be interpreted as a cumulative mass function for the first viable block size, i.e. \(P[B \le \beta ]\). By running the simulator for increasing block sizes until it outputs probability \(1\), one may recover the probability mass function \(P[B = \beta ]\) as

$$\begin{aligned} P[B = \beta ]&= P[B \le \beta ] - P[B \le \beta - 1]. \end{aligned}$$
figure f

5 Experiments

In this section, we describe the experiments we run to check the accuracy of Algorithms 5 and 6, and discuss the results. We start by describing our original batch of experiments in Sect. 5.1. In Sect. 5.2 we make some observations about our experimental results, and describe further tweaked experiments that we run to verify our understanding of the results.

5.1 Initial Experiments

Our aim in this section is threefold: first, we want to provide experimental evidence for the accuracy of our BKZ and Progressive BKZ uSVP simulators when predicting the success probability of the primal attack against LWE with discrete Gaussian secret and error for different block sizes; second, we want to compare previous experiments [AGVW17] to our uSVP simulations; and finally, we want to explore the effect that binary or ternary distributions have on the primal attack. Throughout our experiments, we use BKZ 2.0 as implemented in FPyLLL [dt16b] version 0.5.1dev, writing our own Progressive BKZ script by using FPyLLL’s BKZ 2.0 as a subroutine.

For our first goal, we choose three different parametrisations of the LWE problem, for which the [ADPS16] approach predicts an expected successful block size of either 60 or 61. We give the parameters in Table 1. All parameter sets in these batches use discrete Gaussian secret and error with \(\mathbb {V}(\chi _s) = \mathbb {V}(\chi _e) = \sigma ^2\). The number of LWE samples used, m, is determined by what the LWE estimator [APS15] predicts to be optimal, using (3). For each parameter set we generate 100 instances, and reduce them using either BKZ or Progressive BKZ. We then check whether lattice reduction positioned the embedded shortest target vector in the first index of the reduced basis.

In the case of BKZ, for each basis we run a number of tours of BKZ with block size \(\beta = 45, \dots , 65\). The number of tours, \(\tau \), takes the values 5, 10, 15, 20, 30. This results in a total of 100 bases, reduced independently \(21 \times 5\) times each, once for every combination of \(\beta \) and \(\tau \). For every set of 100 reductions, we record the success rate by counting the number of solved instances. We run a similar set of experiments using Progressive BKZ, allowing \(\tau \ge 1\) tours per block size, in order to see at what point running extra tours per block size becomes redundant. For this reason, we reduce each basis 5 times, once per value of \(\tau \) in 1, 5, 10, 15, 20. After every call to the BKZ subroutine, we check whether the instance is solved. If not, we increase the block size by 1 and run a further tour of BKZ.

The resulting success rates for BKZ and Progressive BKZ (with \(\tau =1\)) are plotted in Fig. 2, together with the output of our uSVP simulators, interpolated as curves. Figure 3 contains similar plots for Progressive BKZ with \(\tau \ge 1\). In Fig. 5 we plot the measured difference between the average mean and standard deviation for the simulated and experimental probability distributions, for both Progressive BKZ and BKZ.

Table 1. List of LWE parameters used for testing our uSVP simulators. The instances are in normal form. We use the Bai–Galbraith embedding and the number of samples used, \(m_{2016}\), is given by the LWE estimator (commit 428d6ea).

For our second goal, we take the success probabilities reported in [AGVW17] for their experiments. In Fig. 4 we report their measured success rates at optimal and smaller than optimal block sizes, and we superimpose our BKZ success probability simulations.

Finally, for our third goal, we run Progressive BKZ experiments for \(\tau \) in 1, 5, 10, 15, 20 on three parameter sets using bounded uniform secrets. In particular, we pick the \(n = 72\) and \(n = 93\) parameters from Table 1 but sample secret \(\varvec{s}\) and error \(\varvec{e}\) coefficients uniformly from the set \(\{-1, 1\}\), and the \(n=100\) parameters with secret and error coefficients sampled uniformly from \(\{-1, 0, 1\}\). This preserves the same standard deviations as in Table 1, while adding more structure to the target vector. In the first case, the \(\varvec{s}\) and \(\varvec{e}\) are equivalent to those of a scaled and centred LWE instance with binary secret and error (see the IACR ePrint version of this paper), while in the second case, the problem is LWE with ternary \(\varvec{s}\) and \(\varvec{e}\). The resulting success probability plots can be found in Fig. 6.

Fig. 2.
figure 2

Comparison of simulated success probabilities with experimental results for BKZ and Progressive BKZ (with \(\tau = 1\)). Dashed lines are simulations, crosses are experiments. In the case of Progressive BKZ, 100 total instances are reduced. In the case of BKZ, each experimental result is averaged over 100 instances, with experiments using up to block size 65.

Fig. 3.
figure 3

Comparison of simulated success probabilities with experimental results for Progressive BKZ with \(\tau \ge 1\). Dashed lines are simulations, crosses are experiments.

Fig. 4.
figure 4

Comparison of simulated BKZ success probabilities with experimental results reported in Table 1 of [AGVW17].

Fig. 5.
figure 5

The measured difference \(\varDelta \mathbb {E}[\beta ]\) (resp. \(\varDelta \sqrt{\mathbb {V}}[\beta ]\)) between the simulated and experimental successful block size mean (resp. standard deviation), as \(\tau \) grows. \(\varDelta \mathbb {E}[\beta ] \ge 0\) (resp. \(\varDelta \mathbb {V}[\beta ] \ge 0\)) represents the simulated successful block size mean (resp. standard deviation) being greater than the experimentally measured value.

Fig. 6.
figure 6

Comparison of simulated success probabilities with experimental results for Progressive BKZ on LWE instances with scaled and centred binary secret and error (Figs. 6a and 6b), and ternary secret and error (Fig. 6c). Dashed lines are simulations, crosses are experiments. Each experimental result is averaged over 100 instances. No changes were made to the uSVP simulators.

5.2 Observations

Experimental success rates for both BKZ and Progressive BKZ are in line with the output of the simulators described in Sect. 4. Below, we look at the results.

Progressive BKZ. In the case of Progressive BKZ, simulations seem to predict accurately the success probabilities for \(\tau \le 10\) and all secret and error distributions used. Throughout our experiments reported in Fig. 3, we observe two ways in which experiments slightly deviate from predictions.

Firstly, the success probability appears to stop significantly increasing for \(\tau > 10\), even when the simulation does predict some improvement. We expect this to be a consequence of the large amount of lattice reduction being performed. Indeed, whenever the BKZ-\(\beta \) subroutine is called, the basis has already been reduced with \(\tau \) tours of BKZ-\((\beta -j)\) for \(j = 1,\dots ,\beta -3\). This suggests that little progress on the basis profile can be made with each new tour of BKZ-\(\beta \). In our experiments, we use FPyLLL’s BKZ 2.0 implementation with auto-abort, which triggers by default after the slope of the basis profile does not improve for five tours, the slope being computed using a simple linear regression of the logarithm of the basis profile. This means that if it is the case that little progress can be made, fewer than \(\tau \) tours will be run. To verify this, we rerun experiments while measuring the number of tours run by the BKZ subroutine. The data for the \(n = 100\) experiments can be found in Fig. 7, and seems to confirm that auto-abort for \(\beta > 20\) is much more frequently triggered for \(\tau > 10\). This problem does not affect Progressive BKZ with \(\tau = 1\) since, even with auto-abort, one tour is always run, and only slightly affects \(\tau = 5\) and \(\tau = 10\).Footnote 5 Predictions match experiments well in the \(\tau \le 10\) cases. We note that, even if we were to force all \(\tau \) tours to be performed, once ‘would be auto-abort’ conditions are reached, very few (if any) alterations would likely be made to the basis by each new tour. This means that the last full block of the basis would not be being rerandomised enough for the event of recovering \(\pi _{d-\beta +1}(\varvec{t})\) at tour i to be independent from the event of recovering it at tour \(i{-}1\), as our model assumes. For example, if the basis was not modified by the latest i-th tour and \(\pi _{d-\beta +1}(\varvec{t})\) was not recovered by \(O_\text {SVP}\) after tour \(i{-}1\), it will also not be recovered after tour i.

Fig. 7.
figure 7

Measured number of tours run by the BKZ 2.0 subroutine of Progressive BKZ with \(\tau \ge 5\) for each round of reduction with block size \(\beta \). Numbers are from experiments using the \(n = 100\) parameters from Table 1, with discrete Gaussian secret and error. Values are averaged over 100 instances. Less than \(\tau \) tours are run if either BKZ-\(\beta \) does not change the basis or auto-abort triggers.

The other phenomenon is the presence of a slight plateau in the probability plots as \(P[B \le \beta ] \ge 0.8\). In the case of \(n=72\) we also see that smaller than predicted block sizes accumulate a significant success probability. Interestingly, this effect does not appear to be present in the case of binary secret and error LWE, see Figs. 6a and 6b. We expect that this phenomenon is caused by the slight variation in sample variance throughout our experiments. Indeed, if we think of our target vector \(\varvec{t} = (t_1, \dots , t_d)\) as sampled coefficientwise from some distribution \(\chi \) with variance \(\sigma ^2\), in practice the resulting sample variance for each particular LWE instance , with the sample mean, will likely slightly deviate from \(\sigma ^2\). We would therefore expect \({\left\| \pi _i(\varvec{t})\right\| }^2\) to follow a distribution slightly different to \(\sigma ^2 \cdot \chi ^2_{d - i + 1}\). However, in the case of \(\chi = \mathcal {U}(\{-1, 1\})\), the distribution resulting from scaled and centred binary LWE embeddings, this distribution has a very small variance of \(s^2\), i.e. \(\mathbb {V}(s^2)\),Footnote 6 meaning that most sampled target vectors will have sample variance almost exactly \(\mathbb {V}(\chi ) = 1\). To verify this hypothesis, we run a set of \(n=72\) and \(n=100\) discrete Gaussian experiments from Table 1, where we resample each LWE instance until the target vector’s sample variance is within a 2% error of \(\sigma ^2\), and then run Progressive BKZ with \(\tau \) in 1, 5, 10. The resulting experimental probability distributions, shown in Fig. 8, do not present plateaus (and in the case of \(n=72\), they also do not present the high success probability for small block sizes), supporting our hypothesis. In practice, this effect should not significantly affect cryptographic parameters, as \(\mathbb {V}(s^2) \in O(\frac{1}{d})\) [KK51, Eq. 7.20], keeping the effect of fluctuations in \(\left\| \pi _{d-\beta +1}(\varvec{t})\right\| ^2\) small as the embedding dimension \(d\) increases.

Our uSVP simulators output similarly accurate simulations for scaled and centred binary, and ternary, secret and errors, as seen in Fig. 6, without making any alterations. This is in line with the notion that the hardness of solving uSVP via lattice reduction depends on the standard deviation of the target vector’s coefficients rather than their exact distribution. In recent work [CCLS20], Chen et al. run small block size (\(\beta \le 45\)) experiments and from their results conclude that the [ADPS16] methodology may be overestimating the security of binary and ternary secret LWE instances, and that discrete Gaussian secrets offer ‘greater security levels’. We believe their conclusions to be incorrect. First, their experiments are exclusively run in the small block size regime, where it is known that lattice heuristics often do not hold [GN08b, §4.2], [CN11, §6.1]. Second, their methodology does not take into account the norm of the embedded shortest vector. In their experiments they compare \(\text {LWE}_{n, q, \chi , m}\) instances where \(\chi \) is swapped between several distributions with different variances. They use the [BG14] embedding, which results in target vectors whose expected norms grow with the variance of \(\chi \). This means instances with narrower \(\chi \) will be easier to solve, something that can already be predicted by running the LWE estimator using the secret_distribution parameter. The estimator will also perform secret coefficient guessing, thus reducing the dimensionality of the problem. After this guessing has occurred, narrower \(\chi \) giving rise to easier instances does not mean that Gaussian secrets offer ‘greater security levels’ than binary or ternary secrets, but rather that when fixing n, q, m, the larger the secret variance, the harder the instance. Gaussian secrets with variance smaller than 1/4 would result in lower security than binary secrets in such a setting. We think the experiments to determine whether discrete Gaussian secrets are more secure than binary or ternary secrets should therefore be to compare LWE instances with different secret distributions, but equal variances, as done in this section, and that parameter selection for small secret LWE should keep the secret’s variance in consideration.

Fig. 8.
figure 8

Progressive BKZ success probability against LWE instances with discrete Gaussian secret and error and \((n, \sigma ^2) \in \{(72, 1) (100, 2/3)\}\), such that their sample variance is within 2% of \(\sigma ^2\).

BKZ. In the case of BKZ, simulations seem to stay similarly accurate across all secret dimensions n, as reported in Fig. 2. It should be noted that, even though a larger gap than for Progressive BKZ can be seen between predictions and experiments in the case of \(\tau = 5\), this predictive gap in expected block size of less than 3 corresponds to about 1 bit in a core-sieve cost model [ADPS16]. Furthermore, this gap narrows as \(\tau \) increases. Following experimental results from [Che13, Figure 4.6] and [Alb17], designers often [ACD+18] consider it sufficient to reduce a basis using \(\tau =16\) tours of BKZ when specifying BKZ cost models, due to the basis quality not improving significantly after 16 tours. Our simulators seem accurate for values of \(\tau \) in such a regime. Another observation is that Progressive BKZ with \(\tau =1\) outperforms BKZ with \(\tau = 5\). Indeed, the earlier performs approximately \(\beta \) tours of increasing block size versus the latter’s five tours of block size \(\beta \). It seems therefore that for these lattice parameters Progressive BKZ applies ‘more’ lattice reduction. We do not attempt to give a closed formula for the minimum block size for which BKZ outperforms Progressive BKZ in output quality. We also see that the phenomenon of success probabilities not increasing when \(\tau \ge 10\), as in the Progressive BKZ case, does not occur here. This is compatible with our understanding of this phenomenon in the case of Progressive BKZ. Indeed, BKZ-\(\beta \) will not auto-abort as often due to the input basis not having already been reduced with, for example, \(\tau \) tours of BKZ-\((\beta -1)\).

However, a different interesting phenomenon can be observed. Sometimes, as the block size is increased, the experimental success probability of BKZ lowers, see the BKZ experiments in Fig. 2. For example, this happens between block sizes 60 and 61 in Fig. 2a when running \(\tau =5\) tours of BKZ. Originally we believed this to be caused by the preprocessing strategies used in FPyLLL. Indeed, at the time of writing, preprocessing strategies for block size \(\beta \) (resp. \(\beta + 1\)) could include running BKZ-\(\beta '\) (resp. BKZ-\(\beta ''\)), with \(\beta ' > \beta ''\), resulting in inferior quality preprocessing for BKZ-\((\beta +1)\) than for BKZ-\(\beta \). We replaced the default preprocessing strategies with a custom one such that preprocessing block sizes are non decreasing as a function of \(\beta \), however this did not remove the effect. A possible cause for this phenomenon could be that basis profiles output by the [CN11] simulator do not capture the possibility that Gram–Schmidt vector norms can be non decreasing as a function of their index. This means that one could have a BKZ-\(\beta \) reduced basis such that \(\left\| \varvec{b}^*_{d-\beta }\right\| < \left\| \varvec{b}^*_{d-\beta +1}\right\| \). This event happening across instances or block sizes could be a potential cause for the phenomenon. The probabilistic BKZ simulator developed in [BSW18] seems to better capture this phenomenon, when run with a fixed PRNG seed. An example of the output of our uSVP simulator for BKZ, when replacing the [CN11] simulator with the [BSW18] simulator, can be found in Fig. 9. However, our experimental measurements are averaged over 100 runs. Running our uSVP simulator with the [BSW18] simulator, and averaging its output, results in a simulation with strictly increasing probabilities, unlike our measurements. In any case, the overall success probability predictions stay reasonably accurate.

Finally, looking at Fig. 4, it seems that our simulations are consistent with the measurements originally reported in [AGVW17, Table 1]. The simulators therefore seem to explain the reported success probabilities of lower than expected block sizes in that paper.

Fig. 9.
figure 9

Both figures show BKZ experiments and uSVP simulations for \(n=100\) instances with Gaussian secret and error, where the calls to the [CN11] simulator made in Algorithm 6 are replaced. The left plot shows simulations where the [BSW18] simulator is used with a fixed PRNG seed. The right plot shows the same experimental data with simulations obtained by averaging the output of the [BSW18] simulator over 10 different seeds.

6 Simulations of Cryptographically Sized LWE Instances

In previous sections we developed simulators for the success probability of solving uSVP instances and tested them against uSVP embedding lattices generated from small LWE instances that could be solved in practice. An immediate application could be to use such simulators to estimate the behaviour of lattice reduction when used against cryptographically sized instances.

Here we use the simulator to compute the expected first viable block sizes required to solve LWE and NTRU instances proposed for the NIST PQC standardisation process. In particular we look at the second round versions of the three lattice KEM finalists; Kyber [SAB+19], NTRU [ZCH+19], and Saber [DKRV19]. An interesting option would be to use the simulators to predict what block size is required to solve an instance with a target low success probability. However, as we discuss in Sect. 5.2, the simulations are not necessarily fully accurate for smaller or larger block sizes, due to the fluctuations in sample variance that an instance can have. While the effect should be minor for cryptographically sized instances, low probability attacks may also include combinatorial techniques not captured by our simulators. Therefore, extracting block sizes for low probability attacks from the simulated probabilities may not capture all of the necessary subtleties. Furthermore, we will see that the window of block sizes predicted to be first viable is relatively narrow, so that lower success probability attacks without combinatorial tricks should not be significantly cheaper than higher success probability attacks.

In Table 2, we look at parameter sets from the lattice KEM finalists in the third round of the NIST PQC standardisation process [NIS16], as specified during the second round. We provide expected first viable block sizes \(\mathbb {E}(\beta )\) (and their standard deviations \(\sqrt{\mathbb {V}}(\beta )\)) when using 15 tours of BKZ, and Progressive BKZ with \(\tau = 1\) or 5 (see Algorithm 2). We choose \(\tau =15\) for BKZ due to our experiments confirming the accuracy of our estimator for this value and its closeness to 16, which is commonly found in BKZ cost models. We choose \(\tau =1\) and \(\tau =5\) in the case of Progressive BKZ since our experiments suggest both cases are accurately predicted by the uSVP simulator; this allows us to see if running more tours in the BKZ subroutine has any effect on the complexity of cryptographically sized parameters.

Two clear disclaimers should be made. First, in Table 2 we list the expected block size required to solve uSVP instances for the primal attack. While in an aggressive cost model for these algorithms, such as core-SVP [ADPS16], one could be tempted to make direct cost comparisons between algorithms based only on \(\beta \), in the case of BKZ we assume that \(\tau \) tours of BKZ-\(\beta \) are run, while in the case of Progressive BKZ about \(\tau \beta \) tours of varying block size are run. Second, for both algorithms we fixed the same number of samples m, chosen with the aid of the LWE estimator as the optimal number of samples when using the ‘2016 estimate’ (except in the case of NTRU, where we assume \(m = n\) samples). This is not necessarily the optimal number of samples for each specific block size when computed using a uSVP simulator. We therefore avoid making claims and comparisons regarding the exact cost of solving uSVP using the two algorithms, and propose our results as an intermediate step between using the current LWE estimator and finding a theoretically cheapest attack using our simulators.

6.1 Observations

In almost all cases the mean required block size \(\mathbb {E}(\beta )\) is predicted to be larger than the LWE estimator currently suggests. Our results for using Progressive BKZ with \(\tau = 1\) against NTRU-HPS are in line with what Dachman-Soled et al. [DSDGR20, Table 5] predict (NTRU-HPS being the only examined scheme in common). The increase in \(\mathbb {E}(\beta )\) may seem counterintuitive. The Alkim et al. [ADPS16] methodology already aims to recover \(\mathbb {E}(\beta )\), with the simulators described in Sect. 4 capturing the success probability of smaller block sizes, possibly reducing the value of \(\mathbb {E}(\beta )\). Indeed, the increase seems to be mainly due to the use of the [CN11] simulator rather than the GSA for predicting the profile of a BKZ reduced basis (i.e. the right hand side of (3)). An illustrative example of this happening in the case of Kyber 512 can be see in Fig. 10. Indeed, patching the LWE estimator to partiallyFootnote 7 use the [CN11] simulator, we obtain \(\mathbb {E}(\beta )\) of Kyber 512 (resp. Kyber 768, Kyber 1024) of 390 (resp. 636, 890), narrowing the gap with the predictions obtained in Table 2 by using our uSVP simulators. The small standard deviations reported in Table 2 suggest that the success probability of block sizes below \(\mathbb {E}(\beta )\) decrease quickly.

Table 2. Security estimates for some lattice schemes. The number of samples m used in the embedding for Kyber (LWE) and Saber (LWR) is chosen using the LWE estimator, to optimise the cost of the attack following the 2016 estimate for BKZ [ADPS16]. In the case of NTRU, the number of samples m is chosen equal to n. \(\beta _{2016}\) is the block size suggested by the LWE estimator. For BKZ and Progressive BKZ (PBKZ), \(\mathbb {E}(\beta )\) and \(\sqrt{\mathbb {V}}(\beta )\) are the mean and standard deviation of the distribution of first viable block sizes.
Fig. 10.
figure 10

Example plot showing the effect on the [ADPS16] methodology of using the [CN11] BKZ simulator rather than the GSA, in the case of Kyber 512. Due to the resulting higher basis profile, the GSA leads to picking a smaller block size. The required winning block size in the [ADPS16] methodology is the distance from the vertical line indicating the intersection to the final basis index \(d\). Note that this plot is zoomed in (\(d > 800\)).

Conclusion. Overall, our data suggests that the experiments in Sect. 5 show that the techniques in Sect. 4 help to more accurately predict lattice reduction success probabilities for solving uSVP. It also suggests that in the case of short vectors sampled coefficientwise from bounded uniform distributions, it is the variance of the distribution, and not the exact probability mass function, that determines the hardness of the LWE instance. The uSVP simulators also seem to explain the success probability for smaller than expected block sizes reported in [AGVW17].

As part of our experiments, we also tested whether using Progressive BKZ with \(\tau > 1\) could be beneficial for an attacker. This seems to be useful to some small degree from the point of view the of success probabilities, although BKZ seems to perform comparatively well. However, Progressive BKZ could be of interest to an attacker that wants to start performing lattice reduction as part of a long term attack, but initially has access to fewer resourcesFootnote 8 than necessary to run BKZ with the expected first viable block size. Progressive BKZ would then allow them to increase their resources as the attack progresses, with \(\tau > 1\) allowing them to stop at an overall slightly smaller final block size.

We also note that our preliminary estimates for the success probabilities of lattice reduction on cryptographically sized instances result in higher block sizes than output by the LWE estimator [APS15]. This seems to be mostly due to our use of a BKZ simulator rather than the GSA. A patch to the LWE estimator substituting the GSA with a BKZ simulator could mitigate this effect.