Keywords

1 Introduction

The Learning with Errors problem (LWE) has attained a central role in cryptography as a key hard problem for building cryptographic constructions, e.g. quantum-safe public-key encryption/key exchange and signatures schemes [Reg09, LP11, ADPS16, BG14a], fully homomorphic encryption [BV11, GSW13] and obfuscation of some families of circuits [BVWW16].

Informally, LWE asks to recover a secret vector \(\mathbf {s} \in \mathbb {Z} _q^n\), given a matrix \(\mathbf {A} \in \mathbb {Z} _q^{m\times n}\) and a vector \(\mathbf {c} \in \mathbb {Z} _q^m\) such that \(\mathbf {A} \mathbf {s} + \mathbf {e} = \mathbf {c} \mod q\) for a short error vector \(\mathbf {e} \in \mathbb {Z} _q^m\) sampled coordinate-wise from an error distribution \(\chi \). The decision variant of LWE asks to distinguish between an LWE instance \((\mathbf {A}, \mathbf {c})\) and uniformly random \((\mathbf {A}, \mathbf {c})\in \mathbb {Z} _q^{m\times n} \times \mathbb {Z} _q^m\). To assess the security provided by a given set of parameters \(n,\chi , q\), two strategies are typically considered: the dual strategy finds short vectors in the lattice

$$\begin{aligned} q\varLambda ^* = \left\{ \mathbf {x} \in \mathbb {Z} _q^m \ | \ \mathbf {x} \cdot \mathbf {A} \equiv 0 \bmod {q}\right\} \!, \end{aligned}$$

i.e. it solves the Short Integer Solutions problem (SIS). Given such a short vector \(\mathbf {v} \), we can decide if an instance is LWE by computing \(\langle {\mathbf {v},\mathbf {c}}\rangle =\langle {\mathbf {v},\mathbf {e}}\rangle \) which is short whenever \(\mathbf {v} \) and \(\mathbf {e} \) are sufficiently short [MR09]. This strategy was recently revisited for small, sparse secret instances of LWE [Alb17]. The primal strategy finds the closest vector to \(\mathbf {c} \) in the integral span of columns of \(\mathbf {A} \) mod \(q\) [LP11], i.e. it solves the corresponding Bounded Distance Decoding problem (BDD) directly. Writing \([\mathbf {I} _{n} | \mathbf {A} ']\) for the reduced row echelon form of \(\mathbf {A} ^{T} \in \mathbb {Z} _{q}^{n \times m}\) (with high probability and after appropriate permutation of columns), this task can be reformulated as solving the unique Shortest Vector Problem (uSVP) in the \(m+1\) dimensional \(q\)-ary lattice

$$\begin{aligned} \varLambda = \mathbb {Z} ^{m+1} \cdot \left( \begin{array}{ccc} \mathbf {I} _{n} &{} \mathbf {A} ' &{} 0 \\ \mathbf {0} &{} q\,\mathbf {I} _{m-n} &{} 0\\ \mathbf {c} ^T &{} &{} t \\ \end{array} \right) \end{aligned}$$
(1)

by Kannan’s embedding [Kan87] with embedding factor \(t\).Footnote 1 Indeed, BDD and uSVP are polynomial-time equivalent for small approximation factors up to \(\sqrt{n{/}\log n}\) [LM09]. The lattice \(\varLambda \) has volume \(t\cdot q^{m-n}\) and contains a vector of norm \(\sqrt{\Vert \mathbf {e} \Vert ^{2} + t^{2}}\) which is unusually short, i.e. the gap between the first and second Minkowski minimum \(\lambda _{2}(\varLambda )/\lambda _{1}(\varLambda )\) is large.

Alternatively, if the secret vector \(\mathbf {s} \) is also short, there is a second established embedding reducing LWE to uSVP (cf. Eq. (4)). When the LWE instance under consideration is in normal form, i.e. the secret \(\mathbf {s} \) follows the noise distribution, the geometries of the lattices in (1) and (4) are the same, which is why without loss of generality we only consider (1) in this work save for Sect. 5.

To find short vectors, lattice reduction [LLL82, Sch87, GN08a, HPS11, CN11, MW16] can be applied. Thus, to establish the cost of solving an LWE instance, we may consider the cost of lattice reduction for solving uSVP.

Two conflicting estimates for the success of lattice reduction in solving uSVP are available in the literature. The first is going back to [GN08b] and was developed in [AFG14, APS15, Gö16, HKM17] for LWE. This estimate is commonly relied upon by designers in the literature, e.g. [BG14a, CHK+17, CKLS16a, CLP17, ABB+17]. The second estimate was recently outlined in [ADPS16] and is relied upon in [BCD+16, BDK+17]. We will use the shorthand 2008 estimate for the former and 2016 estimate for the latter. As illustrated in Fig. 1, the predicted costs under these two estimates differ greatly. For example, considering \(n=1024\), \(q \approx 2^{15}\) and \(\chi \) a discrete Gaussian with standard deviation \(\sigma = 3.2\), the former predicts a cost of \({\approx }2^{355}\) operations, whereas the latter predicts a cost of \({\approx }2^{287}\) operations in the same cost model for lattice reduction.Footnote 2

Fig. 1.
figure 1

Required block size \(\beta \) according to the estimates given in [AFG14, ADPS16] for modulus \(q=2^{15}\), standard deviation \(\sigma =3.2\) and increasing \(n\); for [AFG14] we set \(\tau =0.3\) and \(t=1\). Lattice-reduction runs in time \(2^{\varOmega (\beta )}\).

Our Contribution. Relying on recent progress in publicly available lattice-reduction libraries [FPL17, FPY17], we revisit the embedding approach for solving LWE resp. BDD under some reasonable assumptions about the LWE error distribution. After some preliminaries in Sect. 2, we recall the two competing estimates from the literature in Sect. 3. Then, in Sect. 4, we expand on the exposition from [ADPS16] followed by presenting the results of running 23,000 core hours worth of lattice-reduction experiments in medium to larger block sizes \(\beta \). Our results confirm that lattice-reduction largely follows the behaviour expected from the 2016 estimate [ADPS16]. However, we also find that in our experiments the attack behaves somewhat better than expected.Footnote 3 In Sect. 4.3, we then explain the observed behaviour of the BKZ algorithm under the Geometric Series Assumption (GSA, see below) and under the assumption that the unique shortest vector is distributed in a random direction relative to the rest of the basis. Finally, using the 2016 estimate, we show that some proposed parameters from the literature need to be updated to maintain the currently claimed level of security in Sect. 5. In particular, we give reduced costs for solving the LWE instances underlying TESLA [ABB+17] and the somewhat homomorphic encryption scheme in [BCIV17]. We also show that under the revised, corrected estimate, the primal attack performs about as well on SEAL v2.1 parameter sets as the dual attack from [Alb17].

2 Preliminaries

We write vectors in lower-case bold, e.g. \(\mathbf {a} \), and matrices in upper-case bold, e.g. \(\mathbf {A} \). We write \(\langle {\cdot ,\cdot }\rangle \) for the inner products and \(\cdot \) for matrix-vector products. By abuse of notation we consider vectors to be row resp. column vectors depending on context, such that \(\mathbf {v} \cdot \mathbf {A} \) and \(\mathbf {A} \cdot \mathbf {v} \) are meaningful. We write \(\mathbf {I} _{m}\) for the \(m \times m\) identity matrix over whichever base ring is implied from context. We write \(\mathbf {0} _{m \times n}\) for the \(m \times n\) all zero matrix. If the dimensions are clear from the context, we may omit the subscripts.

2.1 Learning with Errors

The Learning with Errors (LWE) problem is defined as follows.

Definition 1

(LWE [Reg09]). Let \(n,\,q\) be positive integers, \(\chi \) be a probability distribution on \( \mathbb {Z} \) and \(\mathbf {s} \) be a secret vector in \( \mathbb {Z} _q^n\). We denote by \(L_{\mathbf {s},\chi }\) the probability distribution on \( \mathbb {Z} _q^n \times \mathbb {Z} _q\) obtained by choosing \(\mathbf {a} \in \mathbb {Z} _q^n\) uniformly at random, choosing \(e \in \mathbb {Z} \) according to \(\chi \) and considering it in \( \mathbb {Z} _q\), and returning \((\mathbf {a},c) = (\mathbf {a},\langle {\mathbf {a},\mathbf {s}}\rangle + e) \in \mathbb {Z} _q^n \times \mathbb {Z} _q\).

Decision-LWE is the problem of deciding whether pairs \((\mathbf {a}, c) \in \mathbb {Z} _q^n \times \mathbb {Z} _q\) are sampled according to \(L_{\mathbf {s},\chi } \) or the uniform distribution on \( \mathbb {Z} _q^n \times \mathbb {Z} _q\).

Search-LWE is the problem of recovering \(\mathbf {s} \) from \((\mathbf {a}, c)=(\mathbf {a},\langle {\mathbf {a},\mathbf {s}}\rangle + e) \in \mathbb {Z} _q^n \times \mathbb {Z} _q\) sampled according to \(L_{\mathbf {s},\chi } \).

We may write LWE instances in matrix form \(\left( \mathbf {A},\mathbf {c} \right) \), where rows correspond to samples \(\left( \mathbf {a} _i,c_i\right) \). In many instantiations, \(\chi \) is a discrete Gaussian distribution with standard deviation \(\sigma \). Throughout, we denote the number of LWE samples considered as \(m\). Writing \(\mathbf {e} \) for the vector of error terms, we expect \(\Vert \mathbf {e} \Vert \approx \sqrt{m}\sigma \).

2.2 Lattices

A lattice is a discrete subgroup of \( \mathbb {R} ^d\). Throughout, \(d\) denotes the dimension of the lattice under consideration and we only consider full rank lattices, i.e., lattices \(\varLambda \subset \mathbb {R} ^d\) such that \(\mathsf {span}_{ \mathbb {R}}(\varLambda )= \mathbb {R} ^d\). A lattice \(\varLambda \subset \mathbb {R} ^d\) can be represented by a basis \(\mathbf {B} =\{\mathbf {b} _1, \ldots , \mathbf {b} _k\}\), i.e., \(\mathbf {B} \) is linearly independent and \(\varLambda = \mathbb {Z} \mathbf {b} _1 + \cdots + \mathbb {Z} \mathbf {b} _k\). We write \(\mathbf {b} _{i}\) for basis vectors and \(\mathbf {b} _{i}^{*}\) for the corresponding Gram-Schmidt vectors. We write \(\varLambda (\mathbf {B} )\) for the lattice generated by the rows of the matrix \(\mathbf {B} \), i.e. all integer-linear combinations of the rows of \(\mathbf {B} \). The volume of a lattice \({{\mathrm{Vol}}}(\varLambda )\) is the absolute value of the determinant of any basis and it holds that \({{\mathrm{Vol}}}(\varLambda ) = \prod _{i=1}^{d} \Vert \mathbf {b} _i^*\Vert \). We write \(\lambda _{i}(\varLambda )\) for Minkowski’s successive minima, i.e. the radius of the smallest ball centred around zero containing \(i\) linearly independent lattice vectors. The Gaussian Heuristic predicts

$$\begin{aligned} \lambda _1(\varLambda ) \approx \sqrt{\frac{d}{2 \pi e}} {{{\mathrm{Vol}}}(\varLambda )}^{1/d}. \end{aligned}$$

For a lattice basis \(\mathbf {B} = \{\mathbf {b} _1, \ldots , \mathbf {b} _d\}\) and for \(i \in \{1, \ldots , d\}\) let \(\pi _{\mathbf {B}, i}(\mathbf {v})\) denote the orthogonal projection of \(\mathbf {v} \) onto \(\{\mathbf {b} _1, \ldots , \mathbf {b} _{i-1}\}\), where \(\pi _{\mathbf {B}, 1}\) is the identity. We extend the notation to sets of vectors in the natural way. Since usually the basis \(\mathbf {B} \) is clear from the context, we omit it in the notation and simply write \(\pi _{i}\) instead of \(\pi _{\mathbf {B}, i}\). Since Sect. 4.3 relies heavily on size reduction, we recall its definition and reproduce the algorithm in Algorithm 1.

Definition 2

Let \(\mathbf {B} \) be a basis, \(\mathbf {b} _{i}^{*}\) its Gram-Schmidt vectors and

$$\begin{aligned} \mu _{i,j} = \langle {\mathbf {b} _{i},\mathbf {b} _{j}^{*}}\rangle /\langle {\mathbf {b} _{j}^{*},\mathbf {b} _{j}^{*}}\rangle , \end{aligned}$$

then \(\mathbf {B} \) basis is size reduced if \(|\mu _{i,j}| \le 1/2\) for \(1 \le j \le i \le n\).

figure a

2.3 Lattice Reduction

Informally, lattice reduction is the process of improving the quality of a lattice basis. To express the output quality of a lattice reduction, we may relate the shortest vector in the output basis to the volume of the lattice in the Hermite-factor regime or to the shortest vector in the lattice, in the approximation-factor regime. Note that any algorithm finding a vector with approximation-factor \(\alpha \) can be used to solve Unique-SVP with a gap \(\lambda _{2}(\varLambda )/\lambda _{1}(\varLambda ) < \alpha \).

The best known theoretical bound for lattice reduction is attained by Slide reduction [GN08a]. In this work, however, we consider the BKZ algorithm (more precisely: BKZ 2.0 [Che13], cf. Sect. 4.2) which performs better in practice. The BKZ-\(\beta \) algorithm repeatedly calls an SVP oracle for finding (approximate) shortest vectors in dimension or block size \(\beta \). It has been shown that after polynomially many calls to the SVP oracle, the basis does not change much more [HPS11]. After BKZ-\(\beta \) reduction, we call the basis BKZ- \(\beta \) reduced and in the Hermite-factor regime assume [Che13] that this basis contains a vector of length \(\Vert \mathbf {b} _{1}\Vert = \delta _{0} ^{d} \cdot {{{\mathrm{Vol}}}(L)}^{1/d}\) where

$$\begin{aligned} \delta _{0} = {(({(\pi \beta )}^{1/\beta } \beta )/(2 \pi e))}^{1/(2 (\beta -1))}. \end{aligned}$$

Furthermore, we generally assume that for a BKZ-\(\beta \) reduced basis of \(\varLambda (\mathbf {B} )\) the Geometric Series Assumption holds.

Definition 3

(Geometric Series Assumption [Sch03]). The norms of the Gram-Schmidt vectors after lattice reduction satisfy

$$\begin{aligned} \Vert \mathbf {b} _i^*\Vert = \alpha ^{i-1} \cdot \Vert \mathbf {b} _1\Vert \,\, \mathrm{for \,\, some} \,\, 0< \alpha < 1. \end{aligned}$$

Combining the GSA with the root-Hermite factor \(\Vert \mathbf {b} _1\Vert = \delta _{0} ^d \cdot {{{\mathrm{Vol}}}(\varLambda )}^{1/d}\) and \({{\mathrm{Vol}}}(\varLambda ) = \prod _{i=1}^{d} \Vert \mathbf {b} _i^*\Vert \), we get \(\alpha = \delta _{0} ^{-2d/(d-1)} \approx \delta _{0} ^{-2}\) for the GSA.

3 Estimates

As highlighted above, two competing estimates exist in the literature for when block-wise lattice reduction will succeed in solving uSVP instances such as (1).

3.1 2008 Estimate

A first systematic experimental investigation into the behavior of lattice reduction algorithms LLL, DEEP and BKZ was provided in [GN08b]. In particular, [GN08b] investigates the behavior of these algorithms for solving Hermite-SVP, Approx-SVP and Unique-SVP for families of lattices used in cryptography.

For Unique-SVP, the authors performed experiments in small block sizes on two classes of semi-orthogonal lattices and on Lagarias-Odlyzko lattices [LO83], which permit to estimate the gap \(\lambda _2(\varLambda )/\lambda _1(\varLambda )\) between the first and second minimum of the lattice. For all three families, [GN08b] observed that LLL and BKZ seem to recover a unique shortest vector with high probability whenever \(\lambda _2(\varLambda ) / \lambda _1(\varLambda ) \ge \tau \delta _{0} ^d\), where \(\tau < 1\) is an empirically determined constant that depends on the lattice family and algorithm used.

In [AFG14] an experimental analysis of solving LWE based on the same estimate was carried out for lattices of the form (1). As mentioned above, this lattice contains an unusually short vector \(\mathbf {v} = (\mathbf {e} \mid t)\) of squared norm \({\lambda _1(\varLambda )}^2 = \left||\mathbf {v} \right||^2 = \left||\mathbf {e} \right||^2 + t^2\). Thus, when \(t = \left||\mathbf {e} \right||\) resp. \(t=1\) this implies \(\lambda _1(\varLambda ) \approx \sqrt{2m}\sigma \) resp. \(\lambda _1(\varLambda ) \approx \sqrt{m}\sigma \), with \(\sigma \) the standard deviation of . The second minimum \(\lambda _2(\varLambda )\) is assumed to correspond to the Gaussian Heuristic for the lattice. Experiments in [AFG14] using LLL and BKZ (with block sizes 5 and 10) confirmed the 2008 estimate, providing constant values for \(\tau \) for lattices of the form (1), depending on the chosen algorithm, for a 10% success rate. Overall, \(\tau \) was found to lie between 0.3 and 0.4 when using BKZ.

Still focusing on LWE, in [APS15] a closed formula for \(\delta _{0} \) is given in function of \(n, \sigma , q, \tau \), which implicitly assumes \(t=\left||\mathbf {e} \right||\). In [Gö16] a bound for \(\delta _{0} \) in the [GN08b] model for the case of \(t=1\), which is always used in practice, is given. In [HKM17], a related closed formula is given, directly expressing the asymptotic running time for solving LWE using this approach.

3.2 2016 Estimate

In [ADPS16] an alternative estimate is outlined. The estimate predicts that \(\mathbf {e} \) can be found ifFootnote 4

$$\begin{aligned} \sqrt{\beta /d} \left||(\mathbf {e} \mid 1)\right|| \approx \sqrt{\beta } \sigma \le \delta _{0} ^{2\beta -d} \, {{{\mathrm{Vol}}}(\varLambda (\mathbf {B} ))}^{1/d}, \end{aligned}$$
(2)

under the assumption that the Geometric Series Assumption holds (until a projection of the unusually short vector is found). The brief justification for this estimate given in [ADPS16] notes that this condition ensures that the projection of \(\mathbf {e} \) orthogonally to the first \(d-\beta \) (Gram-Schmidt) vectors is shorter than the expectation for \(\mathbf {b} _{d-\beta +1}^*\) under the GSA and thus would be found by the SVP oracle when called on the last block of size \(\beta \). Hence, for any \(\beta \) satisfying (2), the actual behaviour would deviate from that predicted by the GSA. Finally, the argument can be completed by appealing to the intuition that a deviation from expected behaviour on random instances—such as the GSA—leads to a revelation of the underlying structural, secret information.Footnote 5

4 Solving uSVP

Given the significant differences in expected solving time under the two estimates, cf. Fig. 1, and recent progress in publicly available lattice-reduction libraries enabling experiments in larger block sizes [FPL17, FPY17], we conduct a more detailed examination of BKZ’s behaviour on uSVP instances. For this, we first explicate the outline from [ADPS16] to establish the expected behaviour, which we then experimentally investigate in Sect. 4.2. Overall, our experiments confirm the expectation. However, the algorithm behaves somewhat better than expected, which we then explain in Sect. 4.3.

For the rest of this section, let \(\mathbf {v} \) be a unique shortest vector in some lattice \(\varLambda \subset \mathbb {R} ^d\), i.e. in case of (1) we have \(\mathbf {v} = (\mathbf {e} \mid t)\) where we pick \(t=1\).

4.1 Prediction

Projected norm. In what follows, we assume the unique shortest vector \(\mathbf {v} \) is drawn from a spherical distribution or is at least “not too skewed” with respect to the current basis. As a consequence, following [ADPS16], we assume that all orthogonal projections of \(\mathbf {v} \) onto a k-dimensional subspace of \( \mathbb {R} ^d\) have expected norm \((\sqrt{k}/ \sqrt{d}) \left||\mathbf {v} \right||\). Note that this assumption can be dropped by adapting (2) to \(\left||\mathbf {v} \right|| \le \delta _{0} ^{2\beta -d}\, {{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}\) since \(\left||\pi _{d-\beta +1}(\mathbf {v})\right|| \le \left||\mathbf {v} \right||\).

Finding a projection of the short vector. Assume that \(\beta \) is chosen minimally such that (2) holds. When running BKZ the length of the Gram-Schmidt basis vectors of the current basis converge to the lengths predicted by the GSA. Therefore, at some point BKZ will find a basis \(\mathbf {B} = \{\mathbf {b} _1, \ldots , \mathbf {b} _d\}\) of \(\varLambda \) for which we can assume that the GSA holds with root Hermite factor \(\delta _{0} \). Now, consider the stage of BKZ where the SVP oracle is called on the last full projected block of size \(\beta \) with respect to \(\mathbf {B} \). Note that the projection \(\pi _{d-\beta +1}(\mathbf {v})\) of the shortest vector is contained in the lattice

$$\begin{aligned} \varLambda _{d-\beta +1} := \varLambda \left( \pi _{d-\beta +1}(\mathbf {b} _{d-\beta +1}), \ldots , \pi _{d-\beta +1}(\mathbf {b} _d)\right) \!, \end{aligned}$$

since

$$\begin{aligned} \pi _{d-\beta +1}(\mathbf {v}) = \sum _{i = d-\beta +1}^d{\nu _i \pi _{d-\beta +1}(\mathbf {b} _{i})} \in \varLambda _{d-\beta +1}, \text { where } \nu _i \in \mathbb {Z} \text { with } \mathbf {v} = \sum _{i = 1}^d{\nu _i \mathbf {b} _{i}}. \end{aligned}$$

By (2), the projection \(\pi _{d-\beta +1}(\mathbf {v})\) is in fact expected to be the shortest non-zero vector in \(\varLambda _{d-\beta +1}\), since it is shorter than the GSA’s estimate for \(\lambda _1(\varLambda _{d-\beta +1})\), i.e.

$$\begin{aligned} \left||\pi _{d-\beta +1}(\mathbf {v})\right|| \approx \frac{\sqrt{\beta }}{\sqrt{d}} \left||\mathbf {v} \right|| \le \delta _{0} ^{-2(d-\beta )+d}{{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}. \end{aligned}$$

Hence the SVP oracle will find \(\pm \pi _{d-\beta +1}(\mathbf {v})\) and BKZ inserts

$$\begin{aligned} \mathbf {b} _{d-\beta +1}^\mathsf {new} = \pm \sum _{i = d-\beta +1}^d{\nu _i \mathbf {b} _{i}} \end{aligned}$$

into the basis \(\mathbf {B} \) at position \(d-\beta +1\), as already outlined in [ADPS16]. In other words, by finding \(\pm \pi _{d-\beta +1}(\mathbf {v})\), BKZ recovers the last \(\beta \) coefficients \(\nu _{d-\beta +1}, \ldots , \nu _d\) of \(\mathbf {v} \) with respect to the basis \(\mathbf {B} \).

Finding the short vector. The above argument can be extended to an argument for the full recovery of \(\mathbf {v} \). Consider the case that in some tour of BKZ-\(\beta \), a projection of \(\mathbf {v} \) was found at index \(d-\beta +1\). Then in the following tour, by arguments analogous to the ones above, a projection of \(\mathbf {v} \) will likely be found at index \(d-2\beta +2\), since now it holds that

$$\begin{aligned} \pi _{d-2\beta +2}(\mathbf {v})\in \varLambda _{d-2\beta +2} := \varLambda \left( \pi _{d-2\beta +2}(\mathbf {b} _{d-2\beta +2}), \ldots , \pi _{d-2\beta +2}(\mathbf {b} _{d-\beta +1}^\mathsf {new})\right) \!. \end{aligned}$$

Repeating this argument for smaller indices shows that after a few tours \(\mathbf {v} \) will be recovered. Furthermore, noting that BKZ calls LLL which in turn calls size reduction, i.e. Babai’s nearest plane [Bab86], at some index \(i>1\) size reduction will recover \(\mathbf {v} \) from \(\pi _{i}(\mathbf {v})\). In particular, it is well-known that size reduction (Algorithm 1) will succeed in recovering \(\mathbf {v} \) whenever

$$\begin{aligned} \mathbf {v} \in \mathbf {b} _{d-\beta +1}^{\mathsf {new}} + \left\{ \sum _{i=1}^{d-\beta } c_{i} \cdot \mathbf {b} _{i}^{*} : c_{i} \in \left[ -\frac{1}{2}, \frac{1}{2} \right] \right\} \!. \end{aligned}$$
(3)

4.2 Observation

The above discussion naturally suggests a strategy to verify the expected behaviour. We have to verify that the projected norms \(\Vert \pi _{i}(\mathbf {v})\Vert = \Vert \pi _{i}(\mathbf {e} \mid 1)\Vert \) do indeed behave as expected and that \(\pi _{d-\beta +1}(\mathbf {v})\) is recovered by BKZ-\(\beta \) for the minimal \(\beta \in \mathbb {N} \) satisfying (2). Finally, we have to measure when and how \(\mathbf {v} =(\mathbf {e} \mid 1)\) is eventually recovered.

Thus, we ran lattice-reduction on many lattices constructed from LWE instances using Kannan’s embedding. In particular, we picked the entries of \(\mathbf {s} \) and \(\mathbf {A} \) uniformly at random from \(\mathbb {Z}_q\), the entries of \(\mathbf {e} \) from a discrete Gaussian distribution with standard deviation \(\sigma = 8/\sqrt{2\pi }\), and we constructed our basis as in (1) with embedding factor \(t = 1\). For parameters \((n, q, \sigma )\), we then estimated the minimal pair (in lexicographical order) \((\beta ,m)\) to satisfy (2).

Implementation. To perform our experiments, we used SageMath 7.5.1 [S+17] in combination with the fplll  5.1.0 [FPL17] and fpylll  0.2.4dev [FPY17] libraries. All experiments were run on a machine with Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz cores (“strombenzin”) resp. Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (“atomkohle”). Each instance was reduced on a single core, with no parallelisation.

Our BKZ implementation inherits from the implementation in fplll and fpylll of BKZ 2.0 [Che13] algorithm. As in BKZ 2.0, we restricted the enumeration radius to be approximately the size of the Gaussian Heuristic for the projected sublattice, apply recursive BKZ-\(\beta '\) preprocessing with a block size \(\beta '<\beta \), make use of extreme pruning [GNR10] and terminate the algorithm when it stops making significant progress. We give simplified pseudo-code of our implementation in Algorithm 2. We ran BKZ for at most 20 tours using fplll ’s default pruning and preprocessing strategies and, using fplll ’s default auto abort strategy, terminated the algorithm whenever the slope of the Gram Schmidt vectors did not improve for five consecutive tours. Additionally, we aborted if a vector of length \({\approx }\) \(\left||\mathbf {v} \right||\) was found in the basis (in line 15 of Algorithm 2).

figure b

Implementations of block-wise lattice reduction algorithms such as BKZ make heavy use of LLL [LLL82] and size reduction. This is to remove linear dependencies introduced during the algorithm, to avoid numerical stability issues and to improve the performance of the algorithm by moving short vectors to the front earlier. The main modification in our implementation is that calls to LLL during preprocessing and postprocessing are restricted to the current block, not touching any other vector, to aid analysis. That is, in Algorithm 2, LLL is called in lines 7 and 12 and we modified these LLL calls not to touch any row with index smaller than \(\kappa \), not even to perform size reduction.

As a consequence, we only make use of vectors with index smaller than \(\kappa \) in lines 3 and 15. Following the implementations in [FPL17, FPY17], we call size reduction from index 1 to \(\kappa \) before (line 3) and after (line 15) the innermost loop with calls to the SVP oracle. These calls do not appear in the original description of BKZ. However, since the innermost loop re-randomises the basis when using extreme pruning, the success condition of the original BKZ algorithm needs to be altered. That is, the algorithm cannot break the outer loop once it makes no more changes as originally specified. Instead, the algorithm terminates if it does not find a shorter vector at any index \(\kappa \). Now, the calls to size reduction ensure that the comparison at the beginning and end of each step \(\kappa \) is meaningful even when the Gram-Schmidt vectors are only updated lazily in the underlying implementation. That is, the call to size reduction triggers an internal update of the underlying Gram-Schmidt vectors and are hence implementation artefacts. The reader may think of these size reduction calls as explicating calls otherwise hidden behind calls to LLL and we stress that our analysis applies to BKZ as commonly implemented, our changes merely enable us to more easily predict and experimentally verify the behaviour.

We note that the break condition for the innermost loop at line 5 depends on the pruning parameters chosen, which control the success probability of enumeration. Since it does not play a material role in our analysis, we simply state that some condition will lead to a termination of the innermost loop.

Finally, we recorded the following information. At the end of each step \(\kappa \) during lattice reduction, we recorded the minimal index i such that \(\pi _i(\mathbf {v})\) is in \(\text {span}(\mathbf {b} _{1},\ldots ,\mathbf {b} _{i})\) and whether \(\pm \mathbf {v} \) itself is in the basis. In particular, to find the index \(i\) in the basis \(\mathbf {B} \) of \(\pi _i(\mathbf {v})\) given \(\mathbf {v} \), we compute the coefficients of \(\mathbf {v} \) in basis \(\mathbf {B} \) (at the current step) and pick the first index i such that all coefficients with larger indices are zero. Then, we have \(\pi _i(\mathbf {b} _i) = c \cdot \pi _i(\mathbf {v})\) for some \(c\in \mathbb {R} \). From the algorithm, we expect to have found \(\pm \pi _i(\mathbf {b} _i) = \pi _i(\mathbf {v})\) and call \(i\) the index of the projection of \(\mathbf {v} \).

Results. In Fig. 2, we plot the average norms of \(\pi _{i}(\mathbf {v})\) against the expectation \(\sqrt{d-i+1}\, \sigma \approx \sqrt{\frac{d-i+1}{d}}\sqrt{m \cdot \sigma ^{2} + 1}\), indicating that \(\sqrt{d-i+1}\, \sigma \) is a close approximation of the expected lengths except perhaps for the last few indices.

Fig. 2.
figure 2

Expected and average observed norms \(\left||\pi _i(\mathbf {v})\right||\) for 16 bases (LLL-reduced) and vectors \(\mathbf {v} \) of dimension \(d=m+1\) and volume \(q^{m-n}\) with LWE parameters \(n=65, m=182, q=521\) and standard deviation \(\sigma = 8/\sqrt{2\pi }\).

Fig. 3.
figure 3

Expected and observed norms for lattices of dimension \(d=m+1=183\) and volume \(q^{m-n}\) after BKZ-\(\beta \) reduction for LWE parameters \(n=65, m=182, q=521\) and standard deviation \(\sigma = 8/\sqrt{2\pi }\) and \(\beta =56\) (minimal \((\beta ,m)\) such that (2) holds). Average of Gram-Schmidt lengths is taken over 16 BKZ-\(\beta \) reduced bases of random \(q\)-ary lattices, i.e. without an unusually short vector.

Recall that, as illustrated in Fig. 3, we expect to find the projection \(\pi _{d-\beta +1}(\mathbf {v})\) when \((\beta ,d)\) satisfy (2), eventually leading to a recovery of \(\mathbf {v} \), say, by an extension of the argument for the recovery of \(\pi _{d-\beta +1}(\mathbf {v})\). Our experiments, summarised in Table 1, show a related, albeit not identical behaviour. Defining a cut-off index \(c = d-0.9\beta +1\) and considering \(\pi _\kappa (\mathbf {v})\) for \(\kappa < c\), we observe that the BKZ algorithm typically first recovers \(\pi _{\kappa }(\mathbf {v})\) which is immediately followed by the recovery of \(\mathbf {v} \) in the same step. In more detail, in Fig. 4 we show the measured probability distribution of the index \(\kappa \) such that \(\mathbf {v} \) is recovered from \(\pi _\kappa (\mathbf {v})\) in the same step. Note that the mean of this distribution is smaller than \(d - \beta + 1\). We explain this bias in Sect. 4.3.

Fig. 4.
figure 4

Probability mass function of the index \(\kappa \) from which size reduction recovers \(\mathbf {v} \), calculated over 10,000 lattice instances with LWE parameters \(n=65, m=182, q=521\) and standard deviation \(\sigma = 8/\sqrt{2\pi }\), reduced using \(\beta = 56\). The mean of the distribution is \({\approx }124.76\) while \(d-\beta +1 = 128\).

The recovery of \(\mathbf {v} \) from \(\pi _{\kappa }(\mathbf {v})\) can be effected by one of three subroutines: either by a call to LLL, by a call to size reduction, or by a call to enumeration that recovers \(\mathbf {v} \) directly. Since LLL itself contains many calls to size reduction, and enumeration being lucky is rather unlikely, size reduction is a good place to start the investigation. Indeed, restricting the LLL calls in Algorithm 2 as outlined in Sect. 2.3, identifies that size reduction suffices. That is, to measure the success rate of size reduction recovering \(\mathbf {v} \) from \(\pi _{\kappa }(\mathbf {v})\), we observe size reduction acting on \(\pi _\kappa (\mathbf {v})\). Here, we consider size reduction to fail in recovering \(\mathbf {v} \) if it does not recover \(\mathbf {v} \) given \(\pi _\kappa (\mathbf {v})\) for \(\kappa < c\) with \(c = d-0.9\beta +1\), regardless of whether \(\mathbf {v} \) is finally recovered at a later point either by size reduction on a new projection, or by some other call in the algorithm such as an SVP oracle call at a smaller index. As shown in Table 1, size reduction’s success rate is close to 1. Note that the cut-off index \(c\) serves to limit underestimating the success rate: intuitively we do not expect size reduction to succeed when starting from a projection with larger index, such as \(\pi _{d-\gamma +1}(\mathbf {v})\) with \(\gamma < 10\). We discuss this in Sect. 4.3.

Table 1. Overall success rate (“\(\mathbf {v}\) ”) and success rate of size reduction (“same step”) for solving LWE instances characterised by \(n,\sigma ,q\) with \(m\) samples, standard deviation \(\sigma = 8/\sqrt{2\pi }\), minimal \((\beta _{2016},m_{2016})\) such that \(\sqrt{b_{2016}}\, \sigma \le \delta _{0}^{2\beta _{2016}-(m_{2016}+1)} q^{(m_{2016}-n)/(m_{2016}+1)}\) with \(\delta _{0}\) in function of \(\beta _{2016}\). The column “\(\beta \)” gives the actual block size used in experiments. The “same step” rate is calculated over all successful instances where \(\mathbf {v} \) is found before the cut-off point c and for the instances where exactly \(\pi _{d-b+1}(\mathbf {v})\) is found (if no such instance is found, we do not report a value). In the second case, the sample size is smaller, since not all instances recover \(\mathbf {v} \) from exactly \(\kappa = d-\beta +1\). The column “time” lists average solving CPU time for one instance, in seconds. Note that our changes to the algorithm and our extensive record keeping lead to an increased running time of the BKZ algorithm compared to [FPL17, FPY17]. Furthermore, the occasional longer running time for smaller block sizes is explained by the absence of early termination when \(\mathbf {v} \) is found.

Overall, Table 1 confirms the prediction from [ADPS16]: picking \(\beta = \beta _{2016}\) to be the block size predicted by the 2016 estimate leads to a successful recovery of \(\mathbf {v} \) with high probability.

4.3 Explaining Observation

As noted above, our experiments indicate that the algorithm behaves better than expected by (2). Firstly, the BKZ algorithm does not necessarily recover a projection of \(\mathbf {v} \) at index \(d-\beta +1\). Instead, the index \(\kappa \) at which we recover a projection \(\pi _{\kappa }(\mathbf {v})\) follows a distribution with a centre below \(d-\beta +1\), cf. Fig. 4. Secondly, size reduction usually immediately recovers \(\mathbf {v} \) from \(\pi _{\kappa }(\mathbf {v})\). This is somewhat unexpected, since we do not have the guarantee that \(|c_i| \le 1/2\) as required in the success condition of size reduction given in (3).

Finding the projection. To explain the bias towards a recovery of \(\pi _{\kappa }(\mathbf {v})\) for some \(\kappa < d-\beta + 1\), note that if (2) holds then for the parameter sets in our experiments the lines for \(\Vert \pi _{i}(\mathbf {v})\Vert \) and \(\Vert \mathbf {b} _{i}^{*}\Vert \) intersect twice (cf. Fig. 3). Let \(d-\gamma +1\) be the index of the second intersection. Thus, there is a good chance that \(\Vert \pi _{d-\gamma +1}(\mathbf {v})\Vert \) is a shortest vector in the lattice spanned by the last projected block of some small rank \(\gamma \) and will be placed at index \(d-\gamma +1\). As a consequence, all projections \(\pi _{i}(\mathbf {v})\) with \(i>d-\gamma +1\) will be zero and \(\pi _{d-\beta -\gamma +1}(\mathbf {v})\) will be contained in the \(\beta \)-dimensional lattice

$$\begin{aligned} \varLambda _{d-\beta -\gamma +1} := \varLambda \left( \pi _{d-\beta -\gamma +1}(\mathbf {b} _{d-\beta -\gamma +1}), \ldots , \pi _{d-\beta -\gamma +1}(\mathbf {b} _{d-\gamma +1})\right) \!, \end{aligned}$$

enabling it to be recovered by BKZ-\(\beta \) at an index \(d-\beta -\gamma +1<d-\beta +1\). Thus, BKZ in our experiments behaves better than predicted by (2). We note that another effect of this second intersection is that, for very few instances, it directly leads to a recovery of \(\mathbf {v} \) from \(\pi _{d-\beta -\gamma +1}(\mathbf {v})\).

Giving a closed formula incorporating this effect akin to (2) would entail to predict the index \(\gamma \) and then replace \(\beta \) with \(\beta +\gamma \) in (2). However, as illustrated in Fig. 3, neither does the GSA hold for the last 50 or so indices of the basis [Che13] nor does the prediction \(\sqrt{d-i+1}\,\sigma \) for \(\Vert \pi _{d-1+1}(\mathbf {v})\Vert \).

We stress that while the second intersection often occurs for parameter sets within reach of practical experiments, it does not always occur for all parameter sets. That is, for many large parameter sets (\(n, \alpha , q\)), e.g. those in [ADPS16], a choice of \(\beta \) satisfy (2) does not lead to a predicted second intersection at some larger index. Thus, this effect may highlight the pitfalls of extrapolating experimental lattice-reduction data from small instances to large instances.

Finding the short vector. In what follows, we assume that the projected norm \(\Vert \pi _{d-k}(\mathbf {v})\Vert \) is indeed equal to this expected norm (cf. Fig. 2). We further assume that \(\pi _{i}(\mathbf {v})\) is distributed in a random direction with respect to the rest of the basis. This assumption holds for LWE where the vector \(\mathbf {e} \) is sampled from a (near) spherical distribution. We also note that we can rerandomise the basis and thus the relative directions. Under this assumption, we show that size reduction recovers the short vector \(\mathbf {v} \) with high probability. More precisely, we show:

Claim 1

Let \(\mathbf {v} \in \varLambda \subset \mathbb {R} ^{d}\) be a unique shortest vector and \(\beta \in \mathbb {N} \). Assume that (2) holds, the current basis is \(\mathbf {B} = \{\mathbf {b} _1,\ldots , \mathbf {b} _d\}\) such that \(\mathbf {b} _\kappa ^* = \pi _\kappa (\mathbf {v})\) for \(\kappa = d-\beta +1\) and

$$\begin{aligned} \mathbf {v} = \mathbf {b} _k + \sum _{i = 1}^{k-1} \nu _i \mathbf {b} _i \end{aligned}$$

for some \(\nu _i\in \mathbb {Z} \), and the GSA holds for \(\mathbf {B} \) until index \(\kappa \). If the size reduction step of BKZ-\(\beta \) is called on \(\mathbf {b} _{\kappa }\), it recovers \(\mathbf {v} \) with high probability over the randomness of the basis.

Note that if BKZ has just found a projection of \(\mathbf {v} \) at index \(\kappa \), the current basis is as required by Claim 1. Now, let \(\nu _i\in \mathbb {Z} \) denote the coefficients of \(\mathbf {v} \) with respect to the basis \(\mathbf {B} \), i.e.

$$\begin{aligned} \mathbf {v} = \mathbf {b} _{d-\beta +1} + \sum _{i = 1}^{d-\beta } \nu _i \mathbf {b} _i. \end{aligned}$$

Let \(\mathbf {b} _{d-\beta +1}^{(d-\beta +1)} = \mathbf {b} _{d-\beta +1}\), where the superscript denotes a step during size reduction. For \(i = d-\beta , d-\beta -1, \ldots , 1\) size-reduction successively finds \(\mu _i\in \mathbb {Z} \) such that

$$\begin{aligned} \mathbf {w} _{i} = \mu _{i} \pi _i (\mathbf {b} _i) + \pi _{i} (\mathbf {b} _{d-\beta +1}^{(i+1)}) = \mu _{i} \mathbf {b} _i^* + \pi _{i} (\mathbf {b} _{d-\beta +1}^{(i+1)}) \end{aligned}$$

is the shortest element in the coset

$$\begin{aligned} L_i := \{\mu \mathbf {b} _i^* + \pi _{i} (\mathbf {b} _{d-\beta +1}^{(i+1)}) | \mu \in \mathbb {Z} \} \end{aligned}$$

and sets

$$\begin{aligned} \mathbf {b} _{d-\beta +1}^{(i)} := \mu _{i} \mathbf {b} _i + \mathbf {b} _{d-\beta +1}^{(i+1)}. \end{aligned}$$

Note that if \( \mathbf {b} _{d-\beta +1}^{(i+1)} =\mathbf {b} _{d-\beta +1} + \sum _{j = i+1}^{d-\beta } {\nu _j \mathbf {b} _{j}} \), as in the first step \(i = d-\beta \), then we have that

$$\begin{aligned} \pi _i(\mathbf {v}) = \nu _{i} \mathbf {b} _i^* + \pi _{i} (\mathbf {b} _{d-\beta +1}^{(i+1)}) \in L_i \end{aligned}$$

is contained in \(L_i\) and hence

$$\begin{aligned} L_i = \pi _i(\mathbf {v}) + \mathbb {Z} \mathbf {b} _i^*. \end{aligned}$$

If the projection \( \pi _i(\mathbf {v})\) is in fact the shortest element in \(L_i\), for the newly defined vector \(\mathbf {b} _{d-\beta +1}^{(i)}\) it also holds that

$$\begin{aligned} \mathbf {b} _{d-\beta +1}^{(i)} = \nu _{i} \mathbf {b} _i + \mathbf {b} _{d-\beta +1}^{(i+1)} = \mathbf {b} _{d-\beta +1} + \sum _{j = i}^{d-\beta }{\nu _j \mathbf {b} _{j}}. \end{aligned}$$

Hence, if \( \pi _i(\mathbf {v})\) is the shortest element in \(L_i\) for all i, size reduction finds the shortest vector

$$\begin{aligned} \mathbf {v} = \mathbf {b} _{d-\beta +1}^{(1)} \end{aligned}$$

and inserts it into the basis at position \(d-\beta +1\), replacing \(\mathbf {b} _{d-\beta +1}\).

It remains to argue that with high probability p for every i we have that the projection \(\pi _i(\mathbf {v})\) is the shortest element in \(L_i\). The success probability p is given by

$$\begin{aligned} p = \prod _{i=1}^{d-\beta } p_i, \end{aligned}$$

where the probabilities \(p_i\) are defined as

$$\begin{aligned} p_i = \Pr \left[ \pi _i(\mathbf {v}) \text { is the shortest element in } \pi _i(\mathbf {v}) + \mathbb {Z} \mathbf {b} _i^* \right] \!. \end{aligned}$$
Fig. 5.
figure 5

Illustration of a case such that \(\pi _i(\mathbf {v})\) is the shortest element on \(L_i\).

For each i the probability \(p_i\) is equal to the probability that

$$\begin{aligned} \left||\pi _i(\mathbf {v})\right|| < \min \{\left||\pi _i(\mathbf {v}) +\mathbf {b} _i^*\right||, \left||\pi _i(\mathbf {v}) - \mathbf {b} _i^*\right||\} \end{aligned}$$

as illustrated in Fig. 5. To approximate the probabilities \(p_i\), we model them as follows. By assumption, we have

$$\begin{aligned} r_i:= \left||\pi _i(\mathbf {v})\right|| = (\sqrt{d-i+1}/ \sqrt{d}) \left||\mathbf {v} \right||\text { and } R_i:=\left||\mathbf {b} _i^*\right||=\delta _{0} ^{-2(i-1)+d}{{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}\!, \end{aligned}$$

and that \(\pi _i(\mathbf {v})\) is uniformly distributed with norm \(r_i\). We can therefore model \(p_i\) as described in the following and illustrated in Fig. 6.

Fig. 6.
figure 6

Illustration of the success probability \(p_i\) in \( \mathbb {R} ^2\). If \(\mathbf {w} \) is on the thick part of the circle, step i of size reduction is successful.

Pick a point \(\mathbf {w} \) with norm \(r_i\) uniformly at random. Then the probability \(p_i\) is approximately the probability that \(\mathbf {w} \) is closer to \(\mathbf {0} \) than it is to \(\mathbf {b} _i^*\) and to \(- \mathbf {b} _i^*\), i.e.

$$\begin{aligned} r_i < \min \{\left||\mathbf {w} -\mathbf {b} _i^*\right||, \left||\mathbf {w} + \mathbf {b} _i^*\right||\}. \end{aligned}$$

Calculating this probability leads to the following approximation of \(p_i\)

$$\begin{aligned} p_i \approx {\left\{ \begin{array}{ll} 1-\frac{2A_{d-i+1}(r_i, h_i)}{A_{d-i+1}(r_i)} &{}\text { if } R_i < 2r_i \\ 1 &{}\text { if } R_i \ge 2r_i\end{array}\right. }\!, \end{aligned}$$

where \(A_{d-i+1}(r_i)\) is the surface area of the sphere in \( \mathbb {R} ^{d-i+1}\) with radius \(r_i\) and \(A_{d-i+1}(r_i, h_i)\) is the surface area of the hyperspherical cap of the sphere in \( \mathbb {R} ^{d-i+1}\) with radius \(r_i\) of height \(h_i\) with \(h_i = r_i - R_i / 2\). Using the formulas provided in [Li11], an easy calculation leads to

$$\begin{aligned} p_i \approx {\left\{ \begin{array}{ll} 1-\frac{\int _0^{2\frac{h_i}{r_i}-\left( \frac{h_i}{r_i}\right) ^2} t^{((d-i)/2)-1}(1-t)^{-1/2} dt}{B(\frac{d-i}{2}, \frac{1}{2})} &{}\text { if } R_i < 2r_i \\ 1 &{}\text { if } R_i \ge 2r_i \end{array}\right. }\!, \end{aligned}$$

where \(B(\cdot , \cdot )\) denotes the Euler beta function. Note that \(R_i \ge 2r_i\) corresponds to (3).

Estimated success probabilities p for different block sizes \(\beta \) are plotted in Fig. 7. Note that if we assume equality holds in (2), the success probability p only depends on the block size \(\beta \) and not on the specific lattice dimension, volume of the lattice, or the length of the unique short vector, since then the ratios between the predicted norms \(\left||\pi _{d-\beta +1-k}(\mathbf {v})\right||\) and \(\left||\mathbf {b} _{d-\beta +1-k}^*\right||\) only depend on \(\beta \) for all \(k=1,2,\ldots \), since

$$\begin{aligned} \frac{\left||\pi _{d-\beta +1-k}(\mathbf {v})\right||}{\left||\mathbf {b} _{d-\beta +1-k}^*\right||}=\frac{\frac{\sqrt{\beta }\sqrt{\beta +k}}{\sqrt{\beta }\sqrt{d}} \left||\mathbf {v} \right||}{\delta _{0} ^{2(\beta +k)-d}\,{{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}} = \frac{\frac{\sqrt{\beta +k}}{\sqrt{\beta }} \delta _{0} ^{2\beta -d}\,{{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}}{\delta _{0} ^{2(\beta +k)-d}\,{{{\mathrm{Vol}}}(\varLambda )}^{\frac{1}{d}}} = \frac{\sqrt{\beta +k}}{\sqrt{\beta }} \delta _{0} ^{-2k} \end{aligned}$$

and the estimated success probability only depends on these ratios.

Fig. 7.
figure 7

Estimated success probability p for varying block sizes \(\beta \), assuming \(\beta \) is chosen minimal such that (2) holds.

The prediction given in Fig. 7 is in line with the measured probability of finding \(\mathbf {v} \) in the same step when its projection \(\pi _{d-\beta +1}(\mathbf {v})\) is found as reported in Table 1 for \(\beta = \beta _{2016}\) and \(m = m_{2016}\). Finally, note that by the above analysis we do not expect to recover \(\mathbf {v} \) from a projection \(\pi _{d-\gamma +1}(\mathbf {v})\) for some small \(\gamma \ll \beta \) except with small probability.

5 Applications

Section 4 indicates that (2) is a reliable indicator for when lattice-reduction will succeed in recovering an unusually short vector. Furthermore, as illustrated in Fig. 1, applying (2) lowers the required block sizes compared to the 2008 model which is heavily relied upon in the literature. Thus, in this section we evaluate the impact of applying the revised estimates to various parameter sets from the literature. Indeed, for many schemes we find that their parameters need to be adapted to maintain the currently claimed level of security.

Many of the schemes considered below feature an unusually short secret \(\mathbf {s} \) where for some small \(B \in \mathbb {Z}_q\). Furthermore, some schemes pick the secret to also be sparse such that most components of \(\mathbf {s} \) are zero. Thus, before we apply the revised 2016 estimate, we briefly recall the alternative embedding due to Bai and Galbraith [BG14b] which takes these small (and sparse) secrets into account.

5.1 Bai and Galbraith’s Embedding

Consider an LWE instance in matrix form \((\mathbf {A}, \mathbf {c}) \equiv (\mathbf {A},\mathbf {A} \cdot \mathbf {s} + \mathbf {e}) \in \mathbb {Z} _q^{m \times n} \times \mathbb {Z} _q^m\). By inspection, it can be seen that the vector \((\nu \, \mathbf {s} \mid \mathbf {e} \mid 1)\), for some \(\nu \ne 0\), is contained in the lattice \(\varLambda \)

$$\begin{aligned} \varLambda = \left\{ \mathbf {x} \in \left( \nu \mathbb {Z} \right) ^n \times \mathbb {Z} ^{m+1} \ \mid \mathbf {x} \cdot \left( \frac{1}{\nu } \mathbf {A} \mid \mathbf {I} _{m} \mid -\mathbf {c} \right) ^\top \equiv 0 \bmod {q}\right\} \!, \end{aligned}$$
(4)

where \(\nu \) allows to balance the size of the secret and the noise. An \((n+m+1)\times (n+m+1)\) basis \(\mathbf {M} \) for \(\varLambda \) can be constructed as

$$ \mathbf {M} = \left( \begin{array}{c@{}c@{}c} \nu \mathbf {I} _{n} &{} -\mathbf {A} ^\top &{} \mathbf {0} \\ \mathbf {0} &{} q\mathbf {I} _{m} &{} \mathbf {0} \\ \mathbf {0} &{} \mathbf {c} &{} 1 \\ \end{array}\right) \!. $$

Indeed, \(\mathbf {M} \) is full rank, \(\det (\mathbf {M}) = {{\mathrm{Vol}}}(\varLambda )\), and the integer span of \({\mathbf {M} \subseteq \varLambda }\), as we can see by noting that

$$ \left( \begin{array}{c@{}c@{}c} \nu \mathbf {I} _{n} &{} -\mathbf {A} ^\top &{} \mathbf {0} \\ \mathbf {0} &{} q\mathbf {I} _{m} &{} \mathbf {0} \\ \mathbf {0} &{} \mathbf {c} &{} 1 \\ \end{array}\right) \left( \frac{1}{\nu } \mathbf {A} \mid \mathbf {I} _{m} \mid -\mathbf {c} \right) ^\top = \left( \mathbf {A} - \mathbf {A} \mid q\mathbf {I} _{m} \mid \mathbf {c} -\mathbf {c} \right) ^\top \equiv \mathbf {0} \bmod {q}. $$

Finally, note that \( (\mathbf {s} \mid *\mid 1) \cdot \mathbf {M} = (\nu \, \mathbf {s} \mid \mathbf {e} \mid 1)\) for suitable values of \(*\). If \(\mathbf {s} \) is small and/or sparse, choosing \(\nu = 1\), the vector \((\mathbf {s} \mid \mathbf {e} \mid 1)\) is unbalanced, i.e. \(\frac{\left||\mathbf {s} \right||}{\sqrt{n}} \ll \frac{\left||\mathbf {e} \right||}{\sqrt{m}} \approx \sigma \), where \(\sigma \) is the standard deviation of the LWE error distribution. We may then want to rebalance it by choosing an appropriate value of \(\nu \) such that \(\left||(\nu \,\mathbf {s} \mid \mathbf {e} \mid 1)\right|| \approx \sigma \sqrt{n+m}\). Rebalancing preserves \((\nu \,\mathbf {s} \mid \mathbf {e} \mid 1)\) as the unique shortest vector in the lattice, while at the same time increasing the volume of the lattice being reduced, reducing the block size required by (2).

If \(\mathbf {s} \xleftarrow {\$} {\{-1, 0 , 1\}}^n\) we expect \(\left||\nu \,\mathbf {s} \right||^2 \approx \frac{2}{3} \nu ^2 n\). Therefore, we can chose \(\nu = \sqrt{\frac{3}{2}}\sigma \) to obtain \(\left||\nu \, \mathbf {s} \right|| \approx \sigma \sqrt{n}\), so that \(\left||(\mathbf {s} \mid \mathbf {e} \mid 1)\right|| \approx \sigma \sqrt{n+m}\). Similarly, if \(w < n\) entries of \(\mathbf {s} \) are non-zero from \(\{-1, 1\}\), we have \(\left||\nu \,\mathbf {s} \right||^2 = w\, \nu ^2\). Choosing \(\nu = \sqrt{\frac{n}{w}}\sigma \), we obtain a vector \(\nu \,\mathbf {s} \) of length \(\sigma \sqrt{n}\).

In the case of sparse secrets, combinatorial techniques can also be applied [HG07, BGPW16, Alb17]. Given a secret \(\mathbf {s} \) with at most \(w < n\) non-zero entries, we guess k entries of \(\mathbf {s} \) to be 0, therefore decreasing the dimension of the lattice to consider. For each guess, we then apply lattice reduction to recover the remaining components of the vector \((\mathbf {s} \mid \mathbf {e} \mid 1)\). Therefore, when estimating the overall complexity for solving such instances, we find \(\min \limits _{k}\{1/p_{k} \cdot C(n-k)\}\) where C(n) is the cost of running BKZ on a lattice of dimension n and \(p_{k}\) is the probability of guessing correctly.

5.2 Estimates

In what follows, we assume that the geometry of (4) is sufficiently close to that of (1) so that we transfer the analysis as is. Furthermore, we will denote applying (2) from [ADPS16] for Kannan’s embedding as “Kannan” and applying (2) for Bai and Galbraith’s embedding [BG14b] as “Bai-Gal”. Unless stated otherwise, we will assume that calling BKZ with block size \(\beta \) in dimension \(d\) costs \(8\,d\,2^{0.292\,\beta + 16.4}\) operations [BDGL16, Alb17].

Lizard [CKLS16b, CKLS16a] is a PKE scheme based on the Learning With Rounding problem, using a small, sparse secret. The authors provide a reduction to LWE, and security parameters against classic and quantum adversaries, following their analysis. In particular, they cost BKZ by a single call to sieving on a block of size \(\beta \). They estimate this call to cost \(\beta \,2^{c\,\beta }\) operations where \(c = 0.292\) for classical adversaries, \(c = 0.265\) for quantum ones and \(c = 0.2075\) as a lower bound for sieving (“paranoid”). Applying the revised 2016 cost estimate for the primal attack to the parameters suggested in [CKLS16b] (using their sieving cost model as described above) reduces the expected costs, as shown in Table 2. We note that in the meantime the authors of Lizard have updated their parameters in [CKLS16a].

Table 2. Bit complexity estimates \(\lambda \) for solving Lizard PKE [CKLS16b] as given in [CKLS16b] and using Kannan’s resp. Bai and Galbraith’s embedding under the 2016 estimate. The dimension of the LWE secret is n. In all cases, BKZ-\(\beta \) is estimated to cost \(\beta \,2^{c\,\beta }\) operations.

HElib [GHS12a, GHS12b] is an FHE library implementing the BGV scheme [BGH13]. A recent work [Alb17] provides revised security estimates for HELib by employing a dual attack exploiting the small and sparse secret, using the same cost estimate for BKZ as given at the beginning of this section. In Table 3 we provide costs for a primal attack using Kannan’s and Bai and Galbraith’s embeddings. Primal attacks perform worse than the algorithm described [Alb17], but, as expected, under the 2016 estimate the gap narrows.

Table 3. Solving costs for LWE instances underlying HELib as given in [Alb17] and using Kannan’s resp. Bai and Galbraith’s embedding under the 2016 estimate. The dimension of the LWE secret is n. In all cases, BKZ-\(\beta \) is estimated to cost \(8d\,2^{0.292\,\beta + 16.4}\) operations.

SEAL [CLP17] is an FHE library by Microsoft, based on the FV scheme [FV12]. Up to date parameters are given in [CLP17], using the same cost model for BKZ as mentioned at the beginning of this section. In Table 4, we provide complexity estimates for Kannan’s and Bai and Galbraith’s embeddings under the 2016 estimate. Note that the gap in solving time between the dual and primal attack reported in [Alb17] is closed for SEAL v2.1 parameters.

Table 4. Solving costs for parameter choices in SEAL v2.1 as given in [CLP17], using [Alb17] as implemented in the current [APS15] estimator commit 84014b6 (“[Alb17]+”), and using Kannan’s resp. Bai and Galbraith’s embedding under the 2016 estimate. In all cases, BKZ-\(\beta \) is estimated to cost \(8d\,2^{0.292\,\beta + 16.4}\) operations.

TESLA [BG14a, ABBD15] is a signature scheme based on LWE. Post-quantum secure parameters in the quantum random oracle model were recently proposed in [ABB+17]. In Table 5, we show that these parameters need to be increased to maintain the currently claimed level of security under the 2016 estimate. Note that [ABB+17] maintains a gap of \({\approx }\log _{2}n\) bits of security between the best known attack on LWE and claimed security to account for a loss of security in the reduction.

Table 5. Bit complexity estimates for solving TESLA parameter sets [ABB+17]. The entry “[ABB+17]+” refers to reproducing the estimates from [ABB+17] using a current copy of the estimator from [APS15] which uses \(t=1\) instead of \(t=\Vert \mathbf {e} \Vert \), as a consequence the values in the respective rows are slightly lower than in [ABB+17]. We compare with Kannan’s embedding under the 2016 estimate. Classically, BKZ-\(\beta \) is estimated to cost \(8d\,2^{0.292\,\beta + 16.4}\) operations; quantumly BKZ-\(\beta \) is estimated to cost \(8d\,\sqrt{\beta ^{0.0225\,\beta } \cdot 2^{0.4574\,\beta } / 2^{\beta /4}}\) operations in [ABB+17].

BCIV17 [BCIV17] is a somewhat homomorphic encryption scheme obtained as a simplification of the FV scheme [FV12] and proposed as a candidate for enabling privacy friendly energy consumption forecast computation in smart grid settings. The authors propose parameters for obtaining 80 bits of security, derived using the estimator from [APS15] available at the time of publication. As a consequence of applying (2), we observe a moderate loss of security, as reported in Table 6.

Table 6. Solving costs for proposed Ring-LWE parameters in [BCIV17] using Kannan’s resp. Bai and Galbraith’s embedding under the 2016 estimate. In both cases, BKZ-\(\beta \) is estimated to cost \(8d\,2^{0.292\,\beta + 16.4}\) operations.