Keywords

1 Introduction

Homomorphic encryption (HE) is a cryptographic scheme that allows us to evaluate an arbitrary arithmetic circuit on encrypted data without decryption. There have been a number of studies [5,6,7,8,9, 11, 16, 19, 22, 25, 30, 31] to improve the efficiency of HE cryptosystem after Gentry’s blueprint [26]. This cryptographic primitive has a number of prospective real-world applications based on the secure outsourcing of computation in public clouds. For example, HE can be a solution to performing the computation of various algorithms on financial, medical, or genomic data without any information leakage [15, 36, 38, 39, 41].

Unfortunately, most of existing HE schemes support the exact arithmetic operations over some discrete spaces (e.g. finite field), so that they are not suitable for many real-world applications which require a floating point operation or real number arithmetic. To be specific, bitwise encryption schemes [17, 24] can evaluate a boolean gate with bootstrapping in much shorter time, but it is necessary to evaluate a deep circuit with a number of gates to perform a single arithmetic operation (e.g. addition or multiplication) between high-precision numbers. Moreover, a huge expansion rate of ciphertexts is another issue that stands in the way of the practical use of bitwise encryptions. On the other hand, word encryption schemes [6, 7, 25, 30] can encrypt multiple high-precision numbers in a single ciphertext but the rounding operation is difficult to be evaluated since it is not expressed as a small-degree polynomial. Therefore, they require either a plaintext space with an exponentially large bit size on the depth of a circuit, or an expensive computation such as rounding operation and extraction of the most significant bits.

Recently, Cheon et al. [14] proposed a HE scheme for an Arithmetic of Approximate Numbers (called \({\texttt {HEAAN}}\) in what follows) based on the ring learning with errors (RLWE) problem. The main idea is to consider an encryption error as part of a computational error that occurs during approximate computations. For an encryption \(\mathsf {ct}\) of a message m with a secret key \(\mathsf {sk}\), the decryption algorithm \([\langle \mathsf {ct},\mathsf {sk}\rangle ]_q\) outputs an approximate value \(m+e\) of the original message with a small error e. The main advantage of \({\texttt {HEAAN}}\) comes from the rescaling procedure for managing the magnitude of plaintexts. It truncates a ciphertext into a smaller modulus, which leads to an approximate rounding of the encrypted plaintext. As a result, it achieved the first linear growth of the ciphertext modulus on the depth of the circuit being evaluated, against the exponential growth in previous word encryption schemes. In addition, the RLWE-based \({\texttt {HEAAN}}\) scheme has its own packing strategy to encrypt multiple complex numbers in a single ciphertext and perform a parallel computation. However, \({\texttt {HEAAN}}\) is a leveled HE scheme which can only evaluate a circuit of fixed depth. As homomorphic operations progresses, the ciphertext modulus decreases and finally becomes too small to carry out more computations.

In previous literature, Gentry’s bootstrapping is the only known method to construct a fully homomorphic encryption (FHE) scheme which allows us to evaluate an arbitrary circuit. Technically, the bootstrapping method can be understood as a homomorphic evaluation of the decryption circuit to refresh a ciphertext for more computations. The \({\texttt {HEAAN}}\) scheme does not support the modular arithmetic, however, its decryption circuit \([\langle \mathsf {ct},\mathsf {sk}\rangle ]_q\) requires the modular reduction operation, which makes its bootstrapping much harder. Therefore, the bootstrapping of \({\texttt {HEAAN}}\) can be reduced to a problem that represents the modular reduction function \(F(t)=[t]_q\) as a polynomial over the integers (or, complex numbers). One may use the polynomial interpolation of this function over the domain of \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \), but it is a limiting factor for practical implementation due to a huge computational cost of an evaluation.

Our Contributions. We present a methodology to refresh ciphertexts of \({\texttt {HEAAN}}\) and make it bootstrappable for the evaluation of an arbitrary circuit. We take advantage of its intrinsic characteristic - approximate computation on encrypted data. Our bootstrapping procedure aims to evaluate the decryption formula approximately and obtain an encryption of the original message in a large ciphertext modulus. Hence, we find an approximation of the modular reduction function that can be evaluated efficiently using arithmetic operations of \({\texttt {HEAAN}}\). The approximation error should be small enough to maintain the precision of an input plaintext.

We first note that the modular reduction function \(F(t)=[t]_q\) is the identity nearby zero and periodic with period q. If \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \) is close to a multiple of the ciphertext modulus q (or equivalently, if the encrypted plaintext \(m=[t]_q\) is small compared to q), then a trigonometric function can be a good approximation to the modular reduction. Namely, the decryption formula of \({\texttt {HEAAN}}\) can be represented using the following scaled sine function as

$$\begin{aligned}{}[\langle \mathsf {ct},\mathsf {sk}\rangle ]_q=\frac{q}{2\pi }\cdot \sin \left( \frac{2\pi }{q} \cdot \langle \mathsf {ct},\mathsf {sk}\rangle \right) +O(\epsilon ^3\cdot q), \end{aligned}$$

when \(|[\langle \mathsf {ct},\mathsf {sk}\rangle ]_q|\le \epsilon \cdot q\). Hence we may use this analytic function instead of the modular reduction in the decryption formula.

Now our goal is to homomorphically evaluate the trigonometric function \(\frac{q}{2\pi }\cdot \sin \left( \frac{2\pi }{q}\cdot t\right) \) with an input \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \), which is bounded by Kq for some constant \(K=O(\lambda )\) with \(\lambda \) the security parameter. We can consider the Taylor polynomial as an approximation to the trigonometric function, but its degree should be at least O(Kq) to make an error term small enough on the interval \((-Kq,Kq)\). The evaluation of polynomial can be done in \(O(\sqrt{Kq})\) homomorphic multiplications with Paterson-Stockmeyer method [40], but this complexity of recryption grows exponentially with the depth \(L=O(\log q)\) of the decryption circuit - which is still quite substantial.

We suggest an evaluation strategy of the trigonometric function to reduce its computation cost by exploiting the following double-angle formulas:

$$ {\left\{ \begin{array}{ll} \cos ( 2\theta ) = \cos ^2\theta - \sin ^2\theta ,\\ \sin (2\theta ) = 2\cos \theta \cdot \sin \theta , \end{array}\right. } $$

which means that we can obtain some approximate values of \(\cos (2\theta )\) and \(\sin (2\theta )\) from approximate values of \(\cos \theta \) and \(\sin \theta \). In our bootstrapping process, we first compute the Taylor expansions of \(\cos \left( \frac{2\pi }{q}\cdot \frac{t}{2^r}\right) \) and \(\sin \left( \frac{2\pi }{q}\cdot \frac{t}{2^r}\right) \) of a small degree \(d_0=O(1)\) for some \(r=O(\log (Kq))\). Then we use the doubling-angle formulas r times recursively to get an approximate value of \(\sin \left( \frac{2\pi }{q}\cdot {t}\right) \). In the case of the RLWE-based construction, this evaluation can be even more simplified by encrypting the complex exponentiation \(\exp (i\theta )=\cos \theta +i\cdot \sin \theta \) and adapting the identity \(\exp (i\cdot 2\theta )=(\exp (i\cdot \theta ))^2\).

Fig. 1.
figure 1

Modular reduction and scaled sine functions

Results. Our bootstrapping technique for \({\texttt {HEAAN}}\) is a new cryptographic primitive for FHE mechanisms, which yields the first word encryption scheme for approximate arithmetic. For a ciphertext \(\mathsf {ct}\) with a modulus q, our bootstrapping procedure generates a ciphertext \(\mathsf {ct}'\) with a larger ciphertext modulus \(Q\gg q\), satisfying the condition \([\langle \mathsf {ct}',\mathsf {sk}\rangle ]_Q \approx [\langle \mathsf {ct},\mathsf {sk}\rangle ]_q\) while an error is kept small enough not to destroy the significant digits of a plaintext. The output ciphertext will have a sufficiently large modulus compared to a plaintext, thereby enabling further computation on the ciphertext. In addition, our approximation to a trigonometric function and efficient evaluation strategy reduce the complexity of the evaluation down to O(L) homomorphic multiplications for the depth \(L=O(\log q)\) of the decryption circuit.

We also give an open-source implementation [12] to demonstrate the performance of our bootstrapping method. It contains some optimization techniques including the linear transformation method of [33] for the recryption over the packed ciphertexts. When we want to preserve 12 bits of precision, our bootstrapping on a single-slot ciphertext takes about 26.6 s. We also optimize the linear transforms for sparsely packed ciphertexts and it takes about 139.8 s to recrypt a ciphertext that encrypts 128 complex numbers in plaintext slots, yielding an amortized rate of 1.1 seconds per slot.

Implications of Our Bootstrapping Method. The main feature of approximate arithmetic is that every number contains an error of which could increase during computation. The precision of a number is reduced by approximately one bit after multiplication and finally we may not extract any meaningful information from the computation result if the depth of a circuit is larger than the bit precision of the input data. On the other hand, our bootstrapping procedure is to refresh ciphertexts and then perform further computation on encrypted data. This concept of an unlimited computation may seem a contradiction to the property of finite precision in the approximate arithmetic.

However, it turns out to be better for real-world applications that have a property of negative feedback or stability. For example, a cyber-physical system (CPS) is a compromised mechanism of physical and computational components. A computational element commutes with the sensors and every signal contains a small error. One can guarantee the correctness of CPS only when it is stable because an error is reduced by negative feedback to the input. Another example is the gradient descent method, which is the most widely used algorithm to perform optimization. It has a number of applications in machine learning such as logistic regression and neural networks. It computes the gradient of a point and moves it closer to an optimal point, which reduces the effects of perturbations in the output.

As in the examples above, we do not have to worry about the precision of numbers when the overall system is stable. In fact, there are some proof-of-concept implementations about the secure control of CPS [35] and secure logistic regression using biomedical data [37]. We expect that our bootstrapping process can be applied to these real-world applications.

Related Works. There have been several attempts to carry out an approximate arithmetic using HE. Downlin et al. [23] (see also [4, 21]) described a method to transform a real number into a polynomial with small and sparse coefficients to reduce the required size of a plaintext modulus. Costache et al. [20] suggested a similar encoding method with [14] to evaluate the discrete Fourier transformation efficiently, but a ciphertext could encrypt only one value. Chen et al. [10] uses a technique of [34] to encrypt a single high-precision number. However, they still have some problems: (1) the coefficients and the degree of encoded polynomial grow exponentially with the depth of a circuit and (2) there is no known result to achieve an FHE scheme because a polynomial should be re-encoded to be represented with a smaller degree and coefficients for more computations and the bootstrapping method of [33] is not enough for this functionality.

The original Gentry’s bootstrapping technique was implemented by Gentry and Halevi [27], which took a half hour to recrypt a single bit ciphertext. Gentry et al. [28] represented the decryption circuit of RLWE-based HE with a lower depth circuit using a special modulus space. The Halevi-Shoup FHE implementation [33] reported a recryption time of approximately six minutes per slot. Meanwhile, Ducas and Micciancio [24] proposed the FHEW scheme that bootstraps a single-bit encryption in less than a second based on the framework of [2]. Chillotti et al. [17] obtained a speed up to less than 0.1 s. The following works [3, 18] improved the performance by using the evaluation of a look-up table before bootstrapping. However, the size of an input plaintext of bootstrapping is very limited since it is related to the ring dimension of an intermediate Ring GSW scheme. In addition, a huge expansion rate of ciphertexts is still an open problem in bitwise encryption schemes.

The previous FHE schemes evaluate the exact decryption circuit using the structure of a finite field or a polynomial ring in bootstrapping algorithm. The evaluation of an arbitrary polynomial of degree d requires \(O(\sqrt{d})\) homomorphic multiplications, but Halevi and Shoup [33] used a polynomial with the lifting property to reduce the computational cost of bootstrapping. They used a recursive algorithm to extract some digits in an encrypted state, so the number of homomorphic multiplications for bootstrapping was reduced down to \(O(\log ^2 d)\). Contrary to the work of Halevi and Shoup, we find an approximate decryption circuit using a trigonometric function and suggest an even simpler recursive algorithm. As a result, our algorithm only requires \(O(\log d)\) number of homomorphic multiplications, which results in an enhanced performance.

Road-map. Section 2 briefly introduces notations and some preliminaries about algebra. We also review the \({\texttt {HEAAN}}\) scheme of Cheon et al. [14]. Section 3 explains our simplified decryption formula by using a trigonometric function. In Sect. 4, we recall the ciphertext packing method of \({\texttt {HEAAN}}\) and describe a linear transformation on packed ciphertexts. In Sect. 5, we present our bootstrapping technique with a precise noise estimation. In Sect. 6, we implement the recryption procedure based on the proposed method and discuss the performance results.

2 Preliminaries

The binary logarithm will be simply denoted by \(\log (\cdot )\). We denote vectors in bold, e.g. \(\mathbf a\), and every vector in this paper is a column vector. For a \(n_1\times m\) matrix \(A_1\) and a \(n_2\times m\) matrix \(A_2\), \((A_1;A_2)\) denotes the \((n_1+n_2)\times m\) matrix obtained by concatenating matrices \(A_1\) and \(A_2\) in a vertical direction.

We denote by \(\langle \cdot ,\cdot \rangle \) the usual dot product of two vectors. For a real number r, \(\lfloor {r}\rceil \) denotes the nearest integer to r, rounding upwards in case of a tie. For an integer q, we identify \({\mathbb Z}\cap (-q/2,q/2]\) as a representative of \({\mathbb Z}_q\) and use \([z]_q\) to denote the reduction of the integer z modulo q into that interval. We use \(x\leftarrow D\) to denote the sampling x according to distribution D. The uniform distribution over a finite set S is denoted by U(S). We let \(\lambda \) denote the security parameter throughout the paper: all known valid attacks against the cryptographic scheme under scope should take \(\varOmega (2^\lambda )\) bit operations.

2.1 Cyclotomic Ring

For a positive integer M, let \(\varPhi _M(X)\) be the M-th cyclotomic polynomial of degree \(N=\phi (M)\). Let \({\mathcal R}={{\mathbb Z}[X]/(\varPhi _M(X))}\) be the ring of integers of a number field \({{\mathbb Q}[X]/(\varPhi _M(X))}\). We write \({\mathcal R}_q={\mathcal R}/q{\mathcal R}\) for the residue ring of \({\mathcal R}\) modulo an integer q. An arbitrary element of the set \({\mathcal P}= {\mathbb R}[X]/(\varPhi _M(X))\) will be represented as a polynomial \(a(X)=\sum _{j=0}^{N-1}a_j X^j\) of degree strictly less than N and identified with its coefficients vector \({{\varvec{a}}}=(a_0,\dots ,a_{N-1})\in \mathbb {\mathbb R}^N\). We define \(\Vert {a}\Vert _\infty \) and \(\Vert {a}\Vert _1\) by the relevant norms on the coefficients vector \({{\varvec{a}}}\).

Write \({\mathbb Z}_M^*= \{ x\in {\mathbb Z}_M: \text {gcd}(x,M)=1\}\) for the multiplicative group of units in \({\mathbb Z}_M\). Recall that the canonical embedding of \(a(X) \in {{\mathbb Q}[X]/(\varPhi _M(X))}\) into \({\mathbb {C}}^N\) is the vector of evaluations of a(X) at the M-th primitive roots of unity. We use its natural extension \(\sigma \) to \({\mathcal P}\), defined by \(\sigma (a)=(a(\zeta ^j))_{j\in {\mathbb Z}_M^{*}}\) for \(\zeta =\exp {(2\pi i/M)}\). Its \(\ell _{\infty }\)-norm is called the canonical embedding norm, denoted by \(\Vert {a}\Vert _\infty ^{\textsf {can}}=\Vert {\sigma (a)}\Vert _\infty \).

2.2 Homomorphic Encryption for Arithmetic of Approximate Numbers

HE is one of the prospective cryptographic primitives for secure outsourcing computation without information leakage. However, an inefficiency of real number computation is one of the main obstacles to apply HE schemes in real-world applications. Recently Cheon et al. [14] proposed a method to construct the HE scheme for approximate arithmetic, called \({\texttt {HEAAN}}\). Their scheme supports an efficient rounding operation of encrypted plaintext as well as basic arithmetic operations. This subsection gives a concrete description of the RLWE-based \({\texttt {HEAAN}}\) scheme.

For a real \(\sigma >0\), \(\mathcal {DG}(\sigma ^2)\) denotes a distribution over \({\mathbb Z}^N\) which samples its components independently from the discrete Gaussian distribution of variance \(\sigma ^2\). For an positive integer h, \(\mathcal {HWT}(h)\) denotes a uniform distribution over the set of signed binary vectors in \(\{\pm 1\}^N\) whose Hamming weight is exactly h. For a real \(0\le \rho \le 1\), the distribution \(\mathcal {ZO}(\rho )\) draws each entry in the vector from \(\{0,\pm 1\}\), with probability \(\rho /2\) for each of \(-1\) and \(+1\), and probability being zero \(1-\rho \).

  • \(\underline{\textsf {KeyGen}(1^\lambda )}.\)

    • For a base p and an integer L, let \(q_\ell =p^\ell \) for \(\ell =1,\dots , L\). Given the security parameter \(\lambda \), choose a power-of-two M, an integer h, an integer P, and a real number \(\sigma >0\) for an RLWE problem that achieves \(\lambda \)-bit of security level.

    • Sample \(s\leftarrow \mathcal {HWT}(h)\), \(a\leftarrow U({\mathcal R}_{q_L})\) and \(e\leftarrow \mathcal {DG}(\sigma ^2)\). Set the secret key as \(\mathsf {sk}\leftarrow (1,s)\) and the public key as \(\mathsf {pk}\leftarrow (b,a)\in {\mathcal R}_{q_L}^2\) where \(b\leftarrow -as+e \pmod {q_L}\).

  • \(\underline{\textsf {KSGen}_\mathsf {sk}(s')}\). For \(s'\in {\mathcal R}\), sample \(a'\leftarrow U({\mathcal R}_{P\cdot q_L})\) and \(e'\leftarrow \mathcal {DG}(\sigma ^2)\). Output the switching key as \(\mathsf {swk}\leftarrow (b',a')\in {\mathcal R}_{P\cdot q_L}^2\) where \(b'\leftarrow -a's+e'+Ps' \pmod {P\cdot q_L}\).

    • Set the evaluation key as \(\mathsf {evk}\leftarrow \textsf {KSGen}_\mathsf {sk}(s^2)\).

  • \(\underline{\textsf {Enc}_\mathsf {pk}(m)}\). For \(m\in {\mathcal R}\), sample \(v\leftarrow \mathcal {ZO}(0.5)\) and \(e_0,e_1\leftarrow \mathcal {DG}(\sigma ^2)\). Output \(v\cdot \mathsf {pk}+(m+e_0,e_1) \pmod {q_L}\).

  • \(\underline{\textsf {Dec}_\mathsf {sk}(\mathsf {ct})}\). For \(\mathsf {ct}= (c_0,c_1)\in {\mathcal R}_{q_\ell }^2\), output \(m=c_0 + c_1\cdot s \pmod {q_\ell }.\)

  • \(\underline{\textsf {Add}(\mathsf {ct}_1,\mathsf {ct}_2)}\). For \(\mathsf {ct}_1,\mathsf {ct}_2\in {\mathcal R}_{q_\ell }^2\), output \(\mathsf {ct}_\mathsf {add}\leftarrow \mathsf {ct}_1+\mathsf {ct}_2 \pmod {q_\ell }\).

  • \(\underline{\textsf {Mult}_\mathsf {evk}(\mathsf {ct}_1,\mathsf {ct}_2)}\). For \(\mathsf {ct}_1=(b_1,a_1),\mathsf {ct}_2=(b_2,a_2)\in {\mathcal R}_{q_\ell }^2\), let \((d_0, d_1,d_2)=(b_1b_2,\) \(a_1b_2+a_2b_1,a_1a_2)\pmod {q_\ell }\). Output \(\mathsf {ct}_\mathsf {mult}\leftarrow (d_0, d_1)+\lfloor {P^{-1}\cdot d_2\cdot \mathsf {evk}}\rceil \pmod {q_\ell }\).

  • \(\underline{\textsf {RS}_{\ell \rightarrow \ell '}(\mathsf {ct})}\). For a ciphertext \(\mathsf {ct}\in {\mathcal R}_{q_\ell }^{2}\) at level \(\ell \), output \(\mathsf {ct}'\leftarrow \lfloor {p^{\ell '-\ell }\cdot \mathsf {ct}}\rceil \pmod {q_{\ell '}}\). We will omit the subscript \((\ell \rightarrow \ell ')\) when \(\ell '=\ell -1\).

The native plaintext space of \({\texttt {HEAAN}}\) can be understood as the set of polynomials m(X) in \({{\mathbb Z}[X]/(\varPhi _M(X))}\) such that \(\Vert {m}\Vert _\infty ^{\textsf {can}}< q/2\). For convenience, we allow an arbitrary element of \({\mathcal P}={\mathbb R}[X]/(\varPhi _M(X))\) as a plaintext polynomial, so that a ciphertext \(\mathsf {ct}=(c_0, c_1)\in {\mathcal R}_{q_\ell }^2\) at level \(\ell \) will be called an encryption of \(m(X)\in {\mathcal P}\) with an error bound B if it satisfies \(\langle \mathsf {ct},\mathsf {sk}\rangle =m+e \pmod {q_\ell }\) for some polynomial \(e(X)\in {\mathcal P}\) satisfying \(\Vert {e}\Vert _\infty ^{\textsf {can}} \le B\). The set \({\mathcal P}={\mathbb R}[X]/(\varPhi _M(X))\) can be identified with the complex coordinate space \({\mathbb {C}}^{N/2}\) using a ring isomorphism. This decoding map allows us to encrypt at most (N / 2) numbers in a single ciphertext and carry out parallel operations in a Single Instruction Multiple Data (SIMD) manner. A simple description of the packing method will be described in Sect. 4.1.

We will make the use of the following lemmas from [14] for noise estimation. We adapt some notations from [14], defining the constants \(B_\mathsf {ks}\) and \(B_\mathsf {rs}\).

Lemma 1

([14, Lemma 1]). Let \(\mathsf {ct}\leftarrow {{{\textsf {\textit{Enc}}}}}_\mathsf {pk}(m)\) be an encryption of \(m\in {\mathcal R}\). Then \(\langle \mathsf {ct},\mathsf {sk}\rangle =m+e \pmod {q_L}\) for some \(e\in {\mathcal R}\) satisfying \(\Vert {e}\Vert _\infty ^{\textsf {can}} \le B_\mathsf {clean}\) for \(B_\mathsf {clean}= 8\sqrt{2} \sigma N + 6\sigma \sqrt{N}+16\sigma \sqrt{hN}\).

Lemma 2

([14, Lemma 2]). Let \(\mathsf {ct}'\leftarrow {{{\textsf {\textit{RS}}}}}_{\ell \rightarrow \ell '}(\mathsf {ct})\) for a ciphertext \(\mathsf {ct}\in {\mathcal R}_{q_\ell }^2\). Then \(\langle \mathsf {ct}',\mathsf {sk}\rangle =\frac{q_{\ell '}}{q_\ell }\langle \mathsf {ct},\mathsf {sk}\rangle +e \pmod {q_{\ell '}}\) for some \(e\in {\mathcal P}\) satisfying \(\Vert {e}\Vert _\infty ^{\textsf {can}} \le B_\mathsf {rs}\) for \(B_\mathsf {rs}=\sqrt{N/3}\cdot (3 +8 \sqrt{h})\).

Lemma 3

([14, Lemma 3]). Let \(\mathsf {ct}_\mathsf {mult}\leftarrow {{{\textsf {\textit{Mult}}}}}_\mathsf {evk}(\mathsf {ct}_1,\mathsf {ct}_2)\) for two ciphertexts \(\mathsf {ct}_1,\mathsf {ct}_2\in {\mathcal R}_{q_\ell }^2\). Then \(\langle \mathsf {ct}_\mathsf {mult},\mathsf {sk}\rangle =\langle \mathsf {ct}_1,\mathsf {sk}\rangle \cdot \langle \mathsf {ct}_2,\mathsf {sk}\rangle +e_\mathsf {mult}\pmod {q_\ell } \) for some \(e\in {\mathcal R}\) satisfying \(\Vert {e_\mathsf {mult}}\Vert _\infty ^{\textsf {can}} \le B_\mathsf {mult}(\ell )\) for \(B_\mathsf {ks}=8\sigma N/\sqrt{3}\) and \(B_\mathsf {mult}(\ell )=P^{-1}\cdot q_\ell \cdot B_\mathsf {ks}+B_\mathsf {rs}\).

A rescaling (rounding) error is the smallest error type of homomorphic operations. The least digits of a plaintext is destroyed by some error after multiplication or rescaling, so its significand should be placed in higher digits not to lose the precision of the resulting plaintext.

3 Decryption Formula over the Integers

The goal of bootstrapping is to refresh a ciphertext and keep computing on encrypted data. Recall that \({\texttt {HEAAN}}\) supports arithmetic operations on a characteristic zero plaintext space such as \({\mathbb {C}}\) However, its decryption formula consists of two steps: the inner product \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \) over the integers and the modular reduction \(m=[t]_q\). We therefore have to express this decryption formula efficiently using homomorphic operations provided in the \({\texttt {HEAAN}}\) scheme.

The main difficulty comes from the fact that the reduction modular q function \(F(t)=[t]_q\) is not represented as a small-degree polynomial. A naive approach such as the polynomial interpolation causes a huge degree, resulting in a large parameter size and an expensive computational cost for bootstrapping process. Instead, we reduce the required circuit depth and the evaluation complexity by exploiting a polynomial approximation of the decryption formula and taking advantage of approximate arithmetic.

3.1 Approximation of the Modular Reduction Function

Let \(\mathsf {ct}\) be a ciphertext relative to a secret key \(\mathsf {sk}\) and a modulus q. Since \(\mathsf {sk}\) is sampled from a small distribution, the size of its decryption structure \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \) is bounded by Kq for some fixed constant K. So we can say that the decryption formula of \({\texttt {HEAAN}}\) is defined on the set \({\mathbb Z}\cap (-Kq,Kq)\) and it maps an arbitrary integer \(t\in {\mathbb Z}\cap (-Kq,Kq)\) to the reduction modular q.

It is infeasible to find a good approximation of the modular reduction function since it is not continuous. We first assume that a message m of an input ciphertext is still much smaller than a ciphertext modulus q, so that \(t=\langle \mathsf {ct},\mathsf {sk}\rangle \) can be expressed as \(qI+m\) for some I and m such that \(|I|<K\) and \(|m|\ll q\). This assumption is reasonable because one can start the bootstrapping procedure on a ciphertext before its modulus becomes too small. Then the modular reduction \(F(t)=[t]_q\) on a restricted domain becomes a piecewise linear function (see Fig. 1). We point out that this function is the identity near zero and periodic, so it looks like a part of the scaled sine

$$S(t)=\frac{q}{2\pi }\sin \left( \frac{2\pi t}{q}\right) .$$

Note that it gives a good approximation to the piecewise linear function when an input value \(t=qI+m\) is close to a multiple of q. Specifically, an error between \(F(t)\) and \(S(t)\) is bounded by

$$ |F(t)-S(t)|=\frac{q}{2\pi }\left| \frac{2\pi m}{q} - \sin \left( \frac{2\pi m}{q}\right) \right| \le \frac{q}{2\pi }\cdot \frac{1}{3!}\left( \frac{2\pi |m|}{q}\right) ^3=O\left( q\cdot \frac{|m|^3}{q^3}\right) , $$

which is equivalently O(1) when \(m=O(q^{2/3})\).

3.2 Homomorphic Evaluation of the Complex Exponential Function

As discussed before, the scaled sine function \(S(t)\) is a good approximation of the reduction modulo q. However, this function cannot be evaluated directly using HE since it is not a polynomial function. The goal of this subsection is to explain how to approximately and efficiently evaluate this trigonometric function based on \({\texttt {HEAAN}}\).

We may consider the Taylor polynomial \(\frac{q}{2\pi }\sum _{j=0}^{d-1}\frac{(-1)^j}{(2j+1)!}\left( \frac{2\pi t}{q}\right) ^{2j+1}\) of \(S(t)\). The size of error converges to zero very rapidly as the degree grows, i.e., an error between \(S(t)\) and its Taylor polynomial of degree 2d is bounded by \(\frac{q}{2\pi }\cdot \frac{1}{(2d+1)!}\left( 2\pi K\right) ^{2d+1}\) when \(|t|<Kq\), and it becomes small enough when the degree of the Taylor polynomial is O(Kq). However, despite its high precision, this naive method has an ineffective problem in practice. The complexity grows exponentially with the depth of a circuit, e.g. \(O(\sqrt{d})\) using the Paterson-Stockmeyer algorithm [40] for an evaluation of a degree-d polynomial.

Instead, we can reduce the computational cost by exploiting the following double-angle formulas: \(\cos ( 2\theta ) = \cos ^2\theta - \sin ^2\theta \) and \(\sin (2\theta ) = 2\cos \theta \cdot \sin \theta .\) From approximate values of trigonometric functions in a small domain, we extend to find good approximations of the sign function on a wider (doubled) range. In particular, the RLWE-based \({\texttt {HEAAN}}\) scheme can encrypt the complex numbers, so that the evaluation algorithm can be more simplified using the complex exponential function. Specifically, we use the identities

$${\left\{ \begin{array}{ll} \exp ( i\theta ) = \cos \theta + i \cdot \sin \theta ,\\ \exp (2i\theta ) = (\exp (i\theta ))^2, \end{array}\right. }$$

and the error growth from squaring can be bounded by about one bit since \((\exp (i\theta )\pm \epsilon )^2\approx \exp (2i\theta )\pm 2\epsilon \).

We take the Taylor polynomial of a small degree \(d_0\ge 1\) as a high-precision approximation of the complex exponential function within a small range. Then we perform the squaring operation repeatedly to get an approximation of the complex exponential function over the desirable domain. Note that we multiply a scale factor of \(\varDelta \) to prevent the precision loss and divide the intermediate ciphertexts by a constant \(\varDelta \) using the rescaling procedure of \({\texttt {HEAAN}}\).

The use of the complex exponential function has another advantage in error analysis. When we consider the RLWE-based \({\texttt {HEAAN}}\) scheme, small complex errors are added to plaintext slots during encryption, evaluation, rescaling and slot permutation. Therefore, we have only one constraint such that a decryption formula should be tolerant of small complex errors. Another advantage of our method comes from the fact that the complex exponential function is analytic with a bounded derivative over the whole complex plane, and therefore an error does not blow up by the decryption formula. The whole procedure is explicitly described as follows.

figure a

The degree \(d_0\) of the initial Taylor polynomial, the scaling factor of \(\varDelta \), and the number r of the iterations (squaring) are determined by the following noise analysis. Since the size of the initial input \((2\pi t)/(2^r\cdot q)\) of the complex exponential function has a small upper bound \((2\pi K/2^r)\), even the Taylor polynomial of a small degree \(d_0\) can be a good approximation to the complex exponential function \(\exp \left( \frac{2\pi it}{2^r\cdot q}\right) \). From the above observation, the output \(P_r(t)\) is a polynomial of degree \(d_r=d_0 \cdot 2^r\) and it is an approximation of \(E(t):=\varDelta \cdot \exp \left( \frac{2\pi i t}{q}\right) \) on a wide interval \(t\in \left( -Kq, Kq\right) \). After the evaluation of the complex exponential function, we can extract the imaginary (sine) part by conjugation operation (i.e., \(2\sin \theta =\exp (i\theta )-\exp (-i\theta )\)), which will be described in the next section.

For the estimation of noise, we start from an initial error between \(P_0(t)\) and \(\varDelta \cdot \exp \left( \frac{2\pi it}{2^r\cdot q}\right) \), which is bounded by \(\frac{\varDelta }{(d_0+1)!}\left| \frac{2\pi K}{2^r}\right| ^{d_0+1}\) from the Taylor remainder theorem. As described above, the error bound is almost doubled after each squaring. Therefore, we get a bound from an approximation as follows:

$$\begin{aligned} \left| P_r(t)- E(t)\right|&\le \frac{\varDelta \cdot 2^r}{(d_0+1)!}\left( \frac{2\pi K}{2^{r}}\right) ^{d_0+1} \\&\le \frac{\varDelta \cdot 2^r}{\sqrt{2\pi (d_0+1)}}\left( \frac{e\pi K}{2^{r-1}(d_0+1)}\right) ^{d_0+1} \end{aligned}$$

from Stirling’s formula. Asymptotically the choice of parameters \(d_0=O(1)\) and \(r=O(\log (Kq))\) gives us a sufficiently small error bound. Note that the complexity of the algorithm is \(r = O(\log (Kq))\) homomorphic multiplications and it grows linearly with the depth of the decryption circuit.

4 Linear Transformation on Packed Ciphertexts

In this section, we explain how to homomorphically evaluate the linear transformations over the vector of plaintext slots. We first present a simple description of the packing method of \({\texttt {HEAAN}}\). We then explain how to compute the rotation and the complex conjugation over the plaintext slots using the key-switching technique. These functionalities can be applied to the evaluation of a linear transformation over plaintext slots.

4.1 Packing Method

The packing technique of HE schemes allows us to encrypt multiple messages in a single ciphertext and enables a parallel computation in a SIMD manner. Cheon et al. [14] proposed a method to identify a cyclotomic polynomial with real coefficients to a vector of complex numbers. We clarify this encoding method and give a simpler description using the structure of a cyclotomic ring with a power-of-two dimension.

Recall that for a power-of-two integer \(M>4\), we have \(N=M/2\) and \(\varPhi _M(X)=X^N+1\). The integer 5 has the order of (N / 2) modulo M and spans \({\mathbb Z}_M^*\) with the integer “\(-1\)”. Hence \(\{\zeta _j,\overline{\zeta _j}:0\le j<N/2\}\) forms the set of the primitive M-th roots of unity for \(\zeta _j:=\zeta ^{5^j}\) and \(0\le j<N/2\). We use the notation \(\tau : {\mathcal P}= {\mathbb R}[X]/(X^N+1) \rightarrow {\mathbb {C}}^{N/2}\) to denote a variant of the complex canonical embedding map defined by \(\tau :m(X)\mapsto {{\varvec{z}}}= (z_j)_{0\le j < N/2}\) such that \(z_j=m(\zeta _j)\). Note that \(\tau \) is an isometric ring homomorphism between metric spaces \(({\mathcal P}, \Vert {\cdot }\Vert _\infty ^{\textsf {can}})\) and \(({\mathbb {C}}^{N/2}, \Vert {\cdot }\Vert _\infty )\). We use this isomorphism \(\tau \) as the decoding function for packing of (N / 2) complex numbers in a single polynomial. By identifying a polynomial \(m(X)=\sum _{i=0}^{N-1}m_iX^i\in {\mathcal P}\) with the vector of its coefficients \({{\varvec{m}}}=(m_0,\dots ,m_{N-1})\), the decoding algorithm \(\tau \) can be understood as a linear transformation from \({\mathbb R}^N\) to \({\mathbb {C}}^{N/2}\). Its matrix representation is given by

$$U= \begin{bmatrix} 1&\zeta _0&\zeta _0^2&\dots&\zeta _0^{N-1} \\ 1&\zeta _1&\zeta _1^2&\dots&\zeta _1^{N-1} \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ 1&\zeta _{\frac{N}{2}-1}&\zeta _{\frac{N}{2}-1}^2&\dots&\zeta _{\frac{N}{2}-1}^{N-1} \end{bmatrix}$$

which is the \((N/2)\times N\) Vandermonde matrix generated by \(\{\zeta _j:0\le j<N/2\}\).

In order to compute the encoding function, which is the inverse of \(\tau \), we first note that the relation \({\overline{{{\varvec{z}}}}}={\overline{U}}\cdot {{\varvec{m}}}\) is obtained from \({{\varvec{z}}}=U\cdot {{\varvec{m}}}\) by taking the conjugation. If we write \(\mathsf {CRT}=(U;{\overline{U}})\) as the CRT matrix generated by the set \(\{\zeta _j,{\overline{\zeta _j}}:0\le j<N/2\}\) of M-th primitive roots of unity, we have the identities \(({{\varvec{z}}};{\overline{{{\varvec{z}}}}})=\mathsf {CRT}\cdot {{\varvec{m}}}\) and \(\mathsf {CRT}^{-1}=\frac{1}{N}{\overline{\mathsf {CRT}}}^T\). This implies that the inverse of \(\tau \) can be computed by

$${{\varvec{m}}}=\frac{1}{N}({\overline{U}}^T\cdot {{\varvec{z}}}+ U^T\cdot {\overline{{{\varvec{z}}}}}).$$

Throughout this paper, we will identify two spaces \({\mathcal P}\) and \({\mathbb {C}}^{N/2}\) via the map \(\tau \), and hence a ciphertext will be called an encryption of \({{\varvec{z}}}\in {\mathbb {C}}^{N/2}\) if it encrypts the corresponding polynomial \(m(X)=\tau ^{-1}({{\varvec{z}}})\).

4.2 Rotation and Conjugation

The purpose of the key-switching operation is to convert a ciphertext under a secret \(s'\) into a ciphertext of the same message with respect to another secret key \(\mathsf {sk}=(1,s)\). The switching key \(\mathsf {swk}\) can be generated by the procedure of \(\textsf {KSGen}_\mathsf {sk}(s')\). Given a ciphertext \(\mathsf {ct}= (c_0, c_1)\) at level \(\ell \), the procedure \(\textsf {KS}_{\mathsf {swk}}(\mathsf {ct})\) proceeds as follows.

  • \(\underline{\textsf {KS}_{\mathsf {swk}}(\mathsf {ct})}\). Output the ciphertext \( \mathsf {ct}' \leftarrow (c_0,0)+ \lfloor {P^{-1}\cdot c_1\cdot \mathsf {swk}}\rceil \pmod {q_\ell }\).

The following lemma shows the correctness of key-switching procedure and estimates a noise bound. It has a similar noise bound \(P^{-1}\cdot q\cdot B_\mathsf {ks}+B_\mathsf {rs}\approx B_\mathsf {rs}\) with the rescaling process.

Lemma 4

(Key-switching). Let \(\mathsf {ct}=(c_0,c_1)\in {\mathcal R}_q^2\) be a ciphertext with respect to a secret key \(\mathsf {sk}'=(1,s')\) and let \(\mathsf {swk}\leftarrow {{{\textsf {\textit{KSGen}}}}}_\mathsf {sk}(s')\). Then \(\mathsf {ct}'\leftarrow {{{\textsf {\textit{KS}}}}}_\mathsf {swk}(\mathsf {ct})\) satisfies \(\langle \mathsf {ct}',\mathsf {sk}\rangle = \langle \mathsf {ct},\mathsf {sk}'\rangle + e_\mathsf {ks}\pmod {q}\) for some \(e_\mathsf {ks}\in {\mathcal R}\) with \(\Vert {e_\mathsf {ks}}\Vert _\infty ^{\textsf {can}} \le P^{-1}\cdot q\cdot B_\mathsf {ks}+B_\mathsf {rs}\).

Proof

Let \(e'=\langle \mathsf {swk},\mathsf {sk}\rangle -P\cdot s' \pmod {P\cdot q_L}\) be the inserted error of the switching key \(\mathsf {swk}\). It was shown in [14, Lemma 3] that the ciphertext \(\mathsf {ct}'\) contains an additional error \(e''=c_1\cdot e'\) in the modulus \(P\cdot q\) from the key-switching operation. The key-switching error is the sum of \(P^{-1}\cdot e''\) and a rounding error \(e_\mathsf {rs}\) of \(P^{-1}\cdot c_1\cdot \mathsf {swk}\), so its size is bounded by

$$\Vert {e_\mathsf {ks}}\Vert _\infty ^{\textsf {can}} \le P^{-1}\cdot \Vert {e''}\Vert _\infty ^{\textsf {can}}+\Vert {e_\mathsf {rs}}\Vert _\infty ^{\textsf {can}} \le P^{-1}\cdot q\cdot B_\mathsf {ks}+B_\mathsf {rs},$$

as desired.    \(\square \)

For an integer k co-prime with M, let \(\kappa _k: m(X) \mapsto m(X^k) \pmod {\varPhi _M(X)}\) be a mapping defined on the set \({\mathcal P}\). As noted in [29], this transformation can be used to provide more functionalities on plaintext slots. In some more details, given a ciphertext \(\mathsf {ct}\) of a message m with \(\mathsf {sk}=(1,s)\), we denote \(\kappa _k(\mathsf {ct})\) the ciphertext which is obtained by applying \(\kappa _k\) to each entry of \(\mathsf {ct}\). Then \(\kappa _k(\mathsf {ct})\) is a valid encryption of \(\kappa _k(m)\) with the secret key \(\kappa _k(s)\). The key-switching technique can be applied to the ciphertext \(\kappa _k(\mathsf {ct})\) to get an encryption of the same message \(\kappa _k(m)\) with respect to the original secret key \(\mathsf {sk}\).

Rotation. Suppose that \(\mathsf {ct}\) is an encryption of a message m(X) with the corresponding plaintext vector \({{\varvec{z}}}= (z_j)_{0\le j < \frac{N}{2}} \in {\mathbb {C}}^{N/2}\). For any \(0 \le i, j< N/2\), there is a mapping \(\kappa _k\) which sends an element in the slot of index i to an element in the slot of index j. Let us define \(k= 5^{i-j} \pmod M\) and \(\tilde{m}= \kappa _k(m)\). Denoting \(\tilde{{\varvec{z}}}= (\tilde{z}_j)_{0\le j < \frac{N}{2}}\) as the corresponding plaintext vector of \(\tilde{m}\), we have

$$\tilde{z}_j= \tilde{m}(\zeta _{j})= m(\zeta _j^{5^{i-j}})= m(\zeta _{i})= z_i,$$

so the j-th slot of \(\tau (\tilde{m})\) and the i-th entry of \(\tau (m)\) have the same value. In general, we may get a ciphertext \(\kappa _{5^r}{(\mathsf {ct})}\) encrypting \(\rho ({{\varvec{z}}};r):=(z_r,\dots ,z_{\frac{N}{2}-1},z_0,\dots ,z_{r-1})\), which is the vector obtained from \({{\varvec{z}}}\) by rotation. Below, we describe the rotation procedure including the key-switching operation.

  • Generate the rotation key \(\mathsf {rk}_r \leftarrow \textsf {KSGen}_{\mathsf {sk}}(\kappa _{5^r}(s))\).

  • \(\underline{\textsf {Rot}(\mathsf {ct};r)}\). Output the ciphertext \(\textsf {KS}_{\mathsf {rk}_r}(\kappa _{5^r}{(\mathsf {ct})})\).

Conjugation. We see that \(\kappa _{-1}(\mathsf {ct})\) is a valid encryption of the plaintext vector \({\overline{{\varvec{z}}}} = ({\overline{z_j}})_{0\le j<N/2}\) with the secret key \(\kappa _{-1}(s)\). It follows from that fact that

$${\overline{z_j}} = {\overline{m({\zeta _j})}}= m({\overline{\zeta _j}})= m({\zeta _j^{-1}}).$$

Then the homomorphic evaluation of the conjugation operation over plaintext slots consists of two procedures:

  • Generate the conjugation key \(\mathsf {ck}\leftarrow \textsf {KSGen}_{\mathsf {sk}}(\kappa _{-1}(s))\).

  • \(\underline{\textsf {Conj}(\mathsf {ct})}\). Output the ciphertext \(\textsf {KS}_{\mathsf {ck}}(\kappa _{-1}{(\mathsf {ct})})\).

4.3 Linear Transformations

In general, an arbitrary linear transformation over the vector of plaintext slots in \({\mathbb {C}}^{N/2}\) can be represented as \({{\varvec{z}}}\mapsto A\cdot {{\varvec{z}}}+B\cdot {\overline{{{\varvec{z}}}}}\) for some complex matrices \(A,B\in {\mathbb {C}}^{N/2\times N/2}\). As discussed in [32], it can be efficiently done by handling the matrix in a diagonal order and making use of SIMD computation. Specifically, let \({{\varvec{u}}}_j=(A_{0,j},A_{1,j+1},\dots ,A_{\frac{N}{2}-j-1,\frac{N}{2}-1},A_{\frac{N}{2}-j,0},\dots ,A_{\frac{N}{2}-1,j-1})\in {\mathbb {C}}^{N/2}\) denote the shifted diagonal vector of A for \(0 \le j < N/2\). Then we have

$$\begin{aligned} A\cdot {{\varvec{z}}}= \sum _{0\le j<N/2}({{\varvec{u}}}_j\odot \rho ({{\varvec{z}}};j)) \end{aligned}$$
(1)

where \(\odot \) denotes the Hadamard (component-wise) multiplication between vectors. Therefore, if the matrix A is given in plaintext and \(\mathsf {ct}\) is given as an encryption of the vector \({{\varvec{z}}}\), the matrix-vector multiplication \(A\cdot {{\varvec{z}}}\) is expressed as combination of rotations and scalar multiplications. The vector rotation \(\rho ({{\varvec{z}}};j)\) can be homomorphically computed by \(\textsf {Rot}(\mathsf {ct};j)\) and the Hadamard (component wise) scalar multiplication is done by multiplying the polynomial \(\tau ^{-1}({{\varvec{u}}}_j)\). See Algorithm 1 for an explicit description of the homomorphic matrix multiplication. Similarly, the second term \(B\cdot {\overline{{{\varvec{z}}}}}\) can be obtained by multiplying the matrix B after applying the slot-wise conjugation on \({{\varvec{z}}}\).

figure b

Algorithm 1 requires \((N/2-1)\) rotations and N multiplications with scalar polynomials but the complexity can be reduced down using the idea of Baby-Step Giant-Step algorithm. Let \(N_1 = O(\sqrt{N})\) be a divisor of N / 2 and denote \(N_2 = N/2N_1\). It follows from [33] that Eq. (1) can be expressed as

$$\begin{aligned} A\cdot {{\varvec{z}}}&= \sum _{0\le j<N_2}\sum _{0\le i<N_1}({{\varvec{u}}}_{N_1\cdot j+i}\odot \rho ({{\varvec{z}}};N_1\cdot j+i)) \\&= \sum _{0\le j<N_2} \rho \left( \sum _{0\le i<N_1}\rho ({{\varvec{u}}}_{N_1\cdot j+i};-N_1\cdot j)\odot \rho ({{\varvec{z}}};i); N_1\cdot j\right) . \end{aligned}$$

For the homomorphic evaluation of the arithmetic circuit, we first compute the ciphertexts of \(\rho ({{\varvec{z}}};k)\) for \(i=1,\ldots ,N_1-1\). For each index j, we perform \(N_1\) scalar multiplications and aggregate the resulting ciphertexts. In total, the matrix multiplication can be homomorphically evaluated with \((N_1-1)+(N_2-1)=O(\sqrt{N})\) rotations and \(N_1\cdot N_2=O(N)\) scalar multiplications.

We provide a trade-off between the precision of a plaintext and the size of a ciphertext modulus. The output plaintext contains errors from the rounding operation of scalar polynomial and the homomorphic rotation. We can reduce the relative size of the rounding error by multiplying a scaling factor of \(\varDelta \ge 1\) to the scalar polynomials, and the relative size of the rotation error can be controlled by placing the significand of an input ciphertext in higher digits. Therefore, the modulus of an output ciphertext is reduced after the evaluation of a linear transformation if we use a scaling factor to get a better precision of plaintexts.

4.4 Sparsely Packed Ciphertext

The packing method described in Sect. 4.1 allows us to encrypt (N / 2) complex numbers in a single ciphertext. However, it is sufficient to deal with a small number of slots in some applications of HE. In this section, we explain how to encode a sparse plaintext vector and describe the relation with the ordinary packing method. This idea will be applied to our bootstrapping method and provide a trade-off between the latency time and the amortized time of bootstrapping procedure.

Let \(n\ge 2\) be a divisor of N and let \(Y=X^{N/n}\). The native plaintext space of \({\texttt {HEAAN}}\) is the set of small polynomials in \({\mathbb Z}[X]/(X^N+1)\), and it has a subring \({\mathbb Z}[Y]/(Y^{n}+1)\). Note that \({\mathbb Z}[Y]/(Y^n+1)\) can be identified with the complex coordinate space \({\mathbb {C}}^{n/2}\) by adapting the idea of ordinary packing method in Sect. 4.1. Specifically, a polynomial m(Y) is mapped to the vector \({{\varvec{w}}}=(w_j)_{0\le j<n/2}\) where \(w_j=m(\xi _j)\), \(\xi =\exp (-2\pi i/n)=\zeta ^{N/n}\), and \(\xi _j=\xi ^{5^j}\) for \(0\le j<n/2\). If we consider a plaintext polynomial \(m(Y)\in {\mathbb Z}[Y]/(Y^n+1)\) as a polynomial \(\tilde{m}(X)=m(X^{N/n})\) in X, then the image of \(\tilde{m}\) through the ordinary decoding function \(\tau \) is obtained by \(\tau (\tilde{m}) =(z_j)_{0\le j<N/2}\) where

$$z_j=\tilde{m}(\zeta _j^{N/n})=m(\xi ^{5^j})=w_{j \pmod {n/2}}$$

for \(0\le j<N/2\). Hence \({{\varvec{z}}}=(z_j)_{0\le j<N/2}\) can be understood as the vector obtained from \({{\varvec{w}}}\) by concatenating itself (N / n) times.

An encryption of a plaintext polynomial \(m(Y)\in {\mathbb Z}[Y]/(Y^n+1)\) with respect to a secret key \(\mathsf {sk}=(1,s)\) will be pairs as \(\mathsf {ct}\in {\mathcal R}_q^2\) satisfying \(\langle \mathsf {ct},\mathsf {sk}\rangle =m(Y)+e(X) \pmod q\) for some small polynomial \(e(X)\in {\mathcal R}\). We may employ key-switching technique to get the functionalities of rotation, conjugation, and linear transformation on sparsely packed slots. The main advantage of this method is that it reduces the complexity of an arbitrary linear transformation: the total complexity on (n / 2)-sparsely packed ciphertexts is bounded by \(O(\sqrt{n})\) rotations and O(n) scalar multiplications.

5 Bootstrapping for \({\texttt {HEAAN}}\)

5.1 Overview of the Recryption Procedure

This section gives a high level structure of the bootstrapping process for the \({\texttt {HEAAN}}\) scheme. We employ the ciphertext packing method and combine it with our efficient evaluation strategy to achieve a better performance in terms of memory and computation cost. Below we describe all the parts of the recryption procedure in more details. We denote the following five steps by \(\textsc {ModRaise}\), \(\textsc {CoeffToSlot}\), \(\textsc {EvalExp}\), \(\textsc {ImgExt}\) and \(\textsc {SlotToCoeff}\), respectively. See Fig. 2 for an illustration.

Fig. 2.
figure 2

Pipeline of our bootstrapping process

Modulus Raising. Let \(\mathsf {ct}\) be an input ciphertext of the bootstrapping procedure with a ciphertext modulus q satisfying \(m(X)=[\langle \mathsf {ct},\mathsf {sk}\rangle ]_q\). We start with the point that its inner product \(t(X)=\langle \mathsf {ct},\mathsf {sk}\rangle \pmod {X^N+1}\) is of the form \(t=qI+m\) for some small \(I(X)\in {\mathcal R}\) with a bound \(\Vert {I}\Vert _\infty < K\). Thus \(\mathsf {ct}\) itself can be viewed as an encryption of \(t(X)=t_0+t_1X+\dots +t_{N-1}X^{N-1}\) in a large modulus \(Q_0\gg q\) due to \([\langle \mathsf {ct},\mathsf {sk}\rangle ]_{Q_0}=t(X)\). Our bootstrapping procedure aims to homomorphically and approximately evaluate the reduction mod q \(F(t)=[t]_q\) using arithmetic operations over the integers, and hence we can generate an encryption of the original message \(m=[t]_{q}\) with a ciphertext modulus larger than q.

Putting Polynomial Coefficients in Plaintext Slots. Given the input ciphertext \(\mathsf {ct}\in {\mathcal R}_{Q_0}^2\) with a decryption structure \(t(X)=\langle \mathsf {ct},\mathsf {sk}\rangle \), this step aims to put the coefficients \(t_0,\dots ,t_{N-1}\) in plaintext slots in order to evaluate the modular reduction function \(F(t)\) coefficient-wisely. Let \({{\varvec{z}}}'=\tau (t)\in {\mathbb {C}}^{N/2}\) be the corresponding vector of plaintext slots of the ciphertext \(\mathsf {ct}\). Since each ciphertext can store at most N / 2 plaintext values, we will generate two ciphertexts encrypting the vectors \({{\varvec{z}}}_0'=(t_0,\dots ,t_{\frac{N}{2}-1})\) and \({{\varvec{z}}}_1'=(t_{\frac{N}{2}},\dots ,t_{N-1})\), respectively.

As mentioned in Sect. 4.1, recall the linear relation between the coefficient vector of a polynomial and its corresponding vector of plaintext slots. If we divide the matrix \(U\) into two square matrices

$$U_0= \begin{bmatrix} 1&\zeta _0&\dots&\zeta _0^{\frac{N}{2}-1} \\ 1&\zeta _1&\dots&\zeta _1^{\frac{N}{2}-1} \\ \vdots&\vdots&\ddots&\vdots \\ 1&\zeta _{\frac{N}{2}-1}&\dots&\zeta _{\frac{N}{2}-1}^{\frac{N}{2}-1} \end{bmatrix}\quad \text {and}\quad U_1= \begin{bmatrix} \zeta _0^{\frac{N}{2}}&\zeta _0^{\frac{N}{2}+1}&\dots&\zeta _0^{N-1} \\ \zeta _1^{\frac{N}{2}}&\zeta _1^{\frac{N}{2}+1}&\dots&\zeta _1^{N-1} \\ \vdots&\vdots&\ddots&\vdots \\ \zeta _{\frac{N}{2}-1}^{\frac{N}{2}}&\zeta _{\frac{N}{2}-1}^{\frac{N}{2}+1}&\dots&\zeta _{\frac{N}{2}-1}^{N-1}\end{bmatrix},$$

then we get an identity \({{\varvec{z}}}_k'=\frac{1}{N}({\overline{U_k}}^T\cdot {{\varvec{z}}}'+U_k^T\cdot {\overline{{{\varvec{z}}}'}})\) for \(k=0, 1\). Therefore, we can generate encryptions of \({{\varvec{z}}}_0'\) and \({{\varvec{z}}}_1'\) using the linear transformations on the plaintext vector \({{\varvec{z}}}'\). One can apply our general method in Sect. 4.3 to this step.

Evaluation of the Complex Exponential Function. This step takes the results of the previous step and homomorphically computes the reduction mod q function \(F(t)=[t]_q\) homomorphically. We use the trigonometric function \(S(t):=\frac{q}{2\pi }\sin \left( \frac{2\pi t}{q}\right) \) as an approximation of \(F(t)\) and adapt the optimized evaluation strategy for the complex exponential function \(E(t):=\frac{q}{2\pi }\exp \left( \frac{2\pi it}{q}\right) \).

Since the plaintext slots of the input ciphertexts contain the coefficients \(t_j = qI_j+m_j\) for \(0\le j<N\), the output ciphertexts will encrypt \(\frac{q}{2\pi }\exp \left( \frac{2\pi i t_j}{q}\right) =\frac{q}{2\pi }\exp \left( \frac{2\pi i m_j}{q}\right) \) in the corresponding plaintext slots.

Extraction of the Imaginary Part. We take two input ciphertexts encrypting the values \(\frac{q}{2\pi }\exp \left( \frac{2\pi i m_j}{q}\right) \) in their plaintext slots for \(0\le j<N\). We extract their imaginary parts as \(\frac{q}{2\pi }\sin \left( \frac{2\pi m_j}{q}\right) \approx m_j\) by using the relation

$$\sin \left( \frac{2\pi m_j}{q}\right) =\frac{1}{2}\left( \exp \left( \frac{2\pi i m_j}{q}\right) -\exp \left( \frac{-2\pi im_j}{q}\right) \right) $$

and applying the evaluation method of slot-wise conjugation described in Sect. 4.2.

Switching Back to the Coefficient Representation. The final step is to pack all the coefficients \(m_j\) in the plaintext slots of two ciphertexts back in a single ciphertext. This procedure is exactly the inverse of the \(\textsc {CoeffToSlot}\) transformation. That is, when given two ciphertexts that encrypt the vectors \({{\varvec{z}}}_0 =(m_0,\dots ,m_{\frac{N}{2}-1})\) and \({{\varvec{z}}}_1=(m_{\frac{N}{2}},\dots ,m_{N-1})\), we aim to generate an encryption of m(X). Since the plaintext vector \({{\varvec{z}}}=\tau (m)\) of m(X) satisfies the identity \({{\varvec{z}}}=U\cdot {{\varvec{m}}}=U_0\cdot {{\varvec{z}}}_0+U_1\cdot {{\varvec{z}}}_1\), this transformation is also represented as a linear transformation over the plaintext vectors.

Our bootstrapping process returns an encryption of m(X) with a ciphertext modulus \(Q_1<Q_0\), which is much larger enough than the initial modulus q to allow us to perform further homomorphic operations on the ciphertext.

We can perform the final two steps together by pre-computing the composition of linear transformations. The (inverse) linear transformation step consumes only one level for scalar multiplications but requires a number of slot rotations. On the other hand, \(\textsc {EvalExp}\) performs homomorphic evaluation of the polynomial \(P_r(\cdot )\), which is the most levels consuming part of the recryption but requires relatively small computational cost from our recursive evaluation strategy: linear with the depth.

5.2 Recryption with Sparsely Packed Ciphertexts

As mentioned in Sect. 4.4, the use of sparsely packed ciphertexts has an advantage in some applications, in that it reduces the complexity of linear transformation steps. However, the recryption of sparsely packed ciphertexts requires an additional step before the \(\textsc {CoeffToSlot}\) step.

Let \(n\ge 2\) be a divisor of N and let \(Y=X^{N/n}\) as in Sect. 4.4. Assume that we are given an encryption \(\mathsf {ct}\) of \(m(Y)\in {\mathbb Z}[Y]/(Y^n+1)\) such that \(\langle \mathsf {ct},\mathsf {sk}\rangle \approx q\cdot I(X)+m(Y)\) for some \(I(X) = I_0+I_1\cdot X+\dots +I_{N-1}\cdot X^{N-1}\in {\mathcal R}\). Then the \(\textsc {ModRaise}\) step returns an encryption of \(q\cdot I(X)+m(Y)\) which is not a polynomial of Y. We aim to generate an encryption of \(q\cdot \tilde{I}(Y)+m(Y)\) for some \(\tilde{I}(Y)\in {\mathbb Z}[Y]/(Y^n+1)\) before the next \(\textsc {CoeffToSlot}\) step. It proceeds as described in Algorithm 2.

figure c

Note that the monomial \(X^k\) vanishes by the homomorphism \(X\mapsto X-X^{n+1}+X^{2n+1}-\cdots -X^{N-n+1}\) if k is not divisible by (N / n); otherwise, it is multiplied by the constant (N / n). Hence \(q\cdot I(X)+m(Y)\) is mapped to \((N/n)\cdot (q\cdot \tilde{I}(Y)+m(Y))\) by this homomorphism where \(\tilde{I}(Y)=I_0+I_{N/n}\cdot Y+\dots +I_{N-(N/n)}\cdot Y^{n-1}\). For an efficient evaluation of this homomorphism, Algorithm 2 uses the rotation operation repeatedly to fill the same value in the plaintext slots of index \(j \pmod {n/2}\) for each \(j=0,\dots ,n/2-1\).

As mentioned before, the main advantage of sparsely packed ciphertexts is that the \(\textsc {CoeffToSlot}\) step can be represented as relatively small matrices multiplications with only a single encryption of the coefficients vector of \(t(Y)=q\cdot \tilde{I}(Y)+m(Y)\) while fully-packed slots need two ciphertexts for the \(\textsc {CoeffToSlot}\) step. In some more details, for the plaintext vector \({{\varvec{w}}}\in {\mathbb {C}}^{n/2}\) of t(Y), the desired ciphertext can be computed by \(\frac{1}{n}({\overline{U'}}^T \cdot {{\varvec{w}}}+ U'^T \cdot {\overline{{{\varvec{w}}}}})\) where

$$U'= \begin{bmatrix} 1&\xi _0&\xi _0^2&\dots&\xi _0^{n-1} \\ 1&\xi _1&\xi _1^2&\dots&\xi _1^{n-1} \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ 1&\xi _{\frac{n}{2}-1}&\xi _{\frac{n}{2}-1}^2&\dots&\xi _{\frac{n}{2}-1}^{n-1} \\ \end{bmatrix},$$

as in Sect. 4.1. We can either generate two ciphertexts encrypting two plaintext vectors \((w_0,\dots ,w_{n/2-1})\) and \((w_{n/2},\dots ,w_{n-1})\) by separating \(U'\) into two square matrices, or compute a single encryption of \((w_0,\dots ,w_{n-1})\) with n plaintext slots when \(n<N\). In the latter case, \(\textsc {EvalExp}\) and \(\textsc {ImgExt}\) perform the same operations as in the fulled-packed ciphertexts, but the memory and the computational cost are reduced by half since we can work on a single ciphertext. The final \(\textsc {SlotToCoeff}\) step is also expressed as a linear transformation over n-dimensional vector.

5.3 Noise Estimation of Recryption

In this section, we describe each step of recryption procedure with a noise analysis. We start with an upper bound K of \(\Vert {I}\Vert _\infty \). Since each coefficient of a ciphertext \(\mathsf {ct}=(c_0, c_1)\) is an element of \({\mathbb Z}_q\) and the signed binary secret key s has exactly h nonzero coefficients, each coefficient of \(\langle \mathsf {ct},\mathsf {sk}\rangle =c_0+c_1s\) can be considered to be the sum of \((h+1)\) elements in \({\mathbb Z}_q\), which is bounded by \(\frac{q}{2}(h+1)\). Hence all the coefficient of \(I(X)=\lfloor {\frac{1}{q}\langle \mathsf {ct},\mathsf {sk}\rangle }\rceil \) is bounded by \(\frac{1}{2}(h+1)\approx \frac{1}{2}\Vert {s}\Vert _1\). In practice, the coefficients of \(c_i\) look like a random variable over the interval \({\mathbb Z}_q\) and a coefficient of \(\frac{1}{q}\langle \mathsf {ct},\mathsf {sk}\rangle \) behaves as the sum of \((h+1)\)-numbers of i.i.d. uniform random variables over the interval \((-\frac{1}{2},\frac{1}{2})\). This heuristic assumption gives us a smaller bound as \(K=O(\sqrt{h})\) for \(\Vert {I}\Vert _\infty \).

We now consider the error growth during homomorphic evaluation of a linear transformation. As noted in Sect. 4.3, it induces two types of errors such as from the rounding of a scalar polynomial and the key-switching operation. We multiply a sufficiently large scaling factor of \(\varDelta =O(q)\) to scalar polynomials, so that the precision of rounded polynomials becomes larger than that of an input plaintext. Then we do not need to consider the rounding errors because they have no effect on the precision of the resulting plaintext. The second type of error is added to a plaintext when we apply the key-switching technique for some functionalities such as rotation or conjugation. From Lemma 4, the key-switching error is bounded by \(P^{-1}\cdot q\cdot B_\mathsf {ks}+B_\mathsf {rs}\approx B_\mathsf {rs}\) since we set a ciphertext modulus q much smaller than P. During matrix multiplication, key-switching errors are multiplied with diagonal vectors of matrix. Since infinite norms of diagonal vectors of \(\frac{1}{N}U_0\) and \(\frac{1}{N}U_1\) are exactly one, the total error of second type is bounded by \(O(B_\mathsf {rs})\).

In the \(\textsc {EvalExp}\) step, we take two ciphertexts to be homomorphically evaluated by the approximate polynomial \(P_r(\cdot )\) of the complex exponential polynomial. Each component of the corresponding plaintext slots contains \(t_j+e_j\) for some small error \(e_j\) such that \(|e_j|\le O(B_\mathsf {rs})\). Hence, the error between the desired value \(E(t_j)\) and the resulting plaintext of \(\textsc {EvalExp}\) can be measured by

$$\begin{aligned} |E(t_j)-P_r(t_j+e_j)|&\le |E(t_j)-E(t_j+e_j)|+|E(t_j+e_j)-P_r(t_j+e_j)| \\&\le \frac{\varDelta \cdot 2\pi }{q} |e_j|+ \frac{\varDelta \cdot 2^r}{\sqrt{2\pi (d_0+1)}}\left( \frac{e\pi K}{2^{r-1}(d_0+1)}\right) ^{d_0+1}, \end{aligned}$$

since \(E(\cdot )\) is analytic and \(|E'(\cdot )|\le \varDelta \cdot 2\pi /q\). The second term can be bounded by O(1) and so it is negligible when \(d_0=O(1)\) and \(r=O(\log (q K))\) as described in Sect. 3.2. By combining the error bound as \(|e_j|\le O(B_\mathsf {rs})\) of the previous step, we deduce that the output ciphertexts of \(\textsc {EvalExp}\) encrypt \(E(t_j)=\varDelta \cdot \exp \left( \frac{2\pi it_j}{q}\right) \) in their plaintext slots with errors bounded by \(O(B_\mathsf {rs})\).

The imaginary part of \(E(t)\) is \(\frac{\varDelta \cdot 2\pi }{q}S(t)=\varDelta \cdot \sin \left( \frac{2\pi t}{q}\right) \), which is an approximation of \(\frac{\varDelta \cdot 2\pi }{q}m\) with an error bounded by \(O\left( q\cdot \frac{|m|^3}{q^3}\right) \) for \(m=[t]_q\). We set a small bound on an input plaintext (e.g. \(|m|\le q^{2/3}\)) so that an approximation error does not destroy the significant digits of the plaintext. Hence the \(\textsc {ImgExt}\) step does not change the magnitude of bootstrapping error.

Finally, in the \(\textsc {SlotToCoeff}\) step, the plaintext vectors are multiplied with the matrices \(U_0\) and \(U_1\), and their diagonal vectors have the size of one. The size of error in the resulting plaintext is bounded by \(O(N\cdot B_\mathsf {rs})\) since it is the sum of N errors of size \(O(B_\mathsf {rs})\). In practice, under the heuristic assumption that these errors behaves as independent Gaussian distributions, we get a reduced error bound as \(O(\sqrt{N}\cdot B_\mathsf {rs})\).

In summary, for an input ciphertext \(\mathsf {ct}\in {\mathcal R}_q^2\) satisfying \(\langle \mathsf {ct},\mathsf {sk}\rangle =m \pmod q\), our bootstrapping process returns a ciphertext \(\mathsf {ct}'\) such that \(\langle \mathsf {ct}',\mathsf {sk}\rangle =m+e \pmod {Q_1}\) for a modulus \(Q_1\gg q\) and some error e with \(\Vert {e}\Vert _\infty ^{\textsf {can}}\le O(\sqrt{N}\cdot B_\mathsf {rs})\). It consists of the initial/final linear transformations and the evaluation of the complex exponential function, so the total depth (number of levels consumed) for bootstrapping is \(O(\log (Kq))=O(\log \lambda )\). The linear transformations require and \(O(\sqrt{N})\) rotations, while the evaluation of the exponential function needs \(r=O(\log (Kq))=O(\log \lambda )\) homomorphic multiplications, which is linear with the depth of the decryption formula. As described above, the total complexity can be significantly reduced when we handle a sparsely packed ciphertext with (n / 2) slots: the linear transformations require \(O(\sqrt{n})\) rotations and the evaluation complexity of the exponential function is reduced by half.

6 Implementation

In this section, we suggest some parameter sets for our bootstrapping procedure with experimental results. Our implementation is based on the \({\texttt {HEAAN}}\) library [13] implementing the HE scheme of Cheon et al. [14]. The source code is publicly available at github [12].

6.1 Parameter Selection

We adapt the estimator of Albrecht at el. [1] to guarantee the concrete security of the proposed parameters. All the parameter sets achieve at least 80-bit security level against the known attacks of the LWE problem.

The key-switching keys have the largest modulus in the \({\texttt {HEAAN}}\) scheme and the modulus \(Q_0\) (after \(\textsc {ModRaise}\)) has half the bit size of the modulus. We use the discrete Gaussian distribution of standard deviation \(\sigma = 3.2\) to sample error polynomials and set the Hamming weight \(h=64\) of the secret key s(X). The parameters \(d_0=O(1)\) and \(r=O(\log q)\) were chosen asymptotically in the above sections, but in practice, we set the parameter experimentally based on the bootstrapping error. We take the degree 7 Taylor expansion as an initial approximation of the exponential function and choose a sufficiently large number of iterations r to maintain the precision of output plaintexts.

The parameter \(\log p\) is the bit size of plaintexts and the plaintext precision denotes the number of significant bits of plaintexts after bootstrapping. The before and after levels \(L_0, L_1\) are obtained by dividing \(\log {Q_0}\) and \(\log {Q_1}\) by \(\log {p}\), respectively. The whole parameter sets are described in Table 1.

Table 1. Parameter sets

We present some specific examples of input and output plaintexts. For simplicity, we show the real parts of the plaintext values in four slots. All the values are divided by a factor of p for a clear interpretation.

$$\begin{aligned} \begin{aligned} \texttt {Before bootstrapping :}&~\texttt {[0.777898 0.541580 0.603675 0.822638]} \\ \texttt {After bootstrapping :}&~\texttt {[0.777435 0.541021 0.603023 0.822321]} \end{aligned} \end{aligned}$$

It shows that the error vectors is bounded by \(2^{-11}\cdot p\), i.e., the bootstrapping procedure with the parameter set I outputs a ciphertext of 10 bits of precision.

$$\begin{aligned} \begin{aligned} \texttt {Before bootstrapping :}&~\texttt {[0.516015 0.772621 0.939175 0.345987]} \\ \texttt {After bootstrapping :}&~\texttt {[0.516027 0.772614 0.939172 0.346001]} \end{aligned} \end{aligned}$$

Similarly, the second example shows that the bootstrapping error with the parameter set II is bounded by \(2^{-16}\cdot p\) and the output plaintext has 15 bits of precision.

Table 2. Bootstrapping timings with parameter sets I to IV
Fig. 3.
figure 3

Tendency of real (left) and amortized (right) bootstrapping timings

6.2 Experimental Results

We show the performance of our bootstrapping procedure based on the proposed parameter sets. All the experimentations were performed as a single hyperthread on a 2.10 GHz Intel Xeon E5-2620. The experimental results are summarized in Table 2. The linear transformation includes the three steps - \(\textsc {PartialSum}\), \(\textsc {CoeffToSlot}\), and \(\textsc {SlotToCoeff}\). The amortized time is obtained by dividing the total bootstrapping time by the number of plaintext slots.

The linear transformations take a longer time as the number of plaintext slots grows while the complexity of the \(\textsc {EvalExp}\) step remains stable. Therefore, we can make a trade-off between the latency time and the amortized time by changing the number of slots. Figure 3 illustrates the trend of evaluation timings for each of the bootstrapping phases (parameter set I).

7 Conclusion

In this paper, we suggested a method to recrypt a ciphertext of the \({\texttt {HEAAN}}\) scheme. The performance of our bootstrapping procedure was significantly improved by adapting a trigonometric approximation of the modular reduction function. The linear transformation turns out to be the most time-consuming part, but we used almost the same method as in [28, 33]. It would be an interesting open problem to find an efficient algorithm to evaluate the linear transformations approximately.