Keywords

1 Introduction

A classical problem in cryptography is that of stretching the key length of a block cipher. Namely, from a block cipher E with block length n and key length k, we want to obtain a new one with key length \(k' > k\) which is more secure than E. The problem was naturally motivated by legacy designs – in particular, DES – with inherently too-short keys (e.g., 56 bits), and the desire to stretch this key length generically without resorting to designing a new cipher.

The common wisdom is that double encryption is not useful for key-stretching purposes. Here, by double encryption, we mean the construction that, given an n-bit plaintext M and two k-bit keys \(K_1, K_2\), outputs \(E_{K_1}(E_{K_2}(M))\). Indeed, there is a well-known meet-in-the-middle attack recovering the key with only marginally more than \(2^k\) operations given (very few) valid plaintext-ciphertext pairs. This weakness has led to the widespread deployment (which continues to date in some niche areas) of Triple-DES [1], as well as a number of works on analyzing the theory of triple and multiple encryption [7, 13,14,15,16,17, 20], and alternative constructions with extra whitening steps (and key material) [15, 16, 18,19,20].

In this paper, we revisit double encryption in the context of multi-user security, where we give tight bounds, and show that it constitutes a sound and simple method to mitigate multi-user attacks on block ciphers. However, this problem will also serve as an application for a generic framework to provide good multi-user security bounds, and which we hope to be of wider applicability.

Double Encryption in the Single User Setting. As in previous works, we study the security of double encryption in the ideal-cipher model as a (strong) pseudorandom permutation (PRP). The attacker A is given access to an ideal cipher E to which it can issue p forward or backward queries for any chosen key (these are usually referred to as “offline queries”), and up to q queries (in either direction) to \(E_{K_1} \circ E_{K_2}\) (for random secret keys \(K_1, K_2\)) or a truly random permutation on the n-bit strings (this being usually called “online queries”). The attacker’s goal is to decide which of the two it is accessing. In this model, Aiello et al. [2] proved that A’s distinguishing advantage satisfies

$$\begin{aligned} {\mathsf {Adv}}^{\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \left( \frac{p}{2^k}\right) ^2 \;. \end{aligned}$$
(1)

where \(\mathrm {DE}[E]\) denotes double encryption. Note that for single encryption, the bound is easily shown to be \({\mathsf {Adv}}^{\mathrm {prp}}_{E}(A) \le \frac{p}{2^k}\). Both advantages become non-negligible for the same \(p \approx 2^k\), although (1) is smaller when \(p \ll 2^k\).

The Multi-user Setting. In the multi-user (mu) setting, originally proposed by Bellare, Boldyreva, and Micali [5] for public-key encryption, the attacker can distribute its online queries adaptively across multiple independent key pairs (in the real world) or independent permutations (in the ideal world). A few recent block-cipher analyses [19, 24, 29] have focused on mu security, and the notion has established itself as a more realistic security target.

One expects security to degrade as the number of users increases, and this loss can be linear in the worst case. For example, for single-encryption, we do have

$$\begin{aligned} {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{E}(A) \le \frac{u \left( p + u \right) }{2^k} \le \frac{q \left( p + q \right) }{2^k}\;, \end{aligned}$$
(2)

where u is a bound on the number of users A queries, and this bound is tight, i.e., there is a matching attack [10]. Also, we can only guarantee \(u \le q\), as the attacker can decide to only issue one query per user. However, for double encryption, we can use a simple hybrid argument to show that

$$\begin{aligned} {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le u \left( \frac{p+2q}{2^k}\right) ^2 \le q \left( \frac{p+2q}{2^k}\right) ^2\;. \end{aligned}$$
(3)

This bound is already better than the one from (2) – for instance, for roughly \(p = q = 2^{k/2}\), this latter bound is still \(O(2^{-k/2})\), but (2) gives \(\varOmega (1)\). However, contrary to the single-encryption case, it is not clear that the bound is tight. We will indeed show a much better bound.

Our Bounds. Our main result shows that the security of double encryption does not degrade substantially in the multi-user setting, and that the bound from (3) is overly pessimistic. In particular, we prove that

$$ {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \frac{1}{2^n} + \frac{5q}{2^{k + n/2}} + \frac{6qB^2 + 222 B Q^2}{2^{2k}} $$

where \(Q = \max \{p, q\}\) and \(B = 5\max \{n + k/2, 2q/ 2^n\}\). This bound is rather cumbersome, but the key observation is that third-degree monomials in p and q all appear with denominator \(2^{2k + n}\), whereas any term with denominator \(2^{2k}\) is at most quadratic in p, q – very similar to the single-user case.

Recall that the meet-in-the-middle attack on the single user security of double encryption succeeds with advantage \(p^2 / 2^{2k}\), and Biham’s key-collision attack [10] achieves advantage \(q^2 / 2^{2k}\). Therefore for the setting that \(n \ge k\) (such as DES or AES), our bound is tight. For the setting \(n \ll k\) (which occurs in Format-Preserving Encryption [6], and several block-cipher designs), finding matching attacks is difficult, and we leave it as an open problem. However, as an intermediate step, we note that most proofs are in models where the keys are revealed to the distinguisher at the end of the execution. In this model, we can give a matching attack (based on the meet-in-the-middle paradigm) that achieves distinguishing advantage

$$\begin{aligned} \max \{ \lfloor n / 8 \lg (n) \rfloor , q / 2^n\} \cdot \frac{p^2}{3 \cdot 2^{2k}} \;. \end{aligned}$$

We discuss attacks below in Sect. 6.

A Disclaimer. We stress that the common wisdom that there is no security increase is obviously still in place. However, the envisioned application is to ciphers whose key length is not an issue in the single-user setting, but becomes too short in a multi-user regime. For instance, a multi-user attack reduces the security of (single) AES128 to 64 bits. Our result shows that iterating AES128 twice substantially mitigates the impact of a multi-user attack, and that in fact we obtain almost optimal multi-user security, namely around 115 bits for a total key length of 256 bits. (Also see Fig. 2.)

Techniques. Our result is obtained using new techniques we introduce and that we believe to be of broad applicability in lifting existing analyses from the single-user (su) to the mu setting.

Hoang and Tessaro (HT) [19] already proposed a generic approach for this purpose. It is illustrative to briefly review it, and see why it fails for double encryption. HT’s idea is to show that the construction (e.g., double encryption) satisfies, in the su case, a property called point-wise proximity, a stronger property than indistinguishability, already used in previous works (e.g., in [9]). Concretely, this means that there exists a function \(\epsilon = \epsilon (p, q)\) of the query parameters p and q, such that for all transcripts \(\tau \) containing p offline and q online queries, we have

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le \epsilon (p, q) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) , \end{aligned}$$
(4)

where \(p_{\mathbf {S}_{\mathrm {ideal}}}(\tau )\) and \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\) are the so-called ideal and real interpolation probabilities. Namely, they describe the probability that the real (\(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}\)) and the ideal (\(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}\)) worlds behave consistently with the transcript when the queries the transcript contains are asked in that order.

HT show that then point-wise proximity is achieved in the multi-user experiment, where \(\epsilon (p, q)\) is replaced by \(\epsilon (p + qt, q)\), where t is the number of calls made by the construction to the underlying primitive (in the case of double encryption, \(t = 2\)). This implies that the distinguishing advantage is also at most \(\epsilon (p + qt, q)\). For this argument to hold, however, \(\epsilon \) needs to be super-additive, i.e., \(\epsilon (x, y) + \epsilon (x, z) \le \epsilon (x, x + y)\), and moreover, \(\epsilon (\cdot , y)\) and \(\epsilon (x, \cdot )\) need to be non-decreasing functions for all \(x, y \in \mathbb {N}\). For double encryption, no such \(\epsilon \) can be established. For instance, the natural candidate \(\epsilon (p ,q) = \left( \frac{p}{2^k}\right) ^2\) is not super-additive, as \(\epsilon (x, y) + \epsilon (x, z) = 2\epsilon (x, y + z)\).

We take a different approach, by introducing a relaxed notion of almost proximity, which in particular akin to the H-coefficient method (cf. e.g. [12, 26]), introduced a partition the set of single-user transcripts into good and bad transcripts, and proximity guarantees are shown only on the former. Our main technical insight is the introduction of a precise framework to mitigate the effects of the growth of the probability of a bad transcript when increasing the number of users. We dispense with a formulation here – the conditions are not concise – and refer the reader to Sect. 3. We note that we also provide simplifications of the framework in Sect. 4, one of which is in particular sufficient for analyzing double encryption. We finally apply it in Sect. 5.

Further Related Work. Multiple encryption has been studied also in the standard computational model, with respect to the question of how it amplifies (weak) PRP security. Luby and Rackoff [21] initially studied double encryption, and bounds for multiple encryption were later provided by Myers [25] and Tessaro [28].

Also, while above we have focused on block cipher analyses, recent works have studied mu security in different contents, in particular for authentication encryption [8] and message-authentication codes [3, 4].

2 Preliminaries

Notation. For a finite set S, we let \(x \,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,S\) denote the uniform sampling from S and assigning the value to x. Let |x| denote the length of the string x, and for \(1 \le i < j \le |x|\), let x[ij] denote the substring from the ith bit to the jth bit (inclusive) of x. If A is an algorithm, we let \(y \leftarrow A(x_1,\ldots ;r)\) denote running A with randomness r on inputs \(x_1,\ldots \) and assigning the output to y. We let \(y \,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,A(x_1,\ldots )\) be the resulting of picking r at random and letting \(y \leftarrow A(x_1,\ldots ;r)\).

Multi-user PRP Security of Blockciphers. Let \(\varPi : \mathcal {K}\times \{0,1\}^n \rightarrow \{0,1\}^n\) be a blockcipher, which is built on another blockcipher \(E: \{0,1\}^k \times \mathcal {M}\rightarrow \mathcal {M}\). We associate with \(\varPi \) a key-sampling algorithm Sample. Let A be an adversary. Define

$$ {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\varPi [E], \mathrm {Sample}}(A) = \Pr [ \mathrm {Real}^{A}_{\varPi [E], \mathrm {Sample}} \Rightarrow 1] - \Pr [\mathrm {Rand}^{A}_{\varPi [E], \mathrm {Sample}} \Rightarrow 1] $$

where games \(\mathrm {Real}\) and \(\mathrm {Rand}\) are defined in Fig. 1. If Sample is the uniform sampling of \(\mathcal {K}\) then we only write \({\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\varPi [E]}(A)\).

Fig. 1.
figure 1

Games defining the multi-user security of a blockcipher \(\varPi : \mathcal {K}\times \{0,1\}^n \rightarrow \{0,1\}^n\). This blockcipher is based on another blockcipher \(E: \{0,1\}^k \times \{0,1\}^n \rightarrow \{0,1\}^n\). The game is associated with a key-sampling algorithm \(\mathrm {Sample}\). Here \(\mathrm {Perm}(\{0,1\}^n)\) denotes the set of all permutations on \(\{0,1\}^n\).

In the games above, we first use \(\mathrm {Sample}\) to sample keys \(K_1, K_2, \ldots \in \mathcal {K}\) for \(\varPi \), and independent, random permutations \(f_1, f_2, \ldots \) on \(\mathcal{M}\). The adversary is given four oracles \(\textsc {Prim}, \textsc {PrimInv}\), \(\textsc {Enc}\), and \(\textsc {Dec}\). In both games, the oracles \(\textsc {Prim}\) and \(\textsc {PrimInv}\) always give access to the primitive E and its inverse respectively. The \(\textsc {Enc}\) and \(\textsc {Dec}\) oracles give access to \(f_1(\cdot ), f_2(\cdot ), \ldots \) and their inverses respectively in game \(\mathrm {Rand}\), and access to \(\varPi [E](K_1, \cdot ), \varPi [E](K_2, \cdot ), \ldots \) and their inverses in game \(\mathrm {Real}\). The adversary finally needs to output a bit to tell which game it is interacting with.

Single and Double Encryption. Let \(k, n \in \mathbb {N}\) and let \(E: \{0,1\}^k \times \{0,1\}^n \rightarrow \{0,1\}^n\) be a blockcipher. The Single Encryption of E is the blockcipher E itself. The Double Encryption \(\mathrm {DE}[E]\) of E is a blockcipher with keyspace \((\{0,1\}^k)^2\) and message space \(\{0,1\}^n\). On key \(K = (J_1, J_2)\) and message \(x \in \{0,1\}^n\), \(\mathrm {DE}_K[E](x)\) returns \(E_{J_2}( E_{J_{1}}(x))\).

Systems and Transcripts. Following up the notation from [19] (which was in turn inspired by Maurer’s framework [22]), it is convenient to consider interactions of a distinguisher A with an abstract system \(\mathbf {S}\) which answers A’s queries. The resulting interaction then generates a transcript \(\tau = ((X_1, Y_1), \ldots , (X_q, Y_q))\) of query-answer pairs. It is well known that \(\mathbf {S}\) is entirely described by the probabilities \(\mathsf {p}_{\mathbf {S}}(\tau )\) that if we make queries in \(\tau \) to system \(\mathbf {S}\), we will receive the answers as indicated in \(\tau \). We say in particular that \(\mathbf {S}\) is stateless if \(\mathsf {p}_{\mathbf {S}}(\tau )\) is invariant under permuting the orders of the input-output pairs it contains.

We will generally describe systems informally, or more formally in terms a set of oracles they provide, and only use the fact that they define a corresponding probabilities \(\mathsf {p}_{\mathbf {S}}(\tau )\) without explicitly giving these probabilities.

The Expectation Method. In this paper, we shall use the expectation method of Hoang and Tessaro [19]. For a pair of systems \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\), this method aims to bound the gap \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\), for a fixed (su) transcript \(\tau \) such that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) > 0\). Under this method, one extends the transcript with a random variable \(S\). In \(\mathbf {S}_{\mathrm {real}}\), this \(S\) is often a part of the key and suppose that it has marginal distribution \(\mu \). In \(\mathbf {S}_{\mathrm {ideal}}\), we pick \(S\) of the same marginal distribution \(\mu \), but independent of \(\tau \). Let \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s)\) denote the probability that \(\mathbf {S}_{\mathrm {real}}\) behaves according to \(\tau \), and \(S\) agrees with s. Let \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s)\) denote the probability that \(\mathbf {S}_{\mathrm {ideal}}\) behaves according to \(\tau \), and \(S\,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,\mu \) agrees with s. Under the expectation method, one partitions the range of \(S\) into two sets, \(\varGamma _{\mathrm {good}}\) and \(\varGamma _{\mathrm {bad}}\). For s such that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s) > 0\), if \(s \in \varGamma _{\mathrm {bad}}\) then we say that s is bad; otherwise s is good. We write \(\Pr [S\in \varGamma _{\mathrm {bad}}]\) to denote the probability that \(S\,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,\mu \) independent of \(\tau \) is bad. Hoang and Tessaro give the following result.

Lemma 1

(The expectation method). [19] Fix a su transcript \(\tau \) such that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau )> 0\). Assume that there is a partition \(\varGamma _{\mathrm {good}}\) and \(\varGamma _{\mathrm {bad}}\) of the range \(\mathcal {U}\) of \(S\), as well as a function \(g: \mathcal {U}\rightarrow [0, \infty )\) such that \(\Pr [S\in \varGamma _{\mathrm {bad}}] \le \delta \) and for all \(s \in \varGamma _{\mathrm {good}}\),

$$ 1 - \frac{\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s)}{\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s)} \le g(s) . $$

Then

$$ \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le (\delta + \mathbf {E}[g(S)]) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) . \;\;\;\;\;\; $$

   \(\square \)

Note that in Lemma 1, the expectation is taken over all possible (good or bad) values of \(S\).

3 A Generic Method to Bound Multi-user Security

In this section we present a generic method to prove information-theoretic mu security bounds, based (mostly) on upper bounding single-user quantities. The framework is very general, and in fact generalizes the approach by Hoang and Tessaro [19] based on pointwise proximity.

The Generic Setting. We consider two (stateless) systems \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\), called the real and ideal systems, respectively. Each of these two systems can be invoked via two oracles \(\textsc {Cons}\) and \(\textsc {Prim}\), allowing for construction and primitive queries, respectively. First off, \(\textsc {Prim}\) gives access to an ideal primitive (for example, an ideal cipher, a random function or permutation), whereas \(\textsc {Cons}\)’s role depends on the context, but always answers queries of the form (iX), where i is the index of a user and X is the query for that user. More specifically:

  1. 1.

    In \(\mathbf {S}_{\mathrm {real}}\), the oracle \(\textsc {Cons}\) upon a query (iX) invokes a construction \(\varPi \) which makes calls to \(\textsc {Prim}\), and additionally depends on some local, initially chosen randomness (or key) \(K_i\). That is, the output is \(\varPi ^{\textsc {Prim}}(K_i, X)\).

  2. 2.

    In \(\mathbf {S}_{\mathrm {ideal}}\), the oracle \(\textsc {Cons}\) samples independent functions \(f_1, f_2, \ldots \) from some distribution, and answers a query (iX) as \(f_i(X)\).

For example, the game from Fig. 1 can be described as suitable systems \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\): We would simply handle inversion queries (to \(\textsc {Dec}\) and \(\textsc {PrimInv}\)) by specifying the direction of the query in the input given to \(\textsc {Cons}\) and \(\textsc {Prim}\), i.e., \(X = (+, x)\) or \(X = (-, y)\). Also, we can model more complex scenarios, like the security of authenticated encryption schemes, as long as we can map the security notion to suitable \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\).

We generally will assume that there exists a metric of data complexity associated with queries made to \(\textsc {Cons}\). For instance, if \(\textsc {Cons}\) takes variable-length inputs, \(\sigma \) could be number bits queried to it, whereas if the input length is fixed, this could just be the number of queries. We assume that there exists a parameter t indicating that when answering multiple queries with overall data complexity \(\sigma \), \(\varPi \) makes at most \(t \cdot \sigma \) queries to \(\textsc {Prim}\).

The Distinguishing Problem. For any adversary A and a system \(\mathbf {S}\), we let \(\mathrm {Script}(A, \mathbf {S})\) denote the random variable for the transcript of the interaction of A and \(\mathbf {S}\). Recall that the advantage of the adversary in distinguishing two systems \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\) is at most the statistical distance between the distributions of the adversary’s transcript in the real and ideal games, which is

$$\begin{aligned} {\mathsf {Adv}}^{\mathrm {dist}}_{\mathbf {S}_{\mathrm {real}}, \mathbf {S}_{\mathrm {ideal}}}(A) \le \sum _{\tau } \max \{0, \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\}, \end{aligned}$$
(5)

where the sum is taken over all \(\tau \) such that \(\Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) = \tau ] > 0\).

Note that there might be some context-dependent constraints on the adversary’s queries. For example, if part of the inputs to \(\textsc {Cons}\) include nonces to a nonce-based authenticated encryption, then one might require that the nonces will not repeat. This is easy to handle, since it will only restrict the set of valid transcripts to be considered. We will usually capture the complexity of A in terms of the number of \(\textsc {Prim}\) queries, p, the number of \(\textsc {Cons}\) queries q, and the overall data complexity \(\sigma \) for the queries made to \(\textsc {Cons}\). A security bound \(\epsilon \) is then viewed as a function \(\epsilon (p, q, \sigma )\). We say that a function \(\epsilon (\cdot , \cdot , \cdot ): \mathbb {N}^3 \rightarrow [0, 1]\) is monotonic if \(\epsilon (\cdot , y, z)\), \(\epsilon (x, \cdot ,z )\), and \(\epsilon (x, y, \cdot )\) are increasing functions, for any \(x, y, z \in \mathbb {N}\). Often security bounds are monotonic functions, since increasing the adversary’s resources can only help it.

Almost Proximity. We now establish a condition on \(\mathbf {S}_{\mathrm {real}}\) that we call almost proximity, which will allow us to establish mu security from a number of functions, \(\delta _0, \delta _1\) and \(\delta _2\), we define next. In particular, some of these functions (\(\delta _1\) and \(\delta _2\)) are defined with respect to single-user (su) transcript, i.e., transcripts were all queries to \(\textsc {Cons}\) are of the form (iX) for one single i.

One begins by defining a context-dependent, undesirable property on su transcripts that we call bad, and if a su transcript is not bad then it is good. We partition in particular the set of bad transcripts into two sets, \(\mathcal {S}\) and \(\mathcal {S}'\). In many cases (such as our Double Encryption application below), one of the two sets \(\mathcal {S}\) and \(\mathcal {S}'\) is simply the empty set, but we envision more general application scenarios.

Further, we will assume that there exists a function \(\mathrm {Rate}\) such that for any good su transcript \(\tau \),

$$\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le \mathrm {Rate}(\tau ) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau )\;,$$

where \(\mathrm {Rate}\) is in particular an increasing function mapping a transcript to a number in [0, 1], meaning that for any transcripts \(\tau \) and \(\tau '\) such that \(\tau '\) contains all the query-answer pairs of \(\tau \) (possibly in a different order), we have \(\mathrm {Rate}(\tau ') \ge \mathrm {Rate}(\tau )\).

Then, we also assume that there is a monotonic function \(\delta _2\) such that for any adversary B attacking a single user via p \(\textsc {Prim}\) queries, q \(\textsc {Enc}\) queries with overall data complexity \(\sigma \), we have

$$ \Pr [\mathrm {Script}(B, \mathbf {S}_{\mathrm {ideal}}) \in \mathcal {S}] \le \delta _2(p, q,\sigma ) . $$

Note that the bound above is with respect to the ideal system, \(\mathbf {S}_{\mathrm {ideal}}\), and thus often easy to compute.

We also define another, context-dependent, desired property on mu transcripts that we call nice — we let \(\mathcal {N}\) be the set of all nice mu transcripts. (We stress that niceness is with respect to mu transcripts, whereas being good/bad is only with respect to su ones.) The notion of niceness involves only the \(\textsc {Cons}\) query-answer pairs: for any two transcripts \(\tau \) and \(\tau '\) that have the same \(\textsc {Cons}\) query-answer pairs (possibly in different orders), if \(\tau \in \mathcal {N}\) then so is \(\tau '\). Also, for a mu transcript \(\tau \) involving queries to exactly r users, and for each \(i \in \{1, \ldots , r\}\), let \(\mathrm {Map}(i, \tau )\) denote the su transcript obtained by deleting the \(\textsc {Cons}(j, \cdot )\) queries and answers for any \(j \ne i\). We require the following conditions:

  • For any transcript \(\tau \in \mathcal {N}\) and all i, \(\mathrm {Map}(i, \tau ) \not \in \mathcal {S}'\).

  • There is a monotonic function \(\delta _0\) such that for any mu adversary A making \(p\, \textsc {Prim}\) queries, \(q\, \textsc {Cons}\) queries, and data complexity \(\sigma \),

    $$ \Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \not \in \mathcal {N}] \le \delta _0(p, q, \sigma ) . $$
  • There is a monotonic function \(\delta _1\) such that for any \(\tau \in \mathcal {N}\) of r users that contains p \(\textsc {Prim}\), q \(\textsc {Cons}\) queries of total data complexity at most \(\sigma \),

    $$\begin{aligned} \sum _{i = 1}^r \mathrm {Rate}(\mathrm {Map}(i, \tau )) \le \delta _1(p, q, \sigma ) . \end{aligned}$$
    (6)

    We refer to this last property as mu-boundedness.

We refer to the existence of suitable functions \(\delta _0, \delta _1, \delta _2\) for corresponding \(\mathrm {Rate}\), \(\mathrm {Map}\), \(\mathcal {S}\), \(\mathcal {S}'\) and \(\mathcal {N}\) as meeting the almost-proximity conditions.

Mu Security via Almost Proximity. The following result bounds the mu advantage in distinguishing \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\), granted the almost-proximity conditions defined above are met.

Lemma 2

(Mu-security via almost proximity). Assume that the almost-proximity conditions above are met, for some \(\delta _2\), \(\delta _0\) and \(\delta _1\). Then for any adversary A that makes at most q \(\textsc {Cons}\) queries of total data complexity \(\sigma \), and p \(\textsc {Prim}\) queries, we have

$$ {\mathsf {Adv}}^{\mathrm {dist}}_{\mathbf {S}_{\mathrm {real}}, \mathbf {S}_{\mathrm {ideal}}}(A) \le \delta _0(p, q, \sigma ) + 2\delta _1(p + t \sigma , q, \sigma ) + 2q \cdot \delta _2(p + t\sigma , q, \sigma ). $$

Discussion. A meaningful question is why we need to separate the set of bad su transcripts into \(\mathcal {S}\) and \(\mathcal {S}'\). The reason is that, when we move from su to mu setting, under our method, the term \(\delta _2\) will blow up to \(q \delta _2\), which is similar to the hybrid argument. To avoid an inferior mu bound, we would like to minimize the term \(\delta _2\) as much as possible, by carving out \(\mathcal {S}'\) from the set of bad su transcripts. Due to the requirement that \(\mathrm {Map}(i, \tau ) \not \in \mathcal {S}'\) for every nice mu transcript \(\tau \) and every i, the set \(\mathcal {S}'\) and the notion of niceness needs to be chosen in tandem to minimize \(q \delta _2 + \delta _0(p, q, \sigma )\). Bounding \(\Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \not \in \mathcal {N}]\) requires working directly in the mu setting, but recall that we are in the ideal game, which is often simple to deal with.

Proof (of Lemma 2 ). Since we consider a computationally unbounded adversary, without loss of generality, assume that the adversary is deterministic. For simplicity, from this point, we will write \(\delta _2\) and \(\delta _1\) instead of \(\delta _2(p + t\sigma , q, \sigma )\) and \(\delta _1(p + t \sigma , q, \sigma )\). Without loss of generality, assume that \(\delta _1 < 1/2\); otherwise the the claimed bound in the statement of this lemma is moot. We also assume that the adversary’s transcript involves at most r users.

Restricting to Nice Transcripts. Recall that in the ideal system, the probability that the adversary A can produce a mu transcript that is not nice is at most \(\delta _0(p, q, \sigma )\). From Eq. (5), what is left is to show that

$$\begin{aligned} \sum _{\tau } \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le 2\delta _1 + 2q \delta _2, \end{aligned}$$
(7)

where the sum in the left hand side is taken over all nice transcripts \(\tau \) in the support \(\mathrm {supp}(\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}))\) of \(\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}})\) such that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) > \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\). Below, when we talk about a valid transcript \(\tau \), this means that \(\tau \) meets the constraint above.

Building Hybrids. For each \(i \in \{0, \ldots , r\}\), consider the hybrid system \(\mathbf {S}_i\) that provides the interface compatible with the real and ideal systems, but queries for user \(u_j\) are answered via the actual construction \(\varPi ^{\textsc {Prim}}(K_j, \cdot )\) for \(j > i\), and via an independent, perfect simulation of the \(\textsc {Cons}(j, \cdot )\) oracle of the ideal game if \(j \le i\). Then \(\mathbf {S}_0 = \mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_r = \mathbf {S}_{\mathrm {ideal}}\) and thus for any valid transcript \(\tau \),

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) = \sum _{i = 1}^r \mathsf {p}_{\mathbf {S}_i}(\tau ) - \mathsf {p}_{\mathbf {S}_{i - 1}}(\tau ) . \end{aligned}$$
(8)

Let \(B_i\) be the following hybrid su adversary. It samples key \(K_j\) for \(\varPi ^{\textsc {Prim}}\) for every \(i < j \le r\), and then runs A. Queries for user \(u_j\) are answered via \(\varPi ^{\textsc {Prim}}(K_j, \cdot )\) if \(j > i\), and via the \(\textsc {Cons}(1, \cdot )\) oracle of \(B_i\) if \(j = i\), and via an independent, perfect simulation of the \(\textsc {Cons}(j, \cdot )\) oracle of the ideal game if \(j < i\). In other words, adversary \(B_i\) simulates system \(\mathbf {S}_{i - 1}\) in its su real game, and simulates system \(\mathbf {S}_{i}\) in its su ideal game. It makes at most q \(\textsc {Cons}\) queries of total data complexity \(\sigma \) and at most \(p + t\sigma \) \(\textsc {Prim}\) queries.

Reducing to Transcript-Wise Gap. Fix a valid transcript \(\tau \). Let \(\mathcal {T}(i, \tau )\) denote the set of extended transcripts of \(B_i\) in its su ideal game that are enhanced with the simulated \(\textsc {Cons}\) queries and answers as well as the simulated keys \(K_j\), such that the corresponding simulated transcript for A is \(\tau \). For each \(\tau _i \in \mathcal {T}(i, \tau )\), let \(\mathrm {Tr}(\tau _i)\) be the transcript of \(B_i\) derived from \(\tau _i\). For \(\mathbf {S}\in \{\mathbf {S}_{\mathrm {real}}, \mathbf {S}_{\mathrm {ideal}}\}\), let \(\mathsf {p}_{\mathbf {S}}(\tau _i)\) denote the probability that, when \(B_i\) interacts with \(\mathbf {S}\), its enhanced transcript is \(\tau _i\). Note that compared to \(\mathrm {Tr}(\tau _i)\), the additional information \(\tau _i\) contains is the keys \(K_j\), and the queries/answers on the simulated oracle \(\textsc {Cons}(j, \cdot )\) of the ideal game for users \(j < i\). Since this information is independent of \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\),

$$\begin{aligned} \frac{\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)}{\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i)} = \frac{\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\mathrm {Tr}(\tau _i))}{\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\mathrm {Tr}(\tau _i))} . \end{aligned}$$
(9)

Let \(\mathcal {S}_i\) be the set of extended transcripts \(\tau _i\) of \(B_i\) such that \(\mathrm {Tr}(\tau _i) \in \mathcal {S}\). We claim that

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\le & {} 2 \Bigl ( \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \Bigr ) + 2 \delta _1 \sum _{\tau _1 \in \mathcal {T}(1, \tau )} \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1) \nonumber \\= & {} 2 \Bigl ( \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \Bigr ) + 2\delta _1 \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \nonumber \\\le & {} 2 \Bigl ( \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \Bigr ) + 2\delta _1 \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ), \end{aligned}$$
(10)

where the last inequality is due to the assumption that \(\tau \) is valid. This claim will be justified later. By summing both sides of Eq. (10) over all valid \(\tau \), we can bound the left-hand side of Eq. (7) by

$$ 2 \Bigl ( \sum _{i = 1}^r \Pr [\mathrm {Script}(B_i, \mathbf {S}_{\mathrm {ideal}}) \in \mathcal {S}] \Bigr ) + 2\delta _1 \le 2 q \cdot \delta _2 + 2\delta _1 $$

which is the right-hand side of Eq. (7). To justify Eq. (10), note that

$$ \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) = \sum _{i = 1}^r \mathsf {p}_{\mathbf {S}_i}(\tau ) - \mathsf {p}_{\mathbf {S}_{i - 1}}(\tau ) . $$

Moreover, for each \(i \le r\),

$$ \mathsf {p}_{\mathbf {S}_i}(\tau ) = \sum _{\tau _i \in \mathcal {T}(i, \tau )} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i), $$

whereas

$$ \mathsf {p}_{\mathbf {S}_{i - 1}}(\tau ) \ge \sum _{\tau _i \in \mathcal {T}(i, \tau )} \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i), $$

because (a) the left-hand side is the chance that adversary \(B_i\) in its real world (recall that the real world of \(B_i\) is the ideal world of \(B_{i - 1}\)) can generate \(\tau \), which is \(\sum _{\tau '} \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ')\) over all enhanced transcripts \(\tau '\) that \(B_i\) can witness such that the corresponding transcript for A is \(\tau \), and (b) the right-hand side is \(\sum _{\tau '} \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ')\) over some (but probably not all) such \(\tau '\). Hence

$$\begin{aligned}&\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau )} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) \\= & {} \Bigl (\sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \!- \!\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) \Bigr ) + \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) \\\le & {} \Bigl (\sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \Bigr ) + \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) . \end{aligned}$$

What is left is to prove that

$$\begin{aligned}&\sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) \nonumber \\\le & {} \Bigl ( \sum _{i = 1}^r \sum _{\tau _i \in \mathcal {T}(i, \tau ) \cap \mathcal {S}_i} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \Bigr ) + 2 \delta _1 \sum _{\tau _1 \in \mathcal {T}(1, \tau )} \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1) . \end{aligned}$$
(11)

Now, recall that for each \(\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i\), the su transcript \(\mathrm {Tr}(\tau _i)\) is good. Since the two systems satisfy the almost proximity condition,

$$ \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\mathrm {Tr}(\tau _i)) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\mathrm {Tr}(\tau _i)) \le \mathrm {Rate}(\mathrm {Tr}(\tau _i)) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\mathrm {Tr}(\tau _i)) . $$

Recall that from Eq. (9), the ratio between \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\mathrm {Tr}(\tau _i))\) and \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\mathrm {Tr}(\tau _i))\) is exactly that between \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i)\) and \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)\). Then

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i) \le \mathrm {Rate}(\mathrm {Tr}(\tau _i)) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) . \end{aligned}$$
(12)

This in turn implies that

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i) \le \frac{\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)}{1 - \mathrm {Rate}(\mathrm {Tr}(\tau _i)) } . \end{aligned}$$
(13)

To justify that the denominator of the right-hand side is nonzero so that Eq. (13) above is well-defined, let \(\tau '\) be the mu transcript that has the same \(\textsc {Cons}\) queries/answers as \(\tau \), and the same \(\textsc {Prim}\) queries/answers as \(\tau _i\). Since \(\tau \) is nice, so is \(\tau '\). Thus, \(1 - \mathrm {Rate}(\mathrm {Tr}(\tau _i)) = 1 - \mathrm {Rate}(\mathrm {Map}(i, \tau ')) \ge 1 - \delta _1 > 0\). From Eq. (12), to justify Eq. (11), we need to bound each sum

$$ \sum _{\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i} \mathrm {Rate}(\mathrm {Tr}(\tau _i)) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i), $$

for every \(i \in \{1, \ldots , r\}\). For \(\ell \le i\), define \(\mathrm {Rate}(i, \tau _{\ell })\) as follows. Let \(\tau '\) be the su transcript induced by \(\tau _\ell \) in which we only keep \(\textsc {Cons}\) queries/answers for user \(u_i\), and all \(\textsc {Prim}\) queries/answers. Let \(\mathrm {Rate}(i, \tau _\ell ) = \mathrm {Rate}(\tau ')\). The special case \(\mathrm {Rate}(i, \tau _i)\) coincides with \(\mathrm {Rate}(\mathrm {Tr}(\tau _i))\). We claim that for each i, the sum above is at most

$$\begin{aligned} \sum _{\tau _1 \in \mathcal {T}(1, \tau )} 2\mathrm {Rate}(i, \tau _{1})\cdot \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1) + \sum _{s = 1}^i \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} 2\mathrm {Rate}(i, \tau _{s}) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) . \end{aligned}$$
(14)

Note that for any \(s \ge 1\) and any \(\tau _s \in \mathcal {T}(s, \tau )\), if we let \(\tau '\) be the mu transcript that has the same \(\textsc {Cons}\) queries/answers as \(\tau \), and the same \(\textsc {Prim}\) queries/answers as \(\tau _s\), then \(\tau '\) is also nice, because \(\tau \) is nice. Then

$$\begin{aligned} \sum _{i = s}^r \mathrm {Rate}(i, \tau _{s}) = \sum _{i = s}^r \mathrm {Rate}(\mathrm {Map}(i, \tau ')) \le \delta _1. \end{aligned}$$
(15)

From Eq. (15),

$$\begin{aligned} \sum _{i = 1}^r \sum _{\tau _1 \in \mathcal {T}(1, \tau )} 2\mathrm {Rate}(i, \tau _{1})\cdot \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1) \le \sum _{\tau _1 \in \mathcal {T}(1, \tau )} 2 \delta _1 \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1), \end{aligned}$$
(16)

and

$$\begin{aligned}&\sum _{i = 1}^r \sum _{s = 1}^i \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} 2\mathrm {Rate}(i, \tau _{s}) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) \nonumber \\= & {} \sum _{s = 1}^r \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} \sum _{i = s}^r 2\mathrm {Rate}(i, \tau _{s}) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) \end{aligned}$$
(17)
$$\begin{aligned}\le & {} \sum _{s = 1}^r \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} 2\delta _1 \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) \le \sum _{s = 1}^r \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) . \end{aligned}$$
(18)

Combining Eqs. (12), (14), (16), and (18) gives us Eq. (11).

To justify Eq. (14), fix \(i \in \{1, \ldots , r\}\). We create a binary tree whose weight at the root is exactly the sum above for i. In this tree, for any two children of a node, the left one must be a leaf node. Moreover, we will put weights on the nodes so that the weight of a parent node is bounded by the sum of the weights of its children. Hence the weight at the root is bounded by the total weight of the leaves.

Starting at the root, from Eq. (13), we can bound the weight at the root by a linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)\), where \(\tau _i \in \mathcal {T}(i, \tau ) \backslash \mathcal {S}_i\). For each such \(\tau _i\), if we enhance it with the key of user \(u_i\) and the internal \(\textsc {Prim}\) queries/answers due to the \(\textsc {Cons}\) queries of user \(u_i\) then we will get an extended transcript \(\tau _{i - 1}\) for adversary \(B_{i - 1}\). (Recall that the real world of \(B_i\) is the ideal world of \(B_{i - 1}\).) Hence the linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)\) becomes a linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _{i - 1})\), for \(\tau _{i - 1} \in \mathcal {T}(i - 1, \tau )\). We divide this into two parts, one for \(\tau _{i - 1} \in \mathcal {S}_{i - 1}\), and another for \(\tau _{i - 1} \not \in \mathcal {S}_{i - 1}\). The first partial sum will be the weight of the left child of the root, and the second the weight of the right child. So far, we have placed the weights up to the second level of the tree. We will repeat the process above, starting at the right child of the root, until we reach the i-th level. At that point, the weight of the right-most leaf is a linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _{1})\), for \(\tau _{1} \in \mathcal {T}(1, \tau )\).

Recall that the weight of each node of the binary tree above is a linear combination. We now specify the coefficients. At the root, each coefficient for \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i)\) is \(\mathrm {Rate}(i, \tau _i)\). We will have to bound \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _i)\) via \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _i)\) by Eq. (13), so the coefficients for the left and right children of the root are at most

$$ \frac{\mathrm {Rate}(i, \tau _i)}{1 - \mathrm {Rate}(i, \tau _i)} \le \frac{\mathrm {Rate}(i, \tau _{i - 1})}{1 - \mathrm {Rate}(\tau _{i - 1})}, $$

where the inequality is due to the fact that \(\mathrm {Rate}\) is increasing and \(\tau _{i - 1}\) contains all queries/answers of \(\tau _i\), and thus \(\mathrm {Rate}(i - 1, \tau _{i - 1}) \ge \mathrm {Rate}(i, \tau _i)\). By repeating this process, for nodes at the \((i + 1 - s)\)-th level, the coefficients are at most

$$ \frac{\mathrm {Rate}(i, \tau _{s})}{\prod _{\ell = s + 1}^{i} \bigl (1 - \mathrm {Rate}(\ell , \tau _{s})\bigr )} . $$

Now, for the right most leaf, its weight is currently a linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _1)\), but we want to have its weight as a linear combination of \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1)\) instead. To achieve this, we will again use Eq. (13) (but i replaced by 1), and the new coefficients for this leaf are at most

$$ \frac{\mathrm {Rate}(i, \tau _{1})}{\prod _{\ell = 1}^{i} \bigl (1 - \mathrm {Rate}(\ell , \tau _{1})\bigr )} . $$

Hence the coefficients for a leaf at the \((i + 1 - s)\)-th level of the tree are at most

$$ \frac{\mathrm {Rate}(i, \tau _{s})}{\prod _{\ell = s}^{i} \bigl (1 - \mathrm {Rate}(\ell , \tau _{s})\bigr )} \le \frac{\mathrm {Rate}(i, \tau _{s})}{1 - \sum _{\ell = s}^i \mathrm {Rate}(\ell , \tau _{s})} \le \frac{\mathrm {Rate}(i, \tau _{s})}{1 - \delta _1} \le 2\mathrm {Rate}(i, \tau _{s}), $$

where the first inequality is due to the fact that \((1 - x)(1 - y) \ge 1 - x - y\) for any \(0 \le x, y < 1\), and the second inequality is due to Eq. (15). The total weight of the leaves therefore is at most

$$\begin{aligned}&\sum _{\tau _1 \in \mathcal {T}(1, \tau )} 2\mathrm {Rate}(i, \tau _{1})\cdot \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau _1) + \sum _{s = 1}^i \sum _{\tau _s \in \mathcal {T}(s, \tau ) \cap \mathcal {S}_s} 2\mathrm {Rate}(i, \tau _{s}) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau _s) . \end{aligned}$$

This concludes the proof.    \(\square \)

4 Simplification of the Framework for Specific Settings

Since the framework in Sect. 3 aims to provide an umbrella for all settings, it appears unnecessarily complex in many important settings. To improve the usability of our framework, in this section, we consider some simplified treatments of our general framework for specific settings. Each such specialized result is somewhat more limited in scope, but simpler to use.

4.1 A Simple Specialization of the Framework

We now describe a specialization of the framework that is very simple, but might be powerful enough for typical real-world cryptographic schemes, such as the authenticated encryption scheme GCM [23]. This simple treatment however is not enough for Double Encryption, and thus in the next subsection, we will consider another specialized result of the general framework to handle Double Encryption.

The Setting. Here we still use the generic setting as stated in Sect. 3, but make an assumption on the metric \(\sigma \). For a mu transcript \(\tau \) and each user \(u_i\) of \(\tau \), let \(\mathrm {Map}(i, \tau )\) be the induced su transcript for user \(u_i\) that consists of the \(\textsc {Cons}(i, \cdot )\) queries/answers and \(\textsc {Prim}(\cdot )\) queries/answers of \(\tau \). We require that for any mu transcript \(\tau \), if the \(\textsc {Cons}\) queries in \(\tau \) have data complexity \(\sigma \), and those in each \(\mathrm {Map}(i, \tau )\) have data complexity \(\sigma _i\), then

$$ \sum _{i} \sigma _i \le \sigma . $$

This requirement obviously holds if we let, for example, \(\sigma \) be the total length of the \(\textsc {Cons}\) queries.

Super-Additivity. For a function \(\delta : (\mathbb {N})^3 \rightarrow [0, 1]\), we say that it is super-additive if

$$ \delta (x, y_0, z_0) + \delta (x, y_1, z_1) \le \delta (x, y_0 + y_1, z_0 + z_1) $$

for every \(x, y_0, y_1, z_0, z_1 \in \mathbb {N}\). In many schemes, the desired bounds (such as \(\delta (p, q, \sigma ) = \sigma ^2 / 2^n\)) are often super-additive.

The Technique. One begins by defining an undesirable property on su transcripts that involves only \(\textsc {Cons}\) queries/answers. If a su transcript has this property then we say that it is bad, otherwise it is good.Footnote 1 A mu transcript \(\tau \) is nice if there is no user \(u_i\) such that its induced su transcript \(\mathrm {Map}(i, \tau )\) is bad. Let \(\mathcal {N}\) be the set of nice mu transcripts. We require that there be a monotonic function \(\delta \) such that for any adversary A making p \(\textsc {Prim}\) queries and q Cons queries of data complexity \(\sigma \),

$$\begin{aligned} \Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \not \in \mathcal {N}] \le \delta (p, q, \sigma ), \end{aligned}$$
(19)

where for any system \(\mathbf {S}\), \(\mathrm {Script}(A, \mathbf {S})\) denotes the random variable for the transcript of the interaction of A and \(\mathbf {S}\). Moreover, we require that there be a monotonic function \(\epsilon '\) and a super-additive, monotonic function \(\epsilon \) such that for any good su transcript \(\tau \) of p \(\textsc {Prim}\) queries and q \(\textsc {Cons}\) queries of data complexity \(\sigma \),

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le (\epsilon (p, q, \sigma ) + \epsilon '(p, q, \sigma )) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) . \end{aligned}$$
(20)

Lemma 3

Assume that the systems \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\) meet the conditions in Eqs. (19) and (20). Then

$$ {\mathsf {Adv}}^{\mathrm {dist}}_{\mathbf {S}_{\mathrm {real}}, \mathbf {S}_{\mathrm {ideal}}}(A) \le \delta (p, q, \sigma ) + 2\epsilon (p + t \sigma , q, \sigma ) + 2q \cdot \epsilon '(p + t\sigma , q, \sigma ) . $$

Proof

For a su transcript \(\tau \) of p \(\textsc {Prim}\) queries and q \(\textsc {Cons}\) queries of data complexity \(\sigma \), let

$$ \mathrm {Rate}(\tau ) = \epsilon (p, q, \sigma ) + \epsilon '(p, q, \sigma ) . $$

This function \(\mathrm {Rate}\) is increasing, in the sense that if \(\tau '\) contains all the query-answer pairs of \(\tau \), then \(\mathrm {Rate}(\tau ') \ge \mathrm {Rate}(\tau )\). To use Lemma 2, we need to establish the mu-boundedness of \(\mathrm {Rate}\). We claim that for any nice mu transcript \(\tau \) of r users, using p \(\textsc {Prim}\) queries and q \(\textsc {Cons}\) queries of data complexity \(\sigma \),

$$ \sum _{i = 1}^r \mathrm {Rate}(\mathrm {Map}(i, \tau )) \le \epsilon (p, q) + q \epsilon '(p, q) . $$

To justify this, suppose that \(\tau _i\) contains \(q_i\) \(\textsc {Cons}\) queries of data-complexity \(\sigma _i\). Then

$$\begin{aligned} \sum _{i = 1}^r \mathrm {Rate}(\mathrm {Map}(i, \tau ))= & {} \sum _{i = 1}^r \epsilon (p, q_i, \sigma _i) + \epsilon '(p, q_i, \sigma _i) \\\le & {} \sum _{i = 1}^r \epsilon (p, q_i, \sigma _i) + \epsilon '(p, q, \sigma ) \\\le & {} \epsilon (p, q, \sigma ) + r \cdot \epsilon '(p, q, \sigma ) \le \epsilon (p, q, \sigma ) + q \cdot \epsilon '(p, q, \sigma ) . \end{aligned}$$

Finally, applying Lemma 2 for \(\delta _0 = \delta \), \(\delta _1 = \epsilon + q\epsilon '\), and \(\delta _2 = 0\), leads to the claimed advantage.    \(\square \)

4.2 The Specialized Framework for Double Encryption and Beyond

We now specialize the general framework into a more specific result that covers the case of Single Encryption, Double Encryption, and Key-Alternating Cipher (KAC) [11]. This result explains why these constructions, despite being somewhat similar in the structure, have different blowups when we move from su setting to mu one.

The Setting. Let \(\varPi [E]: \mathcal {K}\times \{0,1\}^n \times \{0,1\}^n\) be a blockcipher construction built on top of an ideal blockcipher \(E: \{0,1\}^k \times \{0,1\}^n \rightarrow \{0,1\}^n\) such that a single call to \(\varPi /\varPi ^{-1}\) makes at most t calls to \(E/E^{-1}\). Let \(\mathbf {S}_{\mathrm {real}}\) and \(\mathbf {S}_{\mathrm {ideal}}\) be two stateless systems implementing games \(\mathrm {Real}_{\varPi [E], \mathrm {Sample}}\) and \(\mathrm {Rand}_{\varPi [E], \mathrm {Sample}}\) in Fig. 1, respectively. We will measure adversaries’ resources in terms of q (the number of \(\textsc {Enc}/\textsc {Dec}\) queries) and p (the number of \(\textsc {Prim}/\textsc {PrimInv}\) queries). A transcript recording the interaction between an adversary and a system \(\mathbf {S}\in \{\mathbf {S}_{\mathrm {ideal}}, \mathbf {S}_{\mathrm {real}}\}\) contains the following:

  • \(\textsc {Enc}/\textsc {Dec}\) queries: A query to \(\textsc {Enc}(i, x)\) returning y is associated with an entry \((\mathsf {enc}, +, i, x, y)\). Likewise, a query to \(\textsc {Dec}(i, y)\) returning x is associated with an entry \((\mathsf {enc}, -, i, x, y)\).

  • \(\textsc {Prim}/\textsc {PrimInv}\) queries: A query to \(\textsc {Prim}(J, u)\) returning v is recorded in the transcript as \((\mathtt {prim}, +, J, u, v)\). Likewise, a query to \(\textsc {PrimInv}(J, v)\) returning u is associated with an entry \((\mathtt {prim}, -, J, u, v)\).

Super-Additivity and Beyond. For a function \(\delta : (\mathbb {N})^2 \rightarrow [0, 1]\), we say that it is super-additive if \(\delta (x, y) + \delta (x, z) \le \delta (x, y + z)\), for every \(x, y, z \in \mathbb {N}\). For real numbers \(M > 0\) and \(z \ge 0\), let \(\mathrm {Cost}(M, z) = \max \{M, z\}\) if \(z > 1\), and \(\mathrm {Cost}(M, z) = M / \lg (M)\) if \(z \le 1\).

The Technique. One begins by defining an undesirable property on su transcripts, which can involve both \(\textsc {Enc}/\textsc {Dec}\) and \(\textsc {Prim}/\textsc {PrimInv}\) queries/answers. If a su transcript has this property, we say that it is bad; otherwise it is good. Let \(\mathcal {S}\) be the set of all bad su transcripts.Footnote 2 If a su transcript is not bad, we say that it is good. We demand that there be a monotonic function \(\epsilon ^*\) such that for any su adversary A that makes at most \(q\, \textsc {Enc}/\textsc {Dec}\) queries and \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries,

$$\begin{aligned} \Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \in \mathcal {S}] \le \epsilon ^*(p, q) \end{aligned}$$
(21)

where for any system \(\mathbf {S}\), \(\mathrm {Script}(A, \mathbf {S})\) denotes the random variable for the transcript of the interaction of A and \(\mathbf {S}\).

For any transcript \(\tau \) in which the adversary attacks just a single user, let \(\mathrm {Ent}(\tau )\) be the number of entries \((\mathtt {prim}, \cdot , \cdot , u, v)\) such that \(\tau \) contains either an entry \((\mathsf {enc}, +, 1, \cdot , x)\) or an entry \((\mathsf {enc}, -, 1, x, \cdot )\), for some \(x \in \{u, v\}\). Suppose that there are monotonic functions \(\epsilon ', \epsilon ''\) and a monotonic, super-additive function \(\epsilon \) such that, for any good su transcript of q queries to \(\textsc {Enc}/\textsc {Dec}\), and p queries to \(\textsc {Prim}/\textsc {PrimInv}\),

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau ) \le (\epsilon (p, q) + \epsilon '(p, q) \cdot \mathrm {Ent}(\tau ) + \epsilon ''(p, q)) \cdot \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) . \end{aligned}$$
(22)

If Eqs. (21) and (22) are met, then we say that \(\varPi [E]\) has the \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity property.

Note that \(\mathrm {Ent}(\tau ) \le \min \{p, 2^{k + 2} q \}\), where k is the key length of the primitive E. Thus \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity immediately implies that for any adversary attacking a single user via \(q\, \textsc {Enc}/\textsc {Dec}\) queries and \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries, its su advantage is at most \(\epsilon (p, q) + \epsilon '(p, q) \cdot \min \{p, q \cdot 2^{k + 2}\} + \epsilon ''(p, q) + \epsilon ^*(p, q)\). The following result bounds the mu security of \(\varPi [E]\).

Lemma 4

Assume that \(\varPi [E]\) has the \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity property. Then for any adversary A that makes at most \(q\, \textsc {Enc}/\textsc {Dec}\) queries, and \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries,

$$ {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\varPi [E], \mathrm {Sample}}(A) \le 2^{-n} + 2\epsilon + 2q ( \epsilon '' + \epsilon ^*) + \mathrm {Cost}(4n, 8q / 2^n) \cdot 10(p + qt) \epsilon ', $$

where t is the number of calls to \(E/E^{-1}\) that a single call to \(\varPi /\varPi ^{-1}\) makes, and functions \(\epsilon , \epsilon ', \epsilon '', \epsilon ^*\) all take arguments \(p + qt\) and q.    \(\square \)

Discussion. Recall that our technique dissects a su bound into three components: \(\epsilon , \epsilon ' \cdot \min \{p, q \cdot 2^{k + 2}\}\), and \((\epsilon '' + \epsilon ^*)\). Lemma 4 above then lifts those to \(\epsilon \), \(\mathrm {Cost}(4n, 8q/2^n) \cdot (p + qt) \cdot \epsilon '\), and \(q \cdot (\epsilon '' + \epsilon ^*)\), respectively, for the corresponding mu bound. This trisection captures different possibilities of security loss when one moves from su to mu security: (i) Key-Alternating Cipher (where \(\epsilon \) is the dominant term in both the su and mu bounds) [19], (ii) Single Encryption (where \(\epsilon '' + \epsilon ^*\) and \(q \cdot (\epsilon '' + \epsilon ^*)\) are the dominant term in the su and mu bounds respectively), and (iii) Double Encryption (where \(\epsilon ' \cdot \min \{p, q \cdot 2^{k + 2}\}\) and \(\mathrm {Cost}(4n, 8q / 2^n) \cdot (p + qt) \epsilon '\) are the dominant term in the su and mu bounds respectively).

Given a su analysis, there might be multiple choices for \(\epsilon \) and \(\epsilon ''\). However, recall that when we move from su to mu security, the former term remains the same, whereas the latter blows up with a factor q. Therefore, when we need to pinpoint \(\epsilon \) and \(\epsilon ''\), we will shift as much weight to \(\epsilon \) as possible, and the optimal choice of \(\epsilon \) will often be clear from the context and the best mu attacks. On the other hand, due to the q-blowup of \(\epsilon ''\), one may need a very fine-grained su analysis to obtain a good mu bound.

The Proof of Lemma 4 . We want to show that Lemma 2 implies the claimed result. In order to do that, we need to define (i) function \(\mathrm {Rate}(\tau )\) for su transcripts \(\tau \), and (ii) a niceness property for mu transcripts. The former is obvious: for a su transcript \(\tau \) of p \(\textsc {Prim}/\textsc {PrimInv}\) queries and q \(\textsc {Enc}/\textsc {Dec}\) queries, let

$$ \mathrm {Rate}(\tau ) = \epsilon (p, q) + \epsilon '(p, q) \cdot \mathrm {Ent}(\tau ) + \epsilon ''(p, q) . $$

This function \(\mathrm {Rate}\) is increasing, in the sense that if \(\tau '\) contains all the query-answer pairs of \(\tau \) then \(\mathrm {Rate}(\tau ') \ge \mathrm {Rate}(\tau )\). Next, let \(d = \frac{5}{4} \mathrm {Cost}(4n, 8q / 2^n)\). We say that a mu transcript \(\tau \) in the support of \(\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}})\) is nice if it satisfies the following constraints:

  • There are no d entries in \(\tau \) of the form \((\mathsf {enc}, +, \cdot , \cdot , y), \ldots , (\mathsf {enc}, +, \cdot , \cdot , y)\).

  • There are no d entries in \(\tau \) of the form \((\mathsf {enc}, -, \cdot , x, \cdot ), \ldots , (\mathsf {enc}, -, \cdot , x, \cdot )\).

Clearly, the definition of niceness involves only \(\textsc {Enc}/\textsc {Dec}\) query-answer pairs of \(\tau \). Let \(\mathcal {N}\) be the set of nice mu transcripts. The following bounds the chance that A’s transcript is not nice; the proof is in the full version of this paper.

Lemma 5

For any adversary A that makes at most q \(\textsc {Enc}/\textsc {Dec}\) queries, and p \(\textsc {Prim}/\textsc {PrimInv}\) queries,

$$ \Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \not \in \mathcal {N}] \le \frac{1}{2^n} . \;\;\;\; $$

   \(\square \)

To use Lemma 2, we need to establish the mu-boundedness of \(\mathrm {Rate}\). Specifically, we claim that, for any nice mu transcript \(\tau \) of r users, using p \(\textsc {Prim}/\textsc {PrimInv}\) queries and q \(\textsc {Enc}/\textsc {Dec}\) queries,

$$\begin{aligned} \sum _{i = 1}^r \mathrm {Rate}(\mathrm {Map}(i, \tau )) \le \epsilon (p, q) + q \epsilon ''(p, q) + 4d p \epsilon '(p, q) . \end{aligned}$$
(23)

Then using Lemma 2 for \(\delta _0 = 2^{-n}\) and \(\delta _1 = \epsilon + q \epsilon '' + 4dp \epsilon '\) and \(\delta _2 = \epsilon ^*\) leads to our claimed result.

We now justify Eq. (23). Suppose that in \(\tau \), the adversary uses \(q_i\) \(\textsc {Enc}/\textsc {Dec}\) queries for the i-th user. Then

$$\begin{aligned} \sum _{i = 1}^r \mathrm {Rate}(\mathrm {Map}(i, \tau ))= & {} \sum _{i = 1}^r \Bigl ( \epsilon (p, q_i) + \epsilon ''(p, q_i) + \mathrm {Ent}(\mathrm {Map}(i, \tau )) \cdot \epsilon '(p, q_i) \Bigr ) \\\le & {} \epsilon (p, q) + r \epsilon ''(p, q) + \sum _{i = 1}^r \mathrm {Ent}(\mathrm {Map}(i, \tau )) \cdot \epsilon '(p, q) \\\le & {} \epsilon (p, q) + q \epsilon ''(p, q) + \sum _{i = 1}^r \mathrm {Ent}(\mathrm {Map}(i, \tau )) \cdot \epsilon '(p, q), \end{aligned}$$

where the first inequality is due to the superadditivity of \(\epsilon \) and the monotone of \(\epsilon '\) and \(\epsilon ''\). Thus to justify (23), what’s left is to prove that

$$ \sum _{i = 1}^r \mathrm {Ent}(\mathrm {Map}(i, \tau )) \le 4 dp . $$

Since \(\tau \) is nice, for each entry \((\mathtt {prim}, \cdot , \cdot , u, v)\), there are at most 4d entries \((\mathsf {enc}, \cdot , \cdot , \cdot , x)\) or \((\mathsf {dec}, \cdot , \cdot , x, \cdot )\), for \(x \in \{u, v\}\). Since each \(\textsc {Enc}/\textsc {Dec}\) entry belongs to exactly one user, for each \(\textsc {Prim}/\textsc {PrimInv}\) entry of \(\tau \), there are at most 4d indices i such that \(\mathrm {Ent}(\mathrm {Map}(i, \tau ))\) counts this entry, and thus summing over \(p\, \textsc {Prim}/\textsc {PrimInv}\) entries of \(\tau \) gives us

$$ \sum _{i = 1}^r \mathrm {Ent}(\mathrm {Map}(i, \tau )) \le 4dp $$

as claimed.

5 Exact Multi-user Security of Double Encryption

5.1 Results and Discussion

Results. In this section, we give an exact mu security bound of Double Encryption via the specialized framework in Sect. 4.2; the key-sampling algorithm is uniform. While it is relatively easy to give an exact su security bound of Double Encryption [2, 14], giving a good \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity bound, as in Lemma 6 below, requires a much more fine-grained analysis. The proof, given in Sect. 5.2, is based on the expectation method of Hoang and Tessaro [19].

Lemma 6

Let \(n \ge 16\) and \(k \ge 1\) be integers, and let \(E: \{0,1\}^k \times \{0,1\}^n \rightarrow \{0,1\}^n\) be a blockcipher. Then \(\mathrm {DE}[E]\) satisfies the \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity property, with \(\epsilon (p, q) = \frac{2q}{2^{k + n/2}} + \frac{3q B^2 + 2Bpq}{2^{2k}}\), \(\epsilon '(p, q) = \frac{2p}{2^{2k}}\), \(\epsilon ''(p, q) = \frac{5Bp}{2^{2k}}\), and \(\epsilon ^*(p, q) = \frac{1}{2^{k + n}}\), where \(B = \frac{5}{4} \cdot \mathrm {Cost}(4n + 2k, 8q / 2^n)\).    \(\square \)

From Lemmas 4 and 6, we immediately obtain the following result.

Theorem 1

(Mu security of Double Encryption). Let \(n, k \in \mathbb {N}\) be integers, and let \(E: \{0,1\}^k \times \{0,1\}^n \rightarrow \{0,1\}^n\) be a blockcipher. Then for any adversary making only \(q\, \textsc {Enc}/\textsc {Dec}\) queries and \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries,

$$ {\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \frac{1}{2^n} + \frac{5q}{2^{k + n/2}} + \frac{6qB^2 + 222B Q^2}{2^{2k}} $$

where \(B = \frac{5}{4} \cdot \mathrm {Cost}(4n + 2k, 8q / 2^n)\) and \(Q = \max \{p, q\}\).    \(\square \)

Discussion. Admittedly, the bound in Theorem 1 looks complicated. However, for the “usual” setting \(n \ge k \ge 16\) and \(q \le \frac{2^k}{8}\), the bound can be simplified to \({\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \frac{1}{2^n} + \frac{(n + 5)q}{2^{1.5k}} + \frac{1554 n Q^2 }{\lg (4n) \cdot 2^{2k}}\). On the other hand, recall that the classical su bound of \(\mathrm {DE}[E]\) by Aiello et al. [2] is \({\mathsf {Adv}}^{\pm \mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \frac{p^2}{2^{2k}}\). If we apply the hybrid argument to this, we will get the following inferior bound \({\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{\mathrm {DE}[E]}(A) \le \frac{q(p + 2q)^2}{2^{2k}}\). While this bound is enough to show that Double Encryption squarely beats Single Encryption in mu security,Footnote 3 it is much worse than the bound in Theorem 1, as illustrated in Fig. 2.

Fig. 2.
figure 2

Mu and su security of Single and Double Encryption on AES. From left to right: the mu bound of Single Encryption, the naive mu bound of Double Encryption via the hybrid argument, the mu bound of Double Encryption via Theorem 1, the su bound of Single Encryption, and the classical su bound of Double Encryption by Aiello et al. [2]. We set \(p = q\) and \(n = k = 128\). The x-axis gives the log (base 2) of p, and the y-axis gives the security bounds.

5.2 Proof of Lemma 6

It is convenient to assume without loss of generality that the adversary doesn’t make redundant queries. Our proof borrows the overall approach used by Hoang and Tessaro [19] for key-alternating ciphers. We begin with some high-level setup.

Assumptions on the Transcript. We consider an arbitrary fixed transcript \(\tau \) which contains \(q\,\textsc {Enc}/\textsc {Dec}\) queries and \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries. Moreover, for a transcript \(\tau \), we also denote (following [14])

$$\begin{aligned} \begin{aligned} \mathrm {Fwd}(\tau )&= \max _{y \in \{0,1\}^n} \bigl | \{ (J, x) \;|\; (\mathtt {prim}, +, J, x, y) \in \tau \}\bigr | \;,\\ \mathrm {Bwd}(\tau )&= \max _{x \in \{0,1\}^n} \bigl | \{ (J, y) \;|\; (\mathtt {prim}, -, J, x, y) \in \tau \}\bigr | \;. \end{aligned} \end{aligned}$$

Recall that to establish \((\epsilon , \epsilon ', \epsilon '', \epsilon ^*)\)-proximity, we have to define bad transcripts. A transcript is bad if either \(\mathrm {Fwd}(\tau ) > B\) or \(\mathrm {Bwd}(\tau ) > B\), where

$$\begin{aligned} B := \frac{5}{4} \cdot \mathrm {Cost}(4n + 2k, 8p/ 2^n) \;. \end{aligned}$$

Let \(\mathcal {S}\) be the set of all bad transcripts. The following bounds the chance that the adversary produces a bad transcript; the proof is in the full version of this paper.

Lemma 7

For any adversary A that makes \(p\, \textsc {Prim}/\textsc {PrimInv}\) queries and \(q\, \textsc {Enc}/\textsc {Dec}\) queries,

$$ \Pr [\mathrm {Script}(A, \mathbf {S}_{\mathrm {ideal}}) \in \mathcal {S}] \le \frac{1}{2^{n + k}} . \;\;\; $$

   \(\square \)

From now on, we assume that additionally \(\tau \notin \mathcal {S}\). We shall use the expectation method to prove the claimed bound of the gap \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\). We begin with some combinatorial results on the transcript.

Type-1 Chains. Consider a pair of entries \((\mathtt {prim}, \cdot , \cdot , x_1, y_1), (\mathtt {prim}, \cdot , \cdot , x_2, y_2)\) in \(\tau \) such that \(y_1 = x_2\). We say that it is a positive type-1 chain if there’s an entry \((\mathsf {enc}, +, x_1, \cdot )\) in \(\tau \). We say that it is a negative type-1 chain if there’s an entry \((\mathsf {enc}, -, \cdot , y_2)\). The following lemma bounds the number of type-1 chains; the proof is in Appendix A.

Lemma 8

The number of type-1 chains is at most \(4Bp + 2B^2q + 2Bpq\).    \(\square \)

Type-2 Chains. Consider a pair of entries \((\mathtt {prim}, \cdot , \cdot , x_1, y_1), (\mathtt {prim}, \cdot , \cdot , x_2, y_2)\). We say that it is a positive type-2 chain if there’s an entry \((\mathsf {enc}, +, x_1, y_2)\) in \(\tau \). We say that it is a negative type-2 chain if there’s an entry \((\mathsf {enc}, -, x_1, y_2)\) in \(\tau \). The following lemma bounds the number of type-2 chains; the proof is in Appendix B.

Lemma 9

The number of type-2 chains is at most \(2p \cdot \mathrm {Ent}(\tau )\).    \(\square \)

Good and Bad Keys. We shall use the expectation method. Let \(S\) be the random variable for the key. The key-space \(\mathcal {K}\) is \((\{0,1\}^k)^2\) and \(S\) is uniformly distributed over \(\mathcal {K}\). For each key vector \(s = (K_1, K_2) \in \mathcal {K}\) and each \(i \in \{1, 2\}\), let \(p_i[s]\) be the number of queries \((\mathtt {prim}, \cdot , K_i, \cdot , \cdot )\) in \(\tau \).

Definition 1

(Good and bad keys). We say that a key \(s = (K_1, K_2)\) is bad if one the following happens:

  1. (i)

    \(K_1 = K_2\) and \(p_1[s] \ge 1\), or

  2. (ii)

    \(K_1 \ne K_2\), \(p_1[s] \ge 1\) and \(p_2[s] \ge 2^n / 4\), or

  3. (iii)

    \(K_1 \ne K_2\), \(p_1[s] \ge 2^n / 4\) and \(p_2[s] \ge 1\), or

  4. (iv)

    \(K_1 \ne K_2\) and there’s a (type-1 or 2) chain \((\mathtt {prim}, \cdot , K_1, \cdot , \cdot ), (\mathtt {prim}, \cdot , K_2, \cdot , \cdot )\).

If a key is not bad then we say that it is good. Let \(\varGamma _{\mathrm {bad}}\) be the set of bad keys, and let \(\varGamma _{\mathrm {good}}= \mathcal {K}\backslash \varGamma _{\mathrm {bad}}\).

We first bound the probability that \(S\) is bad. First, the chance that \(S\) satisfies condition (i) above is at most \(\frac{p}{2^{2k}}\). Next, we say that a subkey \(J \in \{0,1\}^k\) is heavy if there are at least \(2^n / 4\) entries \((\mathtt {prim}, \cdot , J, \cdot , \cdot )\) in \(\tau \). Since there are at most \(4p / 2^n\) heavy subkeys, the chance that \(S\) satisfies condition (ii) above is at most \(\frac{4p/2^n}{2^k} \cdot \frac{p}{2^k} = \frac{4p^2}{2^{2k + n}}\). Likewise, the chance that \(S\) satisfies condition (ii) above is at most \(\frac{4p^2}{2^{2k + n}}\). From Lemmas 8 and 9, there are at most \(2p \cdot \mathrm {Ent}(\tau ) + 4Bp + 2qB^2 + 2Bpq\) chains, and thus the chance that \(S\) satisfies condition (iii) above is at most \(\frac{2p \cdot \mathrm {Ent}(\tau ) + 4Bp + 2qB^2 + 2Bpq}{2^{2k}}\). Summing up

$$\begin{aligned} \Pr [S\text { is bad}]\le & {} \frac{p + 8p^2/2^n}{2^{2k}} + \frac{2p \cdot \mathrm {Ent}(\tau ) + 4Bp + 2qB^2 + 2Bpq}{2^{2k}} \\\le & {} \frac{2p \cdot \mathrm {Ent}(\tau ) + 5Bp + 2qB^2 + 2Bpq}{2^{2k}} . \end{aligned}$$

Next, recall that in the expectation method, one needs to find a non-negative function \(g: \mathcal {K}\rightarrow [0, \infty )\) such that g(s) bounds \(1 - \mathsf {p}_{\mathbf {S}_{0}}(\tau , s) / \mathsf {p}_{\mathbf {S}_{1}}(\tau , s)\) for all \(s \in \varGamma _{\mathrm {good}}\). Let U be the subset of \(\varGamma _{\mathrm {good}}\) such that for any \((K_1, K_2) \in U\), we have \(K_1 = K_2\). We will define g(s) such that \(g(s) = 2q / 2^{n/2}\) for every \(s \in U\), and \(g(s) = \frac{4q \cdot p_1[s] \cdot p_2[s]}{2^{2n}}\) for every \(s \in \mathcal {K}\backslash U\). We will show that g(s) bounds \(1 - \mathsf {p}_{\mathbf {S}_{0}}(\tau , s) / \mathsf {p}_{\mathbf {S}_{1}}(\tau , s)\) later. Then

$$\begin{aligned} \mathbf {E}[g(S)]= & {} \frac{1}{2^{2k}} \Bigl ( \sum _{s \in U} g(s) + \sum _{s \in \mathcal {K}\backslash U} g(s) \Bigr ) \\= & {} \frac{1}{2^{2k}} \Bigl ( \sum _{s \in U} \frac{q}{2^{n/2}} + \sum _{s \in \mathcal {K}\backslash U} \frac{4q p_1[s] p_2[s]}{2^{2n}} \Bigr ) \\\le & {} \frac{1}{2^{2k}} \Bigl ( \frac{q 2^k}{2^{n/2}} + \frac{4q p^2}{2^{2n}} \Bigr ) \le \frac{q}{2^{k + n/2}} + \frac{qB^2}{2^{2k}} . \end{aligned}$$

Then from Lemma 1,

$$\begin{aligned} \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau )\le & {} \Bigl (\Pr [S\text { is bad}] + \mathbf {E}[g(S)]\Bigr ) \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ) \\\le & {} \Bigl (\frac{2q}{2^{k + n/2}} + \frac{2p \cdot \mathrm {Ent}(\tau ) + 5Bp + 3qB^2 + 2Bpq}{2^{2k}} \Bigr )\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau ). \end{aligned}$$

We now show that g(s) indeed bounds \(1 - \mathsf {p}_{\mathbf {S}_{0}}(\tau , s) / \mathsf {p}_{\mathbf {S}_{1}}(\tau , s)\) for every \(s \in \varGamma _{\mathrm {good}}\). We consider the following cases, depending on whether \(s \in \varGamma _{\mathrm {good}}\backslash U\) or \(s \in U\).

Case 1: \(s \in \varGamma _{\mathrm {good}}\backslash U\). For this case, we have to consider two sub-cases, depending on whether \(q \le N/4\) or not.

Case 1.1: \(q \le N/4\). Let \(s = (K_1, K_2)\). Since \(s \in \varGamma _{\mathrm {good}}\backslash U\), we must have \(K_1 \ne K_2\). We now use the following result of Chen and Steinberger [12]. (Their proof considered key-alternating ciphers, but we note that we are restricting ourselves to the setting \(K_1 \ne K_2\), and and their proof also applies to the special case that all sub-keys of the key-alternating cipher are \(0^n\), which is equivalent to our setting here.)

Lemma 10

[12] Assume that \(p_1[s], p_2[s], q < 2^n / 2\). Then

$$ 1 - \frac{\mathsf {p}_{\mathbf {S}_{0}}(\tau , s) }{ \mathsf {p}_{\mathbf {S}_{1}}(\tau , s)} \le \frac{q \cdot p_1[s] \cdot p_2[s]}{(2^n - q - p_1[s]) (2^n - q - p_2[s])} . \;\;\;\;\; $$

   \(\square \)

From Lemma 10, since \(p_1[s], p_2[s], q \le 2^n / 4\),

$$ 1 - \frac{\mathsf {p}_{\mathbf {S}_{0}}(\tau , s) }{ \mathsf {p}_{\mathbf {S}_{1}}(\tau , s)} \le \frac{4 q \cdot p_1[s] \cdot p_2[s]}{2^{2n}} = g(s) . $$

Case 1.2: \(N / 4 < q \le N\). Let Z be the random variable for the additional \((N - q)\) \(\textsc {Enc}\) queries that \(\tau \) lacks. For we write \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s, z)\) to be the probability that \(\mathbf {S}_{\mathrm {real}}\) answers queries according to \(\tau \), and that \(S= s\) and \(Z = z\). In this case \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s, z)\) is the probability that \(\mathbf {S}_{\mathrm {ideal}}\) behaves according to the entries in \((\tau , z)\), and \(S\,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,\{0,1\}^{2k}\) agrees with s. We now show that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s, z) \le \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s, z)\) for all choices of z such that \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s, z) > 0\), and thus

$$ \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s) \le \sum _z \mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s, z) - \mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s, z) \le 0 \le g(s) . $$

Let \(s = (K_1, K_2)\) and \(a = p_1[s]\) and \(b = p_1[s] + p_2[s] < 2^n\). As \(s \in \varGamma _{\mathrm {good}}\backslash U\), the entries in \((\tau , z)\) consist of the following categories:

  1. (1)

    \((\mathsf {enc}, \cdot , 1, x_1, y_1), \ldots , (\mathsf {enc}, \cdot , 1, x_{2^n}, y_{2^n})\),

  2. (2)

    \((\mathtt {prim}, \cdot , K_1, x_1, u_1), \ldots , (\mathtt {prim}, \cdot , K_1, x_a, u_a)\) and \((\mathtt {prim}, \cdot , K_2, u_{a + 1}, y_{a + 1}), \ldots , (\mathtt {prim}, \cdot , K_2, u_{b}, y_{b})\), and

  3. (3)

    \((\mathtt {prim}, \cdot , J, \cdot , \cdot )\), with \(J \not \in \{K_1, K_2\}\).

Hence \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s, z)\) is the probability of the following events:

  1. (i)

    If we make queries in category (3) above, we get the answers provided by \(\tau \).

  2. (ii)

    \(S\,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,\{0,1\}^{2k}\) agrees with s.

  3. (iii)

    For any \(i \in \{1, \ldots , a + b\}\), querying \(\textsc {Prim}(K_1, x_i)\) in \(\mathbf {S}_{\mathrm {real}}\) yields \(u_i\), and querying \(\textsc {PrimInv}(K_2, y_i)\) in \(\mathbf {S}_{\mathrm {real}}\) yields \(u_i\). Moreover, for any \(j \in \{b + 1, \ldots , 2^n\}\), in \(\mathbf {S}_{\mathrm {real}}\), the output of \(\textsc {Prim}(K_1, x_j)\) is the same as the output of \(\textsc {PrimInv}(K_2, y_j)\).

Note that the three events above are independent, and the first two are independent of the system. On the other hand, \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s, z)\) is likewise the probability of events (i), (ii), and the following

  1. (iv)

    For any \(i \in \{1, \ldots , a\}\), querying \(\textsc {Prim}(K_1, x_i)\) in \(\mathbf {S}_{\mathrm {ideal}}\) yields \(u_i\). For any \(i \in \{a + 1, \ldots , b\}\), querying \(\textsc {PrimInv}(K_2, y_i)\) in \(\mathbf {S}_{\mathrm {ideal}}\) yields \(u_i\). Moreover, for any \(j \in \{1, \ldots , 2^n\}\), querying \(\textsc {Enc}(1, x_j)\) yields \(y_j\).

Again, note that events (i), (ii), and (iv) are independent. Hence we need only show that the probability that event (iii) happens is at least the probability that event (iv) happens. The chance that event (iii) is

$$ \frac{1}{(2^n)! \cdot 2^n (2^n - 1)(2^n - a - b)} $$

whereas the chance that event (iv) happens is

$$ \frac{1}{(2^n)! \cdot 2^n (2^n - 1) \cdots (2^n - a) \cdot 2^n(2^n - 1) \cdots (2^n - b)} . $$

Hence the probability that event (iii) happens is indeed at least the probability that event (iv) happens.

Case 2: \(s \in U\). Then \(p_1[s] = 0\). Clearly if \(q \ge 2^{n/2 - 1}\) then the claim vacuously holds. Assume that \(q < 2^{n / 2 - 1}\). Let \(s = (K_1, K_1)\). Let the \(\textsc {Enc}/\textsc {Dec}\) entries in \(\tau \) be \((\mathsf {enc}, \cdot , 1, x_1, y_1), \ldots , (\mathsf {enc}, \cdot , 1, x_q, y_q)\). Note that \(\tau \) doesn’t contain any entry \((\mathtt {prim}, \cdot , K_1, \cdot , \cdot )\). Then \(\mathsf {p}_{\mathbf {S}_{\mathrm {ideal}}}(\tau , s)\) is the probability of the following events:

  1. (a)

    \(S\,{\leftarrow \!\!{\scriptscriptstyle {\$}}}\,\{0,1\}^{2k}\) agrees with s.

  2. (b)

    If we make \(\textsc {Prim}/\textsc {PrimInv}\) queries in \(\tau \), we get the answers provided by \(\tau \).

  3. (c)

    \(\mathbf {S}_{\mathrm {ideal}}\) behaves according to the \(\textsc {Enc}/\textsc {Dec}\) queries in \(\tau \).

Note that the three events above are independent, and the first two are independent of the system. On the other hand, \(\mathsf {p}_{\mathbf {S}_{\mathrm {real}}}(\tau , s)\) is at least the probability of events (a), (b), and the following:

  1. (d)

    For every \(i \in \{1, \ldots , q\}\), if we query \(\textsc {Prim}(K, x_i)\), we will get an answer \(z_i \not \in \{x_1, y_1, \ldots , x_q, y_q\}\), and the strings \(z_1, \ldots , z_q\) are distinct. Moreover, if we query \(\textsc {Prim}(K, z_i)\), we will get \(y_i\).

Again, events (a), (b), and (d) are independent. Hence we only need to show that, \(\Pr [\text {Event (d)}] \ge (1 - 2q / 2^{n/2}) \Pr [\text {Event (c)}]\). Note that event (c) happens with probability

$$ \frac{1}{2^n(2^n - 1) \cdots (2^n - q + 1)} \,, $$

whereas event (d) happens with probability

$$ \Bigl ( \prod _{i = 0}^{q - 1} \frac{2^n - 2q - i}{2^n - i}\Bigr ) \frac{1}{(2^n - q) \cdots (2^n - 2q + 1)} . $$

Hence

$$\begin{aligned} \frac{\Pr [\text {Event (d)}]}{\Pr [\text {Event (c)}]}= & {} \prod _{i = 0}^{q - 1} \frac{2^n - 2q - i}{2^n - q - i} = \prod _{i = 0}^{q - 1} \Bigl ( 1 - \frac{q}{2^n - q - i}\Bigr ) \\\ge & {} 1 - \sum _{i = 0}^{q - 1} \frac{q}{2^n - q - i} \ge 1 - \frac{q^2}{2^n - 2q} \ge 1 - \frac{2q}{2^{n/2}}, \end{aligned}$$

where the first inequality is due to the fact that \((1 - x)(1 - y) \ge 1 - x - y\) for any \(x, y \ge 0\), and the last inequality is due to the assumption that \(q < 2^{n/2 - 1}\). This concludes the proof.

6 Matching Attacks

In this section, we give matching attacks for both Single Encryption and Double Encryption, in which the adversary uses \(\varTheta (q)\) \(\textsc {Enc}/\textsc {Dec}\) queries and \(\varTheta (p)\) \(\textsc {Prim}/\textsc {PrimInv}\) queries. Our attack on Single Encryption generalizes Biham’s work [10] for all choices of the parameters p and q. For Double Encryption, recall that one can launch a su attack with advantage \(\frac{p^2}{2^{2k}}\), and Biham’s key-collision attack [10] gives a mu attack with advantage \(\frac{q^2}{2^{2k}}\). Thus those attacks already give matching bounds in the usual case \(n \ge k\) (such as DES or AES). Hence for Double Encryption, the only interesting setting to find matching attacks is where \(n \ll k\) (such as Format-Preserving Encryption or MISTY-1). We however only know how to give matching attacks for this setting if the adversary is given all the keys after it finishes querying, which is the model in our security proof and many prior works [14, 16]. Our attack yields the bound around \(p^2s / 2^{2k}\), where \(s = \max \{\lfloor n / 8\lg (n) \rfloor , q / 2^n\}\), which is much better than the two known attacks above. We leave as an open problem to find matching attacks for \(n \ll k\) without key revelation.

A Useful Inequality. In the attacks, we often need to make use of the following technical result.

Lemma 11

Let \(r \ge 1\) be an integer and \(0 < a \le 1/r\). Then \((1 - a)^r \le 1 - ar / 2\).

Proof

Clearly the claimed inequality holds for \(r = 1\), and thus we need only consider \(r \ge 2\). Let \(f(x) = xr / 2 + (1 - x)^r - 1\). Our goal is to show that \(f(a) \le 0\). The derivative and second derivative of the function f are \(f'(x) = \frac{r}{2} - r (1 - x)^{r - 1}\) and \(f''(x) = \frac{1}{2} + r(r - 1) (1 -x)^{r - 2}\) respectively. Since \(f''(x) > 0\) for all \(x \in [0, 1/r]\), the function \(f'(x)\) is strictly increasing. We claim that \(f(a) \le \max \{f(0), f(1/r)\}\). Since \(f(0) = 0\) and

$$ f(1/r) = \frac{1}{2} + (1 - 1/r)^r - 1 \le \frac{1}{e} - \frac{1}{2} < 0, $$

we have \(f(a) \le 0\). To justify the claim above, note that if \(f'(1/r) < 0\) then function f is decreasing, and thus \(f(a) \le f(0) = \max \{f(0), f(1/r)\}\). If \({f'(1/r) \ge 0}\), since function \(f'\) is strictly increasing and \(f'(0) = -r/2 < 0\), there exists \(b \in [0, 1/r]\) such that \(f'(x) < 0\) for every \(x \in [0, b)\) and \(f'(x) \ge 0\) for every \(x \in [b, 1/r]\). Hence function f is decreasing in [0, b) and increasing in [b, 1/r], and thus \(f(a) \le \max \{f(0), f(1/r)\}\).    \(\square \)

6.1 Attacking Single Encryption

Let \(d = \lceil \frac{k + 2}{n - 1} \rceil \) and assume that \(d \le 2^{n - 1}\), which holds for all practical values of n and k. Then

$$\begin{aligned} 2^n (2^n - 1) \cdots (2^n - d + 1) \ge (2^{n - 1})^d \ge 2^{k + 2} . \end{aligned}$$

For all practical values of n and k, the value d will be very small. For example, if \(n = 64\) and \(k = 56\) (meaning DES parameters), we have \(d = 1\). Or if \(n = k = 128\) (meaning AES parameters), we have \(d = 2\). Let \(p, q \in \mathbb {N}\) such that \(pq \le 2^{k}\). Consider the following adversary A. It picks arbitrary distinct \(x_1, \ldots , x_d \in \{0,1\}^n\) and queries \(\textsc {Enc}(i, x_\ell )\) to get answer \(y_{i, \ell }\), for every \(i \in \{1, \ldots , q\}\) and \(\ell \in \{1, \ldots , d\}\). It then picks p arbitrary distinct keys \(K_1, \ldots , K_p \in \{0,1\}^k\) and queries \(E(K_j, x_\ell )\) to get answer \(z_{j, \ell }\), for every \(j \in \{1, \ldots , p\}\) and \(\ell \in \{1, \ldots , d\}\). If there are i and j such that \(y_{i, \ell } = z_{j, \ell }\) for every \(\ell \in \{1, \ldots , d\}\) then the adversary outputs 1, otherwise it outputs 0. In the real game, from Lemma 11, the chance that the adversary outputs 1 is

$$ 1 - \Bigl ( 1 - \frac{p}{2^k}\Bigr )^q \ge \frac{pq}{2^{k + 1}} . $$

In the ideal game, the chance that it outputs 1 is

$$\begin{aligned} \frac{pq}{2^n(2^n - 1) \cdots (2^n - d + 1)} \le \frac{pq}{2^{k + 2}} . \end{aligned}$$

Hence \({\mathsf {Adv}}^{\pm \mathrm {mu}\text {-}\mathrm {prp}}_{E}(A) \ge \frac{pq}{2^{k + 2}}\), and the adversary uses dq \(\textsc {Enc}\) queries and dp \(\textsc {Prim}\) queries.

6.2 Attacking Double Encryption

Here we assume that \(16 \le n < k\), and aim to achieve advantage \(p^2s / 2^{2k}\), where \(s = \max \{\lfloor n / 8\lg (n) \rfloor , q / 2^n\}\). We have the following restrictions on the parameters p and q:

  • Since there are attacks with advantage \(Q^2/ 2^{2k}\), where \(Q = \max \{p, q\}\), we need only consider \(2^n / n \le q \le 2^k\).

  • Since using \(p \approx 2^{k} / \sqrt{s}\) is already enough to get a constant advantage, without loss of generality, we can assume that \(p \le 2^{k} / \sqrt{s}\).

Moreover, recall that we are in the model where the keys are given to the adversary after it finishes querying.

The Attack. For every \(i \in \{1, \ldots , q\}\), query \((i, 0^n)\) to \(\textsc {Enc}\) to receive answer \(y_i\). View each string in \(\{0,1\}^n\) as a bin, and querying \(\textsc {Enc}(i, 0^n)\) means throwing a ball to those \(2^n\) bins at random. Let y be the bin of the most balls, and let S be the set of indices i such that the corresponding ball falls into bin y. The following lemma gives a strong concentration bound on |S| in both the real and ideal games; see Appendix C for the proof.

Lemma 12

Let \(n \ge 16\) and \(q \ge 2^n / n\) be integers. Consider throwing q balls to \(2^n\) bins at random. Let X denote the random variable for the number of balls in the bin of most balls. Then

$$ \Pr \Bigl [ X \ge \max \{ \lfloor n / 8 \lg (n) \rfloor , q / 2^n\} \Bigr ] \ge 1 - 2^{-n/3} . \;\;\;\;\; $$

   \(\square \)

Next, if \(|S| < s\) then the output a random guess to get advantage 0. If \(|S| \ge s\), which happens with probability at least \(1 - 2^{-n/3}\) according to Lemma 12, then adapt the meet-in-the-middle attack as follows. Pick distinct keys \(J_1, \ldots , J_{2p} \in \{0,1\}^k\), and query \(\textsc {Prim}(J_i, x)\) and \(\textsc {PrimInv}(J_{i + p}, y)\) to get answer \(u_i\) and \(v_i\) respectively. When the keys are given, check if there are some \(i, j\in \{1, \ldots , p\}\) and \(\ell \in S\) such that \((J_{i}, J_{j + p})\) is the key of user \(\ell \). If such a triple \((i, j, \ell )\) exists then output 1 if and only if \(u_i = v_j\).

Analyses. Suppose that \(|S| \ge s\). Then the chance that there are \(i, j \in \{1, \ldots , p\}\) and \(\ell \in S\) such that \((J_{i}, J_{j + p})\) is the key of user \(\ell \) is

$$ 1- (1 - p^2 / 2^{2k})^{|S|} \ge 1 - ( 1 - p^2 / 2^{2k})^{s} \ge \frac{p^2s}{2^{2k + 1}}, $$

where the last inequality is due to Lemma 11. If this pair exists then in the ideal game, the conditional probability that \(v_i = u_i\) is \(1 / 2^n\), whereas in the real game, \(v_i = u_i\) with conditional probability 1. Putting this all together, the adversary wins with advantage at least

$$ (1 - 2^{-n/3}) (1 - 2^{-n}) \cdot \frac{p^2s}{2^{2k + 1}} \ge \max \{ \lfloor n / 8 \lg (n) \rfloor , q / 2^n\} \cdot \frac{p^2}{3 \cdot 2^{2k}} . $$

Discussion. What’s the problem if we are not given keys at the end of the querying process? Now we have many pairs (ij) such that \(u_i = v_i\). One such pair will yield the key \((J_i, J_{j + p})\) for some user, but we don’t know which user. Moreover, there are too many pairs (ij)—one can show that in the ideal world, there are on average \(O(p^2 / 2^n)\) such pairs—and most of them are just false positives.