Keywords

1 Introduction

The design of symmetric cryptographic constructions exhibiting a clear and ideally low-degree algebraic structure is motivated by many recent use cases, for example the increasing popularity of new proof systems such as STARKs [8], SNARKs (e.g., Pinocchio [44]), Bulletproofs [19], and other concepts like secure multi-party computation (MPC). To provide good performance in these new applications, ciphers and hash functions are designed in order to minimize specific characteristics (e.g., the total number of multiplications, the depth, or other parameters related to the nonlinear operations). In contrast to traditional cipher design, the size of the field over which these constructions are defined has only a small impact on the final cost. In order to achieve this new performance goal, some crucial differences arise between these new designs and traditional ones. For example, we can consider the substitution (S-box) layer, that is, the operation providing nonlinearity in the permutation: In these new schemes, the S-boxes composing this layer are relatively large compared to the ones used in classical schemes (e.g., they operate over \(64\) or \(128\) bits instead of \(4\) or \(8\) bits) and/or they can usually be described by a simple low-degree nonlinear function (e.g., \(x \mapsto x^d\) for some d). Examples of these schemes include LowMC [4], MiMC [3], Jarvis/Friday [6], GMiMC [2], HadesMiMC [31], Vision/Rescue [5], and Starkad/Poseidon [30].

The structure of these schemes has a significant impact on the attacks that can be mounted. While statistical attacks (including linear [42] and differential [11] ones) are among the most powerful techniques against traditional schemes, algebraic attacks turned out to be especially effective against these new primitives. In other words, these constructions are naturally more vulnerable to algebraic attacks than those which do not exhibit a clear and simple algebraic structure. For example, this has been shown in [1], in which algebraic strategies covering the full-round versions of the attacked primitives are described. Although the approaches can be quite different, most of them exploit the low degree of the construction.

In this paper, we focus on MiMC  [3]. The MiMC design constructs a cryptographic permutation by iterated cubing, interleaved with additions of random constants to break any symmetries. A secret key is added after every such round to obtain a block cipher. The design of MiMC is very flexible and can work with binary strings as well as integers modulo some prime number. Security analysis by the designers rules out various statistical attacks, and the final number of rounds is derived from an analysis of attack vectors that exploit the simple algebraic structure. We remark that the designers chose the number of rounds with a minimal security margin for efficiency. For a more detailed specification and a summary of previous analysis, we refer to Sect. 2.3.

Since its publication in 2016, MiMC has become the preferred choice for many use cases that benefit from a low multiplication count or algebraic simplicity  [32, 45]. It also serves as a baseline for various follow-up designs evaluated in the context of the public “STARK-Friendly Hash Challenge” competitionFootnote 1.

1.1 Our Contribution

As the main results in this paper, we present

  1. (1)

    a new upper bound for the algebraic degree growth in key-alternating ciphers with low-degree round functions,

  2. (2)

    a secret-key higher-order distinguisher on almost full MiMC over \(\mathbb {F}_{2^n}\),

  3. (3)

    a known-key zero-sum distinguisher on almost double the rounds of MiMC,

  4. (4)

    the first key-recovery attack on full-round MiMC over \(\mathbb {F}_{2^n}\).

We also show that the technique we use for MiMC is sufficiently generic to apply to any permutation fulfilling specific properties, which we will define in detail. Our attacks and distinguishers on MiMC, as well as other attacks in the literature, are listed in Table 1.

Table 1. Various attacks on MiMC. In this representation, \(n\) denotes the block size (and key size). The unit for the attack complexity is usually the cost of a single encryption (number of multiplications over \(\mathbb {F}_{2^n}\) necessary for a single encyption). The SK and KR attacks can be implemented using chosen plaintexts CP and/or chosen ciphertexts CC. The memory complexity is negligible for all approaches listed.

Secret-Key Higher-Order Distinguishers. After recalling some preliminary facts about higher-order differentials, in Sect. 3 we analyze the growth of the algebraic degree for key-alternating ciphers whose round function can be described as a low-degree polynomial over \(\mathbb {F}_{2^n}\).

For an SPN cipher over a field \(\mathbb {F}\) where each round has algebraic degree \(\delta \), the algebraic degree of the cipher is expected to grow essentially exponentially in \(\delta \). Several analyses made in the literature [17, 18, 20] confirm this growth for most ciphers, except when the algebraic degree of the function is close to its maximum. As a result, the number of rounds necessary for security against higher-order differential attacks generally grows logarithmically in the size of \(\mathbb {F}\). Different behaviour has been observed for certain non-SPN designs, such as some designs with partial nonlinear layers where the algebraic degree grows exponentially in some (not necessarily integer) value smaller than \(\delta \) [26].

In Sect. 3, we show that if the round function can be described as an invertible low-degree polynomial function in \(\mathbb {F}_{2^n}\), then the algebraic degree grows linearly with the number of rounds, and not exponentially as generally expected. More precisely, let d denote the exponent of the power function \(x\mapsto x^d\) used to define the S-boxes. Then, we show that in the case of key-alternating ciphers over \(\mathbb {F}_{2^n}\), the algebraic degree \(\delta (r)\) as a function in the number of rounds r is

$$ \delta (r) \in \mathcal {O}(\log _2(d^r)) = \mathcal {O}(r). $$

As an immediate consequence, our observation implies that roughly \(n \cdot \log _d(2)\) rounds are necessary to provide security against higher-order differential attacks, much more than the expected \(\approx \log _\delta (n -1)\) rounds.

Distinguishers on MiMC over \(\mathbb {F}_{\mathbf{2}^{{\textit{\textbf{n}}}}}\) . Our new bounds on the number of rounds necessary to provide security against higher-order differential cryptanalysis have a major impact on key-alternating ciphers with large S-boxes. A concrete example for this class of ciphers is MiMC [3], a key-alternating cipher defined over \(\mathbb {F}_{2^n}\) (for odd \(n\in \mathbb {N}\)), where the round function is simply defined as the cube map \(x\mapsto x^3\). Since any cubic function over \(\mathbb {F}_{2^n}\) has algebraic degree 2, one may expect that approximately \(\log _2(n)\) rounds are necessary to prevent higher-order differential attacks. Our new bound implies that a much larger number of rounds is required to provide security, namely approximately \(n \cdot \log _3(2)\).

As a concrete example, in Sect. 4 we show that MiMC-\(n/n\) has a security margin of only 1 or 2 rounds against (secret-key) higher-order distinguishers (depending on \(n\)), which is much smaller than expected by the designers. Moreover, we can set up a known-key distinguisher for approximately double the number of rounds of MiMC, by showing that the same number of rounds is necessary to reach the maximum degree in the decryption direction. Our findings have been practically verified on toy versions.

We remark that the designers presented other non-random properties (including GCD and interpolation attacks) that can cover a similar number of rounds. The number of rounds proposed by the designers were chosen in order to provide security against key-recovery attacks based on these properties. As we are going to show, the number of rounds is not sufficient against our new attack based on a higher-order differential property.

Results using the Division Property. For completeness, in Sect. 4.5 we search for higher-order distinguishers for MiMC-\(n/n\) with the division property [46] proposed by Todo at Eurocrypt 2015, a powerful tool for finding the best integral distinguishers for block ciphers. By modeling the most recently proposed variant of the bit-based division property, which is called three-subset bit-based division property without unknown subset in  [34], we are able to reproduce exactly the same higher-order distinguishers for cases with small \(n\)-bit S-boxes, where \(n \in \{5,7,9\}\). However, as far as we know, it is an open problem to model the three-subset bit-based division property for a larger S-box of size bigger than 9 in practical time. Therefore, we conclude that the division property is unlikely to help us for the ciphers we focus on.

Key-Recovery Attack on MiMC-\({{\varvec{n/n}}}\) and on Generic Ciphers. A trivial way to extend an r-round distinguisher to an \((r+1)\)-round key-recovery attack is based on guessing the last round key, partially decrypting/encrypting, and finally exploiting the distinguisher to filter wrong key guesses. Unfortunately, this strategy does not work for MiMC, since guessing the full last round key required to invert the large S-box is equivalent to exhaustive key search. Another key-recovery approach that has been combined with integral distinguishers is based on interpolating the Boolean polynomials that define the final rounds. However, this strategy requires evaluating the distinguisher several times to collect enough equations, which is not feasible for our distinguisher due to its large data complexity.

In Sect. 5, we show how to solve this problem. Instead of guessing the last round key, we set up an equation over \(\mathbb {F}_{2^n}\) with the master key as a variable. To obtain this equation, we symbolically express the zero sum at the input to the last round as a polynomial function of the key, whose coefficients depend on the queried ciphertexts. We show how the resulting polynomial equation can be solved efficiently to recover the key. As a result, in the chosen-ciphertext case only, recovering the key from this data for the full n-bit version of MiMC takes the equivalent of less than \(2^{n-\log _2(n) +1}\) calls to MiMC, \(2^{n-1}\) chosen ciphertexts, and negligible amounts of memory. Moreover, we show that approximately \(\lceil \log _3(2\cdot R)\rceil \) more rounds (where \(R = \lceil n \cdot \log _3(2)\rceil \) is the current number of rounds of MiMC-n/n) can be necessary and sufficient to restore the security against the key-recovery attack presented here. This would, for example, imply that we need to add 5 more rounds for the most used version MiMC-129/129 (which currently has 82 rounds).

A Generic Strategy. Our strategy is an instance of a broader class of algebraic key-recovery approaches based on solving equations in the key variables. As such, it shares some ideas with other algebraic approaches like optimized interpolation attacks. However, while most algebraic key-recovery approaches of the last years construct and solve systems of many Boolean linear equations, we use a single univariate equation of higher degree that can be solved with polynomial factoring algorithms such as Berlekamp’s algorithm. In Sect. 6, we outline a more detailed and generic procedure for such an attack. It is interesting to note that a comparatively old technique which basically disappeared for the cryptanalysis of AES-like ciphers turns out to be very competitive for schemes with large S-boxes.

2 Preliminaries

In this section, we recall the most important results about polynomial representations of Boolean functions and summarize the currently best known results regarding the growth of the algebraic degree in the context of SP networks. We also provide the specification of MiMC and give an overview of previous cryptanalytic results.

We emphasize that in general it is only possible to give a lower bound regarding the number of rounds which we can attack using higher-order differential techniques, in the following denoted as “necessary number of rounds to provide security”. While upper-bounding the algebraic degree is more important from an adversary’s point of view, lower bounds on the degree are much more relevant when arguing about security against algebraic attacks (such as e.g. [24, 38, 40, 49]) from a designer’s viewpoint. However, at the current state of the art and to the best of our knowledge, it seems hard to find such a lower bound for a given cipher without investigating concrete instances experimentally – which, of course, limits the scope of any analysis.

2.1 Polynomial Representations over Binary Extension Fields

We denote addition (and subtraction) in binary extension fields by the symbol \(\oplus \). For \(n \in \mathbb {N}\), every function \(F : \mathbb {F}_{2^n} \rightarrow \mathbb {F}_{2^n}\) can be uniquely represented by an \(n\)-tuple \((F_1, F_2, \dots , F_n)\) of polynomials over \(\mathbb {F}_2\) in \(n\) variables with a maximum degree of \(1\) in each variable. In this representation, \(F_i\) is of the form

$$\begin{aligned} F_i(X_1,\dots ,X_{n}) = \bigoplus _{u=(u_1,\dots ,u_{n})\in \{0,1\}^{n}} \varphi _i(u) \cdot X_1^{u_1}\cdot \dots \cdot X_{n}^{u_{n}}, \end{aligned}$$
(1)

where the coefficients \(\varphi _i(u)\) can be computed by the Moebius transform.

As is common, we denote functions \(F : \mathbb {F}_{2}^n \rightarrow \mathbb {F}_{2}\) as Boolean functions and functions of the form \(F : \mathbb {F}_{2}^n \rightarrow \mathbb {F}_{2}^m\), for \(n,m\in \mathbb {N}\), as vectorial Boolean functions.

Definition 1

The algebraic normal form (ANF) of a Boolean function \(F:\mathbb {F}_{2}^n\rightarrow \mathbb {F}_2\), as given in Eq. (1), is the unique representation as a polynomial over \(\mathbb {F}_2\) in n variables and with a maximum univariate degree of 1. The algebraic degree \(\delta (F)\) of F – or \(\delta \) for simplicity – is the degree of the above representation of F as a multivariate polynomial over \(\mathbb {F}_2\). If \(G:\mathbb {F}_{2}^n\rightarrow \mathbb {F}_2^n\) is a vectorial Boolean function and \((G_1,\dots , G_n)\) is its representation as an n-tuple of multivariate polynomials over \(\mathbb {F}_2\), then its algebraic degree \(\delta (G)\) is defined as \( \delta (G):=\max _{1\le i\le n} \delta (G_i). \)

The link between the algebraic degree and the univariate degree of a vectorial Boolean function is well-known, and is for example established in [22]: the algebraic degree of \(F:\mathbb {F}_{2^n}\rightarrow \mathbb {F}_{2^n}\) can be computed from its univariate polynomial representation, and is equal to the maximum hamming weight of the 2-ary expansion of its exponents.

Lemma 1

Let \(F:\mathbb {F}_{2^n}\rightarrow \mathbb {F}_{2^n}\) be a function and let \(F(X)=\sum _{i=0}^{2^n-1} \varphi _i\cdot X^i\) denote the corresponding univariate polynomial description over \(\mathbb {F}_{2^n}\). The algebraic degree \(\delta (F)\) of F as a vectorial Boolean function is the maximum hamming weightFootnote 2 of its exponents, i.e., it is \( \delta (F) = \max _{0\le i\le 2^n -1} \left\{ \mathrm {hw}(i) \, | \, \varphi _i \ne 0\right\} . \)

2.2 Higher-Order Differential Cryptanalysis

Higher-order differential attacks [38, 40] form a prominent class of attacks exploiting the low algebraic degree of a nonlinear transformation such as a classical block cipher. If this degree is sufficiently low, an attack using multiple input texts and their corresponding output texts can be mounted. In more detail, if the algebraic degree of a Boolean function \(f\) is \(\delta \), then, when applying \(f\) to all elements of an affine vector space \(\mathcal {V} \oplus c\) of dimension greater than \(\delta \) and taking the sum of these values, the result is \(0\), i.e., \(\bigoplus _{v \in \mathcal {V} \oplus c} f(v) = 0\).

Security Against Higher-Order Differential Attacks – State of the Art. To prevent higher-order differential attacks against iterated block ciphers, one would usually want the maximum algebraic degree to be reached (well) within the suggested number of rounds. To achieve this goal, and to assess the security margins, it is crucial to estimate how the algebraic degree grows with the number of rounds.

The algebraic degree of composing two functions, \(F, G: \mathbb {F}_2^n \rightarrow \mathbb {F}_2^n\), can be generically bounded by

$$\begin{aligned} \deg (F\circ G) \le \deg (F) \cdot \deg (G), \end{aligned}$$
(2)

and hence an upper bound is found by iterative use of this on the round function. The resulting bound does, however, fail to reflect the real growth of the algebraic degree for many cryptosystems, and the problem of estimating the growth has been widely studied in the literature. After the initial work of Canteaut and Videau [20], a tighter upper bound was presented by Boura, Canteaut, and De Cannière [18] at FSE’11. There, the authors show how to deduce a new bound for the algebraic degree of iterated permutations for a special category of SP networks over \((\mathbb {F}_{2^n})^t\), which includes functions that have a number \(t \ge 1\) of balanced S-boxes as their nonlinear layer. Specifically, the authors show that the algebraic degree of the considered SP network grows almost exponentially, except when it is close to its maximum.

Proposition 1

( [18]). Let F be a function from \(\mathbb {F}_{2}^N\) to \(\mathbb {F}_{2}^N\) corresponding to the concatenation of t smaller S-boxes \(S_1, \dots , S_t\) defined over \(\mathbb {F}_2^n\). Then, for any function G from \(\mathbb {F}_{2}^N\) to \(\mathbb {F}_{2}^N\), we have

$$\begin{aligned} \deg (G\circ F(\cdot ))\le \min \biggl \{\deg (F) \cdot \deg (G), N - \frac{N - \deg (G)}{\gamma } \biggl \},\text { where} \end{aligned}$$
(3)
$$\begin{aligned} \gamma = \max _{i=1, \dots , n-1} \frac{n-i}{n-\delta _i} \le n-1, \end{aligned}$$
(4)

and where \(\delta _i\) is the maximum degree of the product of any i coordinates of any of the smaller S-boxes.

Thus, the number of rounds necessary to prevent higher-order differential attacks is in general bigger than the one obtained using the trivial bound in Eq. (2).

2.3 Specification and Previous Analysis of MiMC

MiMC [3] is a key-alternating \(n\)-bit block cipher, where in each round the same \(n\)-bit key is added to the state. The nonlinear component of the construction is the evaluation of the cube function \(f(x) = x^3\) over \(\mathbb {F}_{2^n}\). Additionally, a different round constant is added in each round to break symmetries, where the first round constant is \(0\). The total number of rounds is then

$$\begin{aligned} r = \left\lceil n\cdot \log _3(2) \right\rceil , \end{aligned}$$

and we refer to Fig. 1 for a graphical representation of the encryption function.

Fig. 1.
figure 1

The MiMC encryption function with \(r\) rounds.

MiMC is defined to work over prime fields and binary fields. In this paper, we focus on the binary field versions of MiMCFootnote 3, for which the block size \(n\) has to be odd in order for the S-box to be a permutation.

MiMC:Related Attacks in the Literature. The designers recommend MiMC with \(\lceil n \cdot \log _3(2) \rceil \) rounds [3]. They derive this number of rounds by considering a variety of different key-recovery attacks on MiMC. According to their analysis, the most powerful attacks are interpolation [36] and GCD attacks. About higher-order differential attacks, the authors claim that “the large number of rounds ensures that the algebraic degree of MiMC in its native field will be maximum or almost maximum. This naturally thwarts higher-order differential attacks [...]”.

The first attack on MiMC-n/n [41], presented at SAC 2019, targets a reduced-round version of MiMC proposed by the designers for a scenario in which the attacker has only limited memory, but it does not affect the security claims of full-round MiMC. The Feistel version of MiMC was attacked shortly after, by using generic properties of the used Feistel construction instead of exploiting properties of the primitive itself [16]. Finally, a specific attack on MiMC using Gröbner bases was considered in [1]. The authors state that by introducing a new intermediate variable in each round, the resulting multivariate system of equations is already a Gröbner basis and thus the first step of a Gröbner basis attack is for free. However, recovering univariate polynomials from this representation and then applying techniques like the GCD attack will result in a prohibitively large computational complexity, since the recovered polynomials will be of degree \(\approx 3^r\) after \(r\) rounds. Hence, the authors conclude that MiMC cannot be attacked directly by using known Gröbner basis techniques.

3 Higher-Order Differentials of Key-Alternating Ciphers

Our bound on the growth of the algebraic degree does not depend on the cubing of the round function in MiMC, so we introduce the following generalization of the result on MiMC from Sect. 2.3.

3.1 Setting

Let \(E^r_k: \mathbb {F}_{2^n} \rightarrow \mathbb {F}_{2^n}\) be a key-alternating cipher defined by

$$\begin{aligned} E^r_k(x) := k_r \oplus R( \cdots R(k_1 \oplus R(k_0 \oplus x)) \cdots ) \end{aligned}$$
(5)

over \(r\ge 1\) rounds, where \(k_0, k_1, \dots , k_r\in \mathbb {F}_{2^n}\) are derived from a master key \(k\in \mathbb {F}_{2^n}\) using a key schedule. Each round function \(R:\mathbb {F}_{2^n}\rightarrow \mathbb {F}_{2^n}\) is defined as some invertible univariate polynomial function

$$\begin{aligned} R(x) := \rho _0 \oplus \bigoplus _{i = 1}^d \rho _i \cdot x^i \end{aligned}$$
(6)

of univariate degree \(d \ge 3\), where \(\rho _i\in \mathbb {F}_{2^n}\) and \(\rho _d \ne 0\). We will, without loss of generality, assume \(d \le d_{\text {inv}}\), where \(d_{\text {inv}}\) denotes the degree of the compositional inverse of R (otherwise, an attacker would target the decryption function instead). Furthemore, we assume that the round function has low univariate degree, i.e., low compared to the size of \(\mathbb {F}_{2^n}\). In other words, we work with \(d\ll 2^n-1\).

3.2 Growth of the Degree

In this section, we show that the algebraic degree \(\delta \) of a key-alternating cipher \(E_k^r\) grows much slower than commonly presented in the literature. More precisely, in some cases it can grow linearly in the number of rounds and not exponentially.

Proposition 2

Let \(E^r_k\) be a an \(r\)-round key-alternating block cipher with a round function \(R\) of degree \(d\), as defined in Eq. (5). If \(r \le \mathcal {R}_{\text {lin}}{} - 1\), where

$$\begin{aligned} \mathcal {R}_{\text {lin}}{} = \left\lceil \log _d\bigl (2^{n-1}-1\bigl ) \right\rceil \approx (n-1) \cdot \log _d(2), \end{aligned}$$
(7)

then the algebraic degree \(\delta \) of \(E^r_k\) is at most \(n-2\). Then, a (secret-key) higher-order distinguisher using at most \(2^{n-1}\) data can be applied to \(E^r_k\).Footnote 4

Proof

Due to the relation between the word-level degree and the algebraic degree, \(E_k^r\) reaches its maximum algebraic degree of \(n-1\) if at least one monomial with the exponent \(2^n - 2^{j} - 1\) (for \(0\le j < n\)) appears in the polynomial representation. Indeed, note that all these monomials have an algebraic degree of \(n-1\). Since the smallest exponent of this form is \(2^n-2^{n-1}-1 = 2^{n-1}-1\), and since the degree of \(E_k^r\) after r rounds is at most \(d^r\), we require \(d^r\ge 2^{n-1}-1\) to make \(x^{2^{n-1}-1}\) appear, or equivalently,

$$\begin{aligned} r \ge \lceil \log _d(2^{n-1}-1) \rceil . \end{aligned}$$

Hence, the degree is not maximal for \(r < \lceil \log _d(2^{n-1}-1) \rceil \) and a higher-order distinguisher using at most \(2^{n-1}\) data can be applied.   \(\square \)

The Difficulty of Lower-Bounding the Growth of the Degree. We point out that it is always possible to set up a (secret-key) higher-order distinguisher if the number of rounds is smaller than \(\mathcal {R}_{\text {lin}}{}\). However, a number of rounds greater than or equal to \(\mathcal {R}_{\text {lin}}{}\) does not necessarily provide security.

One of the main problems in order to derive a sufficient condition for the number of rounds that provides security is the difficulty of analyzing the non-vanishing coefficients in the polynomial representation of \(E_k^r\). Note, in general it is not easy to give a condition guaranteeing that a particular monomial appears, since many factors (including the secret key, the constant addition, and the details of the S-box) influence the result.

Without going into the details, we consider the influence of the S-box in some concrete examples. Working with \(R(x) = x^d\) for a certain \(3\le d\le 2^{n}-2\) (where \(d\ne 2^{d^\prime }\) for \(d^\prime \in \mathbb {N}\)), we focus for simplicity only on two extreme cases \(d =2^{d^\prime } \pm 1\). By exploiting Lucas’s TheoremFootnote 5:

  • If \(d = {2^{d^\prime }+1}\) for some \(d^\prime \in \mathbb {N}\), then the output of a single round is sparse:

    $$ (x\oplus y)^{2^{d^\prime } + 1} = x^{2^{d^\prime }+1} \oplus x^{2^{d^\prime }} \cdot y \oplus y^{2^{d^\prime }} \cdot x \oplus y^{2^{d^\prime }+1} $$

    (note that it contains only 4 terms instead of \(d +1 = {2^{d^\prime }+2}\)).

  • If \(d = {2^{d^\prime } - 1}\) for some \(d^\prime \in \mathbb {N}\), then the output of a single round is full, since

    $$ (x\oplus y)^{2^{d^\prime } - 1} = \bigoplus _{i=0}^{2^{d^\prime }-1} x^i \cdot y^{2^{d^\prime }-1-i}. $$

Even if a single round is not sparse, the output of several combined rounds is not guaranteed to be full (even if it is in general dense). As a concrete example, while the output of \((x \oplus k_0)^3\oplus k_1\) is full, the same is not true for

$$\begin{aligned} \begin{aligned} ((x \oplus k_0)^3 \oplus&k_1)^3 \oplus k_2 = x^9 \oplus x^8 \cdot k_0 \oplus x^6 \cdot k_1 \oplus x^4 \cdot k_0^2 \cdot k_1 \oplus x^3 \cdot k_1^2 \\&\oplus x^2 \cdot (k_0 \cdot k_1^2 \oplus k_0^2 \cdot k_1^2 \oplus k_0^4 \cdot k_1) \oplus x \cdot k_0^8 \oplus c(k_0, k_1, k_2), \end{aligned} \end{aligned}$$
(8)

where both \(x^5\) and \(x^7\) are missing, and where \(c(k_0, k_1, k_2)\) is a function that depends only on the keys. This simple example emphasizes the difficulty of analyzing the sparsity of the polynomial that defines \(E_k\).

3.3 Comparison with Other Bounds

We now compare the new number of rounds necessary to provide security against secret-key higher-order distinguishers with other possible bounds. An alternative strategy is to apply generic bounds focusing on the algebraic degree of the round function, as recalled in Proposition 1. Recall that \(\mathcal {R}_{\text {lin}}{}\) is the number of rounds from Proposition 2, and we will denote the number of round based on generic bounds by \(\mathcal {R}_{\text {gen}}{}\). The comparison will make use of \(\delta _{\text {lin}}{}(r)\), the upper bound on the algebraic degree after r rounds following Proposition 2. The upper bound from Eq. (3) will be denoted by \(\delta _{\text {gen}}{}(r)\). Note that \(\delta _{\text {gen}}{}(r)\) can, for example, take advantage of a slower growth in the algebraic degree, as in Eq. (8) by considering two rounds instead of one. Despite this, the overall trend of \(\delta _{\text {gen}}{}(r)\) will still be exponential. On the other hand, if the round function can be described by a polynomial of low univariate degree d over \(\mathbb {F}_{2^n}\), we expect a linear behaviour in \(\delta _{\text {lin}}{}(r)\):

$$ \delta _{\text {lin}}{}(r) \le \left\lfloor \log _2(d^r + 1) \right\rfloor \approx r \cdot \log _2(d). $$

As a result, the round numbers \(\mathcal {R}_{\text {lin}}{}\) and \(\mathcal {R}_{\text {gen}}{}\) necessary to provide security grow respectively linearly and logarithmically in the size n of the field, namely

$$ \mathcal {R}_{\text {lin}}{} \in \mathcal {O}(n) \qquad \text {and} \qquad \mathcal {R}_{\text {gen}}{} \in \mathcal {O}(\log _{\delta }(n)). $$

A concrete comparison of \(\delta _{\text {lin}}{}(r)\) and \(\delta _{\text {gen}}{}(r)\) for MiMC-\(129/129\) is given in Fig. 2. In this setting we have \(\delta _{\text {lin}}{}(r) = \lfloor \log _2(3^{r} +1)\rfloor \), and \(\delta _{\text {gen}}{}(r)\) has been derived using the observation that two rounds of MiMC have algebraic degree two (see [28, App. A] for more details). In particular, we find \(\mathcal {R}_{\text {gen}}{} = 13\) and \(\mathcal {R}_{\text {lin}}{} = 81\).

Remark. We emphasize that every (invertible) S-box/round function in \(\mathbb {F}_2^n\) can be rewritten as a polynomial over \(\mathbb {F}_{2^n}\). The crucial point here is that given a “random” S-box/round function over \(\mathbb {F}_2^n\), the corresponding polynomial over \(\mathbb {F}_{2^n}\) has in general a high univariate degree (e.g., \(d \approx 2^n - \varepsilon \) for some small \(\varepsilon \)). In such a case, even if our argument still holds, the final result becomes meaningless, since \(\log _d(2^n -1) \approx \log _{2^n - \varepsilon }(2^n-1) \approx 1\) is basically constant (i.e., it does not grow linearly with n). Hence, our results turn out to be relevant only for S-boxes/round functions for which the corresponding polynomial over \(\mathbb {F}_{2^n}\) has “small” degree (namely, small compared to the field size, i.e., \(d \ll 2^n\)).

Fig. 2.
figure 2

Different upper bounds of the growth of the algebraic degree for MiMC-\(129/129\). The trivial bound is \(2^r\). A tighter bound, \(\delta _{\text {gen}}{}(r)\), exploits the observation that 2 rounds only have degree 2 (see Eq. (8)). Our new bound, \(\delta _{\text {lin}}{}(r)\), is linear in the number of rounds.

4 Distinguishers for Reduced-Round and Full MiMC

Exploiting the previous result, we now discuss the possibility to set up higher-order differential distinguishers and attacks on MiMC [3]. We show that

  • (1) MiMC has a security margin of only 1 or 2 round(s) against (secret-key) higher-order distinguishers, depending on \(n\), and that

  • (2) a zero-sum known-key distinguisher can be set up for approximately double the number of rounds of MiMC.

4.1 Secret-Key Higher-Order Distinguisher for MiMC

The results just presented allow to set up a nontrivial (secret-key) higher-order distinguisher on \(\lceil \log _3(2^{n-1}-1) \rceil -1\) rounds of MiMC, where \(\lceil \log _3(2^{n-1}-1) \rceil -1 < \lceil n\cdot \log _3(2) \rceil \) for all n. Consequently, the security margin is reduced to

$$ 1\le \lceil n\cdot \log _3(2) \rceil - \left( \lceil \log _3(2^{n-1}-1) \rceil -1 \right) \le 2 $$

rounds. To give some concrete examples, MiMC has 1 round of security margin for \(n \in \{33, 63, 255\}\), and 2 rounds of security margin for \(n \in \{31, 65, 127, 129\}\).

4.2 Practical Results

In this section we compare the results from Proposition 2 with practical results from scaled-down versions of MiMC. The testsFootnote 6 have been performed in the following way: Instead of computing the ANF of a keyed permutation (which is expensive even for small field sizes), we evaluate the higher-order differential zero-sum property (as given in Sect. 2.2) for a specific input vector space. Namely, for random keys, random constants, and an input subspace of dimension \(n-1\), we look for the minimum number of rounds r for which the corresponding sum of the ciphertexts is different from zero. Such a number corresponds to the number of rounds necessary to prevent higher-order distinguishers. In order to avoid the influence of weak keys or round constants, we repeated the tests multiple times (with new random keys and round constants). The practical number of rounds we give in each row is the smallest number of rounds among all tested keys and round constants necessary to prevent higher-order distinguishers. This means that a potentially higher number of rounds can be attacked by choosing the keys and round constants in a particular way.

Table 2. Theoretical and practical round numbers necessary to prevent higher-order distinguishers for MiMC over \(\mathbb {F}_{2^n}\).

The results, denoted \(\mathcal {R}\), are given in Table 2. We also present \(\mathcal {R}_{\text {lin}}{}\) (from Proposition 2) and \(\mathcal {R}_{\text {gen}}{}\) (see [28, App. A]) for comparison. We emphasize that the theoretical values predicted by \(\mathcal {R}_{\text {lin}}{}\) match the practical results in about half of the cases, and are off by at most one.

4.3 Known-Key Zero-Sum Distinguisher for MiMC

A known-key distinguisher is a scenario introduced in [39] where the attacker knows the key, and it is important in all settings in which no secret material is present. To succeed, the attacker has to discover some property of the attacked cipher that holds with a probability higher than for an ideal cipher, or is believed to be hard to exhibit generically. The goal of a known-key zero-sum distinguisher is to find a set of plaintexts and ciphertexts whose sums are equal to zero. To do this, the idea is to exploit the inside-out approach. By choosing a subspace of texts \(\mathcal {V}\), one simply defines the plaintexts as the \(r_{\text {dec}}\)-round decryption of \(\mathcal {V}\) and the ciphertexts as the \(r_{\text {enc}} \)-round encryption of \(\mathcal {V}\). Such a distinguisher can then cover \(r_{\text {enc}} + r_{\text {dec}}\) rounds. Examples of this approach are given in the literature for Keccak [7, 10, 18], Luffa [7, 18], or PHOTON [50].

In the case of MiMC, the idea is to choose \(\mathcal {V}\) as a subspace of \(\mathbb {F}_{2^n}\) of dimension \(n-1\). The maximum number of encryption rounds \(r_{\text {enc}} \) for which it is possible to guarantee a zero sum has been given in the previous paragraph. Based on Sect. 4.2, we can set up a known-key distinguisher on (more than) full MiMC-n/n. For our distinguisher on MiMC, we first recall the following result from [17].

Proposition 3

(Corollary 3 of [17]). Let F be a permutation of \(\mathbb {F}^n_2\). Then, \(\deg (F^{-1}) = n -1\) if and only if \(\deg (F) = n -1\).

Corollary 1

Let \(r_{\text {enc}}\) be the number of rounds necessary for MiMC over \(\mathbb {F}_{2^n}\) to reach its maximum algebraic degree in the encryption direction. The same number of rounds is necessary for reaching the maximum algebraic degree in the decryption direction, i.e., \(r_{\text {dec}} = r_{\text {enc}} = \lceil \log _3(2^{n-1}-1) \rceil \).

It follows that, given a subspace \(\mathcal {V} \subseteq \mathbb {F}_{2^n}\) of dimension \(n-1\), the sums of the corresponding texts after \(r_{\text {dec}} - 1\) decryption rounds and \(r_{\text {enc}} - 1\) encryption rounds are always equal to zero, i.e.,

$$ \underbrace{\bigoplus _{w \in \mathcal {V} \oplus v} R^{-(r_{\text {dec}}-1)}(w) = 0}_{\text {Zero sum}} \xleftarrow {R^{-(r_{\text {dec}}-1)}} \mathcal {V} \oplus v \xrightarrow {R^{r_{\text {enc}}-1}} \underbrace{0 = \bigoplus _{w \in \mathcal {V} \oplus v} R^{r_{\text {enc}}-1}(w)}_{\text {Zero sum}} $$

for each \(v\in \mathbb {F}_{2^n}\). Hence, a known-key zero-sum distinguisher can be set up for

rounds of MiMC-n/n, which is much more than full MiMC-n/n.

4.4 Impact of the Known-Key Distinguisher on Full MiMC

Sponge Function. In [3], the authors propose a hash function by instantiating a sponge construction with MiMC\(^\pi \), a fixed-key version of MiMC. The sponge hash function is indifferentiable from a random oracle up to \(2^{c/2}\) calls to the internal permutation P (where c is the capacity) if P is modeled as a randomly chosen permutation [9]. Thus, even if it is not strictly necessary, it is desirable that MiMC is resistant against known-key distinguishers.

For completeness, we mention that even if there is a way to distinguish a permutation from a random one, it seems difficult to exploit a zero-sum distinguisher of the internal permutation of a sponge construction in order to attack the hash function. To give a concrete example, consider the case of Keccak: As a consequence of the zero-sum distinguisher found on \(18\)-round Keccak-f[1600], the number of rounds has been increased from \(18\) to \(24\) in the second round of the SHA-3 competition in order to avoid “non-ideal” properties (see [10, 18] for more details). However, the best known attack on the Keccak hash function can only be set up when using \(6\)-/7-round Keccak-f [33].

In any case, we remark that such distinguishers based on zero sums cannot be set up for an arbitrary number of rounds, and they do indeed exploit the internal properties of a primitive using the inside-out approach found in this paper and in other literature. Hence, they cannot be considered meaningless.

Other Approaches. Even though the original MiMC paper only specifies a sponge-based hash function using MiMC, there are various applications and/or specific considerations that would make a block-cipher-based approach more advantageous (like, for example, being forced to use a block size which is too small for a sponge-based approach). Another way to turn a block cipher into a hash function is to use a compression function like the Davies–Meyer one together with something like the Merkle–Damgård construction. Similar to the case of sponge constructions, the security of such an algorithm is proven in the ideal cipher model [12]. This choice is, however, not supported by the MiMC designers, who use our results to support their advice against using a block-cipher-based approach (even though such implementations can still be foundFootnote 7). It follows that, since the attacker has control of the key in such scenarios, it is desirable for MiMC to be resistant against known- and chosen-key distinguishers, even if it does not seem to be strictly necessary.

4.5 Results Using the Division Property

Finally, in [28, App. C] we present our practical results obtained using “Mixed Integer Linear Programming (MILP)”, which models the propagation of the (conventional) bit-based division property.

The (conventional) bit-based division property [48] was proposed to investigate integral characteristics of block ciphers at a bit level. With this approach, the integral property of each bit is studied independently. Naturally, this strategy allows to capture more information of the propagation than the word-level version, and thus integral characteristics for more rounds can be found with this new technique. For example, the integral distinguishers of SIMON32 have been improved from 10 rounds [46] (the current best result at word level) to 14 rounds [52] (obtained by the experimental method cited before).

Instead of separating the parity into the two cases “0” and “unknown” as for the (conventional) bit-based division property, three-subset bit-based division property [48] was introduced to enhance the accuracy of the conventional one, where the parity is separated into three sets, i.e., “0”, “1”, and “unknown”. It shows that the three-subset bit-based division property can indeed be more accurate than the two-subset bit-based division property for some ciphers  [35, 53]. However, it becomes harder to efficiently model the three-subset division property propagation even for ciphers with simple structures. Recently, [34] pointed out that the three-subset division property has a couple of known problems when applied to cube attacks, and proposed a modified three-subset bit-based division without the “unknown” set to overcome these problems. By modeling this modified version of the three-subset bit-based division property for our cases with small n-bit S-boxes, where \(n\in \{5, 7, 9\}\), we confirm the practical results given in Table 2.

However, as far as we know, it is still an open problem to model the (modified) three-subset bit-based division property for a larger S-box of size bigger than 9. The S-boxes we focus on in this paper can be described as a (low-degree) polynomial function in \(\mathbb {F}_{2^n}\), where n is much larger than 9. Therefore, the division property, which is commonly believed as the most efficient tool to find the best integral distinguishers, might not help us as much for the ciphers we focus on.

5 Key-Recovery Attack on MiMC

Since the security margin of MiMC with respect to a (secret-key) higher-order distinguisher is of only 1 or 2 round(s) depending on n, it is potentially possible to extend a distinguisher to a key-recovery attack. Given a subspace \(\mathcal {V}\) of plaintexts whose sum is equal to zero after r rounds, we can consider \(r+1\) rounds, partially guess the last subkey and decrypt, and filter wrong key guesses that do not satisfy the zero sum:

$$ \mathcal {V} \oplus v \xrightarrow {R^{r}(\cdot )} \underbrace{\bigoplus _{w \in \mathcal {V} \oplus v} R^{r}(w) = 0}_{\text {Higher-order distinguisher}} \xleftarrow [\text {Key guessing}]{R^{-1}(\cdot )} \underbrace{\{R^{r+1}(w) \mid w \in \mathcal {V} \oplus v\}}_{\text {Ciphertexts}}. $$

However, since the subkeys of MiMC are equal to the master key plus constants, and due to the single full-state S-box, even a (partial) decryption of a single round requires guessing the full key. As a result, a key-recovery attack on full MiMC based on this strategy seems infeasible.

In this section, we present an alternative strategy that allows to break full-round MiMC. Since a trivial key-guessing approach is inefficient, our idea is to construct a polynomial of low degree, which we can then try to solve.

5.1 Strategy of the Attack

From Proposition 2 and Proposition 3, a zero sum can be set up for at least \(\left\lceil (n - 1)\log _3(2) \right\rceil - 1=\left\lceil n\log _3(2) \right\rceil - \varepsilon \) rounds in the encryption and decryption direction with a vector space \(\mathcal {V} \oplus v\) of dimension \(n-1\), where \(\varepsilon \in \{1, 2\}\). Recalling that \(\left\lceil n\cdot \log _3(2) \right\rceil \) is the number of rounds of full MiMC, we define \(r_\text {ZS}\), \(r_\text {KR}\) as

$$\begin{aligned} r_\text {ZS} = \left\lceil (n - 1)\log _3(2) \right\rceil - 1 \qquad \text {and} \qquad r_\text {KR} = 1 + \left( \left\lceil n\log _3(2) \right\rceil - \left\lceil (n - 1)\log _3(2) \right\rceil \right) , \end{aligned}$$

where \(r_\text {ZS}\) is the number of rounds that we can cover with a zero sum, \(r_\text {KR} = \left\lceil n\cdot \log _3(2) \right\rceil - r_\text {ZS} \in \{1, 2\}\).

Let \(f^r(x, K)\) be the function corresponding to r rounds of (and \(f^{-r}(x, K)\) be r rounds of decryption, ), where x is the input text and K is a symbolic variable that represents the secret key k. We intend to use these functions to create a polynomial from which we can deduce k. More precisely, for a fixed vector space \(\mathcal {V} \oplus v\), we consider the equations

(9)

After having received all \(x\) values from an oracle, the attacker can construct one of the polynomials \(F(K) = 0\) or \(G(K) = 0\). The secret key k can now be determined by finding the roots of either of these polynomials.

In the case of MiMC, the degree of a single encryption round is \(3\), while the degree of a single decryption round is \((2^{n+1}-1)/3\) (which is significantly larger than \(3\) for large n). Due to the slow degree growth in the encryption direction of MiMC, we will focus on finding the roots of F(K) given in Eq. (9).

Finding the Roots of Univariate Polynomials. Let \(F(X)\in \mathbb {F}_{2^{n}}[X]/\langle X^{2^{n}} + X \rangle \) be a univariate polynomial of degree \(D\). Furthermore, let \(M(D)\) denote a number such that multiplying two polynomials of degree \(\le D\) over \(\mathbb {F}_{2^{n}}\) requires \(\mathcal {O}(M(D))\) operations in \(\mathbb {F}_{2^{n}}\). For instance, a straightforward method would yield \(M(D) = D^2\), whereas \(M(D) = D \cdot \log (D)\cdot \log \log (D)\) holds for methods based on fast Fourier transforms [21]. The Berlekamp algorithm for determining the roots of F is then expected to require \(\mathcal {C} \in \mathcal {O}\left( M(D)\log (D)\log \left( 2^{n} D\right) \right) \) operations in \(\mathbb {F}_{2^{n}}\) (see [29, Chapter 14.5]).

5.2 Details of the Attack

Assume \(\mathcal {V} \oplus v\) is a coset of a subspace \(\mathcal {V}\) of dimension \(n-1\). We define

$$ \mathcal {W} = \text {MiMC}{}_k^{-1}(\mathcal {V} \oplus v) \equiv \{\text {MiMC}{}_k^{-1}(x) \in \mathbb {F}_{2^n} \,| \, x \in \mathcal {V} \oplus v\} $$

under a fixed secret key k. Here, we present the details of the attack for the cases \(r_\text {KR} = 1\) and \(r_\text {KR} = 2\), and we analyze the computational cost. We introduce the following notation:

$$\begin{aligned} \forall d \in \mathbb {N} : \qquad \mathscr {P}_d := \bigoplus _{x \in \mathcal {W}} x^d, \end{aligned}$$
(10)

and whenever possible we will make use of the fact that squaring is a linear operation over \(\mathbb {F}_{2^{n}}\). More specifically, computing \(\mathscr {P}_{2d}\) only requires a single squaring operation once \(\mathscr {P}_{d}\) is calculated:

$$\begin{aligned} \mathscr {P}_{2d} := \bigoplus _{x \in \mathcal {W}} x^{2d} = \left( \bigoplus _{x \in \mathcal {W}} x^{d}\right) ^2 = \mathscr {P}_{d}^2. \end{aligned}$$
(11)

This allows to reduce the total number of XOR operations.

Case: \({\textit{\textbf{r}}}_\mathbf{KR } = \mathbf{1}\) . Since a single round of MiMC is described by \((x \oplus k)^3 = k^3 \oplus k^2 \cdot x \oplus k \cdot x^2 \oplus x^3\), the function F(K) is given by

$$ F(K) = K^2 \cdot \mathscr {P}_1 \oplus K \cdot \mathscr {P}_2 \oplus \mathscr {P}_3. $$

A complete pseudo code of the attack can be found in Algorithm 1, which makes it easy to see that the cost of the attack is well approximated by

  • \(|\mathcal {V}| = 2^{n-1}\) multiplications,

  • \(|\mathcal {V}| = 2^{n-1} + 1\) squarings,

  • \(2 \cdot |\mathcal {V}| + 1 = 2^{n} +1\) n-bit XOR operations,

  • cost of finding the roots of a univariate polynomial of degree 2.

figure a

Case: \({\textit{\textbf{r}}}_\mathbf{KR } = \mathbf{2}\) . The attack for the case \(r_\text {KR} = 2\) is similar. From Eq. (8) (using \(k_0 = k\), \(k_1 = k\oplus c_1\) and \(k_2 = 0\)), the function F(K) is described by

$$\begin{aligned}&F(K) = K^8 \cdot \mathscr {P}_1 \oplus K^5 \cdot \mathscr {P}_2 \oplus K^4\cdot (\mathscr {P}_2 \cdot c_1 \oplus \mathscr {P}_1) \oplus K^3 \cdot (\mathscr {P}_4 \oplus \mathscr {P}_2)\\ \oplus&K^2 \cdot (\mathscr {P}_4 \cdot c_1 \oplus \mathscr {P}_3 \oplus \mathscr {P}_1 \cdot c_1^2) \oplus K \cdot (\mathscr {P}_8 \oplus \mathscr {P}_6 \oplus \mathscr {P}_2 \cdot c_1^2) \oplus (\mathscr {P}_9 \oplus \mathscr {P}_6\cdot c_1 \oplus \mathscr {P}_3\cdot c_1^2), \end{aligned}$$

where \(c_1\) is the round constant of the first round. As also noted in Sect. 3.2, while \(\mathscr {P}_9\) is the largest \(\mathscr {P}_d\) in this expression, both \(\mathscr {P}_5\) and \(\mathscr {P}_7\) are missing, and hence do not need to be computed. A complete pseudo code of the attack can be found in Algorithm 2. Again, it is easy to see that the cost of the attack is well approximated by

  • \(2\cdot |\mathcal {V}| + 6= 2^{n} + 6\) multiplications,

  • \(2 \cdot |\mathcal {V}| + 4 = 2^{n} +4\) squarings,

  • \(3 \cdot |\mathcal {V}| + 8 = 3 \cdot 2^{n-1} + 8\) n-bit XOR operations,

  • cost of finding the roots of a univariate polynomial of degree 8.

figure b

5.3 Complexity Estimation

As we have just seen, our attack requires half of the code book (namely, \(2^{n-1}\) chosen ciphertexts). Here we show that our attacks are better than exhaustive search (from the computational point of view). In order to do this, we measure the time complexities in equivalent encryption operations.

A single encryption round in MiMC requires one addition, one squaring operation, and one multiplication in the extension field. Since the cost of a single n-bit XOR operation is much smaller than the cost of a multiplication over \(\mathbb {F}_{2^n}\), and since the number of XOR operations is similar to the number of multiplications, in the following we do not consider XOR operations. After this simplification, we find that the time complexity of \(r_\text {KR} = 1\) is dominated by \(2^{n-1}\) squaring and multiplication operations or, equivalently, \(2^{n-1}\) encryption rounds. A similar line of reasoning reveals that \(r_\text {KR} = 2\) is comparable to \(2^{n}\) encryption rounds.

Since the cost of solving a single low-degree equation is negligible, and one unit of encryption contains \(\left\lceil n \cdot \log _3(2) \right\rceil \) rounds, it follows that the cost of our attacks is about

$$\begin{aligned} \frac{r_\text {KR} \cdot 2^{n-1}}{\left\lceil n \cdot \log _3(2) \right\rceil } \end{aligned}$$

encryptions for \(r_\text {KR}\in \{1,2\}\). That is, the computational cost of the key-recovery part of our attacks is upper-bounded by \(2^{n-\log _2(n) + 1}\), and hence the total cost is smaller than that of a brute-force attack (namely, \(2^n\) encryptions) for each \(n\ge 3\).

5.4 Practical Verification

We implemented Algorithm 1 and Algorithm 2 in the computer algebra system Magma, and verified both algorithms for all odd integers \(n\in [5,35]\). We note that Algorithm 1 (\(r_\text {KR} = 1\)) yields the correct answer for all the tested \(5 \le n \le 35\), even if \(\left\lceil n\log _3(2) \right\rceil \ne \left\lceil (n - 1)\log _3(2) \right\rceil \). Namely, in practice it is possible to cover one more round with a zero sum than what we theoretically expect. In other words, \(\left\lceil (n - 1)\log _3(2) \right\rceil \) rounds of the decryption function of MiMC fail to obtain the maximum algebraic degree for these parameters, which is reached after \(\left\lceil (n - 1)\log _3(2) \right\rceil + 1\) rounds (see [28, App. B] for more details on the degree growth of \(\text {MiMC}{}^{-1}\)). Since we are not able to prove this behavior for larger values of n, we leave it as an open question whether Algorithm 1 can be applied to MiMC for odd integers \(n>35\).

Considerations on Data and Computational Costs of This Attack. A possible drawback of our attack is the cost. Since we are not able to provide an estimation of the growth of the degree in the decryption direction, we can only exploit the fact that a certain number of rounds are necessary in order to achieve maximum degree. It follows that the attacker is forced to use half of the code book in order to set up the attack, which also has an impact on the computational cost.

Even if our attack is not practical, we believe it provides valuable theoretical insight. It is also in line with several other attacks found in the literature, which are set up under a similar assumption on the data and/or computational cost. To give some concrete examples, consider the case of zero-correlation attacks [14], which exploit linear approximations that hold with probability \(\frac{1}{2}\). The crucial limitation for basic zero-correlation linear cryptanalysis is that it requires half of the code book. Only follow-up works have been able to reduce this data requirement, including the more powerful distinguisher called multiple zero-correlation (MPZC) linear distinguisher proposed in [15], which exploits the fact that there are numerous zero-correlation linear approximations in susceptible ciphers. While needing (close to) the full code book is an inherent property of zero-correlation attacks, the reason for the high data complexity in our case is purely due to the specification of MiMC and the attacked number of rounds, and not due to an inherent property of our attack.

Splice-and-cut meet-in-the-middle attacks and biclique attacks are other examples of attacks that often come with time complexities relatively close to exhaustive search. Indeed, an extension of the biclique approach first described in [13] has a brute-force phase for a number of rounds as part of the attack. It can in principle work for any number of rounds and is hence best described as a particular optimization of brute-force key guessing. However, later variants then showed examples where the gain over brute force was in the order of millions  [37]. Still, we note that the complexity of biclique attacks scales differently than our attack, whose runtime cost depends strongly on the details of the target cipher MiMC.

Finally, we point out that any attack that is better than brute force is relevant, even if it requires unrealistic amounts of data or storage. Indeed, the main goal of cryptanalysis is finding a “certificated weakness”, that is, an evidence that the cipher does not perform as advertised. In other words, in academic cryptography, a weakness or a break in a scheme is usually defined quite conservatively: It may require impractical amounts of time, memory, or data.

The Number of Rounds Needed for Security. It may be of interest to estimate the number of rounds needed for MiMC to be resistant against this attack. To this end, we bound the operations needed to compute all monomials of odd degree, up to a maximum degree D.

Lemma 2

Let \(1 \le D \le 2^n - 1\) and \(x \in \mathbb {F}_{2^n}\). The overall number of operations needed to compute all odd powers \(x^i\) for \(i\in [3,D]\) is given by 1 squaring and \(\left\lfloor \frac{D-1}{2} \right\rfloor \) multiplications.

Proof

From x, calculate and store \(q := x^2\). The odd powers of x can now be successively computed as \(x^{i + 2} = x^{i}\cdot q\) for all odd integers i in the interval \([1,D-2]\). This yields a total of 1 squaring and \(\left\lfloor \frac{D-1}{2} \right\rfloor \) multiplications.    \(\square \)

Assume for simplicity that \(\left\lceil n \cdot \log _3(2) \right\rceil - 1\) rounds can be covered by a zero sum, and that the cost of solving the final polynomial equation is negligible. As before, we expect the time complexity to be dominated by the number of operations needed to construct the polynomial F(K). Since the degree of this polynomial is upper-bounded by \(3^{r_\text {KR}}\), by Lemma 2 at most \([(3^{r_{RK}}-1)/2] \cdot 2^{n-1}\) multiplications are required to compute all monomials with odd exponents in F(K) (where all monomials with even exponents are computed via Eq. (11)).

Since one encryption of MiMC costs \(\left\lceil n \cdot \log _3(2) \right\rceil \) multiplications, the number of extra rounds \(\rho \) for MiMC must satisfy

$$\begin{aligned} (3^{\rho + 1} - 1)\cdot 2^{n-2} \ge 2^n\cdot (\left\lceil n \cdot \log _3(2) \right\rceil + \rho ) \end{aligned}$$

in order to provide security against the attack just presented. This would, for example, require at least \(\rho = 5\) extra rounds for \(n=129\) (more generally, if R is the number of rounds of MiMC-n/n, then \(\rho \approx \lceil \log _3(2\cdot R)\rceil \) more rounds are sufficient to restore the securityFootnote 8). We remark that this rough estimation is not intended to replace the number of rounds proposed by the designers.

6 An Algebraic Attack on Ciphers with Low-Degree Round Functions

Here we generalize the key-recovery attack on MiMC described in Sect. 5 and discuss a generic attack strategy for any block cipher working over \((\mathbb {F}_{2^n})^t\), where \(n,t\in \mathbb {N}\), \(t\ge 2\) and \(n\ge 3\).

6.1 Setting

We consider an \(r\)-round block cipher \(E_{k}^r: (\mathbb {F}_{2^n})^t \rightarrow (\mathbb {F}_{2^n})^t\) with

$$ E_{k}^{r}(x) = (R_{r} \circ R_{r-1} \circ \dots \circ R_1)(x\oplus k), $$

and where \(R, R_i:(\mathbb {F}_{2^n})^t\rightarrow (\mathbb {F}_{2^n})^t\) are defined by \(R_i(x) = R(x) \oplus k^{(i)}\). Here, R denominates the (nonlinear) round function. Since \(E_k^r\) consists of t components, we can write

$$ E_{k}^{r}(x)=(E_{k,1}^{r}(x),\dots ,E_{k,t}^{r}(x)), $$

where \(E_{k,i}^{r}:(\mathbb {F}_{2^n})^t\rightarrow \mathbb {F}_{2^n}\). We denote the compositional inverse of \(E_{k}^{r}\) by \(E_{k}^{-r}\). We assume that

  1. (1)

    the i-th round key \(k^{(i)}\in (\mathbb {F}_{2^n})^t\) is derived from the master key \(k=(k_1,\dots ,k_t)\in (\mathbb {F}_{2^n})^t\) by some low-degree (e.g., linear) key schedule,

  2. (2)

    the round function R can be described by a polynomial

    $$ R(x = (x_1,\dots ,x_t))= \bigoplus _{\begin{array}{c} j=(j_1,\dots ,j_t)\in \{0,1,\dots ,2^n-1\}^t\\ j_1+\dots +j_t\le d \end{array}} \alpha _j \cdot x_1^{j_1} \cdot \dots \cdot x_t^{j_t} $$

    of low-degree d with coefficients \(\alpha _j\in (\mathbb {F}_{2^n})^t\).

Our attack requires the symbolic evaluation of the encryption function \(E_{k}^{r^\prime }\) for a small number of rounds \(r^\prime \) to be relatively easy, which motivates the requirements of a low-degree round function R and a low-degree key schedule. This ensures that the polynomial representation of \(E_{k}^{r^\prime }\) can be computed efficiently. In both cases, low-degree means low compared to the size of the field \(\mathbb {F}_{2^n}\), i.e., \(d\ll 2^n-1\). A cipher in the literature that satisfies above assumptions and does indeed use low-degree round functions is, e.g., HadesMiMC [31].

6.2 Strategy of the Attack

The idea of our generic attack is to recover the secret master key k of a cipher \(E_{k}^{r}\) by exploiting a given higher-order distinguisher over the subset \(\mathcal {X}\subseteq (\mathbb {F}_{2^n})^t\) covering \(1\le r_\text {ZS} < r\) rounds in the encryption or the decryption direction. For the sake of simplicity, we follow the approach of the attack on MiMC in Sect. 5 and assume that the higher-order distinguisher covers \(r_{\text {ZS}}\) rounds in the decryption direction.

In our attack, we symbolically evaluate \(E_{k}^{r_\text {KR}}(y)\) with respect to the remaining \(r_\text {KR} := r - r_\text {ZS}\) rounds in the encryption direction and obtain polynomials (\(1\le i\le t\))

$$E^{r_\text {KR}}_{(K_1,\ldots ,K_t),i}(Y)\in \mathbb {F}_{2^n}\!\left[ K_1,\ldots ,K_t,Y_1,\ldots ,Y_t\right] $$

over \(\mathbb {F}_{2^n}\) with the master key words \(K_j\) and plaintext variables \((Y_1,\ldots ,Y_t)=:Y\) as indeterminates – in short, one polynomial for each of the t components of \(E_{k}^{r_\text {KR}}(y)\). In general, we work with \(r_{KR}\ll r_{ZS}\), since the symbolic evaluation of \(E_{k}^{r_\text {KR}}(y)\) is expensive.

Having a zero sum after \(r_\text {ZS}\) rounds in the decryption direction with respect to the subset \(\mathcal {X}\subseteq (F_{2^n})^{t}\) means that

$$ \bigoplus _{x\in \mathcal {X}}E_{k}^{-r_\text {ZS}}(x)=0. $$

The main observation behind our attack is the following: We exploit the relationFootnote 9

$$\begin{aligned} 0 = \bigoplus _{x\in \mathcal {X}}E_{k}^{-r_\text {ZS}}(x) = \bigoplus _{x\in \mathcal {X}}\left( E_{k}^{r_\text {KR}}\circ E_{k}^{-r}\right) (x) = \bigoplus _{y\in E_{k}^{-r}(\mathcal {X})}E_{k}^{r_\text {KR}}(y) \end{aligned}$$
(12)

to set up the following equations (\(1\le i\le t\)) over \(\mathbb {F}_{2^n}\) in the variables \(k_1,\ldots ,k_t\):

$$\begin{aligned} F_i(k_1,\ldots ,k_t) := \bigoplus _{y\in E_{k}^{-r}(\mathcal {X})}E^{r_\text {KR}}_{(k_1,\ldots ,k_t),i}(y)=0. \end{aligned}$$
(13)

Again, \(E^{r_\text {KR}}_{(k_1,\ldots ,k_t),i}(y)\) denotes the symbolic evaluation of the \(i\)-th word after \(r_\text {KR}\) rounds in the encryption direction with the master key words as variables \(k_1,\ldots ,k_t\) and evaluated at \(y\in \mathbb {F}_{2^n}\). Once we have set up the equation system arising from Eq. (13), we apply Gröbner basis techniques to solve this system over \(\mathbb {F}_{2^n}\) for the key variables \(k_1,\ldots ,k_t\).

In Algorithm 3 we summarize the approach of our generic attack and present a pseudo code of the attack procedure. For completeness, a rough complexity estimation of the attack is derived in [28, App. E].

figure c

6.3 Comparison with Related Work

Interpolation Attacks. Originally introduced as a standalone attack, interpolation attacks [36] are algebraic attacks that express the (potentially round-reduced) cipher as a polynomial equation with unknown, key-dependent coefficients, and recover these coefficients from known inputs and outputs. More recently, this approach has been combined as a key-recovery approach together with integral distinguishers.

Attack on CAST. In an attack [43] on the CAST cipher the authors use a higher-order differential distinguisher to set up an equation system and finally solve this systems for the key variables. In contrast to our attack, the authors of [43] work with linear equation systems over \(\mathbb {F}_2\). While this is sufficient for CAST, working at bit level is in general more expensive than working on word level when analyzing ciphers that are natively defined at word level.

Optimized Interpolation Attacks. One type of optimized interpolation attacks was described in [23], where the authors find attacks on reduced-round versions of LowMC which are more efficient than previous attacks based on key guessing [25]. A similar attack was also used to break the full-round version of the Frit permutation in an Even–Mansour setting [26]. The overall strategy of this interpolation attack is to find a distinguisher (for example a constant sum in the encryption direction in the case of LowMC) with which one attacks the construction by finding the unknown monomials of the sums of the symbolic representations in the inverse direction. By determining these (key-dependent) monomials, the full key can eventually be found. Since the approach in [23] shares some similarities with our proposal, we describe the differences between these two strategies in detail.

The main difference regarding the two strategies concerns the way in which the system of equations \(F_i(K)=0\) is constructed and consequently solved:

  • In [23], the idea is to construct the function using a “standard” interpolation technique. Specifically, the attacker does not care about the specification of the monomials of F, which are simply considered as unknowns. Hence, the idea is to recover (interpolate) the unknown coefficients of \(F_K(C)\), and then use various ad-hoc techniques (which are not part of the framework described in this section) in order to recover the actual secret key.

  • In our case, we heavily exploit the simple algebraic structure of the round function in order to construct the system of equations \(F_i(K)=0\). In other words, the system of equations is constructed by using a symbolic evaluation and not by interpolation techniques.

We emphasize that the possibility to set up one of the two attacks does not imply the possibility to set up the other one. For example, it seems hard to use the attack presented in [23] against full-round MiMC, while we show that our strategy can break it. Indeed, since we already need \(2^{n-1}\) data for the distinguishing property (i.e., half of the code book), we do not see how to apply the approach from [23] to MiMC without further increasing the data complexity due to data needed for the interpolation step.

Attack on Pyjamask. Only recently, a similar attack on Pyjamask, competing in the ongoing NIST call for lightweight authenticated encryption, has been presented [27]. The authors propose an attack on the full block cipher Pyjamask-96 by combining higher-order differentials with an in-depth ad-hoc analysis of the system of equations obtained for 2.5 rounds of Pyjamask-96. As is the case for CAST, the attack is set up at bit level.

Cube Attacks. Although our attack and cube attacks [24] exploit low degrees in the polynomial description of a cipher, they are quite different from a conceptual point of view and can be regarded as two different cryptanalytic methods. To justify this conclusion, we briefly present the idea behind cube attacks and contrast them with our attack ideas.

Given a cipher with input variables \(x_0,\ldots ,x_{n-1}\) as the public variables (IV bits, plaintext bits, tweak bits, etc.), and \(x_{n},\ldots ,x_{n+m-1}\) as the secret variables (key bits), the output of the cipher can be regarded as a polynomial \(f=f(x)\) in \(x=(x_0,\ldots ,x_{n+m-1})\). For every set \(I\subset \{0,\ldots , n-1\}\), f can be uniquely decomposed into

$$ f = t_I \cdot f_{S(I)} + q, $$

where \(t_I:=\prod _{i\in I}x_i\) denotes the product of all variables indexed by elements in I, the polynomial \(f_{S(I)}\) does not contain any variables from \(t_I\), and where q misses at least one variable from \(t_I\). The polynomial \(f_{S(I)}\) is also called the superpoly with respect to I. For any subset \(I\subseteq \{0,\ldots , n-1\}\) of size \(|I |\), the authors of [24] call the set \(C_I\) of \(2^{|I |}\) vectors, where all the \(|I |\) variables indexed by I range over all possible combinations of elements in \(\mathbb {F}_2\) and the remaining \(n+m-|I |\) variables remain undetermined, a \(|I |\)-dimensional Boolean cube. Then the sum of f over all values in the cube \(C_I\) yields the equation of polynomials

$$ \bigoplus _{v\in C_I} f (v) = f_{S(I)}. $$

Cube attacks consist of two steps. First, attackers recover the superpoly in the offline phase. In this phase, the attacker might need to try sufficiently many cubes and assignments for the remaining public variables such that the superpoly \(f_{S(I)}\) is a balanced function of the secret variables. Moreover, determining the actual coefficients of \(f_{S(I)}\) requires the additional assumption that the attacker is allowed to tweak both public and secret variables. Then, with this usable superpoly, during the online phase, the attacker leaves the secret variables undetermined and queries the encryption oracle with every value \(c\in C_I\) and gets \(f(c)\in \mathbb {F}\). Eventually, the attacker computes

$$ f_I := \bigoplus _{c\in C_I} f(c). $$

The secret key information can be recovered by solving the corresponding equation system \(f_I = f_{S(I)}\).

Compared with our attack, cube attacks involve an initial step of finding balanced superpolies that contain independent secret variables. Apart from that, cube attacks do not exploit the algebraic structure of a cipher, since they rely on the assumption of tweakable black box polynomials. In this sense, our attack is different, since it makes heavy use of the algebraic structure of a cipher when symbolically evaluating a certain number of rounds. Furthermore, cube attacks use the assumption that both key and plaintext variables are tweakable, while we rely on the assumption that some rounds of the cipher can be efficiently evaluated symbolically (which is why we work with low-degree round functions).

7 Concluding Remarks and Future Work

Reducing the Cost of the Attack. As shown in [28, App. E], two steps – namely, (1st) the construction of the system of equations \(F_i(k_1, \dots , k_t) = 0\) for \(1 \le i \le t\) and (2nd) solving such a system – mainly constitute the cost of the attack. In general, it could make sense to balance the costs of the two steps in order to either minimize the total cost of the attack or maximize the number of rounds that can be broken.

In more detail, consider the case in which the cost of the attack is well approximated by the cost of constructing the system of equations \(F_i(K)=0\). Since this cost grows with the size of the subspace \(\mathcal {V}\), one strategy could be to consider a smaller subset \(\mathcal {X}\).Footnote 10 Obviously, this implies in general the possibility to cover fewer rounds \(r_\text {ZS}\) using a higher-order distinguisher, which means that more rounds \(r_\text {KR}\) must be covered in general. However, the overall cost of the attack may benefit from this strategy. On the other hand, the case in which the attack cost is well approximated by the cost of solving the system of equations \(F_i(K)=0\) requires the opposite strategy.

Moreover, we point out that the attacks can be improved by exploiting the details of the cipher. To give a concrete example, consider the case of MiMC given in Algorithm 1: The attack and its computational complexity benefit from the fact that F(K) does not depend on \(\mathscr {P}_5\) or \(\mathscr {P}_7\). As another example, consider the case of an SPN cipher where the round function is defined as

$$ R(x = (x_1,\ldots ,x_t)) = M \times (S(x_1), S(x_2), \dots , S(x_t)), $$

where \(M \in (\mathbb {F}_{2^n})^{t\times t}\) and \(S: \mathbb {F}_{2^n} \rightarrow \mathbb {F}_{2^n}\) (here, ‘\(\times \)’ denotes matrix-vector multiplication). The cost of the attack can potentially be reduced by taking into account the fact that all monomials in the polynomial representation R depend only on a single variable \(x_i\).

Further Generalization: Ciphers over \(\mathbb {F}_p\). Finally, the attack strategy can be generalized to include ciphers over \((\mathbb {F}_p)^t\) for a prime p. This is of particular importance since many of the new applications named in the introduction (e.g., STARKs and MPC) natively work over \(\mathbb {F}_p\), which means that many of the recently proposed primitives are natively constructed over \(\mathbb {F}_p\). We remark that the strategy of the attack does not depend on the details of the field \(\mathbb {F}\). Hence, the only thing that seems to preclude this possibility seems to be a lack of knowledge regarding efficient distinguishers over \((\mathbb {F}_p)^t\). Indeed, while it is well-known how to find a higher-order distinguisher over Boolean fields (e.g., by exploiting division property tools present in the literature [47, 51, 53]), the same is not yet true for prime fields.