1 Introduction

The Keccak  [3, 5] family of hash functions has attracted intensive cryptanalysis since its submission to the SHA-3 competition in 2008 [1, 9,10,11, 13, 14, 16, 17, 19]. In 2012, the National Institute of Standards and Technology of the U.S. selected Keccak as the winner of the SHA-3 competition. The SHA-3 family consists of four cryptographic hash functions of fixed digest sizes and two eXtendable-Output Functions (XOFs) named SHAKE128 and SHAKE256, each of which is based on an instance of the Keccak algorithms [18]. Keccak [rcd] applies sponge construction with bitrate r and capacity c to generate d bit digests from arbitrary length messages where \(d=224, 256, 384,512\) in the official SHA-3 versions and \(d=160, 80\) in the Keccak Crunchy Crypto Collision and Pre-image Contest [2]. Depending on the size of the internal state in \(r+c\) bits from the set \(\{200, 400, 800, 1600\}\), each of the challenge versions contains 4 variants. SHAKE128 and SHAKE256 generate digests that can be extended to any desired length. The suffixes “128” and “256” indicate the security strengths against generic attacks that these two functions support.

In this paper, we focus on collision attacks against the Keccak family, i.e., to find two different messages such that their hash digests are the same. The best previous practical collision attacks on Keccak family are on \(\mathtt{{\textsc {Keccak}}}\)-224 and \(\mathtt{{\textsc {Keccak}}}\)-256 reduced to 4 rounds found by Dinur et al.  [10] in 2012 and later furnished in the journal version [12]. After this, theoretical results improved to 5-round \(\mathtt{{\textsc {Keccak}}}\)-256 [11]. However, the number of practically attacked rounds remains at 4. To promote cryptanalysis against Keccak, the Keccak design team proposed smaller variants in the Keccak challenge [2] with 160 digest size for collision attack and 80 digest size for preimage attack with each of the 4 sizes of internal states reduced to from 1 to 12 rounds. The ideal security levels of both are set to be \(2^{80}\) unit computations for collision and preimages, respectively. This is a level much lower than that of the main 4 instances of SHA-3, but still beyond the reach of current computation resource one may have. The current best solutions of collision challenges are instances reduced up to 4 rounds by Dinur et al.  [10] and Mendel et al.  [17]. Theoretical results were found by Dinur et al.  [11] against \(\mathtt{{\textsc {Keccak}}}\)-256 with complexities \(2^{115}\) using generalized internal differentials. To the best of our knowledge, this remains as the only results on collision attack against Keccak reduced to 5 or more rounds up to date.

Our Contribution. We develop an algebraic and differential hybrid method to launch collision attacks on Keccak family and practically find collisions of 5-round SHAKE128 and two 5-round instances of the Keccak collision challenges. Theoretical results, with complexities below the birthday bound, against 5-round \(\mathtt{{\textsc {Keccak}}}\)-224 and 6-round Keccak collision challenges are also achieved.

These results follow a crucial observation that, the Keccak S-box can be re-expressed as linear transformations, when the input is restricted to some affine subspaces. It was already noted by Daemen et al.  [3, 8] and Dinur et al.  [10] that when the input and output differences are fixed, the solution set of the Keccak S-box contains affine subspaces of dimension up to 3. In this paper, we show the maximum subspaces allowing linearization of S-box is of dimension 2. Furthermore, all affine subspaces of dimension up to 2 allow S-box linearization, and for those of dimension 3, six 2-dimensional affine subspaces out of it could allow the linearization. With this property in mind, we enforce linearization of all S-boxes in the first round, under which the first round function of the Keccak permutation is transformed into a linear one. Combining with an invertion method of the S-box layer of the second round, we convert the problem of finding two-round connectors into that of solving a system of linear equations. Solving the equation once will produce sufficiently many solutions so that at least one of them will follow the differential trails in the following 3 rounds or more.

A side effect of linearization of all S-boxes is quick reduction of freedom degrees, which in turn decides the existence of such two-round connectors. To solve this problem, we aim to find differential trails, which impose least possible conditions to the two-round connectors. We design a dedicated search strategy to find suitable differential trails of up to 4 rounds. Implementation confirmed the correctness of this idea, and found real examples of collisions for 5-round SHAKE128 and two instances of challenge versions.

We list our results together with the related previous work in Table 1. Note, the algorithm for building 2-round connectors is heuristic and there is no theoretical bound for the solving time. However, it applies to our attacked instances within practical time, so we indicate the real time cost instead of complexities here. Experiments are run on a server with 32 cores of AMD processors.

Table 1. Collision attack results and comparison

Organization. The rest of the paper is organized as follows. In Sects. 2 and 3, notations and a brief description of Keccak family are given. In Sect. 4, we give a detailed description of the algebraic methods to achieve 2-round connectors. In Sect. 5, we give the dedicated search strategies for differential trails. Then the experimental results are presented in Sect. 6. We conclude the paper in Sect. 7.

2 Notations

We summarize the majority of notations to be used in this paper here.

figure a

3 Description of Keccak

In this section, we give a brief description of Keccak family of hash functions. Keccak family applies sponge construction which processes messages in two phases—absorbing phase and squeezing phase, as shown in Fig. 1. The message is firstly padded by appending a bit string of \(10^*1\), where \(0^*\) represents a shortest string of 0’s so that the length of padded message is multiple of r. We denote the original message by M and the padded message by \(\overline{M}=M||10^*1\). The b-bit internal state is initialized to be all 0’s. In absorbing phase, the padded message is split into blocks of r-bits and each message block is XORed into the first r bits of current internal state, followed by the application of the fixed permutation to the entire b-bit state. This is repeated until all message blocks are processed. In the squeezing phase, the first r bits of the state are returned as output, then the permutation is applied and another r bits are outputted until all d output bits are produced. When restricted to the case of \(b=1600\) and \(c=2d\), the four official instances of Keccak family are denoted by \(\mathtt{{\textsc {Keccak}}}\)-224/256/384/512 respectively for \(d = 224, 256, 384, 512\). SHAKE128 and SHAKE256 are defined from two instances of Keccak with the capacity c being 256 and 512, respectively, and the additional appending of a four-bit suffix 1111 to the original message M before applying the Keccak padding. Without further specifications, we presume the digest sizes are 256 and 512 for SHAKE128 and SHAKE256, respectively. We use \(\overline{M}\) to denote the padded message for SHAKE as well. Instances of Keccak challenges will be denoted as Keccak \([r,c,n_r,d]\), where the parameters are explicitly indicated for the rate, capacity, number of rounds the permutation is reduced to, and bit size of the digest, respectively.

Fig. 1.
figure 1

Sponge construction [4].

The Keccak permutation function in SHA-3 consists of 24 rounds of five layers operating on the 1600-bit state that can be represented as \(5\times 5\) 64-bit lanes. In general \(2^l\) is used to denote the bit length of lanes. If A denotes a 5-by-5-by-\(2^l\) array of bits that represents the state, then its indices are the integer triples (ijk) for which \(0\le i<5, 0\le j<5,\) and \(0\le k<2^l\). The bit that corresponds to (ijk) is denoted by A[ijk]. Names for single-dimensional sub-arrays and two-dimensional ones are defined by the Keccak designers: \(A[\cdot ][j][k]\) is called a row, \(A[i][\cdot ][k]\) is a column, and \(A[i][j][\cdot ]\) is a lane; \(A[i][\cdot ][\cdot ]\) is called a sheet, \(A[\cdot ][j][\cdot ]\) is a plane, and \(A[\cdot ][\cdot ][k]\) is a slice.

The five layers in each round of the permutation are given below:

$$\begin{aligned} \theta :&A[i][j][k]\leftarrow A[i][j][k]+\sum _{j'=0}^4 A[i-1][j'][k]+\sum _{j'=0}^4 A[i+1][j'][k-1]\\ \rho :&A[i][j][k]\leftarrow A[i][j][k+T(i,j)],\hbox {where }T(i,j) \hbox { is a predefined constant}\\ \pi :&A[i][j][k]\leftarrow A[i'][j'][k], \text {where } \begin{pmatrix} i\\ j \end{pmatrix} = \begin{pmatrix} 0&{}1\\ 2&{}3 \end{pmatrix} \cdot \begin{pmatrix} i'\\ j' \end{pmatrix} .\\ \chi :&A[i][j][k]\leftarrow A[i][j][k]+((\lnot A[i+1][j][k])\wedge A[i+2][j][k]),\\ \iota :&A\leftarrow A+RC[i_r],\hbox { where }RC[i_r]\hbox { is the round constants}. \end{aligned}$$

It is interesting to note that the size of permutation can be reduced to one of \(\{25,50,100,200,400,800\}\) by choosing \(2^l = 1,2,4,8,16,32\), respectively for the size of the lanes. In such cases, the round functions are defined exactly in the same way except the rotation constants of the \(\rho \) operation are now in modulo the respective \(2^l\) instead of 64 in the original 1600-bit full permutation. These size-reduced permutations are not used in the SHA-3 instances, but in the Keccak challenges.

The first three layers are linear mappings and we denote their composition by \(L \triangleq \theta \circ \rho \circ \pi \). The only non-linear layer of the permutation is \(\chi \), which can be seen as a S-box layer that applies 5-bit substitution to 320 rows of the state. We use \(\mathtt{S}(x)\) to denote the substitution of a 5-bit input value x. The difference distribution table of the S-box is denoted by DDT, where \(\mathtt{DDT} (\delta _{in},\delta _{out})\) represents the size of the set \(\{x:\mathtt{S}(x)+\mathtt{S}(x+\delta _{in})=\delta _{out}\}\). We denote the Keccak permutation reduced to the first i rounds as \(\mathtt{R}^i\) (note the round functions are identical up to a difference of constant addition in \(\iota \) and we will omit \(\iota \) as it has little impact on our differential collision attack), i.e., \(\mathtt{R}^i(\overline{M})\) is the state after i rounds processing of the padded message \(\overline{M}\).

4 Overview of Our Collision Attack

In this section, we give an overview of our collision attacks, followed by the details of the algebraic methods to achieve two-round connectors. Without further specification, we assume in this paper the length of the messages used are of one block after padding. To fulfil the Keccak padding rule, one needs to fix the last bit of the padded message to be “1”, hence the first \(r-1\) bits of the state are under the full control of the attacker through the message bits, and the last c bits of the state are fixed to zeros as in the IV specified by Keccak. When applied to SHAKE, there are \(r-6\) free bits under control, by setting the last 6 bits of the padded message to be all 1’s so it is compatible with the specific SHAKE padding rule.

Following the framework by Dinur et al.  [10], as well as many other collision attacks utilizing differential trails, our collision attacks consist of two parts, i.e., a high probability differential trail and a connector linking the differential trail with the initial value, as depicted in Fig. 2. Let \(\varDelta S_{I}\) and \(\varDelta S_{O}\) denote the input and output differences of the differential trail, respectively. Dinur et al. explored a method, which they call “target difference algorithm”, to find message pairs \((M, M')\) such that the output difference after one round permutation is \(\varDelta S_{I}\), formally \(\mathtt{R}^1(\overline{M} || 0^c) +\mathtt{R}^1(\overline{M'}||0^c) = \varDelta S_{I}\). In what follows, we show an algebraic method to extend this connector to two rounds, i.e., a new target difference algorithm to find \((M, M')\) such that \(\mathtt{R}^2(\overline{M} || 0^c) +\mathtt{R}^2(\overline{M'}||0^c) = \varDelta S_{I}\). The differential trail is then fulfilled probabilistically with many such message pairs, collision can be produced if the first d bits of \(\varDelta S_{O}\) are 0. As we are aiming at low complexity attacks, finding solutions of connectors should be practical so that this part will not dominate the overall complexities of collision finding. Details of the differential trail search will be discussed in Sect. 5.

Overall, the constraints of the two-round connectors are that the last \(c+1\) (or \(c+6\)) bits of the initial state are fixed, and the output difference after two rounds is given and fixed (this is determined by the differential trail to be used), we are to utilize the degree of freedom from the first \(r-1\) (or \(r-6\)) bits of the initial state to find solutions efficiently. We will start with some observations on the Keccak S-box, then move to the details of solution finding algorithm.

Fig. 2.
figure 2

Overview of 5-round collision attack

4.1 S-Box Linearization and Affine Subspaces

The key observation is that internal state is much larger than the digest size, providing large number of freedom degrees to attackers. One can choose some subsets of the available spaces with special properties to achieve fast enumerations. In case of Keccak, we are to choose the subsets which are linear with respective to the S-box, i.e., the expression of S-box can be re-written as linear transformation when the input is restricted to such subsets. It is obvious to note the S-box is non-linear when the entire \(2^{5}\) input space is considered. However, affine subspaces of size up to 4, as to be shown below, could be found so that the S-box can be linearized. Note that the S-box is the only nonlinear part of the Keccak round function. Hence, the entire round function becomes linear when restricted to such subspaces. Formally, we define

Definition 1

(Linearizable affine subspace). Linearizable affine subspaces are affine input subspaces on which S-box substitution is equivalent to a linear transformation. If V is a linearizable affine subspace of an S-box operation \(\mathtt{S}(\cdot )\), \(\forall x\in V, \mathtt{S}(x)=A\cdot x+b\), where A is a matrix and b is a constant vector.

For example, when input is restricted to the subset \(\{\mathtt{00000,00001,00100,}\mathtt{00101}\}\) (\(\{\mathtt{00,01,04,05}\}\) in hex), the corresponding output set of the Keccak S-box is \(\{\mathtt{00000,01001,00101,01100}\}\)(\(\{\mathtt{00,09,05,0C}\}\) in hex), and the expression of the S-box can be re-written as linear transformation:

$$\begin{aligned} y= \begin{pmatrix} 1&{} 0&{} 1&{} 0&{} 0 \\ 0&{} 1&{} 0&{} 0&{} 0\\ 0&{} 0&{} 1&{} 0&{} 0\\ 1&{} 0&{} 0&{} 1&{} 0\\ 0&{} 0&{} 0&{} 0&{} 1 \end{pmatrix} \cdot x \end{aligned}$$
(1)

where x and y are bit vector representation of input and output values of the Keccak S-box with the last bit on top. By rotation symmetry, four more linearizable affine subspaces can be deduced from one.

Exhaustive search for the linearizable affine subspaces of the Keccak S-box shows:

Observation 1

Out of the entire 5-dimensional input space,

  1. a.

    there are totally 80 2-dimensional linearizable affine subspaces, as listed in Table 5 in Appendix A.

  2. b.

    there does not exist any linearizable affine subspace with dimension 3 or more.

For completeness, any 1-dimensional subspace is automatically linearizable affine subspace.

Since the affine subspaces are to be used together with differential trails, we are interested in those linearizable affine subspaces with fixed input and output differences, which is more relevant with the differential distribution table (DDT) of S-boxes. Referring to the DDT of Keccak S-box postponed to Appendix B, we observe:

Observation 2

Given a 5-bit input difference \(\delta _{in}\) and a 5-bit output difference \(\delta _{out}\) such that \(\mathtt{DDT} (\delta _{in},\delta _{out})\ne 0\), denote the value solution set \(V=\{x:\mathtt{S}(x)+\mathtt{S}(x+\delta _{in})=\delta _{out}\}\) and \(\mathtt{S}(V)=\{\mathtt{S}(x):x\in V\}\), we have

  1. a.

    if \(\mathtt{DDT} (\delta _{in},\delta _{out})=2\) or 4, then V is a linearizable affine subspace.

  2. b.

    if \(\mathtt{DDT} (\delta _{in},\delta _{out})=8\), then there are six 2-dimensional subsets \(W_i\subset V,i=0,1,\cdots ,5\) such that \(W_i(i=0,1,\cdots ,5)\) are linearizable affine subspaces.

It is interesting to note the 2-dimensional linearizable affine subspaces obtained from analysis of DDT cover all the 80 cases in Observation 1. It is already noted in [15] there is one-to-one correspondence between linearizable affine subspaces and entries with value 2 or 4 in DDT. As for the DDT entries of value 8, we will leave the 6 choices of 2-dimensional linearizable affine subspaces for later usage. As an example, the 3-dimensional affine subspace corresponding to \(\mathtt{DDT} (\mathtt{01,01})\), i.e., with both input and output differences being 01, is \(\{\mathtt{10, 11, 14, 15, 18, 19, 1C, 1D}\}\) and the six 2-dimensional linearizable affine subspaces from it are

$$\begin{aligned} \begin{aligned}&\{\mathtt{10, 11, 14, 15}\}, \\&\{\mathtt{10, 11, 18, 19}\}, \\&\{\mathtt{10, 11, 1C, 1D}\}, \\&\{\mathtt{14, 15, 18, 19}\}, \\&\{\mathtt{14, 15, 1C, 1D}\}, \\&\{\mathtt{18, 19, 1C, 1D}\}. \end{aligned} \end{aligned}$$
(2)

When projected to the whole Keccak state, direct product of affine subspaces of each individual S-box form affine subspaces of the entire state with larger dimensions. In other words, when all the S-boxes in the round function are linearized, the entire round function becomes linear. This will be the way we are to handle the S-box layer of the first round of the 2-round connector.

4.2 A Connector Covering Two Rounds

The core idea of our two-round connector is to convert the problem to solving a system of linear equations. Two rounds of Keccak permutation can be expressed as \(\chi _1\circ L_1 \circ \chi _0 \circ L_0\) (omitting the \(\iota \)). With the \(\chi _0\) layer linearized by the techniques discussed above, i.e., given the input and output differences of \(\chi \), the first three operations \( L_1 \circ \chi _0 \circ L_0\) become linear. We will give details of the method how input and output differences of \(\chi _0\) are selected later. Now, we show how the \(\chi _1\) can be inverted by adding more constraints of linear equations. In our attack setting, the output difference of \(\chi _1\) is given as \(\varDelta S_I\)—input difference of the 3-round differential trail. It is not necessary that all S-boxes of the \(\chi _1\) layer are active, i.e., with a non-zero difference. Here only active S-boxes are concerned, and each of them is inverted by randomly choosing an input difference with non-zero number of solutions, we call it compatible input difference. Formally, given the output difference \(\delta _{out}\), compatible input differences are \(\{\delta _{in}: \mathtt{DDT} (\delta _{in}, \delta _{out}) \ne 0\}\). As noted previously [3, 8, 10], for any pair of \((\delta _{in}, \delta _{out})\), the solution set \(V = \{x : \mathtt{S}(x) +\mathtt{S}(x+\delta _{in}) = \delta _{out}\}\) forms an affine subspace. In other words, V can be deduced from the set \(\{0,1\}^5\) by setting up i constraints that turn to be binary linear equations, when the size of solution set V is \(2^{5-i}\). For example, corresponding to \(\mathtt{DDT} (\mathtt{03,02})\) is the 2-dimensional affine subspace \(\{\mathtt{14, 17, 1C, 1F}\}\) which can be formulated by the following three linear equations:

$$\begin{aligned} \begin{pmatrix} 0 &{} 0 &{} 1 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 1 \end{pmatrix} \cdot x = \begin{pmatrix} 1\\ 0\\ 1 \end{pmatrix}. \end{aligned}$$
(3)

It is important to note, under the i linear constraints or set V, there is a bijective relation between \(\delta _{in}\) and \(\delta _{out}\), i.e., given one the other can be deduced deterministically. Hence, each active S-box in \(\chi _1\) layer is inverted by a choice of compatible input difference together with the corresponding i linear constraints on the input values. Once input difference and linear constraints for all active S-boxes of \(\chi _1\) are enforced and fulfilled, solutions of 2-round connector are found. Note a compatible input difference of \(\chi _1\) is a choice of \(\beta _1\), and \(\alpha _1\) can be uniquely determined by the relation \(\alpha _1 = L^{-1}(\beta _1)\). In the remaining part of this subsection, more details on implementation of this idea are given.

As depicted in Fig. 3, the variables of our equation system are the bit values before the first \(\chi \) layer denoted by vector x. y and z are bit vectors of intermediate values for further interpretation where y represents the output after the first \(\chi \) layer and z the bits before the second \(\chi \) layer. The main task is to derive all constraints on differences and affine subspaces to that on the variables x. Now, suppose \(\beta _1\) and \(\beta _0\) (details will be given in Sect. 4.4) are fixed, and \(\varDelta S_I\) (aka. \(\alpha _2\)) is given, we show how the system of equations could be set up. With the input difference \(\beta _1\) and output difference \(\varDelta S_I\) of \(\chi _1\), all the linear constraints on the input affine subspaces of the active S-boxes can be derived and stored as

$$\begin{aligned} G\cdot z=m, \end{aligned}$$

where G is a block-diagonal matrix in which each diagonal block together with corresponding constants in m formulates the constraints of one active S-box. Similar procedure is done for input affine subspaces of the first round, except that the input is restricted to linearizable affine subspaces for all S-boxes regardless whether or not the S-box is active so that \(\chi _0\) layer can be replaced by a linear transformation \(\chi _L\). We denote the constraints by

$$\begin{aligned} A\cdot x = t. \end{aligned}$$
(4)

Then x and y can be linked by

$$\begin{aligned} \chi _L \cdot x+\chi _C=y, \end{aligned}$$

where \(\chi _C\) denotes the constant offsets for the affine subspaces. Furthermore, the two equation systems can be linked by

$$\begin{aligned} L\cdot (y + RC[0]) = z, \end{aligned}$$

where RC[0] denotes the round constant of the first round. Note, only active S-boxes of the second round are concerned, i.e., only part of bits of z are known, hence the same applies to y, and we use \(y'\) to denote the known bits of y for later. Overall, the constraints on z can be derived to that on x as

$$\begin{aligned} G \cdot L \cdot (\chi _L \cdot x + \chi _C + RC[0]) = m. \end{aligned}$$
(5)

Note an additional constraint x needs to fulfil is that the last \(c+1\) (or \(c+6\)) bits of initial state are pre-fixed, which can be derived as

$$\begin{aligned} L^{-1}(x) = CA, \end{aligned}$$
(6)

where CA denotes the preset values for bits of the inner state and padding bits. We use \(E_M\) to denote the equation systems (4), (5) and (6), solutions fulfilling \(E_M\) will be solutions of 2-round connectors.

Algorithm for Building Two-Round Connectors. We use the basic linearization procedure to generate the equations for confining x to a smaller subspace suitable for linearization of the first \(\chi \) layer and use the main linearization procedure to generate the final equations to bypass the second \(\chi \) layer. One of the inputs of the basic procedure is the equation system \(E_M\) on x values, other inputs include the input and output differences of the first S-box layer \(\beta _0, \alpha _1\) and \(y'\).

The Basic Linearization Procedure.

Inputs: \(E_M\), \(\beta _0,\alpha _1,y'\).

Outputs: updated \(E_M\), \(\chi _L,\chi _C\).

  1. 1.

    Initialize a matrix \(\chi _L\) and a vector \(\chi _C\).

  2. 2.

    Iterate on each bit of \(y'\), calculate the index of the bit in S-box level, say the j-th bit of the i-th S-box in the first round. Then for the i-th S-box in the first round:

    1. (a)

      If the i-th S-box has not been processed in this procedure before, then:

      1. (i)

        If it is non-active, randomly choose a linearizable 2-dimensional subspace, check whether the 3 equations specifying this 2-dimensional affine subspace is consistent with the current \(E_M\).

        If so, add them to \(E_M\) and update \(\chi _L\) and \(\chi _C\) with the j-th line of the matrix which specifies the affine linear transformation. Continue to next bit of \(y'\) in step 2.

        Otherwise, try another linearizable 2-dimensional subspace. If all linearizable 2-dimensional subspaces have been tried and no consistent equations exist, output “No Solution in basic procedure”.

      2. (ii)

        Otherwise it is active: find its input and output differences from \(\beta _0\) and \(\alpha _1\), i.e., \(\delta _{in}, \delta _{out}\).

        1. Case 1.

          When \(\mathtt{DDT} (\delta _{in},\delta _{out})=8\), randomly choose one of the six linearizable 2-dimensional subspaces and the corresponding equation to specialize this 2-dimensional subspace (the other two of the three equations to formulate the 2-dimentional subspace have already been indicated in \(E_M\) after choosing \(\beta _0\) procedure).

          If current \(E_M\) is consistent with this linear equation, add it to \(E_M\) and update \(\chi _L\) and \(\chi _C\) with the j-th line of the matrix which specifies the linear map from the 2-dimensional subspace to the output 2-dimensional subspace of S-box. Continue to next bit of \(y'\) in step 2.

          Otherwise, try another randomly chosen 2-dimensional linearizable subspace. If all six 2-dimensional linearizable subspaces have been chosen and no consistent equation exist, output “No Solution in basic procedure”.

        2. Case 2.

          When \(\mathtt{DDT} (\delta _{in},\delta _{out})=2\text { or }4\), update \(\chi _L\) and \(\chi _C\) with the j-th line of the matrix which specifies the affine linear transformation of the input 1 or 2-dimensional subspace to the output 1 or 2-dimensional subspace of S-box. Continue to next bit of \(y'\) in step 2.

    2. (b)

      Otherwise, if the i-th S-box has already been processed in this procedure: update \(\chi _L\) and \(\chi _C\) with the j-th line of the matrix which specifies the affine linear transformation of the predefined linearizable subspace to the output subspace of S-box.

  3. 3.

    Output the current equations system \(E_M\) as well as \(\chi _L\) and \(\chi _C\) such that \(\chi _L\cdot x+\chi _C=y'\).

The inputs to the Main procedure are \(\beta _0,\alpha _1, \beta _1, \alpha _2(\varDelta S_I)\) and \(E_M\) we get after choosing \(\beta _0\).

The Main Linearization Procedure.

Input: \(E_M, \beta _0,\alpha _1,\beta _1,\alpha _2\).

Output: Updated \(E_M\).

  1. 1.

    Using \(\beta _1\) and \(\alpha _2\), initialize a coefficient matrix G and a constant vector m that specify the linear equations to constrain the input bits of the second S-box layer for deriving the equation \(G\cdot z=m\).

  2. 2.

    Derive the L into the matrix format for \(L \cdot (y + RC[0]) = z\).

  3. 3.

    Initialize a counter to 0.

  4. 4.

    Execute the basic linear procedure with indexes of know bits \(y'\) in y and \(E_M, \beta _0\) and \(\alpha _1\). If the procedure succeeds, it will return the matrix specifying the linearization of the first S-box layer such that \(\chi _L\cdot x+\chi _C=y'\), then continue to Step 6. Otherwise, go to step 5.

  5. 5.

    Increment the counter. If the counter’s value is equal to a preset threshold T1, output “Failed”. Otherwise, go to step 4.

  6. 6.

    Test whether the equation system (5) is consistent with \(E_M\). If so, add the new system to \(E_M\) and output final \(E_M\). Otherwise, go to step 5.

Note that the algorithms do not succeed all the time. To overcome this problem, from the input difference \(\varDelta S_I\) of a 3-round differential trail, we repeat random picks of compatible input differences \(\beta _1\) until the main procedure succeeds. As the number of active S-boxes in \(\alpha _2\) is large enough (range from tens to hundreds in our experiments), there are enough different cases for \(\beta _1\) resulting in high final success probability. An interesting point is that the invertion from \(\alpha _2\) to \(\beta _1\) does not need to maintain high probability because this transition is covered in our two-round connector. Besides, the unconstrained number of active S-boxes of an input difference allows more freedom in searching of the most suitable three round differential trails. We will describe the searching strategies in Sect. 5. Finally, exhaustive search of solution for the following 3-round differential trails can be performed from the solution space of \(E_M\).

4.3 Analysis of Degree of Freedom

The degree of freedom of solution space of final \(E_M\) is a key factor on success of our method. A solution space with degree of freedom larger than the weight of the 3-round differential trail is possible to suggest a message pair with collision digest. After the linearization of the first round, the degree of freedom is \(\sum _{i=0}^{\frac{b}{5}-1}{\mathtt{DF}} ^{(1)}_i\) in which \({\mathtt{DF}} ^{(1)}_i\) is the degree of freedom of 5-bit input space of the i-th S-box in the first round. The value is assigned for \({\mathtt{DF}} ^{(1)}_i\) according to rules in Table 2.

The constraints on the initial state reduce \((c+p)\) degree of freedom where c is the capacity and p is due to the padding rule. We have \(p=1\) for Keccak and \(p=6\) for SHAKE. Another decrease on degree of freedom is due to the constraints on the input values of the S-box layer in the second round. The definition of \({\mathtt{DF}} ^{(2)}_i\), the degree of freedom of 5-bit input values to S-boxes in the second round, is

$$\begin{aligned} {\mathtt{DF}} ^{(2)}_i= {\left\{ \begin{array}{ll} 1, &{} \mathtt{DDT} (\delta _{in},\delta _{out})=2,\\ 2, &{} \mathtt{DDT} (\delta _{in},\delta _{out})=4,\\ 3, &{} \mathtt{DDT} (\delta _{in},\delta _{out})=8,\\ 5, &{}\mathtt{DDT} (\delta _{in},\delta _{out})=0, \end{array}\right. } \end{aligned}$$
(7)

where \(\delta _{in}\) and \(\delta _{out}\) are the input and output differences of the i-th S-box in the second round. For the i-th S-box in the second round, we add \((5-{\mathtt{DF}} ^{(2)}_i)\) equations to \(E_M\) and suppose to deduce the degree of freedom by this amount.

Table 2. Rules for value assignment for \({\mathtt{DF}} ^1_i\).

The degree of freedom of the final \(E_M\) is estimated as

$$\begin{aligned} {\mathtt{DF}} =\sum _{i=0}^{\frac{b}{5}-1}{\mathtt{DF}} ^{(1)}_i-(c+p)-\sum _{i=0}^{\frac{b}{5}-1}(5-{\mathtt{DF}} ^{(2)}_i). \end{aligned}$$
(8)

Large \({\mathtt{DF}} \) benefits our search for collisions in rounds beyond the second round.

4.4 How to Choose \(\beta _0\)

So far we have not given details on how \(\beta _0\) can be selected. We follow Dinur et al.’s work [10] in a more general way to uniquely determine \(\beta _0\), the difference before \(\chi \) layer in the first round. The algorithm is called “target difference algorithm” and consists of difference phase and value phase.

Given \(\varDelta S_{I}\), we have randomly chosen a compatible input difference \(\beta _1\). We then build two equation systems \(E_{\varDelta }\) and \(E_{M}\) accordingly. \(E_{\varDelta }\) is on differences of the message pairs and \(E_{M}\) is on values of one message. The initialization of \(E_{\varDelta }\) should abide by (1) the constraints implied by padding rules that the last \(c+1\) difference bits of initial state equal to 0, and (2) the input difference bits of nonactive S-boxes in the first round equal to 0. The initialization of \(E_M\) should abide by the padding rules that the last \(c+p\) value bits equal to \(1^p||0^c\). We set \(p=1\) for Keccak and \(p=6\) for SHAKE. These rules are easy to be implemented as the variable vector x is an invertible linear mapping of the initial vector. Therefore, in the initialization period, we equate the corresponding bits to their enforced values in \(E_{\varDelta }\) and \(E_M\).

For \(E_{\varDelta }\), we add additional equations to enforce that \(\alpha _1\) is possibly deduced from \(\beta _0\). Though the obvious way is to equate the 5 input difference bits to a specific value for each active S-box in \(\beta _0\), this will restricts the solution space significantly. As suggested in [10], we chose one of the 2-dimensional affine subsets of input differences instead of a specific value for each active S-box. This is based on the fact that given any nonzero 5-bit output difference to a Keccak S-box, the set of possible input differences contains at least five 2-dimensional affine subspaces. After a consistent \(E_{\varDelta }\) system has been constructed, the solution space is an affine subspace of candidates for \(\beta _0\). Then we continue to maintain \(E_{\varDelta }\) by iteratively add the additional 2 equations to uniquely specify each 5-bit input difference for the active S-boxes. For all active S-boxes, once the specific input differences is determined, we add equations to \(E_M\) system to enforce every active 5-bit of x (input bits to active S-box) to an affine subspace corresponding to the uniquely determined \(\delta _{in}\) and \(\delta _{out}\). In this way, we always find a compatible \(\beta _0\) from \(\alpha _1\) fulfilling the constraints from the \(c+p\) bits of padding and pre-set bits of capacities.

5 Search for Differential Trails

In this section, we elaborate on our searching algorithms for finding differential trails of Keccak. Our ideas greatly benefit from previous works of searching differential trails for Keccak [9, 14, 19]. We start by recalling several properties of the operations in the round function, followed by our considerations in finding differential trails. Then, we describe our searching algorithms which provide differential trails for practical collision attacks against Keccak [1440, 160, 5, 160], 5-round SHAKE128 and Keccak [640, 160, 5, 160] respectively, and trails for theoretical collision attack against 5-round \(\mathtt{{\textsc {Keccak}}}\)-224 and Keccak [1440, 160, 6, 160].

5.1 Properties of \(\theta ,\rho ,\pi ,\iota \) and \(\chi \)

\(\theta ,\rho ,\pi ,\iota \) are linear operations while \(\chi \) acts as the parallel application of 5-bit nonlinear S-boxes on the rows of the state. Since \(\iota \) adds a round constant and has no essential effect on difference, we ignore it in this section. Additionally, \(\rho \) and \(\pi \) do not change the number of active bits in a differential trail, but only positions. Therefore, \(\theta \) and \(\chi \) are the crucial parts for differential analysis.

To describe the properties of \(\theta \), we take definitions from [3]. The column parity (or parity for short) P(A) of a value (or difference) A is defined as the parity of the columns of A, i.e. \(P(A)[i][k]=\Sigma _j A[i][j][k]\). A column is even, if its parity is 0, otherwise it is odd. A state is in CP-kernel if all its columns are even.

\(\theta \) adds a pattern to the state, and this pattern is called the \(\theta \)-effect. The \(\theta \)-effect of a state A is \(E(A)[i][k]=P(A)[i-1][k]+P(A)[i+1][k-1]\). So \(\theta \) depends only on column parities. The \(\theta \)-gap is defined as the Hamming weight of the \(\theta \)-effect divided by two. Note that if the \(\theta \)-gap is g, after applying \(\theta \) there are 10g bits flipped. Given a state A in CP-kernel, the \(\theta \)-gap is zero and hence the Hamming weight of A remains after \(\theta \). Another interesting property is that \(\theta ^{-1}\) diffuses much faster than \(\theta \). More exactly, a single bit difference can be propagated to about half state bits through \(\theta ^{-1}\).

Given an input difference to \(\chi \), all possible output differences occur with the same probability. On the contrary, given an output difference to \(\chi \), it is not the same case, but the highest probability of all possible input differences is determined. Moreover, for one-bit differences, each S-box of \(\chi \) acts as identity with probability \(2^{-2}\).

5.2 Representation of Trails and Their Weights

As in previous sections, we denote the differences before and after i-th round by \(\alpha _{i}\) and \(\alpha _{i+1}\), respectively. Let \(\beta _i = L(\alpha _i)\). Therefore an n-round differential trail starting from 0-th round is of the following form

$$\alpha _0\xrightarrow {L}\beta _0\xrightarrow {\chi }\alpha _1\xrightarrow {L}\cdots \alpha _{n-1}\xrightarrow {L}\beta _{n-1}\xrightarrow {\chi }\alpha _n.$$

For the sake of simplicity, a trail can also be represented with only \(\beta _i\)’s or \(\alpha _i\)’s.

The weight of a differential \(\beta \rightarrow \alpha \) over a function f with domain \(\{0,1\}^b\) is defined as

$$w(\beta \rightarrow \alpha )=b-\mathrm {log}_2|\{x:f(x)\oplus f(x\oplus \beta )=\alpha \}|.$$

In other words, the weight of a differential \(\beta \rightarrow \alpha \) is equal to \(-\mathrm {log}_2\mathrm {Pr}(\beta \rightarrow \alpha )\). If \(\mathrm {Pr}(\beta \rightarrow \alpha )>0\), we say \(\alpha \) and \(\beta \) are compatible, otherwise the weight of \(\beta \rightarrow \alpha \) is undefined.

We denote the weight of i-th round differential by \(w_{i}\) where i starts from 0, and thus the weight of a trail is the sum of the weights of round differentials that constitute the trail. In addition, we use \(\#\mathrm {AS}(\alpha )\) to represent the number of active S-boxes in a state difference \(\alpha \). According to the properties of \(\chi \), given \(\beta _{i}\) the weight of (\(\beta _{i}\rightarrow \alpha _{i+1}\)) is determined; also, given \(\beta _{i}\) the minimum reverse weight of (\(\beta _{i-1}\rightarrow L^{-1}(\beta _i)\)) is fixed.

As in [3], \(n-1\) consecutive \(\beta _i\)’s, say \((\beta _1,\cdots ,\beta _{n-1})\) is called an n-round trail core which defines a set of n-round trails \(\alpha _0\xrightarrow {L}\beta _0\xrightarrow {\chi }\alpha _1\xrightarrow {L}\beta _1\cdots \xrightarrow {L}\beta _{n-1}\xrightarrow {\chi }\alpha _n\) where the first round is of the minimal weight determined by \(\alpha _1=L^{-1}(\beta _1)\), and \(\alpha _n\) is compatible with \(\beta _{n-1}\). The first step of mount collision attacks against n-round Keccak is to find good \((n-1)\)-round trail cores.

5.3 Requirements for Differential Trails

Good trail cores are those satisfying all the requirements which we will explain as follows. The first requirement is that the difference of the output is zero, i.e. \(\alpha _{n_r}^d=0\) (we denote output digest difference after \(n_r\) rounds with \(\alpha _{n_r}^d\)). The second requirement relates to the freedom degree budget.

With the definition of weight, Eq. (8) can be represented in an alternative way

$$\begin{aligned} {\mathtt{DF}} =\sum _{i=0}^{b/5-1}{\mathtt{DF}} ^{(1)}_i-(c+p)-w_1. \end{aligned}$$
(9)

The first term of the formula depends on the number of S-boxes that need to be linearized and its corresponding DDT entry as depicted in Table 2. Empirically, when all S-boxes are active and linearized in the first round it is more possible to get a consistent equation system. Therefore, we heuristically set \(\frac{b}{5}\times 2\) as a threshold for the first term in (9), and denote a threshold of the first two terms in (9) for further search conditions by

$${\mathtt{TDF}} = \frac{b}{5}\times 2-(c+p).$$

To mount collision attacks against Keccak \([r,c,n_r,d]\) with methods described in Sect. 4, it is necessary that

$$\begin{aligned} {\mathtt{TDF}} >w_1+\cdots +w_{n_r-2}+w_{n_r-1}^d \end{aligned}$$
(10)

where \(w_{n_r-1}^d\) is the part of \(w_{n_r-1}\) that relates to the digestFootnote 1. The trail searching phase is performed to provide \(\varDelta S_I\) for the connector building algorithm. However, the sufficient conditions for a good trail core is restrained by solving results of the connector, i.e. the number of freedom degrees of the solution space of \(E_M\). So we take (10) as a heuristic condition for searching good trail cores which are promising for collision attacks.

Thirdly, the collision attack should be practical. Note that after we obtain a subspace of message pairs making it sure to bypass the first two rounds, the complexity for searching a collision is \(2^{w_2+\cdots +w_{n_r-1}^d}\). To make our attacks practical, we restrict \(w_2+\cdots +w_{n_r-1}^d\) to be small enough, say 48.

We summarize the requirements for differential trails as follows and list \({\mathtt{TDF}} \)s for different versions of Keccak \([r,c,n_r,d]\) in Table 3.

  1. (1)

    \(\alpha _{n_r}^d=0\), i.e. the difference of output must be zero.

  2. (2)

    \({\mathtt{TDF}} >w_1+\cdots +w_{n_r-1}^d\), i.e. the degree of freedom must be sufficient;

  3. (3)

    \(w_2+\cdots +w_{n_r-1}^d\le 48\), the complexity for finding a collision should be low.

Table 3. \({\mathtt{TDF}} \)s of different versions of Keccak \([r,c,n_r,d]\).

5.4 Searching Strategies

Searching From Light \(\varvec{\beta _3}\) ’s. Our initial goal is to find collisions for 5-round Keccak. To facilitate a 5-round collision of Keccak, we need to find 4-round differential trails satisfying the three requirements mentioned previously. However it is difficult to meet all of them simultaneously even though each of them can be fulfilled easily.

We explain as follows. Since we aim for practical attacks, \(w_2+w_3+w_{4}^d\) must be small enough, say 48. That is to say, the last three rounds of the trail must be light and sparse. When we restrict a 3-round trail to be lightweight and extend it backwards for one round, we almost always unfortunately get a heavy state \(\alpha _2\) (usually \(\#AS(\alpha _2)>120\)) whose weight may exceeds the TDF. We take \(\mathtt{{\textsc {Keccak}}}\)-224 as an example. The \({\mathtt{TDF}} \) of \(\mathtt{{\textsc {Keccak}}}\)-224 is 191, which indicates \(\#\mathrm {AS}(\alpha _2)<92\) as the least weight for an S-box is 2. For a lightweight 3-round trail, it satisfies Requirement (1) occasionally. The greater d is, the less trails satisfy Requirement (1).

With these requirements in mind, we search for 4-round differential trail cores from light middle state differences \(\beta _3\)’s. From light \(\beta _3\)’s we search forwards and backwards, and check whether Requirement (1) and (2) are satisfied respectively; once these two requirements are satisfied, we compute the weight \(w_2+w_3+w_{4}^d\) for brute force, hoping it is small enough for practical attacks.

\(\alpha _3, \alpha _4\) in CP-kernel. The designers of Keccak show in [3] that it is not possible to construct 3-round low weight differential trails which stay in CP-kernel. However, 2-round differential trails in CP-kernel are possible, as studied in [9, 14, 19].

We restrict \(\alpha _3\) in CP-kernel. If \(\rho ^{-1}\circ \pi ^{-1}(\beta _3)\) is outside the CP-kernel and sparse, say 8 active bits, the active bits of \(\alpha _3=L^{-1}(\beta _3)\) will increase due to the strong diffusion of \(\theta ^{-1}\) and the sparseness of \(\beta _3\). When \(\#AS(\alpha _3)>10\), the complexity for searching backwards for one \(\beta _3\) is greater than \(2^{31.7}\) which is too time-consuming. We had better also confine \(\alpha _4\) to the CP-kernel. If not, the requirement \(\alpha _{n_r}^d=0\) may not be satisfied. As can be seen from the lightest 3-round trail for \(\mathtt{{\textsc {Keccak}}}\)-f[1600] [14], even though the \(\theta \)-gap is only one, after \(\theta \) the difference bits are diffused among the state making a 224-bit collision impossible (a 160-bit collision is still possible). So our starting point is special \(\beta _3\)’s which makes sure \(\alpha _3 = L^{-1}(\beta _3)\) lies in CP-kernel, and for which there exists a compatible \(\alpha _4\) in CP-kernel. Fortunately, such kind of \(\beta _3\)’s can be obtained with KeccakTools [6].

Steps for Searching 4-Round Differential Trails. We sketch below our steps for finding 4-round differential trail cores for Keccak and provide a description in more detail in Appendix C. To mount collision attacks on 6-round Keccak, 5-round differential trail cores are needed. In this case, we just extend our forward extension for one more round.

  1. 1.

    Using KeccakTools, find special \(\beta _3\)’s with a low Hamming weight, say 8.

  2. 2.

    For every \(\beta _3\) obtained, traverse all possible \(\alpha _4\) using a tree structure, compute \(\beta _4=L(\alpha _4)\) and test whether there exists a compatible \(\alpha _5\) where \(\alpha _5^d=0\). If so, keep this \(\beta _3\) and record its forward extension, otherwise discard it.

  3. 3.

    For remaining \(\beta _3\)’s, also using a tree structure traverse all possible \(\beta _2\) which is compatible with \(L^{-1}(\beta _3)\)’s, compute \(\#AS(\alpha _2)\) from \(\beta _2\). If \(\#AS(\alpha _2)\) is small enough, say below 110, check whether this trail core \((\beta _2,\beta _3,\beta _4)\) under consideration is sufficient for collision attacks.

Fig. 3.
figure 3

Collision attacks on 5-round Keccak.

5.5 Searching Results

Some of the best differential trail cores we obtained are listed in Table 4. As can be seen that Trail cores No. 1–3 are all suitable for collision attacks against Keccak [1440,160,5,160], and Trail cores No. 1 and 2 for SHAKE128. Trail core No. 4 is sufficiently good for collision attacks against Keccak [640, 160, 5, 160]. However, to mount collision attacks on \(\mathtt{{\textsc {Keccak}}}\)-224, all the first three trail cores are not good enough. Fortunately, a doubled version of Trail core No. 4 can make our two-round attack possible because \(85\times 2=170 < 191\). For Keccak [1440, 160, 6, 160], we also find a trail core ripe for collision attacks except that Requirement (3) is not satisfied. Details of these differential trail cores are provided in Appendix D.

Table 4. Differential trail cores for Keccak \([r,c,n_r,d]\).

6 Experiments and Results

In this section, we employ 4-round (5-round) trail cores to mount collision attacks against 5-round (6-round) Keccak \([r,c,n_r,d]\). Our attack consists of two main stages:

  • Connecting stage. Find a subspace of messages bypassing the first two rounds.

  • Brute-force searching stage. Find a colliding pair from this subspace by brute force.

In the first stage, with \(\alpha _2\) fixed by the trail core, we choose compatible \(\beta _1\) where \(\alpha _1=L^{-1}(\beta _1)\) and all the S-boxes in \(\alpha _1\) are active. In order to save freedom degrees, we also restrict that \(\beta _1\rightarrow \alpha _2\) should be of least weight. When \(\beta _1\) is chosen, we run the two-round connector. If a certain number of failures is reached, we select another \(\beta _1\) until a solution is found, i.e. a subspace of message pairs definitely reaching to \(\alpha _2\) is obtained. If the number freedom degrees of this subspace is large enough, the first stage succeeds. Once the first stage succeeds, we move on to the second stage for finding a colliding message pair.

6.1 Collision Attack of

We apply Trail core No. 2 to the collision attack of 5-round Keccak \([1440,160,5,160]\). In this case, we choose compatible \(\beta _1\)s randomly. After solving the two-round problem in 9.6 s, the degree of freedom is 162, which is enough for collision search of the remaining 3 rounds with probability \(2^{-40}\). The searching time for the collision is 2.48 h. We give one example of collisions in Table 10, with which we solve a challenge of Keccak Crunchy Crypto Collision and Pre-image Contest [2].

6.2 Collision Attack of 5-Round  SHAKE128

We apply Trail core No. 1 to the collision attack of 5-round SHAKE128 Footnote 2. As the capacity of SHAKE128 is much larger than that of Keccak[1440, 160, 5, 160], which means about 100 more freedom degrees are needed, we just choose compatible \(\beta _1\)s where \(\beta _1\rightarrow \alpha _2\) is of least weight. We also follow this rule in later collision attacks. After solving the two-round problem with 25 min, the degree of freedom is 94 and the search for 3-round collision with probability \(2^{-39}\) costs half an hour. We give an instance of collision in Table 11.

6.3 Collision Attack of

We apply Trail core No. 5 to the collision attack of Keccak [640, 160, 5, 160]. The methods used in this case are similar to those of 5-round SHAKE128. The first stage succeeds in 30 min. The second stage takes 2 h 40 min to find a collision which happens with probability \(2^{-35}\). An example of collision is provide in Table 12, with which we solve another challenge of Keccak Crunchy Crypto Collision and Pre-image Contest [2].

6.4 Collision Attack of

We found four trail cores for which there exist zero 160-bit output differences. The one with the best probability is Trail core No. 5 which is displayed in Table 9. From \(\beta _4\) there are 24 trails to zero \(\alpha _6^d\). Taking all these trails into consideration, we get a complexity of \(2^{67.24}\)\(2^{70.24}\) for the second stage. If we let \(\#AS(\alpha _2)\) (\(w_2\)) be the smallest, the complexity for the second stage is \(2^{70.24}\) (\(2^{67.24}\)). In the experiments, we let \(\#AS(\alpha _2)\) be the smallest. In one hour our two-round algorithm returns a subspace of messages with freedom degree 135, and in 20 min we get a message pair shown in Table 13 that follows the first four rounds of the differential trail, which demonstrates that in time complexity of \(2^{70.24}\) a collision for 6-round Keccak [1440, 160, 6, 160] will be found with great confidence.

6.5 Collision Attack of 5-Round \(\mathtt{{\textsc {Keccak}}}\)-224

For the collision attack of 5-round \(\mathtt{{\textsc {Keccak}}}\)-224, all the 4-round trail cores we found for Keccak-f[1600] are not good enough, i.e. the weight of the trail cores exceeds \({\mathtt{TDF}} \) too much and even \(w_2>{\mathtt{TDF}} \). However, our two-round connector is still likely to work. For one hand, from Trail core No. 4 for \(\mathtt{{\textsc {Keccak}}}\)-f[800] we can construct a 4-round trail core for \(\mathtt{{\textsc {Keccak}}}\)-f[1600] with weight pattern (170-40-32-0) which makes our two-round connector possible. From the other, as the capacity increases, it is probable that equations added in connecting phase are not always mutually independent, which means the assumption of freedom degrees of our connector may be less than \({\mathtt{TDF}} \). The applicability of our connector in this case is verified with experiments. With Trail core No. 4, the two round connector returns a subspace of messages of freedom degree 11 and 2 or 3 for Trail core No. 3. Since the message subspaces derived are too small to mount collision attacks against 5-round \(\mathtt{{\textsc {Keccak}}}\)-224, we turn to two-block messages. Once we get c bits from the first block, we set corresponding c bit constants in \(E_M\) to the value we obtained and then solve the system to find a subspace of messages for the second block. Now the attack proceeds in the following way.

  • Connecting stage.

    • Use the two-round connector to find a message subspace with freedom degree s as large as possible, hoping that \(t=(c+p)+{\text {rank}}(E_M ~\textbackslash ~ E_{(c,p)})-{\text {rank}}(E_M)\) is as small as possible.

  • Brute-force searching stage.

    • Choose the first message randomly and compute the c-bit value for the second block. Replace the corresponding c bit constants in \(E_M\) and check whether it is still consistent. If it is consistent, we obtain another subspace with size \(2^s\).

    • Search for collision with the subspace.

    • Repeat until we find a two-block collision.

In our experiment, using Trail core No. 3 the connector returns a message subspace with freedom degree \(s=2\), and \(t=55\). Then the complexity for find a two-block collision is \(2^{55+(48-2)}=2^{101}\).

6.6 Re-launch 4-Round Collision Attacks of \(\mathtt{{\textsc {Keccak}}}\)-224 and \(\mathtt{{\textsc {Keccak}}}\)-256

Though the 4-round collisions of \(\mathtt{{\textsc {Keccak}}}\)-224 and \(\mathtt{{\textsc {Keccak}}}\)-256 have already been found [10], we use our method to optimize the complexity. We start from the same 2-round differential trail in Dinur et al.’s work [10] and build a two-round connector. The time spent on building and solving the two-round connectors is 2 min 15 s for \(\mathtt{{\textsc {Keccak}}}\)-224 and 7 min for \(\mathtt{{\textsc {Keccak}}}\)-256. Then the complexity for brute forth searching is reduced to \(2^{12}\) and cost 0.325 s and 0.28 s respectively which outperforms \(2^{24}\) on-line complexity in [10]. Besides, it is pointed out in [10] that even though they got subsets with more than \(2^{30}\) message pairs from their target difference algorithm, they were not able to find collisions within some of these subsets. The reason was suspected to be the incomplete diffusion within the first two rounds and the closely related message pairs within a subset. While in our algorithm, we did not encounter such a problem. In other words, we always find collisions from the subsets deduced from the two-round connector. Thus once we succeed in the 2-round connector building phase with a large enough subset, we never need to repeat it.

7 Conclusion

In conclusion, we observed that the Keccak S-box can be re-expressed as linear transformations under some restricted input subspaces. With this property, we linearized all S-boxes of the first round, and extended the existing connector by one round. Implementations confirmed our idea, and found us real examples of 5-round SHAKE128, and two instances of Keccak challenges. Theoretical results on 5-round \(\mathtt{{\textsc {Keccak}}}\)-224 and a 6-round Keccak challenge version are projected.

It is noted that the algorithm for solving the two-round connectors are heuristic, further work includes finding the theoretical bounds of this algorithm and factors deciding the complexities for possible improvements. Note, any relaxation on the restrictions of \(\varDelta S_I\) might lead us to better differential trails in the searching phase.