Abstract
The ultimate goal of designing a symmetric-key cryptographic primitive often can be formulated as an optimization problem. So far, these problems mainly have been solved with trivial algorithms such as brute force or random search. We show that a more advanced and equally versatile class of search algorithms, called metaheuristics, can help to tackle optimization problems related to design of symmetric-key primitives. We use two nature-inspired metaheuristics, simulated annealing and genetic algorithm, to optimize in terms of security the components of two recent cryptographic designs, SKINNY and AES-round based constructions. The positive outputs of the optimization suggest that metaheuristics are non-trivial tools, well suited for automatic design of primitives.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
In the past several years we have seen a major development of computer tools for automatic analysis of symmetric-key primitives. The tools cover a wide range of analysis techniques: differential [5,6,7, 16, 18, 19, 21, 30,31,32,33, 38,39,40], linear [17, 30], impossible differential [12, 15, 26, 29, 36, 43], meet-in-the-middle [8, 14, 15], etc. Among other applications, the tools can serve as a proof of security of new designs because they can provide resistance of new designs against most (sometimes all) of the known cryptographic attacks.
Advanced computer tools for design of symmetric-key primitives, however, have not been considered. Instead, most of the design problems have been solved either analytically or with trivial computer algorithms such as brute force and random search. Consider, for example, the problem of tweaking AES to make it more resistant against meet-in-the-middle attacks. With automatic tools for analysis against meet-in-the-middle attacks we can check the security margin of each tweaked version of AES. If we tweak only ShiftRows, then we can brute force the space of all tweaks, check each tweak with the above automatic tools, and find the one that provides the highest security. On the other hand, if we decide to tweak both ShiftRows and MixColumns, then the space of tweaks may be too large for a brute force, and thus we will use a random search. That is, we will check only a subset of randomly chosen tweaks and find the best among them. These two simple algorithms have been basically the only available computer tools to designers.
Assume the goal is to create a tool for automatic design of symmetric-key primitives that is: (1) based on more advanced search methods, and is (2) versatile, can tackle a variety of design problems. Note, brute force and random search do not satisfy the first point because they are trivial, but do satisfy the second point because they can be applied to many design problems. In a nutshell, these two algorithms are the simplest optimization methods. Therefore, to build a better design tool we need to focus on the next class of known optimization algorithms, that is also universal, but more sophisticated. That is the class of metaheuristics.
A metaheuristic is a search algorithm used to find a sufficiently good solution to an optimization problem. It makes almost no assumptions about the optimized (objective) function and it performs equally well when the function is not explicitly defined, but it can be queried. The search strategy implemented in a metaheuristic is often based on some nature-inspired method or technique – metaheuristics are named according to their nature equivalent, for instance, particle swarm optimization, simulated annealing, ant colony optimisation, evolutionary computation, etc. In cryptography, metaheuristics have been used mainly to design Sboxes that satisfy special criteria, such as resistance against cryptographic attacks [1, 11, 34, 42, 44].
Our Contributions. Arguably, the design decision behind any part of a symmetric-key cryptographic primitive is driven by the goal of optimization (in terms of security, size, throughput, etc.). Therefore, we regard the problem of design purely as an optimization problem. The computer algorithms that solve the optimization problem we call tools for automatic design.
Our tools are based on metaheuristics. These search algorithms are sufficiently universal to solve most of the design optimization problems. We use two nature-inspired metaheuristics: simulated annealing and genetic algorithm. We introduce the metaheuristics in Sect. 2; for each of them we point the main idea, provide a description in pseudo-code, and give a list of the most important parameters. In Sect. 3, we apply the metaheuristics to solve two concrete design optimization problems. To do so, first we identify the optimization problem, then formally defined it (describe the objective function and its input space), and finally we use metaheuristics to find good solutions. Our two problems are related to finding new components in the recently proposed block cipher SKINNY [3] and the AES-round based constructions [23]. We choose these two primitives because they best demonstrate the effectiveness of metaheuristics. Both SKINNY and the AES-round constructions are designed with clear optimization goals and, considering their excellent performance, achieve these goals. Nonetheless, metaheuristics allow even further optimization. We show that simulated annealing and genetic algorithm can be used to find specific components in the two primitives that results in even higher performance according to criteria which was considered important by the designers. More precisely, we use the metaheuristics to find for SKINNY a permutation in the tweakey schedule that leads to a higher resistance against related-tweakey attacks and for the AES-round constructions to find a round transformation that results in a better security against internal collisions.
To summarize, our main objective and contribution is to provide an empirical proof that, due to their simplicity and versatility, metaheuristics are perhaps the most effective tools for automatic design of symmetric-key primitives.
2 Metaheuristics
Consider a simple optimization problem: find optimum (maximum or minimum) of an objective function \(f(x):D\rightarrow R\). If f(x) is given as a blackbox, i.e. it can be queried but is not explicitly defined, then mathematical and standard computer science methods for solving optimization problems cannot be applied because they require full definition of f(x). In addition, if the domain D is discrete and has a large size, then the optimization problem cannot be solved by a brute force in practical time.
To cope with these type of problems, we use metaheuristics. They are approximate algorithms – the solution they provide is not guaranteed to be optimal (although some have a theoretical proof of asymptotic convergence). However, metaheuristics output the solution by using only limited computational resources, i.e. they are practical algorithms. Hence, among other applications, metaheuristics are well suited for search of near optimal solutions to optimization problems where the (blackbox) objective function is expensive.
There are various classifications of metaheuristics. According to the search strategy, they are divided into local search (try to find only the local optimum) and global search (global optimum). For instance, one of the most popular metaheuristic is the hill climbing method which tries to find only the local optimum. Another classification is single vs population-based search. A single metaheuristic works with only one candidate solution at a time, while population-based works simultaneously with multiple candidates. Hill climbing, simulated annealing, iterated local search are examples of single search, while genetic algorithm, ant colony optimization are examples of population-based search.
The efficiency of metaheuristics is tested experimentally by comparing the time complexities (measured in calls to f(x)) the metaheuristics require to solve some well-known problems. Depending on the problems, the comparative efficiency of two metaheuristics can vary, i.e. for some problems the first may be better, while for others the second. Therefore, the term “best metaheuristic” is meaningless. Testing the efficiency of a metaheuristic is not trivial because each is associated with a set of parameters. A metaheuristic needs a fine tuning of its parameters for each problem – this can be a very long and tedious process but it can have a major impact on its efficiency. For each metaheuristic, there are recommended set of values for its parameters, however, they were deduced empirically from its previous applications and thus provide no guarantee of optimality.
Further we will use two metaheuristics: simulated annealing and genetic algorithm. The choice is not accidental – both of them have been reported as one of the best performing on wide variety of problems in the single-based and population-based categories, respectively. In the sequel we give a minimal description of the two metaheuristics which we believe is sufficient to understand our ideas that follow. An interested reader can find more details about the metaheuristics for instance in [37, 41].
2.1 Simulated Annealing
Simulated annealing [9, 27] is a single-based, global search metaheuristic. It is a nature-inspired algorithm that mimics a physical process occurring in chemical substances: heating, followed by cooling and crystallizing.
Given an objective function f(x), simulated annealing tries to find its maximumFootnote 1 by iteratively improving the potential solution. That is, starting from some random \(x_0\), it builds \(x_1,x_2,\ldots \). At iteration i, the value of \(x_i\) is produced from the previous value \(x_{i-1}\), with the goal of maximizing further the function f(x), i.e. \(f(x_i) \ge f(x_{i-1})\). The main idea of simulated annealing is to allow probabilistic degradation of solutions, i.e. sometimes it accepts \(x_i\) even if \(f(x_i) < f(x_{i-1})\). However, the probability of acceptance varies: in the early stages (when i is smaller) it accepts more degrading solutions, while later less. Such a strategy allows at the beginning to explore more variety of solutions, including degrading, while later to focus only on local optimization. Note, the degrading solutions allow the algorithm to escape local optima.
A formal description of simulated annealing is given in Algorithm 1. It takes as inputs three parameters: initial temperature T, cooling schedule function \(\alpha (T)\), and neighbour function \(\epsilon (x)\). In the initialization, it assigns a random value as a best solution x to the maximization problem of f(x). Then it keeps trying to build better solution by iterating the same procedure: from x generate a new candidate \(x'\), and if it complies to a certain criteria, accept it as a new solution x. The function \(\epsilon (x)\) generates \(x'\) from x by slightly changingFootnote 2 the value of x. If \(x'\) is a better solution than x, then x is updated to \(x'\). However, if \(x'\) is worse, then it is not immediately rejected. Rather, it can be accepted, but only with some probability. The probability of acceptance (expressed with \(r< e^\frac{f(x')-f(x)}{T}\), where \(f(x') - f(x)\) is negative) is higher when the temperature T is higher and when the value of the objective function on the new candidate \(x'\) is closer to the value of the old candidate x. The iterations are stopped once the termination criteria is met. The criteria can be set differently: through the number of iterations, the value of the temperature, etc.
Parameters. As mentioned earlier, the main objective when choosing the values of the parameters is to optimize the efficiency of the metaheuristic so that it can produce a solution close to the global maximum in the shortest possible time. Simulated annealing requires the following parameters:
-
Neighbour function: \(\epsilon (x)\) should return \(x'\) that is in the neighbourhood of x, i.e. \(\Vert x - \epsilon (x)\Vert \) should be small. For instance, if x is a vector then we can define \(\epsilon (x)\) as a vector that coincides with x on all coordinates except one. Note, if \(\Vert x - \epsilon (x)\Vert \) is large (or unlimited), then simulated annealing turns into a plain random search. Refer to Appendix B for more discussion on neighbour functions.
-
Cooling schedule: \(\alpha (T)\) should be monotonic (strictly decreasing) function. There are several choices for \(\alpha (T)\): linear, exponential, inverse, logarithmic and other cooling schedules. We will use inverse cooling, defined as \(\alpha (T) = \frac{T}{1 + \beta T}\), where \(\beta \) is small constant, usually of order 0.001. We choose inverse cooling because it outperformed other cooling schedules in our preliminary experiments.
-
Initial temperature: if \(T_0\) is high then simulated annealing will explore more possibilities, however, it will require more time to converge to a near-optimal solution. Conversely, lower initial temperature leads to faster finding some solution that may not be so optimal. The value of \(T_0\) should be chosen depending on the values of \(\epsilon (x)\) and \(\alpha (T)\) as well as the allowed time complexity, in order to balance the possibility of exploring more solutions with the maximal allowed time.
2.2 Genetic Algorithm
Genetic algorithm [22] is a population-based, global search metaheuristic. It belongs to the larger family of evolutionary algorithms which simulate natural selection to solve optimization problems.
To find maximum of an objective function (called a fitness function), genetic algorithm works in iterations (called generations). At each iteration it tries to improve a set of solutions, rather than a single solution. This set is called a population of individuals. To produce a new population from an old population, i.e. to change the generation, genetic algorithm uses two operations: mutation and crossover. A mutation is applied to one individual and it consists of slightly changing it. On the other hand, crossover is a synonym for reproduction. It takes two individuals (called parents) and produces two new individuals (called children)Footnote 3. The choice of parents is controlled by so-called selection function which decides how to choose the parents. The selection function is biased towards individuals with better fitness (higher value of fitness function). This is done to mimic the natural selection of parents – those with better qualities (genes) have higher chance of reproduction. A formal description of genetic algorithm is given in Algorithm 2.
Parameters. Genetic algorithm uses a wide range of parameters:
-
Population size N: the number of individuals. The recommended value of N is in the range \([\log |D|, 2\log |D|]\), where |D| is the size of the search space.
-
Selection function: the most popular types of selections are roulette-wheel (individuals’ probabilities of being selected as parents are proportional to their fitness functions), tournament (several individuals are first randomly selected and then in tournament-like fashion the winner is chosen according to his/her fitness value), rank (the individuals are sorted according to their fitness value, and their positions – called ranks– are used to determine their probability of selection), and stochastic (several individuals are simultaneously selected as parents according to their probability distributions). More detailed descriptions of the selection functions are given in Appendix B.
-
Crossover function: it produces children that share similarities with the parents. For instance, if the two parents are given as vectors (the coordinates of these vectors are called genes), then the corresponding coordinates of the vectors of their children will have values either of the first or of the second parentFootnote 4. Crossover function decides how the children inherit parents’ genes. We will use a uniform crossover function, i.e. each gene of the children (each coordinate of the vector) has an equal probability to come from any of the two parents, and this probability is independent of the previous genes.
-
Mutation probability and function: within one generation, the mutation is applied only to a small number of individuals defined by mutation probability. A recommended value for this probability is in the range [0.001, 0.01], i.e. only around 0.1–1% of the individuals are mutated. The mutation function defines how an individual is changed – it alters slightly the genes of an individual.
-
Elitism: usually the best individuals within each generation are kept, that is, at the end of a generation, a certain percentage of the best parents progress to the next generation (are copied to children). This is called elitism (from elite). A recommended elitism is in the range [0.05–0.2], i.e. 5–20% of the parents with best fitness progress to the next generation.
3 Applications
Usually the objective of a new cryptographic primitive is to provide at least one better functionality than all known designs. This functionally can vary and may include better throughput, smaller footprint, higher security, etc. Regardless of the chosen functionality, the goal of designers essentially can be seen as an optimization problem.
The optimization of cryptographic designs may or may not be solved with the use of metaheuristics. If the optimization problem is too general or the objective function is not clearly stated, then metaheuristics cannot solve the problem. For instance, trying to tweak somehow the round function of AES to maximize its resistance against impossible differential attacks does not formulate a good objective function. On the other hand, trying to tweak the MixColumns matrix by changing its coefficients, provides a clear objective function: the input to the function is some MixColumns matrix, and the output is the security level against impossible differential attacksFootnote 5. Some optimization problems can be solved better (faster or with higher precision) with methods other than metaheuristics, such as heuristics or even brute force. For instance, trying to tweak the ShiftRows constants in AES to maximize its resistance against impossible differential attack can be solved simply by brute force as the number of all possible variants is small.
From the above discussion it follows that we can use metaheuristics to design or improve symmetric-key primitives when:
-
1.
The optimization goal can be quantified (the objective function is clearly stated and can be computed on arbitrary inputs),
-
2.
The search space is relatively large and cannot be covered by a brute force,
-
3.
The solution not necessarily has to be globally/locally optimal (recall, metaheuristics may or may not return optimal solution in feasible time).
Further we give two examples of good optimization goals, that can be tackled with metaheuristics. They are related to improving the security marginFootnote 6 of SKINNY [3] and of the AES-round based constructions from [23]. These two primitives are ideal candidates for testing the effectiveness of metaheuristics because they are recent designs, have strong emphasis on optimization of components, and have clear optimization goals. Note, we have considered as well the use of metaheuristics to a few other recent designs, however, for various reasons we omit the details of their applications. For instance, the potential optimization of the functions Simpira v2 [20] and Haraka [28] can be solved with a brute force, therefore the optimization does not satisfy the above second requirement, and hence metaheuristics are not the first choice. On the other hand, optimizing component in the authenticated encryption scheme Deoxys [25] can be done with metaheuristics, however, the problem is too similar to the further analyzed problem of SKINNY, and thus we omit it.
3.1 SKINNY
SKINNY [3] is a family of block ciphers proposed at CRYPTO’16. Its goal is to be on par with NSA cipher SIMON [2] in hardware and software, while providing higher security. The ciphers are tweakable, i.e. besides a key and a plaintext, they have a third input called a tweak. The tweaking is based on a framework [24] that treats the key and the tweak universally, as a single input called tweakey. SKINNY ciphers have state sizes \(n=64\) or \(n=128\) bits, regarded as 4\(\,\times \,\)4 matrices of nibbles. On the other hand, the tweakey sizes t are multiples of the state size n, and have three versions: \(t=n, t=2n, t=3n\).
SKINNY are iterative substitution-permutation ciphers. In Fig. 1 we give one round of the ciphers when \(t=3n\). A state round consists of five familiar transformations: SubCells is an Sbox layer, AddConstants xors constants, AddRoundTweakey xors the two top rows of each tweakey word to the two top rows of the state, ShiftRows rotates the nibbles of the state rows, and MixColumns multiplies the state columns by some matrix. In the tweakey schedule, the three tweakey words \(TK_1, TK_2\), and \(TK_3\) undergo two transformations: state-wise nibble permutation \(P_T\) which is the same for all the tweakeys, and nibble-wise linear transformations \(l_i\).
To be competitive in hardware and software, SKINNY ciphers have been highly optimized. Most of the transformations used in the ciphers have above average performance according to some design criteria and have been found as a result of some heuristic or a computer search. According to the extended version of the submission document [4], the nibble permutation \(P_T\) used in the tweakey schedule “has been chosen to maximize the bounds on the number of active Sboxes ... in the related-tweakey model”. The search method used to find \(P_T\) is not specified.
With the use of metaheuristics, we will further optimize \(P_T\). Note, the optimization problem has already been well formulated: find \(P_T\) to maximize the number of active Sboxes in the best related-tweakey characteristic. To find this number for a particular choice of \(P_T\), as suggested by the designers we use an automatic tool based on integer linear programming (ILP). Therefore, ILP can be seen as the objective function f, which takes as input a permutation \(P_T\) and returns the number of active Sboxes. Hence our problem becomes
where \(P_T\) is a permutation of 16 elements with an additional constraint: the first eight elements can be send only to the last eight positions, and vice versa. In fact, besides this constraint, the designers of SKINNY have imposed two additional: (1) \(P_T\) must consist of a single cycle, and (2) it sends the first 8 elements to the last 8 positions. In our search, we will relax these two constraints. This increases the search space from slightly under 8! possible choices of \(P_T\) to \((8!)^2\). Hence we will operate in a space that cannot be covered by a brute force and that has candidate permutations that may lead to ciphers with higher security margins. However, as we relax constraint (2), our permutations may require higher implementation cost in certain environments. Hence, our search for \(P_T\) should be seen in general as tradeoff between possibly higher security and lower speed.
Before we apply the metaheuristics, let us clarify a few points. First, SKINNY has several versions and we will focus on SKINNY-64-192 which is the 64-bit version with three tweakeys (\(n=64, t=3n=192\)), i.e. on the lightweight version which gives the most freedom to the attacker. Other versions can be processed similarly: moving from 64-bit to 128-bit will require more computational powerFootnote 7, while reducing the number of tweakey words from three to two or one will require less powerFootnote 8. Second, the best characteristics not necessarily have to be found for the full cipher. Rather, once in a round-reduced characteristic the number of active Sboxes reaches some threshold, the cipher is already considered secure. In SKINNY-64-192 this numberFootnote 9 is 33, and according to [4], for the original choice of \(P_T\) it is reached after 18 rounds. We will try to achieve 33 active Sboxes earlier, in 16 roundsFootnote 10. Therefore, our objective function \(f(P_T)\) is defined as the number of active Sboxes in the best characteristics on 16 rounds.
Let us clarify the above points. First note that the original permutation \(P_T^o\) of SKINNY is defined as
According to the designers, and confirmed with our own ILP tool, \(f(P_T^o) = 27\). We are looking for another permutation \(P_T\)
such that \(f(P_T)\) is as large as possible. Note, there is an additional condition, which requires that \(a_i \ge 8\) for \(i=0,1,\ldots ,7\) and \(a_i < 8\) for \(i=8,9,\ldots ,15\), which assures that the first 8 elements are sent to the last 8 positions, and vice versa.
Let us focus on simulated annealing. To solve the optimization problem with this metaheuristic, we first need to specify the three parameters: \(\epsilon (P_T), \alpha (T), T_0\). As a neighbour function \(\epsilon (P_T)\) we use a random transposition. That is, we randomly choose two indices in \(P_T\) and switch the value of the elements with such indices. Note, however, the choice of indices cannot be completely random because \(\epsilon (P_T)\) must fulfil the additional condition. Hence, to properly implement \(\epsilon (P_T)\), we first choose the half of \(P_T\) where the transposition will occur, and only then the two random elements that belong to the same half. For cooling schedule \(\alpha (T)\), as mentioned before, we use inverse cooling \(\alpha (T) = \frac{T}{1 + \beta T}\) and experiment with value of \(\beta \) in the range [0.001, 0.003]. Finally, as an initial temperature \(T_0\) we take values in the range [1, 2]. Our termination criteria is time-based, i.e. we stop the search and output the best found solution after running the metaheuristics around a day on an 8-core processor.
Further, let us focus on genetic algorithms and the used parameters. In all of our implementations, the population size is 50. To test the effectiveness and impact of different selection functions, we used all four of them. In addition, we use mutation rate of 0.01 and a mutation function that closely resembles \(\epsilon (P_T)\) from simulated annealing (i.e. mutation consists of one random transposition). Finally, we use elitism with 20% rate. The termination criteria is similar to that of simulated annealing, but we allow more time.
The results of the optimization with the two metaheuristics are as follows. Both simulated annealing and genetic algorithm were able to find permutations \(P_T\) such that \(f(P_T)=33\). Simulated annealing performed similarly on different choices of parameters \(\beta \) and \(T_0\), i.e. we did not detect any significant difference. On average, it required around 1000 calls to the objective function f(x) to find a permutation \(P_T\) such that \(f(P_T)=33\). On the other hand, genetic algorithm performed better for some choices of selection functions. To find \(P_T\) such that \(f(P_T)=33\) on average over three trials, with stochastic selection it required 950 calls, with rank selection 1380 callsFootnote 11, with roulette-wheel 2250 calls, and with tournament selection 5900 calls. Therefore, we can conclude that simulated annealing and genetic algorithm with stochastic or rank selection performed similarly.
In Table 1 are shown examples of permutations \(P_T^{SA}, P_T^{GA}\) found with simulated annealing and genetic algorithm such that \(f(P_T^{SA}) = f(P_T^{GA}) = 33\). For performance measurements, we give in the table as well the number of active Sboxes of the best characteristics reduced to not only 16 rounds, but in the range of 14 to 18 rounds. Evidently, the two new permutations result in higher numbers of active Sboxes in comparison to the original permutation of SKINNY.
We conclude this subsection with a discussion on further use of metaheuristics in SKINNY. One potential direction would be to optimize the resistance against related-tweakey attacks with respect to both \(P_T\) and AddRoundTweakey, i.e. by changing the permutation \(P_T\) and by identifying which 8 nibbles of the tweakey words should be xored to the state (instead of the 8 nibbles of the first two rows).
3.2 AES-round Based Constructions [23]
Software optimized designs based on the AES round function are presented in [23]. The main objective of the authors of this paper is to provide symmetric-key constructions (as building blocks of MACs and authenticated encryption schemes) that are efficient on the latest Intel processorsFootnote 12. The authors show seven constructions that run at only a few tenths of a cycle per byte on Intel processors Ivy Bridge, Haswell and Skylake.
The proposed constructions have a state composed of s words of 128 bits. The state is transformed by a round function given in Fig. 2, where A stands for one AES round. Besides A, the only remaining operation is the xor (of message words \(M_{i_j}\) and state words \(X_{i_j}\)). Each construction is characterized by a parameter called a rate \(\rho \) which is defined as the number of AES rounds required to process a 128-bit message word. That is, \(\rho \) is the ratio of the number of calls to A to the number of different message words (in one round). The lower the rate, the faster the design, hence the goal of the authors has been to reduce the rate as much as possible.
A construction is considered secure if it is free of so-called internal collisions, which are special type of differential characteristics: they start and end in zero differences in the stateFootnote 13. A construction should provide 128-bit security, that is, differential characteristics that lead to internal collisions must not have a probability higher than \(2^{-128}\). To find the best characteristic and its probability, the authors reduce the problem to counting active Sboxes and use the aforementioned integer linear programming tool to get the lower bound on this number. The security level of 128 bits corresponds to at least 22 active SboxesFootnote 14 in the best characteristics.
The seven proposed constructions have different number of state words (7 to 12) and different rates (in the range [2, 3]). For a particular choice of state size and rate, the authors use some heuristic (which has not been explained in the paper) to search the space of all constructions as defined in Fig. 2 and consider only those which are resistant against internal collisions, i.e. their best characteristics have at least 22 active Sboxes. Constructions that have the lowest probability of internal collisions (i.e. the highest number of active Sboxes) are considered the best.
Further we use metaheuristics to optimize the constructions according to the design criteria of [23]. The optimization problem is clear: for a particular choice of state size s and rate \(\rho \), find a round function as in Fig. 2 that defines a construction whose best differential characteristic that leads to internal collisions has maximal number of active Sboxes. Once again, the role of objective function f is played by the integer linear program that returns a lower bound on the number of active Sboxes. To understand what the input to the objective function is, let us focus on Fig. 2. Note, there are three types of red (optional) transformations in the round function. First, each of the s calls to A are optional. Therefore, we can use s-bit vector \(aes\_masks\) to describe a particular configuration of the calls to A, where i-th bit of \(aes\_masks\) is set iff in the round function, A is applied to the \(X_i\). Second, all s feedforwards of words (the red vertical lines) can also be described with an s-bit vector \(feed\_masks\). Finally, the xors of message words \(M_{i_j}\) can be described with vector messages of s coordinates, each an integer value in the range [0, w], where w is the total number of message words in a round. A value of 0 denotes that no message words is xored, while any positive integer value corresponds to the index of the message word being xored. As a result, each potential construction can be described with the three vectors: \(aes\_masks, feed\_masks,\) and messages. However, note that not all combinations are possible because the values of \(aes\_masks\) and messages cannot be arbitrary. Rather, they must agree with the rate \(\rho \). For instance, if \(\rho =2\) and the Hamming weight of \(aes\_masks\) is 6, then the vector messages can contain the values 1,2 and 3, and it must have each of these values at least once. This assures that the rate of the constructions is indeed 2. Let us assume further that the tuples \((aes\_masks, feed\_masks,messages)\) agree with the predefined rate \(\rho \). Then our optimization problem for fixed state size s and rate \(\rho \) can be defined as:
We optimize six of the seven constructions proposed in [23]. We omit one, with rate \(\rho =2\) and size \(s=12\), because it has too expensive objective function – it took us half a day to compute f on an input.
To solve the optimization problems, we run simulated annealing and genetic algorithm with the following parameters. In simulated annealing, the neighbour function \(\epsilon (x)\) consists of flipping 1–2 bits in some (or all) of the three vectors \(aes\_masks, feed\_masks,messages\) (with an additional check on the rate \(\rho \)). Furthermore, we use inverse cooling \(\alpha (T)\) with \(\beta =0.003\), and initial temperature \(T_0 = 1.5\). In genetic algorithm, the population size is 30, combined with stochastic selection function, uniform crossover, mutation rate of 0.01, a mutation function based on random flip of bits, and 20% elitism. The termination criteria for both of the metaheuristics is based on number of calls to the objective function, and it is either 500 calls (for smaller search spaces) or 700 calls (for larger).
The outputs of the metaheuristics are given in Table 4. For all six constructions, both of the metaheuristics were able to find better candidates with an in crease of 13%–44% to the number of active Sboxes in comparison to the original proposals from [23]. Simulated annealing performed slightly better than genetic algorithm – in limited number of calls to the objective function, it managed to find constructions with higher security margin. We suspect this is due to the termination criteria as genetic algorithm requires more generations to find better solutions (Table 2).
Finally, we note that we have also run metaheuristics to find competing constructions to the published ones [23] not only in terms of higher security, but in terms of efficiency too. We refer the reader to Appendix C for more details.
4 Conclusion
Metaheuristics are widely used algorithms for search of solutions to optimization problems. The design of symmetric-key primitives can be seen as one such problem, thus metaheuristics can be used to find better designs. Therefore, metaheuristics can serve as tools for automatic designs of symmetric-key primitives. Unlike brute force and random search, metaheuristics are non-trivial tools which should be scrutinized in absence of better heuristics or of other more advanced search methods.
We used two metaheuristics, simulated annealing and genetic algorithm, to optimize designs with respect to security. Our choice of metaheuristics was guided by their popularity and reported success – both of them are considered among the best performing on well known problems. On the other hand, as an optimization parameter we chose security because that led to well defined and computable objective functionsFootnote 15. We wrote the implementations of the two metaheuristics on C – they were straightforward to code. It took us several thousand CPU hours to test for good set of parameters and to find approximate solutions for the design optimization problems in SKINNY and the AES-round constructions. The outputs were positive – the metaheuristics were able to find better components for both of the primitives, sometimes improving the optimized component by more than 40%. Thus we can conclude that metaheuristics can serve as effective tools for automatic design of symmetric-key primitives.
Future research may focus on expanding the area of application and variety of metaheuristics. This includes formulating other design problems as optimization problems and subsequently using the proposed metaheuristics for their solution. We stress out that the optimization problems not necessarily have to be related to an increase in security, but may target better throughput, smaller size, etc. Furthermore, using metaheuristics other than simulated annealing and genetic algorithms may also improve design methods of crypto primitives. Some more advanced metaheuristics, such as the multi-objective genetic algorithm NSGA-II [13], may well excel in solving design problems related to multidimensional optimization, i.e. optimization by several criteria.
Notes
- 1.
Finding the minimum can be achieved similarly, with minor changes.
- 2.
For instance, when x is a vector, then \(\epsilon (x)\) returns another vector in some predefined \(\epsilon \) environment of x.
- 3.
Variations of crossover operators exist where two parents can produce only a single child or more than two children.
- 4.
In some cases this is not possible, so some of the genes will be random.
- 5.
Assuming that one can compute the security level against impossible differential attacks with tweaked MixColumns matrix.
- 6.
However, we remind the reader that this is not necessarily the only use of metaheuristics – they can be applied to optimize designs with respect to throughput, size, etc.
- 7.
The 128-bit version of SKINNY uses 8-bit Sboxes that have the same maximal differential propagation probability of \(2^{-2}\) as the 4-bit Sboxes used in the 64-bit version. Therefore, to achieve 128-bit security (rather than 64-bit security) the number of active Sboxes in the best characteristic has to be much larger, which in return results in higher complexity search.
- 8.
For results on these versions refer to Appendix A.
- 9.
The number is defined by the state size and the probability of the best differential transition of the Sbox. The state of SKINNY is 64 bits, and the highest probability of a differential transition in the 4-bit Sbox is \(2^{-2}\), thus if the number of active Sboxes is \(1 + \lfloor \frac{64}{2} \rfloor = 33\), the cipher is resistant against related-tweakey differential attacks.
- 10.
The number of rounds cannot be predicted a priori. We focus on 16 rounds, but if we do not succeed we can always compare either how many active Sboxes we have reached on 16 rounds, or if we have reached 33 active Sboxes on 17 rounds.
- 11.
This number (1380) not necessarily has to be divisible by the population size (50). The reason is two-fold: (1) we halt the search once a sufficiently good construction is found, without updating the whole population, and (2) we use elitism, which dictates that at each generation only \(50\cdot (1-elitism)\) individuals are updated.
- 12.
These processor have special instructions set called AES-NI, that can execute AES round function as a single instruction.
- 13.
The difference is introduced and later cancelled through the message words.
- 14.
Because the differential propagation probability of an Sbox in AES is \(2^{-6}\), thus 128-bit security means \(\lfloor \frac{128}{6} \rfloor + 1 = 22\) active Sboxes.
- 15.
The objective function is well defined because the security criteria is characterized by a single parameter. On the other hand, it is computable, because there are various tools such as those based on ILP that can produce the output for an arbitrary input.
References
Ahmad, M., Bhatia, D., Hassan, Y.: A novel ant colony optimization based scheme for substitution box design. Procedia Comput. Sci. 57, 572–580 (2015)
Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The SIMON and SPECK families of lightweight block ciphers. Cryptology ePrint Archive, Report 2013/404 (2013). http://eprint.iacr.org/2013/404
Beierle, C., Jean, J., Kölbl, S., Leander, G., Moradi, A., Peyrin, T., Sasaki, Y., Sasdrich, P., Sim, S.M.: The SKINNY family of block ciphers and its low-latency variant MANTIS. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9815, pp. 123–153. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53008-5_5
Beierle, C., Jean, J., Kölbl, S., Leander, G., Moradi, A., Peyrin, T., Sasaki, Y., Sasdrich, P., Sim, S.M.: The SKINNY family of block ciphers and its low-latency variant MANTIS. Cryptology ePrint Archive, Report 2016/660 (2016).http://eprint.iacr.org/2016/660
Biryukov, A., Nikolić, I.: Automatic search for related-key differential characteristics in byte-oriented block ciphers: application to AES, Camellia, Khazad and others. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 322–344. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_17
Biryukov, A., Nikolić, I.: Search for related-key differential characteristics in DES-Like ciphers. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 18–34. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21702-9_2
Biryukov, A., Velichkov, V.: Automatic search for differential trails in ARX ciphers. In: Benaloh, J. (ed.) CT-RSA 2014. LNCS, vol. 8366, pp. 227–250. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04852-9_12
Bouillaguet, C., Derbez, P., Fouque, P.: Automatic search of attacks on round-reduced AES and applications. IACR Cryptol. ePrint Arch. 2012, 69 (2012)
Černỳ, V.: Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithm. J. Optim. Theory Appl. 45(1), 41–51 (1985)
Cheon, J.H., Takagi, T. (eds.): ASIACRYPT 2016. LNCS, vol. 10031. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6
Clark, J.A., Jacob, J.L., Stepney, S.: The design of S-boxes by simulated annealing. In: Congress on Evolutionary Computation, CEC2004, vol. 2, pp. 1533–1537. IEEE (2004)
Cui, T., Jia, K., Fu, K., Chen, S., Wang, M.: New automatic search tool for impossible differentials and zero-correlation linear approximations. IACR Cryptol. ePrint Arch. 2016, 689 (2016)
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Derbez, P., Fouque, P.-A.: Exhausting Demirci-Selçuk meet-in-the-middle attacks against reduced-round AES. In: Moriai, S. (ed.) FSE 2013. LNCS, vol. 8424, pp. 541–560. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43933-3_28
Derbez, P., Fouque, P.-A.: Automatic search of meet-in-the-middle and impossible differential attacks. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9815, pp. 157–184. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53008-5_6
Dinu, D., Perrin, L., Udovenko, A., Velichkov, V., Großschädl, J., Biryukov, A.: Design strategies for ARX with provable bounds: Sparx and LAX. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 484–513. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_18
Dobraunig, C., Eichlseder, M., Mendel, F.: Heuristic tool for linear cryptanalysis with applications to CAESAR candidates. In: Iwata, T., Cheon, J.H. (eds.) ASIACRYPT 2015. LNCS, vol. 9453, pp. 490–509. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48800-3_20
Emami, S., Ling, S., Nikolić, I., Pieprzyk, J., Wang, H.: The resistance of PRESENT-80 against related-key differential attacks. Crypt. Commun. 6(3), 171–187 (2014)
Fouque, P.-A., Jean, J., Peyrin, T.: Structural evaluation of AES and chosen-key distinguisher of 9-round AES-128. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 183–203. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_11
Gueron, S., Mouha, N.: Simpira v2: a family of efficient permutations using the AES round function. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 95–125. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_4
Grault, D., Lafourcade, P., Minier, M., Solnon, C.: Revisiting AES related-key differential attacks with constraint programming. Cryptology ePrint Archive, Report 2017/139 (2017).http://eprint.iacr.org/2017/139
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT Press, Cambridge (1992)
Jean, J., Nikolić, I.: Efficient design strategies based on the AES round function. In: Peyrin, T. (ed.) FSE 2016. LNCS, vol. 9783, pp. 334–353. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-52993-5_17
Jean, J., Nikolić, I., Peyrin, T.: Tweaks and keys for block ciphers: the TWEAKEY framework. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8874, pp. 274–288. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45608-8_15
Jean, J., Nikolić, I., Peyrin, T., Seurin, Y.: Deoxys v1.4. Submitted to CAESAR (2016)
Kim, J., Hong, S., Sung, J., Lee, S., Lim, J., Sung, S.: Impossible differential cryptanalysis for block cipher structures. In: Johansson, T., Maitra, S. (eds.) INDOCRYPT 2003. LNCS, vol. 2904, pp. 82–96. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-24582-7_6
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Kölbl, S., Lauridsen, M.M., Mendel, F., Rechberger, C.: Haraka v2 - efficient short-input hashing for post-quantum applications. IACR Trans. Symmetric Cryptol. 2016(2), 1–29 (2016)
Luo, Y., Lai, X., Wu, Z., Gong, G.: A unified method for finding impossible differentials of block cipher structures. Inf. Sci. 263, 211–220 (2014)
Matsui, M.: On correlation between the order of S-boxes and the strength of DES. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 366–375. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0053451
Moriai, S., Sugita, M., Aoki, K., Kanda, M.: Security of E2 against truncated differential cryptanalysis. In: Heys, H., Adams, C. (eds.) SAC 1999. LNCS, vol. 1758, pp. 106–117. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-46513-8_8
Mouha, N., Wang, Q., Gu, D., Preneel, B.: Differential and linear cryptanalysis using mixed-integer linear programming. In: Wu, C.-K., Yung, M., Lin, D. (eds.) Inscrypt 2011. LNCS, vol. 7537, pp. 57–76. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34704-7_5
Nikolić, I.: Tweaking AES. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 198–210. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19574-7_14
Picek, S., Yang, B., Rozic, V., Mentens, N.: On the construction of hardware-friendly 4x4 and 5x5 S-boxes. Lecture Notes in Computer Science (2016)
Robshaw, M., Katz, J. (eds.): CRYPTO 2016. LNCS, vol. 9815. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53008-5
Sasaki, Y., Todo, Y.: New impossible differential search tool from design and cryptanalysis aspects. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10212, pp. 185–215. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56617-7_7
Simon, D.: Evolutionary Optimization Algorithms. Wiley, Hoboken (2013)
Sun, S., Gerault, D., Lafourcade, P., Yang, Q., Todo, Y., Qiao, K., Hu, L.: Analysis of AES, SKINNY, and others with constraint programming. IACR Trans. Symmetric Cryptol. 2017(1), 281–306 (2017)
Sun, S., Hu, L., Qiao, K., Ma, X., Shan, J., Song, L.: Improvement on the method for automatic differential analysis and its application to two lightweight block ciphers DESL and LBlock-s. In: Tanaka, K., Suga, Y. (eds.) IWSEC 2015. LNCS, vol. 9241, pp. 97–111. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22425-1_7
Sun, S., Hu, L., Wang, P., Qiao, K., Ma, X., Song, L.: Automatic security evaluation and (related-key) differential characteristic search: application to SIMON, PRESENT, LBlock, DES(L) and other bit-oriented block ciphers. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8873, pp. 158–178. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45611-8_9
Talbi, E.-G.: Metaheuristics: from design to implementation, vol. 74. Wiley, Hoboken (2009)
Tesar, P.: A new method for generating high non-linearity S-boxes. Radioengineering (2010)
Wu, S., Wang, M.: Automatic search of truncated impossible differentials for word-oriented block ciphers. In: Galbraith, S., Nandi, M. (eds.) INDOCRYPT 2012. LNCS, vol. 7668, pp. 283–302. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34931-7_17
Yang, M., Wang, Z., Meng, Q., Han, L.: Evolutionary design of S-box with cryptographic properties. In: 2011 Ninth IEEE International Symposium on Parallel and Distributed Processing with Applications Workshops (ISPAW), pp. 12–15. IEEE (2011)
Acknowledgments
The author would like to thank the anonymous reviewers of ASIACRYPT’17 for their constructive comments and Yu Sasaki for helping to finalize the paper. This work is supported by the Ministry of Education, Singapore under Grant No. R-252-000-560-112.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Applications to SKINNY-64-64 and SKINNY-64-128
In addition to the full search given in Sect. 3.1 on SKINNY-64-192 (i.e. in \(TK_3\)), we have also run search for \(P_T\) in SKINNY-64-64 and SKINNY-64-128, i.e. in \(TK_1\) and in \(TK_2\). The search criterion for \(P_T\) was identical as in Sect. 3.1. With the use of simulated annealing only, we have looked for \(P_T\) in three different related-tweakey differential models:
-
1.
Find \(P_T\) for SKINNY-64-64 secure in \(TK_1\). The search returned several permutations, each resulting in a cipher that has at least 33 active Sboxes in any 11-round related-tweakey differential characteristics.
-
2.
Find \(P_T\) for SKINNY-64-128 secure in \(TK_2\). Similarly, we found several permutations with at least 34 active Sboxes in any 14-round characteristic.
-
3.
Find \(P_T\) simultaneously for SKINNY-64-64 secure in \(TK_1\) and for for SKINNY-64-128 secure in \(TK_2\). We found a few permutations \(P_T\) that simultaneously provide security in both \(TK_1\) and \(TK_2\). Interestingly, the corresponding characteristics have 33 active Sboxes on 11 rounds in \(TK_1\) and 34 active Sboxes on 14 rounds in \(TK_2\). In other words, any of these permutations can be used as an optimal candidate in scenarios (1) and (2).
Examples of permutations found with the search are given in Table 3.
B Specification and Implementation Details of the Metaheuristics
Selection Functions. We work with four types of selection functions:
-
Roulette-wheel selection is also called furness-proportionate selection. The selection probability of each individual is proportional to its fitness value. In Fig. 3, we assume the population is composed of four individuals with fitness measures of 40, 30, 20, 10. Then, the roulette-wheel selection dictates that the first individual has \(\frac{40}{100}\) probability of being selected as a parent, the second \(\frac{30}{100}\), etc. To select a single parent, we “run the roulette”, i.e. uniformly at random choose a number in the range [0, 100), and accordingly choose that individual on which slice the “ball” has landed, e.g. if the number is anywhere in the range [0, 10) then it is the individual 1, if in the range [10, 30) then it is individual 2, etc.
-
Stochastic selection is similar to roulette-wheel, but it increases the chance of high fitness individuals becoming parents. In stochastic selection the parents are selected in bulk. For example, in Fig. 4 we show how to use the wheel to select four parents at once. We run the roulette once, i.e. select a random number in range [0, 100] and, as in roulette-wheel, choose the corresponding individual as a first parent. Then, the remaining three parents are the ones the correspond to the other three uniformly-spaced numbers. That is, if the ball has landed on 23, then we assume that it has also landed on \(23 + \frac{100}{4} = 48\), \(23 + 2\cdot \frac{100}{4}=73\), and \(23 + 3\cdot \frac{100}{4} = 98\).
-
Rank selection is as well similar to roulette-wheel. However, instead of using the fitness to determine the portion of the wheel, individuals’ rank is used. That is, all individuals within a population are sorted according to their fitness in ascending order, and their position is taken as a fitness measure in roulette-wheel fashion. For instance, individuals with fitnesses of 1,5,20,8, after sorting will be at positions 1, 2, 4, 3, thus have \(\frac{1}{10},\frac{2}{10},\frac{4}{10},\frac{3}{10}\) probabilities to be selected as parents.
-
Tournament selection depends on the tournament size and we use the most common size of 2. That is, to select a parent, we uniformly at random choose two individuals, compare their fitness, and choose the one that has higher fitness.
Pseudo code of a full selection procedure (based on roulette-wheel) used by the genetic algorithm for search of \(P_T\) in SKINNY is given below.
Neighbour Functions \(\epsilon (x)\). Intuitively, the task of a neighbour function is to produce a value in the neighbourhood of x. Thus they output values (or vectors) that are very similar to the input.
In SKINNY we define \(\epsilon (P_T)\) as a swap of two elements. That is, we randomly choose two positions, and then exchange the elements in these two positions. Since there is an additional requirement on the form of \(P_T\), the swap can occur only between two elements that belong both to the same half. Further we give a simple pseudo-code that accomplishes this.
In the AES-round based construction search, the solution is composed of three vectors, thus \(\epsilon (x)\) can be defined as a composition of three separate neighbour functions, one for each of the vectors. A pseudo-code of such function for the vector \(aes\_masks\) is given below.
C Efficient AES-Based Constructions
The only goal in the search for AES-based constructions presented in Sect. 3.2 was to improve the security in comparison to the already published constructions in [23], without affecting their efficiency. Further we focus on the latter goal, i.e. our imperative below is to improve the efficiency of the constructions, while still maintaining sufficient security level of at least 22 active Sboxes.
A construction has better efficiency if it has smaller state size, better rate, or both. Constructions that have smaller state size but worse rate, or vice versa, are not considered to be more efficient. To search for efficient constructions, once again we use the two metaheuristics. The formulations of the optimization problems are identical to the formulations given in Sect. 3.2, i.e. we still optimize with respect to the security and with fixed state size and rate. However, once a metaheuristic identifies a construction as optimal, we compare its efficiency to all of the constructions given in [23]. We report in Table 4 the constructions we found to be more efficient than some of the previously known constructions.
Rights and permissions
Copyright information
© 2017 International Association for Cryptologic Research
About this paper
Cite this paper
Nikolić, I. (2017). How to Use Metaheuristics for Design of Symmetric-Key Primitives. In: Takagi, T., Peyrin, T. (eds) Advances in Cryptology – ASIACRYPT 2017. ASIACRYPT 2017. Lecture Notes in Computer Science(), vol 10626. Springer, Cham. https://doi.org/10.1007/978-3-319-70700-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-70700-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70699-3
Online ISBN: 978-3-319-70700-6
eBook Packages: Computer ScienceComputer Science (R0)