• No results found

Differential Cryptanalysis

In document A Salad of Block Ciphers (Page 51-95)

๐‘ฅ๐‘– ๐‘ฅ๐‘–+1) (mod 256) .

Hence, the T layer is represented by the8ร—8block diagonal matrix๐ดwith the matrix

(

2 1 1 1)on

2

the diagonal four times. The permutation of the bundles is called ashuffle. It is the permutation 3

(0 2 4 6 1 3 5 7)

and corresponds to the matrix 4 ๐ต = โŽ› โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽ 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 โŽž โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽ  (mod 256) .

Hence, the๐‘…-matrix representing the entire diffusion layer of SAFER is 5 ๐‘€ = ๐ด โ‹… ๐ต โ‹… ๐ด โ‹… ๐ต โ‹… ๐ด = โŽ› โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ โŽ 8 4 4 2 4 2 2 1 4 2 2 1 4 2 2 1 4 4 2 2 2 2 1 1 2 2 1 1 2 2 1 1 4 2 4 2 2 1 2 1 2 1 2 1 2 1 2 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 โŽž โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽ  (mod 256) .

Let us now consider the effect of this matrix on the state, interpreted as a vector of length eight 6

over the ring๐‘…. Since the operator defined by๐‘€is linear, in order to determine how differences 7

in the input propagate it suffices to consider the differences as relative to the zero vector, i.e. to 8

study the images of individual vectors. Most vectors๐‘ฃof weight one are mapped to vectors 9

๐‘ฃ โ‹… ๐‘€ of weight eight, but not all, for instance, ๐‘ฃ = (32 0 0 0 0 0 0 0)๐‘ก is mapped to a vector

10

๐‘ฃ โ‹… ๐‘€ =(0 128 128 64 128 64 64 32)๐‘กof weight seven, and the image of๐‘ฃโ€ฒ=(128 0 0 0 0 0 0 0)๐‘กhas

11

weight just one: ๐‘ฃโ€ฒโ‹… ๐‘€ =(0 0 0 0 0 0 0 128)๐‘ก. Intuitively, this means that some inputs or some 12

differences do not diffuse well through the layer. Catastrophic consequences are in fact avoided 13

only by the fact that SAFER uses good S-boxes and the fact that the single bundle difference on 14

the eighth vector element will completely diffuse in the following round. 15

1.8.3.1 Multipermutations and MDS Matrices 16

The obvious question is: what are the matrices that guarantee themost completediffusion? The 17

question is somewhat ill posed because a desirable property of any component of a block cipher 18

is its fast evaluation. Hence, a good diffusion matrix must strike the right balance between good 1

diffusion and fast evaluation: a less perfect but much faster diffusion layer could still lead to a 2

cipher that it faster and not less secure than another cipher making use of an ideal, but slower, 3

diffusion layer. Also the question of performance is per se difficult to formalise: for instance, a 4

sparse matrix is not necessarily good if some entries represent elements which are expensive 5

to multiply with. 6

This said, the first problem remains that of measuring the quality of diffusion and determining 7

optimal matrices โ€“ performance considerations, including compromises, come later. 8

To address this first problem, Serge Vaudenay suggested [Vau94] (generalising previous work 9

by himself and Claus-Peter Schnorr [SV94]) to use multipermutations: Given an alphabetโ„ณand

10

integers๐‘›,๐‘ , a a(s, n)-multipermutationover the alphabetโ„ณis a function ๐‘“ fromโ„ณ๐‘  toโ„ณ๐‘›such

11

that two different(๐‘  + ๐‘›)-tuples of the form(๐‘ฅ, ๐‘“(๐‘ฅ))cannot collide in any๐‘ positions. Serge Vaudenay 12

in particular first observed that the PHT in SAFER (and hence the whole diffusion layer) is not 13

a multipermutation. 14

To construct multipermutations, if the alphabet is representable as a finite field, he suggested to 15

use (the redundancy part) ofMDS matrices, i.e. matrices of MDS (maximum distance separa- 16

ble) codes, which are the codes which reach the Singleton bound: In other words a๐‘› ร— ๐‘ matrix 17

๐‘€over a finite field๐”ฝ is an MDS matrix if it is the transformation matrix of a linear transforma- 18

tion ๐‘“ โˆถ๐”ฝ๐‘  ๐”ฝ๐‘›,๐‘ฅ ๐ด๐‘ฅwith the following property: if๐‘ฅand๐‘ฅโˆ—differ in exactly๐‘กcomponents,

19

then ๐‘“(๐‘ฅ)and ๐‘“(๐‘ฅโˆ—) must differ in at least๐‘› โˆ’ ๐‘ก + 1components. The latter property is called 20

perfect diffusion. Vaudenay also showed how to exploit imperfect diffusion for cryptanalysis (as 21

in the case of reduced rounds of SAFER with suboptimal S-boxes, cf. Section3.8 on page 150). 22

Now, to see why this is optimal and indeed a desirable cryptographic property, let us assume 23

๐‘  = ๐‘›and consider first the case of a single changed input word. Then the change should spread 24

to all outputs โ€“ a property that, as we have seen at the beginning of this section, is not satisfied 25

by the SAFER diffusion later. If we now change two words, we may always choose them to 26

thatoneof the outputs of the linear transformation is equal to the corresponding input (this is 27

a simple linear algebra exercise) so we cannot do better than requiring that at least๐‘› โˆ’ 1inputs 28

are changed. 29

Note however, that the MDS condition, for๐‘  = ๐‘› is stronger than being invertible (i.e. non- 30

singular), as exemplified by the identity matrix, and non-singularity is of course not a sufficient 31

condition for being an MDS matrix, since it applies only to square matrices. Non-singularity, 32

however, gives a way to characterise an MDS matrix: Theorem 8 (page 321) of [MS77] states 33

thata matrix is an MDS matrix if and only if every square sub-matrix is non-singular. In particular, 34

a MDS matrix cannot have zero entries. 35

The first notable cipher to use MDS matrices for diffusion is Shark [RDP+96], designed by 36

Vincent Rijmen, Joan Daemen, Bart Preneel, Antoon Bosselaers and Erik De Win (cf. Subsec- 37

tion3.21.2 on page 190). For the design of the diffusion layer, the๐‘š-bit outputs of the S-boxes 38

are considered as elements of๐”ฝ๐‘š. The diffusion layer takes๐‘› ๐‘š-bit values as input, and gives 39

๐‘› ๐‘š-bit outputs. Such a vector or๐‘› ๐‘š-bit values represents the state of the cipher. Joan Dae- 40

men defines optimal diffusion using thebranch number[Dae95]: Thebranch numberโ„ฌof an 41

invertible linear mapping๐œƒis 42

โ„ฌ๐œƒ= min

wherewt(๐‘Ž)is the Hamming weight of๐‘Ž(here๐‘Žis considered as a tuple of elements over some al- 1

gebraic structure - the bundles - so by Hamming weight it is understood the number of nonzero 2

elements of the tuple). โ„ฌ๐œƒgives a measure for the worst case diffusion: it is a lower bound for 3

the number of active S-boxes in two consecutive rounds of a linear trail or a differential charac- 4

teristic. 5

Note thatwt(๐‘Ž)โฉฝ ๐‘›, for every choice of๐œƒ; ifwt(๐‘Ž)= 1, this implies thatโ„ฌ๐œƒโฉฝ ๐‘› + 1. An invertible 6

linear mapping๐œƒfor whichโ„ฌ๐œƒ= ๐‘› + 1is calledoptimal. If the vector๐‘Žrepresents, for instance, 7

an input differential, we see how this definition of optimality corresponds to the multipermu- 8

tation property. In fact, it follows directly from the definitions of branch number and of MDS 9

codes that the generator matrix of an MDS code defines an optimal linear transformation ๐œƒ. 10

Furthermore, this๐œƒmust be invertible. 11

Other examples of block ciphers that use MDS matrices for diffusion are SQUARE [DKR97] (see 12

Section3.11 on page 160), Twofish (Section3.13 on page 164), the AES contest winner Rijndael 13

(Section3.20 on page 182), Hierocrypt [OMSK00], IDEA NXT (Section3.23 on page 195), Clefia 14

(Section3.28 on page 203), Piccolo (Subsection3.37.2 on page 228), and LED (Subsection3.37.4 15

on page 229), MDS matrices are used also in the stream cipher MUGI [WFY+04] and in the

16

cryptographic hash function WHIRLPOOL [BR11a]. 17

It is worth noting that the entries in the MDS matrices are usually chosen as to be elements 18

of low Hamming weight, in order to make multiplication by them as inexpensive as possible. 19

This is often done by exhaustive search within certain classes of MDS matrices, such as gener- 20

ator matrices of Reed-Solomon codes. Also, since a MDS matrix cannot have zero entries, the 21

desirable type of sparseness is a small amount of entries not equal to one. 22

Even multiplication by low Hamming weight elements of a finite field can be too expensive for 23

some applications. Therefore some ciphers, such as mCrypton (Section3.25 on page 198), define 24

their diffusion matrix in a different, ad hoc, way. The corresponding study of the diffusion 25

properties is also ad hoc and the S-boxes have to be chosen carefully. 26

We shall return to the problem of constructing efficient MDS diffusion layers in Subsubsec- 27

tion1.8.3.3 on page 51. 28

Another problem arises with states that consists of many words, for instance 16, as in 128-bit 29

SPNs with 8-bit S-boxes (or with 64 bit SPNs that use 4-bit S-boxes), namely that the diffusion 30

matrix becomes too large - in the examples we just made it would be a16 ร— 16matrix. The 31

solution adopted in ciphers such as SQUARE and Rijndael, with 16 words (of one byte each) 32

is to only apply diffusion to each of four blocks of four words independently during a round 33

- and then to simply permute the words in such a way that full diffusion will be completed 34

in the followinground. Therefore instead of the multiplication of a diffusion matrix times a 35

column vector, in such ciphers the diffusion operation is implemented as a multiplication of 36

two matrices: the diffusion matrices and a matrix whose columns are segments of the state. In 37

mathematical notation this is described in Section3.20 on page 182. 38

1.8.3.2 Types of MDS Matrices 39

We recall that what we called a MDS matrix๐‘€ is, formally, thenon-systematic (or redundancy)

40

part of the generator matrix of an MDS code. This means that a basis for the corresponding๐‘› + ๐‘˜- 41

dimensional codeword space over a finite field๐พis given by the rows of the generator matrix 42

๐บ =(๐‘€|๐ผ๐‘›), where๐‘€is a๐‘˜ ร— ๐‘›matrix and๐ผ๐‘›is the identity๐‘› ร— ๐‘›matrix. 1

Here we are chiefly interested insquare MDS matrices, i.e. with๐‘˜ = ๐‘›, however we must ob- 2

serve there are further uses in cryptography: the F-function of the block cipher PICARO (Sub- 3

section3.38.1 on page 230), a Feistel network, uses a full generator matrix๐บof an MDS code 4

to embed a eight-dimensional vector space over๐”ฝ28 into a 14-dimensional one, and then the

5

transpose of๐บto compress back 14 dimensions to eight. In this case the generator matrix has 6

a6 ร— 8redundancy part. 7

There are two ways of constructing MDS matrices: one can start with a known MDS code, for 8

instance the code used in Shark is a Reed-Solomon code, or search for matrices that satisfy the 9

non-singular sub-matrix condition. 10

Cauchy matrices are a classic example of MDS matrices. They are of the form(๐‘ฅ1

๐‘–โˆ’๐‘ฆ๐‘—)0โฉฝ๐‘–,๐‘—<๐‘›with

11

all๐‘ฅ๐‘–โˆ’ ๐‘ฆ๐‘— โ‰  0over a field๐พ. In general they do not lend themselves readily to optimisation. 12

Amr Youssef, Serge Mister and Stafford Tavares define in [YMT97] a special class of Cauchy 13

matrices for the design of diffusion layers in block ciphers: they construct their matrices ๐ด 14

over a binary field๐พby first choosing the๐‘ฅ๐‘–โ€™s such that the least significant๐‘Ÿbits of๐‘ฅ๐‘–are the 15

binary representation of the number๐‘–, and then putting๐‘ฆ๐‘–= ๐‘ฅ๐‘–โŠ• ๐‘ฃwhere๐‘ฃis a nonzero field 16

element such that its least significant๐‘Ÿbits are all zero. This matrix satisfies๐ด2 = ๐‘๐ผ๐‘› where

17

๐‘ = โจ๐‘›โˆ’1๐‘–=0 (๐‘ฅ 1

1โŠ•๐‘ฆ๐‘–)

2

over ๐พ. The matrix ๐ดis then normalised dividing all its entries by โˆš๐‘, so 18

that it becomes involutory. Such a ๐‘› ร— ๐‘›matrix also has only ๐‘› different entries, which are 19

used for both encryption and decryption, reducing the number of circuits or short programs to 20

implement for the multiplication by constants. 21

Vandermonde matrices (see [Yca13] for their history and naming) are matrices where each row 22

is of the form1, ๐›ผ๐‘–, ๐›ผ2๐‘–, โ€ฆ , ๐›ผ๐‘›โˆ’1๐‘– for pairwise distinct๐›ผ๐‘–โ€™s. They are MDS matrices and there are 23

very efficient algorithms for multiplication of vectors by them, as this operation amounts to 24

multi-evaluation of a polynomial of degree๐‘› โˆ’ 1at๐‘›points. These algorithms are DFT based 25

(cf. Chapter 3 of [Pan01]) and therefore suitable only for large matrices. We are not sure which 26

is the first mention of Vandermonde matrices for the construction of diffusion layers in SPNs: 27

Often, in the literature, a 2004 paper by Jรฉrรดme Lacan and Jรฉrรดme Fimes [LF04] is cited, which 28

however deals with a clever use of Vandermonde matrices to build erasure codes, not with 29

cryptographic applications. 30

Hadamard matrices ๐ป have the property that ๐ป โ‹… ๐ป๐‘ก = ๐‘›๐ผ๐‘› (here ๐ป๐‘ก denotes the transpose 31

of ๐ป). The first such matrices were originally constructed by James Joseph Sylvester [Syl67] 32

and Jacques Hadamard [Had93] as real matrices with entries equal toยฑ1, but over finite fields 33

the latter condition is relaxed. The property ๐ป โ‹… ๐ป๐‘ก = ๐‘›๐ผ๐‘› makes them suitable to construct 34

involutory diffusion layers, upon scaling, and they are used for this purpose in Anubis (Sub- 35

section3.21.3 on page 192) and Khazad (Subsection3.21.2 on page 190). 36

Mahdi Sajadieh et al. in [SDMO12], construct involutory MDS matrices using Vandermonde 37

matrices over fields๐”ฝ2๐‘Ÿ. Their idea is to take๐‘›pairwise distinct and non-vanishing values๐›ผ๐‘–for

38

0 โฉฝ ๐‘– < ๐‘›, a nonzero๐›ฟin๐”ฝ2๐‘Ÿ, and to put๐›ฝ๐‘–= ๐›ผ๐‘–โŠ• ๐›ฟ. If๐ดand๐ตare the Vandermonde matrices

39

associated to the ๐‘›-uples(๐›ผ0, ๐›ผ1, โ€ฆ , ๐›ผ๐‘›โˆ’1) and(๐›ฝ0, ๐›ฝ1, โ€ฆ , ๐›ฝ๐‘›โˆ’1), then๐ต โ‹… ๐ดโˆ’1 is an involutory 40

MDS matrix. They then go on to construct2๐‘‘ร— 2๐‘‘Hadamard involutory matrices recursively, 41

starting from slightly modified4ร—4Vandermonde matrices: Kishan Chand Gupta and Indranil 42

Ghosh Ray [GR13a] show that these matrices can be constructed also starting from Cauchy 43

matrices. 1

1.8.3.3 Constructing Efficient MDS Matrices 2

We now focus on the problem of constructingefficient MDS matrices. We have already men- 3

tioned that choosing matrices with entries of low Hamming weight is often desirable, but the 4

actual problem is the minimisation of the cost of the multiplication by the whole MDS matrix. 5

Multiplication of a variable vector or matrix, over a finite field, by a fixed matrix with hard- 6

wired nonzero constant entries has a code or area complexity strongly correlated with the num- 7

ber of entries different from one (the actual values of such entries of course also plays a role). 8

Therefore, Pascal Junod and Serge Vaudenay introduce in [JV04b] the following criterion: if

9

๐‘ฃ1(๐‘€)is the number of entries equal to one in the matrix๐‘€and๐‘1(๐‘€)is the cardinality of the set๐ถ(๐‘€)

10

of distinct entries in๐‘€which are different from one, then the goal is to maximise๐‘ฃ1(๐‘€)and to minimise

11

๐‘(๐‘€). Of course this does not take into account special situations which may make some ma- 12

trices more efficient than others in some cases: for instance, multiplication the generator๐‘ฅof 13

the polynomial basis of the field is inexpensive, as is also multiplication by๐‘ฅโˆ’1; and the struc-

14

ture of the set๐ถ(๐‘€)is not taken into account, as when some elements are the sum or product 15

of other elements in the set. Junod and Vaudenay then start constructing candidates for MDS 16

matrix from the concept ofbi-regularity:a2 ร— 2array with nonzero entries in a field๐พisbi-regular

17

if at least one row and one column have two different entries. It is clear that bi-regularity is a pre- 18

requisite for non-singularity. MDS matrices are constructed iteratively by extending bi-regular 19

arrays, and lower bounds for๐‘ฃ1(๐‘€)and๐‘1(๐‘€)are given as a function of the dimensions of the 20

matrices. 21

To support their point through examples, Junod and Vaudenay consider the4ร—4MDS matrix๐‘€ 22

over๐”ฝ28 =๐”ฝ2[๐‘ฅ]/(๐‘ฅ8+๐‘ฅ4+๐‘ฅ3+๐‘ฅ+1)used in Rijndael โ€“ i.e. the matrix used in theMixColumns

23

step described in Section 3.20 on page 182. It has๐‘1(๐‘€) = 2, which according to [JV04b] is 24

optimal, but๐‘ฃ1(๐‘€) = 8, whereas a lower bound of๐‘ฃ1(๐‘€) = 7is possible. Multiplication by 25

the Rijndael matrix can be implemented using 15 XORs, four table lookups in one table (to 26

implement multiplication) and using three temporary variables. Junod and Vaudenay show 27

that the family of matrices of the form 28 โŽ› โŽœ โŽœ โŽœ โŽ ๐‘Ž 1 1 1 1 ๐‘Ž 1 ๐‘ 1 ๐‘ ๐‘Ž 1 1 1 ๐‘ ๐‘Ž โŽž โŽŸ โŽŸ โŽŸ โŽ 

can be implemented using 10 XORs and seven table lookups in two tables, using two temporary 29

variables. Using the sub-matrix non-singularity criterion, it is easily seen that such a matrix is 30

a MDS matrix over a field extension of๐”ฝ2if and only if 1,๐‘Ž,๐‘, and๐‘Ž + ๐‘are pairwise distinct 31

from each other,๐‘Ž โ‰  ๐‘2, and๐‘Ž2 โ‰  ๐‘. This matrix is at the basis of the diffusion layer in IDEA

32

NXT-64 (Section3.23 on page 195). Junod and Vaudenay also construct a8 ร— 8matrix over๐”ฝ28,

33

which is used in IDEA NXT-128. Being MDS, these matrices all have optimal branch numbers, 34

i.e.5and9respectively. 35

A different line of research, followed by several authors during the last few years, and that is 36

particularly advantageous for ciphers whose design criteria are compactness of code and data 37

or of area, is to construct MDS matricesiteratively. The idea is simple, if a matrix๐‘exists with a 38

very compact and sparse representation such that the๐‘˜thpower of๐‘is a MDS matrix๐‘€, then

1

one can just apply๐‘˜times the matrix๐‘in place of๐‘€. (For some reason, such constructions are 2

often calledrecursivein the literature.) 3

Jian Guo, Thomas Peyrin, Axel Poschmann, and Matthew Robshaw design the diffusion layers 4

of the hash function PHOTON [GPP11] and of the block cipher LED [GPPR12, GPPR11] by 5

constructing their MDS matrix as the power of the companion matrix of a LFSR. Recall that if 6

๐‘ฆ๐‘›+๐‘˜= ๐‘๐‘›โˆ’1๐‘ฆ๐‘›+๐‘˜โˆ’1+ ๐‘๐‘›โˆ’2๐‘ฆ๐‘›+๐‘˜โˆ’2+ โ‹ฏ + ๐‘1๐‘ฆ๐‘˜+1+ ๐‘0๐‘ฆ๐‘˜ (1.3) is a recursive relation with๐‘0, ๐‘๐‘›โˆ’1โ‰  0, then its characteristic polynomial is

7

๐‘”(๐‘‹)= ๐‘‹๐‘›โˆ’(๐‘๐‘›โˆ’1๐‘‹๐‘›โˆ’1+ ๐‘๐‘›โˆ’2๐‘‹๐‘›โˆ’2+ โ‹ฏ + ๐‘1๐‘‹ + ๐‘0) (1.4) and its companion matrix is the matrix๐ถsuch that

8 โŽ› โŽœ โŽœ โŽœ โŽ 0 ๐ผ๐‘›โˆ’1 ๐‘0 ๐‘1 โ‹ฏ ๐‘๐‘›โˆ’1 โŸโŸโŸโŸโŸโŸโŸโŸโŸ ๐ถ โŽž โŽŸ โŽŸ โŽŸ โŽ  โ‹… โŽ› โŽœ โŽœ โŽœ โŽ ๐‘ฆ๐‘˜ ๐‘ฆ๐‘˜+1 โ‹ฎ ๐‘ฆ๐‘›+๐‘˜โˆ’1 โŽž โŽŸ โŽŸ โŽŸ โŽ  = โŽ› โŽœ โŽœ โŽœ โŽ ๐‘ฆ๐‘˜+1 โ‹ฎ ๐‘ฆ๐‘›+๐‘˜โˆ’1 ๐‘ฆ๐‘›+๐‘˜ โŽž โŽŸ โŽŸ โŽŸ โŽ  , (1.5)

which is denoted bySerial(๐‘0, ๐‘1, โ€ฆ , ๐‘๐‘›โˆ’1)in [GPP11]. The inverse of๐ถhas a simple form as well

In document A Salad of Block Ciphers (Page 51-95)