Cube Attacks and Cube Testers

A cube attacks_{[49] is an algebraic method of cryptanalysis targeting block ciphers.}

When applicable, it can lead to a key recovery. The main idea of cube attacks is

to find linear terms in the algebraic normal form of the output. If these terms are

bits of the secret key, then the attacker can easily solve the linear system and thus

recover the key bits.

The method works in two phases. In the first, so-called preprocessing phase,

given the description or a black box access to the cipher, the attacker tries to build

the algebraic normal form (ANF) for the output bits (the input variables x

, . . . , x

are the bits of the key and the plaintext). However, it is reasonable to assume

that the explicit formula of the ANF, i.e. the number of monomials in the ANF, is

exponential in the number of input bits, hence the attacker cannot fully construct

ANF. Given a monomial t

that is a product of variables with indices from I, where

I

⊂ {1, . . . , n}, the ANF of some output bit can be represented as:

p(x

₁

, . . . , x

) = t

· p

S(I)

+ q(x

, . . . , x

),

(5.1)

where p

S(I)

is called a superpoly of I in p and has no common variables with t

, and

qmisses at least one variable from t

. The attacker, using various heuristics, tries

to find different t

’s that have linear non-constant superpolies – these type of terms

are called maxterms of p. Note that if in (5.1) the attacker sums over all possible

values of the variables of t

, it can be shown that the right part of (5.1) becomes

equal to p

S(I)

, while the value of the left part is known to the attacker. Hence,

when enough maxterms are available, the problem of recovering the key bits (the

one in p

S(I)

) is reduced to the problem of solving a system of linear equations. This

is done in the online phase, when the key is fixed and unknown to the attacker. He

queries the cipher with plaintexts in order to obtain the sum over all possible values

of variables of different maxterms (found previously in the preprocessing phase).

Once he gets the bits of the ciphertexts, he only has to solve the linear system of

equations to recover the key bits included in the maxterms.

Cube testers

[6] are distinguishing type of attacks that further exploit some

properties of superpolies. As in the cube attacks, the adversary chooses various

t

’s and gets their superpolies p

S(I)

. Then he tries to show some distinguishing

property of these superpolies, i.e. the attacker tries to show that these superpolies

have some property that is not (easily) found in superpolies of a random func-

tion/permutation. Usually such properties are balance, constantness, low degree

and other. Cube testers are purely practical type of distinguisher, i.e. the attacker

can present the distinguishing property in feasible time and finding testers with

higher complexity is still an open problem.

Part III

Differential Attacks on Hash

Functions

Differential attacks play an important role in the analysis of cryptographic hash

function. The starting point of the attacks is finding a high probability differential

trail for the hash function or the underlying compression function. The search for

these trails is usually ad-hoc, i.e. there is no universal method applicable to any

function. Depending on the properties of the found trail, the attacker can launch

several distinct differential attacks:

1. When the differential trail for n-bit function ends with a zero difference, and

the probability of the trail is higher than 2

−n2

, then the attacker can launch a

simple collision attack. Given a trail∆

2−t

−→ 0 for an n-bit function F(x) the

attacker composes 2

_pairs_(F(X

), F(X

⊕ ∆

). Then with a high probability

he can expect that in one of the pairs he will get a collision. Since t< n/2, he

finds this collision with an effort less than 2

, i.e. he has launched a collision

attack. Further, we present trails and collision attacks for the compression

functions of SHA-256 and LAKE. The attacks were published in:

• [109] Collisions for Step-Reduced SHA-256, FSE 2008

• [27] Cryptanalysis of the LAKE Hash Family, FSE 2009

2. When the trail has a probability higher than 2

−n

_{(but not necessarily higher}

than 2

−n2

), then it can be used as a distinguisher for the function. Again

the attacker creates 2

_{pairs and finds one that follows the trail. On the}

other hand, in a random function, he needs around 2

_{pairs to find one pair}

that on the fixed input differences produces the fixed output difference. It

is important to notice that the estimate 2

_{for a random function holds only}

when the input difference is fixed as well – otherwise the complexity drops

to 2

.

3. Two high probability trails on different halves of the function can be used in a

boomerang-type of attack. When the combined probability of these two trails

is higher than 2

−n2

, then the attacker can create a boomerang distinguisher

(a boomerang quartet) with a complexity lower than 2

, which is the com-

plexity in the generic case. Further, we present the details for the boomerang

distinguisher on the SHA-3 proposal BLAKE. The attack is based on the work:

Chapter 6

Collisions for SHA-2

The SHA-2 family of hash functions was introduced to the cryptographic commu-

nity as a new, more complex, and hopefully, more secure variant of the MD4-family

of hash functions. The recent results on the widely used MD4-family hash func-

tions SHA-1 and MD5[140],[141] show flaws in the security of these functions,

with respect to collision attacks. The question arises, if the most complex member

of MD4-family, the SHA-2 family, is also vulnerable to collision attacks.

Research has been made on finding local collisions for the SHA-2 family. Gilbert

and Handschuh_{[56] reported a 9-step local collision with probability of the differ-}

ential path of 2

−66

_{. Later, Mendel et al.}_{[97] estimated the probability of this local}

collision to be 2

−39

_{. Somitra and Palash obtained a local collision with probability}

2

−42

_{. Using modular differences Hawkes, Paddon and Rose}_{[64] were able to find}

a local collision with probability 2

−39

. Finding a real collision for SHA-2 was due

to Mendel et al[97]. They studied the message expansion of the SHA-256 and

reported a 19-step near collision.

We find a 9-step differential trail (we use modular differences) that holds with

probability

if we fix some of the intermediate values and solve the equations that

arise, i.e. if we use a message modification. We show that it is not necessary to

introduce differences in message words on each step of the trail. This helps us

to overcome the message expansion. Using only one instance of this differential

trails we find 20 and 21-step collisions (collisions for the original initial value)

with complexities of 3 and 2

compression function calls, respectively. Also, using

slightly different differential paths we are able to find a 23-step semi-free start

collision (collisions for a specific initial value) with a complexity of 2

_{calls. Our}

final result is a 25-step semi-free start near collision with Hamming distance of 15

bits and 2

_calls.

Our results were further improved by Indesteege et al

[66]. They reported

24-step collisions for SHA-256 and SHA-512 with complexity of 2

28.5

_{and 2}

_com-

pression function calls, respectively. They were able to find as well free-start near-

collisions for 31-step reduced SHA-256.

6.1 Description of SHA-2

SHA-2 family consists of iterative hash functions SHA-224, SHA-256, SHA-384, and

SHA-512. For our purposes, we will describe only SHA-256. The definitions of the

rest of the functions can be found in[136]. The SHA-256 takes a message of length

less than 2

and produces a 256-bit hash value. First, the input message is padded

so the length becomes a multiple of 512, and afterwards each 512-bit message

block is processed as an input in the Damgård-Merkle iterative structure. Each iter-

ation calls a compression function which takes for an input a 256-bit chaining value

and a 512-bit message block and produces an output 256-bit chaining value. The

output chaining value of the previous iteration is an input chaining value for the

following iteration. The initial chaining value, i.e. the value for the first iteration,

is fixed, and the chaining value produced after the last message block is proceeded,

is the hash value of the whole message. Internal state of SHA-256 compression

function consists of eight 32-bit variables A, B, C, D, E, F, G, and H, each of which

is updated on every of the 64 steps. These variables are updated according to the

following equations:

A

_i₊₁

= Σ

₀

(A

) + Ma j(A

, B

, C

) + Σ

(E

) + Ch(E

, F

, G

) + H

+ K

+ W

B

i+1

= A

C

i+1

= B

D

_i₊₁

= C

E

i+1

= Σ

(E

) + Ch(E

, F

, G

) + H

+ K

+ W

+ D

F

_i₊₁

= E

G

i+1

= F

H

_i₊₁

= G

.

The M a j(X , Y, Z) and Ch(X , Y, Z) are bitwise boolean functions defined as:

Ch(X , Y, Z) = (X ∧ Y ) ∨ (¬X ∧ Z)

M a j(X , Y, Z) = (X ∧ Y ) ∨ (X ∧ Z) ∨ (Y ∧ Z).

For SHA-256Σ

(X ) and Σ

(X ) are defined as:

Σ

(X ) = ROTR

(X ) ⊕ ROTR

(X )

Σ

(X ) = ROTR

(X ) ⊕ ROTR

(X ).

State update function uses constants K

, which are different for every step. The

512-bit message block itself is divided in 16 32-bit words: m

₀

, m

₁

, . . . , m

₁₆

. After-

wards, the message block is expanded to 64 32-bit words according to the following

rule:

W

=

¨

m

,

0≤ i ≤ 15

σ

(W

i−2

) + W

i−7

+ σ

(W

i−15

) + W

i−16

,

i> 15

For SHA-256σ

(X ) and σ

(X ) are defined as:

σ

(X ) =

ROT R

(X ) ⊕ ROTR

(X ) ⊕ SHR

(X )

Table 6.1: A 9 step differential trail for the SHA-2 family. Notice that only 5 differ-

ences are introduced, i.e. in the steps i, i+ 1, i + 2, i + 3, and i + 8.

step ∆A ∆B ∆C ∆D ∆E ∆F ∆G ∆H ∆W

i 0 0 0 0 0 0 0 0 1 i+1 1 0 0 0 1 0 0 0 δ1 i+2 0 1 0 0 -1 1 0 0 δ2 i+3 0 0 1 0 0 -1 1 0 δ3 i+4 0 0 0 1 0 0 -1 1 0 i+5 0 0 0 0 1 0 0 -1 0 i+6 0 0 0 0 0 1 0 0 0 i+7 0 0 0 0 0 0 1 0 0 i+8 0 0 0 0 0 0 0 1 δ4 i+9 0 0 0 0 0 0 0 0 0

The compression function after the 64-th step adds the initial values to the chaining

variables, i.e. the hash result of the compression function is:

h(M) = (A

₆₄

+A

₀

, B

+B

, C

+C

, D

+ D

, E

+ E

, F

+ F

, G

+G

, H

+H

).

These values become the initial chaining value for the next compression function.

In document Cryptanalysis and design of symmetric primitives (Page 45-52)

A cube attacks[49] is an algebraic method of cryptanalysis targeting block ciphers.

When applicable, it can lead to a key recovery. The main idea of cube attacks is

to find linear terms in the algebraic normal form of the output. If these terms are

bits of the secret key, then the attacker can easily solve the linear system and thus

recover the key bits.

The method works in two phases. In the first, so-called preprocessing phase,

given the description or a black box access to the cipher, the attacker tries to build

the algebraic normal form (ANF) for the output bits (the input variables x

, . . . , x

are the bits of the key and the plaintext). However, it is reasonable to assume

that the explicit formula of the ANF, i.e. the number of monomials in the ANF, is

exponential in the number of input bits, hence the attacker cannot fully construct

ANF. Given a monomial t

that is a product of variables with indices from I, where

I

⊂ {1, . . . , n}, the ANF of some output bit can be represented as:

p(x

, . . . , x

) = t

· p

+ q(x

, . . . , x

),

(5.1)

where p

is called a superpoly of I in p and has no common variables with t

, and

qmisses at least one variable from t

. The attacker, using various heuristics, tries

to find different t

’s that have linear non-constant superpolies – these type of terms

are called maxterms of p. Note that if in (5.1) the attacker sums over all possible

values of the variables of t

, it can be shown that the right part of (5.1) becomes

equal to p

, while the value of the left part is known to the attacker. Hence,

when enough maxterms are available, the problem of recovering the key bits (the

one in p

) is reduced to the problem of solving a system of linear equations. This

is done in the online phase, when the key is fixed and unknown to the attacker. He

queries the cipher with plaintexts in order to obtain the sum over all possible values

of variables of different maxterms (found previously in the preprocessing phase).

Once he gets the bits of the ciphertexts, he only has to solve the linear system of

equations to recover the key bits included in the maxterms.

Cube testers

[6] are distinguishing type of attacks that further exploit some

properties of superpolies. As in the cube attacks, the adversary chooses various

t

’s and gets their superpolies p

. Then he tries to show some distinguishing

property of these superpolies, i.e. the attacker tries to show that these superpolies

have some property that is not (easily) found in superpolies of a random func-

tion/permutation. Usually such properties are balance, constantness, low degree

and other. Cube testers are purely practical type of distinguisher, i.e. the attacker

can present the distinguishing property in feasible time and finding testers with

higher complexity is still an open problem.

Part III

Differential Attacks on Hash

Functions

Differential attacks play an important role in the analysis of cryptographic hash

function. The starting point of the attacks is finding a high probability differential

trail for the hash function or the underlying compression function. The search for

these trails is usually ad-hoc, i.e. there is no universal method applicable to any

function. Depending on the properties of the found trail, the attacker can launch

several distinct differential attacks:

1. When the differential trail for n-bit function ends with a zero difference, and

the probability of the trail is higher than 2

, then the attacker can launch a

simple collision attack. Given a trail∆

−→ 0 for an n-bit function F(x) the

attacker composes 2

pairs(F(X

), F(X

⊕ ∆

). Then with a high probability

he can expect that in one of the pairs he will get a collision. Since t< n/2, he

finds this collision with an effort less than 2

, i.e. he has launched a collision

attack. Further, we present trails and collision attacks for the compression

A cube attacks_{[49] is an algebraic method of cryptanalysis targeting block ciphers.}

_pairs_(F(X

_{(but not necessarily higher}

_{pairs and finds one that follows the trail. On the}

_{pairs to find one pair}

_{for a random function holds only}

and Handschuh_{[56] reported a 9-step local collision with probability of the differ-}

_{. Later, Mendel et al.}_{[97] estimated the probability of this local}

_{. Somitra and Palash obtained a local collision with probability}

_{. Using modular differences Hawkes, Paddon and Rose}_{[64] were able to find}

_{calls. Our}

_calls.

_{and 2}

_com-