Algebraic attacks - The MD6 hash function A proposal to NIST for SHA-3

There are a number of algebraic attacks one might consider trying against the MD6 compression function.

CHAPTER 6. COMPRESSION FUNCTION SECURITY 125

6.11.1 Degree Estimates

Many algebraic attacks, such as Shamir’s “cube” attack, require that the algebraic normal form describing the output bits have relatively small degree in terms of the input bits being considered. There are 89 · 64 = 5696 input variables, one for each of the 64 bits in each of the 89 input words to the compression function f . A particular algebraic attack may consider only a subset of these, and leave the others fixed.

Each bit of each word A[i] is a function of the nw = 5696 input variables, that can be described in a unique way by a polynomial in algebraic normal form (ANF). A particular MD6 output bit can be represented in a unique way as an ANF polynomial with input variables x1, x2, . . . , xnw and coefficients and

outputs in GF (2), e.g.

a0 ⊕ a1x1 ⊕ . . . ⊕ anwxnw ⊕ anw+1x1x2 ⊕ . . . ⊕ a2nw₋₁x₁x₂· · · x_nw. (6.18) There are 2nw _{terms in this sum, one for each subset of the variables. The}

coefficient for a term determines whether that term is “present” (coefficient = 1) or “absent” (coefficient = 0).

6.11.1.1 Maximum degree estimates

We can estimate the degree of the polynomial for such bit in A[i] as follows. We estimate a common degree of the polynomials for each of the 64 bits occuring word A[i]. Let δito denote the estimated common degree for bits in A[i]:

δi= 1 for i = 0, 1, . . . , n − 1 (6.19)

and

δi= min(δi−t5, δi−t0, χ(δi−t1, δi−t2), χ(δi−t3, δi−t4)) for i > 88 , where

χ(d1, d2) = d1+ d2− d1d2/nw .

(The term d1d2/nw accounts for the fact that random terms of degree d1and d2

should have about d1d2/nw variables in common, so the degree of their product

will be smaller by this amount than the sum d1+ d2 of their degrees.) We

obtain the following estimate for the minimum degree of any bit computed in each round.

According to these estimates (see Table 6.4), after 12 rounds the minimum degree should easily exceed 512, and after 20 rounds the minimum degree should almost certainly equal the maximum possible degree nw = 5696.

Similar computations can be performed when one cares about only some of the input variables (i.e., when some of the variables are fixed, and we only care about the degree in terms of the remaining variables).

rounds minimum degree 1 2 2 2 3 4 4 8 5 16 6 24 7 42 8 66 9 128 10 252 11 443 12 665 13 1164 14 1819 15 2910 16 4268 17 5156 18 5565 19 5692 20 5696 21 5696 22 5696 23 5696 24 5696

Table 6.4: Table of estimated degrees of polynomials after a given number of rounds. MD6 is estimated to have polynomials of maximum possible degree after 20 rounds.

CHAPTER 6. COMPRESSION FUNCTION SECURITY 127

6.11.1.2 Density estimates

An algebraic attack may depend not only on the degree of the polynomials, but also on how dense they are. For example, if a polynomial is not dense, then fixing some of the variables may dramatically reduce the degree in the remaining variables. This section provides some crude estimates of the density of the MD6 polynomials.

We give an informal definition. We say that a polynomial in nw variables has “dense degree d1” if the number of terms of any degree d2 ≤ d1 is near its

expected value (1/2) nw_d

2. This definition is informal because we don’t carefully define what we mean by “near”.

We base our estimates on the following informal proposition.

Proposition 1 The product of two ANF polynomials P1 and P2 of dense de-

grees d1 and d2 respectively is a ANF polynomial P3 of dense degree d3, where

d3= min(nw, d1+ d2) .

(Here P1 and P2 are defined in terms of the same set of nw variables.)

The “proof” of this proposition notes that any given possible term t of degree d3can be formed in many possible ways as the product of a term t1of degree d1

from P1 and a term t2 of degree t2from P2. If we view P1 and P2 as randomly

chosen polynomials of dense degress d1 and d2 respectively, then the various

such products t1t2 will be present or absent independently, so that t will be

present with probability 1/2. Thus, P3 is dense with degree d3.

Of course, in an actual computation the relevant polynomials P1and P2may

not be “chosen randomly”, and may not be independent of each other, so this proposition is not rigorous. Nonetheless, it forms the basis for an interesting heuristic estimation of the density of the MD6 polynomials.

In this estimation, we let δi be our degree estimate such that we estimate

that for each bit of A[i], the corresponding ANF polynomial has dense degree δi.

We let δi= 0 for all i < n + 11c, and set δi= 1 for all i such that n + 11c ≤

i < n + 12c. That is, we assume that the polynomials are not of dense degree 1 until the 12-th round. This corresponds to the output seen in our program shiftopt.c, which optimized the shift amounts for MD6 by considering the density of the linear portions of the polynomials.

Table 6.5 then lists lower bounds on estimated dense degrees for words in each round 12, . . . , 31, based on the above proposition and the structure of MD6. We see that after 28 rounds we estimate that the MD6 ANF are of full dense degree (i.e., of dense degree 5696).

6.11.2 Monomial Tests

Jean-Philippe Aumasson has kindly allowed us to report on some of his initial experiments using a test related to those in [27, 29, 53].

The test attemps to determine whether reduced-round versions of the MD6 compression function have a low degree over GF(2), with respect to some small

rounds dense degree 1..11 0 12 1 13 1 14 2 15 4 16 8 17 12 18 21 19 33 20 64 21 127 22 227 23 349 24 642 25 1079 26 2004 27 3877 28 5696 29 5696 30 5696 31 5696

Table 6.5: Table of estimated dense degrees of polynomials after a given number of rounds. MD6 is estimated to have polynomials of dense degree 5696 after 28 rounds.

CHAPTER 6. COMPRESSION FUNCTION SECURITY 129

subset of input variables. The experiments run consider families of functions parametrized by bits of A[54], and considering the other bits of A[54] as input variables.

The test was able to detect nonrandomness in the MD6 compression function after for 18 rounds in about 217computations of the function.

Extensions of these studies are still underway.

6.11.3 The Dinur/Shamir “Cube” Attack

At his CRYPTO 2008 keynote talk, Adi Shamir presented an interesting algebraic attack on keyed cryptosystems that is capable of key recovery, if the cryptosystem has a sufficiently simple algebraic representation. (This attack is joint work with Itai Dinur.) The cube attack is related to, but more sophisti- cated than, the Englund et al. maximum degree monomial test. It searches for sums over subcubes of the full Boolean input that give values for linear equations over the unknown key bits. With enough such values, the equations can be solved to yield the key.

We have begun some initial collaborations with Itai Dinur (and Adi Shamir) to evaluate the effectiveness of the “cube attack” on MD6.

Our initial results are very preliminary and tentative. It appears that the cube attack can distinguish key MD6 from a random function up to 15 rounds, and extensions of the cube attack will probably be able to extract the MD6 key in the same number of rounds. It is plausible that these results can be improved by a few rounds; experiments are ongoing.

In document The MD6 hash function A proposal to NIST for SHA-3 (Page 127-132)