5.4 The Generic Case
5.4.1 Generic Cryptanalysis
This section elaborates on the impact of Assumption 2 on each phase of the cryptanalysis presented in Sect. 5.3 resulting in a generic cryptanalysis.
Setup Phase and Phase 1
The setup phase is independent of the secret randomization and hence remains the same, i.e., the eight dTMC(1,j)i tables (i = 0, 1 and j = 0, 1, 2, 3) can still be made key-independent based on Lemma 4. With regard to Phase 1, the attacker is still able to construct four sets Sl(i,j)(l = 0, 1, 2, 3) as defined by (5.5) comprising leaked information for each linear input encoding L(1,j)i −1
for i = 0, 1 and j = 0, 1, 2, 3; however, the associated function fl(i,j) with each set Sl(i,j) is no longer known due to the secret randomization. Instead, the associated function can be any element of the known set
Sf = n f = S−1◦ ⊗mc−1 1 ◦ ⊗mc0◦ S (mc0, mc1) ∈ S ∗ MC o with SMC∗ = {(01, 02), (02, 03), (03, 01), (01, 01), (01, 03), (03, 02), (02, 01)} . The set SMC∗ comprises all possible pairs formed out of the four MixColumns co- efficients appearing on each row of the 4 × 4 matrix MC, i.e., out of the set
THE GENERIC CASE 145
{01, 01, 02, 03}. Since two MixColumns coefficients are equal to 01 for AES encryption, we have that |SMC∗| = 7 and as a result also |Sf| = 7.
Phase 2
The second phase retrieves the secret linear input encodings L(1,j)i −1for i = 0, 1 and j = 0, 1, 2, 3. Originally, this was achieved by using Algorithm 1 of Phase 2 (i.e., the algorithm for finding the desired linear equivalence (A, B)d) which
required as inputs one of the four sets Sl(i,j)(l = 0, 1, 2, 3) and its associated function fl(i,j). However, as mentioned above, fl(i,j) is unknown in the generic case and thus we need to guess ˜fl(i,j)∈ Sf. Now, the question remains: “can
we filter out the incorrect guesses of ˜fl(i,j)6= fl(i,j) and obtain (A, B)d?”. This
is discussed in the following, where S(i,j)= {S0(i,j), S1(i,j), S2(i,j), S3(i,j)}. First, randomly select a set S ∈ S(i,j) without knowing the associated function
f . Given that the chosen two distinct points xn ∈ S (n = 1, 2) are defined by
L(1,j)i −1
(xn) = unkf (un), Algorithm 1 finds a linear equivalence if there exist
two distinct values an ∈ F82\ {0} for n = 1, 2 such that the initial guesses for A
become
A(xn) = ank ˜f (an) = As· unkf (un) for n = 1, 2 ,
for some guess of ˜f ∈ Sf and for some As(see Property 5). This problem can
be reduced to the following problem statement:
Problem Statement 2. Given x ∈ F8
2\ {0}, does there exist a y ∈ F82\ {0}
such that
xk ˜f (x) = As· ykf (y)
, (5.8)
for any As(see Property 5) and for any pair of functions (f, ˜f ) ∈ Sf × Sf?
Table 5.2 (left entries if applicable) lists the maximum number of x-values for which there exists a y satisfying (5.8) for each possible As and for any pair
(f, ˜f ). As a result of a certain symmetry within the set Sf and As, the entries
of 255 in Table 5.2 on both ‘diagonals’ can be explained by the following: 1. If ˜f = f and As = I
16, where I16 denotes the 16-bit identity matrix
over F2, then (5.8) becomes xkf (x)
= ykf (y) such that for each x ∈ F82\ {0} there exist a y satisfying the equation, i.e., y = x. This is
considered to be the trivial case; if we guess f correctly, then at least the desired linear equivalence (A, B)d with A = L
(1,j)
i
−1
Table 5.2: For any pair of functions (f, ˜f ) ∈ Sf × Sf listing the maximum
number of x ∈ F8
2\ {0} for which there exists a y ∈ F82\ {0} satisfying (5.8)
taken over all possible As.
H H H H H f ˜ f (01, 02) (02, 03) (03, 01) (01, 03) (03, 02) (02, 01) (01, 02) 255 3 6 4 4 6 255 3 (02, 03) 4 255 3 4 4 255 3 4 (03, 01) 4 5 255 6 255 6 5 4 (01, 01) 3 4 3 3 4 3 (01, 03) 4 5 255 6 255 6 5 4 (03, 02) 4 255 3 4 4 255 3 4 (02, 01) 255 3 6 4 4 6 255 3 2. If ˜f = f−1 and As = 08×8 I8 I8 08×8
, where 08×8 denotes the 8 × 8
zero matrix and I8 denotes the 8-bit identity matrix over F2, then (5.8)
becomes xkf−1(x) = f (y)ky such that for each x ∈ F8
2\{0} there exist
a y satisfying the equation, i.e., y = f−1(x). Hence, if we guess the inverse of f , then at least the linear equivalence (A, B) with A = As· L(1,j)
i
−1 is given as output where As is as specified above. Let us denote this specific
linear equivalence by (A, B)0d in the following.
Excluding the above two cases results in the right entries (if applicable) of Table 5.2. This shows that there are at most six distinct x-values for each possible Asand for any pair (f, ˜f ) (excluding the above cases) for which there
exists a y satisfying (5.8). Observe that the grey-colored entries correspond to the cases discussed in Sect. 5.3.3 to determine the best choice of the set S selected out of S(i,j)in order to execute Algorithm 1.
Note that in Table 5.2 the identity function I8 (i.e., the function with
(mc0, mc1) = (01, 01)) is left out as a possible guess for ˜f . The reason for
this omission is that the identity function requires additional guesses during the execution of LE, which is undesirable since it increases the work factor. This was already discussed in Sect. 5.3.3.
Generic algorithm for finding (A, B)d and (A, B)0d. Here, we present a
generic algorithm for finding the linear equivalences (A, B)d and (A, B)0d that
eventually yield the secret linear input encoding L(1,j)i −1. From Table 5.2 and the above observations it follows that if LE is repeated four times for a certain chosen set S ∈ S(i,j)and for all six guesses of ˜f ∈ S
THE GENERIC CASE 147
1. no solutions which shows that the chosen set is S ↔ (01, 01). In this case we need to chose a different set S∗ ∈ S(i,j) and repeat the whole
procedure for this new set;
2. exactly two solutions, i.e., (A, B)dand (A, B)0d, out of which we can easily
filter out the linear input encoding L(1,j)i −1
as explained below. The reason for repeating LE four times is to exclude additional linear equivalences except for (A, B)dand (A, B)0d. From Table 5.2 it follows that such
additional linear equivalences can only occur during at most three executions of LE. Note that it is only required to repeat LE four times if at least one linear equivalence is found during the first execution of LE.
Algorithm 2 gives a detailed description of the whole procedure for finding both linear equivalences (A, B)d and (A, B)0d contained within the returned set
S(A,B). Although the attacker cannot distinguish both elements in S(A,B), he
knows that both A’s of the found pairs of linear equivalences have the form A1= L (1,j) i −1 and A2= C · L (1,j) i −1 with C = 08×8 I8 I8 08×8 ,
or vice versa. Hence by verifying whether A1· A−12 or A2· A−11 equals C, the
attacker is able to retrieve the secret linear input encoding L(1,j)i −1.
Phase 3
After the setup phase and Phases 1-2, the attacker retrieved all encodings
L(1,j)i −1 (i = 0, 1 and j = 0, 1, 2, 3) of the first round. This enables him
to extract the round key bytes of the first round as described in Sect. 5.3.4. However, due to the secret randomization, there exists an ambiguity about the order of the round key bytes. Therefore, as was done in the BGE attack, the attacker needs to extract the round key bytes of the second round as well. This can be achieved by repeating the setup phase and Phases 1-2 for the second round. Observe that the generic cryptanalysis presented above can be applied to any two consecutive rounds r and r + 1 for some value of r with 1 ≤ r ≤ 8 and is not restricted to the first two rounds.
After that, the values of the round key bytes of two consecutive rounds are known, though with an unknown order of the round key bytes associated with each subround and an unknown order of the four subrounds. Phase 4 of the improved BGE attack (Sect. 4.1.3) provides an efficient method to determine the correct order of the round key bytes and to extract the secret AES key.
Algorithm 2 Finding the linear equivalences (A, B)d and (A, B)0d
Input: S1= (S, S), S2= dTMC
(1,j)
i , S(i,j), Sf\ {I8}
Output: (A, B)d and (A, B)0d
1: choose S ∈ S(i,j)
2: S(A,B) ← ∅
3: for all ˜f ∈ Sf\ {I8} do
4: select 8 distinct points x(i)n ∈ S with x(i)n 6= 0 for n = 1, 2 and 0 ≤ i ≤ 3
5: call search-LE x(0)1 , x(0)2 , ˜f ,S1,S2 → SLE
6: if |SLE| > 0 then
7: for i = 1 to 3 do
8: call search-LE x(i)1 , x(i)2 , ˜f ,S1,S2 → SLE∗
9: SLE ← SLE∩ SLE∗ 10: end for 11: end if 12: S(A,B) ← S(A,B)∪ SLE 13: end for 14: if |S(A,B)| = ∅ then
15: choose S∗∈ S(i,j) with S 6= S∗
16: repeat steps 3–13 with the set S∗
17: end if
18: return S(A,B)
where Procedure search-LE is as specified in Algorithm 1.
With regard to the external encodings IN−1 and OUT, both bijective linear mappings on F128
2 , it suffices to say that the attacker is in possession of the
AES key (such that he can construct a standard AES encryption/decryption routine instantiated with the extracted key) and furthermore can observe a plain intermediate AES result that gives him access to the raw plaintext and ciphertext. This enables the attacker to determine the image of the external encodings for each i-th unit vector ei in F1282 .