Optimized Linear Attack against Stream Ciphers of Pseudo Random Number Generators Using Non-linear Combiner

(1)

Regular Paper

Optimized Linear Attack against Stream Ciphers of

Pseudo Random Number Generators Using Non-linear Combiner Hidema Tanaka

^†

and Toshinobu Kaneko

^††

The generator that is the target of this attack consists of LFSRs (Linear Feedback Shift Registers) and a non linear functionf(·). The attack equation (AEQ) is derived from the linear approximate functionF (·) of f(·) and the generator polynomials of the LFSRs. AEQ focuses on the output sequence of one LFSR, by eliminating the initial values of the other LFSRs in a functionF (·). The performance of AEQ depends on the number of terms and the degree of the elimination polynomial. We derive an eﬃcient algorithm for determining an optimal elimination polynomial. With this attack, we can determine the initial value of the LFSR from the tapped bits whose number is much smaller than the period of the random generator.

1. Introduction

Many random generators consisting of LFSRs (Linear Feedback Shift Registers) have been proposed — the Geﬀe generator, the Pless generator, the Multiplexer generator, etc.¹³⁾These are used as the key stream generators in stream ciphers, and are evaluated by their linear complexity, mutual information, statistical distri- bution, etc. Many methods of attacking these generators proposed. A correlation attack is one of traditional attack algorithm for stream ciphers. It derives an attack equation focusing on the probability of the generator’s output coinciding with the output of one LFSR in the generator. A correlation attack has some new types; for examples, fast correlation attacks^1),4). While almost correlation attack al- gorithms determine the initial value of generator by a brute force search, fast correlation attacks and others have various eﬀective methods; for example, paper¹⁾shows a method using parity-check. The BAA attack, which has been proposed by Rueppel¹⁰⁾, calculates the Walsh transform of the non-linear function of the generator, to derive the best linear approximation.

The method of Goli´c³⁾, is the attack method against combiners with a memory. His method transforms the generator into one which can be attacked by general methods (e.g., fast correlation attack). The diﬀerential attack²⁾ transforms the generator into the natural sequence generator. The linear syndrome attack¹⁸⁾ re-

† Emergency Communications Group, Communica- tions Research Laboratory

†† Faculty of Science and Technology, Science Univer- sity of Tokyo

gards an output sequence of the generator as an EX-OR sum of the output sequence of the LFSR and the noise sequence generated by the non-linear function of the generator. By using the majority logic decoding algorithm, the attack eliminates the noise sequence to recover the true output sequence of the LFSR. The linear consistency test¹⁹⁾ uses a deterministic equation which holds for the true initial state of the LFSR.

We propose an optimized linear attack against random generators that have a non- linear combiner consisting of LFSRs and a non- linear function f (·). The linear attack derives the attack equation(AEQ) from linear approxi- mate functions F (·) of f(·). Therefore, the lin- ear attack derives an attack equation focusing on the probability of the generator’s output coinciding with the EX-OR sum of the outputs of some LFSRs in the generator. We can say that a correlation attack is a linear equation which uses only monomial linear approximate func- tion F (·). So the linear attack is the extended type of a correlation attack. The AEQ focuses on the output sequence of one LFSR, by eliminating the initial values of the other LFSRs in each F (·). We use an elimination polynomial which is the product of the generator polynomials of the LFSR to be eliminated. The performance of AEQ depends on the number of terms and the degree of the elimination polynomial.

Any multiple of the generator polynomial can be used as an elimination polynomial. In this paper, we derive an eﬃcient algorithm for determining the optimal elimination polynomial.

In many cases, costs for an attack (Cpu time and a length of tapped bits) are estimated by

2163

(2)

a statistical methods. In this paper, we shows an information theoretical method. By regard- ing the AEQ as a binary symmetric channel, we estimate the number of tapped bits for successful attack, from its channel capacity. The results of computer simulations show that the estimated number is accurate enough for a successful attack and that the number of tapped bits is much smaller than the period of the random generator.

2. Pseudo Random Number Genera- tors Using Non Lineart Combiner A pseudo random number generator with a non-linear combiner is shown in Fig. 1. It is composed of n LFSRs: LFSR#1, . . . , LFSR#n and a non linear function f (·). If the random generator is used as the key stream generator in the stream cipher, it is necessary to satisfy the following conditions¹³⁾.

• The periodicity must be long.

• The linear complexity must be high.

• The statistical property must be good.

• The non-linearity must be high.

• The correlation immunity must be good.

Let X(t) be the n outputs of the LFSRs.

X(t) = (x₁(t), x₂(t), . . . , x_n(t)) (1) where x_i(t) is the output sequence of LFSR#i.

These outputs X(t) are combined by the non- linear function f (·) to generate a random se- quence R(t).

R(t) = f (X(t)) (2)

Each LFSR#i is a L_i-stage M sequence generator whose generator polynomials are

G_i(x) = 1⊕ gi1x⊕ gi2x²⊕ · · · ⊕ x^Lⁱ (3) where g_ij ∈ GF (2) , ⊕ is EX-OR.

A sequence x_i(t) is generated by the following recurrence relation.

x_i(t) = g_i1x(t− 1) ⊕ . . . ⊕ x(t − Li) (4) In the delay operator D formula,

G_i(D)x_i(D) = 0. (5)

Each sequences x_i(t) is determined by its initial state INI_i.

INI_i = (x(L_i− 1), x(Li− 2), . . . , x(0))(6) To maximize its period, the number of stages L_is are selected to be relatively prime. The period N of the random generator is

N =

n i=1

(2^Lⁱ− 1). (7)

3. Outline of the Linear Attack 3.1 Conditions and Procedure

This attack assumes that the attacker knows

LFSR #n LFSR #2 LFSR #1

f (X(t))

- - -

-

····

··

x₁(t)

x₂(t)

x_n(t)

R(t)

Fig. 1 A pseudo random number generator by non linear combiner.

the non-linear function f (X(t)) of the random generator and the generator polynomials of the LFSRs. The procedure is as follows

Phase-1 Calculate the linear approximate probability P_L of each linear approximate function F (X(t)) of the non-linear function f (X(t)).

Phase-2 Derive an attack equation (AEQ) for each F (X(t)). Calculate its probabil- ity S by Matsui’s Piling up Lemma.

Phase-3 For each AEQ, estimate the number of tapped bits needed for a successful attack.

Phase-4 Select the AEQ whose number of tapped bits is the smallest.

Phase-5 Estimate the initial state of the LFSR by using the AEQ from the tapped sequence.

3.2 Linear Approximate Probability To simplify the expression, we omit the vari- able t in this section; for example f (X) repre- sents f (X(t)). Let the linear approximate func- tion of f (X) be F (X).

F (X) = c₀⊕ c1x₁⊕ c2x₂⊕ · · · ⊕ cnx_n(8) where X = (x₁, x₂, . . . , x_n) and c_n, x_i∈ GF (2).

The constant c₀ is determined to be 0 or 1 to make its linear approximate probability P_L greater than 0.5 .

The linear approximate probability P_Lis calculated as follows.

P_L= Prob{f(X) = F (X)} (9) 3.3 Probability S

Let = be the equal sign which denotes that^P the equation holds with probability P . For ex- ample, a linear approximate function F (X(t)) which holds with probability P_L is

R(t) = f (X(t))^P= F (X(t)).^L (10)

(3)

Let LFSR#j be the target LFSR which has non zero coeﬃcient c_j in a linear approxi- mate function F (X(t)). We analyze the initial state variables of the target LFSR#j from the tapped sequence. Let J be the index set i of LFSR#i(i = j) that has non-zero coeﬃcients in F (X(t)). Let G_J(x) be the product of the generator polynomial;

G_J(x) = LCM

i∈J {G_i(x)}. (11) Let E(x) be a polynomial divisible by G_J(x),

G_J(x)|E(x). (12)

The terms except x_j(t) can be eliminated by using the recurrence relation E(D), where D is the delay operator. We denote E(D)R(t) as the coeﬃcient of D^tin E(D)R(D). Then the AEQ can be derived as follows.

E(D)R(t)= E(D)F (X(t)) = E(D)x^S _j(t) (13) We calculate the probability S of each AEQ by using Matsui’s piling up lemma [4], which is used in the linear cryptanalysis of DES. Since AEQ is derived from the system of M linear approximate functions, where M is the num- ber of terms in E(x)(i.e., the weight of E(x)), probability S is as follows.

S = 2^M−1

P_L−1

2

_M +1

2 (14)

3.4 Number of Tapped Bits for Mounting a Successful Attack We conjecture that the level of immunity against a linear attack can be estimated from the channel capacity of the AEQ. If we regard the right hand side of Eq. (13) as the input, and regard the left hand side as the output of a communication channel, the AEQ will be a binary symmetric channel with the following channel matrix.

T =

S 1− S

1− S S

(15) The channel capacity C of the channelT is cal- culated as follows.

C = 1− H(S) (16)

where H(·) is the entropy function.

H(S) = S log₂ 1

S + (1− S) log₂ 1 1− S The number of unknown variables in the AEQ is the number of stages L_jof LFSR#j. To determine the L_j unknown variables, we need

L_j/C tapped bits according to the channel capacity analysis. The AEQ at time t is the

EX-OR sum of the previous deg{E(x)} bits.

Thus the number of tapped bits T for mount- ing a successful attack can be estimated using information theoretical analysis, as follows.

T =

L_j C

+ deg{E(x)} (17)

3.5 Attack Algorithm

The attacker analyzes the AEQ by using the tapped sequence. The attack algorithm is as follows.

Stage-1 Calculate the left hand side of the AEQ using the tapped bits.

Stage-2 Calculate the right hand side of the AEQ using the assumed initial state of LFSR#j.

Stage-3 Calculate the probability with which the AEQ holds.

For the true initial state, AEQ will hold with probability S. For any false initial states, it will hold with probability 0.5 . Thus we can determine the true initial state INI_j from a suf- ﬁciently long tapped sequence.

4. Optimizing the Eﬀectiveness of the Attack

S exponentially converges to 1/2 with the number of terms M in E(x). For S = 0.5, the number of tapped bits T becomes inﬁnity, and the attack will be infeasible. To avoid this dif- ﬁculty, we employ an optimal elimination polynomial.

4.1 Optimal Elimination Polynomial Let E_[j,m](x) be the minimum degree poly- nomial which is divisible by G_J(x) and whose weight equals m. Let N_J be the period of G_J(x), then the following elimination polynomial whose weight equals 2 exists.

E_[j,2](x) = x^N^J+ 1 (18)

Let T_m be the number of tapped bits for mounting a successful attack with an AEQ de- rived with E_[j,m](x). (If there is no such a poly- nomial E_[j,m](x) then we consider T_m=∞.)

T_m=

L_j C_m

+ deg

E_[j,m](x)

(19) where C_m is the channel capacity of the AEQ derived from the elimination polynomial whose weight equals m.

Let T_min be the minimum value in {T₂, T₃, . . . , T_∞}. Then the optimal elimination poly- nomial E_j(x) for LFSR#j as follows.

E_j(x) = E_[j,min](x) (20) T_min is calculated as follows. Its value is the number of tapped bits for the AEQ derived by

(4)

the optimal elimination polynomial.

T_min= min

m

L_j C_m

+ deg

E_[j,m](x)

(m = 2∼ ∞) (21)

The following discussion indicates that we need not do a search of all of T₂, T₃, . . . , T_∞. Let ˜T_min(m) be the local minimum of the set {T₂, T₃, . . . , T_m}. The following theorem holds.

Theorem: When ˜T_min(m) is given, the local minimum ˜T_min(m + 1) is

T˜_min(m + 1) =









T˜_min(m),

B_m+1< deg

E_[j,m+1](x) T_m+1,

B_m+1≥ deg

E_[j,m+1](x) (22) where B_m+1 is as follows.

B_m+1= ˜T_min(m)−

L_j C_m+1

(23) [Proof ]

T˜_min(m)−T_m+1

= ˜T_min(m)−

L_j C_m+1

−deg

E_[j,m+1](x)

= B_m+1−deg

E_[j,m+1](x)

(24) If B_m+1 ≥ deg{E_[j,m+1](x)}, then ˜T_min(m)≥ T_m+1. If B_m+1 < deg{E_[j,m+1](x)}, then T˜_min(m) < T_m+1. Therefore, Eq. (22) holds.

Q.E.D.

In Eq. (23), B_m+1gives the search space limit according to the degree of the elimination poly- nomial. Since the weight of E_[j,m+1](x) is m+1, its degree is greater than or equal to m. Thus we conduct the optimal elimination polynomial search under the condition.

B_m+1≥ deg{E_[j,m+1](x)} ≥ m (25) The search algorithm for the optimal elimination polynomial is as follows.

Step-1 E_[j,min](x) = E_[j,2](x). ˜T_min(2) = T₂. m = 2.

Step-2 Calculate B_m+1 from Eq.(23). If B_m+1< m, then go to Step-5.

Step-3 Search for the elimination polyno- mial E_[j,m+1](x) with deg{E_[j,m+1](x)} ≤ B_m+1.

Step-4 If E_[j,m+1](x) exists, then E_[j,min](x)

= E_[j,m+1](x). Calculate ˜T_min(m + 1). Go to Step-2 with m = m + 1.

Step-5 T_min = T˜_min(m). E_j(x) = E_[j,min](x)

4.2 Optimal Attack Equation

Let I ={i₁, i₂, . . . , i_K} be the index set which

has non-zero coeﬃcients c_i_j(j = 1 ∼ K) in a linear approximate function F (X). For each element i_j∈ I, there is an optimal elimination polynomial E_i_j(x) and number of tapped bits T_minⁱ^j .

Let T_min^{F (X)} be the minimum number of tapped bits among T_minⁱ^j and E_min^{F (X)}(x) be the corresponding optimal elimination polynomial E_i_j(x);

T_min^{F (X)}= min

j∈I

T_minⁱ^j

(26) E_min^{F (X)}(x) = E_i_j(x) (27) where i_j gives T_min^{F (X)}.

LetF be the set of linear approximate func- tions F (X) for the non-linear function f (X).

By the above procedure, E_min^{F (X)}(x) is deter- mined for each linear approximate function.

Let T_opt be the minimum number of tapped bits among T_min^{F (X)} and E_opt(x) be the cor- responding optimal elimination polynomial E_min^{F (X)}(x).

T_opt= min

F

T_min^{F (X)}

(28) Let F_opt(X) be the optimal linear approx- imate function from which the E_opt(x) origi- nates. Thus the optimal AEQ is as follows.

E_opt(D)R(t) = E^S _opt(D)F_opt(X(t))

= E_opt(D)x_jop(t) (29) where x_jop(t) is the index number of the target LFSR to be attacked.

The attacker conducts his/her analysis using the optimal AEQ in the algorithm in Sec- tion 3.5.

5. Example Attack

We attacked a summation generator with three LFSRs^5),13) (3SUM), one with four LF- SRs (4SUM), and the revised version of the dynamic random generator¹¹⁾ (DRG-R). We assumed that these summation generators have an one bit memory. DRG⁷⁾ has a much bet- ter linear complexity and mutual information property than other random generators of the same size. DRG-R is a revision of DRG, and it is secure against correlation attack. The constituent LFSRs and periods of these random generators, are shown in Table 1. We abbrevi- ate the polynomials as follows.

x^m+ xⁿ+ 1 → [m, n, 0] (30) The non linear function of nSUM (n = 3, 4) is

(5)

Table 1 Constituent LFSRs and periods of each random generator.

3SUM 4SUM DRG-R

LFSR#1 [0,9,10,12,13] [0,1,3] [0,2,3,5,6,8,9,10,11]

LFSR#2 [0,9,11] [0,2,5] [0,1,2,3,6,7,8,11,12]

LFSR#3 [0,14,17] [0,1,7] [0,1,2,7,8,10,12,13,14,15,17,22,23,25,27,28,29]

LFSR#4 [0,2,11] [0,1,2,3,4,5,6,7,8,9,11,12,13]

LFSR#5 [0,1,2,5,6,8,9,11,12,14,15,16,17]

PeriodN more than more than

2.56 × 10²³[bits]

2.2 × 10¹²[bits] 5.6 × 10⁷[bits]

Table 2 Optimal linear approximate function Fopt(X(t)), linear approxi- mate probabilityPL, index number of Target LFSR, optimal elimination polynomialEopt(x) and cpu time for optimal AEQ derivation for each random generator.

3SUM 4SUM DRG-R

Fopt(X(t)) x2(t) ⊕ x3(t) x1(t) ⊕ x2(t) ⊕ x3(t) ⊕ x4(t) x4(t) ⊕ x5(t) ⊕ 1

PL 0.625 0.625 0.625

Target LFSR #3 #4 #5

Eopt(x) [2047,0] [332,291,0] [8191,0]

Cpu time for optimal AEQ derivation less than 1[s] 42[s] less than 1[s]

Table 3 Probability S, Channel capacity C, Period N and Number of tapped bitsT , of the AEQs of each random generator.

3SUM 4SUM DRG-R

ProbabilityS 0.5313 0.5078 0.5313

Channel capacityC 2.9 × 10⁻³ 1.8 × 10⁻⁴ 2.9 × 10⁻³

PeriodN more than 2.2 × 10¹²[bits] more than 5.6 × 10⁷[bits] 2.56 × 10²³[bits]

Number of tapped bitsT 8,100[bits] 6.6 × 10⁴[bits] 14,024[bits]

f_nSUM(X(t)) =

n i=1

x_i(t)+Carry(t) mod2

Carry(t) =







n i=1

x_i(t−1) + Carry(t−1) 2







mod2. (31) In the summation generator, Carry(t) is the non-linear part of f_nSUM(X(t)). The non- linear function of DRG-R is

f DRG-R(X(t))

= x₁(t)x₂(t)x₃(t)⊕ x₁(t)x₂(t)

⊕ x1(t)x₃(t)⊕ x2(t)x₃(t)

⊕ x₂(t)⊕ x₄(t)⊕ x₅(t)⊕ 1. (32) We attacked these random generator by using the procedure described in Sections 3 and 4.

A PC (Pentium3 500 MHz + 256 MB memory) was used for the computer simulation.

Table 2 shows the optimal linear approxi- mate function F_opt(X(t)), linear approximate probability P_L, index number of the target LFSR, optimal elimination polynomial E_opt(x)

and the cpu time for the optimal AEQ derivation for each random generator. The “cpu time” column shows the cpu times needed to derive the optimal AEQ.

Table 3 shows the AEQ’s probability S, channel capacity C and the number of tapped bits needed for a successful attack T . These re- sults show that each random generator can be attacked by using tapped bit shorter than the generator’s period.

Table 4 shows the probabilities calculated from the results of the computer simulations.

Due to the limited space, only a part of the simulated results are shown. The “True” column shows the result for the true initial state and the “False-i” columns show some of the results for the false states. Each probability is the ra- tio with which the AEQ holds. The “cpu time for attack”column shows the cpu times taken for the determination of the initial state value.

These results show that the estimation using the channel capacity is accurate enough for the attack to be successful.

6. Conclusion

We proposed an optimized linear attack against stream cipher employing the pseudo

(6)

Table 4 Results of the computer simulation.

Initial value Probability

3SUM 4SUM DRG-R

True 0.5093 0.5052 0.5419

False-1 0.4974 0.5010 0.4981

False-2 0.4956 0.5007 0.5098

False-3 0.5061 0.4987 0.4969

Number of used tapped bits =T 8,100[bits] 6.6 × 10⁴[bits] 14,024[bits]

Cpu time for attack 28[min] 7[min] 28[min]

random number generators using non linear combiner. The simulated attack of Section 5 shows that each random generator can be attacked using the tapped bits whose number is much smaller than the generator’s period.

This linear attack might be able to be combined with the Goli´c’s method³⁾ to make a widely applicable attack algorithm.

References

1) Chose, P., Joux, A. and Mitton, M.: Fast Correlation Attacks: An Algorithmic Point of View, Advances in Cryptology — Eurocrypt 2002, LNCS, Vol.2332, pp.209–221, Springer- Verlag, Berlin (2002).

2) Ding, C.: The diﬀerential cryptanalysis and design of natural stream ciphers, LNCS, Vol.809, pp.101–115, Springer-Verlag, Berlin (1991).

3) Goli´c, J.: Linear cryptanalysis of stream ciphers, LNCS, Vol.1008, pp.154–169, Springer- Verlag, Berlin (1995).

4) Johansson, T. and J¨onsson, F.: Fast Corre- lation Attacks through reconstruction of Lin- ear Polynomials, Advances in Cryptology — Crypto 2000, LNCS, Vol.1880, pp.300–315, Springer-Verlag, Berlin (2000).

5) Matsuzaki, N., Ohmori, M. and Tatebayashi, M.: A study on stream ciphers suitable for con- ditional access to digital broadcasting system, ISEC95-6 (1995).

6) Matui, M.: Linear cryptanalysis of DES cipher (I), SCIS93-3C (1993).

7) Moriyasu, T., Morii, M. and Kasahara, M.:

Nonlinear pseudo-random sequence generator with dynamic structure and its properties, ISEC93-7 (1993).

8) Ohisi, T., Tanaka, H. and Kaneko, T.: A linear attack to the summation generator, SITA96, pp.397–400 (1996).

9) Ruppel, R.: Correlation immunity and the summation generator, LNCS, Vol.218, Springer- Verlag.

10) Rueppel, R.: Design and analysis of stream ciphers, Springer-Verlag (1986).

11) Shiraishi, Y. and Morii, M.: Some notes on the non-linear combiner generator and that against

a linear attack, ISEC96-3 (1996).

12) Siegenthaler, T.: Decrypting a class of ciphers using ciphertext only, IEEE C-34, pp.81–85 (Jan. 1985).

13) Schneiner, B.: APPLIED CRYPTOGRA- PHY, WILEY (1994).

14) Tanaka, H. and Kaneko, T.: A linear attack to the random generator by non linear combiner, Transaction of the Institute of Elec- tronics, Information and Communication En- gineers, Vol.J79, A, No.8, pp.1360–1368 (1996).

15) Tanaka, H. and Kaneko, T.: A linear attack to the random generator by non linear combiner, ISITA96 (1996).

16) Tanaka, H. and Kaneko, T.: A study on a quadratic approximation attack to the re- formed dynamic random generator, ISEC96-44 (1990).

17) Tanaka, H., Ohishi, T. and Kaneko, T.: A linear attack to the random generator by non linear combiner (2), SCIS97-32A (1997).

18) Zeng, K. and Huang, M.: On the linear syndrome method in cryptanalysis, LNCS, Vol.403, pp.469–478, Springer-Verlag, Berlin (1988).

19) Zeng, K., Yang, C. and Rao, T.: On the linear consistency test in cryptanalysis and its appli- cations, LNCS, Vol.435, pp.164–174, Springer- Verlag, Berlin (1989).

(Received December 3, 2002) (Accepted June 3, 2003) Hidema Tanaka received the B.E., M.E., and Ph.D. de- grees all in Electrical Engineer- ing, from Science University of Tokyo, in 1995, 1997, and 2000, respectively. In 2000, he joined the Faculty of Science and Tech- nology, Science University of Tokyo. In 2002, he joined Emergency Communications Group, Communications Resarch Laboratory as a re- searcher. He was engaged in research in the ﬁelds of cryptology and information security.

(7)

Toshinobu Kaneko received the B.E., M.E., and Ph.D. de- grees all in Electrical Engineer- ing, from The University of Tokyo, in 1971, 1973, and 1976, respectively. In 1976, he joined the Faculty of Science and Tech- nology, Tokyo University of Science, and since then, as a faculty member, he was engaged in education and research in the ﬁelds of coding theory and information security. Currently, he is a Professor of Department of Electronics En- gineering of the university.