Mixing properties of triangular feedback shift registers

(1)

REGISTERS

BERND SCHOMBURG⇤

Abstract. The purpose of this note is to show that Markov chains induced by non-singular triangular feedback shift registers and non-degenerate sources are rapidly mixing. The results may directly be applied to the post-processing of random generators and to stream ciphers in CFB mode.

Key words. Feedback shift registers, Markov chains, stochastic matrices, rapid mixing

AMS subject classifications. 94A55, 60J10, 15B51.

1. Introduction. Let_S = (Kt)t 0 be a (memoryless) source, i.e. a sequence

of identically distributed, independent random variables. Informally, a source is an object that emits symbols (in general from a finite alphabet) according to some ran-dom mechanism. This could be, for example, a physical ranran-dom generator or - in first-order approximation - a natural alphabet-based (plain) text. The objective of this paper is to investigate the Markov behaviour of a class of feedback shift registers when the input symbols are modified by the random source. To model the influence of the random source we will assume that the underlying alphabet carries a group structure. We will show that for non-degenerate sources and non-singular triangular feedback shift registers, the associated Markov chains are rapidly mixing. We will also give estimates for mixing (convergence) rate using the Dobrushin coefficients of the associated stochastic matrices.

2. Feedback shift registers and associated Markov chains. In this section we introduce the class of triangular feedback shift registers and derive some of their basic algebraic properties. Furthermore we will define Markov chains associated with feedback shift registers.

Let⌃be a finite alphabet. Afeeedback shift register (FSR)of lengthN over

⌃is a finite automaton (X, , ) with state spaceX =⌃N_{, a} _{transition function} :X _!X which is of the form

=s 0, 0:X !X

(with the cyclic shift s : X _! X, (x1, ..., xN) = (x0, xN) 7! (xN, x1, ..., xN 1) = ⇤_{P.O. Box 1644, 53734 Sankt Augustin, Germany ([email protected]).}

(2)

(xN, x0)) and theoutput function

=pr1 =prN 0: X !⌃.

Definition 2.1. Let_R= (X, , ) be an FSR.

(i)_Ris called non-singulari↵ (or equivalently 0) is bijective.

(ii) An FSR (X, , ) is calledtriangulari↵ 0has the form

0(x1, ..., xN) = (f1(x1), f2(x1, x2), ..., fN(x1, ..., xN)). (2.1)

with mappingsfj:⌃j !⌃, j= 1, ..., N.

Remark 2.2. For a family (fj : ⌃j ! ⌃)1jN define Fj : ⌃j ! ⌃j, j 2

{1, ..., N_}, by

Fj(x1, ..., xj) = (f1(x1), f2(x1, x2), ..., fj(x1, ..., xj)).

IfFk is bijective, then allFj, j < k, are bijective, too. To simplify notation, we will writef forfN andF forFN 1.

Remark 2.3. (i) Let (⌃,_⇤) be a group, hj :⌃j 1 !⌃, j 2 mappings and

h1₂⌃. The family (fj)1jN, defined by

fj(x1, ..., xj) =xj⇤hj(x1, ..., xj 1) 8(x1, ..., xj)2⌃j,1jN,

induces a non-singular triangular FSR via (2.1).

(ii) For⌃=F2 the converse is also true: For a non-singular triangular shift register

the functionsfj are necessarily of the form

fj(x1, ..., xj) =xj+hj(x1, ..., xj 1),

where + denotes the addition (modulo 2) in_F2.

Let (X, , ) be an FSR. We will investigate the state sequence⇠= (⇠j)j 02XN0

⇠j= jx, j 0, (2.2)

(belonging to the initial state x₂ X) and the corresponding output sequence ↵= (↵j)j 02⌃N0

↵j= (⇠j) = ( j)x, j 0. (2.3)

To this end we introduce the mappings

sa:X !X,(x1, ..., xN)7!(a, x1, ..., xN 1), a2⌃.

(3)

The following lemma will be our basic tool.

Lemma 2.4. Let (X, , ) be triangular. Forx2X let⇠ and↵be defined as in (2.2) and (2.3), respectively. Ify₂X an arbitrary state and if we define inductively

⌘0=y, ⌘j+1=s↵j 0(⌘j), j 0,

(2.5)

then for each j2{1, ..., N} the firstj components of ⇠j and⌘j coincide

prk(⇠j) =prk(⌘j) ₈k₂_{1, ..., j_}.

Especially we have

N_x₌_⇠

N =⌘N = ((s↵N 1 0) ... (s↵0 0))y.

(2.6)

Proof. (by induction w.r.t. j).

(a)j= 1. By (2.2), (2.3) and (2.4) we havepr1(⇠1) =↵0=pr1(⌘1).

(b)j_!j+ 1.Letj₂_{1, ..., N 1_} and

prk(⇠j) =prk(⌘j) 8k2{1, ..., j}.

By (2.1) we have

prk( 0⇠j) =prk( 0⌘j) 8k2{1, ..., j}

and thus

prk(⇠j+1) =prk(⌘j+1) 8k2{2, ..., j+ 1}.

Finally, Definition (2.3) of↵j implies

pr1(⇠j+1) =↵j=pr1(⌘j+1)

and the assertion follows.

Remark 2.5. As a first application of Lemma 2.4 we show that for a non-singular triangular FSRR= (X, , ) and an initial statex2X the sequences⇠and ↵have the same period. Recall that for a sequence g = (gi)i₂_N0 in a set D its period per(g)₂N_[_{1}is defined by

(4)

If the periodper(g) is finite, it divides eachl withgi=gi+l 8i2N0.

Since_Ris non-singularper(⇠) is equal to the (finite) length of the orbitx< > _(thus dividing the order of ). Furthermore, trivially

l:=per(↵)_|per(⇠).

(2.6) now implies

N+l_x₌ N_x

and therefore (using the bijectivity of ) l_x ₌ _x _and _⇠

j+l = ⇠j 8j 0, i.e.

per(⇠)_| l=per(↵), which finally shows that per(⇠) =per(↵).

For the remaining part of the paper we assume that (⌃,⇤) is agroup.

Our plan is to define stochastic processes associated with FSRs. First fix some no-tation. For a (non-empty) finite setD let_M1(D) denote the set of all measures on D with mass 1 (also referred to as (probability) distributions on D). The uniform distributiononD will be denoted by D, i.e. D(_{!_}) = 1

|D| for all!2D.

Definition and Remark 2.6. We define stochastic processes associated with an FSR_R= (X, , ) of lengthN over⌃. Let_S = (Kt)t 0 be a memoryless source,

i.e. a sequence of independent, identically distributed (i.i.d.) random variables over

⌃with (common) distribution ₂_M1(⌃). Fix an arbitrary random variableZ0 on X and let the stochastic processZ= (Zt)t 0 be recursively defined by

Zt= (SKt 1 )Zt 1, t 1,

(2.7)

whereSa :X !X, a2⌃is given by

Sa(x1, ..., xn) = (a⇤x1, ..., xn),8(x1, ..., xn)2X. (2.8)

By construction_Zis a homogenous Markov chain overX with transition matrix

P =P( , ) = (Pxy)x,y2X:

Pxy=P r(Zt=y |Zt 1=x) =

⇢

(a), if (Sa )(x) =y, 0, else.

(2.9)

We will call_ZtheMarkov chain associated with_R,_S andZ0. Note thatP ₂

RX⇥X _{is stochastic (cf. Definition 3.1) and depends on}_R_{and , only. Furthermore,} if_Ris non-singular,P is doubly-stochastic (cf. Definition 3.2 (i)); indeed, since is bijective, we have

X

x2X

Pxy= X

¯

x2X ⇢

(a), if Sax¯=y,

0, else. =

X

a2⌃

(5)

for ally₂X.

Ifµis the distribution ofZ0,µPj _{is the distribution of}_Z_j.

In view of (2.9), mappings of the form

a= (Saj 1 ) ... (Sa0 ), a= (a0, ...aj 1)2⌃

j_{, j}

2N,

will play an important role in the investigation of the powers ofP.

Theorem 2.7. Let(X, , )be a triangular FSR andF be defined as in Remark 2.2.

(i) For bijective F, the semi-group hSa | a 2 ⌃i, generated by all compositions

Sa , acts transitively on X; more precisely, for each pair x, y2 X there exists a

uniquely determineda= (a0, ..., aN 1)2⌃N =X with

x= ay= ((SaN 1 ) ... (Sa0 ))y.

(2.10)

(ii) Ify₂X and1_j_N, then

⌃j ₃a_7! a y2X

is one-to-one.

(iii) IfF is not bijective, then for all x₂⌃_⇥(⌃N 1

\F(⌃N 1₎₎_,_y

2X anda₂⌃j_,

j₂_N:

x6= ay.

Proof. (i) Let , 0 : X ! X denote the bijections which are defined by

(x0, xN) 7!(xN, F x0) and (x0, xN) 7!(F x0, xN), respectively, and consider the cor-responding register (X, , prN). Let x, y2 X and (↵j)j 0 be the output sequence,

which corresponds to the initial state N_x_:

↵j=prN( j Nx). (2.11)

Define (⌘j) recursively by⌘0=y and

⌘j+1= (Saj )(⌘j), aj=↵j⇤f(⌘j)

1_{, j} ₀_.

(2.12)

Then

⌘j+1= (aj⇤f(⌘j), F⌘0j) = (↵j, F⌘j0) = (s↵j 0)⌘j.

Lemma 2.4, applied to (X, , prN), shows

(6)

In particular, we have shown that the mapping

X =⌃N ₃a_7! ay2X

is onto. SinceX is finite, the mapping is also one-to-one, which proves that represen-tation in (2.10) is unique.

(ii) SetAj={ a y| a2⌃j}and note that the assertion is equivalent to

|Aj|=|⌃|j. (2.13)

For the proof of (2.13) we use induction w.r.tj:

For j =N, (2.13) is a consequence of (2.10). Now assume that (2.13) is true for a

j >1. Then|Aj 1||⌃|j 1. Furthermore,

⌃_⇥Aj 13(a, x)7!(a⇤f(x), F x0)2Aj is onto, thus

|⌃_{| · |}Aj 1| |Aj|=|⌃|j, i.e._|Aj 1| |⌃|j 1.

(iii) Clear from the definition of a.

Remark 2.8. (i) The group structure_⇤on⌃induces a natural group structure ˆ

⇤onX =⌃N_{. If} _:_X

!X is a group homomorphism w.r.t. ˆ_⇤, then there is a group homomorphism :X!X s.t.

ax= (a)ˆ⇤ Nx 8a, x2X.

For bijectiveF the homomorphism is bijective by Theorem 2.7(i). (ii) LetF be bijective. Theorem 2.7 (ii) shows that for alljN

(Pj)xy= ( Q

i<j (ai), if a(x) =y,(a0, ..., aj 1)2⌃j

0, else.

Especially, if we define the product distribution ˆ₂_M1(X) by

ˆ(a) = N_Y1

i=0

(ai) ₈a= (a0, ..., aN 1)2X,

(2.14)

then

(PN)xy= ˆ(a)if a(x) =y, a₂X.

(2.15)

(7)

(iii) For traditional FSRs we have fi =pri for 1i < N, i.e. F =id⌃N 1, so the

prerequisites of Theorem 2.7 (i) are automatically fulfilled. As an application of the proof of 2.7 consider the case that

(x0, xN) = (c, x0) 8(x0, xN)2X

for a constant c ₂ ⌃. The constuction (2.11)-(2.12) shows that for given x, y the uniquely determinedawith a(x) =y is given by

aj =yN j⇤c 1.

ThusP has the limiting distribution⇡

⇡= ˆ(_·ˆ_⇤(c, ..., c) 1) (2.16)

withµPj₌_⇡_{for all}_µ

2M1(X), j N. Furthermore⇡6= X if 6= ⌃.

Proposition 2.9. Let Fk be bijective for a k < N and lety,y¯2X s.t. y16= ¯y1.

Then a y6= ¯a y¯for allj2{1, ..., k} anda,¯a2⌃j.

This follows immediately from Remark 2.2 and the following

Lemma 2.10. Let k2 {2, ..., N} s.t Fk 1 is bijective. If y,y¯2 X, j 2N, and

(a0, ..., aj 1),(¯a1, ...,¯aj 1)2⌃j with

pri( (a0,...,aj 1)y) =pri( (¯a0,...,¯aj 1)y¯) 8i2{1, ..., k},

then

pri( (a0,...,aj 2)y) =pri( (¯a0,...,¯aj 2)y¯) 8i2{1, ..., k 1}.

Proof. Letx= (a0,...,aj 2)y and ¯x= (¯a0,...,¯aj 2)y¯. Then

(a0,...,aj 1)y = (Saj 1 )x = (aj 1⇤fN(x), f1(x1), ...., fN 1(x1, ..., xN 1)), (¯a0,...,a¯j 1)y¯ = (S¯aj 1 )¯x = (¯aj 1⇤fN(¯x), f1(¯x1), ...., fN 1(¯x1, ...,x¯N 1)).

By assumption,

fi(x1, ..., xi) =fi(¯x1, ...,x¯i) 8i2{1, ..., k 1}.

andFk 1is one-to-one, so the assertion follows.

(8)

include a proof of Theorem 3.5 which is the main result of this section and which will be used in Section 4. For further details and proofs we refer to [1] and [2].

Let n 2 N. In the real linear space Rn ₍_e_i)

1in denotes the canonical basis and (✏i)1in the corresponding dual basis. We set e=Pni=1ei.We identify Mn1 = M1({1, ..., n}) with the set of all ` 2 (Rn)⇤ with ` 01 and `(e) = 1 by µ $

Pn

i=1µ({i})✏i. Then := 1

n

Pn

i=1✏i is the uniform distribution on {1, ..., n}. We

endow_Mn

1 with the metric

d1(µ,⌫) =_kµ ⌫_k1=

X

i

|µi ⌫i|.

Since_Mn

1 is closed in (Rn)⇤, (Mn1, d1) is a complete metric space.

Definition and Remark 3.1. A matrixP 2 Rn⇥n _is _stochastic_{, i}_↵_P ₀ and P e =e, or equivalently, i↵ µP ₂ _Mn

1 for allµ 2 Mn1. Thus, for a stochastic P ₂_Rn⇥n_{, the operator}

TP :Mn1 3µ7!µP 2Mn1

is well-defined and continuous. It is convenient to define certain properties of stochas-tic matrices using the language of dynamical systems in metric spaces. Let (X, d) be a metric space and T : X ! X a continuous operator. A point x 2 X is called

global attractorofT i↵x= limj_!1Tj_y _{for each}_y

2X; in this casexis the only fixed-point ofT.

Definition 3.2. LetP ₂_Rn⇥n _{be stochastic.}

(i)P is calleddoubly-stochastic, i↵ is a fixed-point ofTP, i.e. i↵its transposed matrixt_P _{is stochastic, too. (ii) A distribution} _⇡

2Mn

1 is called stationaryw.r.t. P i↵⇡is a fixed-point ofTP. (iii) A distribution⇡2Mn1 is calledlimiting distri-butionofP, i↵⇡is the global attractor ofTP. (iv)P is called ergodic, i↵it has a limiting distribution>0.

Remark 3.3. If a stochasticP has a limiting distribution⇡, then by 3.1⇡is the only stationary distribution ofP. If, in addition,P is doubly-stochastic, then⇡= andP is ergodic.

Definition 3.4. Let P be a stochastic n⇥n-matrix. The Dobrushin (or

ergodicity)coefficientis defined to be

(P) = 1 2maxi<j

n X

k=1

|Pik Pjk|= max i<j

k✏iP ✏jPk1 k✏i ✏jk1

.

1_{For a (non-empty) finite set}_I _and _{x, y}₂_RI _{we write}_x__y _i_↵_x

(9)

Obviously, 0_ (P)_1, and

(P) = 0, 9⇡2Mn

1 : P=e⌦⇡,

(3.1)

(P) = 1_{, 9}i, j: i < j _^ min(✏iP,✏jP) = 0. (3.2)

Furthermore, it can be shown that

kµP ⌫P_k1 (P)kµ ⌫k1 8µ,⌫2Mn1.

(3.3)

For permutation matricesA, B₂_Rn⇥n _{the product}_{BP A}_{is stochastic with}

(BP A) = (P).

(3.4)

Theorem 3.5. Let P 2Rn⇥n _{be a stochastic matrix.}

(i)P has a limiting distribution i↵ (PN₎_<₁ _{for an}_N

2N. (ii) If (PN₎_<₁_{for an}_N

2Nand if⇡₂_Mn

1 is the limiting distribution ofP (which

exists by (i) and is unique by definition), then

kµPj ⇡_k1 (PN)b j

Nckµ ⇡k

1 8µ2Mn1

for allj 0.

Proof. (i) ”)”: Assume that (PN_{) = 1 for all} _N ₂ _N_{. Then, by (3.2), for} eachN there exist indicesiN < jN with min(✏iNP

N_,_✏ jNP

N_{) = 0}_._{By the pigeon hole}

principle there are indicesi < j and a sequence (Nk)k 1 ⇢Ns.t. limk!1Nk =1 and

min(✏iPNk,✏jPNk) = 0. (3.5)

Now assume thatP has a limiting distribution⇡2Mn

1. Then⇡= limk!1✏iPNk= limk_!1✏jPNk, so by (3.5)⇡= 0, which is a contradiction.

(i) ”₍” and (ii): Set T = TP and let N 2 N s.t. L = (PN) < 1. Then TN is contractive with Lipschitz constantL. By Banach’s fixed-point theorem, TN _{has a} global attractor⇡2Mn

1. Now⇡is the only fixed-point of TN andT⇡=T(TN⇡) = TN₍_T_⇡₎_,_so_T_⇡₌_⇡_{. Finally, let}_µ

2Mn

1,j 0 and writej in the formj=kN +r

withk=_b_Nj_c, 0_r < N. Then, using (3.3) and (P)_1,

d1(Tjµ,⇡) =d1(Tjµ, Tj⇡) =d1(TN kTrµ, TN kTr⇡)

Lk_d1₍_Tr_{µ, T}r_⇡₎

Lk_d1₍_µ,_⇡₎_.

Definition and Remark 3.6. A stochastic matrix P is called primitive, i↵

(10)

primitive. Conversely, if P is primitive with PN _>₀_{, then} ₍_PN₎_<₁_{, so} _P _{has a}

limiting distribution⇡, i.e. limj!1Pjek =⇡(ek)for allk. It is now easy to see that

min i (P

j_x_)i

min i (P

j+1_x_)i

8x₂Rn, j 0,

so0<mini(PN_e

k)i⇡(ek) 8k2{1, ..., n},i.e. ⇡>0 andP is ergodic.

Note that a stochastic matrix is primitive i↵ a corresponding Markov chain is irre-ducible and aperiodic.

4. Mixing properties of FSRs. We are now ready to combine the results of Sections 2 and 3. As in Section 2, let _R= (X, , ) be an FSR of length N over a finite group (⌃,⇤), S = (Kt)t 0 a sequence of i.i.d. random variables over ⌃ with

(common) distribution 2 M1(⌃) and Z0 be an arbitrary random variable on X

with distribution µ₂ _M1(X). Consider the Markov chain Z = (Zt)t 0 associated

with_R,_S andZ0 and the corresponding stochastic matrixP =P( , ). We assume thatRis triangular and letFk, F be defined as in 2.2.

Theorem 4.1. (i) If Fk (k < N) is bijective, then

(Pj) = 1 ₈j₂_{0, ..., k_}.

(4.1)

(ii) For bijectiveF every row of PN _{is a permutation of the tuple}

ˆ(a)_| a₂⌃N =X .

(iii) For bijective F and non-degenerate (i.e. mina (a) > 0) the matrix P is primitive. In factPN _>₀ _{(and thus} ₍_PN₎_<₁_).

(iv) If F is not bijective,P is not primitive.

Proof. (i) Proposition 2.9 shows that forj < k Pj _{has (at least) two rows with} disjoint supports. (4.1) now follows from (3.2).(ii) and (iii) follow from (2.15). (iv) is a consequence of Theorem 2.7 (iii).

We now come to the announced theorem concerning the rapid mixing of non-singular tringular FSRs under the influence of non-degenerate sources. The theorem actually holds under the slightly milder hypothesis thatF is bijective and is a direct consequence of Theorems 3.5 and 4.1 (iii) and Remark 3.6.

Theorem 4.2. Let F be bijective and non-degenerate. ThenP is ergodic with a limiting distribution⇡>0. If_Ris non-singular,⇡= X. For the distributionµPj

of Zj we have the estimate

kµPj ⇡_k1 kµ ⇡k1 (PN)b j

NcC (PN)b j

Nc 8j 0,

where

C= (

(11)

and (PN₎_<₁_.

Obviously, the determination of (PN_{) plays a vital role in the analysis of} con-crete FSRs. In order to shed some light on this practical problem, we conclude this note with some remarks on ”linear” and binary FSRs.

Theorem 4.3. Let R be non-singular and linear in the sense that is a group isomorphism. ThenP =P( , )fulfills

(PN) =1 2maxx2X

X

y

|ˆ(y) ˆ(yˆ_⇤x 1)|=:hN(⌃, ). (4.2)

Proof. By Remark 2.8 (i) there is a bijection s.t.

ax= (a)ˆ⇤ Nx 8a, x2X.

Thus

y= ax,a= 1(yˆ⇤( Nx) 1) and (cf. (2.15))

(PN)xy= ˆ( 1(yˆ_⇤( Nx) 1)).

By (3.4)

(PN) = (Q)

where

Qxy= ˆ(yˆ⇤x 1). The Dobrushin coefficient ofQis now easily calculated as

(Q) = 1 2maxx2X

X

y

|ˆ(y) ˆ(yˆ_⇤x 1)_|.

Remark 4.4. An important special case are binary FSRs. Let (⌃,⇤) = (F2,+) and let be a non-degenerate distribution on F2, p = max( (0), (1)) < 1 and q= min( (0), (1))>0. Recall that"( ) := (0) (1) is called the bias of . Then the maximum in (4.3) is attained forx= (1,1, ...,1) with

hN(F2, ) =

bN 2c

X

k=0

✓_N

k

◆

(12)

Note that

hN+1(F2, ) hN(F2, ) =

( _N

N 2 (pq)

N

2(p q), if N is even,

0, if N is odd.

Therefore limN_!1hN(F2, ) = 1 (use the Taylor expansion of (1 4x)

1

2 around x= 0). By Theorem 4.1(ii)

(P( , )N)hN(F2, )

for all with bijectiveF. Computer experiments suggest the following

Conjecture. For all non-constant Boolean functionsf :FN

2 !F2 we have

(P( f, )N) |"( )|

where

f(x) = (f(x), x0), x= (x0, xN)2FN2 1⇥F2,

the lower bound being attained e.g. for f = {0}, the characteristic function of the

singleton_{0_}.

Note that we have (P( f, )N) = 0 for constantf (cf. (2.16) and (3.1)).

Remark 4.5. The source _S can be interpreted as a random source or (a first-order approximation of) a plain text source. The process Z is then a model of the sequence of states of the FSR (X, , ) when operated in cipher-feedback (CFB) mode. The results of this section may therefore be applied to the post-processing of random generators and to stream ciphers in CFB mode.

REFERENCES

[1] Behrends, E.:Introduction to Markov Chains.Vieweg, Braunschweig/Wiesbaden, 2000. [2] Br´emaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queus.