An Introduction to Tries

(1)

21.09.2015

(2)

Introduction CS Background

Given: Words, e.g. in binary code

Ξ

₁

= 11010 . . . , Ξ

₂

= 00011 . . . , Ξ

₃

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Kevin Leckey (Monash University) An Introduction to Tries 21.09.2015 2 / 19

(3)

Given: Words, e.g. in binary code

Ξ

₁

= 11010 . . . , Ξ

₂

= 00011 . . . , Ξ

₃

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Task: Storage that allows fast search and insert/delete operations

(4)

Given: Words, e.g. in binary code

Ξ

₁

= 11010 . . . , Ξ

₂

= 00011 . . . , Ξ

₃

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Task: Storage that allows fast search and insert/delete operations

→ Use tree-like data structures such as a Trie

(5)

Ξ

₁

= 11010 . . . , Ξ

₂

= 00011 . . . , Ξ

₃

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Task: Storage that allows fast search and insert/delete operations

→ Use tree-like data structures such as a Trie (Information retrieval)

(6)

Constructing a Trie

Ξ

1

= 11010 . . . , Ξ

2

= 00011 . . . , Ξ

3

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

(7)

Constructing a Trie

Ξ

1

= 11010 . . . , Ξ

2

= 00011 . . . , Ξ

3

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

(8)

Constructing a Trie

Ξ

1

= 11010 . . . , Ξ

2

= 00011 . . . , Ξ

3

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Ξ1

(9)

Ξ4 Ξ2 Ξ3 Ξ1 Ξ6 Ξ5

(10)

Constructing a Trie

Ξ

1

= 11010 . . . , Ξ

2

= 00011 . . . , Ξ

3

= 01101 . . . , Ξ

4

= 00000 . . . , Ξ

5

= 11111 . . . , Ξ

6

= 11100 . . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

(11)

Searching

Search for Ξ

1

= 11010 . . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

Worst case = Height of the Trie

(12)

Searching

Search for Ξ

1

= 11010 . . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

Worst case = Height of the Trie

(13)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

Worst case = Height of the Trie

(14)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

Worst case = Height of the Trie

(15)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

Worst case = Height of the Trie

(16)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

= 3

= Shortest prefix of Ξ

₁

not shared by Ξ

₂

, . . . , Ξ

₆

Worst case = Height of the Trie

(17)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

= 3

= Shortest prefix of Ξ

₁

not shared by Ξ

₂

, . . . , Ξ

₆

Worst case = Height of the Trie

(18)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

= Shortest prefix of Ξ

₁

not shared by Ξ

₂

, . . . , Ξ

₆

Worst case = Height of the Trie

(19)

Searching

Search for Ξ

1

=

11010

. . .

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

= 3

= Shortest prefix of Ξ

₁

not shared by Ξ

₂

, . . . , Ξ

₆

Worst case = Height of the Trie

(20)

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Searching cost = Depth of Ξ

1

= 3

= Shortest prefix of Ξ

₁

not shared by Ξ

₂

, . . . , Ξ

₆

Worst case = Height of the Trie = 4

(21)

Introduction The Probabilistic Model

Input Model

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model

The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed

Each word Ξ

_i

= ξ

1

ξ

2

ξ

3

ξ

4

. . . consists of letters ξ

1

, ξ

2

, . . . that are:

More general models allow ξ

₁

, ξ

₂

, . . . to be dependent (e.g. evolving as a Markov chain)

(22)

Input Model

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed

Each word Ξ

_i

= ξ

₁

ξ

₂

ξ

₃

ξ

₄

. . . consists of letters ξ

₁

, ξ

₂

, . . . that are:

More general models allow ξ

₁

, ξ

₂

, . . . to be dependent (e.g. evolving as a Markov chain)

(23)

Input Model

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed Each word Ξ

_i

= ξ

₁

ξ

₂

ξ

₃

ξ

₄

. . . consists of letters ξ

₁

, ξ

₂

, . . . that are:

(24)

Input Model

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed Each word Ξ

_i

= ξ

₁

ξ

₂

ξ

₃

ξ

₄

. . . consists of letters ξ

₁

, ξ

₂

, . . . that are:

• independent,

(25)

Input Model

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed Each word Ξ

_i

= ξ

₁

ξ

₂

ξ

₃

ξ

₄

. . . consists of letters ξ

₁

, ξ

₂

, . . . that are:

• independent, • P(ξ

j

= 0) = 1 /2 = P(ξ

j

= 1)

(26)

Generate the words Ξ

1

, Ξ

2

, . . . to be stored → Probabilistic Model The words Ξ

1

, Ξ

2

, . . . are independent and identically distributed Each word Ξ

_i

= ξ

₁

ξ

₂

ξ

₃

ξ

₄

. . . consists of letters ξ

₁

, ξ

₂

, . . . that are:

• independent, • P(ξ

j

= 0) = 1 /2 = P(ξ

j

= 1)

More general models allow ξ

₁

, ξ

₂

, . . . to be dependent (e.g. evolving as a Markov chain)

(27)

Ξ1

(28)

Ξ1, Ξ2

(29)

Ξ2

Ξ1

(30)

Ξ1, Ξ2

(31)

Ξ2

Ξ1

(32)

Ξ2 Ξ1

(33)

Ξ3

Ξ2 Ξ1

(34)

Ξ2 Ξ1

Ξ3

(35)

Ξ4

Ξ2 Ξ1

Ξ3

(36)

Ξ4

Ξ2 Ξ1

Ξ3

(37)

Ξ2 Ξ1, Ξ4

Ξ3

(38)

Ξ2 Ξ4

Ξ1

Ξ3

(39)

Ξ2

Ξ1 Ξ4

Ξ3

(40)

Ξ5

Ξ2

Ξ1 Ξ4

Ξ3

(41)

Ξ2

Ξ1 Ξ4

Ξ3, Ξ5

(42)

Ξ2

Ξ1 Ξ4

Ξ5

Ξ3

(43)

Ξ2

Ξ1 Ξ4

Ξ3, Ξ5

(44)

Ξ2

Ξ1 Ξ4

Ξ5

Ξ3

(45)

Ξ2

Ξ1 Ξ4 Ξ3 Ξ5

(46)

Analysis The Depth

Consider n words Ξ

1

, . . . , Ξ

n

. What is the depth of the vertex Ξ

1

?

Recall:

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . . P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1

n→∞

−→

( 1, if α > 1, 0, if α < 1.

(47)

Analysis The Depth

Consider n words Ξ

1

, . . . , Ξ

n

. What is the depth of the vertex Ξ

1

? Recall:

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . .

P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1

n→∞

−→

( 1, if α > 1, 0, if α < 1.

(48)

Analysis The Depth

Consider n words Ξ

1

, . . . , Ξ

n

. What is the depth of the vertex Ξ

1

? Recall:

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . . P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

= 1 − 1 2

k

!

n−1

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1

(49)

Analysis The Depth

Consider n words Ξ

1

, . . . , Ξ

n

. What is the depth of the vertex Ξ

1

? Recall:

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . . P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

= 1 − 1 2

k

!

n−1

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1

(50)

Analysis The Depth

Consider n words Ξ

1

, . . . , Ξ

n

. What is the depth of the vertex Ξ

1

? Recall:

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . . P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

= 1 − 1 2

k

!

n−1

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1

(51)

Depth D

n

= Length of the shortest unique prefix of Ξ

1

= ξ

1

ξ

2

ξ

3

. . . P(D

n

≤ k) = P(Ξ

2

, . . . , Ξ

n

do not start with ξ

1

. . . ξ

k

)

= 1 − 1 2

k

!

n−1

Consequence:

P(D

n

≤ α log

₂

(n)) = 1 − n

^−α

n−1 n→∞

−→

( 1, if α > 1, 0, if α < 1.

(52)

Analysis The Depth

Results on D

n

Shown on the previous slide:

D

n

log

₂

(n)

−→ 1

P

(n → ∞)

Thm (Knuth ’72): E[D

n

] = log

₂

(n) + Ψ(log

₂

(n)) + o(1) with periodic function Ψ

Thm (Szpankowski ’86): Var(D

n

) ∼ Φ(log

₂

(n)) with periodic function Φ

(53)

Analysis The Depth

Results on D

n

Shown on the previous slide:

D

n

log

₂

(n)

−→ 1

P

(n → ∞) Considering the previous slide more carefully:

P (D

n

− log

₂

(n) < x) ≈

1 − 2

^−x

n

n−1

n→∞

−→ e

⁻²^−x

(Limit is a Gumbel distribution known from extreme value theory)

(54)

Analysis The Depth

Results on D

n

Shown on the previous slide:

D

n

log

₂

(n)

−→ 1

P

(n → ∞) Considering the previous slide more carefully:

P (D

n

− log

₂

(n) < x) ≈

1 − 2

^−x

n

n−1

n→∞

−→ e

⁻²^−x

(Limit is a Gumbel distribution known from extreme value theory) Thm (Knuth ’72): E[D

n

] = log

₂

(n) + Ψ(log

₂

(n)) + o(1)

with periodic function Ψ

(55)

D

n

log

₂

(n)

−→ 1

P

(n → ∞) Considering the previous slide more carefully:

P (D

n

− log

₂

(n) < x) ≈

1 − 2

^−x

n

n−1

n→∞

−→ e

⁻²^−x

(Limit is a Gumbel distribution known from extreme value theory) Thm (Knuth ’72): E[D

n

] = log

₂

(n) + Ψ(log

₂

(n)) + o(1)

with periodic function Ψ

Thm (Szpankowski ’86): Var(D

n

) ∼ Φ(log

₂

(n)) with periodic function Φ

(56)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}. The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) =

Consequence: P (H

n

> α log

₂

(n)) → 0 for α > 2

(57)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies: P (H

n

> α log

₂

(n)) =

Consequence: P (H

n

> α log

₂

(n)) → 0 for α > 2

(58)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) =

Consequence: P (H

n

> α log

₂

(n)) → 0 for α > 2

(59)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) = P (D

n

(Ξ

i

) > α log

₂

(n) for some i ∈ {1, . . . , n})

Consequence: P (H

n

> α log

₂

(n)) → 0 for α > 2

(60)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) = P (D

n

(Ξ

i

) > α log

₂

(n) for some i ∈ {1, . . . , n})

≤ n · P (D

n

> α log

₂

(n))

(61)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) = P (D

n

(Ξ

i

) > α log

₂

(n) for some i ∈ {1, . . . , n})

≤ n · P (D

n

> α log

₂

(n))

≤ n · 1 − 1 − n

^−α

n

(62)

Analysis The Height

Consider n words Ξ

1

, . . . , Ξ

n

. What is the height of the resulting Trie?

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) = P (D

n

(Ξ

i

) > α log

₂

(n) for some i ∈ {1, . . . , n})

≤ n · P (D

n

> α log

₂

(n))

≤ n · 1 − 1 − n

^−α

n

≤ n

^2−α

(63)

Def: Height H

_n

= max{D

_n

(Ξ

_i

) : i = 1, . . . , n}.

The result P(D

n

≤ k) = (1 − 2

^−k

)

ⁿ⁻¹

implies:

P (H

n

> α log

₂

(n)) = P (D

n

(Ξ

i

) > α log

₂

(n) for some i ∈ {1, . . . , n})

≤ n · P (D

n

> α log

₂

(n))

≤ n · 1 − 1 − n

^−α

n

≤ n

^2−α

Consequence: P (H

n

> α log

₂

(n)) → 0 for α > 2

(64)

Analysis The Height

Results on H

n

Partly proven on the previous slide:

H

n

2 log

₂

(n)

−→ 1

P

E[H

n

] ∼ 2 log

₂

(n) (n → ∞) (Flajolet, Steyaert ’82 → periodic second order term)

(65)

Analysis The Height

Results on H

n

Partly proven on the previous slide:

H

n

2 log

₂

(n)

−→ 1

P

Thm (Devroye ’84):

n→∞

lim P(H

n

− 2 log

₂

(n) − 1 ≤ x ) = exp(−2

^−x

), x ∈ R

(66)

H

n

2 log

₂

(n)

−→ 1

P

Thm (Devroye ’84):

n→∞

lim P(H

n

− 2 log

₂

(n) − 1 ≤ x ) = exp(−2

^−x

), x ∈ R Thm (Regnier ’82):

E[H

n

] ∼ 2 log

₂

(n) (n → ∞) (Flajolet, Steyaert ’82 → periodic second order term)

(67)

Analysis The Height

Summary: Typical depth: log

₂

(n), height: 2 log

₂

(n).

log₂n + O(1)

2 log₂n + O(1) 2 log₂n + O(1)

(External nodes/Leaves) (Internal nodes)

(68)

log₂_{log n}ⁿ + O(1)

log₂n + O(1)

2 log₂n + O(1)

log₂_{log n}ⁿ + O(1) log₂n + O(1)

2 log₂n + O(1)

(External nodes/Leaves) (Internal nodes)

(69)

Analysis The External Path Length

Consider n words Ξ

1

, . . . , Ξ

n

. External Path Length:

L

_n

:=

n

X

i =1

D

_n,i

, D

_n,i

= D

_n

(Ξ

_i

).

Example: L

6

= 2 + 3 + 4 · 4 = 21

(70)

Consider n words Ξ

1

, . . . , Ξ

n

. External Path Length:

L

_n

:=

n

X

i =1

D

_n,i

, D

_n,i

= D

_n

(Ξ

_i

).

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

(71)

i =1

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

Example: L

6

= 2 + 3 + 4 · 4 = 21

(72)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d

(73)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d

(74)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d

(75)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d

(76)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d L_K_n

(77)

A Recursion for L

n

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d L_K_n

+

L˜_n−K_n

(78)

Ξ4 Ξ2

Ξ3

Ξ1

Ξ6 Ξ5

K

_n

= # words starting with 0

L

_n

=

^d L_K_n

+

L˜_n−K_n

+ n

(79)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

n

= L

d Kn

+ ˜ L

n−Kn

+ n

X =

^d

1 √

2 X + 1

√

2 X e (1)

3. Solution to (1): Existence of a solution to (1). Here: Normal distribution with mean 0.

4. Contraction: Find a metric such that (1) corresponds to the fixed point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(80)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

n

= L

d Kn

+ ˜ L

n−Kn

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

3. Solution to (1): Existence of a solution to (1). Here: Normal distribution with mean 0.

4. Contraction: Find a metric such that (1) corresponds to the fixed point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(81)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

n

= L

d Kn

+ ˜ L

n−Kn

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

distribution with mean 0.

4. Contraction: Find a metric such that (1) corresponds to the fixed point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(82)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

_n

= L

^d _K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ ???

point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(83)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

_n

= L

^d _K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ (( √

2)

⁻¹

, ( √

2)

⁻¹

, 0)

point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(84)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

_n

= L

^d _K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ (( √

2)

⁻¹

, ( √

2)

⁻¹

, 0) X =

^d

1 √

2 X + 1

√

2 X e (1)

(85)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

_n

= L

^d _K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ (( √

2)

⁻¹

, ( √

2)

⁻¹

, 0) X =

^d

1 √

2 X + 1

√

2 X e (1)

3. Solution to (1): Existence of a solution to (1). Here: Normal distribution with mean 0.

(86)

The Contraction Method in a Nutshell

Aim: Find a limit law for L

n

(after rescaling properly) L

_n

= L

^d _K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ (( √

2)

⁻¹

, ( √

2)

⁻¹

, 0) X =

^d

1 √

2 X + 1

√

2 X e (1)

3. Solution to (1): Existence of a solution to (1). Here: Normal distribution with mean 0.

4. Contraction: Find a metric such that (1) corresponds to the fixed point of a contracting map.

(87)

L

_n

= L

_K_n

+ ˜ L

_n−K_n

+ n 1. Rescaling: X

n

= (L

n

− E[L

n

])/pVar(L

n

)

X

_n

= A

^d _n,1

X

_K_n

+ A

_n,2

X e

_n−K_n

+ b

_n

2. Find the Limits: (A

_n,1

, A

_n,2

, b

_n

) −→ (( √

2)

⁻¹

, ( √

2)

⁻¹

, 0) X =

^d

1 √

2 X + 1

√

2 X e (1)

3. Solution to (1): Existence of a solution to (1). Here: Normal distribution with mean 0.

4. Contraction: Find a metric such that (1) corresponds to the fixed point of a contracting map.

5. Convergence: Prove convergence with respect to that metric.

(88)

Results on L

n

Thm (Jacquet, Regnier ’88; Neininger, R¨ uschendorf 2004):

L

n

− E[L

n

] pVar(L

n

)

−→ N (0, 1)

d

From the analysis of D

_n

:

E[L

n

] = E

"

_n

X

i =1

D

_n

(Ξ

_i

)

#

Thm (Kirschenhofer, Prodinger ’86):

Var(L

n

) = n e Ψ(log

₂

(n)) + O(log

²

(n))

(89)

Results on L

n

Thm (Jacquet, Regnier ’88; Neininger, R¨ uschendorf 2004):

L

n

− E[L

n

] pVar(L

n

)

−→ N (0, 1)

d

From the analysis of D

_n

:

E[L

n

] = E

"

_n

X

i =1

D

_n

(Ξ

_i

)

#

(90)

Results on L

n

Thm (Jacquet, Regnier ’88; Neininger, R¨ uschendorf 2004):

L

n

− E[L

n

] pVar(L

n

)

−→ N (0, 1)

d

From the analysis of D

_n

:

E[L

n

] = E

"

_n

X

i =1

D

_n

(Ξ

_i

)

#

= nE[D

n

]

(91)

Results on L

n

Thm (Jacquet, Regnier ’88; Neininger, R¨ uschendorf 2004):

L

n

− E[L

n

] pVar(L

n

)

−→ N (0, 1)

d

From the analysis of D

_n

:

E[L

n

] = E

"

_n

X

i =1

D

_n

(Ξ

_i

)

#

= nE[D

n

] = n log

₂

(n) + nΨ(log

₂

(n)) + o(n)

(92)

L

n

− E[L

n

] pVar(L

n

)

−→ N (0, 1)

d

From the analysis of D

_n

:

E[L

n

] = E

"

_n

X

i =1

D

_n

(Ξ

_i

)

#

= nE[D

n

] = n log

₂

(n) + nΨ(log

₂

(n)) + o(n)

Thm (Kirschenhofer, Prodinger ’86):

Var(L

n

) = n e Ψ(log

₂

(n)) + O(log

²

(n))

(93)

Summary

Trie:

tree-like data structure to store words

position of a word in the tree ↔ path given by shortest unique prefix Performance:

Consider input: n independent words, each word is a sequence of

’coin tosses’