arxiv: v4 [math.ds] 11 Jan 2016

(1)

arXiv:1512.09161v4 [math.DS] 11 Jan 2016

REAL LINE

MRINAL KANTI ROYCHOWDHURY

Abstract. Let P be a Borel probability measure on R generated by an infinite system of self-similar mappings associated with a probability vector. For such a probability measure P , in this paper, the optimal sets of n-means and the nth quantization error are calculated for every natural number n. Moreover, the connection between the rate of convergence of the logarithms of the quantization errors for n going to infinity and the Hausdorff dimension of the limit set of the infinite iterated function system is indicated.

1. Introduction

The history of the theory and practice of quantization dates back to 1948, although similar ideas had appeared in the literature in 1897 (see [S]). It is used in many applications such as signal processing and telecommunications, data compression, pattern recognitions and cluster analysis (for details see [GG, GN]). It is also closely connected with centroidal Voronoi tessellations. Let R^d denote the d-dimensional Euclidean space, k · k denote the Euclidean norm on R^d for any d ≥ 1, and P be a Borel probability measure on R^d. Given a finite subset α ⊂ R^d, the Voronoi region generated by a ∈ α is defined by

M(a|α) = {x ∈ R^d : kx − ak ≤ kx − bk for all b ∈ α}

i.e., the Voronoi region generated by a ∈ α is the set of all points in R^d which are closer to a ∈ α than to all other points in α, and the set {M(a|α) : a ∈ α} is called the Voronoi diagram or Voronoi tessellation of R^d with respect to α. A Borel measurable partition {Aa : a ∈ α} of R^d is called a Voronoi partition of R^d with respect to α (and P ) if Aa ⊂ M(a|α) (P -almost surely) for every a ∈ α. The Voronoi tessellation {M(a|α) : a ∈ α} generated by the set of points α is called the centroidal Voronoi tessellation (CVT) if the points a∈ α are also the centroids of their corresponding Voronoi regions, i.e., for each a ∈ α,

a= 1

P(M(a)) Z

M(a)

xdP = R

M(a)xdP R

M(a)dP .

For details about CVT and its application one can see [DFG]. If α is a finite set, the error R mina∈αkx − ak²dP(x) is often referred to as the variance, cost, or distortion error for α with respect to the probability measure P , and is denoted by V (α) := V (P ; α). On the other hand, inf{V (P ; α) : α ⊂ R^d, card(α) ≤ n} is called the nth quantization error for the probability measure P , and is denoted by Vn:= Vn(P ). If R kxk²dP(x) < ∞ then there is some set α for which the infimum is achieved (see [GKL, GL1, GL2]). Such a set α for which the infimum occurs and contains no more than n points is called an optimal set of n-means. It is known that for a continuous probability measure an optimal set of n-means always has exactly n-elements (see [GL2]). For a Borel probability measure P on R^d, an optimal set of n-means forms a CVT with n-means (n-generators) of R^d; however, the converse is not true in general (see

2010 Mathematics Subject Classification. 60Exx, 28A80, 94A34.

Key words and phrases. Probability measure, infinite similitudes, optimal quantizers, quantization error, quantization dimension, Hausdorff dimension.

The research of the author was supported by U.S. National Security Agency (NSA) Grant H98230-14-1-0320.

1

(2)

[DFG, R2]). A CVT with n-means is called an optimal CVT with n-means if the generators of the CVT form an optimal set of n-means with respect to the probability distribution P . Let us now state the following proposition (see [GG, GL2]).

Proposition 1.1. Let α be an optimal set of n-means, a ∈ α, and M(a) be the Voronoi region generated by a ∈ α. Then for every a ∈ α, (i) P (M(a)) > 0, (ii) P (∂M(a)) = 0, (iii) a = E(X : X ∈ M(a)), and (iv) P -almost surely the set {M(a) : a ∈ α} forms a Voronoi partition of R^d.

For κ > 0, we define the κ-dimensional lower and upper quantization coefficients for P by, Q^κ(P ) := lim inf

n n²^κVn(P ) and Q^κ(P ) := lim sup

n

n²^κVn(P ).

If Q^κ(P ) and Q^κ(P ) coincide, it is called the quantization coefficient for P . The lower and the upper quantization dimensions of P are defined to be

D(P ) := lim inf

n→∞

2 log n

− log Vn(P ) and D(P ) := lim sup

n→∞

2 log n

− log Vn(P ).

If D(P ) and D(P ) coincide, we call the common value the quantization dimension of the probability measure P . Quantization dimension measures the speed at which the specified measure of the error goes to zero as n tends to infinity. For details about quantization coefficients and quantization dimensions one is referred to [GL2, P1, P2].

A transformation f : X → X on a metric space (X, d) is called contractive or a contraction mapping if there is a constant 0 < c < 1 such that d(f (x), f (y)) ≤ cd(x, y) for all x, y ∈ X.

On the other hand, f is called a similitude or a similarity mapping if there exists a constant s >0 such that d(f (x), f (y)) = sd(x, y) for all x, y ∈ X. Here s is called the similarity ratio of the similarity mapping f . It is known that the classical Cantor set C is generated by the two contractive similarity mappings S1 and S2 given by S1(x) = ¹₃xand S2(x) = ¹₃x+²₃ for all x∈ R. Let P be a Borel probability measure on R such that P = ¹₂P ◦ S₁⁻¹+¹₂P ◦ S₂⁻¹, where P◦ S_i⁻¹ denotes the image measure of P with respect to Si for i = 1, 2 (see [H], Theorem 4.4(1) for a generalization of self-similar measure). Then, P has support the Cantor set C. For this probability measure Graf and Luschgy gave a closed formula to determine the optimal sets of n-means and the nth quantization error for n ≥ 2. They also showed that the quantization dimension of this probability measure equals the Hausdorff dimension of the Cantor set C, but the quantization coefficient for P does not exist (see [GL3]). Later for n ≥ 2, L. Roychowdhury gave an induction formula to determine the optimal sets of n-means and the nth quantization error for a probability distribution P on R, given by P = ¹₂P ◦ S₁⁻¹ + ¹₂P ◦ S₂⁻¹ which has support the Cantor set generated by S₁ and S₂, where S₁(x) = ¹₄x and S₂(x) = ¹₂x+ ¹₂ for all x ∈ R (see [R1]). In [R2], the author investigated the optimal sets of n-means and the centroidal Voronoi tessellations with n generators, n ∈ N, for a Borel probability measure P = ¹₂P ◦ S₁⁻¹+ ¹₂P ◦ S₂⁻¹ on R supported by a dyadic Cantor set generated by a set of two contractive similarity mappings S₁ and S₂ such that S₁(x) = rx and S₂(x) = rx + (1 − r) for all x ∈ R with similarity ratios r where 0.4364590141 ≤ r ≤ 0.4512271429.

In this paper, we have considered a probability measure P on R which is generated by an infinite collection of similitudes {Sj}^∞_j=1 on R, where Sj(x) = ₃¹jx+ 1 − ₃j−1¹ for all x ∈ R and P is given by P = P∞

j=1 1

2^jP ◦ S_j⁻¹. For this probability measure we determine the optimal sets of n-means and the nth quantization error. Besides, we showed that like the classical Cantor distribution considered by Graf-Luschgy in [GL3], the quantization coefficient for the probability measure P considered in our paper does not exist, but the quantization dimension of P exists and equals the Hausdorff dimension of the limit set generated by the infinite similitudes.

(3)

2. Basic definitions, lemmas and proposition

Let N denote the set all natural numbers, i.e., N = {1, 2, · · · }. By a string or a word ω over the alphabet N, we mean a finite sequence ω := ω₁ω₂· · · ωkof symbols from the alphabet, where k ≥ 1, and k is called the length of the word ω. The length of a word ω is denoted by |ω|. A word of length zero is called the empty word, and is denoted by ∅. We denote the set of all words of length k by N^k. By N^∗ we denote the set of all words over the alphabet N of some finite length k including the empty word ∅. For any two words ω := ω1ω2· · · ωk

and τ := τ1τ2· · · τℓ in N^∗, by ωτ := ω1· · · ωkτ1· · · τℓ we mean the word obtained from the concatenation of the two words ω and τ . For n ≥ 1 and ω = ω1ω2· · · ωn ∈ N^∗ we define ω⁻ := ω1ω₂· · · ω_n−1, i.e., ω⁻ is the word obtained from the word ω by deleting the last letter of ω. Note that ω⁻ is the empty word if the length of ω is one. For ω ∈ N^∗, by (ω, ∞) it is meant the set of all words ω⁻(ω|ω|+ j), obtained by concatenation of the word ω⁻ with the word ω|ω| + j for j ∈ N, i.e.,

(ω, ∞) = {ω⁻(ω|ω|+ j) : j ∈ N}.

Let {Si}^∞_i=1 be an infinite collection of contractive similitudes on R such that Si(x) = 1

3ⁱx+ 1 − 1 3ⁱ⁻¹,

for each i ∈ N and x ∈ R. With the similitudes let us now associate a probability vector (p1, p₂,· · · ) where pi = ₂¹i for all i ∈ N. Then, there exists a unique Borel probability measure P on R such that

P =

∞

X

i=1

piP ◦ S_i⁻¹,

which has support lying in the closed interval [0, 1]. This paper deals with this this probability measure P . For ω = ω1ω₂· · · ωn ∈ Nⁿ, write

Sω := Sω1 ◦ · · · ◦ Sωn, Jω := Sω(J), sω := sω1· · · sωn, pω := pω1· · · pωn, where J := J∅ = [0, 1]. Then, for any ω ∈ N^∗, we write

J_(ω,∞) := ^∞∪

j=1J_ω⁻_(ω_|ω|_+j) and p(ω,∞) := P (J(ω,∞)) =

∞

X

j=1

P(Jω⁻(ω_|ω|+j)) =

∞

X

j=1

p_ω⁻_(ω_|ω|_+j).

Note that for any ω ∈ N^∗, p_(ω,∞) =

∞

X

j=1

p_ω⁻_(ω_|ω|_+j)= pω⁻

∞

X

j=1

1

2^ω^|ω|^+j = pω⁻pω_|ω|

∞

X

j=1

1

2^j = pω⁻pω_|ω| = pω. Let us now give the following lemmas.

Lemma 2.1. Let f : R → R be Borel measurable and k ∈ N. Then Z

f dP = X

ω∈N^k

pω

Z

f ◦ SωdP.

Proof. We know P = P∞

j=1pjP ◦ S_j⁻¹, and so by induction P = P

ω∈N^kpωP ◦ S_ω⁻¹, and thus

the lemma is yielded.

Lemma 2.2. Let X be a random variable with probability distribution P . Then, the expectation E(X) and the variance V := V (X) of the random variable X are given by

E(X) = 1

2 and V = 1 8.

(4)

Proof. Using Lemma 2.1, we have E(X) =

Z

xdP(x) =

∞

X

j=1

1 2^j

Z

Sj(x)dP =

∞

X

j=1

1 2^j

Z 1

3^jx+ 1 − 1 3^j−1

dP

=

∞

X

j=1

1

6^jE(X) + 1 2^j − 3

6^j

= 1

5E(X) + 1 − 3 5, which implies E(X) = ¹₂. Now,

E(X²) = Z

x²dP(x) =

∞

X

j=1

1 2^j

Z 1

3^jx+ 1 − 1 3^j−1

2

dP

=

∞

X

j=1

1 2^j

Z 1

9^jx²+ 2

3^j(1 − 1

3^j−1)x + (1 − 1 3^j−1)²

dP.

Since,

∞

X

j=1

1 18^j = 1

17, and

∞

X

j=1

1 2^j

Z 2

3^j(1 − 1

3^j−1)xdP = 2

∞

X

j=1

(1 6^j − 3

18^j)E(X) = 1 5 − 3

17 = 2 85,

and ∞

X

j=1

1

2^j(1 − 1 3^j−1)² =

∞

X

j=1

1

2^j(1 − 2

3^j−1 + 1

9^j−1) = 1 − 6 5+ 9

17 = 28 85, we have E(X²) = ₁₇¹E(X²) + ₈₅² +²⁸₈₅ which yields E(X²) = ³₈. Thus,

V(X) = E(X²) − (E(X))² = 3 8 −1

4 = 1 8,

which is the lemma.

Lemma 2.3. For any k ≥ 1, we have

E(X|X ∈ Jk∪ Jk+1∪ · · · ) = 1 − 1 2

1 3^k−1. Proof.

E(X|X ∈ Jk∪ J_k+1∪ · · · ) = 1 P∞

j=k 1 2^j

X^∞

j=k

1 2^jSj(1

2)

= 2^k−1

∞

X

j=k

1

2^j(1 − 5 2

1 3^j)

= 2^k−1( 1 2^k−1 − 1

2 1

6^k−1) = 1 − 1 2

1 3^k−1,

which is the lemma.

Now, the following notes are in order.

Note 2.4. For k ∈ N, we have Sk(¹₂) = ₃¹k

1

2 + 1 − ₃k−1¹ = 1 − ⁵₂₃¹k. Thus, by Lemma 2.3, for k ∈ N,

E(X|X ∈ Jk∪ J_k+1∪ · · · ) = Sk(1 2) + 1

3^k = 1

2(Sk(1) + S_k+1(0)).

Following the standard theory of probability, for any x₀ ∈ R, we have R (x − x0)²dP(x) = V(X) + (x₀− E(X))². Thus, one can deduce that the optimal set of one-mean is the expected value and the corresponding quantization error is the variance V of the random variable X.

For ω ∈ N^k, k ≥ 1, using Lemma 2.1, we have E(X : X ∈ Jω) = 1

P(Jω) Z

Jω

xdP(x) = Z

Jω

xdP ◦ S_ω⁻¹(x) = Z

Sω(x)dP (x) = E(Sω(X)).

(5)

Since Sj are similitudes, it is easy to see that E(Sj(X)) = Sj(E(X)) for j ∈ N, and so by induction, E(Sω(X)) = Sω(E(X)) for ω ∈ N^k, k ≥ 1.

Note 2.5. For words β, γ, · · · , δ in N^∗, by a(β, γ, · · · , δ) we denote the conditional expectation of the random variable X given Jβ∪ J_γ∪ · · · ∪ J_δ, i.e.,

a(β, γ, · · · , δ) = E(X|X ∈ Jβ ∪ Jγ∪ · · · ∪ Jδ) = 1

P(Jβ ∪ · · · ∪ Jδ) Z

Jβ∪···∪Jδ

xdP.

Thus by Note 2.4, for ω ∈ N^∗, we have

a(ω) = Sω(E(X)) = Sω(¹₂), and

a(ω, ∞) = E(X|X ∈ Jω⁻(ω_|ω|+1)∪ J_ω⁻_(ω_|ω|₊₂₎∪ · · · ) = S_ω⁻_(ω_|ω|₊₁₎(¹₂) + sω⁻(ω_|ω|+1). (1)

Moreover, for any ω ∈ N^∗ and j ≥ 1, since pω⁻(ω|ω|+j) = pω⁻pω_|ω|+j = pω⁻pω_|ω|pj = pωpj, and similarly sω⁻(ω_|ω|+j) = sωsj, for any x0 ∈ R, it is easy to see that





 R

Jω(x − x0)²dP = pωR (x − x0)²dP ◦ S_ω⁻¹ = pω

s²_ωV + (Sω(¹₂) − x0)² , and R

J_(ω,∞)(x − x0)²dP =P∞ j=1pωj

s²_ωjV + (Sω⁻(ω_|ω|+j)(¹₂) − x0)² (2) .

The expressions (1) and (2) are useful to obtain the optimal sets and the corresponding quantization errors with respect to the probability distribution P .

The following lemma is easy to prove.

Lemma 2.6. Let P be the probability measure as defined before and let ω ∈ N^∗. Then, Z

J_(ω,∞)

(x − a(ω, ∞))²dP = Z

Jω

(x − a(ω))²dP.

Remark 2.7. By (1) and Lemma 2.6, we see that for any ω ∈ N^∗, Z

J_(ω,∞)

(x − a(ω, ∞))²dP = Z

Jω

(x − a(ω))²dP = pωs²_ωV.

The following lemma is useful.

Lemma 2.8. For any two words ω, τ ∈ N^∗, if pω = pτ then, Z

Jω

(x − a(ω))²dP = Z

Jτ

(x − a(τ ))²dP.

Proof. Let ω, τ ∈ N^∗. Let ω = ω1ω₂· · · ωk and τ = τ1τ₂· · · τm for some k, m ∈ N. Then, pω = pτ implies ω1 + ω2+ · · · + ωk = τ1+ τ2+ · · · + τm, and so sω = sτ. Thus,

Z

Jω

(x − a(ω))²dP = pωs²_ωV = pτs²_τV = Z

Jτ

(x − a(τ ))²dP,

which is the lemma.

Definition 2.9. For n ∈ N with n ≥ 2 let ℓ(n) be the unique natural number with 2^ℓ(n)≤ n <

2^ℓ(n)+1. Write

α(ℓ(n)) := {a(ω) : ω ∈ N^∗ and pω = 1

2^ℓ(n)} ∪ {a(ω, ∞) : ω ∈ N^∗ and pω = 1 2^ℓ(n)}.

For I ⊂ α(ℓ(n)) with card(I) = n − 2^ℓ(n), write

αn(I) : = (α(ℓ(n)) \ I) ∪ {a(ω1) : a(ω) ∈ I} ∪ {a(ω1, ∞) : a(ω) ∈ I}

∪ {a(ω⁻(ω|ω|+ 1)) : a(ω, ∞) ∈ I} ∪ {a(ω⁻(ω|ω|+ 1), ∞) : a(ω, ∞) ∈ I}.

Remark 2.10. In Definition 2.9, if n = 2^ℓ(n), then I = ∅, and so, αn(I) = α(ℓ(n)).

(6)

Using Definition 2.9, we now give few examples.

Example 2.11. Let n = 3. Then, ℓ(n) = 1, α(1) = {a(1), a(1, ∞)} = {¹₆,⁵₆}, card(I) = 1. If I = {a(1)}, then

α₃(I) = {a(11), a(11, ∞), a(1, ∞)} = { 1 18, 5

18,5 6}.

If I = {a(1, ∞)}, then,

α3(I) = {a(1), a(2), a(2, ∞)} = {1 6,13

18,17 18}.

Example 2.12. Let n = 4. Then, ℓ(n) = 2, I = ∅, and so

α4(I) = α(2) = {a(11), a(11, ∞), a(2), a(2, ∞)} = { 1 18, 5

18,13 18,17

18}.

Example 2.13. Let n = 5. Then, ℓ(n) = 2, α(2) = {a(11), a(11, ∞), a(2), a(2, ∞)}, I ⊂ α(2) with card(I) = 1. If I = {a(11)}, then

α₅(I) = {a(111), a(111, ∞), a(11, ∞), a(2), a(2, ∞)} = { 1 54, 5

54, 5 18,13

18,17 18}.

If I = {a(2)}, then

α₅(I) = {a(11), a(11, ∞), a(21), a(21, ∞), a(2, ∞)} = { 1 18, 5

18,37 54,41

54,17 18}.

If I = {a(11, ∞)}, then

α₅(I) = {a(11), a(12), a(12, ∞), a(2), a(2, ∞)} = { 1 18,13

54,17 54,13

18,17 18}.

If I = {a(2, ∞)}, then

α₅(I) = {a(11), a(11, ∞), a(2), a(3), a(3, ∞)} = { 1 18, 5

18,13 18,49

54,53 54}.

Let us now prove the following proposition.

Proposition 2.14. Let αn(I) be the set as defined in Definition 2.9. Then Z

a∈αminn(I)kx − ak²dP = 1 18^ℓ(n)

1 8

2^ℓ(n)+1− n + 1

9(n − 2^ℓ(n)) . Proof. Using the definition of αn(I), we have

Z

a∈αminn(I)kx − ak²dP

= X

a(ω)∈α(ℓ(n))\I

Z

Jω

(x − a(ω))²dP + X

a(ω,∞)∈α(ℓ(n))\I

Z

J₍ω,∞)

(x − a(ω, ∞))²dP

+ X

a(ω)∈I

Z

Jω1

(x − a(ω1))²dP + Z

J_(ω1,∞)

(x − a(ω1, ∞))²dP

+ X

a(ω,∞)∈I

Z

J_ω−

(ω|ω|+1)

(x − a(ω⁻(ω|ω|+ 1)))²dP +

Z

J(ω−(ω|ω|+1),∞)

(x − a(ω⁻(ω|ω|+ 1), ∞))²dP .

(7)

Now, using Remark 2.7, we have X

Z

Jω

(x − a(ω))²dP + X

a(ω,∞)∈α(ℓ(n))\I

Z

J₍ω,∞)

(x − a(ω, ∞))²dP

= X

pωs²_ωV + X

a(ω,∞)∈α(ℓ(n))\I

pωs²_ωV

= 1

18^ℓ(n) 1

8 card(α(ℓ(n)) \ I) = 1 18^ℓ(n)

1

8(2^ℓ(n)+1− n).

Again, by Remark 2.7, we have X

a(ω)∈I

Z

Jω1

(x − a(ω1))²dP + Z

J_(ω1,∞)

(x − a(ω1, ∞))²dP

= 2p₁s²₁V X

a(ω)∈I

pωs²_ω, and

X

a(ω,∞)∈I

Z

J_ω−

(ω|ω|+1)

(x − a(ω⁻(ω|ω|+ 1)))²dP +

Z

J(ω−(ω|ω|+1),∞)

(x − a(ω⁻(ω|ω|+ 1), ∞))²dP

= 2p1s²₁V X

a(ω,∞)∈I

pωs²_ω.

Combining all these, Z

a∈αminn(I)kx − ak²dP = 1 18^ℓ(n)

1

8(2^ℓ(n)+1− n) + 2p₁s²₁V X

a(ω)∈I

pωs²_ω+ X

a(ω,∞)∈I

pωs²_ω

= 1

18^ℓ(n) 1

8(2^ℓ(n)+1− n) +1 9

1 8

1

18^ℓ(n)card(I) = 1 18^ℓ(n)

1 8

2^ℓ(n)+1− n + 1

9(n − 2^ℓ(n)) ,

which is the lemma.

Corollary 2.15. Let Vn be the nth quantization error for every n ≥ 1. Then, Vn≤ 1

18^ℓ(n) 1 8

2^ℓ(n)+1− n + 1

9(n − 2^ℓ(n)) .

In the next sections first we determine the optimal sets of two- and three-means, and then, we will show that the set αn(I) is an optimal set of n-means for P and Vn is the corresponding quantization error.

3. Optimal sets of 2- and 3-means

In this section we determine the optimal sets of n-means for n = 2 and n = 3. The results and the proofs for these two cases are the key to understand the general case.

Lemma 3.1. Let α = {a₁, a₂} be an optimal set of two-means, a₁ < a₂. Then, a₁ = a(1) = ¹₆, a₂ = a(1, ∞) = ⁵₆ and the corresponding quantization error is V2 = ₇₂¹ = 0.0138889.

Proof. by Corollary 2.15, V2 ≤ ₇₂¹. Let α = {a1, a₂} be an optimal set of two-means, a₁ < a₂. Since a1 and a2 are the centroids of their own Voronoi regions, we have 0 ≤ a1 < a₂ ≤ 1. If possible, let a₁ ≥ ¹₃. Then, using (2), we have

1

72 ≥ V2 ≥ Z

J1

(x − a1)²dP = 1 2

1

9V + (S1(1

2) − a1)²

> 1 72,

(8)

which is a contradiction, and so a1 ≤ ¹₃. If a2 < ²₃, then using (2), we have 1

72 ≥ V₂ ≥ Z

J_(1,∞)

(x − a₂)²dP >

Z

J₂∪J3∪J4

(x − a₂)²dP = 1 2²

1

9²V + (S₂(1

2) − a₂)² + 1

2³

1

27²V + (S3(1

2) − a2)² + 1

2⁴

1

81²V + (S4(1

2) − a2)²

> 1 2²

1

9²V + (S2(1 2) −2

3)² + 1

2³

1

27²V + (S3(1 2) − 2

3)² + 1

2⁴

1

81²V + (S4(1 2) −2

3)² , which implies ₇₂¹ ≥ V2 >0.0141425 > ₇₂¹, a contradiction. Thus, a2 ≥ ²₃. Since 0 ≤ a1 ≤ ¹₃ <

2

3 ≤ a₂ ≤ 1, we have ¹₃ ≤ ^a¹^+a₂ ² ≤ ²₃, and so J1 ⊆ M(a₁|α) and J_(1,∞) ⊆ M(a₂|α). Thus, Z

mina∈α kx − ak²dP = Z

J₁

(x − a1)²dP + Z

J_(1,∞)

(x − a2)²dP,

which is minimum when a₁ = a(1) = S₁(¹₂) = ¹₆ and a₂ = a(1, ∞) = S₂(¹₂) + ₃¹2 = ⁵₆, and the corresponding quantization error is V2 = ₇₂¹. Hence the lemma.

Using the technique of Lemma 3.1, the following corollary can be proved.

Corollary 3.2. For any ω ∈ N^∗ with respect to the probability distribution P , the set {a(ω1), a(ω1, ∞)} forms a unique optimal CVT of Jω, and the set {a(ω⁻(ω|ω|+ 1)), a(ω⁻(ω|ω|+ 1), ∞)} forms a unique optimal CVT of J(ω,∞).

We now give the following lemma.

Lemma 3.3. Let α be an optimal set of three-means. Then, α = {a(11), a(11, ∞), a(1, ∞)} = {₁₈¹,₁₈⁵ ,⁵₆}, or α = {a(1), a(2), a(2, ∞)} = {¹₆,¹³₁₈,¹⁷₁₈} with quantization error V3 = ₆₄₈⁵ = 0.00771605.

Proof. Let α be an optimal set of three-means with α = {a1, a2, a3}, where a1 < a2 < a3. Proceeding in the same way as Lemma 3.1, it can be shown that 0 ≤ a1 ≤ ¹₃ and ²₃ ≤ a3 ≤ 1.

Let us now show that a2 6∈ (¹₃,²₃). Consider the following cases:

Case 1: If possible let a2 ∈ [¹₂,²₃).

Then, ¹₂(a1+ a2) < ¹₃, otherwise, quantization error could be strictly reduced by moving a2

to ²₃. Thus, we have

a1 < 2

3− a2 ≤ 2 3 − 1

2 = 1 6 < 2

9 = S12(0).

Since a1 is the centroid of its own Voronoi region, we have a1 = 1

P[0,¹₆] Z

[0,¹₆]

xdP = 1

P([0,¹₉]) Z

[0,¹₉]

xdP = S11(1 2) = 1

18,

and so a2 < ²₃ − a₁ = ²₃ − ₁₈¹ = ¹¹₁₈. Again, ¹₂(a2 + a3) > ²₃, otherwise, quantization error could be strictly reduced by moving a2 to ¹₃, and thus, a3 > ⁴₃ − a2 > ⁴₃ − ¹¹₁₈ = ¹³₁₈. Now, for x∈ J₁₂= [²₉,₂₇⁷ ], we get

x− a1 ≥ x − (2 9− 1

18) = x − 1

6, and a2− x ≥ 1 2− 7

27 − x = 13 54− x, which implies min

a∈α(x − a)² ≥ (x −¹₆)². Similarly, for x ∈ J21= [²₃,¹⁹₂₇], x− a₂ ≥ x − (2

3 −11

18) = x − 1

18, and a₃− x ≥ 13 18− 19

27− x = 1 54− x,

(9)

which yields min

a∈α(x − a)² ≥ (x −₅₄¹ )². Thus, using (2), V3 =

Z

mina∈αkx − ak²dP ≥ Z

J₁₁

(x − 1

18)²dP + Z

J₁₂

(x − 1

6)²dP + Z

J₂₁

(x − 1 54)²dP

= 661

11664 = 0.0566701.

But, by Corollary 2.15, V3 ≤ ₁₈¹ ¹₈(4 − 3 + ¹₉(3 − 2)) = ₆₄₈⁵ = 0.00771605. Thus, a contradiction arises in this case, and so a2 6∈ [¹₂,²₃).

Case 2: If possible let a2 ∈ (¹₃,¹₂].

This leads to a contradiction in a similar way as Case 1.

Thus, by Case 1 and Case 2, we have a₂ 6∈ (¹₃,²₃), i.e., either a₂ ∈ [0,¹₃] or a₂ ∈ [²₃,1]. Let us first assume a2 ∈ [0,¹₃] = J1. Let α1 = {a1, a₂} and α₂ = {a3}. Since α = α₁ ∪ α₂, by Lemma 2.1, we deduce

V3 = Z

J1

a∈αmin1

(x − a)²dP + Z

J_(1,∞)

(x − a3)²dP = 1 18

Z

a∈3αmin1

(x − a)²dP + Z

J_(1,∞)

(x − a3)²dP.

We now show that S₁⁻¹(α₁) is an optimal set of two-means. If S₁⁻¹(α₁) := 3α₁is not an optimal set of two-means, then we could find a set β ⊂ R with card(β) = 2 such thatR min

b∈β(x−b)²dP <

R mina∈α1(x − 3a)²dP. But, then (¹₃β) ∪ α2 is a set of cardinality three withR min_a∈¹

3β∪α2(x − a)²dP < R mina∈α(x − a)²dP, which contradicts the optimality of α. Thus, S₁⁻¹(α1) is an optimal set of two-means, i.e., S₁⁻¹(α1) = {a(1), a(1, ∞)} which gives α1 = {a(11), a(11, ∞)}.

Again, V3 being the optimal error, we must have a3 = a(1, ∞). Thus, under the assumption a₂ ∈ [0, ¹₃] = J1, we have α = {a(11), a(11, ∞), a(1, ∞)}, and then using (2), we have V3 = ₆₄₈⁵ . Let us now assume ²₃ ≤ a2. Let β = {a2, a3}. Then,

V3 = Z

J₁

(x − 1

18)²dP + Z

J_(1,∞)

minb∈β(x − b)²dP = 1 144 +

Z

J_(1,∞)

minb∈β(x − b)²dP.

We show that a2 < ⁷₉. If a2 > ⁷₉, then a2− S2(¹₂) > ⁷₉ −¹³₁₈ = ₁₈¹, which implies Z

J_(1,∞)

minb∈β(x − b)²dP >

Z

J₂

(x − a2)²dP = 1 4

1

81V + (S2(1

2) − a2)²

> 1 864,

which is not true since V3 = ₆₄₈⁵ . Thus, ²₃ ≤ a2 ≤ ⁷₉. Similarly, one can show that ⁸₉ ≤ a3 ≤ 1.

Thus, under the assumption ²₃ ≤ a₂, we have Z

minb∈β kx − bk²dP = Z

J₂

(x − a₂)²dP + Z

J_(2,∞)

(x − a₃)²dP,

which is minimum when a2 = a(2) and a3 = a(2, ∞). Hence, in this case we obtain α = {a(1), a(2), a(2, ∞)}. Thus, the proof of the lemma is complete.

4. Quantization error and the optimal sets of n-means in the general case In this section, we determine the optimal sets of n-means and the nth quantization error for the probability distribution P for all n ≥ 2.

Let us first state and prove the following proposition. The technique of the proof is adapted from [R1].

(10)

Proposition 4.1. For any n ≥ 2, let αn be an optimal set of n-means with respect to the probability distribution P . Write

W(αn) := {ω ∈ N^∗ : a(ω) or a(ω, ∞) ∈ αn}, and

W˜(αn) := {τ ∈ W (αn) : pτs²_τ ≥ pωs²_ω for all ω ∈ W (αn)}.

Then, for any τ ∈ ˜W(αn) the set αn+1 := αn+1(τ ), where αn+1(τ ) = (αn\ {a(τ )}) ∪ {a(τ 1), a(τ 1, ∞)} if a(τ ) ∈ αn,

(αn\ {a(τ, ∞)}) ∪ {a(τ⁻(τ|τ |+ 1)), a(τ⁻(τ|τ |+ 1), ∞)} if a(τ, ∞) ∈ αn, is an optimal set of (n + 1)-means.

Proof. Let us first claim that for any ω, τ ∈ N^∗, pτs²_τ ≥ pωs²_ω if and only if Z

Jτ1

(x − a(τ 1))²dP + Z

J_{(τ 1,∞)}

(x − a(τ 1, ∞))²dP + Z

Jω

(x − a(ω))²dP

≤ Z

Jτ

(x − a(τ ))²dP + Z

J_ω1

(x − a(ω1))²dP + Z

J_(ω1,∞)

(x − a(ω1, ∞))²dP.

Using Remark 2.7, we simplify the above inequality and obtain LHS = 2pτ1s²_τ1V + pωs²_ωV = 1

9pτs²_τV + pωs²_ωV, RHS = pτs²_τV + p_ω1s²_ω1V = pτs²_τV +1

9pωs²_ωV.

Thus, LHS ≤ RHS if and only if pτs²_τ ≥ pωs²_ω, which is the claim.

We now prove the proposition by induction. By Lemma 3.1, we know that the optimal set of two-means is α2 = {a(1), a(1, ∞)}. Here ˜W(α2) = W (α2) = {1}. Since a(1) ∈ α2, we have α₃ = {a(11), a(11, ∞), a(1, ∞)}. Again, as a(1, ∞) ∈ α2, we have α3 = {a(1), a(2), a(2, ∞)}.

Clearly by Lemma 3.3, the sets α3 are optimal sets of three-means. Thus, the proposition is true for n = 2. Let us now assume that αm is an optimal set of m-means for some m ≥ 2.

Write

W(αm) := {ω ∈ N^∗ : a(ω) or a(ω, ∞) ∈ αm}, and

W˜(αm) := {τ ∈ W (αm) : pτs²_τ ≥ pωs²_ω for all ω ∈ W (αm)}.

If τ 6∈ ˜W(αm), i.e., if τ ∈ W (αm) \ ˜W(αm), then by the claim, if a(τ ) ∈ αm the error Z

min{(x − a)² : a ∈ (αm\ {a(τ )}) ∪ {a(τ 1), a(τ 1, ∞)}}dP, or if a(τ, ∞) ∈ αm the error

Z

min{(x − a)² : a ∈ (αm\ {a(τ, ∞)}) ∪ {a(τ⁻(τ|τ |+ 1)), a(τ⁻(τ|τ |+ 1), ∞)}}dP

is either equal or larger, in fact strictly larger if n is not of the form 2^k for any positive integer k, than the corresponding error obtained in the case where τ ∈ ˜W(αm). Hence, for any τ ∈ ˜W(αn) the set α_m+1 := α_m+1(τ ), where

α_m+1(τ ) = (αm\ {a(τ )}) ∪ {a(τ 1), a(τ 1, ∞)} if a(τ ) ∈ αm,

(αm\ {a(τ, ∞)}) ∪ {a(τ⁻(τ|τ |+ 1)), a(τ⁻(τ|τ |+ 1), ∞)} if a(τ, ∞) ∈ αm, is an optimal set of (m + 1)-means. Thus, by the principle of mathematical induction, the proposition is true for all positive integers n ≥ 2. Thus, the proof of the proposition is

complete.

(11)

Lemma 4.2. Let n ∈ N be such that n = 2^k for some k ≥ 1. Then, α(k) := {a(ω) : pω = 1

2^k} ∪ {a(ω, ∞) : pω = 1 2^k}

is an optimal set of n-means. Set αj(k) = α(k) ∩ Jj for 1 ≤ j ≤ k. Then, S_j⁻¹(αj(k)) is an optimal set of 2^k−j-means for 1 ≤ j ≤ k. Moreover, n =Pk

j=12^k−j+ 1 and Vn=

k

X

j=1

1

18^jV₂^k−j + 1 18^kV₁.

Proof. Let us prove the lemma by induction. If n = 2, by Lemma 3.1, we have α(1) = {a(1), a(1, ∞)} = {a(ω) : pω = ¹₂} ∪ {a(ω, ∞) : pω = ¹₂} is an optimal set of two-means. Here α1(1) = α(1) ∩ J1 = {a(1)}. Note that card(α1(1)) = 1, and S₁⁻¹(α1) = {¹₂} is an optimal set of one-mean. Moreover, V2 = ₁₈¹ V₁+ ₁₈¹V₁. Thus, the lemma is true for n = 2. Let the lemma be true if n = 2^k for some k = m, where m ∈ N and m ≥ 2. We will show that it is also true for k = m + 1. We have

α(m) = {a(ω) : pω = 1

2^m} ∪ {a(ω, ∞) : pω = 1 2^m}.

List the elements of α(m) as a1, a2,· · · , a2^m, i.e., α(m) = {aj : 1 ≤ j ≤ 2^m}. Construct the sets Aj for 1 ≤ j ≤ 2^m as follows:

Aj := {a(ω1), a(ω1, ∞)} if aj = a(ω) for some ω ∈ N^∗,

{a(ω⁻(ω|ω|+ 1)), a(ω⁻(ω|ω|+ 1), ∞)} if aj = a(ω, ∞) for some ω ∈ N^∗. For 1 ≤ j ≤ 2^m, set α2^m+j = (α(m) \ ∪^j

k=1{ak}) ∪ A₁∪ A₂∪ · · · ∪ Aj. Since α2^m is an optimal set of 2^m-means, by Proposition 4.1, α2^m+1 is an optimal set of (2^m+ 1)-means, which implies α₂^m₊₂ is an optimal set of (2^m+ 2)-means, and thus proceeding inductively, one can say that the set

α₂^m+1 := α₂^m₊₂^m = (α(m) \ ∪²_k=1^m {ak}) ∪ A₁∪ A₂ ∪ · · · ∪ A₂^m = A₁∪ A₂∪ · · · ∪ A₂^m is an optimal set of 2^m+1-means. Note that for any ω ∈ N^∗ if a(ω) or a(ω, ∞) ∈ Aj, then pω = 2^m+1, and so

α₂^m+1 = α(m + 1) = {a(ω) : pω = 1

2^m+1} ∪ {a(ω, ∞) : pω = 1 2^m+1}.

Therefore, by using the principle of mathematical induction, one can say that the set α(k) is an optimal set of n-means if n ∈ N and n = 2^k for some k ≥ 1. To complete the rest of the proof, we proceed as follows: For any ω = ω1ω₂· · · ω_|ω| ∈ N^∗, we have a(ω) := Sω(¹₂) ∈ Jω₁. Again, from the definitions of a(ω), a(ω, ∞), if a(ω) ∈ Jω1 and |ω| > 1, then a(ω, ∞) ∈ Jω1. Keeping ω1 fixed, if ω1 < k, it is easy to see that there are 2^k−ω¹⁻¹ different τ ∈ N^∗ such that pωτ = ₂¹k. Thus, we see that for any ω = ω₁ω₂· · · ω|ω| ∈ N^∗ with |ω| > 1 and pω = ₂¹k, the optimal set α(k) contains 2^k−ω¹ elements from Jω1; in other words, card(α(k) ∩ Jω1) = 2^k−ω¹. If |ω| = 1 and pω = ₂¹k, i.e., when ω = k, then a(k) ∈ Jk, i.e., α(k) contains only one element from Jk. Besides, α(k) contains the element a(k, ∞). Write αj(k) = α(k)∩Jj. Then, card(αj(k)) = 2^k−j for 1 ≤ j ≤ k. For any 1 ≤ j ≤ k − 1, by the definition of the mappings, we have

S_j⁻¹(αj(k)) = {a(ωj+1· · · ω|ω|) : pωj+1···ω_|ω| = 1

2^k−j} ∪ {a(ωj+1· · · ω|ω|,∞) : pωj+1···ω_|ω| = 1 2^k−j}, and S_k⁻¹(αk(k)) = {¹₂}. Thus, for all 1 ≤ j ≤ k, one can see that S_j⁻¹(αj(k)) = α(k − j).

Hence, by the first part of the lemma, for each 1 ≤ j ≤ k, the set S_j⁻¹(αj(k)) is an optimal set

(12)

of 2^k−j-means. Now, Vn =

Z

a∈α(k)min kx − ak²dP =

k

X

j=1

Z

Jj

a∈αminj(k)(x − a)²dP + Z

J_(k,∞)

(x − a(k, ∞))²dP

=

k

X

j=1

pj

Z

a∈αminj(k)(x − a)²dP ◦ S_j⁻¹+ Z

Jk

(x − a(k))²dP, which yields

Vn=

k

X

j=1

1 18^j

Z

min

a∈S_J⁻¹(αj(k))

(x − a)²dP + 1 18^kV1 =

k

X

j=1

1

18^jV₂^k−j + 1 18^kV1.

Thus, the proof of the lemma is complete.

Remark 4.3. The set α(k)) given by Lemma 4.2 is a unique optimal set of n-means where n= 2^k for some k ∈ N.

In regard to Lemma 4.2 let us give the following example.

Example 4.4. Take n = 16 = 2⁴. Then,

α(4) = {a(1111), a(1111, ∞), a(112), a(112, ∞), a(121), a(121, ∞), a(13),

a(13, ∞), a(211), a(211, ∞), a(22), a(22, ∞), a(31), a(31, ∞), a(4), a(5, ∞)}.

Since, αj(4) = α(4) ∩ Jj for 1 ≤ j ≤ 4, we have

α1(4) = {a(1111), a(1111, ∞), a(112), a(112, ∞), a(121), a(121, ∞), a(13), a(13, ∞)}, α2(4) = {a(211), a(211, ∞), a(22), a(22, ∞)},

α₃(4) = {(31), a(31, ∞)}, α₄(4) = {a(4)}.

Here, S₁⁻¹(α1(4)) = {a(111), a(111, ∞), a(12), a(12, ∞), a(21), a(21, ∞), a(3), a(3, ∞)} is an optimal set of 2³-means, S₂⁻¹(α2(4)) = {a(11), a(11, ∞), a(2), a(2, ∞)} is an optimal set of 2²- means, S₃⁻¹(α3(4)) = {a(1), a(1, ∞)} is an optimal set of 2-means, and S₄⁻¹(α4(4)) = {¹₂} is an optimal set of one-mean. Moreover, one can see that

V₁₆= 1

18V₈+ 1

18²V₄+ 1

18³V₂ + 1

18⁴V₁+ 1 18⁴V₁.

Theorem 4.5. For n ∈ N with n ≥ 2 let ℓ(n) ∈ N satisfy 2^ℓ(n)≤ n < 2^ℓ(n)+1. Let α(ℓ(n)) and αn(I) be the sets as defined in Definition 2.9. Then, αn(I) is an optimal set of n-means with quantization error

Vn= 1 18^ℓ(n)

1 8

2^ℓ(n)+1− n + 1

9(n − 2^ℓ(n)) .

The number of such sets is ²^ℓ(n)C_n−2ℓ(n), where ^uCv = ^u_v is a binomial coefficient.

Proof. By Lemma 4.2, α(ℓ(n)) is an optimal set of 2^ℓ(n)-means. Choose I ⊂ α(ℓ(n)) such that card(I) = n−2^ℓ(n). List the elements of I as a1, a₂,· · · , a_n−2ℓ(n), i.e., I = {aj : 1 ≤ j ≤ n−2^ℓ(n)}.

Construct the sets Aj for 1 ≤ j ≤ n − 2^ℓ(n) as follows:

Aj := {a(ω1), a(ω1, ∞)} if aj = a(ω) for some ω ∈ N^∗,

{a(ω⁻(ω|ω|+ 1)), a(ω⁻(ω|ω|+ 1), ∞)} if aj = a(ω, ∞) for some ω ∈ N^∗. For 1 ≤ j ≤ n − 2^ℓ(n), set

α₂ℓ(n)+j = (α(ℓ(n)) \ ∪^j

k=1{ak}) ∪ A₁∪ A₂∪ · · · ∪ Aj.

(13)

As shown in Lemma 4.2, proceeding inductively, the set αn(I) := α₂^ℓ(n)_+(n−2^ℓ(n)₎ = (α(ℓ(n)) \ I) ∪ A1 ∪ A2 ∪ · · · ∪ A_n−2^ℓ(n) is an optimal set of n-means. Then, using Proposition 2.14, we obtain the quantization error as

Vn = Z

a∈αminn

kx − ak²dP = 1 18^ℓ(n)

1 8

2^ℓ(n)+1− n + 1

9(n − 2^ℓ(n)) .

Since the subset I from the set α(ℓ(n)) can be chosen in²^ℓ(n)C_n−2ℓ(n) different ways, the number of αn(I) is ²^ℓ(n)C_n−2ℓ(n). Thus, the proof of the theorem is complete.

Remark 4.6. Let β be the Hausdorff dimension of the limit set generated by the infinite similitudes {Sj}^∞_j=1. Then, we know (see [M]):

∞

X

j=1

(1

3^j)^β = 1,

which gives β = ^{log 2}_{log 3}, and it is same as the Hausdorff dimension of the classical Cantor set C generated by the similitudes S₁ and S₂ where S₁(x) = ¹₃x and S₂(x) = ¹₃x+²₃ for all x ∈ R.

Recall that the quantization dimension of a probability measure P is defined to be the number

n→∞lim

2 log n

− log Vn

,

if the limit exists in R, and for any κ > 0 the number lim

n→∞n²^κVn, if it exists, is called the κ-dimensional quantization coefficient for P . Since the nth quantization error Vn for the probability measure P generated by the infinite similitudes considered in this paper is same as the nth quantization error for the Cantor distribution considered by Graf-Luschgy in [GL3], the following two theorems are also true for the probability measure P in this paper.

Theorem 4.7. (see [GL3, Theorem 6.3]) The set of accumulation points of the sequence (n^β²Vn)n∈N equals

h1

8, f( 17 8 + 4β)i

,

i.e., the β-dimensional quantization coefficient for the probability measure P does not exist, where f : [1, 2] → R is defined by f (x) = ₇₂¹x^β²(17 − 8x).

Theorem 4.8. (see [GL3, Theorem 6.6]) The quantization dimension of P equals the Hausdorff dimension β of the limit set generated by the infinite similitudes.

References

[DFG] Q. Du, V. Faber and M. Gunzburger, Centroidal Voronoi Tessellations: Applications and Algorithms, SIAM Review, Vol. 41, No. 4 (1999), 637-676.

[GG] A. Gersho and R.M. Gray, Vector quantization and signal compression, Kluwer Academy publishers:

Boston, 1992.

[GKL] R.M. Gray, J.C. Kieffer and Y. Linde, Locally optimal block quantizer design, Information and Control, 45 (1980), 178-198.

[GL1] A. Gy¨orgy and T. Linder, On the structure of optimal entropy-constrained scalar quantizers, IEEE transactions on information theory, vol. 48, no. 2, February 2002.

[GL2] S. Graf and H. Luschgy, Foundations of quantization for probability distributions, Lecture Notes in Mathematics 1730, Springer, Berlin, 2000.

[GL3] S. Graf and H. Luschgy, The Quantization of the Cantor Distribution, Math. Nachr., 183 (1997), 113-133.

[GN] R. Gray and D. Neuhoff, Quantization, IEEE Trans. Inform. Theory, 44 (1998), 2325-2383.

[H] J. Hutchinson, Fractals and self-similarity, Indiana Univ. J., 30 (1981), 713-747.

(14)

[M] M. Moran, Hausdorff measure of infinitely generated self-similar sets, Monatsh. Math. 122 (1996), 387- 399.

[P1] P.L. Zador, Asymptotic quantization error of continuous signals and the quantization dimension, IEEE Transactions on Information Theory, 1982, Vol. 28 Issue 2, 139-149.

[P2] K. P¨otzelberger, The quantization dimension of distributions, Math. Proc. Camb. Phil. Soc. (2001), 131, 507-519.

[R1] L. Roychowdhury, Optimal quantizers for probability distributions on nonhomogeneous Cantor sets, arXiv:1512.00379 [stat.CO].

[R2] M.K. Roychowdhury, Quantization and centroidal Voronoi tessellations for probability measures on dyadic Cantor sets, arXiv:1509.06037 [math.DS].

[S] W.F. Sheppard, On the Calculation of the most Probable Values of Frequency-Constants, for Data arranged according to Equidistant Division of a Scale, Proc. London Math. Soc. (1897) s1-29 (1): 353- 380.

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, 1201 West University Drive, Edinburg, TX 78539-2999, USA.

E-mail address: [email protected]