• No results found

Convergence Rates of Density Estimation in Besov Spaces

N/A
N/A
Protected

Academic year: 2020

Share "Convergence Rates of Density Estimation in Besov Spaces"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

doi:10.4236/am.2011.210175 Published Online October 2011 (http://www.SciRP.org/journal/am)

Convergence Rates of Density Estimation in Besov Spaces

Huiying Wang

Department of Applied Mathematics, Beijing University of Technology, Beijing, China E-mail: b200806005@emails.bjut.edu.cn

Received July 29, 2011; revised August 23, 2011; accepted August 30, 2011

Abstract

The optimality of a density estimation on Besov spaces ,

 

s r q

B R for the risk was established by Donoho, Johnstone, Kerkyacharian and Picard (“Density estimation by wavelet thresholding,” The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508-539). To show the lower bound of optimal rates of conver-gence

p L

, ,

s

n r q

R B p , they use Korostelev and Assouad lemmas. However, the conditions of those two lemmas are difficult to be verified. This paper aims to give another proof for that bound by using Fano’s Lemma, which looks a little simpler. In addition, our method can be used in many other statistical models for lower bounds of estimations.

Keywords:Optimal Rate of Convergence, Density Estimation, Besov Spaces, Wavelets

1. Introduction

Wavelet analysis has many applications, one of which is to estimate an unknown density function based on inde-pendent and identically distributed (i.i.d.) random

sam-ples. Let be a probability measurable space

and 1

( , ,  P) , , n

XX be i.i.d. random variables with an

un-known density function f. We use to denote the

expectation of a random variable X. The sequence

 

E X

,

: inf sup

n

n n p

f f V

R V p E f f

  is called optimal rate

of convergence on the functional class V for the Lp risk.

Here, fn is an arbitrary estimator of f with n i.i.d.

ran-dom samples. Kerkyacharian and Picard [1] study

n when V is a Besov space with matched case.

Donoho, Johnstone, Kerkyacharian and Picard [2] con-sider unmatched cases. In fact, they show the optimal

convergence rates for Besov class

,p

R V

r

L Br qs, and Lp

risk

 

1/ 1/ 2 1/ 1 ,

2 1

ln , ,

2 1 ,

, .

2 1 s r p

s r s

n r q s

s

p

n n r

s R B p

p

n r

s

   

 

 

(1.1)

To show the lower bound of (1.1), authors of [2,3] use Korostelev and Assouad lemmas. However, the condi-tions of those two lemmas are difficult to be verified. In this small paper, we give another proof for the lower

bound of (1.1) by using Fano’s lemma [4]. It should be pointed out that Fano’s lemma can be used to a variety of statistical models, see [5-7].

As usual, Lp

  

R p1

denotes the classical Lebes-gue space on the real line R. In particular, L2

 

R

R

stands for the Hilbert space, which consists of all square

inte-grable functions. As a subspace of p , the Sobolev

space with an integer exponent k means

L

 

:

,  m

 

, 0,1, ,

1

k

p p

W Rf fL R m  k p .

The corresponding norm

 

: .

k p

k

W p p

fff

Moreover, the Besov space ,

[3] ,

s p q

B R (1 p q ,

s n  and (0,1]) can be defined by

 

 

2

 

, , 2 , 2

n

s n j j

p q p p

j Z

B R f W R  fl

  q

with the associated norm

 

,

2 ( )

: 2 , 2

s n

p q p q

j n j

p

B W

l Z

ff   f  ,

where

 

 

2

, : sup 2 2 .

p f t h t f x h f x h f x p

     

In general, it can be shown that compactly supported and

n times differentiable functions belong to ,

 

s p q

(2)

H. Y. WANG 1259

Where p and q are density functions of P Q respec-tively.

Lemma 1.2. (Fano’s Lemma, [4]) Let be

pr

,

0 s n and 1 p, q .

The Besov space can be discretized by the sequence norm of wavelet coefficients. Many useful wavelets are

generated by scaling functions. More precisely, if  is

a scaling function with

 

k 2

2

,

k

x h x k

 

 

then

 

:

 

1 1k 2

2

k k

x

hxk

   defines a

wave-let [3]. Clearly, when  is compactly supported and

continuous, the corresponding wavelet  has the same

properties. An orthonormal wavelet basis of L2

 

R is

generated from dilation and translation of a scaling func-tion and its corresponding wavelet, i.e.

 

 

0 0

0

2

2

,

: 2 2 ,

: 2 2 .

j j

j j 0

j k

jk

j j k Z

x x k

x x k

 

 

 



 



 



Although wavelet basis are constructed for L2

 

R ,

most of them constitute unconditional bases for Lp

 

R .

A scaling function  is called t regular, if  has

con-tinuous derivatives of order t and its corresponding

wavelet  has vanishing moments of order t, i.e.

 

k

xx dx0,k0,1, ,t1

 .

The following lemma [3] plays important roles in this paper.

Lemma 1.1. Let  be a compactly supported, t regu-lar orthonormal scaling function with the corresponding wavelet  and0 s t. If fLp

 

R , s0k : f,0k ,

: ,

jk jk and 1 ,

df   p q , then the following

two conditions are equivalent:

1) ,

 

s p q

fB R ;

2)

1 1 2 0

0

2 j s

p j

p p

j q

s d

      

 

 

 

 

 

 

.

Furthermore,

,

1 1 2 0

0

2

s p q

j s p

j

B p p

j q

f s d

      

 

 

 

  

 

 

 .

Before introducing Fano’s Lemma, we need the nota-tion of Kullback-Leilber distance [4]. Let P and Q with P

being absolutely continuous with respect to Q (denoted

by ). Then the Kullback-Leilber distance is

de-fined by

PQ

,

: 0

 

ln

 

 

d p q

p x

( , ,  Pk)

obability measurable spaces and Ak, k0,1,,m.

If Ak

,

K P Q p x x

q x

 

v

A 

 for kv, then with Ac standing for

the complement of A and

k, v

v

0

1 : inf m

v m k

K P P

,

m

 

 

1

1

, exp 3 .

2 m em

0

sup k kc min k m

P A

 

  

By Lemma 1.1 and 1.2, we can show the following re-sult:

Theorem 1.1. Let

, ,

s r q

fBR L with1r, q ,

1  p and sr1. If fn is an estimator of f with

n i.i.d. random samples, then

 

 

,

1/ 1/ 2 1/ 1

2 1 ,

ln

sup max , ,

s r q

s r p s s r

s n p

f B R L

n

E f f n

    

 

  

n

 

 

 

where

 

,

, , : , , s

r q

s s

r q r q B

BR LfB R fL and f

xy means

has compact support}; The notation with a constant C.

rk 1.1. Note that

xCy Rema

1/ 1/  

2 1/ 1

ln

1/ 1/ 2 1/ 1 2 1 ln

max ,

s r p p

s s r n

n

  s r

s r

s n

n

  

n

 

 

   

   

  

 

for

    

2 1

p r

s

 and for 2 1

p r

s

1/1/1/ 1

2 1 2 1

max ,

s r p

2

ln s r s s

s s

n n

n

 

 

 

 

  

 

 

.

Then theorem 1.1 is a reformulate of the lowe und in

(1.1). By using the idea of reference [5], we show this theorem in the next two sections.

irstly, we prove

n

  

  

r bo

2. Proof of Theorem 1.1

F

 

1/ 1/

2 1/ 1

ln

. s

, , sup

s r q

r p s r n p

n f f

n

   

    

One need construct

f B R L

E



such that s

,

k r

k

g gB,q R L

and

1//1/ 1

.

2 1

ln sup

s r p r n k p

k n

  

 

Let

s

n E fg 

ono

be a compactly supported, regular and

orth rmal scaling function,

t ts

 be the corresponding

(3)

d gers. Th

N enotes the set of positive inte en there

ex-ists a compactly supported density function g (i.e.

 

0

g x  and

g

 

) satisfying

 

,

 

s r q

d 1

x x

g xB R and g x

 

 0,lc00.

l l l, 2jl

. Then

Let elem

, 2 ,, 2j1 : 0,

j

 

ents in

the number of

j

 enote

ed by [

is 2j

5], one defines

1

 , d d by # j 2j1.

 2 1/

Motivat aj: 2 j s 1/  r and

 

 

 

 

2j

k j jk k l

g x :g xax I ,k j

: 1 if 2

with

2 I

k jl

j

kl , else

2j : 0

k l

I  .

Obvi-ously,

2jl

g g,

g x

 

x  

 d 1 and

 

s1/r

k

g x  for large implies

that k

0 2 0

c

  j

j, which

g

h

is a density function e assumptions of

for each k.

By t , the wavelet  is

com-pported and pactly su

s r q

B

t times differentiable. Therefore,

  

R t s

   and ,

 

.

s

k r q

, gBR Because

 1/ 2 1/

2j s r 1

j

a    ,

,

s r q

j jk B

a C and so is

, s r q k B g Hence,

due to Lemma 1.1.

 

,

s

k r q

,

g xBR L . Clearly,

 

2 1/ 1/

2 :

j

k k p k l p j jk p

j s p r j p

g g g g a

    

   

 

(2.1)

For k  kj

 

2jl due to

: 2 j s1/ 2 1/ 

j

a     r . Fur-

thermore, :

2

j k n k p

A  fg  

  satisfies AkAk 

. Recall that 1. By Lemm

for kk # j 2j a 1.2,

 

2

1 3

sup k min , 2 exp j

c j

g k

  

    . Here and

,

2 j

k  e

after

n

P A   

n f

P

th

stands for the probability measure

correspond-o e density function

ing t

 

:

   

1 2

 

n

n

f xf x f

n n

xf x . It is easy to see that PgkPg0 from the constructions

of gk. Since fn is an estimator of density with n i.i.d.

random samples,

 

2 i k

j n j j n c

n k p g n k p g k

E fg  P  fg   P A .

2 2  Then,

 

2 sup sup 2 1 3

min , 2 exp .

2 2

k j

k j

j

j n c

n k p g k

k

j j

E f g P A

e                 (2.2)

Next, one shows 2j c na01 2j

 

 

 

1 2

1

1 2 0 1

2

, : n n ln d

n

n n n

n

f f

f x

K P P f x x

f x

 

, 1

 

1

n n

j

f x   

 

1 j

f x and 2

 

1 2

 

n n

j j

f x    f

 

x . Then

 

1

 

1 1

1 2 1 1 2

1 2

, ln i d ,

n n

i i

i i

f x

: Recall that

.

n

K P P f x x nK P P

f x



Note that

 

 

 

1 1 1

1 2 1

2

, : ln f x d

P P f x x

f x

lnu u 1

K and  

for en

0 Th

u .

 

 

 

 

 

 

 

 

 

1 1 2 1

2 1 1

2

1 2

2 1 2

, ln d

1 d

d .

n n f x

K P P n f x x

f x

f x

n f x x

f x

n f xf x f x x

            

Hence,

2

2j : inf 2 k, v 2 k, jl .

j

j

j n n j n n

g g g g

v

k v k

K P P K P P

  

 

Moreover,

 

1

 

 

2

2j 2 d

j

j

k k

n g x g x g x

  



 

x. (2.3)

ng to the definition of

Accordi gk, supp

gkg

 

0,l

and g x

 

c0 on

 

0,l. Thus,

g x

 

1 gk

 

xg

 

x 2

 

2

 

2

1 1 2 1 2

0 2 0

d

0 j jk x x c aj jk x c aj

by the

orthonormality of

dxc

a 

. Then (2.3) reduces to

jk

1 2 0 2j c naj.

(2.4)

ke

Ta  

1 2 1

2 ln

/ 1 2 1 1

2 2

2 l

j s r j

na n n

 

    

  .

s r

n  

. Then j n        n

Now, one can choose C0 such that na2jClnn

and C4

s1r

2c0. Therefore,   4 1/ 2

1

s r

c

n

 1

1 1

0 0

2 2

2 2

ln

j c C

j j

j

n

e e na

n           

and (2.2) reduces to supkj E

fngk p

Cj. Then

the desired follows from  jp2j s1/p1/r by (2.1)

and  

1 2 1/ 1

2 ln s r j n n          .  

, 2 1 , sup s r q s s n p R L

E f f n

Now, we prove 

f B

 . Our

(4)

H. Y. WANG 1261

ov-Gilbert) Let

Lemma 2.1. (Varsham

1

:  , ,m

    , i

 

0,1. Then there exists a

sub-set

0,,M

of

/8

m

 with 0 

0,, 0

such that 2

M and

1

0 .

8

m

i j m k k k

i j M

 

    

construct

It is sufficient to gi

i0,1,,M

such

that i ,

,

s r q

gBR L and

2 1.

i s

n

p n

 

 (2.5)

sup

i

E f g

As pro ove, let

s

ved ab  be a compactly supported,

t ts regular and orthonormal scaling function, be upp

the corresponding wavelet with s  

0,l

, lN.

s s

Assume g Br q, R, 1/ 2

 

,  j:

 

:

L

 

  and ne

:

j

a

0, , 2 ,l l

[0, ] 0

| l 0

gc

1

l and

 

. Defi

2 j s , 2j

j

j k jk k

g xg xa   x



with

 

 

0,12j

j

k k

     (note that g0g). Since

 

0,1 k

  , one knows that 2

j

r j

k

k  

nd a

 

1/ 1/ 2 1/

2 1

j j

r r r

k k

a

 



 

 

 

By Lemma 1.1,

.

j s

,

s j

r q

k jk

B 

j k

a

  C, and so is

,

s r q

B

g . Hence gBr qs,

R L,

. at the supports of

Note th jk for k j are

mutu-ally disjoint. Then

 

0 0 2 0

js j jk

g xcac   

for big j. This with

g

 

x dx

g x

 

dx1 implies that g is a density function for each

 

0,12

j

 

1, there exists

.

Ac-cording to Lemma 2.

 0, 1,,M

such that M22j3 and

3

2 . l i j

k

(2.6)

j

kk



Because suppjksuppjk for k  kj, one

knows that

 1

2

l i

j

p p l i p

j k k

k p

j

p

jk p

p p

sp j l i

k k

k p

g g a  

  

This with (2.6) and 

 

  

, l k

i

 

0,1

k

  leads to

3

2

l i

pj p

g

     2 and

p p s p

g  

1/

8 2 :

l i

p sj j p

p

gg      . (2.7)

Clearly, the sets

0,1, ,

2

i i

j n

A  fg   iM

  

satisfy AlAi  for . Then Fano’s Lemma

yi

il

elds

 

1

0

1

sup min , Mexp 3 .

2 i

i

n c

g M

i M

P A e

  

  

 

 

  (2.8)

On the other hand, it follows 01 22

j

M c

f of (2.4). Take

j

na from

the similar arguments to the proo

1 2 1

2jn s. Then

choose a c 0 su hat

2 (2 1)

2 s j 1

j

naa    . Hence, one can

onstant C ch t

1 2 1

4 4

0 2 0 2

2 2

2 j e j j 2 j e j 1

M c na c C

M e

 

.

Therefore, (2.8) reduces to

 

0

sup i i

n c g i M

P A C

 

 

 0 and

0

sup n i

p i M

E f g

 

0

sup .

2 2 j

i M

C

i i

j n j

g n p

P f g

  

  

 

 

 

  

This with (2.7) and

1 2 1

2j n s yield (2.5).

3.

The author Huiying Wang is grateful to the referees for

their valuable comments and thanks her adviso

Profes-sor Youming Liu, for his helpful guidance. Th work is supported by the National Natural Science Foundation of China (No. 10871012) and Natural Science Foundation of Beijing (No. 1082003).

] G. Kerkyacharian and D. Picard, “Density Estimation in

Acknowledgements

r, is

4. References

[1

Besov Spaces,” Statistics & Probability Letters, Vol. 13, No. 1, 1992, pp. 15-24.

doi:10.1016/0167-7152(92)90231-S

[2] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D. Picard, “Density Estimation by Wavelet Thresholding,” The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508- 539. doi:10.1214/aos/1032894451

Kerkyacharian, D. Picard and A. B. Tsy-lets, Approximation and Statistical Appli-cations,” Springer-Verlag, New York, 1997.

mir Zaiats, Springer rk, 2009.

[3] W. Härdle, G. bakov, “Wave

[4] A. B. Tsybakov, “Introduction to Nonparametric Estima-tion,” (English) Revised and Extended from the 2004 French Original, Translated by Vladi

Series in Statistics, Springer, New Yo

(5)

9-AOS682

2009, pp. 3362-3395. doi:10.1214/0

05.010

[6] C. Christophe, “Regression with Random Design: A Minimax Study,” Statistics & Probability Letters, Vol. 77, No. 1, 2007, pp. 40-53. doi:10.1016/j.spl.2006.

References

Related documents

3.8.2 Major Exportable item Mangoes & its processed products, Fish & Fishery Products, Food & Beverages, Agricultural Products, PVC Resin, Piping Systems,

Given the greater availability of external knowledge sources in modern economies, a dynamic capability that influences a firm’s ability to target, absorb and deploy the external

If these plants are contaminated with residues of PPPs, not only the adult bees but also the offspring (eggs and larvae) of leafcutter bees will be exposed to residues

Secondly, the insurance role played by remittances is highlighted: remittances dampen the effect of various sources of consumption instability in developing

For further information on patient public involvement in health and social care research, INVOLVE have many resources available on their website from reports and studies to example

Kvaliteta mlijeka ovisi o postotku mliječne masti u mlijeku (Priručnik za farmere, 2006.). Proizvodnja mlijeka je zapravo prerada krme u organizmu krave u visokovrijedni proizvod

Deloitte’s SAP CRM Pre-Configured Solution is designed to help you address all of the issues identified by our round table participants and drive customer value through time and

Regarding to the role of technical support staff, they recommended that schools should work to convince technology staff that reliability is very important, especially