• No results found

Nonlinear wavelet density estimation for biased data in Sobolev spaces

N/A
N/A
Protected

Academic year: 2020

Share "Nonlinear wavelet density estimation for biased data in Sobolev spaces"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

R E S E A R C H

Open Access

Nonlinear wavelet density estimation for

biased data in Sobolev spaces

Jinru Wang

*

, Meng Wang and Yuan Zhou

*Correspondence:

[email protected] Department of Applied Mathematics, Beijing University of Technology, Beijing, 100124, China

Abstract

In this paper, we consider the density estimation problem from independent and identically distributed (i.i.d.) biased observations. We develop an adaptive wavelet hard thresholding rule and evaluate its performance by consideringLprisk over

Sobolev balls. We prove that our estimation attains a sharp rate of convergence and show the optimality.

MSC: 49K40; 90C29; 90C31

Keywords: Sobolev spaces; wavelet density estimation; hard thresholding; convergence rate

1 Introduction

In practice, it usually happens that drawing a direct sample from a random variableX is impossible. In this paper, we consider the problem of estimating the density functions fX(x) without observing directly the i.i.d. sampleX,X, . . . ,X

n. We observe the samples

Y,Y, . . . ,Ynfrom biased data with the following density function:

fY(x) =g(x)f

X(x)

μ , (.)

whereg(x) is the so-called weight or bias function,μ=E(g(X)). The purpose of this paper is to estimate the density functionfX(x) from the samplesY,Y, . . . ,Y

n.

Several examples of this biased data can be found in the literature. For instance, in paper [], it is shown that the distribution of the concentration of alcohol in the blood of intoxi-cated drivers is of interest, since the drunken driver has a larger chance of being arrested, the collected data are size-biased.

The density estimation problem for biased data (.) has been discussed in several pa-pers. In , Vardi [] considered the nonparametric maximum likelihood estimation for fX(x). In , Jones [] discussed the mean squared error properties of the kernel density

estimation. In , Efromovich [] developed the Efromovich-Pinsker adaptive Fourier estimator. It was based on a blockwise shrinkage algorithm and achieved the minimax rate of convergence under theLrisk over a Besov classBs,.

In , Ramírez and Vidakovic [] proposed a linear wavelet estimator and discussed the consistency of a function inL[, ] under the mean integrated squared error (MISE) sense. But the wavelet estimator in paper [] contained the unknown parameterμ. In the same year, Christophe [] constructed a nonlinear wavelet estimator and evaluated theLp

(2)

risk in the Besov spaceBs

r,q. However, Sobolev spacesWrN (N∈N+) exceptr=  is not a

special case in the Besov spaceBs r,q.

In this paper, we consider the nonlinear hard thresholding wavelet density estimation for biased data in Sobolev spacesWN

r (N∈N+). We mainly give the upper bound of minimax

rate of convergence under theLp risk without particular restriction on the parametersr

andp, and the convergence rate is optimal.

2 Preliminaries

In this section, we shall recall some well-known concepts and lemmas.

2.1 Wavelets

In this paper, we always assume that the scaling waveletϕis orthonormal, compactly sup-ported andN+  regular.

Definition . The scaling functionϕ(x) is calledmregular ifϕ(x) has continuous deriva-tives of ordermand its corresponding waveletψ(x) has vanishing moments of orderm, i.e.,xkψ(x)dx= ,k= , , . . . ,m– .

The following conditions about the scaling functionϕand the kernel functionK(x,y) will be very useful in the third section.

Condition(θ) The functionθϕ(x) =

k∈Z|ϕ(x–k)|is such thatess supx∈Rθϕ(x) <∞.

Condition H(N) There exists an integrable function F(x) such that for any x,y∈R,

|K(x,y)| ≤F(xy), where|x|NF(x)dx<.

ConditionM(N) ConditionH(N) is satisfied andK(x,y)(yx)kdy=δ

k,k= , . . . ,N,

x∈R.

For any x∈R,j,k∈Z, denoted byϕjk(x) :=  j

ϕ(jxk),ψjk(x) := 

j

ψ(jxk), then for anyf(x)∈Lr(R) :={f(x)|

R|f(x)|rdx<∞}, where ≤r<∞, we have the following

equation []:

f(x) =

k∈Z

αJ,kϕJ,k(x) +

jJ

k∈Z

βj,kψj,k(x), a.e., (.)

where

αJ,k=

Rf(x)

ϕJ,k(x)dx, βj,k=

Rf(x)

ψj,k(x)dx.

2.2 Sobolev space The Sobolev spaceWN

r (R) (N∈N+) is defined byWrN(R) :={f :fLr(R),f(N)∈Lr(R)},

which is equipped with the normfWN

r :=fr+f

(N)

r. The Sobolev ballsW˜rN(A,L) are

defined as follows:

˜

WrN(A,L) :=fWrN(R) :f is a probability density function,suppfA,

(3)

Between a Sobolev space and a Besov space, the following embedding conclusions are established.

Lemma .[] Let s> , ≤p,q,r≤ ∞,then (i) WrNBNrBN∞∞–/r,∀N> /r; (ii) BsrqBspq,∀r<p,s =s– /r+ /p,

where AB denotes that the Banach space A is continuously embedding in the Banach space B,i.e.,there exists a constant c≥such that for any uA,we haveuBcuA.

2.3 Auxiliary lemmas

The following lemmas given by [] will be used in the next section.

Lemma . If the scaling function ϕ satisfies Condition (θ), then for any sequence

{λk}k∈Z satisfying λlp := (

k|λk|p)

p <, we have Cλ lp(

j

–

j

p) ≤

kλkϕj,kp

Cλlp( j

–

j

p),where C= (θ ϕ

p

ϕ

q

)–,C= (θϕ

q

ϕ

p

)–, ≤p≤ ∞, p+

q= .

Lemma . For some integer N≥,if the kernel function K(x,y)satisfies Conditions M(N) and H(N+ ),fBs

pq(R),where≤p,q≤ ∞,  <s<N+ ,then we haveKjffp= –jsεj,

whereεjlq.

Lemma .(Rosenthal inequality) Let X, . . . ,Xnbe independent random variables such

that E(Xi) = and E(|Xi|p) <∞,then there exists a constant C(p) > such that

E

n

i= Xi

p

C(p)

n

i= E|Xi|p

+

n

i= EXi

p/

, p> ,

E

n

i= Xi

p

n

i= EXi

p/

,  <p≤.

Lemma . (Bernstein inequality) Let X,X, . . . ,Xn be independent random variables

such that E(Xi) = ,E(Xi)≤σ,|Xi| ≤M<∞.Then

P

n

n

i= Xi

>λ

≤exp

(σ+Mλ/)

, ∀λ> .

Remark In this paper, we often use the notationABto indicate thatAcBwith a

positive constantc, which is independent ofAandB. IfABandBA, we writeAB. 3 Main results

In this paper, our hard thresholding wavelet density estimator is defined as follows:

ˆ

fXnon

n (x) =

k

ˆ

αjkϕjk(x) +

j

j=j

k

ˆ

βjkψjk(x), (.)

where

ˆ

αjk:=

ˆ

μ

n

n

i=

ϕjk(Yi)

g(Yi)

, βˆjk:= ˆ μ

n

n

i=

ψjk(Yi)

g(Yi)

, μˆ:=nn

i=g(Yi)

(4)

The hard thresholding wavelet coefficients areβˆjk∗ :=βˆjkI{| ˆβjk| ≥λ}, where

I| ˆβjk| ≥λ

:=

⎧ ⎨ ⎩

, | ˆβjk| ≥λ,

, | ˆβjk|<λ.

Suppose that the parametersj,j,λof the wavelet thresholding estimator (.) satisfy the assumptions:

j∼

⎧ ⎨ ⎩

((lnn)prrn)N+, r> pN+,

n(N–/–/pr)+, rpN+,

(.)

j∼

⎧ ⎨ ⎩

n

N

N(N+), r> pN+, ( n

lnn)

(N–/r)+, rpN+,

(.)

λ=c

j

n, (.)

wherecis a suitably chosen positive constant.

Lemma . Suppose that there exist two constants gand gsuch that <gg(x)g<

for x∈R.Letαjk,βjkbe the coefficients in the expansion(.)and letαˆjk,βˆjk be defined

by estimator in(.).Ifjn,then for any≤p<∞,we have (i) E|αjkαˆjk|pn

p

;

(ii) E|βjkβjkˆ |pnp.

Proof (i) From the definition ofαˆjkand the triangular inequality, we have

| ˆαj,kαj,k|=

ˆ

μ

n

n

i=

ϕj,k(Yi)

g(Yi)

αj,k

=

ˆ

μ μ

μ

n

n

i=

ϕj,k(Yi)

g(Yi)

αj,k

+μαˆ j,k

μ

ˆ

μ

μˆ

μ μnn

i=

ϕj,k(Yi)

g(Yi)

αj,k

+| ˆμαj,k| μˆ – 

μ .

Sincegg(y)g, we have

ˆ

μ=nn

i=g(Yi)

g, μ=Eg(X)g,

and

|αj,k| ≤ fX(y)ϕj,k(y)dyfX(y)dy

ϕj,k(y)dy

 

A/f∞.

Furthermore, a Sobolev space and a Besov space have the following embedding the-orem,WN

(5)

fWN

r =c. Therefore, by the convexity inequality, we get

E| ˆαj,kαj,k|p

E g g

μ

n

n

i=

ϕj,k(Yi)

g(Yi)

αj,k

+gA/c μˆ – 

μ

p

≤p–max

g g

,gA/c

p

E

μ

n

n

i=

ϕj,k(Yi)

g(Yi)

αj,k p

+

ˆ

μ

μ p

E

μ

n

n

i=

ϕj,k(Yi)

g(Yi)

αj,k p

+E

ˆ

μ

μ p

=:T+T,

whereT:=E|μnni=ϕgj,(kYi(Yi))–αj,k|p,T:=E|μˆ

μ| p.

The termTiis estimated as follows. Firstly, letξi:=μ ϕj,k(Yi)

g(Yi) –αj,k, we can see that they are i.i.d., andE(ξi) = . Moreover, for anym≥,

E|ξi|m =E

μϕj,k(Yi)

g(Yi)

αj,k m

≤m–

Eμϕj,k(Yi)

g(Yi)

m+|αj,k|m

,

where

Eμϕj,k(Yi)

g(Yi)

m =μm–ϕ m–

j,k (Yi)

g(Yi)m–

Eμϕ

j,k(Yi)

g(Yi)

gm–gm+j(m–)ϕm– ∞ E

μϕ

j,k(Yi)

g(Yi) ,

and

Eμϕ

j,k(Yi)

g(Yi)

=

μϕ

j,k(y)

g(y) f

Y(y)dy=

μϕ

j,k(y)

g(y)

g(y)fX(y)

μ dy

f∞≤ fWrN =c.

So, we have

Eμϕj,k(Yi)

g(Yi)

mgm–gm+j(m–)ϕm– ∞ c.

Since jn, we obtain

E|ξi|mCj(m–)/n(m–)/.

By Rosenthal’s inequality, we have

T=E

n

n

i=

ξi p

npnE|ξi|p+np/

E|ξi| p/

(6)

To estimate the termT, letηi=g(Yi)μ. We can computeE(ηi) =  easily, and for any

m≥,E|ηi|mC.

Ifp≥,i.e.,  –p< –p/, using Rosenthal’s inequality, we have

T=E

n n i= ηi p

npnE|ηi|p+np/

E|ηi| p/

np++np/np/. (.)

If ≤p< , we get

T=E

n n i= ηi p

npnp/E|ηi| p/

np/. (.)

By (.), (.) and (.), we obtain

E| ˆαj,kαj,k|pT+Tnp/.

(ii) It is similar to (i), we omit it.

Lemma . If jjn,then for anyω> ,there exists a constant c> such that

P

| ˆβjkβjk|>λ=c

j n

–ωj. (.)

Proof We can easily get

ˆ

μg, μg,

μg

–  ,

|βj,k| ≤A/f∞≤A/fWrN.

Therefore,

| ˆβj,kβj,k|= ˆ μ μ μ n n i=

ψj,k(Yi)

g(Yi)

βj,k

+μβˆ j,kμ–  ˆ μg gn n i=

μψj,k(Yi)

g(Yi)

βj,k

+gA/fXWrNn n i=  g(Yi)

–  μ =: g gn n i= ξi

+gA/fXWN rn n i= ηi ,

whereξi=μ ψj,k(Yi)

g(Yi) –βj,k,ηi= 

g(Yi)– 

μ. So, we get

P| ˆβj,kβj,k|>λ

P ggn n i= ξi

+gA/fXWN rn n i= ηi >λ

P g g  n n i= ξi >λ/

+P gA/fXWN rn n i= ηi >λ/

(7)

=P n

n

i=

ξi >

λg g

+P n

n

i=

ηi >

λ

gA/fX WN

r

=:P+P,

whereP:=P(|n

n

i=ξi|>λgg),P:=P(|n

n

i=ηi|>g λ

A/fXW N

r

).

Now, we estimateP. Clearly,Eξi= , and

i =E

μψj,k(Yi)

g(Yi)

βj,k

≤

Eμψj,k(Yi)

g(Yi) +βj,k

= 

Eμψ

j,k(Yi)

g(Yi)

g(Yμi)+βj,k

≤

g g

fXWN r +Af

X

WrN

:=σ.

Furthermore, we have

|ξi|=

μψj,k(Yi)

g(Yi)

E

μψj,k(Yi)

g(Yi)

≤·j/gg– ψ∞.

By Bernstein’s inequality, we obtain

P

n

n

i=

ξi >

λg g

≤exp

g /g (σ+λg

g·

j/gg–

ψ∞/)

= exp

n·c

j n·g

 /g (σ+gcj/nj/gg–

ψg/)

= exp

c

jg /g (σ+jj/nψc/)

.

Sincejjn, then

P n

n

i=

ξi >

λg g

≤exp

c

jg /g (σ+ψc/)

= exp

c

g /g (σ+ψc/)j

.

Takingc>  such that

cg/g

(σ+ψc/)≥ω, then

P=P

n

n

i=

ξi >

λg g

(8)

Next, we estimateP. We compute thatEηi= ,i.e.,

Eηi=E

g(Yi)

E

μ

=

g(y)f

Y(y)dyμ

=

g(y)

g(y)fX(y) μ dy

μ

= 

μ

fX(y)dyμ= ,

and

i =E

g(Yi)

– 

μ

≤

Eg(Yi)

+ 

μ

≤ 

g,

|ηi|=

g(Yi)– 

μ

= 

g(Yi)

Eg(Yi)

≤g–.

By Bernstein’s inequality, we obtain

P

n

n

i=

ηi >

λ

gA/fXWrN

≤exp

n(λ/(gA /fX

WrN))

(

g 

+λg–

 /(gA/fXWrN))

= exp

ncj/(ngAfX

WrN)

(

g+c

j/n/(ggA/fX WrN))

.

Sincejn, then

P

n

n

i=

ηi >

λ

gA/fXWrN

≤exp

c/(gAfX

WrN)

(

g +c/(ggA

/fX WrN))

j

.

Takingc>  such that

c/(gAfX

W Nr

)

(

g+c/(ggA

/fX W Nr ))

ω, we have

P=P n

n

i=

ηi >

λ

gA/fX WrN

≤e–ωjωj. (.)

Takingc=max{c,c}, by (.) and (.), we have

P| ˆβj,kβj,k|>λ

(9)

Lemma . Suppose that there exist two constants gand gsuch that <gg(x)g<

∞,for x∈R,andβˆjk,βˆjkare given by(.).Then

E j

j=j

k ˆ

βjk∗ –βjk ψjk p ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< pN+,

where c,care constants.

Proof By Lemma ., we obtain

E j

j=j

k ˆ

βjk∗ –βjk ψjk pE j

j=j

k ˆ

βjk∗ –βjk ψjk p E j

j=j

j(–p) k

βˆjk∗ –βjk p

p

.

Furthermore, sinceβˆjk∗ =βˆjkI{| ˆβjk|>λ}, we have

E j

j=j

k ˆ

βjk∗ –βjk ψjk p E j

j=j

j(–p) k

| ˆβjkβjk|pp × I

| ˆβjk|>λ,|βjk| ≥ λ

+I

| ˆβjk|>λ,|βjk|< λ  + j

j=j

j(–p) k

|βjk|p

p

I| ˆβjk| ≤λ,|βjk| ≤λ

+I| ˆβjk| ≤λ,|βjk|> λ

.

Note that

I

| ˆβjk|>λ,|βjk|< λ

I

| ˆβjkβjk|> λ

,

I| ˆβjk| ≤λ,|βjk|> λ

I

| ˆβjkβjk|> λ

,

and if| ˆβjk| ≤λ, |βjk|> λ, we get| ˆβjkβjk| ≥ |βjk|–| ˆβjk|>

|βjk|

 ,i.e.,|βjk|< | ˆβjkβjk|; therefore, we have

E j

j=j

k ˆ

βjk∗ –βjk ψjk p E j

j=j

j(–p) k

| ˆβjkβjk|pI

|βjk| ≥ λ   p +E j

j=j

j(–p) k

| ˆβjkβjk|pI

| ˆβjkβjk|> λ

(10)

+

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ

p

=:W+W+W,

where

W:=E

j

j=j

j(–p) k

| ˆβjkβjk|pI

|βjk| ≥λ

p

,

W:=E

j

j=j

j(–p) k

| ˆβjkβjk|pI

| ˆβjkβjk|> λ

p

,

W:=

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ

p

.

(i) Firstly, we estimate

W:=E

j

j=j

j(–p) k

| ˆβjkβjk|pI

|βjk| ≥ λ

p

.

By Lemma ., we have

E| ˆβjkβjk|pnp

.

UsingI{|βjk| ≥λ} ≤(

|βjk|

λ

)rand Jensen’s inequality, we obtain

W =E

j

j=j

j(–p) k

| ˆβjkβjk|pI

|βjk| ≥ λ

p

j

j=j

j(–p) k

E| ˆβjkβjk|pI

|βjk|> λ

p

j

j=j

j(–p) k

np

| βjk| λ/

r

p

=

j

j=j

j(–p)n–λ

r pβ

j·

r p r.

Byβj·r–j(N+

–r)andλc

lnn

n , we have

W

j

j=jn–j(

–p)j(N+–r)rp

n

lnn

r

p

=n–

n

lnn

r

pj

j=j –

n

rp

(11)

Using Lemma . and (.), we obtain

W

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< pN+,

(.)

wherec,care constants. (ii) For

W:=

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ

p

,

letξ:=(rp(N+ ) – ). ByI{|βjk| ≤λ} ≤(|βλ jk|)

pr(r<p), we have

W

j

j=j

j(–p) k

|βjk|p

λ

|βjk| p–rp

=

j

j=j

j(–p)(λ) pr

p

k

|βjk|r

p

=

j

j=j

j(–p)(λ) pr

p β j·

r p r.

SincefXWN

r (R), thenβj·r–j(N+

–r). Takingλ=c

j nc

lnn

n ,jjC(lnn),

we have

W

j

j=j

j(–p)λ pr

p j(N+–r)rp

lnn n

pr

p j

j=j

–j[rp(N+)–]

=

lnn n

pr

p j

j=j –

lnn n

pr

p

–jξI{ξ> }+ jξI{ξ< }+ (jj+ )I{ξ= }.

Note thatξ >  if and only ifr> Np+. Whenξ = ,i.e.,p=r(N+ ), we can compute

N

(N–r)+=

pr

p. Using (.), (.), we obtain

W

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn n )

pr

pjξ = (lnn)

pr

r(N+)n–NN+, r> pN+, (lnnn)

pr

p(jj+ )(lnn n )

pr

p, r= p

N+,

(lnnn)p–prjξ= (lnn

n ) N

(N–/r)+, r< pN+.

(12)

(iii) Finally, we estimate

W:=E

j

j=j

j(–p) k

| ˆβjkβjk|pI

| ˆβjkβjk|> λ

p

.

Let  <q,q<∞, and q +q = . Using Jensen’s inequality and Hölder’s inequality, we have

W =E

j

j=j

j(–p) k

| ˆβjkβjk|pI

| ˆβjkβjk|> λ

p

j

j=j

j(–p) k

E

| ˆβjkβjk|pI

| ˆβjkβjk|> λ

p

j

j=j

j(–p) k

E| ˆβjkβjk|qp

q

EIq

| ˆβjkβjk|>λ

q p

j

j=j

j(–p) k

E| ˆβjkβjk|qp

q

P

| ˆβjkβjk|>λ

q p

.

By Lemma . and Lemma ., we obtain

W≤

j

j=j

j(–p)jnp

ωj

q

p =n–

j

j=j j(

 –pqω).

Taking large enoughωsuch that <pqω, we get

Wn–  j(

–pqω)≤√n–j.

Taking jas in (.), we have

W

⎧ ⎨ ⎩

(lnn)r(pNr+)n–NN+, r> pN+,

n–(N–/Nr)+, rpN+.

(.)

Putting (.), (.) and (.) together, we can obtain

E

j

j=j

k ˆ

βjk∗ –βjk

ψjk p

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< pN+,

wherec,care constants.

Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+

(13)

ˆ

fXnon

n is the nonlinear wavelet estimator in(.),and assumptions(.), (.)and(.)are

satisfied,then for any fX∈ ˜WN

r (A,L),where≤r<p<∞,N>r,we have

sup

fX∈ ˜WN r (A,L)

EfˆnXnon–fXp

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< pN+,

where N =N– /r+ /p,c,care constants.

Proof By the definition offˆXnon

n in (.) and the expansion offXin (.), one has

ˆ

fnXnon–fX=

k

(αˆjkαjk)ϕjk+

j

j=j

k ˆ

βjk∗ –βjk

ψjk+Pj+fXfX.

Then

EfˆnXnon–fXp

E

k

(αˆjkαjk)ϕjk

p

+E

j

j=j

k ˆ

βjk∗–βjk

ψjk p

+Pj+fXfXp

=:I+I+I,

where

I:=E

k

(αˆjkαjk)ϕjk

p

,

I:=E

j

j=j

k ˆ

βjk∗ –βjkψjk p

,

I:=Pj+fXfXp.

Firstly, we estimate

I:=E

k

(αjˆkαjk)ϕjk

p

.

By Lemma . and Jensen’s inequality,

Ij(–p)E k

| ˆαjkαjk|

p

p

≤j(–p) k

E| ˆαjkαjk|

p

p

.

SincefX(x) andϕ(x) are compactly supported, then the number of elements in{k:α jk= }isO(j). By Lemma ., we haveE| ˆα

jkαjk|pnp.

Therefore

Ij(–p)jnp 

(14)

Using (.), we have

I

⎧ ⎨ ⎩

(lnn) pr

r(N+)n–NN+, r> pN+,

n–(N–/Nr)+, rpN+,

(.)

whereN =N–r+p. Next, we estimate

I:=Pj+fXfXp.

In reference [], it turns out that if the scaling functionϕ(x) is orthonormal, compactly supported andN+ regular, then the associated kernel functionK(x,y) :=kϕ(x–k)ϕ(y– k) satisfies ConditionsH(N+ ) andM(N), andKjf(x) =Pjf(x).

Since a Sobolev space and a Besov space have the following embedding theorem:W˜N r

˜

BN

r→ ˜BNp∞, whereN =N–r+

p, thenfX∈ ˜BNp∞. By Lemma ., we have Pj+fXfXp–jN.

Taking jas in (.), we have

I

⎧ ⎨ ⎩

n–NN+, r> pN+,

(lnn n )

N

(N–/r)+, rpN+.

(.)

Finally, we estimate

I:=E

j

j=j

k ˆ

βjk∗ –βjk

ψjk p

.

Using Lemma ., we obtain

E

j

j=j

k ˆ

βjk∗ –βjk

ψjk p

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< pN+.

(.)

By (.), (.) and (.), we obtain

sup

fX∈ ˜WN r (A,L)

EfˆnXnon–fXp

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)cn–NN+, r> pN+,

(lnn)c(lnn

n ) N

(N–/r)+, r= pN+,

(lnnn)(N–/Nr)+, r< p

N+.

4 Optimality

(15)

Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+  regular,fX∈ ˜WN

r (A,L).If there exist two positive constants gand gsuch that gg(x)

g,x∈R,then for any estimatorfˆnX,we have

inf

ˆ

fnX

sup

fX∈ ˜WN r (A,L)

EfˆnXfXp

⎧ ⎨ ⎩

n–NN+, r> pN+,

(lnnn)(N–/Nr)+, rpN+,

where≤r,p<∞,N>r.

Remark The proof is very similar to that in reference [], in which the author studied the lower bound of the convergence rates in Besov spaces for the samples without bias data.

According to Theorem ., we can see that:

(i) Whenr<Np+, our nonlinear estimator can attain the optimal rate.

(ii) Whenr=Np+, our convergence rate and the optimal rate of convergence differ in a logarithmic. So, it is sub-optimal.

(iii) Whenr>Np+, the logarithmic factor is an extra penalty for the chosen wavelet thresholding, our convergence rate is sub-optimal.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

WJR participated in the sequence alignment and drafted the manuscript. WM participated in the design of the study and performed the statistical analysis. ZY conceived of the study and participated in its design and coordination. All authors read and approved the final manuscript.

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No. 11271038) and Foundation of BJUT (No. 006000542213501).

Received: 12 January 2013 Accepted: 14 June 2013 Published: 3 July 2013

References

1. Efromovich, S: Nonparametric Curve Estimation. Methods, Theory, and Applications. Springer Series in Statistics. Springer, New York (1999)

2. Vardi, Y: Nonparametric estimation in the presence of length bias. Ann. Stat.10(2), 616-620 (1982) 3. Jones, MC: Kernel density estimation for length-biased data. Biometrika78(3), 511-519 (1991) 4. Efromovich, S: Density estimation for biased data. Ann. Stat.32, 1137-1161 (2004)

5. Ramirez, P, Vidakovic, B: Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference140(2), 419-432 (2010)

6. Christophe, C: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc.39, 43-53 (2010)

7. Kelly, C, Kon, MA, Rapheal, LA: Local convergence for wavelet expansion. J. Funct. Anal.126, 102-138 (1994) 8. Triebel, H: Theory of Function Spaces. Birkhäuser, Basel (1983)

9. Härdle, W, Kerkyacharian, G, Picard, D, Tsybakov, A: Wavelets, Approximation and Statistical Applications. Springer, Berlin (1997)

10. Wang, HY: Convergence rates of density estimation in Besov spaces. Appl. Math.2(10), 1258-1262 (2011) doi:10.1186/1029-242X-2013-308

References

Related documents

El supuesto sobre el que se asienta esta política de promoción de los permisos parentales para hombres es que durante el permiso, el padre no se involucra solamente de

Surprisingly, we found that in wild-type cells both Ste7 and Kss1 are phosphorylated in response to osmostress, but signaling specificity must be main- tained since transcription of

In the Essentials of Baccalaureate Education for Professional Nursing Practice (AACN, 2008), “Essential VI: Interprofessional Communication and Collaboration for Improving

The primary objective of the Governance and Nominating Committee is to assist the Board of Directors in fulfilling its responsibilities by (i) ensuring that corporate

How long does data need to be preserved. Will there be funding to assist in

This work brings a literature review regarding the physical activities recommended to older adults, discussing questions related to lifestyle, aging and health, public

The primary outcomes were Psycho-motor Skill performance- Shooting accuracy which is measured by shooting score and the MIQ-R was used to measure imagery ability among

The granules ready for compression has been evaluated The prepared mucoadhesive tablets were evaluated thickness, weight variation, friability, hardness,