Nonlinear wavelet density estimation for biased data in Sobolev spaces

(1)

R E S E A R C H

Open Access

Nonlinear wavelet density estimation for

biased data in Sobolev spaces

Jinru Wang

*

_{, Meng Wang and Yuan Zhou}

*_{Correspondence:}

[email protected] Department of Applied Mathematics, Beijing University of Technology, Beijing, 100124, China

Abstract

In this paper, we consider the density estimation problem from independent and identically distributed (i.i.d.) biased observations. We develop an adaptive wavelet hard thresholding rule and evaluate its performance by consideringLprisk over

Sobolev balls. We prove that our estimation attains a sharp rate of convergence and show the optimality.

MSC: 49K40; 90C29; 90C31

Keywords: Sobolev spaces; wavelet density estimation; hard thresholding; convergence rate

1 Introduction

In practice, it usually happens that drawing a direct sample from a random variableX is impossible. In this paper, we consider the problem of estimating the density functions fX_{(x) without observing directly the i.i.d. sample}_X,_{X, . . . ,}_X

n. We observe the samples

Y,Y, . . . ,Ynfrom biased data with the following density function:

fY(x) =g(x)f

X_(x)

μ , (.)

whereg(x) is the so-called weight or bias function,μ=E(g(X)). The purpose of this paper is to estimate the density functionfX_{(x) from the samples}_Y,_{Y, . . . ,}_Y

n.

Several examples of this biased data can be found in the literature. For instance, in paper [], it is shown that the distribution of the concentration of alcohol in the blood of intoxi-cated drivers is of interest, since the drunken driver has a larger chance of being arrested, the collected data are size-biased.

The density estimation problem for biased data (.) has been discussed in several pa-pers. In , Vardi [] considered the nonparametric maximum likelihood estimation for fX_{(x). In , Jones [] discussed the mean squared error properties of the kernel density}

estimation. In , Efromovich [] developed the Efromovich-Pinsker adaptive Fourier estimator. It was based on a blockwise shrinkage algorithm and achieved the minimax rate of convergence under theLrisk over a Besov classBs_,.

In , Ramírez and Vidakovic [] proposed a linear wavelet estimator and discussed the consistency of a function inL[, ] under the mean integrated squared error (MISE) sense. But the wavelet estimator in paper [] contained the unknown parameterμ. In the same year, Christophe [] constructed a nonlinear wavelet estimator and evaluated theLp

(2)

risk in the Besov spaceBs

r,q. However, Sobolev spacesWrN (N∈N+) exceptr=  is not a

special case in the Besov spaceBs r,q.

In this paper, we consider the nonlinear hard thresholding wavelet density estimation for biased data in Sobolev spacesWN

r (N∈N+). We mainly give the upper bound of minimax

rate of convergence under theLp risk without particular restriction on the parametersr

andp, and the convergence rate is optimal.

2 Preliminaries

In this section, we shall recall some well-known concepts and lemmas.

2.1 Wavelets

In this paper, we always assume that the scaling waveletϕis orthonormal, compactly sup-ported andN+  regular.

Deﬁnition . The scaling functionϕ(x) is calledmregular ifϕ(x) has continuous deriva-tives of ordermand its corresponding waveletψ(x) has vanishing moments of orderm, i.e.,xk_ψ_(x)_dx_{= ,}_k_{= , , . . . ,}_m_{– .}

The following conditions about the scaling functionϕand the kernel functionK(x,y) will be very useful in the third section.

Condition(θ) The functionθϕ(x) =

k∈Z|ϕ(x–k)|is such thatess supx∈Rθϕ(x) <∞.

Condition H(N) There exists an integrable function F(x) such that for any x,y∈R,

|K(x,y)| ≤F(x–y), where|x|N_F(x)_dx_<_∞_.

ConditionM(N) ConditionH(N) is satisﬁed andK(x,y)(y–x)k_dy₌_δ

k,k= , . . . ,N,

x∈R.

For any x∈R,j,k∈Z, denoted byϕjk(x) :=  j

_ϕ(jx–k),_ψ_jk(x) := 

j

_ψ(jx–k), then for anyf(x)∈Lr(R) :={f(x)|

R|f(x)|rdx<∞}, where ≤r<∞, we have the following

equation []:

f(x) =

k∈Z

αJ,kϕJ,k(x) +

j≥J

k∈Z

βj,kψj,k(x), a.e., (.)

where

αJ,k=

Rf(x)

ϕJ,k(x)dx, βj,k=

Rf(x)

ψj,k(x)dx.

2.2 Sobolev space The Sobolev spaceWN

r (R) (N∈N+) is deﬁned byWrN(R) :={f :f ∈Lr(R),f(N)∈Lr(R)},

which is equipped with the normf_WN

r :=fr+f

(N)

r. The Sobolev ballsW˜rN(A,L) are

deﬁned as follows:

˜

W_rN(A,L) :=f ∈W_rN(R) :f is a probability density function,suppf ≤A,

(3)

Between a Sobolev space and a Besov space, the following embedding conclusions are established.

Lemma .[] Let s> , ≤p,q,r≤ ∞,then (i) W_rN→BN_r_∞→BN_∞∞–/r,∀N> /r; (ii) Bs_rq→Bs_pq,∀r<p,s =s– /r+ /p,

where A→B denotes that the Banach space A is continuously embedding in the Banach space B,i.e.,there exists a constant c≥such that for any u∈A,we haveuB≤cuA.

2.3 Auxiliary lemmas

The following lemmas given by [] will be used in the next section.

Lemma . If the scaling function ϕ satisﬁes Condition (θ), then for any sequence

{λk}k∈Z satisfying λlp := (

k|λk|p)



p _<∞_, _{we have C}_λ lp(

j

–

j

p) ≤

kλkϕj,kp ≤

Cλlp( j

–

j

p)_,_{where C}_{= (}_θ ϕ



p

∞ϕ 

q

)–,C= (θϕ



q

∞ϕ 

p

)–, ≤p≤ ∞, p+



q= .

Lemma . For some integer N≥,if the kernel function K(x,y)satisﬁes Conditions M(N) and H(N+ ),f ∈Bs

pq(R),where≤p,q≤ ∞,  <s<N+ ,then we haveKjf–fp= –jsεj,

whereεj∈lq.

Lemma .(Rosenthal inequality) Let X, . . . ,Xnbe independent random variables such

that E(Xi) = and E(|Xi|p) <∞,then there exists a constant C(p) > such that

E

n

i= Xi

p

≤C(p)

n

i= E|Xi|p

+

n

i= EX_i

p/

, p> ,

E

n

i= Xi

p

≤

n

i= EX_i

p/

,  <p≤.

Lemma . (Bernstein inequality) Let X,X, . . . ,Xn be independent random variables

such that E(Xi) = ,E(Xi)≤σ,|Xi| ≤M<∞.Then

P

 n

n

i= Xi

>λ

≤exp

– nλ 

(σ₊_M_λ_/)

, ∀λ> .

Remark In this paper, we often use the notationABto indicate thatA≤cBwith a

positive constantc, which is independent ofAandB. IfABandBA, we writeA∼B. 3 Main results

In this paper, our hard thresholding wavelet density estimator is deﬁned as follows:

ˆ

fXnon

n (x) =

k

ˆ

αjkϕjk(x) +

j

j=j

k

ˆ

β_jk∗ψjk(x), (.)

where

ˆ

αjk:=

ˆ

μ

n

i=

ϕjk(Yi)

g(Yi)

, βˆjk:= ˆ μ

n

i=

ψjk(Yi)

g(Yi)

, μˆ:=nn

i=g(Yi)

(4)

The hard thresholding wavelet coeﬃcients areβˆ_jk∗ :=βˆjkI{| ˆβjk| ≥λ}, where

I| ˆβjk| ≥λ

:=

⎧ ⎨ ⎩

, | ˆβjk| ≥λ,

, | ˆβjk|<λ.

Suppose that the parametersj,j,λof the wavelet thresholding estimator (.) satisfy the assumptions:

j∼

⎧ ⎨ ⎩

((lnn)p–rr_n)N+_, _r_> p N+,

n(N–/–/pr)+_, _r≤ p N+,

(.)

j∼

⎧ ⎨ ⎩

n

N

N(N+)_, _r_> p N+, ( n

lnn)



(N–/r)+_, _r≤ p N+,

(.)

λ=c

j

n, (.)

wherecis a suitably chosen positive constant.

Lemma . Suppose that there exist two constants gand gsuch that <g≤g(x)≤g<

∞for x∈R.Letαjk,βjkbe the coeﬃcients in the expansion(.)and letαˆjk,βˆjk be deﬁned

by estimator in(.).Ifj≤n,then for any≤p<∞,we have (i) E|αjk–αˆjk|pn–

p

_;

(ii) E|βjk–βjkˆ |pn–p__.

Proof (i) From the deﬁnition ofαˆjkand the triangular inequality, we have

| ˆαj,k–αj,k|=

ˆ

μ

n

i=

ϕj,k(Yi)

g(Yi)

–αj,k

=

ˆ

μ μ

μ

n

i=

ϕj,k(Yi)

g(Yi)

–αj,k

+μαˆ j,k



μ–



ˆ

μ

≤μˆ

μ μ_nn

i=

ϕj,k(Yi)

g(Yi)

–αj,k

+| ˆμαj,k| _μ_ˆ – 

μ .

Sinceg≤g(y)≤g, we have

ˆ

μ=_nn

i=g(Yi)

≤g, μ=Eg(X)≥g,

and

|αj,k| ≤ fX(y)ϕj,k(y)dy≤ fX(y)dy





ϕ_j_,_k(y)dy

 

≤A/f∞.

Furthermore, a Sobolev space and a Besov space have the following embedding the-orem,WN

(5)

f_WN

r =c. Therefore, by the convexity inequality, we get

E| ˆαj,k–αj,k|p

≤E g g

μ

n

i=

ϕj,k(Yi)

g(Yi)

–αj,k

+gA/c _μ_ˆ – 

μ

p

≤p–max

g g

,gA/c

p

E

μ

n

i=

ϕj,k(Yi)

g(Yi)

–αj,k p

+

ˆ

μ–



μ p

E

μ

n

i=

ϕj,k(Yi)

g(Yi)

–αj,k p

+E

ˆ

μ–



μ p

=:T+T,

whereT:=E|μ_nn_i_=ϕ_gj,₍k_Yi(Yi₎)–αj,k|p,T:=E|μ_ˆ –



μ| p_.

The termTiis estimated as follows. Firstly, letξi:=μ ϕj,k(Yi)

g(Yi) –αj,k, we can see that they are i.i.d., andE(ξi) = . Moreover, for anym≥,

E|ξi|m =E

μϕj,k(Yi)

g(Yi)

–αj,k m

≤m–

Eμϕj,k(Yi)

g(Yi)

m+|αj,k|m

,

where

Eμϕj,k(Yi)

g(Yi)

m =μm–ϕ m–

j,k (Yi)

g(Yi)m–

Eμϕ



j,k(Yi)

g(Yi)

≤g_m–g_–m+j(m–)_ϕm– ∞ E

μϕ



j,k(Yi)

g(Yi) ,

and

Eμϕ



j,k(Yi)

g(Yi)

=

μϕ



j,k(y)

g(y) f

Y_(y)_dy₌

μϕ



j,k(y)

g(y)

g(y)fX_(y)

μ dy

≤ f∞≤ fWrN =c.

So, we have

Eμϕj,k(Yi)

g(Yi)

m≤g_m–g_–m+j(m–)_ϕm– ∞ c.

Since j≤n, we obtain

E|ξi|m≤Cj(m–)/n(m–)/.

By Rosenthal’s inequality, we have

T=E

 n

n

i=

ξi p

n–pnE|ξi|p+np/

E|ξi| p/

(6)

To estimate the termT, letηi=_g₍_Yi₎–μ. We can computeE(ηi) =  easily, and for any

m≥,E|ηi|m≤C.

Ifp≥,i.e.,  –p< –p/, using Rosenthal’s inequality, we have

T=E

 n n i= ηi p

n–pnE|ηi|p+np/

E|ηi| p/

n–p++n–p/n–p/. (.)

If ≤p< , we get

T=E

 n n i= ηi p

≤n–pnp/E|ηi| p/

≤n–p/. (.)

By (.), (.) and (.), we obtain

E| ˆαj,k–αj,k|pT+Tn–p/.

(ii) It is similar to (i), we omit it.

Lemma . If jj_≤_n,_{then for any}_ω_{> ,}_{there exists a constant c}_{> }_{such that}

P

| ˆβjk–βjk|>λ=c

j n

–ωj. (.)

Proof We can easily get

ˆ

μ≤g, μ≥g, 

μ≤g

–  ,

|βj,k| ≤A/f∞≤A/fWrN.

Therefore,

| ˆβj,k–βj,k|= ˆ μ μ μ n n i=

ψj,k(Yi)

g(Yi)

–βj,k

+μβˆ j,k  μ–  ˆ μ ≤ g g  n n i=

μψj,k(Yi)

g(Yi)

–βj,k

+gA/fXWrN  n n i=  g(Yi)

–  μ =: g g  n n i= ξi

+gA/fX_WN r  n n i= ηi ,

whereξi=μ ψj,k(Yi)

g(Yi) –βj,k,ηi= 

g(Yi)– 

μ. So, we get

P| ˆβj,k–βj,k|>λ

≤P g g  n n i= ξi

+gA/fXWN r  n n i= ηi >λ

≤P g g  n n i= ξi >λ/

+P gA/fX_WN r  n n i= ηi >λ/

(7)

=P  n

n

i=

ξi >

λg g

+P  n

n

i=

ηi >

λ

gA/_fX WN

r

=:P+P,

whereP:=P(|_n

n

i=ξi|>_λg_g_),P:=P(|_n

n

i=ηi|>__g λ

A/fX_{W N}

r

).

Now, we estimateP. Clearly,Eξi= , and

Eξ_i =E

μψj,k(Yi)

g(Yi)

–βj,k 

≤

Eμψj,k(Yi)

g(Yi) +β_j_,_k

= 

Eμψ



j,k(Yi)

g(Yi)

_g(Yμ_i₎+β_j_,_k

≤

g g

fX_WN r +Af

X

WrN

:=σ.

Furthermore, we have

|ξi|=

μψj,k(Yi)

g(Yi)

–E

μψj,k(Yi)

g(Yi)

≤·j/gg– ψ∞.

By Bernstein’s inequality, we obtain

P

 n

n

i=

ξi >

λg g

≤exp

– nλ

_g /g (σ₊λg

g·

j/_gg–

 ψ∞/)

= exp

– n·c

j n·g

 /g (σ₊_gc_j/nj/_gg–

 ψ∞g/)

= exp

– c

_jg /g (σ₊_jj_/n_ψ_∞_c/)

.

Sincejj_≤_{n, then}

P  n

n

i=

ξi >

λg g

≤exp

– c

_jg /g (σ₊_ψ_∞_c/)

= exp

– c

_g /g (σ₊_ψ_∞_c/)j

.

Takingc>  such that

c_g_/g_

(σ+ψ∞c/)≥ω, then

P=P

 n

n

i=

ξi >

λg g

(8)

Next, we estimateP. We compute thatEηi= ,i.e.,

Eηi=E

 g(Yi)

–E



μ

=

 g(y)f

Y_(y)_dy_–  μ

=

 g(y)

g(y)fX_(y) μ dy–



μ

= 

μ

fX_(y)_dy_–  μ= ,

and

Eη_i =E

 g(Yi)

– 

μ 

≤

E  g(Yi)

+ 

μ

≤ 

g_,

|ηi|=

_g(Y_i₎– 

μ

= 

g(Yi)

–E  g(Yi)

≤g_–.

By Bernstein’s inequality, we obtain

P

 n

n

i=

ηi >

λ

gA/fXWrN

≤exp

– n(λ/(gA /_fX

WrN))



(

g 

+λg–

 /(gA/fXWrN))

= exp

–

ncj/(ng_AfX

WrN)

(

g+c

j/n/(ggA/_fX WrN))

.

Sincej≤n, then

P

 n

n

i=

ηi >

λ

gA/fXWrN

≤exp

–

c/(g_AfX

WrN)

(

g_ +c/(ggA

/_fX WrN))

j

.

Takingc>  such that

c_/(g_AfX

W Nr

)

(

g_+c/(ggA

/_fX W N_r ))

≥ω, we have

P=P  n

n

i=

ηi >

λ

gA/_fX WrN

≤e–ωj_–ωj_. _(.)

Takingc=max{c,c}, by (.) and (.), we have

P| ˆβj,k–βj,k|>λ

(9)

Lemma . Suppose that there exist two constants gand gsuch that <g≤g(x)≤g<

∞,for x∈R,andβˆjk,βˆjk∗ are given by(.).Then

E j

j=j

k _ˆ

β_jk∗ –βjk ψjk p ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+_, _r_> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p N+,

where c,care constants.

Proof By Lemma ., we obtain

E j

j=j

k _ˆ

β_jk∗ –βjk ψjk p ≤E j

j=j

k _ˆ

β_jk∗ –βjk ψjk p E j

j=j

j(–p) k

βˆ_jk∗ –βjk p



p

.

Furthermore, sinceβˆ_jk∗ =βˆjkI{| ˆβjk|>λ}, we have

E j

j=j

k _ˆ

j=j

j(–p) k

| ˆβjk–βjk|p  p × I

| ˆβjk|>λ,|βjk| ≥ λ



+I

| ˆβjk|>λ,|βjk|< λ  + j

j=j

j(–p) k

|βjk|p 

p

I| ˆβjk| ≤λ,|βjk| ≤λ

+I| ˆβjk| ≤λ,|βjk|> λ

.

Note that

I

| ˆβjk|>λ,|βjk|< λ



≤I

| ˆβjk–βjk|> λ



,

I| ˆβjk| ≤λ,|βjk|> λ

≤I



,

and if| ˆβjk| ≤λ, |βjk|> λ, we get| ˆβjk –βjk| ≥ |βjk|–| ˆβjk|>

|βjk|

 ,i.e.,|βjk|< | ˆβjk –βjk|; therefore, we have

E j

j=j

k _ˆ

j=j

j(–p) k

| ˆβjk–βjk|pI

|βjk| ≥ λ   p +E j

j=j

j(–p) k





(10)

+

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ



p

=:W+W+W,

where

W:=E

j

j=j

j(–p) k

|βjk| ≥λ





p

,

W:=E

j

j=j

j(–p) k





p

,

W:=

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ



p

.

(i) Firstly, we estimate

W:=E

j

j=j

j(–p) k

|βjk| ≥ λ





p

.

By Lemma ., we have

E| ˆβjk–βjk|pn– p

_.

UsingI{|βjk| ≥λ_} ≤(

|βjk|

λ



)rand Jensen’s inequality, we obtain

W =E

j

j=j

j(–p) k

|βjk| ≥ λ





p

≤

j

j=j

j(–p) k

E| ˆβjk–βjk|pI

|βjk|> λ





p

j

j=j

j(–p) k

n–p

_| βjk| λ/

r

p

=

j

j=j

j(–p)_n–_λ–

r p_β

j·

r p r.

Byβj·r–j(N+



–r)_and_λ∼_c

lnn

n , we have

W

j

j=j n–_j(



–p)_–j(N+–r)rp

n

lnn

r

p

=n–

n

lnn

r

pj

j=j –jξ

≤n

r–p

(11)

Using Lemma . and (.), we obtain

W

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+_, _r_> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p N+,

(.)

wherec,care constants. (ii) For

W:=

j

j=j

j(–p) k

|βjk|pI

|βjk| ≤λ



p

,

letξ:=_(r_p(N+ ) – ). ByI{|βjk| ≤λ} ≤(_|_βλ jk|)

p–r_(r_<_{p), we have}

W

j

j=j

j(–p) k

|βjk|p

λ

|βjk| p–r_p

=

j

j=j

j(–p)_(_λ₎ p–r

p

k

|βjk|r 

p

=

j

j=j

j(–p)_(_λ₎ p–r

p _β j·

r p r.

SincefX_∈_WN

r (R), thenβj·r–j(N+



–r)_{. Taking}_λ₌_c

j n∼c

lnn

n ,j–j∼C(lnn),

we have

W

j

j=j

j(–p)_λ p–r

p _–j(N+–r)rp

lnn n

p–r

p j

j=j

–j[rp(N+)–]

=

lnn n

p–r

p j

j=j –jξ

lnn n

p–r

p

–jξ_I{_ξ_{> }}_{+ }–jξ_I{_ξ_{< }}_{+ (j}_–_j_{+ )I}{_ξ_{= }}_.

Note thatξ >  if and only ifr> __Np_+. Whenξ = ,i.e.,p=r(N+ ), we can compute

N

(N–_r)+=

p–r

p. Using (.), (.), we obtain

W

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn n )

p–r

p_–jξ _{= (}_ln_n)

p–r

r(N+)_n–NN+, r> p N+, (ln_nn)

p–r

p_(j_–_j_{+ )}₍lnn n )

p–r

p_, _r₌ p

N+,

(ln_nn)p–pr_–jξ_{= (}lnn

n ) N

(N–/r)+_, _r_< p N+.

(12)

(iii) Finally, we estimate

W:=E

j

j=j

j(–p) k





p

.

Let  <q,q<∞, and _q +_q = . Using Jensen’s inequality and Hölder’s inequality, we have

W =E

j

j=j

j(–p) k





p

≤

j

j=j

j(–p) k

E





p

≤

j

j=j

j(–p) k

E| ˆβjk–βjk|qp



q

EIq

| ˆβjk–βjk|>λ 



q p

≤

j

j=j

j(–p) k

E| ˆβjk–βjk|qp



q

P

| ˆβjk–βjk|>λ 



q p

.

By Lemma . and Lemma ., we obtain

W≤

j

j=j

j(–p)_j_n–p_–

ωj

q



p ₌_n–

j

j=j j(

 –pqω)_.

Taking large enoughωsuch that _<_pqω, we get

Wn–  _j(



–pqω)≤√_n–_j_.

Taking j_{as in (.), we have}

W

⎧ ⎨ ⎩

(lnn)r(p–Nr+)_n–NN+, _r> p N+,

n–(N–/Nr)+_, _r≤ p N+.

(.)

Putting (.), (.) and (.) together, we can obtain

E

j

j=j

k _ˆ

β_jk∗ –βjk

ψjk p

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+_, _r_> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p N+,

wherec,care constants.

Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+

(13)

ˆ

fXnon

n is the nonlinear wavelet estimator in(.),and assumptions(.), (.)and(.)are

satisﬁed,then for any fX_{∈ ˜}_WN

r (A,L),where≤r<p<∞,N>r,we have

sup

fX_{∈ ˜}_WN r (A,L)

Efˆ_nXnon–fX_p

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+, r> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p N+,

where N =N– /r+ /p,c,care constants.

Proof By the deﬁnition offˆXnon

n in (.) and the expansion offXin (.), one has

ˆ

f_nXnon–fX=

k

(αˆjk–αjk)ϕjk+

j

j=j

k _ˆ

ψjk+Pj+fX–fX.

Then

≤E

k

(αˆjk–αjk)ϕjk

p

+E

j

j=j

k _ˆ

β_jk∗–βjk

ψjk p

+Pj+fX–fX_p

=:I+I+I,

where

I:=E

k

(αˆjk–αjk)ϕjk

p

,

I:=E

j

j=j

k _ˆ

β_jk∗ –βjkψjk p

,

I:=Pj+fX–fX_p.

Firstly, we estimate

I:=E

k

(αjˆk–αjk)ϕjk

p

.

By Lemma . and Jensen’s inequality,

Ij(–p)_E k

| ˆαjk–αjk|

p 

p

≤j(–p) k

E| ˆαjk–αjk|

p 

p

.

SincefX_{(x) and}_ϕ_{(x) are compactly supported, then the number of elements in}_{_k_:_α jk= }isO(j_{). By Lemma ., we have}_E| ˆ_α

jk–αjk|pn –p__.

Therefore

Ij(–p)_j_n–p 

(14)

Using (.), we have

I

⎧ ⎨ ⎩

(lnn) p –r

r(N+)_n–NN+, r> p N+,

n–(N–/Nr)+_, _r≤ p N+,

(.)

whereN =N–_r+_p. Next, we estimate

I:=Pj+fX–fX_p.

In reference [], it turns out that if the scaling functionϕ(x) is orthonormal, compactly supported andN+ regular, then the associated kernel functionK(x,y) :=_kϕ(x–k)ϕ(y– k) satisﬁes ConditionsH(N+ ) andM(N), andKjf(x) =Pjf(x).

Since a Sobolev space and a Besov space have the following embedding theorem:W˜N r →

˜

BN

r∞→ ˜BNp∞, whereN =N–r+



p, thenfX∈ ˜BNp∞. By Lemma ., we have Pj+fX–fX_p–jN.

Taking j_{as in (.), we have}

I

⎧ ⎨ ⎩

n–NN+_, _r_> p N+,

(lnn n )

N

(N–/r)+_, _r≤ p N+.

(.)

Finally, we estimate

I:=E

j

j=j

k _ˆ

ψjk p

.

Using Lemma ., we obtain

E

j

j=j

k _ˆ

ψjk p

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+, r> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p N+.

(.)

By (.), (.) and (.), we obtain

sup

fX_{∈ ˜}_WN r (A,L)

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

(lnn)c_n–NN+, _r> p N+,

(lnn)c₍lnn

n ) N

(N–/r)+_, _r₌ p N+,

(ln_nn)(N–/Nr)+_, _r_< p

N+.

4 Optimality

(15)

Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+  regular,fX_{∈ ˜}_WN

r (A,L).If there exist two positive constants gand gsuch that g≤g(x)≤

g,x∈R,then for any estimatorfˆnX,we have

inf

ˆ

fnX

sup

fX_{∈ ˜}_WN r (A,L)

Efˆ_nX–fX_p

⎧ ⎨ ⎩

n–NN+_, _r_> p N+,

(ln_nn)(N–/Nr)+_, _r≤ p N+,

where≤r,p<∞,N>_r.

Remark The proof is very similar to that in reference [], in which the author studied the lower bound of the convergence rates in Besov spaces for the samples without bias data.

According to Theorem ., we can see that:

(i) Whenr<__Np_+, our nonlinear estimator can attain the optimal rate.

(ii) Whenr=__Np_+, our convergence rate and the optimal rate of convergence diﬀer in a logarithmic. So, it is sub-optimal.

(iii) Whenr>__Np_+, the logarithmic factor is an extra penalty for the chosen wavelet thresholding, our convergence rate is sub-optimal.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

WJR participated in the sequence alignment and drafted the manuscript. WM participated in the design of the study and performed the statistical analysis. ZY conceived of the study and participated in its design and coordination. All authors read and approved the ﬁnal manuscript.

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No. 11271038) and Foundation of BJUT (No. 006000542213501).

Received: 12 January 2013 Accepted: 14 June 2013 Published: 3 July 2013

References

1. Efromovich, S: Nonparametric Curve Estimation. Methods, Theory, and Applications. Springer Series in Statistics. Springer, New York (1999)

2. Vardi, Y: Nonparametric estimation in the presence of length bias. Ann. Stat.10(2), 616-620 (1982) 3. Jones, MC: Kernel density estimation for length-biased data. Biometrika78(3), 511-519 (1991) 4. Efromovich, S: Density estimation for biased data. Ann. Stat.32, 1137-1161 (2004)

5. Ramirez, P, Vidakovic, B: Wavelet density estimation for stratiﬁed size-biased sample. J. Stat. Plan. Inference140(2), 419-432 (2010)

6. Christophe, C: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc.39, 43-53 (2010)

7. Kelly, C, Kon, MA, Rapheal, LA: Local convergence for wavelet expansion. J. Funct. Anal.126, 102-138 (1994) 8. Triebel, H: Theory of Function Spaces. Birkhäuser, Basel (1983)

9. Härdle, W, Kerkyacharian, G, Picard, D, Tsybakov, A: Wavelets, Approximation and Statistical Applications. Springer, Berlin (1997)

10. Wang, HY: Convergence rates of density estimation in Besov spaces. Appl. Math.2(10), 1258-1262 (2011) doi:10.1186/1029-242X-2013-308