R E S E A R C H
Open Access
Nonlinear wavelet density estimation for
biased data in Sobolev spaces
Jinru Wang
*, Meng Wang and Yuan Zhou
*Correspondence:
[email protected] Department of Applied Mathematics, Beijing University of Technology, Beijing, 100124, China
Abstract
In this paper, we consider the density estimation problem from independent and identically distributed (i.i.d.) biased observations. We develop an adaptive wavelet hard thresholding rule and evaluate its performance by consideringLprisk over
Sobolev balls. We prove that our estimation attains a sharp rate of convergence and show the optimality.
MSC: 49K40; 90C29; 90C31
Keywords: Sobolev spaces; wavelet density estimation; hard thresholding; convergence rate
1 Introduction
In practice, it usually happens that drawing a direct sample from a random variableX is impossible. In this paper, we consider the problem of estimating the density functions fX(x) without observing directly the i.i.d. sampleX,X, . . . ,X
n. We observe the samples
Y,Y, . . . ,Ynfrom biased data with the following density function:
fY(x) =g(x)f
X(x)
μ , (.)
whereg(x) is the so-called weight or bias function,μ=E(g(X)). The purpose of this paper is to estimate the density functionfX(x) from the samplesY,Y, . . . ,Y
n.
Several examples of this biased data can be found in the literature. For instance, in paper [], it is shown that the distribution of the concentration of alcohol in the blood of intoxi-cated drivers is of interest, since the drunken driver has a larger chance of being arrested, the collected data are size-biased.
The density estimation problem for biased data (.) has been discussed in several pa-pers. In , Vardi [] considered the nonparametric maximum likelihood estimation for fX(x). In , Jones [] discussed the mean squared error properties of the kernel density
estimation. In , Efromovich [] developed the Efromovich-Pinsker adaptive Fourier estimator. It was based on a blockwise shrinkage algorithm and achieved the minimax rate of convergence under theLrisk over a Besov classBs,.
In , Ramírez and Vidakovic [] proposed a linear wavelet estimator and discussed the consistency of a function inL[, ] under the mean integrated squared error (MISE) sense. But the wavelet estimator in paper [] contained the unknown parameterμ. In the same year, Christophe [] constructed a nonlinear wavelet estimator and evaluated theLp
risk in the Besov spaceBs
r,q. However, Sobolev spacesWrN (N∈N+) exceptr= is not a
special case in the Besov spaceBs r,q.
In this paper, we consider the nonlinear hard thresholding wavelet density estimation for biased data in Sobolev spacesWN
r (N∈N+). We mainly give the upper bound of minimax
rate of convergence under theLp risk without particular restriction on the parametersr
andp, and the convergence rate is optimal.
2 Preliminaries
In this section, we shall recall some well-known concepts and lemmas.
2.1 Wavelets
In this paper, we always assume that the scaling waveletϕis orthonormal, compactly sup-ported andN+ regular.
Definition . The scaling functionϕ(x) is calledmregular ifϕ(x) has continuous deriva-tives of ordermand its corresponding waveletψ(x) has vanishing moments of orderm, i.e.,xkψ(x)dx= ,k= , , . . . ,m– .
The following conditions about the scaling functionϕand the kernel functionK(x,y) will be very useful in the third section.
Condition(θ) The functionθϕ(x) =
k∈Z|ϕ(x–k)|is such thatess supx∈Rθϕ(x) <∞.
Condition H(N) There exists an integrable function F(x) such that for any x,y∈R,
|K(x,y)| ≤F(x–y), where|x|NF(x)dx<∞.
ConditionM(N) ConditionH(N) is satisfied andK(x,y)(y–x)kdy=δ
k,k= , . . . ,N,
x∈R.
For any x∈R,j,k∈Z, denoted byϕjk(x) := j
ϕ(jx–k),ψjk(x) :=
j
ψ(jx–k), then for anyf(x)∈Lr(R) :={f(x)|
R|f(x)|rdx<∞}, where ≤r<∞, we have the following
equation []:
f(x) =
k∈Z
αJ,kϕJ,k(x) +
j≥J
k∈Z
βj,kψj,k(x), a.e., (.)
where
αJ,k=
Rf(x)
ϕJ,k(x)dx, βj,k=
Rf(x)
ψj,k(x)dx.
2.2 Sobolev space The Sobolev spaceWN
r (R) (N∈N+) is defined byWrN(R) :={f :f ∈Lr(R),f(N)∈Lr(R)},
which is equipped with the normfWN
r :=fr+f
(N)
r. The Sobolev ballsW˜rN(A,L) are
defined as follows:
˜
WrN(A,L) :=f ∈WrN(R) :f is a probability density function,suppf ≤A,
Between a Sobolev space and a Besov space, the following embedding conclusions are established.
Lemma .[] Let s> , ≤p,q,r≤ ∞,then (i) WrN→BNr∞→BN∞∞–/r,∀N> /r; (ii) Bsrq→Bspq,∀r<p,s =s– /r+ /p,
where A→B denotes that the Banach space A is continuously embedding in the Banach space B,i.e.,there exists a constant c≥such that for any u∈A,we haveuB≤cuA.
2.3 Auxiliary lemmas
The following lemmas given by [] will be used in the next section.
Lemma . If the scaling function ϕ satisfies Condition (θ), then for any sequence
{λk}k∈Z satisfying λlp := (
k|λk|p)
p <∞, we have Cλ lp(
j
–
j
p) ≤
kλkϕj,kp ≤
Cλlp( j
–
j
p),where C= (θ ϕ
p
∞ϕ
q
)–,C= (θϕ
q
∞ϕ
p
)–, ≤p≤ ∞, p+
q= .
Lemma . For some integer N≥,if the kernel function K(x,y)satisfies Conditions M(N) and H(N+ ),f ∈Bs
pq(R),where≤p,q≤ ∞, <s<N+ ,then we haveKjf–fp= –jsεj,
whereεj∈lq.
Lemma .(Rosenthal inequality) Let X, . . . ,Xnbe independent random variables such
that E(Xi) = and E(|Xi|p) <∞,then there exists a constant C(p) > such that
E
n
i= Xi
p
≤C(p)
n
i= E|Xi|p
+
n
i= EXi
p/
, p> ,
E
n
i= Xi
p
≤
n
i= EXi
p/
, <p≤.
Lemma . (Bernstein inequality) Let X,X, . . . ,Xn be independent random variables
such that E(Xi) = ,E(Xi)≤σ,|Xi| ≤M<∞.Then
P
n
n
i= Xi
>λ
≤exp
– nλ
(σ+Mλ/)
, ∀λ> .
Remark In this paper, we often use the notationABto indicate thatA≤cBwith a
positive constantc, which is independent ofAandB. IfABandBA, we writeA∼B. 3 Main results
In this paper, our hard thresholding wavelet density estimator is defined as follows:
ˆ
fXnon
n (x) =
k
ˆ
αjkϕjk(x) +
j
j=j
k
ˆ
βjk∗ψjk(x), (.)
where
ˆ
αjk:=
ˆ
μ
n
n
i=
ϕjk(Yi)
g(Yi)
, βˆjk:= ˆ μ
n
n
i=
ψjk(Yi)
g(Yi)
, μˆ:=nn
i=g(Yi)
The hard thresholding wavelet coefficients areβˆjk∗ :=βˆjkI{| ˆβjk| ≥λ}, where
I| ˆβjk| ≥λ
:=
⎧ ⎨ ⎩
, | ˆβjk| ≥λ,
, | ˆβjk|<λ.
Suppose that the parametersj,j,λof the wavelet thresholding estimator (.) satisfy the assumptions:
j∼
⎧ ⎨ ⎩
((lnn)p–rrn)N+, r> p N+,
n(N–/–/pr)+, r≤ p N+,
(.)
j∼
⎧ ⎨ ⎩
n
N
N(N+), r> p N+, ( n
lnn)
(N–/r)+, r≤ p N+,
(.)
λ=c
j
n, (.)
wherecis a suitably chosen positive constant.
Lemma . Suppose that there exist two constants gand gsuch that <g≤g(x)≤g<
∞for x∈R.Letαjk,βjkbe the coefficients in the expansion(.)and letαˆjk,βˆjk be defined
by estimator in(.).Ifj≤n,then for any≤p<∞,we have (i) E|αjk–αˆjk|pn–
p
;
(ii) E|βjk–βjkˆ |pn–p.
Proof (i) From the definition ofαˆjkand the triangular inequality, we have
| ˆαj,k–αj,k|=
ˆ
μ
n
n
i=
ϕj,k(Yi)
g(Yi)
–αj,k
=
ˆ
μ μ
μ
n
n
i=
ϕj,k(Yi)
g(Yi)
–αj,k
+μαˆ j,k
μ–
ˆ
μ
≤μˆ
μ μnn
i=
ϕj,k(Yi)
g(Yi)
–αj,k
+| ˆμαj,k| μˆ –
μ .
Sinceg≤g(y)≤g, we have
ˆ
μ=nn
i=g(Yi)
≤g, μ=Eg(X)≥g,
and
|αj,k| ≤ fX(y)ϕj,k(y)dy≤ fX(y)dy
ϕj,k(y)dy
≤A/f∞.
Furthermore, a Sobolev space and a Besov space have the following embedding the-orem,WN
fWN
r =c. Therefore, by the convexity inequality, we get
E| ˆαj,k–αj,k|p
≤E g g
μ
n
n
i=
ϕj,k(Yi)
g(Yi)
–αj,k
+gA/c μˆ –
μ
p
≤p–max
g g
,gA/c
p
E
μ
n
n
i=
ϕj,k(Yi)
g(Yi)
–αj,k p
+
ˆ
μ–
μ p
E
μ
n
n
i=
ϕj,k(Yi)
g(Yi)
–αj,k p
+E
ˆ
μ–
μ p
=:T+T,
whereT:=E|μnni=ϕgj,(kYi(Yi))–αj,k|p,T:=E|μˆ –
μ| p.
The termTiis estimated as follows. Firstly, letξi:=μ ϕj,k(Yi)
g(Yi) –αj,k, we can see that they are i.i.d., andE(ξi) = . Moreover, for anym≥,
E|ξi|m =E
μϕj,k(Yi)
g(Yi)
–αj,k m
≤m–
Eμϕj,k(Yi)
g(Yi)
m+|αj,k|m
,
where
Eμϕj,k(Yi)
g(Yi)
m =μm–ϕ m–
j,k (Yi)
g(Yi)m–
Eμϕ
j,k(Yi)
g(Yi)
≤gm–g–m+j(m–)ϕm– ∞ E
μϕ
j,k(Yi)
g(Yi) ,
and
Eμϕ
j,k(Yi)
g(Yi)
=
μϕ
j,k(y)
g(y) f
Y(y)dy=
μϕ
j,k(y)
g(y)
g(y)fX(y)
μ dy
≤ f∞≤ fWrN =c.
So, we have
Eμϕj,k(Yi)
g(Yi)
m≤gm–g–m+j(m–)ϕm– ∞ c.
Since j≤n, we obtain
E|ξi|m≤Cj(m–)/n(m–)/.
By Rosenthal’s inequality, we have
T=E
n
n
i=
ξi p
n–pnE|ξi|p+np/
E|ξi| p/
To estimate the termT, letηi=g(Yi)–μ. We can computeE(ηi) = easily, and for any
m≥,E|ηi|m≤C.
Ifp≥,i.e., –p< –p/, using Rosenthal’s inequality, we have
T=E
n n i= ηi p
n–pnE|ηi|p+np/
E|ηi| p/
n–p++n–p/n–p/. (.)
If ≤p< , we get
T=E
n n i= ηi p
≤n–pnp/E|ηi| p/
≤n–p/. (.)
By (.), (.) and (.), we obtain
E| ˆαj,k–αj,k|pT+Tn–p/.
(ii) It is similar to (i), we omit it.
Lemma . If jj≤n,then for anyω> ,there exists a constant c> such that
P
| ˆβjk–βjk|>λ=c
j n
–ωj. (.)
Proof We can easily get
ˆ
μ≤g, μ≥g,
μ≤g
– ,
|βj,k| ≤A/f∞≤A/fWrN.
Therefore,
| ˆβj,k–βj,k|= ˆ μ μ μ n n i=
ψj,k(Yi)
g(Yi)
–βj,k
+μβˆ j,k μ– ˆ μ ≤ g g n n i=
μψj,k(Yi)
g(Yi)
–βj,k
+gA/fXWrN n n i= g(Yi)
– μ =: g g n n i= ξi
+gA/fXWN r n n i= ηi ,
whereξi=μ ψj,k(Yi)
g(Yi) –βj,k,ηi=
g(Yi)–
μ. So, we get
P| ˆβj,k–βj,k|>λ
≤P g g n n i= ξi
+gA/fXWN r n n i= ηi >λ
≤P g g n n i= ξi >λ/
+P gA/fXWN r n n i= ηi >λ/
=P n
n
i=
ξi >
λg g
+P n
n
i=
ηi >
λ
gA/fX WN
r
=:P+P,
whereP:=P(|n
n
i=ξi|>λgg),P:=P(|n
n
i=ηi|>g λ
A/fXW N
r
).
Now, we estimateP. Clearly,Eξi= , and
Eξi =E
μψj,k(Yi)
g(Yi)
–βj,k
≤
Eμψj,k(Yi)
g(Yi) +βj,k
=
Eμψ
j,k(Yi)
g(Yi)
g(Yμi)+βj,k
≤
g g
fXWN r +Af
X
WrN
:=σ.
Furthermore, we have
|ξi|=
μψj,k(Yi)
g(Yi)
–E
μψj,k(Yi)
g(Yi)
≤·j/gg– ψ∞.
By Bernstein’s inequality, we obtain
P
n
n
i=
ξi >
λg g
≤exp
– nλ
g /g (σ+λg
g·
j/gg–
ψ∞/)
= exp
– n·c
j n·g
/g (σ+gcj/nj/gg–
ψ∞g/)
= exp
– c
jg /g (σ+jj/nψ∞c/)
.
Sincejj≤n, then
P n
n
i=
ξi >
λg g
≤exp
– c
jg /g (σ+ψ∞c/)
= exp
– c
g /g (σ+ψ∞c/)j
.
Takingc> such that
cg/g
(σ+ψ∞c/)≥ω, then
P=P
n
n
i=
ξi >
λg g
Next, we estimateP. We compute thatEηi= ,i.e.,
Eηi=E
g(Yi)
–E
μ
=
g(y)f
Y(y)dy– μ
=
g(y)
g(y)fX(y) μ dy–
μ
=
μ
fX(y)dy– μ= ,
and
Eηi =E
g(Yi)
–
μ
≤
E g(Yi)
+
μ
≤
g,
|ηi|=
g(Yi)–
μ
=
g(Yi)
–E g(Yi)
≤g–.
By Bernstein’s inequality, we obtain
P
n
n
i=
ηi >
λ
gA/fXWrN
≤exp
– n(λ/(gA /fX
WrN))
(
g
+λg–
/(gA/fXWrN))
= exp
–
ncj/(ngAfX
WrN)
(
g+c
j/n/(ggA/fX WrN))
.
Sincej≤n, then
P
n
n
i=
ηi >
λ
gA/fXWrN
≤exp
–
c/(gAfX
WrN)
(
g +c/(ggA
/fX WrN))
j
.
Takingc> such that
c/(gAfX
W Nr
)
(
g+c/(ggA
/fX W Nr ))
≥ω, we have
P=P n
n
i=
ηi >
λ
gA/fX WrN
≤e–ωj–ωj. (.)
Takingc=max{c,c}, by (.) and (.), we have
P| ˆβj,k–βj,k|>λ
Lemma . Suppose that there exist two constants gand gsuch that <g≤g(x)≤g<
∞,for x∈R,andβˆjk,βˆjk∗ are given by(.).Then
E j
j=j
k ˆ
βjk∗ –βjk ψjk p ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p N+,
where c,care constants.
Proof By Lemma ., we obtain
E j
j=j
k ˆ
βjk∗ –βjk ψjk p ≤E j
j=j
k ˆ
βjk∗ –βjk ψjk p E j
j=j
j(–p) k
βˆjk∗ –βjk p
p
.
Furthermore, sinceβˆjk∗ =βˆjkI{| ˆβjk|>λ}, we have
E j
j=j
k ˆ
βjk∗ –βjk ψjk p E j
j=j
j(–p) k
| ˆβjk–βjk|p p × I
| ˆβjk|>λ,|βjk| ≥ λ
+I
| ˆβjk|>λ,|βjk|< λ + j
j=j
j(–p) k
|βjk|p
p
I| ˆβjk| ≤λ,|βjk| ≤λ
+I| ˆβjk| ≤λ,|βjk|> λ
.
Note that
I
| ˆβjk|>λ,|βjk|< λ
≤I
| ˆβjk–βjk|> λ
,
I| ˆβjk| ≤λ,|βjk|> λ
≤I
| ˆβjk–βjk|> λ
,
and if| ˆβjk| ≤λ, |βjk|> λ, we get| ˆβjk –βjk| ≥ |βjk|–| ˆβjk|>
|βjk|
,i.e.,|βjk|< | ˆβjk –βjk|; therefore, we have
E j
j=j
k ˆ
βjk∗ –βjk ψjk p E j
j=j
j(–p) k
| ˆβjk–βjk|pI
|βjk| ≥ λ p +E j
j=j
j(–p) k
| ˆβjk–βjk|pI
| ˆβjk–βjk|> λ
+
j
j=j
j(–p) k
|βjk|pI
|βjk| ≤λ
p
=:W+W+W,
where
W:=E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
|βjk| ≥λ
p
,
W:=E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
| ˆβjk–βjk|> λ
p
,
W:=
j
j=j
j(–p) k
|βjk|pI
|βjk| ≤λ
p
.
(i) Firstly, we estimate
W:=E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
|βjk| ≥ λ
p
.
By Lemma ., we have
E| ˆβjk–βjk|pn– p
.
UsingI{|βjk| ≥λ} ≤(
|βjk|
λ
)rand Jensen’s inequality, we obtain
W =E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
|βjk| ≥ λ
p
≤
j
j=j
j(–p) k
E| ˆβjk–βjk|pI
|βjk|> λ
p
j
j=j
j(–p) k
n–p
| βjk| λ/
r
p
=
j
j=j
j(–p)n–λ–
r pβ
j·
r p r.
Byβj·r–j(N+
–r)andλ∼c
lnn
n , we have
W
j
j=j n–j(
–p)–j(N+–r)rp
n
lnn
r
p
=n–
n
lnn
r
pj
j=j –jξ
≤n
r–p
Using Lemma . and (.), we obtain
W
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p N+,
(.)
wherec,care constants. (ii) For
W:=
j
j=j
j(–p) k
|βjk|pI
|βjk| ≤λ
p
,
letξ:=(rp(N+ ) – ). ByI{|βjk| ≤λ} ≤(|βλ jk|)
p–r(r<p), we have
W
j
j=j
j(–p) k
|βjk|p
λ
|βjk| p–rp
=
j
j=j
j(–p)(λ) p–r
p
k
|βjk|r
p
=
j
j=j
j(–p)(λ) p–r
p β j·
r p r.
SincefX∈WN
r (R), thenβj·r–j(N+
–r). Takingλ=c
j n∼c
lnn
n ,j–j∼C(lnn),
we have
W
j
j=j
j(–p)λ p–r
p –j(N+–r)rp
lnn n
p–r
p j
j=j
–j[rp(N+)–]
=
lnn n
p–r
p j
j=j –jξ
lnn n
p–r
p
–jξI{ξ> }+ –jξI{ξ< }+ (j–j+ )I{ξ= }.
Note thatξ > if and only ifr> Np+. Whenξ = ,i.e.,p=r(N+ ), we can compute
N
(N–r)+=
p–r
p. Using (.), (.), we obtain
W
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn n )
p–r
p–jξ = (lnn)
p–r
r(N+)n–NN+, r> p N+, (lnnn)
p–r
p(j–j+ )(lnn n )
p–r
p, r= p
N+,
(lnnn)p–pr–jξ= (lnn
n ) N
(N–/r)+, r< p N+.
(iii) Finally, we estimate
W:=E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
| ˆβjk–βjk|> λ
p
.
Let <q,q<∞, and q +q = . Using Jensen’s inequality and Hölder’s inequality, we have
W =E
j
j=j
j(–p) k
| ˆβjk–βjk|pI
| ˆβjk–βjk|> λ
p
≤
j
j=j
j(–p) k
E
| ˆβjk–βjk|pI
| ˆβjk–βjk|> λ
p
≤
j
j=j
j(–p) k
E| ˆβjk–βjk|qp
q
EIq
| ˆβjk–βjk|>λ
q p
≤
j
j=j
j(–p) k
E| ˆβjk–βjk|qp
q
P
| ˆβjk–βjk|>λ
q p
.
By Lemma . and Lemma ., we obtain
W≤
j
j=j
j(–p)jn–p–
ωj
q
p =n–
j
j=j j(
–pqω).
Taking large enoughωsuch that <pqω, we get
Wn– j(
–pqω)≤√n–j.
Taking jas in (.), we have
W
⎧ ⎨ ⎩
(lnn)r(p–Nr+)n–NN+, r> p N+,
n–(N–/Nr)+, r≤ p N+.
(.)
Putting (.), (.) and (.) together, we can obtain
E
j
j=j
k ˆ
βjk∗ –βjk
ψjk p
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p N+,
wherec,care constants.
Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+
ˆ
fXnon
n is the nonlinear wavelet estimator in(.),and assumptions(.), (.)and(.)are
satisfied,then for any fX∈ ˜WN
r (A,L),where≤r<p<∞,N>r,we have
sup
fX∈ ˜WN r (A,L)
EfˆnXnon–fXp
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p N+,
where N =N– /r+ /p,c,care constants.
Proof By the definition offˆXnon
n in (.) and the expansion offXin (.), one has
ˆ
fnXnon–fX=
k
(αˆjk–αjk)ϕjk+
j
j=j
k ˆ
βjk∗ –βjk
ψjk+Pj+fX–fX.
Then
EfˆnXnon–fXp
≤E
k
(αˆjk–αjk)ϕjk
p
+E
j
j=j
k ˆ
βjk∗–βjk
ψjk p
+Pj+fX–fXp
=:I+I+I,
where
I:=E
k
(αˆjk–αjk)ϕjk
p
,
I:=E
j
j=j
k ˆ
βjk∗ –βjkψjk p
,
I:=Pj+fX–fXp.
Firstly, we estimate
I:=E
k
(αjˆk–αjk)ϕjk
p
.
By Lemma . and Jensen’s inequality,
Ij(–p)E k
| ˆαjk–αjk|
p
p
≤j(–p) k
E| ˆαjk–αjk|
p
p
.
SincefX(x) andϕ(x) are compactly supported, then the number of elements in{k:α jk= }isO(j). By Lemma ., we haveE| ˆα
jk–αjk|pn –p.
Therefore
Ij(–p)jn–p
Using (.), we have
I
⎧ ⎨ ⎩
(lnn) p –r
r(N+)n–NN+, r> p N+,
n–(N–/Nr)+, r≤ p N+,
(.)
whereN =N–r+p. Next, we estimate
I:=Pj+fX–fXp.
In reference [], it turns out that if the scaling functionϕ(x) is orthonormal, compactly supported andN+ regular, then the associated kernel functionK(x,y) :=kϕ(x–k)ϕ(y– k) satisfies ConditionsH(N+ ) andM(N), andKjf(x) =Pjf(x).
Since a Sobolev space and a Besov space have the following embedding theorem:W˜N r →
˜
BN
r∞→ ˜BNp∞, whereN =N–r+
p, thenfX∈ ˜BNp∞. By Lemma ., we have Pj+fX–fXp–jN.
Taking jas in (.), we have
I
⎧ ⎨ ⎩
n–NN+, r> p N+,
(lnn n )
N
(N–/r)+, r≤ p N+.
(.)
Finally, we estimate
I:=E
j
j=j
k ˆ
βjk∗ –βjk
ψjk p
.
Using Lemma ., we obtain
E
j
j=j
k ˆ
βjk∗ –βjk
ψjk p
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p N+.
(.)
By (.), (.) and (.), we obtain
sup
fX∈ ˜WN r (A,L)
EfˆnXnon–fXp
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
(lnn)cn–NN+, r> p N+,
(lnn)c(lnn
n ) N
(N–/r)+, r= p N+,
(lnnn)(N–/Nr)+, r< p
N+.
4 Optimality
Theorem . Let the scaling functionϕ(x)be orthonormal,compactly supported and N+ regular,fX∈ ˜WN
r (A,L).If there exist two positive constants gand gsuch that g≤g(x)≤
g,x∈R,then for any estimatorfˆnX,we have
inf
ˆ
fnX
sup
fX∈ ˜WN r (A,L)
EfˆnX–fXp
⎧ ⎨ ⎩
n–NN+, r> p N+,
(lnnn)(N–/Nr)+, r≤ p N+,
where≤r,p<∞,N>r.
Remark The proof is very similar to that in reference [], in which the author studied the lower bound of the convergence rates in Besov spaces for the samples without bias data.
According to Theorem ., we can see that:
(i) Whenr<Np+, our nonlinear estimator can attain the optimal rate.
(ii) Whenr=Np+, our convergence rate and the optimal rate of convergence differ in a logarithmic. So, it is sub-optimal.
(iii) Whenr>Np+, the logarithmic factor is an extra penalty for the chosen wavelet thresholding, our convergence rate is sub-optimal.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
WJR participated in the sequence alignment and drafted the manuscript. WM participated in the design of the study and performed the statistical analysis. ZY conceived of the study and participated in its design and coordination. All authors read and approved the final manuscript.
Acknowledgements
This paper is supported by the National Natural Science Foundation of China (No. 11271038) and Foundation of BJUT (No. 006000542213501).
Received: 12 January 2013 Accepted: 14 June 2013 Published: 3 July 2013
References
1. Efromovich, S: Nonparametric Curve Estimation. Methods, Theory, and Applications. Springer Series in Statistics. Springer, New York (1999)
2. Vardi, Y: Nonparametric estimation in the presence of length bias. Ann. Stat.10(2), 616-620 (1982) 3. Jones, MC: Kernel density estimation for length-biased data. Biometrika78(3), 511-519 (1991) 4. Efromovich, S: Density estimation for biased data. Ann. Stat.32, 1137-1161 (2004)
5. Ramirez, P, Vidakovic, B: Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference140(2), 419-432 (2010)
6. Christophe, C: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc.39, 43-53 (2010)
7. Kelly, C, Kon, MA, Rapheal, LA: Local convergence for wavelet expansion. J. Funct. Anal.126, 102-138 (1994) 8. Triebel, H: Theory of Function Spaces. Birkhäuser, Basel (1983)
9. Härdle, W, Kerkyacharian, G, Picard, D, Tsybakov, A: Wavelets, Approximation and Statistical Applications. Springer, Berlin (1997)
10. Wang, HY: Convergence rates of density estimation in Besov spaces. Appl. Math.2(10), 1258-1262 (2011) doi:10.1186/1029-242X-2013-308