Uniform convergence of estimator for nonparametric regression with dependent data

(1)

R E S E A R C H

Open Access

Uniform convergence of estimator for

nonparametric regression with dependent

data

Xiaoqin Li, Wenzhi Yang

*

_{and Shuhe Hu}

*_{Correspondence:}

[email protected] School of Mathematical Science, Anhui University, Hefei, 230601, China

Abstract

In this paper, the authors investigate the internal estimator of nonparametric regression with dependent data such as

α

-mixing. Under suitable conditions such as the arithmetically

α

-mixing andE|Y1|s<∞(s> 2), the convergence rate

|mn(x) –m(x)|=OP(an) +O(h2) and uniform convergence rate

sup_x_∈_S

f|mn(x) –m(x)|=Op(an) +O(h

2_{) are presented, if}_a

n=

lnn

nhd →0. We generalize

some results in Shen and Xie (Stat. Probab. Lett. 83:1915-1925, 2013).

MSC: 62G05; 62G08

Keywords: nonparametric regression; internal estimator; arithmetically

α

-mixing; uniform convergence rate

1 Introduction

Kernel-type estimators of the regression function are widely various situations because of their ﬂexibility and eﬃciency, in the dependent cases as well as in the independent data case. This paper is concerned with the nonparametric regression model

Yi=m(Xi) +Ui, ≤i≤n,n≥, (.)

where (Xi,Yi)∈Rd×R, ≤i≤n,Uiis random variable such thatE(Ui|Xi) = , ≤i≤n.

Then one has

E(Yi|Xi=x) =m(x), ≤i≤n,n≥.

The most popular nonparametric estimators of the unknown function m(x) is the Nadaraya-Watson estimatormNW(x) given below and the local polynomials ﬁtting. Let

K(x) be a kernel function. DeﬁneKh(x) =h–dK(x/h), whereh=hnis a sequence of positive

bandwidths tending to zero asn→ ∞. Kernel-type estimators of the regression function are widely various situations because of their ﬂexibility and eﬃciency, in the dependent data case as well as the independent data case. For the independent data, Nadaraya [] and Watson [] gave the most popular nonparametric estimators of the unknown

(2)

tionm(x), named the Nadaraya-Watson estimatormNW(x),i.e.

mNW(x) =

n

i=YiKh(x–Xi)

n

i=Kh(x–Xi)

. (.)

Joneset al.[] considered various versions of kernel-type regression estimators such as the Nadaraya-Watson estimator (.) and the local linear estimator. They also investigated the following internal estimator:

mn(x) =

 n

n

i=

YiKh(x–Xi)

f(Xi)

(.)

for known densityf(·). The term ‘internal’ stands for the fact that the factor 

f(Xi) is in-ternal to the summation, while the estimatormNW(x) has the factorf(x) =



n–n i=Kh(x–Xi) externally to the summation.

The internal estimator was ﬁrst proposed by Mack and Müller []. Joneset al.[] stud-ied various kernel-type regression estimators, including introduced the internal estimator (.). Linton and Nielsen [] introduced ‘integration method’, based on direct integration of initial pilot estimator (.). Linton and Jacho-Chávez [] studied the two internal non-parametric estimators with the estimator similar to estimator (.) but in place of an un-known densityf(·), a classical kernel estimatorf(x) =n–n_i_=Kh(x–Xi) is used. Much

work has been done for the kernel density estimation. For example, Masry [] gave the recursive probability density estimation for a mixing dependent sample, Roussaset al.[] and Tranet al.[] investigated the ﬁxed design regression for dependent data, Liebscher [] studied the strong convergence of sums ofα-mixing random variables and gave its application to density estimation, Hansen [] obtained the uniform convergence rates for kernel estimation with dependent data, and so on. For more work as regards kernel estimation, we can also refer to [–] and the references therein.

Let (,F,P) be a ﬁxed probability space. DenoteN={, , . . . ,n, . . .}. LetFn

m=σ(Xi,m≤

i≤n,i∈N) be theσ-ﬁeld generated by random variablesXm,Xm+, . . . ,Xn, ≤m≤n. For

n≥, we deﬁne

α(n) =sup m∈N

sup A∈F_m,B∈Fm+n∞

P(AB) –P(A)P(B).

Deﬁnition . Ifα(n)↓ asn→ ∞, then{Xn,n≥}is called a strong mixing orα-mixing

sequence.

Recently, Shen and Xie [] obtained the strong consistency of the internal estimator (.) underα-mixing data. In their paper, the process is assumed to be geometricallyα-mixing sequence,i.e.the mixing coeﬃcientsα(n) satisfyα(n)≤βe–βn, whereβ>  andβ> .

They also supposed that the sequence{Yn,n≥}is bounded, as well as the densityf(x)

ofX. Inspired by Hansen [], Shen and Xie [] and other papers above, we also

investi-gate the convergence of the internal estimator (.) underα-mixing data. The process is supposed to be an arithmeticallyα-mixing sequence,i.e.the mixing coeﬃcientsα(n) sat-isfyα(n)≤Cn–β_,_C_{> , and}_β_{> . Without the bounded conditions of}_{_Y

n,n≥}and the

(3)

internal estimator (.). For the details, please see our results in Section . The conclusion and the lemmas and proofs of the main results are presented in Section  and Section , respectively.

Regarding notation, forx= (x, . . . ,xd)∈Rd, set x =max(|x|, . . . ,|xd|). Throughout the

paper,C,C,C,C, . . . , denote some positive constants not depending onn, which may be

diﬀerent in various places.xdenotes the largest integer not exceedingx.→means to take the limit asn→ ∞,−→P means to convergence in probability.X=dY means that the random variablesXandYhave the same distribution. A sequence{Xn,n≥}is said to be

of second-order stationarity if (X,X+k) d

= (Xi,Xi+k),i≥,k≥.

2 Results and discussion

2.1 Some assumptions

Assumption . We assume the data observed{(Xn,Yn),n≥}valued inRd×Rcomes

from a second-order stationary stochastic sequence. The sequence{(Xn,Yn),n≥}is also

assumed to be arithmeticallyα-mixing with mixing coeﬃcientsα(n) such that

α(n)≤An–β_, _(.)

whereA<∞and for somes> 

E|Y|s<∞ (.)

and

β≥s– 

s–  . (.)

The known densityf(·) ofX is upon its compact supportSf and it is also assumed that inf_x_∈_S_f f(x) > . LetBbe a positive constant such as

sup x∈Sf

E|Y|s|X=x

f(x)≤B. (.)

Also, there is aj∗<∞such that for allj≥j∗

sup x∈Sf,xj+∈Sf

E|YYj+||X=x,Xj+=xj+

fj(x,xj+)≤B, (.)

whereBis a positive constant andfj(x,xj+) denotes the joint density of (X,Xj+).

Assumption . There exist two positive constantsK¯ >  andμ>  such that

sup u∈Rd

K(u)≤ ¯K and

Rd

K(u)du=μ. (.)

Assumption . Denote byS_f the interior ofSf. Forx∈Sf, the functionm(x) is twice

diﬀerentiable and there exists a positive constantbsuch that

(4)

The kernel density functionK(·) is symmetrical and satisﬁes

Rd|vi||vj|K

(v)dv<∞, ∀i,j= , , . . . ,d.

Assumption . The kernel function satisﬁes the Lipschitz condition,i.e.

∃L> , K(u) –Ku≤Lu–u, u,u∈Rd.

Remark . Similar to Assumption  of Hansen [], Assumption . speciﬁes that the serial dependence in the data is of strong mixing type, and equations (.)-(.) specify a required decay rate. Condition (.) controls the tail behavior of the conditional ex-pectationE(|Y|s|X=x), condition (.) places a similar bound on the joint density and

conditional expectation. Assumptions .-. are the conditions of kernel functionK(u), i.e., Assumption . is a general condition, Assumption . is used to estimate the con-vergence rate of|Emn(x) –m(x)|, and Assumption . is used to investigate the uniform

convergence rate of the internal estimatormn(x).

2.2 Main results

First, we investigate the variance bound of estimatormn(x). For ≤r≤sands> , denote ¯

μ(r,s) := (B)r/sK¯r–μ

(infx∈_Sff(x))r–+r/s, whereB,infx∈Sf f(x),K¯, andμare deﬁned in Assumptions . and ..

Theorem . Let Assumption. and Assumption. be fulﬁlled. Then there exists a

<∞such that for n suﬃciently large and x∈Sf

Varmn(x)

≤

nhd, (.)

where:=μ¯(,s) + j∗μ¯(,s) + (μ¯(,s) + Bμ

(infx∈Sff)) +

A–/sμ¯s(s,s) (s–)/s .

As an application to Theorem ., we obtain the weak consistency of estimatormn(x).

Corollary . Let Assumption.and Assumption.be fulﬁlled and K(·)be a density function.For x∈Sf,m(x)is supposed to be continuous at x.If nhd→ ∞as n→ ∞,then

mn(x)

P −

→m(x). (.)

Next, the convergence rate of estimatormn(x) is presented.

Theorem . For <θ< and s> ,let Assumptions.-.hold,where the mixing ex-ponentβsatisﬁes

β>max

 – θ+ θs ( –θ)(s– ),

s–  s– 

. (.)

Denote an=

lnn

nhd and take h=n–θ/d.Then for x∈Sf,one has

mn(x) –m(x)=OP(an) +O

(5)

Third, we now investigate the uniform convergence rate of estimatormn(x) and its

con-vergence over a compact set. LetS_fbe any compact set contained inS

f.

Theorem . For <θ< and s> ,let Assumptions.-.be fulﬁlled,where the mixing exponentβsatisﬁes

β>max

sθd+ θs+sd+s– θd– θ

( –θ)(s– ) , s– 

s– 

. (.)

Suppose that Assumption.is also fulﬁlled.Denote an=

lnn

nhd and take h=n–θ/d.Then

sup x∈S_f

mn(x) –m(x)=Op(an) +O

h. (.)

2.3 Discussion

The parametric θ in Theorem . and Theorem . plays the role of a bridge between the process (i.e.mixing exponent) and choice of positive bandwidthh. For example, if d= ,θ = _, andβ>max{_ss_–+,_ss_––}, then we takeh=n–/ in Theorem . and obtain the convergence rate|mn(x) –m(x)|=OP((lnn)/n–/). Similarly, ifd= ,θ=_, andβ>

s–

s–, then we chooseh=n–/in Theorem . and establish the uniform convergence rate

sup_x_∈_S

f |mn(x) –m(x)|=Op((lnn)

/_n–/_).

3 Conclusion

On the one hand, similar to Theorem ., Hansen [] investigated the kernel average estimator

(x) =  n

n

i=

YiKh(x–Xi),

and obtained the variance boundVar((x))≤_nhd, whereis a positive constant. For the details, see Theorem  of Hansen []. Under some other conditions, Hansen [] also gave the uniform convergence rates such assup_x_≤_c

n|(x) –E(x)|=OP(an), wherean=

lnn

nhd → and{cn}is a sequence of positive constant (see Theorems - of Hansen []). On the other hand, under the conditions such as the geometrically α-mixing and{Yn,

n≥}is bounded as well as the density functionf(x) ofX, Shen and Xie [] obtained the

complete convergence such as|mn(x) –m(x)| a.c.

−→, ifln_nhdn→ (see Theorem . of Shen and Xie []), the uniform complete convergence such assup_x_∈_S

f |mn(x) –m(x)|

a.c.

−→, if

lnn

nhd → (see Theorem . of Shen and Xie []). In this paper, we do not need the bounded conditions of{Yn,n≥} andf(x) ofX, and we also investigate the convergence of the

internal estimatormˆn(x). Under some weak conditions such as the arithmeticallyα-mixing

andE|Y|s<∞,s> , we establish the convergence rate in Theorem . such as|mn(x) –

m(x)|=OP(an) +O(h) ifan=

lnn

nhd →, and uniform convergence rate in Theorem .

such assup_x_∈_S

f|mn(x) –m(x)|=Op(an) +O(h

_{) if}_a

n=

lnn

nhd →. In Theorem . and Theorem ., we have|mn(x) –Emn(x)|=OP(an) andsupx∈S_f|mn(x) –Emn(x)|=OP(an),

(6)

4 Some lemmas and the proofs of the main results

Lemma .(Hall and Heyde, [], Corollary A.,i.e.Davydov’s lemma) Suppose that X and Y are random variables which areG-measurable andH-measurable,respectively, and E|X|p_<_∞_,_E_|_Y_|q_<_∞_,_{where p}_,_q_{> ,}_p–₊_q–_{< .}_Then

E(XY) –EXEY≤E|X|p/pE|Y|q/qα(G,H)–p––q–.

Lemma .(Liebscher [], Proposition .) Let{Xn,n≥} be a stationary α-mixing

sequence with mixing coeﬃcient α(k). Assume that EXi=  and|Xi| ≤S<∞,a.s., i=

, , . . . ,n.Then,for n,m∈N,  <m≤n/,and allε> ,

P

n

i=

Xi

>ε

≤exp

– ε



(_mnDm+_εSm)

+ S

εnα(m),

where Dm=max≤j≤mVar(

j i=Xi).

Lemma .(Shen and Xie [], Lemma .) Under Assumption.,for x∈S_f,one has

Emn(x) –m(x)=O

h.

Proof of Theorem. Forx∈Sf, letZi:=YiK_fh₍(_Xx_i–₎Xi), ≤i≤n. Consider now

mn(x) =

 n

n

i=

YiKh(x–Xi)

f(Xi)

= n

n

i=

Zi, n≥.

For any ≤r≤sands> , it follows from (.) and (.) that

hd(r–)E|Z|r=hd(r–)E

Kh(x–X)Y

f(X)

r

=hd(r–)E

|Kh(x–X)|r

fr₍_X

)

E|Y|r|X

=

Sf K

x–u

h

rE|Y|r|X=u

 hd

f(u) fr₍_u₎du

≤ Sf

K

x–u

h

rE|Y|s|X=u

f(u)r/s  hd

 fr–+r/s₍_u₎du

≤ (B)r/sK¯r–μ

(inf_x_∈_S_f f)r–+r/s:=μ¯(r,s) <∞. (.)

Forj≥j∗, by (.), one has

E|ZZj+|=E

Kh(x–X)Kh(x–Xj+)YYj+

f(X)f(Xj+)

=E _|

Kh(x–X)Kh(x–Xj+)|

f(X)f(Xj+)

E|YYj+||X,Xj+

=

Sf Sf K

x–u

h

K

x–uj+

h

E|YYj+||X=u,Xj=uj+

(7)

× 

hd

 f(u)f(uj+)

fj(u,uj+)duduj+

≤ B

(infx∈Sf f) Rd Rd

_K₍_u_₎_K₍_u_j_+₎_du__du_j_+_≤ Bμ

(infx∈Sf f)

<∞. (.)

Deﬁne the covariancesγj=Cov(Z,Zj+),j> . Assume thatnis suﬃciently large so that

h–d_≥_j∗_{. We now bound the}_γ

jseparately forj≤j∗,j∗<j≤h–d, andh–d<j<∞. First, for

≤j≤j∗, by the Cauchy-Schwarz inequality and (.) withr= ,

|γj| ≤

Var(Z)·Var(Zj+) =Var(Z)≤EZ≤ ¯μ(,s)h–d. (.)

Second, forj∗<j≤h–d, in view of (.) (r= ) and (.), we establish that

|γj| ≤E|ZZj+|+

E|Z|



≤ Bμ

(infx∈Sf f)

+μ¯(,s). (.)

Third, forj>h–d_{, we apply Lemma ., (.) and (.) with}_r₌_s₍_s_{> ) and we thus obtain}

|γj| ≤

α(j)–/sE|Z|s

/s

≤A––/sj–β(–/s)_μ_¯₍_s_,_s₎_h–d(s–)/s

≤A––/sμ¯s₍_s_,_s₎_j–(–/s)_h–d(s–)/s_. _(.)

Consequently, in view of the property of second-order stationarity and (.)-(.), forn suﬃciently large, we establish

Varmn(x)

= 

nVar

_n

i=

Zi

= 

n

nγ+ n

n–

j=

 – j

n

γj

≤ 

n

nh–dμ¯(,s) + n

≤j≤j∗

|γj|+ n

j∗<j≤h–d

|γj|+ n

h–d_<_j

|γj|

≤ 

nμ¯(,s)h

–d₊

nj

∗_μ_¯_(,_s₎_h–d₊

n

h–d–j∗μ¯(,s) + Bμ



(infx∈Sf f)

+ n

h–d_<_j_<_∞

A–/sμ¯s₍_s_,_s₎_j–(–/s)_h–d(s–)/s

≤

¯

μ(,s) + j∗μ¯(,s) + 

¯

μ(,s) + Bμ



(inf_x_∈_S_f f)

+A

–/s_μ_¯_s₍_s_,_s₎

(s– )/s

 nhd

:=

nhd, (.)

where the ﬁnal inequality uses the fact that ∞_j₌_k_+j–δ_≤∞ k x–

δ_dx₌k–δ

δ– forδ >  and

k≥.

(8)

Proof of Corollary. It is easy to see that

mn(x) –m(x)≤mn(x) –Em(x)+Emn(x) –m(x),

which can be treated as ‘variance’ part and ‘bias’ part, respectively.

On the one hand, by the proof of Theorem . of Shen and Xie [], one has|Emn(x) –

m(x)| →. On the other hand, we apply Theorem . and obtain that|mn(x) –Emn(x)| P − →

. So, (.) is proved ﬁnally.

Proof of Theorem. Letτn=a–/(n s–)and deﬁne

Rn=mn(x) –

 n

n

i=

YiKh(x–Xi)

f(Xi)

I|Yi| ≤τn

=

n

i=

YiKh(x–Xi)

f(Xi)

I|Yi|>τn

.

Obviously, we have

E|Rn| ≤E

|Kh(x–X)|

f(X)

E|Y|I

|Y|>τn

|X

≤ 

(infSff) Sf K

x–u

h

E|Y|I

|Y|>τn

|X=u

f(u) hd du

= 

(infSff) Sf

K(u)E|Y|I

|Y|>τn

|X=x–hu

f(x–hu)du

≤ 

(inf_S_ff) 

τs–

n Sf

K(u)E|Y|s|X=x–hu

f(x–hu)du

≤ 

(inf_S_ff)

μB

τs–

n

. (.)

Combining Markov’s inequality with (.), one has

|Rn–ERn|=OP

τ_n–(s–)=OP(an). (.)

Denote

mn(x) =

 n

n

i=

YiKh(x–Xi)

f(Xi)

I|Yi| ≤τn

:= 

n

i=

˜

Zn, n≥. (.)

It can be seen that

mn(x) –m(x)≤mn(x) –Emn(x)+Emn(x) –m(x)

≤mn(x) –Emn(x)+|Rn–ERn|+Emn(x) –m(x). (.)

Similar to the proof of (.), it can be argued that

Var

_j

i=

˜

Zi

(9)

which implies

Dm= max

≤j≤mVar

_j

i=

˜

Zi

≤Cmh–d.

Meanwhile, one has| ˜Zi–EZ˜i| ≤ C_hdτn, ≤i≤n. Settingm=a–n τn–and using (.),h=

n–θ/d_{, and Lemma . with}_ε₌_a

nn, we obtain fornsuﬃciently large

Pmn(x) –Emn(x)>an

=P

n

i=

(Z˜i–EZ˜i)

>nan

≤exp

– na



n

(Ch–d+_Ch–d)

+  Cτn annhd

nA(anτn)β

≤exp

– lnn (C+_C)

+Ch–da

β(s–)–s s–

n

≤o() +Cnθn(

θ–)β(s–)–s_(s–)

(lnn)β(s–)–s(s–)

=o() +Cn

β(θ–)(s–)+θs+s–θ

(s–) ₍_ln_n₎

β(s–)–s

(s–) ₌_o_(), _(.)

in view ofs> ,  <θ< ,β>max{_(–θs+_θ₎₍s–_s_–)θ ,_ss_––}, and β(θ–)(s_(–)+_s_–)θs+s–θ< .

Consequently, by (.), (.), (.), and Lemma ., we establish the result of (.).

Proof of Theorem. We use some similar notation in the proof of Theorem .. Obvi-ously, one has

sup x∈S_f

mn(x) –m(x)≤sup x∈S_f

mn(x) –Emn(x)+sup x∈S_f

Emn(x) –m(x). (.)

By the proof of (.) of Shen and Xie [], we establish that

Emn(x) –m(x)≤h

b 

≤i,j≤d Rd

K(v)|vivj|dv≤Ch,

which implies

sup x∈S_f

Emn(x) –m(x)=O

h. (.)

Sincemˆn(x) =Rn(x) +m˜n(x),

sup x∈S_f

mn(x) –Emn(x)≤sup x∈S_f

mn(x) –Emn(x)+sup x∈S_f

|Rn–ERn|. (.)

It follows from the proof of (.) that

sup x∈S_f

(10)

SinceS_f is a compact set, there exists aξ>  such thatS_f⊂B:={x: x ≤ξ}. Letvnbe

a positive integer. Take an open coveringvdn

j=Bjn ofB, whereBjn⊂ {x: x–xjn ≤ _vξ_n},

j= , , . . . ,vd_n, and their interiors are disjoint. So it follows that

sup x∈S_f

mn(x) –Emn(x)

≤ max

≤j≤vdn

sup x∈Bjn∩Sf

mn(x) –Emn(x)

≤ max

≤j≤vdn

sup x∈Bjn∩Sf

mn(x) –mn(xjn)+ max

≤j≤vdn

mn(xjn) –Emn(xjn)

+ max

≤j≤vdn

sup x∈Bjn∩Sf

Emn(xjn) –Emn(x)

:=I+I+I. (.)

By the deﬁnition ofmn(x) in (.) and the Lipschitz condition ofK,

mn(x) –mn(xjn)≤

inf S_f f

– _τ_n

nhd n

i=

K

x–Xi

h

–K

xjn–Xi

h

≤ Lτn

hd+_inf

S_ff

x–xjn , x∈Sf.

Takingvn=_hd+τn_a

n+ , we obtain

sup x∈Bjn∩Sf

mn(x) –mn(xjn)≤

Lξ inf_S

ff

an, ≤j≤vdn, (.)

and

I= max ≤j≤vdn

sup x∈Bjn∩Sf

mn(x) –mn(xjn)=O(an). (.)

In view of|Emn(xjn) –Emn(x)| ≤E|mn(xjn) –mn(x)|, we have by (.)

I= max ≤j≤vdn

sup x∈Bjn∩Sf

Emn(xjn) –Emn(x)≤

Lξ inf_S

ff

an=O(an). (.)

For ≤i≤nand ≤j≤vd_n, denoteZ˜i(j) =

YiKh(xjn–Xi)

f(Xi) I(|Yi| ≤τn). Then similar to the proof of (.), we obtain by Lemma . withm=a–

nτn–andε=Mnanfornsuﬃciently

large

P|I|>Man

=P

max

≤j≤vdn

mn(xjn) –Emn(xjn)>Man

≤ vdn

j=

P

n

i=

_˜

Zi(j) –EZ˜i(j)

>Mnan

(11)

≤vd_nexp

– M

_na

n

(Ch–d+_CMh–d)

+ vd_n Cτn Manhd

A(anτn)β

=I+I, (.)

where the value ofMwill be given in (.).

In view of  <θ< ,s> ,h=n–θ/d_{, and}_a

n= (_nhlnnd)/, one hash–d(d+)=n

θ(d+)_and_a–s–sd

n =

(lnn)–(s–)sd _n sd(–θ)

(s–)_{. Therefore, by}_v

n=_hd+τn_a_n+  andτn=a

–_s–

n , we obtain fornsuﬃciently

large

I= vdnexp

– M

_na

n

(Ch–d+_MCh–d)

≤Ch–d(d+)a –_s–sd

n exp

– M

_ln_n

(C+_MC)

≤C(lnn)–

sd (s–)_n

θ(d+)+sd(–_(s–)θ)– M

(C+MC) ₌_o_(), _(.)

whereMis suﬃciently large such that

M

(C+_MC)

≥θ(d+ ) +sd( –θ)

(s– ) . (.)

Meanwhile, by (.) andh=n–θ/d_{, one has for}_n_{suﬃciently large}

I= vdn

Cτn

Manhd

A(anτn)β≤

C

M

τn

hd+_a

n

d

a

β(s–)–s s–

n h–d=

C

Ma

β(s–)–s(d+) s–

n h–d(d+)

=C M(lnn)

β(s–)–s(d+) (s–) _n

β(θ–)(s–)+sθd+θs+sd+s–θd–θ

(s–) ₌_o_(), _(.)

in which is used the fact thats> ,  <θ< ,

β>max

sθd+ θs+sd+s– θd– θ

( –θ)(s– ) , s– 

s– 

,

and

β(θ– )(s– ) +sθd+ θs+sd+s– θd– θ

(s– ) < .

Thus, by (.)-(.), we establish that

|I|=Op(an). (.)

Finally, the result of (.) follows from (.)-(.), (.), (.), and (.)

immedi-ately.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

(12)

Acknowledgements

The authors are deeply grateful to the editors and the anonymous referees for their careful reading and insightful comments, which helped in improving the earlier version of this paper. This work is supported by the National Natural Science Foundation of China (11171001, 11426032, 11501005), Natural Science Foundation of Anhui Province (1408085QA02, 1508085QA01, 1508085J06, 1608085QA02), the Provincial Natural Science Research Project of Anhui Colleges (KJ2014A010, KJ2014A020, KJ2015A065), Quality Engineering Project of Anhui Province (2015jyxm054), Applied Teaching Model Curriculum of Anhui University (XJYYKC1401, ZLTS2015053) and Doctoral Research Start-up Funds Projects of Anhui University.

Received: 28 January 2016 Accepted: 13 May 2016 References

1. Shen, J, Xie, Y: Strong consistency of the internal estimator of nonparametric regression with dependent data. Stat. Probab. Lett.83, 1915-1925 (2013)

2. Nadaraya, EA: On estimating regression. Theory Probab. Appl.9, 141-142 (1964) 3. Watson, GS: Smooth regression analysis. Sankhya, Ser. A26, 359-372 (1964)

4. Jones, MC, Davies, SJ, Park, BU: Versions of kernel-type regression estimators. J. Am. Stat. Assoc.89, 825-832 (1994) 5. Mack, YP, Müller, HG: Derivative estimation in nonparametric regression with random predictor variable. Sankhya,

Ser. A51, 59-72 (1989)

6. Linton, O, Nielsen, J: A kernel method of estimating structured nonparametric regression based on marginal integration. Biometrika82, 93-100 (1995)

7. Linton, O, Jacho-Chávez, D: On internally corrected and symmetrized kernel estimators for nonparametric regression. Test19, 166-186 (2010)

8. Masry, E: Recursive probability density estimation for weakly dependent stationary processes. IEEE Trans. Inf. Theory

32, 254-267 (1986)

9. Roussas, GG, Tran, LT, Ioannides, DA: Fixed design regression for time series: asymptotic normality. J. Multivar. Anal.40, 262-291 (1992)

10. Tran, LT, Roussas, GG, Yakowitz, S, Van, BT: Fixed-design regression for linear time series. Ann. Stat.24, 975-991 (1996) 11. Liebscher, E: Strong convergence of sums ofα-mixing random variables with applications to density estimation.

Stoch. Process. Appl.65, 69-80 (1996)

12. Hansen, BE: Uniform convergence rates for kernel estimation with dependent data. Econom. Theory24, 726-748 (2008)

13. Andrews, DWK: Nonparametric kernel estimation for semiparametric models. Econom. Theory11, 560-596 (1995) 14. Bakouch, HS, Risti´c, MM, Sandhya, E, Satheesh, S: Random products and product auto-regression. Filomat27,

1197-1203 (2013)

15. Bosq, D: Nonparametric Statistics for Stochastic Processes: Estimation and Prediction, 2nd edn. Lecture Notes in Statistics, vol. 110. Springer, Berlin (1998)

16. Bosq, D, Blanke, D: Inference and Prediction in Large Dimensions. Wiley, Chichester (2007) 17. Fan, JQ: Design-adaptive nonparametric regression. J. Am. Stat. Assoc.87, 998-1004 (1992) 18. Fan, JQ: Local linear regression smoothers and their minimax eﬃciency. Ann. Stat.21, 196-216 (1993) 19. Fan, JQ, Yao, QW: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York (2003) 20. Gao, JT, Lu, ZD, Tjostheim, D: Estimation in semiparametric spatial regression. Ann. Stat.34, 1395-1435 (2006) 21. Gao, JT, Tjostheim, D, Yin, JY: Estimation in threshold autoregressive models with a stationary and a unit root regime.

J. Econom.172, 1-13 (2013)

22. Györﬁ, L, Kohler, M, Krzy˙zak, A, Walk, H: A Distribution-Free Theory of Nonparametric Regression. Springer, New York (2002)

23. Ling, NX, Wu, YH: Consistency of modiﬁed regression estimation for functional data. Statistics46, 149-158 (2012) 24. Ling, NX, Xu, Q: Asymptotic normality of conditional density estimation in the single index model for functional time

series data. Stat. Probab. Lett.82, 2235-2243 (2012)

25. Ling, NX, Li, ZH, Yang, WZ: Conditional density estimation in the single functional index model forα-mixing functional data. Commun. Stat., Theory Methods43, 441-454 (2014)

26. Masry, E: Multivariate local polynomial regression for time series: uniform strong consistency and rates. J. Time Ser. Anal.17, 571-599 (1996)

27. Newey, WK: Kernel estimation of partial means and a generalized variance estimator. Econom. Theory10, 233-253 (1994)

28. Peligrad, M: Properties of uniform consistency of the kernel estimators of density and of regression functions under dependence conditions. Stoch. Stoch. Rep.40, 147-168 (1992)

29. Rosenblatt, M: Remarks on some non-parametric estimates of a density function. Ann. Math. Stat.27, 832-837 (1956) 30. Stone, CJ: Consistent nonparametric regression. Ann. Stat.5, 595-645 (1977)

31. Hall, P, Heyde, CC: Martingale Limit Theory and Its Application. Academic Press, New York (1980)