• No results found

On bandwidth parameter choices for discrete nonparametric kernel estimator

N/A
N/A
Protected

Academic year: 2021

Share "On bandwidth parameter choices for discrete nonparametric kernel estimator"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Contents lists available atScienceDirect

C. R.

Acad.

Sci.

Paris,

Ser. I

www.sciencedirect.com

Statistics

On

bandwidth

parameter

choices

for

discrete

nonparametric

kernel

estimator

Choix

de

fenêtres

de

lissage

pour

un

estimateur

non

paramétrique

à

noyau

discret

Tristan Senga

Kiessé

INRA,UMR1069,SolAgroetHydrosystèmeSpatialisation,35000Rennes,France

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory: Received20May2012

Acceptedafterrevision10February2016 Availableonline13May2016

PresentedbytheEditorialBoard

This noteconcentrates on the nonparametric estimation ofa probability mass function (p.m.f.) using discrete associated kernels. An expression of the optimal bandwidth minimizing the asymptoticpart of the global squared error is given. Some asymptotic expressions ofbias and varianceof thecross-validation criterionare alsopresented.At last,thetwobandwidthselectionproceduresareillustratedthroughsomesimulationsand anapplicationonarealcountdataset.

©2016Académiedessciences.PublishedbyElsevierMassonSAS.Thisisanopenaccess articleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).

r

é

s

u

m

é

Cette note se focalise sur l’estimation non paramétrique à noyau associé discret d’une fonction de masse de probabilité. Une expression de la fenêtre optimale minimisant la partie asymptotique de l’erreur quadratique globale est donnée. Des expressions asymptotiquespour lebiaisetlavarianced’uncritèredesélectionparvalidation croisée sont égalementprésentées. Enfin,lesdeux méthodesde choixdefenêtre sontillustrées pardessimulationsetuneapplicationsurdesdonnéesréelles.

©2016Académiedessciences.PublishedbyElsevierMassonSAS.Thisisanopenaccess articleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Let

(

Xi

)

i=1,2,··· ,n beasampleofi.i.d.discreterandomvariables(r.v.)havingap.m.f. f

(

x

)

=

Pr

(

Xi

=

x

)

>

0 onsupport

S

(e.g.,non-negativeintegersset

N

).Thediscretekernelestimator



fn of f canbeexpressedas



fn

(

x

)

=

1 n n



i=1 Kx,h

(

Xi

),

x

S,

(1)

E-mailaddress:[email protected]. http://dx.doi.org/10.1016/j.crma.2016.02.012

1631-073X/©2016Académiedessciences.PublishedbyElsevierMassonSAS.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).

(2)

whereh

=

h

(

n

)

>

0 isan arbitrarysequenceofsmoothingparametersthat fulfillslimn→∞h

(

n

)

=

0 and Kx,h

(

·)

isap.m.f., calleddiscreteassociatedkernel,withr.v.

K

x,honsupport

S

x (containingtarget x andnotdependingonh)satisfying

H1

:

lim

h→0E

(K

x,h

)

=

x

,

and H2

:

h→lim0Var

(K

x,h

)

=

0

;

withoutlossofgenerality,weassume

S

x

S

inallthiswork.Thenonparametrickernelestimationhasbeenalreadylargely discussed in the literature for the probability density function (p.d.f.), in particular for choosing smoothing parameters; seeBowman[2]orMarron [6].Incontrast,thediscretekernelestimationforp.m.f.hasreceivedmorelessattentionthan that for p.d.f.In fact, until now, it commonly results from the discretization ofthe continuous case by considering the ordinalvariablesasbeingcontinuous;seeAitchisonandAitken[1],Titterington[8]orSimonoffandTutz[7].Foraspecific investigation ofcount data,Kokonendjietal.[4]andKokonendjiandSengaKiessé[3]havedevelopedthe nonparametric kernelestimationusingsomediscreteassociatedkernels.Thisopenswaytoaspecifictreatmentofthebandwidthselections ofdiscretekernelestimatorsbyadaptingclassicalmethods,whicharewellknownforthecontinuouscase.Thisnotepursues the first works realizedon thissubject, witha first attemptto provide an expression of theoptimal bandwidth andby investigatingtheasymptoticpropertiesofthecross-validationcriterionfor



fn in(1).

2. Discreteassociatedkernels

LetusfirstintroducethefollowinghypothesesonmodalprobabilityofKx,h:

H1

:

Kx,h

(

x

)

=

1

hA

(K

x,h

)

+

O

(

h2

),

with



ySx\{x}Kx,h

(

y

)

=

hA

(

K

x,h

)

+

O

(

h2

)

0 as h

0,where A

(

K

x,h

)

isboundedaway from0 forh

0.Itresultsin thefollowingproposition.

Proposition2.1.Considerafixedpointx

S

andthebandwidthh

>

0.UnderassumptionH1,theexpectationandvarianceofthe discreteassociatedkernelKx,hfulfillH1andH2.

Proof. FortheassumptionH1,theexpectationof

K

x,hcomesdirectlyfrom

E(

K

x,h

)

=

xKx,h

(

x

)

+



y∈Sx\{x} y Kx,h

(

y

)

=

x

+

B

(

x

;

h

),

withB

(

x

;

h

)

= −

xhA

(K

x,h

)

+



ySx\{x}y Kx,h

(

y

)

+

O

(

h

2

)

0 forh

0.FortheassumptionH2,thevarianceof

K

x,hcanbe successivelyexpressedas Var

(K

x,h

)

=



y∈Sx y2Kx,h

(

y

)

 

y∈Sx y Kx,h

(

y

)



2

=

x2

{

Kx,h

(

x

)

1

} +

D

(

x

;

h

),

withD

(

x

;

h

)

=



ySx\{x}y 2K x,h

(

y

)

+

x2



x

+



yS x

(

y

x

)

Kx,h

(

y

)



2

0 whenh

0 underthehypothesesH1.Indeed, whenh

0,wehaveboth Kx,h

(

y

)

0 for y

=

x and Kx,h

(

x

)

1 for y

=

x.

2

Then,thefollowingassumptioncanbeformulatedonthevarianceof

K

x,h:

H2

:

Var

(K

x,h

)

=

hV

(K

x,h

)

+

O

(

h2

),

leading tosome hypothesesH1 andH2lessgeneralthanH1andH2;A

(K

x,h

)

andV

(K

x,h

)

donotobligatorydependonx andh,asinthenextexample.

Example. (SeeKokonendjiandZocchi[5].) Leta1

,

a2 befixed integersandh1

,

h2 besmoothingparameters. Foranyfixed x

S = Z

,considerthe r.v.

K

a1,a2;x,h1,h2 ofdiscrete generalizedtriangularassociatedkernels definedonsupports

S

a1,x

=

{

x

1

,

x

2

,

. . . ,

x

a1

}

and

S

x,a2

= {

x

,

x

+

1

,

. . . ,

x

+

a2

}

andwhosep.m.f.is

Ka1,a2;x,h1,h2

(

y

)

=

1 P



1

x

y a1

+

1

h1

1Sa1,x

(

y

)

+

1

y

x a2

+

1

h2

1Sx,a2

(

y

)



,

where P

≡ (a

1

+

a2

+

1

)

− (a

1

+

1

)

h1



ak=11kh1

− (a

2

+

1

)

h2



ak=21kh2

=

P

(

a1

,

a2

,

h1

,

h2

)

isthenormalizingconstant.Forh sufficientlysmall,onehas

Kai;x,hi

(

x

)

=

1

2



i=1



hiA

(

ai

)

+

O

(

h2i

)



and Var

(K

ai;x,hi

)

=

2



i=1



hiV

(

ai

)

+

O

(

h2i

)



,

with A

(

ai

)

=

ailog

(

ai

+

1

)



ai

k=1log

(

k

)

andV

(

ai

)

=

ai

(

2a2i

+

3ai

+

1

)

log

(

ai

+

1

)/

6



ai

k=1k2log

(

k

)

.Hence,theassumptions H1andH2 arefulfilled.

(3)

3. Nonparametrickernelestimator

Thissectionpresentsanoptimalh-valueminimizinganapproximateofthemeanintegratedsquarederror(MISE)of



fn in(1),incomparisonwiththeh-valuemimimizingcross-validationfunction(KokonendjiandSengaKiessé[3]).

3.1. MinimizationofMISE

The



fn’svarianceandbiashavebeenestablishedby KokonendjiandSengaKiessé[3]underH1andH2;bytakinginto accountH1andH2,theglobalerroris

MISE

(

h

)

=



x∈S Var

{

fn

(

x

)

} +



x∈S Bias2

{

fn

(

x

)

} =

AMISE

(

h

)

+

o

1 n

+

h 2

+

O

(

h2

),

where AMISE

(

h

)

=

1 n



x∈S f

(

x

)



1

hA

(K

x,h

)



2

f

(

x

)



+

1 4h 2



x∈S



V

(K

x,h

)

f(2)

(

x

)



2

istheapproximatedMISEforh sufficientlysmall,with f(2)afinitedifferenceofsecondorder.Hence,anapproximatevalue



hopt

=

arg minh>0AMISE

(

h

)

ofthetrueoptimalbandwidthhopt

=

arg minh>0MISE

(

h

)

canbegivenby



hopt

(

n

,

f

)

=



x∈S f

(

x

)

A

(K

x,h

)



x∈S f

(

x

)

{

A

(K

x,h

)

}

2

+ (

n

/

4

)



x∈S



V

(K

x,h

)

f(2)

(

x

)



2

,

whichtendsto0 asn

→ ∞

with0

<



x∈S



V

(K

x,h

)

f(2)

(

x

)



2

<

.Itresultsinanasymptoticrelationshipsuchthat



hopt

k0n−1 withk0

=

4



xS f

(

x

)

A

(K

x,h

)/

[



xS



V

(K

x,h

)

f(2)

(

x

)



2

]

.Thus,the discretegeneralizedtriangularassociated kernel presentedasanexamplehasoptimalbandwidth



hopt

(

n

,

ai

,

f

)

=

A

(

ai

)

2

{

A

(

ai

)

}

2

+ (

n

/

2

)

{

V

(

ai

)

}

2



x∈Z



f(2)

(

x

)



2

,

i

=

1

,

2

.

(2)

3.2. Minimizationofcross-validationfunction

We are interested in a bandwidth hcv minimizing a cross-validation score function CV with respect to h such that hcv

=

arg minh>0CV

(

h

)

with

CV

(

h

)

=



x∈S



1 n n



i=1 Kx,h

(

Xi

)



2

2 n

(

n

1

)

n



i=1



j=i KXi,h



Xj



.

(3)

TheCV’smeanandvarianceforfixedh areprovidedinthefollowingtheorem.

Theorem3.1.Considerafixedpointx

S

andthebandwidthh

=

h

(

n

)

>

0 suchthat lim

n→∞h

=

0.AssumethatCVisthe cross-validationfunctionforthenonparametricestimatorin(1)ofp.m.f.f .Then,wehave

E{

CV

(

h

)

} =

AMISE

(

h

)



x∈S f2

(

x

)

+

O

1 n

+

h 2

and Var

{

C V

(

h

)

} =

1 n



x∈S





x∈S Kx2,h

(

x

)

2Kx,h

(

x

)



2 f3

(

x

)

1 n





x∈S f2

(

x

)



2

+

O

h2 n

+

1 n2

,

whereKx,hisadiscretekernelsatisfyingtheassumptionH1. Proof. Letuspresentthecross-validationscorefunctionin(3)as

C V

(

h

)

=

1 n2 n



i=1



x∈S K2x,h

(

Xi

)

+

2 n2

 

j<i Hi j

,

with Hi j

=



x∈SKx,h

(

Xi

)

Kx,h

(

Xj

)

2KXi,h



Xj



. One has

E{

C V

(

h

)

}

= (

1

/

n

)



x∈S

E{

K2 x,h

(

X1

)

}

+ (

1

1

/

n

)

E(

Hi j

)

. Forthe firsttermof

E{

C V

(

h

)

}

,weget:

(4)



x∈S

E{

K2x,h

(

X1

)

} =



x∈S Kx2,h

(

x

)

f

(

x

)

+



x∈S



y∈S\{x} Kx2,h

(

y

)

f

(

y

)

;

wherethesecondterminthepreviousequationisfinite.Nowletusexpress

E{

KX1,h

(

X2

)

} =



x∈S

E{

Kx,h

(

X2

)

}

f

(

x

),

E{

Kx,h

(

X1

)

} =



y∈S Kx,h

(

y

)

f

(

y

)

= E{

f

(K

x,h

)

},

and

E{

f

(

K

x,h

)

}

=

f

(

x

)

+(

1

/

2

)

f(2)

(

x

)

Var

(

K

x,h

)

+

o

(

h

)

obtainedbyusingadiscreteTaylorexpansionaroundx.Forthesecond termof

E{

C V

(

h

)

}

,itensues

E(

Hi j

)

=

h2 4



x∈S



V

(K

x,h

)

f(2)

(

x

)



2



x∈S f2

(

x

)

+

o

(

h2

)

+

O

(

h2

),

whereo

(

h2

)

isasymptoticallydominatedby O

(

h2

)

.Hence,itresults

E{

C V

(

h

)

}

. ForCV’svariance,webeginbycalculatingthevarianceofthefirsttermas

1 n3Var



x∈S Kx2,h

(

X1

)



=

1 n3



x∈S Kx2,h

(

x

)



2 f

(

x

)



x∈S K2x,h

(

x

)

f

(

x

)



2



+

Qn

(

x

;

h

),

with n3Qn

(

x

;

h

)

=



y∈S\{x}



x∈S Kx2,h

(

y

)



2 f

(

y

)

+



x∈S K2x,h

(

x

)

f

(

x

)



2



y∈S



x∈S Kx2,h

(

y

)

f

(

y

)



2

,

where Qn iso

(

1

/

n3

)

;infact,we canassumeVar



xS



n−2



ni=1Kx2,h

(

Xi

)



=

o

(

n−3

)

+

O

(

n−3

)

.Consideringthevariance ofthesecondtermofCV,wehave

1 n4Var



j<i Hi j

=

1 n2Var

(

Hi j

)

+

1 n

1

3 n

Cov

(

Hi j

,

Hik

)

+

O

1 n3

.

Without giving all details, we get

E(

Hi j2

)

=



xS



xSKx2,h

(

x

)

2Kx,h

(

x

)



2

f2

(

x

)

+

o

(

h2

)

. Then, by usingexpression of

E(

Hi j

)

calculatedpreviously,wehave

Var

(

Hi j

)

=



x∈S





x∈S K2x,h

(

x

)

2Kx,h

(

x

)



2 f2

(

x

)





x∈S f2

(

x

)



2

+

h2 2



x∈S



V

(K

x,h

)

f(2)

(

x

)



2



x∈S f2

(

x

)

+

o

(

h2

).

Inaddition,itcanbeshownthat

E(

Hi jHik

)

=



x∈S





x∈S K2x,h

(

x

)

2Kx,h

(

x

)



2 f3

(

x

)

+

o

(

h3

)

;

then,byconsideringCov

(

Hi j

,

Hik

)

= E(

Hi jHik

)

− E(

Hi j

)

E(

Hik

)

,wehave

Cov

(

Hi j

,

Hik

)

=



x∈S





x∈S Kx2,h

(

x

)

2Kx,h

(

x

)



2 f3

(

x

)



x∈S f2

(

x

)



2

+

h2 2



x∈S



V

(K

x,h

)

f(2)

(

x

)



2



x∈S f2

(

x

)

+

o

(

h3

)

+

O

(

h2

),

withCov

(

Hi j

,

Hkl

)

=

0.HencethedesiredresultonVar

{

C V

(

h

)

}

.

2

4. Illustrations

This section illustrates the bandwidth choices; discrete symmetric and generalized triangular kernels are applied for simulatedandrealdata,respectively.

(5)

Table 1

AveragesI S E,hoptandhcvfornonparametricestimatorusingtriangularkernels.

Sample size n hopt I S E(hopt) hopt I S E(hopt) hcv I S E(hcv)

a=2 25 0.49 0.0028 0.23 0.0116 0.73 0.0130 100 0.16 0.0011 0.12 0.0043 0.23 0.0053 a=3 25 0.22 0.0034 0.10 0.0140 0.43 0.0149 100 0.06 0.0009 0.04 0.0057 0.07 0.0062 a=4 25 0.12 0.0033 0.05 0.0196 0.23 0.0195 100 0.03 0.0006 0.02 0.0065 0.04 0.0068

Fig. 1. Estimations of the number of goals per match using a discrete generalized triangular kernel. 4.1. Simulations

Consider simulated data of sample size n from a Poisson distribution f with mean

μ

=

2. The estimator



fn using symmetrictriangularkernelsisappliedwitha1

=

a2

=

a

∈ {

2

,

3

,

4

}

,asanexample.Theperformanceof



fn isevaluatedvia theintegratedsquarederrorI S E

(

h

)

=



x∈N

{

fn

(

x

)

f

(

x

)

}

2 determinedusing



hopt,hcv andtruehoptcomputednumerically since f is known. In Table 1, the averages I S E,



hopt andhcv are calculated based on 100 replications of the simulated data.The value



hopt iscompetitive since globally we have I S E

(

hopt

)

I S E

(

hcv

)

for chosen valuesofa; note that hopt is underestimatedby



hoptandoverestimatedby hcv.Atlast,byestimating f usingempiricalfrequencyofdata, I S E isequal to0

.

0308 forn

=

25 and0

.

0088 forn

=

100.

4.2. Application

Theestimator



fnisappliedusingageneralizedtriangularkernelwith

(

a1

,

a2

)

= (

1

,

2

)

oncountdata,describinganumber ofgoalspermatchfromtheFrenchfootballchampionship(Kokonendjietal.[4]).Byacross-validationprocedure,wehave h1cv

=

0

.

170 andh2cv

=

0

.

085,whiletheexpression(2)resultsin



hopt,01

=

0

.

156 and



hopt,02

=

0

.

031,obtainedbyreplacing theunknownp.m.f. f in(2)withitsempiricalfrequencyestimate.Thequalityoftheestimateprovidedusing

(

h1cv

,

h2cv

)

is closetothatoftheonecalculatedusing

(

hopt,01

,



hopt,02

)

,sinceISEis,respectively,equalto3

.

0081

×

10−4and3

.

0082

×

10−4 (Fig. 1).

References

[1]J.Aitchison,C.G.G.Aitken,Multivariatebinarydiscriminationbythekernelmethod,Biometrika63(1976)413–420. [2]A.Bowman,Analternativemethodofcross-validationforthesmoothingofdensityestimates,Biometrika71(1984)352–360. [3]C.C.Kokonendji,T.SengaKiessé,Discreteassociatedkernelmethodandextensions,Stat.Methodol.8(2011)497–516.

(6)

[4]C.C.Kokonendji,T.SengaKiessé,S.S.Zocchi,Discretetriangulardistributionsandnon-parametricestimationforprobability massfunction,J. Non-parametr.Stat.19(2007)241–254.

[5]C.C.Kokonendji,S.S.Zocchi,Extensionsofdiscretetriangulardistributionsandboundarybiasinkernelestimationfordiscretefunctions,Stat.Probab. Lett.80(2010)1655–1662.

[6]J.S.Marron,Acomparisonofcross-validationtechniquesindensityestimation,Ann.Stat.15(1987)152–162.

[7]J.S.Simonoff,G.Tutz,Smoothingmethodsfordiscretedata,in:M.G.Schimek(Ed.),SmoothingandRegression:Approaches,Computation,and Applica-tion,Wiley,NewYork,2000,pp. 193–228.

References

Related documents

The level of un- employment is explained by financial variables (stock market capitalization, intermediated credit and banking concentration), labour market factors (labour

• Class II workers who remove intact flooring using compliant work practices: 8-hours if initial specialized training covering the topics listed in the flooring

&#34;It is advisable to do a l l i n your power to sustain such prominent daily and weekly newspapers, especially the agricultural and religious press, as will oppose the

In the educational process the Faculty of Mechanical Engineering offers two Bachelor’s degree programs: Bachelor of Science in MECHANICAL ENGINEERING (4 years

Four standard multiple linear regressions (using the Enter method) were calculated to determine how much of participants’ angry ratings (in response to the scenarios) was accounted

From there Villa Piaggio is at walking distance (one minute) keeping your left when getting off the bus. At the park entrance follow the CiViTAS CARAVEL signs until the

- How to efficiently and accurately match data service needs and availability. - How to coordinate service advertisement in order to achieve near-real time service search and

This ANFIS model thus provide us with an effective partition of the state space and linear rules that are dominant in those par- titions• A number of stability methods can be applied