• No results found

Minimax optimal procedures for testing the structure of multidimensional functions

N/A
N/A
Protected

Academic year: 2021

Share "Minimax optimal procedures for testing the structure of multidimensional functions"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Contents lists available atScienceDirect

Applied

and

Computational

Harmonic

Analysis

www.elsevier.com/locate/acha

Minimax

optimal

procedures

for

testing

the

structure

of

multidimensional

functions

John Astona, Florent Autinb,Gerda Claeskensc, Jean-Marc Freyermutha,∗, Christophe Pouetb

a Statistical Laboratory, University of Cambridge, Wilberforce Road Cambridge CB3 0WB, England,

United Kingdom

bAix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France

cORSTAT and Leuven Statistics Research Center, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgium

a r t i cl e i n f o a b s t r a c t

Article history:

Received1June2016

Receivedinrevisedform22March 2017

Accepted1May2017 Availableonlinexxxx

CommunicatedbyChristopherHeil

Keywords:

Adaptation Anisotropy Atomicdimension Besovspaces Gaussiannoisemodel Hyperbolicwavelets Hypothesistesting Minimaxrate Soboldecomposition Structuralmodeling

We present a novel method for detecting some structural characteristics of multidimensional functions. We consider the multidimensional Gaussian white noisemodel withan anisotropic estimand.Usingthe relation betweenthe Sobol decompositionandthegeometryofmultidimensionalwaveletbasiswecanbuildtest statisticsforanyoftheSobolfunctionalcomponents. Weassesstheasymptotical minimax optimality of these test statistics and show that they are optimal in presence of anisotropy with respect to the newly determined minimax rates of separation.Anappropriatecombinationoftheseteststatisticsallowstotestsome general structural characteristics such as the atomic dimension or the presence of some variables. Numerical experiments show the potentialof our method for studyingspatio-temporalprocesses.

©2017TheAuthor(s).PublishedbyElsevierInc.Thisisanopenaccessarticle undertheCCBYlicense(http://creativecommons.org/licenses/by/4.0/).

1. Introduction

Multidimensional data often exhibit a simpler underlying structure, meaning that their effective di-mension is smaller than the observed dimension. Detecting the presence of such structure can enhance the understanding of the data and allows for more effective modeling and inferential strategies. There is a flourishing literature dealing with nonparametric methods for structure detection. These contributions are concerned with different typesof structures (such as additivity, small atomic dimension and variable selection) and with different modeling approaches according to the nature of the noise, the smoothness

* Correspondingauthor.

E-mail address:[email protected](J.-M. Freyermuth). http://dx.doi.org/10.1016/j.acha.2017.05.003

1063-5203/©2017TheAuthor(s).PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).

(2)

assumptions for instance.A briefoverview of the most directlyrelated contributions are given hereafter. Themaincharacteristicofthispaperistoprovidearigoroustheoreticalstudyofstructuredetectioninthe presenceofananisotropicallysmoothestimand.WeconsideramultidimensionalGaussianwhitenoisemodel and deriveteststatisticsfortheatomicdimensionofmultidimensionalobjects(i.e.,themaximaldegreeof interactionbetweenthevariables) andforvariable selection.Then,following theSieveestimation ofBirgé and Massart [5], we build adata-driven procedure. It is first tested in ‘idealistic’ numerical experiments before beingappliedto amoresophisticatedcontextof timeseriesdataanalysis.

Themainingredientofourmethodologyistoprojectthedataontoanhyperbolic(wavelet)basisandto build test statisticsbased ontheprojection coefficients.Its motivation relies onthe relationthatemerges between the geometry of that basis and the ‘functional components’ of the Sobol decomposition of the estimand (see Sobol [34]). The use ofan appropriate basis allowsto benefitfrom a sparse representation andanoptimaladaptationtotheanisotropicsmoothnessoftheestimand.Inthispaperweconstructoptimal testingproceduresforthefunctionalcomponentsaccordinglytotheminimaxhypothesistestingframework ofIngster[11–13].Appropriatelycombinedthesefunctionalcomponentscanberephrasedintermsofmore generalstructures suchas theatomicdimension.

This project ismotivated by therecentresultsof Autinet al.[3,4] whostudied theabilityof a tensor-productwaveletbasis,theso-calledhyperbolicwaveletbasistoestimatemultidimensionalfunctionshaving anisotropicsmoothness.Tensor-productbases(Fourierorwavelets)havealreadybeenwidelyusedinsignal detectionnotablytotestforadditivity(i.e.atomicdimensionequalsto1)asinDe CanditiisandSapatinas [8]. Nevertheless, in the latter they do not provide deep theoretical results on the performance of their method.Abramovichet al. [1]describe anotherprocedurefortestingadditivity.Theyproposeanadaptive procedure based onthe thresholding wavelet coefficients and derive interesting theoretical results. Never-theless,theyexploitastandard(isotropic)waveletbasisthatcannotoptimallydealwiththemorerealistic situationofhavinganisotropicestimands. Moreover,theirmethodislimitedtotestforadditivityanddoes not exploit the full directional representation of a multidimensional wavelet basis. The functional frame-work (where the dimensiond → ∞) hasalso been investigated by Gayraud and Ingster[10] whostudied the problem of signal detection in the caseof sparse additive functions. Other developments include, for example,multichannelsignaldetectionas inIngsterandSuslina[18],anddetectionininversemodelsasin Ingsteret al.[14].

Comminges andDalalyan [6]madeprofoundtheoretical contributions onhypothesistesting procedures basedonquadraticfunctionals.Theystudyvarioustestingproblemsinvolvingmultidimensionalanisotropic functions.Theyexploitatensor-productFourierbasistobuildnonadaptiveprocedures,i.e.proceduresthat arebuiltontheknowledgeoftheregularityparametersoftheanisotropicfunctionspacesinthealternative. Inthispaper,wefocusontheconsequencesofdealingwithanisotropicestimands.Therefore,wedefinetwo classes of testing procedures,referred to as minimax optimal and adaptiveminimax optimal methodsfor the Besov balls,see Section4.The formermethodology isbuilt usingthe full knowledgeofthe regularity parameterswhilethelatterusesonlyapartofit.

These methodsdifferby the natureof thetruncation appliedto the hyperbolic wavelet coefficients se-quence. An intuitive way to think about this is to consider the better known multidimensional function estimation context. The truncation rule characterizing the minimax optimal method can be viewed as a linear butdirectionallydependentsmoothing. Whilethetruncation rule oftheadaptive minimaxoptimal methodfinds itsroots inthe approximationof thehyperbolic crossthatisnaturallyassociated to tensor-product spaces (see for instance Schmeisser [31], Schmeisserand Sickel [32] and Sickel and Ullrich [33]). Tensor-product spaces are also oftenconsidered invarious statistical contexts such as in estimation with for examplethetensor-product spaceANOVAmodelas inLin[24]or insignaldetectionas inIngsterand Stepanova [15]. They are of great interest since one can hope to reach performances close to the unidi-mensional case. It is sometimespromoted as away to ‘tackle’the curse of dimensionality, neverthelessit is arestrictive assumption.In thecontext of functional ANOVA, forexample, assuming atensor-product

(3)

modelmeansthatthe interactionterm’s complexityhas tobe inverselyproportional to itsdegree.In this paperwedonotrestricttothestructuredetectionoffunctionsbelongingtotensor-productofBesovspaces buttolargerclassessuchas theanisotropicNikolskiiclass. Weshowthatourmethodsattainthe optimal separation rates between the null and the alternative hypothesis and we discuss how to combine sets of testingproceduresinawaythatisadaptedto theanisotropiccase.

Such testing procedures find applications in many fields. We illustrate its usage in time series data analysis. Our aim is to test the structure of the spatio-spectrum associated to aspatio-temporal process thatistimestationary.This allowsusalsotoillustratehow toadaptsuchtestingprocedureinthecontext ofmultidimensionaldatawhere thenumberofobservationsisverylarge,makingp-valuesuseless.

Thispaperisstructuredasfollows.First,inSection2,wegivethefundamentalknowledgeonhyperbolic waveletbasesandtheirrelationtotheSoboldecompositionofamultidimensionalfunction.InSection3we introducethemultidimensionalGaussianwhitenoisemodelandtheminimaxhypothesistestingframework. ThenwedescribeourteststatisticsforSobolfunctionalcomponentsinSection4beforeintroducingtestsfor globalstructuralcharacteristicsinSection5.Section6describesourdata-drivenadaptationandvalidation in a practical setting of the adaptive minimax optimal procedure and a final discussion is provided in Section7.Theproofsofthetheoreticalresultsarepostponedtotheappendix.

2. Hyperbolicwaveletbases

Westartfromaone-dimensionalperiodizedwaveletbasisB1ofL2([0,1)) whichisbuiltfromthedilations

andtranslationsofascalingfunctionφ andawavelet ψ withV (forsomeV ≥ 1)vanishingmoments,

B1=



φ(.), ψj,k(.) = 2j/2ψ(2j.− k) : j ∈ N, k ∈ {0, . . . , 2j− 1}



.

Ad-dimensionalhyperbolicwaveletbasisresultsbyformingd-variatefunctionstakingappropriateproducts oftheunivariatefunctionsφ andψ asfollows,

Bd =



φ0,0(.), ψj, ki (.) : i∈ {0, 1}d\ 0, j ∈ Ji, k∈ Kj



where0 = (0,. . . ,0),φ0,0(.)= φ(.)× · · · × φ(.),ψij,k(.)= ψji11,k1(.)× · · · × ψjidd,kd(.),i = (i1,. . . , id) withthe

followingnotations: ψiu ju,ku(·) =  2ju/2φ(2ju· −k u) if iu= 0 2ju/2ψ(2ju· −k u) if iu= 1 and Ji=j = (j 1, . . . , jd) : ∀u ∈ {1, . . . , d}, ju= juiu, ju∈ N  , Kj=  k = (k1, . . . , kd) : ∀u ∈ {1, . . . , d}, ku∈ {0, . . . , 2ju− 1}  .

Thebasis Bd is generated byaset of ascaling functionφ0,0 and translated anddilated wavelet functions

ψij,k. If all the components of the vector of directional dilations j are equal, this results in the standard (isotropic) wavelet basis. Hereafter we consider that the components of j can take different values. The wavelet functions are then supported on hyper-rectangles and the resulting wavelet basis, the so-called hyperbolicortensor-productwaveletbasis,isproventobeabletooptimallydealwithanisotropicfunctions [3,4].Wesaythateachofthe2d− 1 elementsofi definesanorientation.Insuchabasis,anyf ∈ L

2([0,1)d)

(4)

f =f, φ0,02φ0,0+  i=0   j∈Ji  k∈Kj f, ψi j,k2ψ i j,k ≡ f0+  i=0 fi (1)

where .,.2 isthe L2-scalarproduct.This representation facilitates acharacterization ofa specific

struc-ture, such as additivity. Indeed, for an additive function fadd, that is a function with Sobol

decompo-sition fadd(x1,. . . ,xd) = f0+du=1fu(xu), the coefficients θj,ki = f,ψ i

j,k2 in each orientation i with

|i| = i1+ i2+· · · + id > 1 areexactlyequalto 0.Informationaboutthe componentfunctionsfi in

equa-tion(1)canberetrievedviathewaveletcoefficientsequencethroughthegeometryofthehyperbolicwavelet basis.Thisenablesthedesignofspecifictestingproceduresforstructuralinformationformultivariate func-tions,moregeneralthanmerelytestingforadditivity.

Dalalyanet al.[7]introducedtheatomicdimensioninthephysicaldomainthatwedenoteasδ(f ).It re-flectsthemaximaldegreeofinteractionbetweenthed variablesintheSoboldecomposition.Forinstance,the additivemodelfadd hasδ(fadd)= 1,whileafunctionf (x1,. . . ,xd)=du=1fu(xu)+du=1

d

v=1,v=uxuxv

has δ(f ) = 2. In the sequel, we use the definition from Autin et al. [3] of the atomic dimension in the wavelet coefficient domain relatingthe orientationsof thenonzero wavelet coefficientswith corresponding Sobol functionalcomponents.

Definition 2.1.Letf ∈ L2([0,1)d) bedecomposedinthehyperbolic waveletbasisas

f = f0+  i=0 fi = θ00,0φ0,0+  i /∈0 ⎛ ⎝ j∈Ji  k∈Kj θij,kψj,ki⎠ .

Define Af ={i ∈ {0,1}d\ 0; θij,k = 0 for some (j,k)}.Theatomicdimensionoff withrespecttoBdisthe

integerδBd= δBd(f )∈ {0,. . . ,d} such that:

δBd =



max{|i|; i ∈ Af} if Af = ∅

0 ifAf =∅.

Definition 2.1 gives us more flexibility than its definition in the physical domain. Indeed, the atomic dimensionδBddependsonthenumberofvanishingmoment(evendirectionalones)oftheconsideredbasisBd.

Moreprecisely,theatomicdimensionδ(f ) ofad-dimensionalfunctionf inthephysicdomainandtheatomic dimensionδBd(f ) ofthesamefunctionf inthecoefficientdomainareclearlytiedbythefollowinginequality: δBd(f ) ≤ δ(f). We mention here two situations to emphasize the practical importance of being able to

characterize theatomic dimension inthecoefficientdomain.The firstexample concernsmultidimensional function estimation.Autinet al.[3] introducedanovelestimation procedurebasedonthe thresholdingof thehyperbolicwaveletcoefficients.Theyshowthatitoutperformsthestandardhardthresholdingwhenever somestructuralconditionsontheestimandaresatisfied.Theseconditionsalsoincludesomeconstraintonthe valueoftheatomicdimensionδBd.Hereweprovideateststatistictotestthestructureoftheestimandbefore

applying the aforementioned estimation method. A second example concerns the exploitation of sparsity in statistical modeling that can be useful for building a statistical model using the hyperbolic wavelet coefficients or in generalized linear modeling. For example, Zhou et al. [38] describe such a generalized linear model with multidimensional covariates, their first step consists in dimension reduction by tensor decomposition. Onecould instead exploitthesparsity of thehyperbolic wavelet sequence. In suchacase, onemayseekforthehyperbolic waveletbasisthatprovidesthesparsestsituation.

3. Modelandminimaxapproachforhypothesis testing

(5)

dYε(x) = f (x)dx + εdW (x) (2)

wherex = (x1,. . . ,xd)∈ [0,1)d,f ∈ L2([0,1)d),W (x) istheBrowniansheetandε isthenoiselevel,

consid-eredheretobeclosetozero.Fortechnicalreasons(seetheproofsofPropositionA.3andPropositionA.4), withoutlossofgenerality,weassumethatitissmallerthanεo= e−e,wheree denotestheEuler’snumber.

WeconsiderthesequentialversionoftheGaussianwhitenoisemodelwhichconsistsoftheobservationsof thefollowingrandomvariables

ˆ θ00,0= θ00,0+ εξ =f, φ0,0L2+ εξ and θˆ i j,k = θ i j,k+ εξ i j,k =f, ψ i j,kL2+ εξ i j,k

where,ξ,ξj,ki arei.i.d.N (0, 1) and(j,k)∈ Nd× Kj.

For any chosenorientation i= 0,we maytest the following nullhypothesisHi,0 versus oneof thetwo

statedalternativehypotheses Hi,a andHi,a:

Hi,0: f ∈ Ni(R) = ⎧ ⎨ ⎩f : f 2≤ R, fi(x) =  j∈Ji  k∈Kj θij,kψij,k(x) = 0, ∀x ∈ [0, 1)d ⎫ ⎬ ⎭, (3) Hi,a: f ∈ Ai(R, C, s, rε,i) = ⎧ ⎪ ⎨ ⎪ ⎩f : f ∈ F s(R), f i 2= ⎛ ⎝ j∈Ji  k∈Kj  θj,ki 2 ⎞ ⎠ 1 2 ≥ Crε,i ⎫ ⎪ ⎬ ⎪ ⎭, (4) H i,a: f ∈ ∪s:uius−1u =γi−1Ai(R, C, s, r  ε,i). (5)

Intheaboveexpression,C isapositiveconstantwhichdoesnotdependonε andrε,i,rε,i arecontinuous

sequencesofpositiverealnumberstendingto0 asε goesto0.Fs(R) denotesaballofradiusR inafunction

spacecharacterizedbyavectorofdirectionalregularitiess withharmonicsumintheorientationi denoted

asγi−1.

Given a simultaneous controlon the probabilities of type I and II errors, we consider the asymptotic minimax setup: we aim at providing, for any orientation i, the order of the optimal separation rates in the sense of Section2.6 in Ingsterand Suslina [17] between thenull hypothesis Hi,0 and eachone of the

alternativehypotheses Hi,a andHi,a associated toaparticular choiceofFswhen:

• s is known;

• onlytheharmonicsumγ−1i =uius−1u ofs withrespectto orientationi isknown.

Relatedtothis,wesearchforbothasequenceofFs-minimaxoptimaltestingproceduresandasequenceof

Fs-adaptiveminimaxoptimaltestingprocedures.

Definition 3.1. Let α 0,12, s ∈ (0,+∞)d and i ∈ {0,1}d\ 0. We say that r

ε,i corresponds to the

Fs-minimaxrateseparatingthehypothesesH

i,0andHi,a,uptoaconstant,ifthetwofollowingstatements

aresatisfied:

1. (Upper bound) for any C > Ci,α, with Ci,α an absolute positiveconstant, there exists asequence of

testing proceduresΔε,α ε suchthat sup 0<ε<εo  sup f∈Ni(R) Pf∗ε,α= 1) + sup f∈Ai(R,C,s,rε,i) Pf∗ε,α= 0)  ≤ α,

(6)

2. (Lowerbound)forany C < ci,α,withci,αanabsolutepositiveconstant,thefollowing inequalityholds inf 0<ε<εo inf Δ  sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0)  > α.

Thesequence oftestingproceduresΔε,α

ε issaidtobe F

s-minimaxoptimal.

Definition 3.2. Let α 0,12. Consider i ∈ {0,1}d such that |i| > 1 and let γ

i > 0. We say that rε,i

corresponds totheFs-adaptiveminimaxrateseparatingthehypothesesH

i,0andHi,a,uptoaconstant,if

thetwo followingstatementsaresatisfied:

1. (Upper bound) for any C > Ci,α , with Ci,α an absolute positive constant,there exists a sequence of testingprocedureΔ ε,α  εsuchthat sup 0<ε<o ⎛ ⎝ sup f∈Ni(R) Pfε,α= 1) + sup s:uius−1u =γ−1i sup f∈Ai  R,C,s,rε,i Pfε,α= 0) ⎞ ⎠ ≤ α,

2. (Lowerbound)forany C < ci,α,withci,αanabsolutepositiveconstant,thefollowing inequalityholds

inf 0<ε<εo inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:uius−1u =γ−1i sup f∈Ai  R,C,s,rε,i Pf(Δ = 0) ⎞ ⎠ > α. Thesequence oftestingproceduresΔ

ε,α



ε issaidtobe F

s-adaptiveminimaxoptimal.

Inthelattersituation,wehaveonlytheinformationabouttheharmonicsumofthedirectional smooth-ness parametersofthefunctionclassinthealternative. Todeterminetheminimax andadaptiveminimax separation rates, we naturallyconsider the combination of directional smoothness with agiven harmonic sum thatis givingtheworsttype IIerror rate.The fullyadaptive casecorresponds to an unknownvalue

γi. It isadirectextension studiedherewhich couldbe called partially adaptive,thatis to saywhen γi is

known.Inthisworkforthesakeofsimplicity weuseadaptiveinsteadof partiallyadaptive.

Remark 3.1. The definition of minimax and adaptive minimax rates are equivalent to the ones given in Section1 ofLepskiand Tsybakov[22](Section1) andslightlymoreprecisethantheones giveninSection 2.6 of Ingster and Suslina [17]. We will show that both the minimax separation rate and the adaptive minimax separationrate of testing depend on the smoothness of the underlyingfunctional space and on thenoiselevelε.Attheendoftheappendix,wehighlighttherelationshipsbetweentheabsoluteconstants

Ci,α,ci,α,Ci,α ,ci,α and theparametersappearingintheproblemsuchas R andthesmoothness.

4. Nonparametrictestsfora givenorientation

In this paragraph we present test statistics and testing proceduresthat are used in the sequel. These test statistics aregoing to provide sequences oftesting procedures thatareFs-minimax andFs-adaptive

minimax optimal in the particular case where Fs corresponds to a Besov space with s as smoothness

parameter.

Definition 4.1(Teststatistic).Leti= 0 andj = (j1,. . . ,jd)∈ Ji.Thenonnegative randomvariable

Ti,j=  j∈Ji: ju<max(ju,1),∀u  k∈Kjθji,k)2

(7)

iscalled thei-oriented teststatisticwithlevelparameterj.

Definition4.2(Testingprocedure). Leti= 0,j = (j1,. . . ,jd)∈ Ji and t> 0. Therandom variable

Δi,j(t) = 1{Ti,j>t}

iscalled the(i,j,t)-testingprocedure.

4.1. Bs2,-minimaxoptimalsequenceof testingprocedures

We define appropriate test statistics based on the estimated empirical energy that is the sum of the squaredvaluesoftheempiricalhyperbolic waveletcoefficientswithinthespecified orientationover certain scales.Weshowthatthissequenceoftestsconstructedusinghyperbolicwaveletbasesgetsoptimalminimax propertieswhenf belongstoaBesov ballwithsmoothnessparameters.

Definition4.3(Besovball). LetR > 0 ands = (s1,. . . ,sd)∈ (0,+∞)d.Wesaythatf ∈ L2([0,1)d) belongs

totheBesovball B2,s(R) ifandonlyif

sup i=0 supj∈Ji max 1≤u≤d{2 2jusu} k∈Kj |θi j,k|2≤ R2.

In this section we assume that we have the full knowledge about the smoothness parameter s of the functionsinthe alternative.Inother words itcorresponds to thenon adaptiveset up.We define avector

j∗= (j1∗,. . . ,jd)∈ Ji oftruncationscalessuchthat

2−ju∗≤ ε4iuγi/(1+4γi)su < 21−j∗u (6) withγi= ( d u=1ius−1u )−1. Theorem4.1. Letα∈0,1 2 

,R > 0,s in (0,+∞)d andi∈ {0,1}d\ 0.Consider, asin(3) and(4),testing

the hypotheses Hi,0 versus Hi,a where Fs(R) = Bs2,∞(R). Then the sequence of the (i,j∗,tε,i,α)-testing

proceduresi,j∗(tε,i,α)



ε is B s

2,∞-minimax optimal with j∗ asin (6),and with ε−2tε,i,α = χ1−α/2(2|j∗|),

the1− α/2 quantileof theChi-Squaredistributionwith 2|j∗|= 2j1+···+jd∗ degreesof freedom.Moreoverthe B2,s-minimaxrate separatingHi,0 andHi,a isoforder of

rε,i= ε

4γi 1+4γi.

Theproof of Theorem 4.1is stated intheappendix. Theabsolute constants Ci,αand ci,αfrom

Defini-tion 3.1willbe giventhere.

This theorem can be compared to the results given inthe book of Ingster and Suslina [17] for Besov bodiesintheone-dimension case.Inthisbook,Theorem 3.9 statesdistinguishabilityconditionsand sharp asymptotics are given in Theorem 4.4. The results stated above in Theorem 4.1 is similar but in any dimension;thereforethesmoothnessparameterisreplacedbytheaveragesmoothnessvalueobtainedfrom theharmonicsum.

Thereadercouldnotethattheworstrateisassociatedwithγ1with1 = (1,. . . ,1),i.e.totheinteraction

termofhighestdegree.ThiscaseissimilartotheoneconsideredinIngsterandStepanova[16](Remark 3.3). Weobtainthesameminimaxrateoftestinghypothesesinvolvingtheharmonicsumofsmoothnessastheone theyfoundforasignalfunctionf inaSobolevspace.Nevertheless wecanpoint thattheirnullhypothesis

(8)

H0 : f = 0 is different from the one considered here and briefly reformulated as H1,0 : f1 = 0 which

consists of functionswithan atomicdimensionstrictly smallerthan thedimensiond. Theseparationrate canbedifferentforeveryorientation.Whenitcomestotestforthestructureofad-dimensionalfunction,a sequentialtestingprocedurebasedonthesedifferentteststatisticsseemstobeidealbecauseonecanbenefit from betterseparationrates.

Remark 4.1. Inthe context ofmultidimensional function estimation,themaxiset approach introducedby Kerkyacharian and Picard [20]has been proveduseful to study theperformance of wavelet-based estima-tors[3].Itconsistsindeterminingthelargestfunctionspacesuchthattheriskofanestimatoriscontrolled atachosenrate.Ifthechoiceoftherateisoftenanissue,itiscommontousetheminimaxornear-minimax rates oversome‘standard’functional spacesF. Thenthe maxisetapproachreveals asetof functionsthat contains F strictly or not. This is an optimistic approach in the sense that finally, we can estimate at theminimax ornear-minimax ratemorefunctionsthatpreviouslythought. Inminimaxhypothesistesting problems,wecangetinspiredfromthemaxisetapproachtoremarkthatunderthealternative,wenaturally model the‘estimand’as afunctioninad-variate smoothnessspace, hereB2,s(R).Nevertheless, sincewe areonlyconcernedbythepresenceofinformationalonggivendirections,thebehavioralongthedirections thatarenotofinterestcanbelet‘uncontrolled’.Hence,thesequencespaceinthealternativecanbeenlarged to thesetofallBesovspacesBs2,(R) withsu≤ su foriu= 0 andsu= su otherwise.

4.2. B2,s-adaptive minimaxprocedure

Wenowsupposethattheregularityparameters appearinginthealternativehypothesisisunknownbut its harmonicsumγ−1i =du=1iusu−1 is knownfor achosen i suchthat|i|> 1.Insomesense,weconsider

now theadaptivesetup.

To definetheteststatisticusingtheknowledgeaboutγ,wedenote byJi theintegersuchthat

2−Ji≤ε4log log ε−1

1

1+4γi < 21−Ji. (7)

Theorem 4.2. Let α 0,12, R > 0, i ∈ {0,1}d satisfying |i| > 1 and let γi > 0. Consider, as in (3)

and (5),testingthehypotheses Hi,0 versus Hi,a where Fs(R)= B s

2,∞(R), whereγi−1=

d

u=1ius−1u .Then

the sequenceof testingproceduresi,max(tε,i,γi,α)



ε built from themaximum of the (i,j,t 

ε,i,γi,α)-testing proceduresΔi,j(tε,i,γ,α) forwhichj issatisfying|j|= JiisBs2,∞-adaptiveminimaxoptimalprovidedthat,for

any ε∈ (0,εo), ε−2tε,i,γi,α= χ1−α/2Kγi(2

Ji), with K

γi = #{j ∈ J

i:|j|= J

i}. MoreovertheB2,s∞-adaptive

minimax rateseparatingHi,0 andHi,a isof orderof

rε,i =ε4log log ε−1

γi

1+4γi .

The proof ofTheorem 4.2 isstated intheappendix. The absoluteconstants Ci,α and ci,α from Defini-tion 3.2will begiventhere.

Results given foradaptation inTheorem 4.2canbe compared to usual resultsobtained foradaptation in hypothesistesting problems. Theunavoidable loss due to adaptation is theusual onethat canalso be foundforexampleinTheorem7.2 inIngsterandSuslina[17].TheresultsstatedaboveinTheorem4.2can be comparedtothelossobtainedbySpokoiny[35]intheGaussian whitenoisemodelinone-dimensionfor thesignaldetectionproblem.

Remark4.2.Note thatcompared totheB2,s-minimaxcase,theoptimal separationrateisslower because log log ε−1 → ∞ asε→ 0.This attestsof thedeteriorationoftheinformation aboutthefunctionspace in

(9)

the alternative. Nevertheless,inthe alternativehypothesis, the unionof sequence spacesB2,s(R) can be replacedbyanotherlargersequencespace,denotedAγi,2(R) and definedhereafter.

Definition 4.4 (Truncation ball). Let R > 0, i ∈ {0,1}d satisfying |i| > 1 and γi > 0. We say that

f ∈ L2([0,1)d) belongstothetruncationballAγi,2(R) ifandonlyif

sup

j∈Ji

22|j|γi  k∈Kj

j,ki )2≤ R2.

ThetruncationballAγi,2(R) issimilartoaBesovballinthesensethatitcontrolsthedecayoftheenergy

ofthehyperbolicwaveletcoefficientsoverthescalesandhenceitisusedtocontroltheapproximationerror. FollowingLemma2.2ofNeumann[26], thefollowing inclusionproperty holds.

Proposition4.1. Fix i∈ {0,1}d\ 0. Letγ

i> 0. ForanyR > 0,there existsR > 0 such that



s: i1s−11 +...ids−1d =γi−1

B2,s(R)⊂ Aγi,2(R ).

Consideringthefunctionalcomponentfi off doesnotonlyenableustoexplorefinestructuresbutitis

alsoawaytoimprove theperformanceofthetesting proceduresformoregeneralstructure,seeSection6.

5. Teststhatcombine differentorientations

Manyinterestinghypothesesonthestructure ofafunctioncanbetestedby‘aggregating’thepreviously described optimal testing procedures for theSobol functional components.In this paragraph we describe suchresultsinthecontextoftheBs2,-minimaxframework,butitcanbeeasilygivenfortheB2,s-adaptive minimaxsettingas well.Wefirststatethefollowingproposition:

Proposition 5.1. [Max-testing]Fix α∈0,12,s∈ (0,+∞)d and considerβ = 11α

2

|I|

such that |I| denotesthecardinality of anonemptysetof orientationsI that characterizesthealternative hypothesis.

Consider thefollowingtestinghypotheses

H0: f ∈ S(R, I)= {f : f 2≤ R, fi(x) = 0, ∀i ∈ I, ∀x ∈ [0, 1)d},

Ha: f ∈ T (R, I)= {f : f ∈ B2,s∞(R),∃ i ∈ I, fi 2≥ Crε,i}.

(8)

Then, forany ε∈ (0,εo),thetestingprocedureΛI,ε,α= maxi∈IΔi,j∗(tε,i,α) isβ-level,i.e.itsprobabilityof

type I errorisequaltoβ.

Theproofof Proposition 5.1isanimmediateconsequenceofTheorem 4.1.Since thetesting procedures involvedareindependentfordifferentorientations, thelimitingdistributionscanbe exactlycomputedand wedonotneedtheprobablyconservativeFWER.Foranyε∈ (0,εo),

sup

f∈S(R,I)PfI,ε,α= 1) =f∈S(R,I)sup

 1− Pf

 max

i∈I Δi,j∗(tε,i,α) = 0

= 1  1−α 2 |I| = β.

Remark5.1.Underthealternative,wenaturallymodeltheestimandasafunctioninad-variatesmoothness space,hereandonceagainaB2,s(R).Nevertheless,inspiredfromRemark 4.1,wecanstateanotheranalogy with the maxiset approach. Let us consider the testing problem with the same hypotheses as in (8) but with a ratethat is chosen to be the worst optimalone with respect to s, thatis rε,1 = ε4γ/(1+4γ). Since

(10)

we areonlyconcernedbythepresence ofinformation alongatleast onegiven directionthatbelongstoI, no assumption on the smoothness is needed in the other orientations. Hence, the sequence space in the alternativeassociatedwiththechosenraterε,1canbeconsideredlargerthantheinitialone,Bs2,∞(R).More

precisely it couldbe replaced bythesequence spacedefinedbythe unionofall Besovballs B2,s(R) with

s∈ (0,+∞)d suchthat

uiusu= γ−1 foratleastonei∈ I.

Inwhatfollows,wepresent someexampleswhere theoptimaltesting proceduresthatcombinedifferent orientations couldbeuseful.

5.1. Testingthestructureofthegoal function

Take the set I(δ0) = {i ∈ {0,1}d : |i| > δ0} that contains all orientations for which the number of

contributing components of the d-vector is strictly larger than a specified value δ0. The corresponding

hypotheses fortestingfortheatomicdimensionare:

H0: f ∈ S(R, I(δ0)) ={f : f 2≤ R, δBd(f )≤ δ0},

Ha: f ∈ T (R, I(δ0))={f : f ∈ Bs2,∞(R),∃ i ∈ I(δ0), fi 2≥ Crε,i}.

(9)

Tests for theatomic dimensionare moreversatile than merelytesting foradditivity, thelatter whichis a special casewith δ0 = 1.Concluding thata structure is simplerthan afull d-dimensional object leadsto

an efficiency gain inestimation by using the appropriate methods escaping (at least partly) the curse of dimensionality whenδ(f ) issmallerthand.

Combining tests over multiple orientations i ∈ I can be done by computing the maximum of testing proceduresintheseparate orientations.

From our testingproceduresΔi,j∗(tε,i,α) ,i = 0,there is aquitenaturalway to estimatethe structure

of thegoalfunctionf byconsidering

ˆ

Af,tε,i,α =



i∈ {0, 1}d\ 0 : Δi,j∗(tε,i,α) = 1



asanestimatorofAf.Neverthelessthisestimatorcould leadtomanyfalsediscoveriesthatareany

orienta-tioni∈ Af,tε,i,αsuchthatfiisthenullfunction.Obviouslythelargerα,thebiggerthenumberofexpected

falsediscoveries.

Thereisalsoawaytonaturallygetanestimatorδ ofˆ theatomicdimensionδ(f ) ofad-dimensionalsignal

f byputting,forachosenα∈0,12, ˆ

δBd=



max{|i|; i ∈ ˆAf,tε,i,α} if ˆAf,tε,i,α = ∅

0 if ˆAf,tε,i,α =∅.

Rejecting acombined orientations nullhypothesis as in(9) when δˆBd ≥ δ0 is equivalent with rejecting H0 whenΛI(δ0),ε,α= 1.

5.2. Reductionofmodel: selectionofvariables

Anotherapplicationisto testwhethersomevariablesamongx1,. . . ,xd maybe omittedfrom themodel

or not. Consider the case of leaving out variable xm for some m ∈ {1,. . . ,d}. In this case Im = {i =

(i1,. . . ,im,. . . ,id)∈ {0,1}d : im= 1} andtherelevanthypotheses are

H0: f ∈ U(R, m)={f : f 2≤ R, ∀i ∈ Im, fi(x) = 0, ∀x ∈ [0, 1)d},

Ha: f ∈ V(R, m)={f : f ∈ B2,s∞(R),∃ i ∈ Im, fi 2≥ Crε,i}.

(11)

Inadditiontopossiblyreducethemodel,whenconsideringsuchatestingproblemcouldfindapplications inmodelingtimeseries(see Section6fordetails).

Following the two previous paragraphs, an interesting point must be underlined here. Estimating the atomicdimensionofad-dimensionalobjectorreducingthenumberofitsrelevantvariablescanbeconsidered as afirst stepto guaranteethatthe object isa functionof anumberof variables smaller thand (usually

called‘sparsevariablefunction’).Thisassumptionwasoftenassumedinrecentworksinestimationproblem asinAutinet al.[3]andintestingproblemasinIngsterandSuslina[19].Fromthedata,ourmethodology alsopermitstotestwhetherthisassumption isreasonableor not.

6. Numericalexperiments

In this section we first describe a practical implementation of the adaptive method. We present its performanceintermsofempiricalpowercurvesw.r.t. signaltonoiseratiofortestingtheatomicdimension of some multidimensional functions corrupted by an additive Gaussian white noise. Then, we propose an application in time series analysis that consists in exploring the structure of the spatio-spectrum of spatio-temporalprocesses.

6.1. Data-drivenstructure detection

To link our theory to the practical setting we referthe reader to Reiss [30] for the equivalence in the Le Cam’s sense between the multidimensional nonparametric regression problem (11) and the Gaussian whitenoisemodel(2).Thisequivalenceisassumedto holdandtypicallyrelies onsomeminimalregularity conditionsandanappropriatecalibrationofthenoiselevel ε= σN−d2. Letζl

1,...,ld be i.i.d.N (0,1), Yl1,...,ld= f  l1 N, . . . , ld N + σζl1,...,ld, 1≤ lu≤ N, 1 ≤ u ≤ d. (11)

Both,theB2,s-minimaxandB2,s-adaptiveminimaxoptimalproceduresstudiedpreviouslytruncatethe waveletcoefficientsequenceatscalesthatarecalibratedbasedontheknowledgeoftheunknownsmoothness oftheestimand (basedeitherontheknowledgeofallthedirectionalregularitiesor onlyoftheirharmonic sum). Following the idea of Sieve estimation described in Birgé and Massart [5], we build a data-driven version of the adaptive testing procedure. In this numerical part, we consider only the B2,s-adaptive minimaxoptimalmethodsinceitsalgorithmiccomplexityisofaboutO (log2(N )) comparedtoOlog2(N )|i| fortheB2,s-minimax optimalmethod.

Letusconsider thesets ofallpossibleadaptivemodelsFMA

N basedonε −2 observations, FMA N = ˆ fmA = ˆθ0,00 φ0,0+  i=0  (j,k)∈mA i ˆ θij,kψij,k; mA:={mAi ∈ MAN,i}, with MAN,i:= 

mi,J :=(j, k)∈ Ji× Kj;|j| ≤ J; ju≤ log2(N ), ∀u



; J ≤ |i| log2(N ) 

.

Theoracleapproachleadstoconsider theestimatorfˆmA

o ofFMAN suchthatE ˆfmAo − f

2

2 isminimal.This

correspondsto findmA, denotedas mA

o, suchthatthefollowingquantityisminimal:

 (i,j,k)∈mA  (θij,k)2 σ 2 Nd .

(12)

Theoptimizerisfoundbysolvingthefollowing problemforeveryi∈ {0,1}d\{0},

mAi,o= arg min

mA i∈MAN,i ⎧ ⎨ ⎩  (j,k)∈mA i  j,ki )2 σ 2 Nd.

Inpractice,wepluginempiricalquantitiesandadjustforthevariabilityinthedataproposingthefollowing optimization,withλ = ˆˆ σ2dN−dlog N theuniversalthreshold,

ˆ

mAi,o= arg min

mA i∈MAN,i ⎧ ⎨ ⎩  (j,k)∈mA i  (ˆθij,k)2− ˆλ2 ⎫ ⎬ ⎭. (12)

6.2. Empiricalpowerfunctions

Inthissectionweconsider3-dimensionalfunctionsf andwetestfortheiratomicdimensionH0: δ (f ) = 1

versus H1: δ (f ) > 1. Thereforeweconsider thehyperbolicHaar waveletbasisso thatδBd(f ) = δ (f ).For

sake ofsimplicity, wegenerate them usingthe Soboldecomposition ofad-variate functionf ∈ L2([0,1)d)

into orthogonalsummandsofgrowing dimensions,

f (x1, . . . , xd) = d  u=1  i1<···<iu fi1...iu(xi1, . . . , xiu). (13)

The marginal functional componentsare chosen to be univariate functions thatare frequently consid-eredinthe waveletliterature [2].The interactionterms areobtainedas weightedproduct ofthe marginal functions.Inthisexperimentwechoosethefollowingtestfunctionswedescribebelow.

A: f1(x1):‘blip’,f2(x2):‘blip’,f3(x3):‘blip’,

B: f1(x1):‘blip’,f2(x2):‘wave’,f3(x3):‘bumps’.

Weconsiderthenullhypothesiswithδ (f ) = 1 andtwofunctionsinthealternativewithatomicdimension

δ (f ) = 2 or δ (f ) = 3.

• underH0: δ (f ) = 1,f (x1, x2, x3) = f1(x1) + f2(x2) + f3(x3)

• underH1: δ (f ) = 2,f (x1, x2, x3) = f1(x1) + f2(x2) + f3(x3) + Df1(x1) f2(x2)

• underH1: δ (f ) = 3,f (x1, x2, x3) = f1(x1)+f2(x2)+f3(x3)+Df1(x1) f2(x2)+Df1(x1) f2(x2) f3(x3).

Here,Fig. 1givesanillustrationofsomeofthese testfunctions.

Thetotalenergyofthesefunctionsf (x1,x2,x3) isnormalized.Thentheobserveddataaregeneratedby

adding Gaussian whitenoise to the test functions.The empirical poweris computed as afunctionof the SNRas follows: P (j) = 1 M M  m=1 1 Λ(δ0)m,SN Rj=1 ,

where the signal to noise ratio (SNR) is defined as the ratio of the standard deviation of the function values to the standarddeviation of thenoise. Λ0)m,SN Rj isthe result ofthe procedureΛI(δ0),ε,α (with

(13)

Fig. 1. Three dimensional test functions that are used in the numerical experiments.

Fig. 2. Simulatedrejectionprobabilitiesunder H0 and H1 forbothsetsoftestfunctionsAandB,asafunctionofthesignalto

noiseratio.

N ∈ {32, 64},1≤ j ≤ 50 andwithM = 100 MonteCarloreplicationsateveryvaluesofSN Rj.Theresults

aresummarized inFig. 2.

Fig. 2 shows that for both test functions, the nominal level of the test at α = 5% under the null hypothesisH0 ismaintainedoverthedifferentvaluesoftheSNR.Underthealternative,weobserve firstly

(14)

that thedetection appearsat very lowSNR values,secondly, an increaseinthe samplesize improvesthe detection power of the method, and thirdly, underthe two different alternatives, when the energyof the function is more spreadover orientations (δ (f ) = 3),the detection power decreases. These tests confirm the proper calibration of thedata-driven adaptive method. Wecan now apply itto amorerealistic data example.

6.3. Applicationintimeseries analysis

NeumannandvonSachs[28]considertestingthestationarityofatimeseriesbystudyingthestructureof itstime-varyingspectrum.Theyformlocalcontrastteststatisticsinthetimedirectionalsobasedon hyper-bolic waveletcoefficientsof anestimatorofthetime-varyingspectraldensity,namelythepreperiodogram. A crucial problem for practical application of this method in testing stationarity is due to the inherent natureofthepreperiodogramdata whichsuffers frominterferences,verylowSNRandnonGaussian noise distribution. This maystrongly affect theperformancesof thedata-drivenchoice ofthetruncation scales. Adaptingourtestingmethodologytothiscontextisaninterestingresearchprobleminitself.Inthesequel, wetestthestructureofthespatio-spectrumofspatio-temporalprocessesthataretimestationary.Aspecific concern forthis applicationisthe highdimensionalsetting. Weface the problemof having smallp-values due to a large number of observations [23]. We illustrate how to properly use our results to assess the structureofsuchdata.Wefirstdefineaclassofspatio-temporalprocesses,itsspatio-spectrum,explainhow to estimatethisandfinally weapplyourdata-drivenmethodtosimulateddata.

6.3.1. Spatio-temporal model

Following Ombaoet al. [29], wedefine theCramérrepresentation ofaspatio-temporalprocess {Xt(u) ;

u = (u1, u2)∈ [0,1)2,t∈ Z},whereu isthespatialindex,asfollows:

Xt(u) = π



−π

A (u, ω) exp (iωt) dZ (ω) (14)

whereZ (ω) isacomplex-valuedstochasticprocesswithzeromeanandorthogonalincrements,whichsatisfies

E [dZ (ω)] = 0, Cov (dZ (ω1) , dZ (ω2)) = Dirac (ω1− ω2) dω1

whereDirac(.) meanstheDiracfunctionatmasspoint0 andA(u, λ) isthespatialcomplex-valuedtransfer function. It is Hermitian A(.,−ω) = A∗(., ω),where A∗ is the adjoint operator. Thelocation-dependent spectrumisgivenby

f (u, ω) :=|A (u, ω)|2. 6.3.2. Estimation

In a practical setting we observe the spatio-temporal process over a discrete grid of dimensions (N1× N2× T ) ∈ N3. Since we assume it to be time stationary, we can compute the bias-corrected

log-periodogramateveryspatiallocation

log ⎛ ⎝ 1 2πT    T  t=1

Xt(u) exp (iωt)

  

2⎞

⎠ + κ where κ= 0.57721 istheEuler–Mascheroniconstant,as inWahba [37].

(15)

Fig. 3. SpatialAR(1)parameterforthetimeseriesapplication.ThevalueoftheAR(1)parameterateachspatiallocationranges from0.2 (black)to0.99 (whitecolor).

From thesespatial log-periodogramswetest thestructureof thespatio-spectrumf (u, ω).Hereafterwe directly apply our methodology without adaptation for the non Gaussian nature of the log-periodogram ordinates. Thecentrallimittheorem inthe coefficientdomain permitsto consider thatthewavelet coeffi-cientsare approximatelynormally distributedupto comparativelyfine scales.Theactualprocedurecould be improvedinthatspecificcontext,for examplebyadaptingthemethodologyof Gao[9]inthe penaliza-tionofthedata-drivenselectionofthetruncation scales.Nevertheless,omittingthisadaptationhelpsusto illustratethepracticalapplicationofourmethodconsideringaslightlysuboptimalprocedure.

Intheprevioussectionwecheckedthatourmethodiswell-calibrated.Inthistimeseriesexample,weare awayfromthisidealisticframework.WearenowdealingwithperiodogramdatathatarenonGaussianand thespatio-spectrummayhaveacomplicatedstructure(intermsofenergyalongthedifferentorientations). Inthishigh-dimensionalandmorerealisticcontext,thep-valueshavenotlongeranypracticalimportance. For (very)large sample sizes, p-values are almost always extremely small, suggesting the rejection of the nullhypothesissince,withreallifedata,thesituationoftotalabsenceofinformationinsomeorientationis rarelyexactlytrue.Asolutioninthiscontextisto useourteststatisticvaluestodescribetheproportions tothetotalenergyofthefunctionthatareinthedifferentorientations,similarlyasinprincipalcomponents analysis.Thisistheapproachadoptedhereafterfortimeseriesdata examples.

6.3.3. Results

Wegenerate thefollowingtime seriesmodels using adiscretizedversion oftheirCramér representation givenbyequation(14).Weconsidertwo spatialAR(1) processes,i.e.,theirspatio-spectrumisgivenby

f (u, ω) = σ

2



1− 2φ (u) cos ω + φ (u)2 −1

.

Let‘Model1’be aconstantparameteroverspace,i.e.,φ(u) = φ= 0.9, ∀u ∈ [0, 1]2,and‘Model2’tohave acomplexspatial AR(1)structure overspaceasillustratedinFig. 3.

Table 1givestheresultoftheMonteCarloexperimentsongeneratedtimeseriesdataforN1= N2= 128

andT = 128.Lookingattheenergyrepartitionforthelogarithm ofthetruespatio-spectrumunderModel 1, it is clear that all the information is just contained in the marginal function of frequency. Under the Model 2, one can realize that the difference in the energy inorientations involving the space variable is prettylow.Forwhatconcernstherepartitionestimationinthedifferentorientation,thereisnoproblemfor identifyingthefirst leadingorientation,nevertheless,undermodel2,wedonotget aclearmessageabout theseconddominantorientation.

(16)

Table 1

TruevsEstimatedrepartitionofthetotalenergy(in%). orientation

i

Model 1 Model 2

True repartition Estimated True repartition Estimated

(1, 0, 0) 0 0 (0.013) 0.1 0.1 (0.09) (0, 1, 0) 0 0 (0.01) 0 0.1 (0.078) (1, 1, 0) 0 0 (0.01) 0 0.1 (0.069) (0, 0, 1) 100 99.9 (0.014) 91.8 90.5 (0.150) (1, 0, 1) 0 0 (0.001) 4.9 3.3 (0.051) (0, 1, 1) 0 0 (0.001) 3.1 5.9 (0.074) (1, 1, 1) 0 0 (0.001) 0 0 (0.008) 7. Discussion

InthispaperwedescribehowtheSoboldecompositioninrelationtothegeometryofthemultidimensional hyperbolic wavelet basis allows to build simple test statistics with impressive theoretical performance in thepresenceofanisotropicestimand.Appropriatelycombined,these teststatisticsallowtotestforgeneral structuralpropertiessuchastheatomicdimension.Wedeterminetheminimaxratesforseparatingthenull andalternativehypothesisfordetectingtheSobolfunctionalcomponentincasesoffullorpartialknowledge aboutthesmoothnessparameter.Weproposetwosequencesoftestingproceduresthatarebasedondifferent knowledgesofthesmoothnessparameter(fullorpartiallyknown)andweshowthattheyareasymptotically optimalintheminimaxsense.Interestinglyweobserveontheonehandthatalossofinformationaboutthe smoothnessparameterisaccompaniedwithadeteriorationoftheminimaxrateofseparation;ontheother hand the set of functionsin thealternative hypothesis canbe larger.Finally, we describe amethodology foradatadrivenversionofthetestingprocedureanditsapplicationtotimeseriesdata.

Inthecontextoftimeseriesanalysis,weconsiderextendingourapplicationexampletodealwithspatial and time non-stationary processes. We are therefore interested in spatial time-varying spectrum of these processes.Wecancomputeateveryspatiallocationsanestimatorofthetime-varyingspectrumsuchasthe preperiodogram(seeNeumannandvonSachs[28]).NeumannandvonSachs[27]provideargumentsabout theasymptoticequivalenceofthepreperiodogramestimationproblem,theGaussianwhitenoisemodeland themultidimensionalnonparametricregressionmodelsothatourtheoryactuallycaneasilybeextendedto thiscontext.Nevertheless,inapracticalsetting,thenatureofthepreperiodogramdatarequirestodevelop anadaptedmethodologyforchoosingthetruncationscales.Incombinationwiththefullyadaptivetesting procedure, this method certainly encounters success for practical applications. Already inthe context of estimation of the time-varyingspectral density, the adaptive smoothing of thepreperiodogram is still an actual researchtopic(seeforinstance vanDelftandEichler[36]).

Acknowledgments

The authors wouldlike to thank the two reviewers for useful commentsand suggestionson a prelimi-nary versionofthe manuscript.G.Claeskens and J.-M.Freyermuthacknowledge thesupportof theFund for Scientific Research Flanders, KU Leuvengrant GOA/12/14 and of the IAP Research Network P7/06 of the Belgian Science Policy. Jean-Marc Freyermuth’s and John Aston’s research was supported by the EngineeringandPhysical SciencesResearch Council[EP/K021672/2].

Appendix A

A.1. Proofof Theorem 4.1

TheproofofTheorem 4.1isadirectconsequenceofPropositions A.1 andA.2thatrespectivelydealwith theupperbound andthelowerboundoftheminimaxresult.

(17)

PropositionA.1.Under thesameassumptionsof Theorem 4.1andforany C > Ci,α,thefollowing inequal-ities hold sup 0<ε<o sup f∈Ni(R) Pf  Δi,j∗(tε,i,α) = 1  = α 2 and0<ε<supo sup f∈Ai(R,C,s,rε,i) Pf  Δi,j∗(tε,i,α) = 0  α 2.

RemarkA.1.TheconstantCi,αcorrespondstotheabsolutepositiveconstantCi,αappearinginDefinition 3.1

andwill begivenattheend oftheproofbelow.

Proof. Consider ε ∈ (0,εo). Suppose thatf ∈ Ni(R). Then ε−2Ti,j∗ has a Chi-Square distribution with

2|j∗|degreesoffreedom.Hence

Pf  Δi,j∗(tε,i,α) = 1  =Pf  Ti,j∗> tε,i,α  =Pf  ε−2Ti,j∗> χ1−α 2  2|j∗|=α 2.

Supposenow thatf ∈ Ai(R,C,s,rε,i).Then ε−2Ti,j∗ is anoncentral Chi-Squaredistributionwith2|j |

degreesoffreedom.Moreover,ifVji∗ characterizesthefollowinglinearspanof



ψj,ki : j ∈ Ji: ju< max(ju∗, 1),∀u and k ∈ Kj

 , then, Ef(Ti,j∗) = ε22|j | +  j∈Ji: ju<max(j u,1),∀u  k∈Kj  θij,k2= ε22|j∗|+ P rojVi j∗(fi) 2 2, Varf(Ti,j∗) = 2ε42|j | + 4ε2  j∈Ji: ju<max(j u,1),∀u  k∈Kj  θij,k 2 = 2ε42|j∗|+ 4ε2 P rojVi j∗(fi) 2 2.

ByconsideringtheCramér–Chernoffmethod(seeChapter2ofMassart[25]),weeasilyobtainthefollowing concentrationinequalityforthenoncentralChi-Squaredistributionwith2|j∗|degreesoffreedom,

Pf  Ef(Ti,j∗)− Ti,j∗≥ u  ≤ exp  u2 2Varf(Ti,j∗)  , ∀u > 0. So, Pf  Δi,j∗(tε,i,α) = 0  =Pf  Ti,j∗≤ tε,i,α  =Pf 

Ef(Ti,j∗)− Ti,j∗≥ Ef(Ti,j∗)− tε,i,α

 ≤ exp ⎛ ⎜ ⎝−  Ef(Ti,j∗)− tε,i,α 2 2Varf(Ti,j∗) ⎞ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝  ε22|j∗|+ P roj Vj∗i (fi) 22− tε,i,α 2 42|j∗|+ 8ε2 P roj Vj∗i (fi) 22 ⎞ ⎟ ⎟ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝  ε22|j∗|+ fi 22− fi− P rojVi j∗(fi) 2 2− tε,i,α 2 42|j∗|+ 8ε2 f i 22 ⎞ ⎟ ⎟ ⎟ ⎠.

(18)

Remarkthatfi belongstoB2,s∞(R) becausef does.

Since fi 22 ≥ C2rε,i2 andχ1−α

2



2|j∗|≤ 2|j∗|+ 22|j∗|log(2α−1)

1 2

(seetheexponential inequalityfor Chi-SquaredistributiongiveninLemma1 ofLaurentandMassart[21]),onegets,

 − logPf  Δi,j∗(tε,i,α) = 0 −1 42|j | + 8ε2 f i 22  ε22|j∗|+ f i 22− fi− P rojVi j∗(fi) 2 2− tε,i,α 2 42|j | + 8ε2 f i 22  ε22|j∗|+ f i 22− R22−2|j i − tε,i,α 2 42|j |  (C2− R2)2−2|j∗|γi− 2ε22|j∗|log(2α−1)12 2 + 2 f i 22  fi 22− R22−2|j i − 2ε22|j∗|log(2α−1)12 2 = 42(1+4γi)|j∗|  C2− R2− 21+|i|2 (log(2α−1)) 1 2 2 + 8C4ε2 fi −22  C2− R2− 21+|i|2 (log(2α−1)) 1 2 2 22+(1+4γi)|i|+ 8C2e 2e 1+4γi  C2− R2− 21+|i|2 (log(2α−1))1 2 2 (15) log2α−1−1

provided that C ≥ Ci,α. The constantCi,α is the largest value of C such that thelast inequalitycan be

replaced by anequality. That leadsto Pf



Δi,j∗(tε,i,α) = 0



α

2 for any C large enoughand uniformly

in f .

Notethatthesmallerα thelargerCi,α.Thisremarkisnotasurpriseobviouslybecausewhenα ischosen

to besmall,theproblemofdetectionbecomesharder.Neverthelesstheorderoftheminimaxratedoesnot depend onα.Wealsonote that(15)isobtainedbecauseweassumedthatε< e−e. 2

PropositionA.2.UnderthesameassumptionsofTheorem 4.1andforanyC < ci,α,thefollowinginequality

holds inf 0<ε<o inf Δ  sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0)  > α.

RemarkA.2.Theconstantci,αcorrespondstotheabsolutepositiveconstantci,αappearinginDefinition 3.1

and willbegiven attheendoftheproofbelow.

Proof. For anyε∈ (0,εo) let usconsider

=  k∈Kj∗ θij,kψij,k =  k∈Kj∗  L d " u=1 2−j∗u(γi+12)ζ k  ψji∗,k

where L> 0 andζk ∈ {−1,1} forany k.Thevaluesju∗ arechosenasintheupperbound. Werecallthat

2−ju∗≤ ε

4iuγi

(19)

Firstwecheck that isinthealternative, thatisto sayitbelongs tothe Besovball ofradius R with

theexpectedparameter ofregularity andit isseparated from 0 sufficiently.Accordingto thedefinitionof theBesovballwithradius R,wehavetocheck that

max 1≤u≤d{2 2ju∗su}  k∈Kj∗  θi j∗,k 2 < R2.

Becauseof ourchoiceofθij,k,wehave

max 1≤u≤d{2 2j∗usu}2du=1j∗uL2 d " u=1 2−2ju∗(γi+12)= L2 max 1≤u≤d{2 2ju∗su} d " u=1 2−2ju∗γi≤ 2min1≤u≤dsuL2.

A first condition on theconstant L is thatit hasto be chosen smaller than R

2minu su. It ensures that

belongstotheBesovballunderconsideration.

Nextwecomputethesquare oftheL2-distance betweenf

ζ and0.Wehave  k∈Kj∗  θij∗,k 2 = L2 d " u=1 2−2j∗uγi ≥ L22−2|i|γiε 8γi 4γi+1.

ThereforeL hastobe largerthan2|i|γiC.Thiskindofchoiceguaranteesthat f

ζ 2≥ Crε,i.

FollowingPropositions2.11 and2.12 ofIngsterandSuslina[17], wehavethat:

inf Δ  sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0)  ≥ 1 −1 2 # EEπ  Lfζ 2 − 1 (16) whereπ is anypriorprobabilitymeasure concentratedontheset ofalternativesAi(R, C, s, rε,i) and

Lfζ = exp  ε−2  fζ(x) dY (x)− 1 2ε −2 f ζ(x)2dx .

Therefore,accordingto(16),itsufficestoprovethatforjudiciouschoicesofπ wecanget

EEπ



Lfζ

2

< 4(1− α)2+ 1. (17)

Weconsidernowtheprobabilitymeasureπ suchthattheζk’sareindependentRademacherrandomvariables

withparameter 12.Notethat

Lfζ = exp  ε−2  fζ(x) dY (x)− 1 2ε −2 f ζ(x)2dx = exp  1 2ε −2L22|j∗| d " u=1 2−2ju∗(γi+12)  " k∈Kj∗ exp  ε−1L ζk ξji∗,k d " u=1 2−ju∗(γi+12)  .

(20)

Eπ  Lfζ  = exp  1 2ε −2L22|j∗| d " u=1 2−2j∗u(γi+12)  × " k∈Kj∗ 1 2 $ exp  ε−1L ξji∗,k d " u=1 2−ju∗(γi+12)  + exp  −ε−1L ξi j∗,k d " u=1 2−ju∗(γi+12) % .

AsasecondstepwetaketheexpectationaccordingtotheindependentstandardGaussianrandomvariables

ξji∗,k: EEπ  Lfζ 2 = " k∈Kj∗ cosh  ε−2L2 d " u=1 2−ju∗(2γi+1)  (18) ≤ exp  1 22 |j∗| ε−4L4 d " u=1 2−2j∗u(2γi+1)  ≤ exp  L4 2 < 4(1− α)2+ 1.

The last inequality is obtained by choosing L < min2 log(1 + 4(1− α)2)14 , R

2minusu



. So (17) is proved,when consideringC smallenough,thatis C≤ ci,α= 2−|i|γiL.

A.2. Proofof Theorem 4.2

Theproof ofTheorem 4.2follows thesamepathas theproofofTheorem 4.1. Itisadirectconsequence of Proposition A.3and Proposition A.4thatrespectivelydealwiththe upperbound andthelower bound of theminimaxresult.

PropositionA.3. UnderthesameassumptionsofTheorem 4.2andforanyC > Ci,α ,thefollowing inequal-ities hold sup 0<ε<o sup f∈Ni(R) Pf  Δi,max  tε,i,γi,α  = 1  α 2 and sup 0<ε<o sup s: uius−1u =γ−1i sup f∈Ai  R,C,s,rε,i Pf  Δi,max  tε,i,γi,α  = 0  α 2.

RemarkA.3.TheconstantCi,α correspondstotheabsolutepositiveconstantCi,α appearinginDefinition 3.2 and willbegiven attheendoftheproofbelow.

Proof. Consider ε ∈ (0,εo). Suppose that f ∈ Ni(R). Then, for any j ∈ Ji, ε−2Ti,j has a Chi-Square

distributionwith2|j|degreesoffreedom.Hence

Pf  Δi,max  tε,i,γi= 1=Pf  max j∈Ji:|j|=J i Δi,j  tε,i,γi= 1  j∈Ji:|j|=Ji Pf  Δi,j  tε,i,γi,α  = 1  =  j∈Ji:|j|=J i Pf  Ti,j> tε,i,γi,α 

(21)

=  j∈Ji:|j|=J i α 2Kγi =α 2.

Suppose now that f ∈ Ai(R,C,s,rε,i) where s :

uius−1u = γi−1. Then, for any fixed j ∈ Ji such

that|j|= Ji, ε−2Ti,j is anoncentralChi-Square distributionwith2Ji degrees offreedom. Hence,ifVji

characterizesthefollowinglinearspanof 

ψj,ki : j ∈ Ji: ju< max(ju, 1),∀u and k ∈ Kj

 , then, Pf  Δi,max  tε,i,γi,α  = 0  ≤ Pf  Ti,j ≤ tε,i,γi,α  =Pf 

Ef(Ti,j)− Ti,j ≥ Ef(Ti,j)− tε,i,γi,α

 ≤ exp ⎛ ⎜ ⎝−  Ef(Ti,j)− tε,i,γi,α 2 2Varf(Ti,j) ⎞ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝  ε22Ji+ P roj Vji(fi) 2 2− tε,i,γi,α 2 42Ji+ 8ε2 P roj Vji(fi) 22 ⎞ ⎟ ⎟ ⎟ ⎠.

Since fi 22≥ C2(rε,i )2andχ1−α/2Kγ



2Ji≤ 2Ji+22Jilog(2α−1K γi)

1 2

(see[21])with(K0log ε−1)|i|−1≤

Kγi ≤ (K1log ε−1)|i|−1 for some K0,K1 > 0, when computing similar calculus as in the non adaptive

framework,weget  − logPf  Δi,max  tε,i,γi,α  = 0 −1 42Ji+ 8ε2 fi 22  ε22Ji+ f i 22− fi− P rojVji(fi) 22− tε,i,γi,α 2 42Ji+ 8ε2 fi 22  ε22+ f i 22− R22−2Jiγi− tε,i,γi,α 2 42Ji+ 8ε2 fi 22  fi 22− R22−2Jiγi− 2ε2  2Jilog(2α−1K γi 1 2 2 = 42(1+4γi)Ji  C2− R2− 23 2(log(2α−1) + d(log K1+ 1)) 1 2 2 + 8C4ε2 fi −22  C2− R2− 23 2(log(2α−1) + d(log K1+ 1)) 1 2 2  22+(1+4γi)(log log ε−1)−1 C2− R2− 232(log(2α−1) + d(log K1+ 1)) 1 2 2 + 8C4ε2 f i −22  C2− R2− 232(log(2α−1) + d(log K1+ 1)) 1 2 2 22+(1+4γi)+ 8C2e 2e 1+4γi  C2− R2− 232(log(2α−1) + d(log K1+ 1)) 1 2 2 (19) log2α−1−1

(22)

provided thatC≥ Ci,α .TheconstantCi,α isthevalue ofC for whichthelast inequalitycanbe replaced by anequality.Thatleadsto Pf

 Δi,max



tε,i,γi= 0 α2 forany C largeenough anduniformly inf .

Wealsonote that(19)isobtainedbecauseweassumedthatε< e−e. Note, onceagain,thatthesmallerα the largerCi,α .

PropositionA.4. UnderthesameassumptionsofTheorem 4.2andforanyC < ci,α,thefollowinginequality holds inf 0<ε<o inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:uius−1u =γi−1 sup f∈Ai  R,C,s,rε,i  Pf(Δ = 0) ⎞ ⎠ > α.

RemarkA.4.Theconstantci,αcorrespondstotheabsolutepositiveconstantci,αappearinginDefinition 3.2 and willbegiven attheendoftheproofbelow.

Proof. For anyε∈ (0,εo) let usconsiderthefamilyofd-upleJ =

 j∈ Ji:|j| = J i  . Remember that: • Ji issuchthat2−Ji 

ε4log log ε−11+4γi1 < 21−Ji,

• #J = Kγi ≥ (K0log ε−1)|i|−1 forsomeK0.

Forj ∈ J ,define fζ,j=  k∈Kj θij,kψj,ki =  k∈Kj  L d " u=1 2−ju(γi+12)ζ k  ψj,ki

where L> 0 andζk ∈ {−1,1} forany k.

As in the proof of Proposition A.2, we can easily check that fζ,j is in the alternative, provided that

2γC≤ L≤√R 2γi. Since inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:uius−1u =γi−1 sup f∈Ai  R,C,s,rε,i Pf(Δ = 0) ⎞ ⎠ ≥ 1 −1 2 & ' ' ' ' (E ⎛ ⎜ ⎝ ⎛ ⎝ 1  j∈J Eπ  Lfζ,j ⎞ ⎠ 2⎞ ⎟ ⎠ − 1 (20)

where π isapriorprobabilitymeasure concentratedontheset ofalternativesAi

 R, C, s, rε,i and Lfζ,j = exp  ε−2  fζ,j(x) dY (x)− 1 2ε −2 f ζ,j(x)2 dx .

Therefore, accordingto (20),itsufficesto proveonceagainthatforjudiciouschoicesof π wecanget

E ⎛ ⎜ ⎝ ⎛ ⎝ 1  j∈J Eπ  Lfζ,j ⎞ ⎠ 2⎞ ⎟ ⎠ = K12 γ E j∈J  Eπ  Lfζ,j 2 < 4(1− α)2+ 1. (21)

References

Related documents

Most of the guidelines indicate explicitly the use of the peak shear strength parameters in- stead of the residual shear strength parameters in the design of geosynthetic

The present study is a part from a larger project, A Cross_Cultural and Inter_Cultural Study of Social Disagreement Strategies by Iranian EFL Learners and American, This study

S   Check “yes” for Critical Element 3 on the data collection sheet if the PSEs for for all 3 areas (living, learning, and working) meet all three criteria...

Lastly, the Interactive Update mode reports added, deleted, and changed files and prompts the user whether those database entries should be updated.. The Interactive Update

In addition to brain cancer, ELF-EMF has been reported to increase the risk of adult leukemia and lymphoma, male and female breast cancer, and ALS or Lou Gehrig’s disease,

In pure agile if requirements are changing (due to defect or some additional requirements generated) after completing the project will defiantly increase

9.1 The B+LNZ and DINZ position on each of the OPCs of relevance to sheep, beef and deer production, as part of the pastoral sector (which includes fodder and forage) is

Models of network growth and policies are able to demonstrate how the characteristics of investment rules shape the network form, create various degrees of road hierarchy, and