Minimax optimal procedures for testing the structure of multidimensional functions

(1)

Contents lists available atScienceDirect

Applied

and

Computational

Harmonic

Analysis

www.elsevier.com/locate/acha

Minimax

optimal

procedures

for

testing

the

structure

of

multidimensional

functions

John Astona, Florent Autinb,Gerda Claeskensc, Jean-Marc Freyermutha,∗, Christophe Pouetb

a _{Statistical Laboratory, University of Cambridge, Wilberforce Road Cambridge CB3 0WB, England,}

United Kingdom

b_{Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France}

c_{ORSTAT and Leuven Statistics Research Center, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgium}

a r t i cl e i n f o a b s t r a c t

Article history:

Received1June2016

Receivedinrevisedform22March 2017

Accepted1May2017 Availableonlinexxxx

CommunicatedbyChristopherHeil

Keywords:

Adaptation Anisotropy Atomicdimension Besovspaces Gaussiannoisemodel Hyperbolicwavelets Hypothesistesting Minimaxrate Soboldecomposition Structuralmodeling

We present a novel method for detecting some structural characteristics of multidimensional functions. We consider the multidimensional Gaussian white noisemodel withan anisotropic estimand.Usingthe relation betweenthe Sobol decompositionandthegeometryofmultidimensionalwaveletbasiswecanbuildtest statisticsforanyoftheSobolfunctionalcomponents. Weassesstheasymptotical minimax optimality of these test statistics and show that they are optimal in presence of anisotropy with respect to the newly determined minimax rates of separation.Anappropriatecombinationoftheseteststatisticsallowstotestsome general structural characteristics such as the atomic dimension or the presence of some variables. Numerical experiments show the potentialof our method for studyingspatio-temporalprocesses.

1. Introduction

Multidimensional data often exhibit a simpler underlying structure, meaning that their effective di-mension is smaller than the observed dimension. Detecting the presence of such structure can enhance the understanding of the data and allows for more effective modeling and inferential strategies. There is a flourishing literature dealing with nonparametric methods for structure detection. These contributions are concerned with different typesof structures (such as additivity, small atomic dimension and variable selection) and with different modeling approaches according to the nature of the noise, the smoothness

* Correspondingauthor.

E-mail address:[email protected](J.-M. Freyermuth). http://dx.doi.org/10.1016/j.acha.2017.05.003

(2)

assumptions for instance.A briefoverview of the most directlyrelated contributions are given hereafter. Themaincharacteristicofthispaperistoprovidearigoroustheoreticalstudyofstructuredetectioninthe presenceofananisotropicallysmoothestimand.WeconsideramultidimensionalGaussianwhitenoisemodel and deriveteststatisticsfortheatomicdimensionofmultidimensionalobjects(i.e.,themaximaldegreeof interactionbetweenthevariables) andforvariable selection.Then,following theSieveestimation ofBirgé and Massart [5], we build adata-driven procedure. It is ﬁrst tested in ‘idealistic’ numerical experiments before beingappliedto amoresophisticatedcontextof timeseriesdataanalysis.

Themainingredientofourmethodologyistoprojectthedataontoanhyperbolic(wavelet)basisandto build test statisticsbased ontheprojection coeﬃcients.Its motivation relies onthe relationthatemerges between the geometry of that basis and the ‘functional components’ of the Sobol decomposition of the estimand (see Sobol [34]). The use ofan appropriate basis allowsto beneﬁtfrom a sparse representation andanoptimaladaptationtotheanisotropicsmoothnessoftheestimand.Inthispaperweconstructoptimal testingproceduresforthefunctionalcomponentsaccordinglytotheminimaxhypothesistestingframework ofIngster[11–13].Appropriatelycombinedthesefunctionalcomponentscanberephrasedintermsofmore generalstructures suchas theatomicdimension.

This project ismotivated by therecentresultsof Autinet al.[3,4] whostudied theabilityof a tensor-productwaveletbasis,theso-calledhyperbolicwaveletbasistoestimatemultidimensionalfunctionshaving anisotropicsmoothness.Tensor-productbases(Fourierorwavelets)havealreadybeenwidelyusedinsignal detectionnotablytotestforadditivity(i.e.atomicdimensionequalsto1)asinDe CanditiisandSapatinas [8]. Nevertheless, in the latter they do not provide deep theoretical results on the performance of their method.Abramovichet al. [1]describe anotherprocedurefortestingadditivity.Theyproposeanadaptive procedure based onthe thresholding wavelet coeﬃcients and derive interesting theoretical results. Never-theless,theyexploitastandard(isotropic)waveletbasisthatcannotoptimallydealwiththemorerealistic situationofhavinganisotropicestimands. Moreover,theirmethodislimitedtotestforadditivityanddoes not exploit the full directional representation of a multidimensional wavelet basis. The functional frame-work (where the dimensiond → ∞) hasalso been investigated by Gayraud and Ingster[10] whostudied the problem of signal detection in the caseof sparse additive functions. Other developments include, for example,multichannelsignaldetectionas inIngsterandSuslina[18],anddetectionininversemodelsasin Ingsteret al.[14].

Comminges andDalalyan [6]madeprofoundtheoretical contributions onhypothesistesting procedures basedonquadraticfunctionals.Theystudyvarioustestingproblemsinvolvingmultidimensionalanisotropic functions.Theyexploitatensor-productFourierbasistobuildnonadaptiveprocedures,i.e.proceduresthat arebuiltontheknowledgeoftheregularityparametersoftheanisotropicfunctionspacesinthealternative. Inthispaper,wefocusontheconsequencesofdealingwithanisotropicestimands.Therefore,wedeﬁnetwo classes of testing procedures,referred to as minimax optimal and adaptiveminimax optimal methodsfor the Besov balls,see Section4.The formermethodology isbuilt usingthe full knowledgeofthe regularity parameterswhilethelatterusesonlyapartofit.

These methodsdifferby the natureof thetruncation appliedto the hyperbolic wavelet coefficients se-quence. An intuitive way to think about this is to consider the better known multidimensional function estimation context. The truncation rule characterizing the minimax optimal method can be viewed as a linear butdirectionallydependentsmoothing. Whilethetruncation rule oftheadaptive minimaxoptimal methodfinds itsroots inthe approximationof thehyperbolic crossthatisnaturallyassociated to tensor-product spaces (see for instance Schmeisser [31], Schmeisserand Sickel [32] and Sickel and Ullrich [33]). Tensor-product spaces are also oftenconsidered invarious statistical contexts such as in estimation with for examplethetensor-product spaceANOVAmodelas inLin[24]or insignaldetectionas inIngsterand Stepanova [15]. They are of great interest since one can hope to reach performances close to the unidi-mensional case. It is sometimespromoted as away to ‘tackle’the curse of dimensionality, neverthelessit is arestrictive assumption.In thecontext of functional ANOVA, forexample, assuming atensor-product

(3)

modelmeansthatthe interactionterm’s complexityhas tobe inverselyproportional to itsdegree.In this paperwedonotrestricttothestructuredetectionoffunctionsbelongingtotensor-productofBesovspaces buttolargerclassessuchas theanisotropicNikolskiiclass. Weshowthatourmethodsattainthe optimal separation rates between the null and the alternative hypothesis and we discuss how to combine sets of testingproceduresinawaythatisadaptedto theanisotropiccase.

Such testing procedures ﬁnd applications in many ﬁelds. We illustrate its usage in time series data analysis. Our aim is to test the structure of the spatio-spectrum associated to aspatio-temporal process thatistimestationary.This allowsusalsotoillustratehow toadaptsuchtestingprocedureinthecontext ofmultidimensionaldatawhere thenumberofobservationsisverylarge,makingp-valuesuseless.

Thispaperisstructuredasfollows.First,inSection2,wegivethefundamentalknowledgeonhyperbolic waveletbasesandtheirrelationtotheSoboldecompositionofamultidimensionalfunction.InSection3we introducethemultidimensionalGaussianwhitenoisemodelandtheminimaxhypothesistestingframework. ThenwedescribeourteststatisticsforSobolfunctionalcomponentsinSection4beforeintroducingtestsfor globalstructuralcharacteristicsinSection5.Section6describesourdata-drivenadaptationandvalidation in a practical setting of the adaptive minimax optimal procedure and a ﬁnal discussion is provided in Section7.Theproofsofthetheoreticalresultsarepostponedtotheappendix.

2. Hyperbolicwaveletbases

Westartfromaone-dimensionalperiodizedwaveletbasisB1ofL2([0,1)) whichisbuiltfromthedilations

andtranslationsofascalingfunctionφ andawavelet ψ withV (forsomeV ≥ 1)vanishingmoments,

B1=

φ(.), ψj,k(.) = 2j/2ψ(2j.− k) : j ∈ N, k ∈ {0, . . . , 2j− 1}

.

Ad-dimensionalhyperbolicwaveletbasisresultsbyformingd-variatefunctionstakingappropriateproducts oftheunivariatefunctionsφ andψ asfollows,

Bd =

φ0,0(.), ψj, ki (.) : i∈ {0, 1}d\ 0, j ∈ Ji, k∈ Kj

where0 = (0,. . . ,0),φ0,0(.)= φ(.)× · · · × φ(.),ψi_j,k(.)= ψ_ji1₁_,k₁(.)× · · · × ψ_jid_d_,k_d(.),i = (i1,. . . , id) withthe

followingnotations: ψiu ju,ku(·) = 2ju/2_φ(2ju· −k u) if iu= 0 2ju/2_ψ(2ju· −k u) if iu= 1 and Ji₌_{j = (j} 1, . . . , jd) : ∀u ∈ {1, . . . , d}, ju= juiu, ju∈ N , Kj= k = (k1, . . . , kd) : ∀u ∈ {1, . . . , d}, ku∈ {0, . . . , 2ju− 1} .

Thebasis Bd is generated byaset of ascaling functionφ0,0 and translated anddilated wavelet functions

ψi_j,k. If all the components of the vector of directional dilations j are equal, this results in the standard (isotropic) wavelet basis. Hereafter we consider that the components of j can take diﬀerent values. The wavelet functions are then supported on hyper-rectangles and the resulting wavelet basis, the so-called hyperbolicortensor-productwaveletbasis,isproventobeabletooptimallydealwithanisotropicfunctions [3,4].Wesaythateachofthe2d_{− 1 elements}_of_{i deﬁnes}_an_orientation._In_such_a_basis,_any_f _{∈ L}

2([0,1)d)

(4)

f =f, φ0,02φ0,0+ i=0 j∈Ji k∈Kj f, ψi j,k2ψ i j,k ≡ f0+ i=0 fi (1)

where .,.2 isthe L2-scalarproduct.This representation facilitates acharacterization ofa speciﬁc

struc-ture, such as additivity. Indeed, for an additive function fadd, that is a function with Sobol

decompo-sition fadd(x1,. . . ,xd) = f0+du=1fu(xu), the coeﬃcients θj,ki = f,ψ i

j,k2 in each orientation i with

|i| = i1+ i2+· · · + id > 1 areexactlyequalto 0.Informationaboutthe componentfunctionsfi in

equa-tion(1)canberetrievedviathewaveletcoeﬃcientsequencethroughthegeometryofthehyperbolicwavelet basis.Thisenablesthedesignofspeciﬁctestingproceduresforstructuralinformationformultivariate func-tions,moregeneralthanmerelytestingforadditivity.

Dalalyanet al.[7]introducedtheatomicdimensioninthephysicaldomainthatwedenoteasδ(f ).It re-ﬂectsthemaximaldegreeofinteractionbetweenthed variablesintheSoboldecomposition.Forinstance,the additivemodelfadd hasδ(fadd)= 1,whileafunctionf (x1,. . . ,xd)=du=1fu(xu)+du=1

d

v=1,v=uxuxv

has δ(f ) = 2. In the sequel, we use the definition from Autin et al. [3] of the atomic dimension in the wavelet coefficient domain relatingthe orientationsof thenonzero wavelet coefficientswith corresponding Sobol functionalcomponents.

Deﬁnition 2.1.Letf ∈ L2([0,1)d) bedecomposedinthehyperbolic waveletbasisas

f = f0+ i=0 fi = θ00,0φ0,0+ i /∈0 ⎛ ⎝ j∈Ji k∈Kj θi_j,kψ_j,ki ⎞ ⎠ .

Deﬁne Af ={i ∈ {0,1}d\ 0; θij,k = 0 for some (j,k)}.Theatomicdimensionoff withrespecttoBdisthe

integerδ_Bd= δBd(f )∈ {0,. . . ,d} such that:

δ_Bd =

max{|i|; i ∈ Af} if Af = ∅

0 ifAf =∅.

Definition 2.1 gives us more flexibility than its definition in the physical domain. Indeed, the atomic dimensionδ_B_ddependsonthenumberofvanishingmoment(evendirectionalones)oftheconsideredbasisBd.

Moreprecisely,theatomicdimensionδ(f ) ofad-dimensionalfunctionf inthephysicdomainandtheatomic dimensionδ_Bd(f ) ofthesamefunctionf inthecoeﬃcientdomainareclearlytiedbythefollowinginequality: δ_Bd(f ) ≤ δ(f). We mention here two situations to emphasize the practical importance of being able to

characterize theatomic dimension inthecoefficientdomain.The firstexample concernsmultidimensional function estimation.Autinet al.[3] introducedanovelestimation procedurebasedonthe thresholdingof thehyperbolicwaveletcoefficients.Theyshowthatitoutperformsthestandardhardthresholdingwhenever somestructuralconditionsontheestimandaresatisfied.Theseconditionsalsoincludesomeconstraintonthe valueoftheatomicdimensionδBd.Hereweprovideateststatistictotestthestructureoftheestimandbefore

applying the aforementioned estimation method. A second example concerns the exploitation of sparsity in statistical modeling that can be useful for building a statistical model using the hyperbolic wavelet coeﬃcients or in generalized linear modeling. For example, Zhou et al. [38] describe such a generalized linear model with multidimensional covariates, their ﬁrst step consists in dimension reduction by tensor decomposition. Onecould instead exploitthesparsity of thehyperbolic wavelet sequence. In suchacase, onemayseekforthehyperbolic waveletbasisthatprovidesthesparsestsituation.

3. Modelandminimaxapproachforhypothesis testing

(5)

dYε(x) = f (x)dx + εdW (x) (2)

wherex = (x1,. . . ,xd)∈ [0,1)d,f ∈ L2([0,1)d),W (x) istheBrowniansheetandε isthenoiselevel,

consid-eredheretobeclosetozero.Fortechnicalreasons(seetheproofsofPropositionA.3andPropositionA.4), withoutlossofgenerality,weassumethatitissmallerthanεo= e−e,wheree denotestheEuler’snumber.

WeconsiderthesequentialversionoftheGaussianwhitenoisemodelwhichconsistsoftheobservationsof thefollowingrandomvariables

ˆ θ0_0,0= θ0_0,0+ εξ =f, φ0,0L2+ εξ and θˆ i j,k = θ i j,k+ εξ i j,k =f, ψ i j,kL2+ εξ i j,k

where,ξ,ξ_j,ki arei.i.d.N (0, 1) and(j,k)∈ Nd× Kj.

For any chosenorientation i= 0,we maytest the following nullhypothesisHi,0 versus oneof thetwo

statedalternativehypotheses Hi,a andHi,a:

Hi,0: f ∈ Ni(R) = ⎧ ⎨ ⎩f : f 2≤ R, fi(x) = j∈Ji k∈Kj θi_j,kψi_j,k(x) = 0, ∀x ∈ [0, 1)d ⎫ ⎬ ⎭, (3) Hi,a: f ∈ Ai(R, C, s, rε,i) = ⎧ ⎪ ⎨ ⎪ ⎩f : f ∈ F s_(R), _f i 2= ⎛ ⎝ j∈Ji k∈Kj θ_j,ki 2 ⎞ ⎠ 1 2 ≥ Crε,i ⎫ ⎪ ⎬ ⎪ ⎭, (4) H i,a: f ∈ ∪s:_uius−1u =γi−1Ai(R, C, s, r ε,i). (5)

Intheaboveexpression,C isapositiveconstantwhichdoesnotdependonε andrε,i,rε,i arecontinuous

sequencesofpositiverealnumberstendingto0 asε goesto0.Fs_{(R) denotes}_a_ball_of_radius_{R in}_a_function

spacecharacterizedbyavectorofdirectionalregularitiess withharmonicsumintheorientationi denoted

asγ_i−1.

Given a simultaneous controlon the probabilities of type I and II errors, we consider the asymptotic minimax setup: we aim at providing, for any orientation i, the order of the optimal separation rates in the sense of Section2.6 in Ingsterand Suslina [17] between thenull hypothesis Hi,0 and eachone of the

alternativehypotheses Hi,a andHi,a associated toaparticular choiceofFswhen:

• s is known;

• onlytheharmonicsumγ−1_i =_uius−1u ofs withrespectto orientationi isknown.

Relatedtothis,wesearchforbothasequenceofFs_-minimax_optimal_testing_procedures_and_a_sequence_of

Fs_-adaptive_minimax_optimal_testing_procedures.

Deﬁnition 3.1. Let α ∈ 0,1₂, s ∈ (0,+∞)d _and _i _{∈ {0,}₁_}d_{\ 0.} _We _say _that _r

ε,i corresponds to the

Fs_-minimax_rate_separating_the_hypotheses_H

i,0andHi,a,uptoaconstant,ifthetwofollowingstatements

aresatisﬁed:

1. (Upper bound) for any C > Ci,α, with Ci,α an absolute positiveconstant, there exists asequence of

testing proceduresΔ∗_ε,α ε suchthat sup 0<ε<εo sup f∈Ni(R) Pf(Δ∗ε,α= 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ∗ε,α= 0) ≤ α,

(6)

2. (Lowerbound)forany C < ci,α,withci,αanabsolutepositiveconstant,thefollowing inequalityholds inf 0<ε<εo inf Δ sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0) > α.

Thesequence oftestingproceduresΔ∗_ε,α

ε issaidtobe F

s_-minimax_optimal.

Deﬁnition 3.2. Let α ∈ 0,1₂. Consider i ∈ {0,1}d _such _that _|i| _{> 1 and} _let _γ

i > 0. We say that rε,i

corresponds totheFs_-adaptive_minimax_rate_separating_the_hypotheses_H

i,0andHi,a,uptoaconstant,if

thetwo followingstatementsaresatisﬁed:

1. (Upper bound) for any C > C_i,α , with C_i,α an absolute positive constant,there exists a sequence of testingprocedureΔ ε,α εsuchthat sup 0<ε<o ⎛ ⎝ sup f∈Ni(R) Pf(Δε,α= 1) + sup s:uius−1u =γ−1i sup f∈Ai R,C,s,rε,i Pf(Δε,α= 0) ⎞ ⎠ ≤ α,

2. (Lowerbound)forany C < c_i,α,withc_i,αanabsolutepositiveconstant,thefollowing inequalityholds

inf 0<ε<εo inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:uius−1u =γ−1i sup f∈Ai R,C,s,rε,i Pf(Δ = 0) ⎞ ⎠ > α. Thesequence oftestingproceduresΔ

ε,α

ε issaidtobe F

s_-adaptive_minimax_optimal.

Inthelattersituation,wehaveonlytheinformationabouttheharmonicsumofthedirectional smooth-ness parametersofthefunctionclassinthealternative. Todeterminetheminimax andadaptiveminimax separation rates, we naturallyconsider the combination of directional smoothness with agiven harmonic sum thatis givingtheworsttype IIerror rate.The fullyadaptive casecorresponds to an unknownvalue

γi. It isadirectextension studiedherewhich couldbe called partially adaptive,thatis to saywhen γi is

known.Inthisworkforthesakeofsimplicity weuseadaptiveinsteadof partiallyadaptive.

Remark 3.1. The deﬁnition of minimax and adaptive minimax rates are equivalent to the ones given in Section1 ofLepskiand Tsybakov[22](Section1) andslightlymoreprecisethantheones giveninSection 2.6 of Ingster and Suslina [17]. We will show that both the minimax separation rate and the adaptive minimax separationrate of testing depend on the smoothness of the underlyingfunctional space and on thenoiselevelε.Attheendoftheappendix,wehighlighttherelationshipsbetweentheabsoluteconstants

Ci,α,ci,α,Ci,α ,ci,α and theparametersappearingintheproblemsuchas R andthesmoothness.

4. Nonparametrictestsfora givenorientation

In this paragraph we present test statistics and testing proceduresthat are used in the sequel. These test statistics aregoing to provide sequences oftesting procedures thatareFs_-minimax _and_Fs_-adaptive

minimax optimal in the particular case where Fs _corresponds _to _a _Besov _space _with _{s as} _smoothness

parameter.

Deﬁnition 4.1(Teststatistic).Leti= 0 andj = (j1,. . . ,jd)∈ Ji.Thenonnegative randomvariable

Ti,j= j∈Ji_{: j}_u<max(ju,1),∀u k∈Kj (ˆθ_ji_,k)2

(7)

iscalled thei-oriented teststatisticwithlevelparameterj.

Deﬁnition4.2(Testingprocedure). Leti= 0,j = (j1,. . . ,jd)∈ Ji and t> 0. Therandom variable

Δi,j(t) = 1{Ti,j>t}

iscalled the(i,j,t)-testingprocedure.

4.1. Bs_2,_∞-minimaxoptimalsequenceof testingprocedures

We define appropriate test statistics based on the estimated empirical energy that is the sum of the squaredvaluesoftheempiricalhyperbolic waveletcoefficientswithinthespecified orientationover certain scales.Weshowthatthissequenceoftestsconstructedusinghyperbolicwaveletbasesgetsoptimalminimax propertieswhenf belongstoaBesov ballwithsmoothnessparameters.

Deﬁnition4.3(Besovball). LetR > 0 ands = (s1,. . . ,sd)∈ (0,+∞)d.Wesaythatf ∈ L2([0,1)d) belongs

totheBesovball B_2,s_∞(R) ifandonlyif

sup i=0 supj∈Ji max 1≤u≤d{2 2jusu} k∈Kj |θi j,k|2≤ R2.

In this section we assume that we have the full knowledge about the smoothness parameter s of the functionsinthe alternative.Inother words itcorresponds to thenon adaptiveset up.We deﬁne avector

j∗= (j₁∗,. . . ,j_d∗)∈ Ji _of_truncation_scales_such_that

2−ju∗≤ ε4iuγi/(1+4γi)su _{< 2}1−j∗u ₍₆₎ withγi= ( d u=1ius−1u )−1. Theorem4.1. Letα∈0,1 2

,R > 0,s in (0,+∞)d _and_i_{∈ {0,}₁_}d_{\ 0.}_Consider, _as_in₍₃₎ _and₍₄₎_,_testing

the hypotheses Hi,0 versus Hi,a where Fs(R) = Bs2,∞(R). Then the sequence of the (i,j∗,tε,i,α)-testing

procedures Δi,j∗(tε,i,α)

ε is B s

2,∞-minimax optimal with j∗ asin (6),and with ε−2tε,i,α = χ1−α/2(2|j∗|),

the1− α/2 quantileof theChi-Squaredistributionwith 2|j∗|= 2j1∗+···+jd∗ degreesof freedom.Moreoverthe B_2,s_∞-minimaxrate separatingHi,0 andHi,a isoforder of

rε,i= ε

4γi 1+4γi_.

Theproof of Theorem 4.1is stated intheappendix. Theabsolute constants Ci,αand ci,αfrom

Deﬁni-tion 3.1willbe giventhere.

This theorem can be compared to the results given inthe book of Ingster and Suslina [17] for Besov bodiesintheone-dimension case.Inthisbook,Theorem 3.9 statesdistinguishabilityconditionsand sharp asymptotics are given in Theorem 4.4. The results stated above in Theorem 4.1 is similar but in any dimension;thereforethesmoothnessparameterisreplacedbytheaveragesmoothnessvalueobtainedfrom theharmonicsum.

Thereadercouldnotethattheworstrateisassociatedwithγ1with1 = (1,. . . ,1),i.e.totheinteraction

termofhighestdegree.ThiscaseissimilartotheoneconsideredinIngsterandStepanova[16](Remark 3.3). Weobtainthesameminimaxrateoftestinghypothesesinvolvingtheharmonicsumofsmoothnessastheone theyfoundforasignalfunctionf inaSobolevspace.Nevertheless wecanpoint thattheirnullhypothesis

(8)

H0 : f = 0 is diﬀerent from the one considered here and brieﬂy reformulated as H1,0 : f1 = 0 which

consists of functionswithan atomicdimensionstrictly smallerthan thedimensiond. Theseparationrate canbedifferentforeveryorientation.Whenitcomestotestforthestructureofad-dimensionalfunction,a sequentialtestingprocedurebasedonthesedifferentteststatisticsseemstobeidealbecauseonecanbenefit from betterseparationrates.

Remark 4.1. Inthe context ofmultidimensional function estimation,themaxiset approach introducedby Kerkyacharian and Picard [20]has been proveduseful to study theperformance of wavelet-based estima-tors[3].Itconsistsindeterminingthelargestfunctionspacesuchthattheriskofanestimatoriscontrolled atachosenrate.Ifthechoiceoftherateisoftenanissue,itiscommontousetheminimaxornear-minimax rates oversome‘standard’functional spacesF. Thenthe maxisetapproachreveals asetof functionsthat contains F strictly or not. This is an optimistic approach in the sense that ﬁnally, we can estimate at theminimax ornear-minimax ratemorefunctionsthatpreviouslythought. Inminimaxhypothesistesting problems,wecangetinspiredfromthemaxisetapproachtoremarkthatunderthealternative,wenaturally model the‘estimand’as afunctioninad-variate smoothnessspace, hereB_2,s_∞(R).Nevertheless, sincewe areonlyconcernedbythepresenceofinformationalonggivendirections,thebehavioralongthedirections thatarenotofinterestcanbelet‘uncontrolled’.Hence,thesequencespaceinthealternativecanbeenlarged to thesetofallBesovspacesBs_2,_∞(R) withs_u≤ su foriu= 0 andsu= su otherwise.

4.2. B_2,s_∞-adaptive minimaxprocedure

Wenowsupposethattheregularityparameters appearinginthealternativehypothesisisunknownbut its harmonicsumγ−1_i =d_u=1iusu−1 is knownfor achosen i suchthat|i|> 1.Insomesense,weconsider

now theadaptivesetup.

To deﬁnetheteststatisticusingtheknowledgeaboutγ,wedenote byJi theintegersuchthat

2−Ji≤_ε4_{log log ε}−1

1

1+4γi _{< 2}1−Ji_. ₍₇₎

Theorem 4.2. Let α ∈ 0,1₂, R > 0, i ∈ {0,1}d satisfying |i| > 1 and let γi > 0. Consider, as in (3)

and (5),testingthehypotheses Hi,0 versus Hi,a where Fs(R)= B s

2,∞(R), whereγi−1=

d

u=1ius−1u .Then

the sequenceof testingprocedures Δi,max(tε,i,γi,α)

ε built from themaximum of the (i,j,t

ε,i,γi,α)-testing proceduresΔi,j(tε,i,γ,α) forwhichj issatisfying|j|= JiisBs2,∞-adaptiveminimaxoptimalprovidedthat,for

any ε∈ (0,εo), ε−2tε,i,γi,α= χ1−α/2Kγi(2

Ji_), _with _K

γi = #{j ∈ J

i_:_|j|_{= J}

i}. MoreovertheB2,s∞-adaptive

minimax rateseparatingHi,0 andHi,a isof orderof

r_ε,i =ε4log log ε−1

γi

1+4γi _.

The proof ofTheorem 4.2 isstated intheappendix. The absoluteconstants C_i,α and c_i,α from Deﬁni-tion 3.2will begiventhere.

Results given foradaptation inTheorem 4.2canbe compared to usual resultsobtained foradaptation in hypothesistesting problems. Theunavoidable loss due to adaptation is theusual onethat canalso be foundforexampleinTheorem7.2 inIngsterandSuslina[17].TheresultsstatedaboveinTheorem4.2can be comparedtothelossobtainedbySpokoiny[35]intheGaussian whitenoisemodelinone-dimensionfor thesignaldetectionproblem.

Remark4.2.Note thatcompared totheB_2,s_∞-minimaxcase,theoptimal separationrateisslower because log log ε−1 → ∞ asε→ 0.This attestsof thedeteriorationoftheinformation aboutthefunctionspace in

(9)

the alternative. Nevertheless,inthe alternativehypothesis, the unionof sequence spacesB_2,s_∞(R) can be replacedbyanotherlargersequencespace,denotedAγi,2(R) and deﬁnedhereafter.

Deﬁnition 4.4 (Truncation ball). Let R > 0, i ∈ {0,1}d satisfying |i| > 1 and γi > 0. We say that

f ∈ L2([0,1)d) belongstothetruncationballAγi,2(R) ifandonlyif

sup

j∈Ji

22|j|γi k∈Kj

(θ_j,ki )2≤ R2.

ThetruncationballAγi,2(R) issimilartoaBesovballinthesensethatitcontrolsthedecayoftheenergy

ofthehyperbolicwaveletcoeﬃcientsoverthescalesandhenceitisusedtocontroltheapproximationerror. FollowingLemma2.2ofNeumann[26], thefollowing inclusionproperty holds.

Proposition4.1. Fix i∈ {0,1}d_{\ 0.} _Let_γ

i> 0. ForanyR > 0,there existsR > 0 such that

s: i1s−11 +...ids−1d =γi−1

B_2,s_∞(R)⊂ Aγi,2(R _).

Consideringthefunctionalcomponentfi off doesnotonlyenableustoexploreﬁnestructuresbutitis

alsoawaytoimprove theperformanceofthetesting proceduresformoregeneralstructure,seeSection6.

5. Teststhatcombine diﬀerentorientations

Manyinterestinghypothesesonthestructure ofafunctioncanbetestedby‘aggregating’thepreviously described optimal testing procedures for theSobol functional components.In this paragraph we describe suchresultsinthecontextoftheBs_2,_∞-minimaxframework,butitcanbeeasilygivenfortheB_2,s_∞-adaptive minimaxsettingas well.Weﬁrststatethefollowingproposition:

Proposition 5.1. [Max-testing]Fix α∈0,1₂,s∈ (0,+∞)d _and _consider_{β = 1}₋₁₋α

2

|I|

such that |I| denotesthecardinality of anonemptysetof orientationsI that characterizesthealternative hypothesis.

Consider thefollowingtestinghypotheses

H0: f ∈ S(R, I)= {f : f 2≤ R, fi(x) = 0, ∀i ∈ I, ∀x ∈ [0, 1)d},

Ha: f ∈ T (R, I)= {f : f ∈ B2,s∞(R),∃ i ∈ I, fi 2≥ Crε,i}.

(8)

Then, forany ε∈ (0,εo),thetestingprocedureΛI,ε,α= maxi∈IΔi,j∗(tε,i,α) isβ-level,i.e.itsprobabilityof

type I errorisequaltoβ.

Theproofof Proposition 5.1isanimmediateconsequenceofTheorem 4.1.Since thetesting procedures involvedareindependentfordiﬀerentorientations, thelimitingdistributionscanbe exactlycomputedand wedonotneedtheprobablyconservativeFWER.Foranyε∈ (0,εo),

sup

f∈S(R,I)Pf(ΛI,ε,α= 1) =f∈S(R,I)sup

1− Pf

max

i∈I Δi,j∗(tε,i,α) = 0

= 1− 1−α 2 |I| = β.

Remark5.1.Underthealternative,wenaturallymodeltheestimandasafunctioninad-variatesmoothness space,hereandonceagainaB_2,s_∞(R).Nevertheless,inspiredfromRemark 4.1,wecanstateanotheranalogy with the maxiset approach. Let us consider the testing problem with the same hypotheses as in (8) but with a ratethat is chosen to be the worst optimalone with respect to s, thatis rε,1 = ε4γ/(1+4γ). Since

(10)

we areonlyconcernedbythepresence ofinformation alongatleast onegiven directionthatbelongstoI, no assumption on the smoothness is needed in the other orientations. Hence, the sequence space in the alternativeassociatedwiththechosenraterε,1canbeconsideredlargerthantheinitialone,Bs2,∞(R).More

precisely it couldbe replaced bythesequence spacedeﬁnedbythe unionofall Besovballs B_2,s_∞(R) with

s∈ (0,+∞)d _such_that

uiusu= γ−1 foratleastonei∈ I.

Inwhatfollows,wepresent someexampleswhere theoptimaltesting proceduresthatcombinediﬀerent orientations couldbeuseful.

5.1. Testingthestructureofthegoal function

Take the set I(δ0) = {i ∈ {0,1}d : |i| > δ0} that contains all orientations for which the number of

contributing components of the d-vector is strictly larger than a speciﬁed value δ0. The corresponding

hypotheses fortestingfortheatomicdimensionare:

H0: f ∈ S(R, I(δ0)) ={f : f 2≤ R, δBd(f )≤ δ0},

Ha: f ∈ T (R, I(δ0))={f : f ∈ Bs2,∞(R),∃ i ∈ I(δ0), fi 2≥ Crε,i}.

(9)

Tests for theatomic dimensionare moreversatile than merelytesting foradditivity, thelatter whichis a special casewith δ0 = 1.Concluding thata structure is simplerthan afull d-dimensional object leadsto

an eﬃciency gain inestimation by using the appropriate methods escaping (at least partly) the curse of dimensionality whenδ(f ) issmallerthand.

Combining tests over multiple orientations i ∈ I can be done by computing the maximum of testing proceduresintheseparate orientations.

From our testingproceduresΔi,j∗(tε,i,α) ,i = 0,there is aquitenaturalway to estimatethe structure

of thegoalfunctionf byconsidering

ˆ

Af,tε,i,α =

i∈ {0, 1}d\ 0 : Δi,j∗(tε,i,α) = 1

asanestimatorofAf.Neverthelessthisestimatorcould leadtomanyfalsediscoveriesthatareany

orienta-tioni∈ Af,tε,i,αsuchthatfiisthenullfunction.Obviouslythelargerα,thebiggerthenumberofexpected

falsediscoveries.

Thereisalsoawaytonaturallygetanestimatorδ ofˆ theatomicdimensionδ(f ) ofad-dimensionalsignal

f byputting,forachosenα∈0,1₂, ˆ

δ_Bd=

max{|i|; i ∈ ˆAf,tε,i,α} if ˆAf,tε,i,α = ∅

0 if ˆAf,tε,i,α =∅.

Rejecting acombined orientations nullhypothesis as in(9) when δˆ_Bd ≥ δ0 is equivalent with rejecting H0 whenΛI(δ0),ε,α= 1.

5.2. Reductionofmodel: selectionofvariables

Anotherapplicationisto testwhethersomevariablesamongx1,. . . ,xd maybe omittedfrom themodel

or not. Consider the case of leaving out variable xm for some m ∈ {1,. . . ,d}. In this case Im = {i =

(i1,. . . ,im,. . . ,id)∈ {0,1}d : im= 1} andtherelevanthypotheses are

H0: f ∈ U(R, m)={f : f 2≤ R, ∀i ∈ Im, fi(x) = 0, ∀x ∈ [0, 1)d},

Ha: f ∈ V(R, m)={f : f ∈ B2,s∞(R),∃ i ∈ Im, fi 2≥ Crε,i}.

(11)

Inadditiontopossiblyreducethemodel,whenconsideringsuchatestingproblemcouldﬁndapplications inmodelingtimeseries(see Section6fordetails).

Following the two previous paragraphs, an interesting point must be underlined here. Estimating the atomicdimensionofad-dimensionalobjectorreducingthenumberofitsrelevantvariablescanbeconsidered as aﬁrst stepto guaranteethatthe object isa functionof anumberof variables smaller thand (usually

called‘sparsevariablefunction’).Thisassumptionwasoftenassumedinrecentworksinestimationproblem asinAutinet al.[3]andintestingproblemasinIngsterandSuslina[19].Fromthedata,ourmethodology alsopermitstotestwhetherthisassumption isreasonableor not.

6. Numericalexperiments

In this section we ﬁrst describe a practical implementation of the adaptive method. We present its performanceintermsofempiricalpowercurvesw.r.t. signaltonoiseratiofortestingtheatomicdimension of some multidimensional functions corrupted by an additive Gaussian white noise. Then, we propose an application in time series analysis that consists in exploring the structure of the spatio-spectrum of spatio-temporalprocesses.

6.1. Data-drivenstructure detection

To link our theory to the practical setting we referthe reader to Reiss [30] for the equivalence in the Le Cam’s sense between the multidimensional nonparametric regression problem (11) and the Gaussian whitenoisemodel(2).Thisequivalenceisassumedto holdandtypicallyrelies onsomeminimalregularity conditionsandanappropriatecalibrationofthenoiselevel ε= σN−d2. Letζ_l

1,...,ld be i.i.d.N (0,1), Yl1,...,ld= f l1 N, . . . , ld N + σζl1,...,ld, 1≤ lu≤ N, 1 ≤ u ≤ d. (11)

Both,theB_2,s_∞-minimaxandB_2,s_∞-adaptiveminimaxoptimalproceduresstudiedpreviouslytruncatethe waveletcoeﬃcientsequenceatscalesthatarecalibratedbasedontheknowledgeoftheunknownsmoothness oftheestimand (basedeitherontheknowledgeofallthedirectionalregularitiesor onlyoftheirharmonic sum). Following the idea of Sieve estimation described in Birgé and Massart [5], we build a data-driven version of the adaptive testing procedure. In this numerical part, we consider only the B_2,s_∞-adaptive minimaxoptimalmethodsinceitsalgorithmiccomplexityisofaboutO (log₂(N )) comparedtoOlog₂(N )|i| fortheB_2,s_∞-minimax optimalmethod.

Letusconsider thesets ofallpossibleadaptivemodelsF_MA

N basedonε −2 _{observations,} FMA N = _ˆ fmA = ˆθ_0,00 φ_0,0+ i=0 (j,k)∈mA i ˆ θi_j,kψi_j,k; mA:={mA_i ∈ MA_N,i}, with MA_N,i:=

m_i,J :=(j, k)∈ Ji× Kj;|j| ≤ J; ju≤ log2(N ), ∀u

; J ≤ |i| log₂(N )

.

Theoracleapproachleadstoconsider theestimatorfˆmA

o ofFMAN suchthatE ˆfmAo − f

2

2 isminimal.This

correspondsto ﬁndmA_, _denoted_as _mA

o, suchthatthefollowingquantityisminimal:

− (i,j,k)∈mA (θi_j,k)2− σ 2 Nd .

(12)

Theoptimizerisfoundbysolvingthefollowing problemforeveryi∈ {0,1}d_\{0},

mA_i,o= arg min

mA i∈MAN,i ⎧ ⎨ ⎩− (j,k)∈mA i (θ_j,ki )2− σ 2 Nd ⎫_⎬ ⎭.

Inpractice,wepluginempiricalquantitiesandadjustforthevariabilityinthedataproposingthefollowing optimization,withλ = ˆˆ σ2dN−dlog N theuniversalthreshold,

ˆ

mA_i,o= arg min

mA i∈MAN,i ⎧ ⎨ ⎩− (j,k)∈mA i (ˆθi_j,k)2− ˆλ2 ⎫ ⎬ ⎭. (12)

6.2. Empiricalpowerfunctions

Inthissectionweconsider3-dimensionalfunctionsf andwetestfortheiratomicdimensionH0: δ (f ) = 1

versus H1: δ (f ) > 1. Thereforeweconsider thehyperbolicHaar waveletbasisso thatδBd(f ) = δ (f ).For

sake ofsimplicity, wegenerate them usingthe Soboldecomposition ofad-variate functionf ∈ L2([0,1)d)

into orthogonalsummandsofgrowing dimensions,

f (x1, . . . , xd) = d u=1 i1<···<iu fi1...iu(xi1, . . . , xiu). (13)

The marginal functional componentsare chosen to be univariate functions thatare frequently consid-eredinthe waveletliterature [2].The interactionterms areobtainedas weightedproduct ofthe marginal functions.Inthisexperimentwechoosethefollowingtestfunctionswedescribebelow.

A: f1(x1):‘blip’,f2(x2):‘blip’,f3(x3):‘blip’,

B: f1(x1):‘blip’,f2(x2):‘wave’,f3(x3):‘bumps’.

Weconsiderthenullhypothesiswithδ (f ) = 1 andtwofunctionsinthealternativewithatomicdimension

δ (f ) = 2 or δ (f ) = 3.

• underH0: δ (f ) = 1,f (x1, x2, x3) = f1(x1) + f2(x2) + f3(x3)

• underH1: δ (f ) = 2,f (x1, x2, x3) = f1(x1) + f2(x2) + f3(x3) + Df1(x1) f2(x2)

• underH1: δ (f ) = 3,f (x1, x2, x3) = f1(x1)+f2(x2)+f3(x3)+Df1(x1) f2(x2)+Df1(x1) f2(x2) f3(x3).

Here,Fig. 1givesanillustrationofsomeofthese testfunctions.

Thetotalenergyofthesefunctionsf (x1,x2,x3) isnormalized.Thentheobserveddataaregeneratedby

adding Gaussian whitenoise to the test functions.The empirical poweris computed as afunctionof the SNRas follows: P (j) = 1 M M m=1 1 Λ(δ0)_{m,SN Rj}=1 _,

where the signal to noise ratio (SNR) is deﬁned as the ratio of the standard deviation of the function values to the standarddeviation of thenoise. Λ(δ0)m,SN Rj isthe result ofthe procedureΛI(δ0),ε,α (with

(13)

Fig. 1. Three dimensional test functions that are used in the numerical experiments.

Fig. 2. Simulatedrejectionprobabilitiesunder H0 and H1 forbothsetsoftestfunctionsAandB,asafunctionofthesignalto

noiseratio.

N ∈ {32, 64},1≤ j ≤ 50 andwithM = 100 MonteCarloreplicationsateveryvaluesofSN Rj.Theresults

aresummarized inFig. 2.

Fig. 2 shows that for both test functions, the nominal level of the test at α = 5% under the null hypothesisH0 ismaintainedoverthediﬀerentvaluesoftheSNR.Underthealternative,weobserve ﬁrstly

(14)

that thedetection appearsat very lowSNR values,secondly, an increaseinthe samplesize improvesthe detection power of the method, and thirdly, underthe two diﬀerent alternatives, when the energyof the function is more spreadover orientations (δ (f ) = 3),the detection power decreases. These tests conﬁrm the proper calibration of thedata-driven adaptive method. Wecan now apply itto amorerealistic data example.

6.3. Applicationintimeseries analysis

NeumannandvonSachs[28]considertestingthestationarityofatimeseriesbystudyingthestructureof itstime-varyingspectrum.Theyformlocalcontrastteststatisticsinthetimedirectionalsobasedon hyper-bolic waveletcoefficientsof anestimatorofthetime-varyingspectraldensity,namelythepreperiodogram. A crucial problem for practical application of this method in testing stationarity is due to the inherent natureofthepreperiodogramdata whichsuffers frominterferences,verylowSNRandnonGaussian noise distribution. This maystrongly affect theperformancesof thedata-drivenchoice ofthetruncation scales. Adaptingourtestingmethodologytothiscontextisaninterestingresearchprobleminitself.Inthesequel, wetestthestructureofthespatio-spectrumofspatio-temporalprocessesthataretimestationary.Aspecific concern forthis applicationisthe highdimensionalsetting. Weface the problemof having smallp-values due to a large number of observations [23]. We illustrate how to properly use our results to assess the structureofsuchdata.Wefirstdefineaclassofspatio-temporalprocesses,itsspatio-spectrum,explainhow to estimatethisandfinally weapplyourdata-drivenmethodtosimulateddata.

6.3.1. Spatio-temporal model

Following Ombaoet al. [29], wedeﬁne theCramérrepresentation ofaspatio-temporalprocess {Xt(u) ;

u = (u1, u2)∈ [0,1)2,t∈ Z},whereu isthespatialindex,asfollows:

Xt(u) = π

−π

A (u, ω) exp (iωt) dZ (ω) (14)

whereZ (ω) isacomplex-valuedstochasticprocesswithzeromeanandorthogonalincrements,whichsatisﬁes

E [dZ (ω)] = 0, Cov (dZ (ω1) , dZ (ω2)) = Dirac (ω1− ω2) dω1

whereDirac(.) meanstheDiracfunctionatmasspoint0 andA(u, λ) isthespatialcomplex-valuedtransfer function. It is Hermitian A(.,−ω) = A∗(., ω),where A∗ is the adjoint operator. Thelocation-dependent spectrumisgivenby

f (u, ω) :=|A (u, ω)|2. 6.3.2. Estimation

In a practical setting we observe the spatio-temporal process over a discrete grid of dimensions (N1× N2× T ) ∈ N3. Since we assume it to be time stationary, we can compute the bias-corrected

log-periodogramateveryspatiallocation

log ⎛ ⎝ 1 2πT T t=1

Xt(u) exp (iωt)

2⎞

⎠ + κ where κ= 0.57721 istheEuler–Mascheroniconstant,as inWahba [37].

(15)

Fig. 3. SpatialAR(1)parameterforthetimeseriesapplication.ThevalueoftheAR(1)parameterateachspatiallocationranges from0.2 (black)to0.99 (whitecolor).

From thesespatial log-periodogramswetest thestructureof thespatio-spectrumf (u, ω).Hereafterwe directly apply our methodology without adaptation for the non Gaussian nature of the log-periodogram ordinates. Thecentrallimittheorem inthe coefficientdomain permitsto consider thatthewavelet coeffi-cientsare approximatelynormally distributedupto comparativelyfine scales.Theactualprocedurecould be improvedinthatspecificcontext,for examplebyadaptingthemethodologyof Gao[9]inthe penaliza-tionofthedata-drivenselectionofthetruncation scales.Nevertheless,omittingthisadaptationhelpsusto illustratethepracticalapplicationofourmethodconsideringaslightlysuboptimalprocedure.

Intheprevioussectionwecheckedthatourmethodiswell-calibrated.Inthistimeseriesexample,weare awayfromthisidealisticframework.WearenowdealingwithperiodogramdatathatarenonGaussianand thespatio-spectrummayhaveacomplicatedstructure(intermsofenergyalongthediﬀerentorientations). Inthishigh-dimensionalandmorerealisticcontext,thep-valueshavenotlongeranypracticalimportance. For (very)large sample sizes, p-values are almost always extremely small, suggesting the rejection of the nullhypothesissince,withreallifedata,thesituationoftotalabsenceofinformationinsomeorientationis rarelyexactlytrue.Asolutioninthiscontextisto useourteststatisticvaluestodescribetheproportions tothetotalenergyofthefunctionthatareinthediﬀerentorientations,similarlyasinprincipalcomponents analysis.Thisistheapproachadoptedhereafterfortimeseriesdata examples.

6.3.3. Results

Wegenerate thefollowingtime seriesmodels using adiscretizedversion oftheirCramér representation givenbyequation(14).Weconsidertwo spatialAR(1) processes,i.e.,theirspatio-spectrumisgivenby

f (u, ω) = σ

2

2π

1− 2φ (u) cos ω + φ (u)2 −1

.

Let‘Model1’be aconstantparameteroverspace,i.e.,φ(u) = φ= 0.9, ∀u ∈ [0, 1]2,and‘Model2’tohave acomplexspatial AR(1)structure overspaceasillustratedinFig. 3.

Table 1givestheresultoftheMonteCarloexperimentsongeneratedtimeseriesdataforN1= N2= 128

andT = 128.Lookingattheenergyrepartitionforthelogarithm ofthetruespatio-spectrumunderModel 1, it is clear that all the information is just contained in the marginal function of frequency. Under the Model 2, one can realize that the difference in the energy inorientations involving the space variable is prettylow.Forwhatconcernstherepartitionestimationinthedifferentorientation,thereisnoproblemfor identifyingthefirst leadingorientation,nevertheless,undermodel2,wedonotget aclearmessageabout theseconddominantorientation.

(16)

Table 1

TruevsEstimatedrepartitionofthetotalenergy(in%). orientation

i

Model 1 Model 2

True repartition Estimated True repartition Estimated

(1, 0, 0) 0 0 (0.013) 0.1 0.1 (0.09) (0, 1, 0) 0 0 (0.01) 0 0.1 (0.078) (1, 1, 0) 0 0 (0.01) 0 0.1 (0.069) (0, 0, 1) 100 99.9 (0.014) 91.8 90.5 (0.150) (1, 0, 1) 0 0 (0.001) 4.9 3.3 (0.051) (0, 1, 1) 0 0 (0.001) 3.1 5.9 (0.074) (1, 1, 1) 0 0 (0.001) 0 0 (0.008) 7. Discussion

InthispaperwedescribehowtheSoboldecompositioninrelationtothegeometryofthemultidimensional hyperbolic wavelet basis allows to build simple test statistics with impressive theoretical performance in thepresenceofanisotropicestimand.Appropriatelycombined,these teststatisticsallowtotestforgeneral structuralpropertiessuchastheatomicdimension.Wedeterminetheminimaxratesforseparatingthenull andalternativehypothesisfordetectingtheSobolfunctionalcomponentincasesoffullorpartialknowledge aboutthesmoothnessparameter.Weproposetwosequencesoftestingproceduresthatarebasedondiﬀerent knowledgesofthesmoothnessparameter(fullorpartiallyknown)andweshowthattheyareasymptotically optimalintheminimaxsense.Interestinglyweobserveontheonehandthatalossofinformationaboutthe smoothnessparameterisaccompaniedwithadeteriorationoftheminimaxrateofseparation;ontheother hand the set of functionsin thealternative hypothesis canbe larger.Finally, we describe amethodology foradatadrivenversionofthetestingprocedureanditsapplicationtotimeseriesdata.

Inthecontextoftimeseriesanalysis,weconsiderextendingourapplicationexampletodealwithspatial and time non-stationary processes. We are therefore interested in spatial time-varying spectrum of these processes.Wecancomputeateveryspatiallocationsanestimatorofthetime-varyingspectrumsuchasthe preperiodogram(seeNeumannandvonSachs[28]).NeumannandvonSachs[27]provideargumentsabout theasymptoticequivalenceofthepreperiodogramestimationproblem,theGaussianwhitenoisemodeland themultidimensionalnonparametricregressionmodelsothatourtheoryactuallycaneasilybeextendedto thiscontext.Nevertheless,inapracticalsetting,thenatureofthepreperiodogramdatarequirestodevelop anadaptedmethodologyforchoosingthetruncationscales.Incombinationwiththefullyadaptivetesting procedure, this method certainly encounters success for practical applications. Already inthe context of estimation of the time-varyingspectral density, the adaptive smoothing of thepreperiodogram is still an actual researchtopic(seeforinstance vanDelftandEichler[36]).

Acknowledgments

The authors wouldlike to thank the two reviewers for useful commentsand suggestionson a prelimi-nary versionofthe manuscript.G.Claeskens and J.-M.Freyermuthacknowledge thesupportof theFund for Scientiﬁc Research Flanders, KU Leuvengrant GOA/12/14 and of the IAP Research Network P7/06 of the Belgian Science Policy. Jean-Marc Freyermuth’s and John Aston’s research was supported by the EngineeringandPhysical SciencesResearch Council[EP/K021672/2].

Appendix A

A.1. Proofof Theorem 4.1

TheproofofTheorem 4.1isadirectconsequenceofPropositions A.1 andA.2thatrespectivelydealwith theupperbound andthelowerboundoftheminimaxresult.

(17)

PropositionA.1.Under thesameassumptionsof Theorem 4.1andforany C > Ci,α,thefollowing inequal-ities hold sup 0<ε<o sup f∈Ni(R) Pf Δi,j∗(tε,i,α) = 1 = α 2 and0<ε<supo sup f∈Ai(R,C,s,rε,i) Pf Δi,j∗(tε,i,α) = 0 ≤ α 2.

RemarkA.1.TheconstantCi,αcorrespondstotheabsolutepositiveconstantCi,αappearinginDeﬁnition 3.1

andwill begivenattheend oftheproofbelow.

Proof. Consider ε ∈ (0,εo). Suppose thatf ∈ Ni(R). Then ε−2Ti,j∗ has a Chi-Square distribution with

2|j∗|degreesoffreedom.Hence

Pf Δi,j∗(tε,i,α) = 1 =Pf Ti,j∗> tε,i,α =Pf ε−2Ti,j∗> χ1−α 2 2|j∗|=α 2.

Supposenow thatf ∈ Ai(R,C,s,rε,i).Then ε−2Ti,j∗ is anoncentral Chi-Squaredistributionwith2|j ∗_|

degreesoffreedom.Moreover,ifV_ji∗ characterizesthefollowinglinearspanof

ψ_j,ki : j ∈ Ji: ju< max(ju∗, 1),∀u and k ∈ Kj

, then, Ef(Ti,j∗) = ε22|j ∗_| + j∈Ji_{: j}_u_<max(j∗ u,1),∀u k∈Kj θi_j,k2= ε22|j∗|+ P roj_Vi j∗(fi) 2 2, Varf(Ti,j∗) = 2ε42|j ∗_| + 4ε2 j∈Ji_{: j}_u_<max(j∗ u,1),∀u k∈Kj θi_j,k 2 = 2ε42|j∗|+ 4ε2 P roj_Vi j∗(fi) 2 2.

ByconsideringtheCramér–Chernoﬀmethod(seeChapter2ofMassart[25]),weeasilyobtainthefollowing concentrationinequalityforthenoncentralChi-Squaredistributionwith2|j∗|degreesoffreedom,

Pf Ef(Ti,j∗)− Ti,j∗≥ u ≤ exp − u2 2Varf(Ti,j∗) , ∀u > 0. So, Pf Δi,j∗(tε,i,α) = 0 =Pf Ti,j∗≤ tε,i,α =Pf

Ef(Ti,j∗)− Ti,j∗≥ Ef(Ti,j∗)− tε,i,α

≤ exp ⎛ ⎜ ⎝− Ef(Ti,j∗)− tε,i,α 2 2Varf(Ti,j∗) ⎞ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝− ε2₂|j∗|₊_{P roj} V_j∗i (fi) 22− tε,i,α 2 4ε4₂|j∗|_{+ 8ε}2 P roj V_j∗i (fi) 22 ⎞ ⎟ ⎟ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝− ε22|j∗|+ fi 22− fi− P roj_Vi j∗(fi) 2 2− tε,i,α 2 4ε4₂|j∗|_{+ 8ε}2 f i 22 ⎞ ⎟ ⎟ ⎟ ⎠.

(18)

Remarkthatfi belongstoB2,s∞(R) becausef does.

Since fi 22 ≥ C2rε,i2 andχ1−α

2

2|j∗|≤ 2|j∗|+ 22|j∗|log(2α−1)

1 2

(seetheexponential inequalityfor Chi-SquaredistributiongiveninLemma1 ofLaurentandMassart[21]),onegets,

− logPf Δi,j∗(tε,i,α) = 0 −1 ≤ 4ε42|j ∗_| + 8ε2_f i 22 ε2₂|j∗|₊ f i 22− fi− P roj_Vi j∗(fi) 2 2− tε,i,α 2 ≤ 4ε42|j ∗_| + 8ε2_f i 22 ε2₂|j∗|₊ f i 22− R22−2|j ∗_|γ_i − tε,i,α 2 ≤ 4ε42|j ∗_| (C2_{− R}2₎₂−2|j∗|γi− 2ε2₂|j∗|_log(2α−1₎12 2 + 8ε2_f i 22 fi 22− R22−2|j ∗_|γ_i − 2ε2₂|j∗|_log(2α−1₎12 2 = 4ε 4₂(1+4γi)|j∗| C2_{− R}2_{− 2}1+|i|2 (log(2α−1)) 1 2 2 + 8C4ε2 fi −22 C2_{− R}2_{− 2}1+|i|2 (log(2α−1)) 1 2 2 ≤ 22+(1+4γi)|i|+ 8C2e − 2e 1+4γi C2− R2− 21+|i|₂ _(log(2α−1₎₎1 2 2 (15) ≤log2α−1−1

provided that C ≥ Ci,α. The constantCi,α is the largest value of C such that thelast inequalitycan be

replaced by anequality. That leadsto Pf

Δi,j∗(tε,i,α) = 0

≤ α

2 for any C large enoughand uniformly

in f .

Notethatthesmallerα thelargerCi,α.Thisremarkisnotasurpriseobviouslybecausewhenα ischosen

to besmall,theproblemofdetectionbecomesharder.Neverthelesstheorderoftheminimaxratedoesnot depend onα.Wealsonote that(15)isobtainedbecauseweassumedthatε< e−e. 2

PropositionA.2.UnderthesameassumptionsofTheorem 4.1andforanyC < ci,α,thefollowinginequality

holds inf 0<ε<o inf Δ sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0) > α.

RemarkA.2.Theconstantci,αcorrespondstotheabsolutepositiveconstantci,αappearinginDeﬁnition 3.1

and willbegiven attheendoftheproofbelow.

Proof. For anyε∈ (0,εo) let usconsider

fζ = k∈Kj∗ θi_j∗_,kψi_j∗_,k = k∈Kj∗ L d " u=1 2−j∗u(γi+1₂)_ζ k ψ_ji∗_,k

where L> 0 andζk ∈ {−1,1} forany k.Thevaluesju∗ arechosenasintheupperbound. Werecallthat

2−ju∗≤ ε

4iuγi

(19)

Firstwecheck thatfζ isinthealternative, thatisto sayitbelongs tothe Besovball ofradius R with

theexpectedparameter ofregularity andit isseparated from 0 suﬃciently.Accordingto thedeﬁnitionof theBesovballwithradius R,wehavetocheck that

max 1≤u≤d{2 2ju∗su} k∈Kj∗ θi j∗,k 2 < R2.

Becauseof ourchoiceofθi_j∗_,k,wehave

max 1≤u≤d{2 2j∗usu}2du=1j∗u_L2 d " u=1 2−2ju∗(γi+12)= L2 max 1≤u≤d{2 2ju∗su} d " u=1 2−2ju∗γi≤ 2min1≤u≤dsu_L2_.

A ﬁrst condition on theconstant L is thatit hasto be chosen smaller than √ R

2minu su. It ensures thatfζ

belongstotheBesovballunderconsideration.

Nextwecomputethesquare oftheL2_-distance _between_f

ζ and0.Wehave k∈Kj∗ θi_j∗,k 2 = L2 d " u=1 2−2j∗uγi ≥ L2₂−2|i|γi_ε 8γi 4γi+1_.

ThereforeL hastobe largerthan2|i|γi_C._This_kind_of_choice_guarantees_that f

ζ 2≥ Crε,i.

FollowingPropositions2.11 and2.12 ofIngsterandSuslina[17], wehavethat:

inf Δ sup f∈Ni(R) Pf(Δ = 1) + sup f∈Ai(R,C,s,rε,i) Pf(Δ = 0) ≥ 1 −1 2 # EEπ Lfζ 2 − 1 (16) whereπ is anypriorprobabilitymeasure concentratedontheset ofalternativesAi(R, C, s, rε,i) and

Lfζ = exp ε−2 fζ(x) dY (x)− 1 2ε −2 _f ζ(x)2dx .

Therefore,accordingto(16),itsuﬃcestoprovethatforjudiciouschoicesofπ wecanget

EEπ

Lfζ

2

< 4(1− α)2+ 1. (17)

Weconsidernowtheprobabilitymeasureπ suchthattheζk’sareindependentRademacherrandomvariables

withparameter 1₂.Notethat

Lfζ = exp ε−2 fζ(x) dY (x)− 1 2ε −2 _f ζ(x)2dx = exp −1 2ε −2_L2₂|j∗| d " u=1 2−2ju∗(γi+12) " k∈Kj∗ exp ε−1L ζk ξ_ji∗_,k d " u=1 2−ju∗(γi+12) .

(20)

Eπ Lfζ = exp −1 2ε −2_L2₂|j∗| d " u=1 2−2j∗u(γi+12) × " k∈Kj∗ 1 2 $ exp ε−1L ξ_ji∗_,k d " u=1 2−ju∗(γi+12) + exp −ε−1_{L ξ}i j∗,k d " u=1 2−ju∗(γi+12) % .

AsasecondstepwetaketheexpectationaccordingtotheindependentstandardGaussianrandomvariables

ξ_ji∗_,k: EEπ Lfζ 2 = " k∈Kj∗ cosh ε−2L2 d " u=1 2−ju∗(2γi+1) (18) ≤ exp 1 22 |j∗_| ε−4L4 d " u=1 2−2j∗u(2γi+1) ≤ exp L4 2 < 4(1− α)2+ 1.

The last inequality is obtained by choosing L < min2 log(1 + 4(1− α)2₎14 _,_√ R

2minusu

. So (17) is proved,when consideringC smallenough,thatis C≤ ci,α= 2−|i|γiL.

A.2. Proofof Theorem 4.2

Theproof ofTheorem 4.2follows thesamepathas theproofofTheorem 4.1. Itisadirectconsequence of Proposition A.3and Proposition A.4thatrespectivelydealwiththe upperbound andthelower bound of theminimaxresult.

PropositionA.3. UnderthesameassumptionsofTheorem 4.2andforanyC > C_i,α ,thefollowing inequal-ities hold sup 0<ε<o sup f∈Ni(R) Pf Δi,max tε,i,γi,α = 1 ≤ α 2 and sup 0<ε<o sup s: _uius−1u =γ−1i sup f∈Ai R,C,s,rε,i Pf Δi,max tε,i,γi,α = 0 ≤ α 2.

RemarkA.3.TheconstantC_i,α correspondstotheabsolutepositiveconstantC_i,α appearinginDeﬁnition 3.2 and willbegiven attheendoftheproofbelow.

Proof. Consider ε ∈ (0,εo). Suppose that f ∈ Ni(R). Then, for any j ∈ Ji, ε−2Ti,j has a Chi-Square

distributionwith2|j|degreesoffreedom.Hence

Pf Δi,max t_ε,i,γ_i_,α= 1=Pf max j∈Ji_:_|j|=J i Δi,j t_ε,i,γ_i_,α= 1 ≤ j∈Ji_:_|j|=J_i Pf Δi,j tε,i,γi,α = 1 = j∈Ji_:_|j|=J i Pf Ti,j> tε,i,γi,α

(21)

= j∈Ji_:_|j|=J i α 2Kγi =α 2.

Suppose now that f ∈ Ai(R,C,s,rε,i) where s :

uius−1u = γi−1. Then, for any ﬁxed j ∈ Ji such

that|j|= Ji, ε−2Ti,j is anoncentralChi-Square distributionwith2Ji degrees offreedom. Hence,ifV_ji

characterizesthefollowinglinearspanof

ψ_j,ki : j ∈ Ji: ju< max(ju, 1),∀u and k ∈ Kj

, then, Pf Δi,max tε,i,γi,α = 0 ≤ Pf Ti,j ≤ tε,i,γi,α =Pf

Ef(Ti,j)− Ti,j ≥ Ef(Ti,j)− tε,i,γi,α

≤ exp ⎛ ⎜ ⎝− Ef(Ti,j)− tε,i,γi,α 2 2Varf(Ti,j) ⎞ ⎟ ⎠ = exp ⎛ ⎜ ⎜ ⎜ ⎝− ε22Ji₊ P roj V_ji(fi) 2 2− tε,i,γi,α 2 4ε4₂Ji_{+ 8ε}2 P roj V_ji(fi) 22 ⎞ ⎟ ⎟ ⎟ ⎠.

Since fi 22≥ C2(rε,i )2andχ1−α/2Kγ

2Ji≤ 2Ji₊₂₂Ji_log(2α−1_K γi)

1 2

(see[21])with(K0log ε−1)|i|−1≤

Kγi ≤ (K1log ε−1)|i|−1 for some K0,K1 > 0, when computing similar calculus as in the non adaptive

framework,weget − logPf Δi,max tε,i,γi,α = 0 −1 ≤ 4ε42Ji+ 8ε2 fi 22 ε2₂Ji₊ f i 22− fi− P rojV_ji(fi) 22− tε,i,γi,α 2 ≤ 4ε42Ji+ 8ε2 fi 22 ε2₂Jγ₊ f i 22− R22−2Jiγi− tε,i,γi,α 2 ≤ 4ε42Ji+ 8ε2 fi 22 fi 22− R22−2Jiγi− 2ε2 2Ji_log(2α−1_K γi 1 2 2 = 4ε 4₂(1+4γi)Ji C2_{− R}2_{− 2}3 2(log(2α−1) + d(log K₁+ 1)) 1 2 2 + 8C4ε2 fi −22 C2_{− R}2_{− 2}3 2(log(2α−1) + d(log K₁+ 1)) 1 2 2 ≤ 22+(1+4γi)(log log ε−1)−1 C2_{− R}2_{− 2}32(log(2α−1) + d(log K₁+ 1)) 1 2 2 + 8C4_ε2_f i −22 C2_{− R}2_{− 2}32(log(2α−1) + d(log K₁+ 1)) 1 2 2 ≤ 22+(1+4γi)+ 8C2e − 2e 1+4γi C2− R2− 232(log(2α−1) + d(log K₁+ 1)) 1 2 2 (19) ≤log2α−1−1

(22)

provided thatC≥ C_i,α .TheconstantC_i,α isthevalue ofC for whichthelast inequalitycanbe replaced by anequality.Thatleadsto Pf

Δi,max

t_ε,i,γ_i_,α= 0≤ α₂ forany C largeenough anduniformly inf .

Wealsonote that(19)isobtainedbecauseweassumedthatε< e−e. Note, onceagain,thatthesmallerα the largerC_i,α .

PropositionA.4. UnderthesameassumptionsofTheorem 4.2andforanyC < c_i,α,thefollowinginequality holds inf 0<ε<o inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:_uius−1u =γi−1 sup f∈Ai R,C,s,r_ε,i Pf(Δ = 0) ⎞ ⎠ > α.

RemarkA.4.Theconstantc_i,αcorrespondstotheabsolutepositiveconstantc_i,αappearinginDeﬁnition 3.2 and willbegiven attheendoftheproofbelow.

Proof. For anyε∈ (0,εo) let usconsiderthefamilyofd-upleJ =

j∈ Ji_:_{|j| = J} i . Remember that: • Ji issuchthat2−Ji ≤

ε4_{log log ε}−11+4γi1 _{< 2}1−Ji_,

• #J = Kγi ≥ (K0log ε−1)|i|−1 forsomeK0.

Forj ∈ J ,deﬁne fζ,j= k∈Kj θi_j,kψ_j,ki = k∈Kj L d " u=1 2−ju(γi+12)_ζ k ψ_j,ki

where L> 0 andζk ∈ {−1,1} forany k.

As in the proof of Proposition A.2, we can easily check that fζ,j is in the alternative, provided that

2γC≤ L≤√R 2γi. Since inf Δ ⎛ ⎝ sup f∈Ni(R) Pf(Δ = 1) + sup s:_uius−1u =γi−1 sup f∈Ai R,C,s,rε,i Pf(Δ = 0) ⎞ ⎠ ≥ 1 −1 2 & ' ' ' ' (E ⎛ ⎜ ⎝ ⎛ ⎝ 1 Kγ j∈J Eπ Lfζ,j ⎞ ⎠ 2⎞ ⎟ ⎠ − 1 (20)

where π isapriorprobabilitymeasure concentratedontheset ofalternativesAi

R, C, s, r_ε,i and Lfζ,j = exp ε−2 fζ,j(x) dY (x)− 1 2ε −2 _f ζ,j(x)2 dx .

Therefore, accordingto (20),itsuﬃcesto proveonceagainthatforjudiciouschoicesof π wecanget

E ⎛ ⎜ ⎝ ⎛ ⎝ 1 Kγ j∈J Eπ Lfζ,j ⎞ ⎠ 2⎞ ⎟ ⎠ = _K12 γ E j∈J Eπ Lfζ,j 2 < 4(1− α)2+ 1. (21)