An Alternative Bootstrap to Moving Blocks for Time Series Regression Models

(1)

AN ALTERNATIVE BOOTSTRAP TO MOVING BLOCKS

FOR TIME SERIES REGRESSION MODELS

*

by Javier Hidalgo

London School of Economics and Political Science

Contents: Abstract 1. Introduction

2. Conditions and Main Results 3. Application of the Bootstrap 4. Monte Carlo Simulation 5. Proofs

6. A Technical Lemma 7. Conclusions

References Tables 4.1 – 4.5

The Suntory Centre

Suntory and Toyota International Centres for

Economics and Related Disciplines

London School of Economics and Political Science Discussion Paper Houghton Street

No.EM/03/452 London WC2A 2AE

May 2003 Tel.: 020 7955 6698

* _{This article is based on research funded by the Economic and Social Research Council}

(ESRC) reference number: R000238212. I am grateful to an associate editor and two referees for their very helpful comments which have led to a substantially improved version.

(2)

Abstract

The purpose of this paper is to introduce and examine two alternative, although similar, approaches to the Moving Blocks and subsampling Bootstraps to bootstrapping the estimator of the parameters for time series regression models. More specifically, the first bootstrap is based on resampling from the normalised discrete Fourier transform of the residuals of the model, whereas the second is from the residuals of the model itself. It is shown that the bootstraps are asymptotically valid under quite mild conditions. As a consequence of the result we are able to eleminate the apparent drawback of choosing the block length in empirical examples. A small Monte Carlo study of finite sample performance is included.

Keywords: Least squares estimation; long-range estimation; bootstrap methods. JEL Nos.: C15, C22.

©

by Javier Hidalgo. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without special permission provided that full credit, including © notice, is given to the source.

Contact address: Professor Javier Hidalgo, Department of Economics, London School of Economics and Political Science, Houghton Street, London WC2A 2AE. Email: [email protected]

(3)

1. INTRODUCTION

Since Efron’s(1979)seminar paper, Bootstrap algorithms have attracted considerable eﬀort to its development, being perhaps two the main motivations. First, bootstrap methods are capable of approximating thefinite sample distribution of statistics more eﬀectively than those based on their asymptotic counterparts. The second being that they allow computing valid asymptotic quantiles of the limiting distribution in situa-tions where a)the limiting distribution is unknown or b)if known, the practitioner is unable to compute its quantiles. The basic idea of the bootstrap is, given a stretch of data_ZT ={zt, t= 1, ..., T}say, to treat the data as if it were the true population, and

to carry out Monte-Carlo experiments in which pseudo-data is drawn from_ZT. Based

on the underlying distributional properties of_{zt, t= 1, ..., T_}, diﬀerent schemes have been adopted and proposed.

Tofix ideas, let us suppose thatzt= (yt, x0

t)0 follows the (linear) regression model yt=µ₀+β0₀xt+ut, t= 1, ..., T (1.1) where β₀ is a p₋dimensional vector, and we are interested in making inferences on

β₀. Whenut is a sequence of independent identically distributed data, abbreviated

henceforth as iid, the bootstrap entails to obtain a random sample _{bu∗

t, t= 1, ..., T}

from the empirical distribution of the (centered) OLS residualsu_bt=yet−bβ 0 e xt, t = 1, ..., T, where b β= Ã _T X t=1 e xt_ex0_t !−1 _T X t=1 e xt_eyt, (1.2)

andw_etstands forwt−wwithw=T−1PTt=1wt. Next, generate the bootstrap version

of(1.1)as

e

y_t∗=bβ0xt_e +u_b∗_t, t= 1, ..., T, (1.3) and perform theOLS estimator in(1.3). This bootstrap is known as Residual Boot-strap, in contrast to the Paired BootBoot-strap, which entails to draw a random sample from the empirical distribution of zt_e = (_eyt,x_e0_t)0, that isz_e_t∗ = (_ey_t∗,x_e∗_t0)0, and perform theOLS ofy_e_t∗ on_ex∗_t.

However, if the errorsut were notiid the above scheme would not be valid, as it wasfirst noted by Singh(1981)in the context of bootstrapping the sample mean of

m-dependent data. To circumvent this problem, when the errors are heteroscedastic, Wu (1986) proposed the wild or external bootstrap. This bootstrap amounts to replaceu_b∗t in(1.3)by|but|ξtwhereξtis a sequence ofiid(0,1)random variables and

independent ofzt. In a time series setup, following ideas in Carlstein(1986), Künsch

(1989)proposed to resample, not from ztbut from (overlapping) blocks of data, say Z =

¡

z0, ..., z0₊_b₋₁¢0 where = 1, ..., T ₋b+ 1. This bootstrap is known as the Moving Blocks Bootstrap (M BB). It is worth noting that Efron’s(1979)bootstrap is identical to the M BBwhenb= 1.

Another approach is subsampling, see Politis and Romano(1994). The subsam-pling in the context of model(1.1)is as follows. Consider a subsample (block) of the data of size b, Yb, = (y, ...,_e y_e+b₋1)0 and Xb, = (x, ...,e xe+b₋1)0, and compute the

OLS estimator based on the subsampleYb, andXb,, that is

b βb,= ¡ Xb,0 Xb, ¢−1 Xb,0 Yb,.

(4)

Then, the distribution of T1/2³_β_b −β₀´, say Fb_b_β(z) = PrnT1/2³_β_b −β₀´_≤zo, is estimated by 1 T₋b+ 1 T_X₋b+1 =1 I³b1/2³βb_b,₋βb´_≤z´

where_I(A)denotes the indicator function of the setA. Both methods, subsampling and moving blocks, are similar in that they utilize blocks of data of size b. The im-portant diﬀerence is that subsampling looks upon these blocks as ”subseries” whereas moving blocks use the blocks as ”building stones” to construct new pseudo-time series. The last two methods have been advocated by Politis et al. (1997)or Fitzenberger

(1998)in a model as(1.1)with time series data, motivated by the poorfinite sample performance of inferences using the so-called HAC estimator. The latter estimator entails the estimation of the spectral matrix at frequency zero of xtut, say Ω. In particular, following Parzen(1957),Ωis often estimated by

ˇ Ω= 1 T T_X₋1 r=1−T wrT X t(r) e xtbutex0t+rbut+r (1.4)

wherePt(r)denotes the sum over1≤t, t+r≤T andwrT is a weight function which

normally takes the form w(r/m)and wheremis a bandwidth parameter increasing slowly to infinity with T, that is m−1₊_mT−1 _→ ₀_{. This approach, or versions}

thereof, has been extensively employed, see Andrews (1991) for a latter reference. All these methods date back to ideas in Jowett(1955)and Hannan(1957)for scalar series to ”studentize” its sample mean and further developed by Brillinger(1979)in a multivariate setting. Recent surveys on the literature are den Haan and Levin(1997)

and Robinson and Velasco (1997).

One potential drawback, however, of the M BB or the subsampling bootstrap is their implementation in empirical examples and in particular, the choice of the block-length b. This apparent drawback is motivated by the observation that, more than anything else, their performance depends rather critically on b, especially for moderate sample sizes. Although some automatic or semiautomatic procedures have recently appeared, see Hall et al. (1995)or Loh’s(1987)calibration, the methods can be extremely expensive in computing time.

The purpose of this paper is, under some regularity conditions, to describe and analyse two diﬀerent approaches which eliminate the problem of the choice ofb and the choice of the bandwidth parameter m, as in Robinson (1998), but at the same time able to achieve betterfinite sample properties. The proposed procedures are easy to implement and computationally no more expensive than other bootstrap methods valid in the context of regression models where the errors are iidor heteroscedastic.

We now describe the main ideas of the bootstraps. In the frequency domain the

OLS estimator given in(1.2)can be written as

b β=   T_X₋1 j=1 Ixx(λj)   −1 T_X₋1 j=1 Ixy(λj), (1.5) where

(5)

are the periodogram ofxtand cross-periodogram ofxtandytrespectively, and wa(λ) = 1 (2πT)1/2 T X t=1 ateitλ (1.6)

denotes the discrete Fourier transform of a generic sequence of random variablesat. It should be noted that because wa(λ) is invariant to additive constants when evaluated at the Fourier frequenciesλj= (2πj)/T for integerj, we have that omission

of the frequencyj= 0(andj=T) in(1.5) entails sample-mean correction.

A closer inspection of(1.5)suggests thatβbcan be regarded as theOLS estimator in the ”regression” model

wy(λj) =β00wx(λj) +wu(λj) j= 1, ..., T−1, (1.7)

where wx(λj) and wu(λj) play the role of being the regressors and error term re-spectively. One interesting property of wu(λj) is that they are asymptotically un-correlated although, possibly, heteroscedastic, see Hannan(1970)or Brillinger(1981)

among others. It is precisely this observation that motivated Hannan’s(1963) (semi-parametric) generalized least squares estimator of β₀ for the model(1.1) and conse-quently extended to other useful models in econometrics, see Robinson (1991) and references therein. So, looking at(1.7), vu(λj) =wu(λj)/_|wu(λj)_| can be regarded as a sequence of zero mean and asymptotically independent homoscedastic random variables. This observation and writing(1.7)as

wy(λj) =β0₀wx(λj) +_|wu(λj)_|vu(λj) j= 1, ..., T₋1,

motivates the bootstrap schemes, diﬀering inSTEP 2 below, which we now describe in the following four steps.

STEP 1 Obtain theOLS estimator via(1.2)or(1.5), and the ordinary least squares residuals

b

ut=yt_e ₋βb0_ext, t= 1, ..., T.

STEP 2 (a) Compute the discrete Fourier transform of the residuals _but, denoted wu_b(λj), and letvub(λj) =wbu(λj)/|wub(λj)|,j = 1, ...,[T /2]. Draw independent bootstrap residualsη∗

j,1,j= 1, ...,[T /2], from the empirical distribution function

of e vu_b(λj) =σb−v1(vbu(λj)−vbu), j= 1, ...,[T /2] where v_bu = [T /2]−1P[ T /2] j=1 vbu(λj) and bσ2v = [T /2]− 1P[T /2] j=1 |vbu(λj)−vu_b|2.

That is, for allj= 1, ...,[T /2],

Pr©η∗_j,₁=_ev_bu(λk) ª = [T /2]−1, k= 1, ...,[T /2]. (b)Letu_e∗ ∼= (ue ∗

1,ue∗2, ...,eu∗T)0 be a random sample with replacement from the

stan-dardized residuals e ut=_eσ−_b_u1ut_b; _eσ2_b_u= 1 T T X t=1 b u2_t, and obtain the ”discrete Fourier transform” of _eu

∼ ∗ _as η∗_j,₂= 1 T1/2 T X t=1 e u∗_te−itλj_, _j_{= 1, ...,}_{[T /2]}_.

(6)

Remark 1. Recall that because (1.1) contains an intercept, the sample mean ofut_b

is zero.

STEP 3 Fori= 1,2, obtain the bootstrap regression

wy∗,i(λj) =bβ 0

wx(λj) +_|w_bu(λj)|η∗j,i j= 1, ...,[T /2]. (1.8)

STEP 4 Compute the bootstrap estimator ofbβas

b β∗i =   [_XT /2] j=1 Ixx(λj)   −1 [T /_X2] j=1 Re (Ixy∗,i(λj)), i= 1,2, (1.9)

where Ixy∗,i(λj) =wx(λj)wy0∗,i(−λj) andRe (a) denotes the real part of the

complex numbera.

Remark 2. Observe that by symmetry of Ixx(λj) and Ixy∗,1(λj) = Ixy∗,1(−λj),

where a indicates transposition combined with complex conjugation, the estimator

b β∗₁ in(1.9) becomes b β∗₁=   T_X₋1 j=1 Ixx(λj)   −1 T_X₋1 j=1 Ixy∗,1(λj) where η∗

j,1=η∗(T₋j),1, forj = 1, ...,[T /2]. In fact for the validity of the bootstrap, if

the latter estimatorβb∗₁were to be employed, it would be crucial to haveη∗_j,₁=η∗₍_T

−j),1.

The reason is that ifη∗_j,₁andη∗₍_T

−j),1were generated as inSTEP 2, e.g. independently,

then by symmetry, the right side of the last displayed equation would be

b β₁+  2 [T /_X2]−1 j=1 Ixx(λj)   −1 [T /_X2]−1 j=1 ³ wx(λj)_|wu_b(λj)| ³ η∗_j,₁+η∗₍_T₋_j₎_,₁´´,

whose bootstrap variance would converge toΦ/2sinceE ¯ ¯ ¯η∗_j,₁+η∗₍_T₋_j₎_,₁ ¯ ¯ ¯2= 2instead of4as would be needed for the validity of the bootstrap.

Remark 3. It is worth mentioning that STEP 2 only requires η∗

j,i, conditional on

the data, to be a sequence of iid(0,1)random variables in the unit sphere. So, Wu’s

(1986)wild/external bootstrap could be implemented. However, it appears that the proposed bootstrap performs better than the wild bootstrap. One possible reason is thatη∗

j,iis a random sample from the empirical distribution ofvub(λj), so that it may

capture better thefinite sample distributional features of the model/data than if the sample were drawn from a population with the ”same” distributional properties as those from the asymptotic distribution function of vu_b(λj)or a distribution as that

used by Mammen(1993).

The proposed bootstraps described inSTEPS 1-4eliminate the need to choose the block-lengthbof Künsch’s(1989)M BBor Politis and Romano’s(1994)subsampling approach, see also Politis et al. (1997,1999). It should be mentioned that resampling

(7)

methods in the frequency domain are not new, see for instance Franke and Härdle

(1992), Politis and Romano(1992), Dahlhaus and Janas(1996)or Hidalgo and Kreiss

(1999) among others. However, in all the latter aforementioned papers, the resam-pling techniques are based on the empirical distribution function of the periodogram ordinates and their motivation and emphasis are on the estimation of the spectral density function or functionals of the periodogram, such as the covariance.

The remainder of the paper is organized as follows. In the next section the con-ditions and main results of the paper are introduced. Section 3 illustrates how the bootstrap can be employed for statistical inferences on the parametersβ₀. Section 4 presents a small Monte-Carlo simulation to illustrate the small sample performance of the bootstraps. Section 5 gives the proofs of our results in Sections 2 and 3, which make use of a Lemma shown in Section 6. Finally Section 7 concludes and discusses extensions to more general models of interest in econometrics.

2. CONDITIONS AND MAIN RESULTS

Denote byfxx(λ)andfuu(λ)the spectral density matrix and function of xtandut

respectively, defined from the relationships

γ_x,j = Cov(xt, xt+j) = Z π −π fxx(λ)e−ijλdλ, j= 0,±1, ... γ_u,j = Cov(ut, ut+j) = Z π −π fuu(λ)e−ijλdλ, j= 0,±1, ....

Let us introduce the following regularity conditions:

C1 _{xt}and{ut}are two covariance stationary linear processes defined as xt= ∞ X j=0 ζ_jξ_t₋_j, ∞ X j=0 ° °ζ_j°°2 <_∞andut= ∞ X j=0 ϑjεt₋j; ∞ X j=0 ϑ2_j <_∞, respectively, whereζ₀is the identity matrix,ϑ₀= 1, and_kH_kdenotes the norm of the matrix H. Moreover, the processes εt and ξs are uncorrelated for all t, s= 0,_±1,_±2, ....

Let_Ft and Gt be theσ-algebras of events generated by εs, s≤ t and ξs, s ≤t,

respectively.

C2 _{εt}is an ergodic process that satisfies(a)E(εt|Ft−1∪Gt) = 0, (b)E¡ε2 t|Ft−1∪Gt ¢ =E¡ε2 t ¢ =σ2

ε a.s. and (c) the joint fourth cumulant of εti,i= 1, ...,4satisfies cum(εt1, εt2, εt3, εt4) = ½ κ, t1=t2=t3=t4, 0, otherwise, with_|κ_|<_∞.

C3 _{ξ_t}is a stochastic process that satisfies(a)E(ξt|Gt₋1∪Ft) = 0, (b)E¡ξ_tξ0_t_|Gt₋1∪Ft

¢

=E¡ξ_tξ0_t¢=Ξa.s. and(c)the joint fourth cumulant of

ξ_t ij_i, ji= 1, ..., p andi= 1, ...,4satisfies cum¡ξt1j1, ξt2j2, ξt3j3, ξt4j4 ¢ = ½ κξ,j1,j2,j3,j4, t1=t2=t3=t4, 0, otherwise,

(8)

withκξ= maxj_i=1,...,p,i=1,...,4|κξ,j1,j2,j3,j4|<∞. C4 _|(∂/∂λ)A(λ)_|=O(_|A(λ)_|/λ)and_k(∂/∂λ)Ψ(λ)_k=O(_kΨ(λ)_k/λ)asλ_→0+, where A(λ) = ∞ X j=0 ϑjeijλ and Ψ(λ) = ∞ X j=0 ζ_jeijλ,

and such that_|A(λ)_|>0 and _kΨ(λ)_k>0for all λ _∈[0, π]and continuously diﬀerentiable in any open set outside the origin. In addition, for allg= 1, ..., p+ 1, fgg−1/2(λ)

¯

¯η_g(λ)¯¯ is a non-zero finite vector, where η_g(λ) denotes the gth

row of diag(Ψ(λ), A(λ)) and fgg(λ) is the gth diagonal element of f(λ) = diag(diag(fxx(λ)), fuu(λ)).

C5 Rπ

−πkfxx(λ)fuu(λ)kdλ <∞and Extx0t=Σ>0.

We now comment on the conditions, which for the most part are similar to those used in Robinson (1995a, b,1998). The first three conditions are restricted in the linearity they impose but not otherwise. Part(a)of ConditionsC2andC3, together with thefirst part ofC1, indicate that the best linear predictor is the best predictor, in the least squares sense. The last part of Condition C1and part(b)of Condition

C2are presumably stronger than needed and some heterogeneity could be allowed, although it rules out heteroscedasticity. However, as the main motivation of the paper is to illustrate the possibility of avoiding the drawbacks of theM BBor subsampling bootstrap in a model like(1.1), we have preferred keeping them as they stand. Another motivation is due to the fact that these conditions allow us to consistently estimate the asymptotic covariance matrix of theOLS without resorting to any bandwidth or tuning parameter as Robinson(1988)proved. In particular that

e Ω= 4π 2 T [T /_X2]−1 j=1−[T /2] Ixx(λj)Iu_bub(λj) (2.1) is a consistent estimator of2πRπ −πfxx(λ)fuu(λ)dλ.

Moreover, ConditionsC1₋C3also imply thatE(xtutx0u0) =E(xtx0)E(utu0)

for allt, so that the spectral density ofxtutat frequency zero,fxu(0), is2πRπ

−πfxx(λ)fuu(λ)dλ.

It seems that for (2.1)to be a consistent estimator of fxu(0), the previous condition

is required. Another set of conditions for which E(xtutx0u0) = E(xtx0)E(utu0)

holds is ifxj and u0 are independent for all negative and nonnegative values of j or

that xt and ut are Gaussian andCov(xj, u0) = 0 for all j ≤0. In the terminology

of Engle, Hendry and Richard (1983), they imply strong exogeneity of xt or that xt

is predetermined respectively. So, we are implicitly assuming that the practitioner is willing to take seriously this assumption. However, as discussed by Robinson(1988), if this were not the case, then(2.1)would only converge to one of the components of the long run variance ofxtut, that isΩ, being the other component a function of the fourth cumulants. This component can be estimated using results in Taniguchi(1982)

or Keenan (1987), although under long-memory no results are available. It should be mentioned that in this case, the proposed bootstrap is not valid. But, following ideas and arguments similar to those of Hall and Horowitz (1996), when analysing thet-statistic or theJ-statistic for overidentifying restrictions in aGM M framework, similar corrections should apply in our context. However, we do not proceed as that will involve technicalities and it may obscure the main message of the paper. That

(9)

is, that even with time series data, there are setups encountered in applied work for whichM BB/subsampling is not required. On the other hand, in some respects our conditions are quite general. For instance, it allows for long memory dependence for which the M BB has yet to be justified in a general setting as ours, although the latter algorithm is valid if xt and ut are strong mixing without assuming that xt is

strong exogenous oruthomoscedastic.

Condition C4 eﬀectively allows for a possible singularity of the spectral density matrix and function of xt and ut respectively to be at frequency zero, but smooth elsewhere. This is done merely for notational convenience. The results do not depend on this assumption and they follow similarly if the singularity(ies) were located at some other(s) frequency(ies). So, we allow xt and/or ut to be, possibly, a long-memory process which has attracted immense attention in recent years in econometric literature. An example of a (scalar) process whose spectral density function satisfies

C4is the Fractional Integrated Autoregressive Moving Average(F ARIM A)model, see Granger and Joyeux (1980) or Hosking (1981). The F ARIM A model has a spectral density function defined as

f(λ) = σ 2 ε 2π ¯ ¯1₋eiλ¯¯−d ¯ ¯Θ¡eiλ;θ¢¯¯2 |∆(eiλ_;_θ)_|2,0≤λ≤π, (2.2)

where 0 < C < _∞ and 0 _≤ d < 1/2, and where ∆ and Θ are the autoregressive and moving average polynomials respectively and such that they have no common roots and are outside the unit circle. An earlier example of a process exhibiting long-memory is the fractional Gaussian noise (f gn) process, introduced by Mandelbrot and Van Ness (1968), and whose autocorrelation structure is given by

γ_j= 1 2

³

|j₋1_|1+2d₋2_|j_|1+2d+_|j+ 1_|1+2d´. (2.3) It is known that both models(2.2)and (2.3)satisfy

f(λ)_∼Cλ−2d, asλ_→0 +. Thefirst part of ConditionC5, that isRπ

−πkfxx(λ)fuu(λ)kdλ <∞, seems to be

very mild and due to results in Robinson(1994)and Robinson and Hidalgo(1997), it also appears to be necessary and minimal for the central limit theorem of bβto hold true. This condition is satisfied if, for instance, ut follows a F ARIM A or an f gn

model with d=du and xt follows a model whose th element is aF ARIM Aor an f gn model with d = d satisfying the condition 2 (d+du) < 1for all = 1, ..., p. Alternatively,C5could have been written in terms of the autocovariance function of

xt and ut. C5could be replaced by assuming thatγ_u,jγ_x,j is summable under our framework. It entails that the spectral density function ofxtutto befinite and positive at frequency zero. It is important to note that this is implied by summability of

¯¯γ_u,jγ_x,j¯¯but not the other way round. A standard example is whenxtexhibits strong ”cyclical” behaviour as is the case with Gengebauer models. For the latter models, the autocovariance function behaves asCj2d−1cos (jλ)withλ >0and0_≤d < 1₂. It should be mentioned that if the possible singularities offxandfudo not coincide, then

d and du can take any value smaller than 1

2. Finally, the second part of Condition

C5is a standard condition on the regressors xt and needed for the identification of the parametersβ₀.

(10)

Theorem 2.1. Assuming C1-C5, asT _{→ ∞},

T1/2³βb₋β₀´_→d _N¡0,Σ−1ΩΣ−1¢.

We now focus on the properties of the bootstrap estimators bβ∗_i, i = 1,2, given in (1.9). For the latter to be valid, the resampling scheme must be such that the conditional distribution of T1/2³bβ∗_i ₋bβ´, for i = 1,2, given ³x

∼ 0_{, u} ∼ 0´0 _where _x ∼= (x1, ..., xT)0 and u ∼= (u1, ..., uT)

0_{, consistently estimates the limiting distribution of} T1/2³_βb₋_β 0 ´ , that is Pr ½ T1/2³βb∗_i ₋βb´_≤z¯¯_¯³x ∼ 0_{, u} ∼ 0´0 ¾ P →G(z), f or i= 1,2

where _G(z) is the probability distribution function of a _N¡0,Σ−1_ΩΣ−1¢ _random

variable, see Giné and Zinn(1989). Such a convergence will be denoted as “_→d∗”. To prove the validity of the bootstraps described in STEPS 1-4, we need to strengthen slightly ConditionC5.

C5’ R₋π_π_kfxx(λ)fuu(λ)_{k |}log (_|λ_|)_|dλ <_∞andExtx0_t=Σ>0. Theorem 2.2. Assuming C1-C4 and C5’, fori=1,2 asT _{→ ∞}

T1/2³bβ∗_i₋bβ´_→d∗_N¡0,Σ−1ΩΣ−1¢.

So Theorem 2 indicates that the bootstraps are consistent, and more importantly, that we can avoid the drawback of the M BB and the subsampling bootstrap, e.g. the choice of the block lengthb. Moreover, the results of Theorem 2 will allow us the implementation of tests for the parametersβ₀, say, such as that given in(3.1)below. This will be addressed in the next section.

3. APPLICATIONS OF THE BOOTSTRAP

Suppose that we are interested in the following hypothesis testing

H₀: Cβ₀=c against H₁: Cβ₀₆=c, (3.1) where C is a full rank(r_×p)matrix andc is an(r_×1) vector. A common method to testH0 is based on theF−statistic

F=T³Cβb₋c´0³CΦbC0´−1³Cbβ₋c´, (3.2) whereΦb =Σb−1_Ωe_Σb−1_{is a consistent estimator of}_Φ₌_{AsyV ar}³_T1/2³_βb₋_β

0 ´´ , with e Ωgiven in(2.1)andΣb= (2π)T−1PT−1 j=1 Ixx(λj).

Suppose that we want to bootstrap the testF in(3.2), being denoted byF∗_{. For}

F∗ _{to be asymptotically valid, it should satisfy two basic requirements. First, under}

(11)

norm, the conditional distribution ofF∗_{, given the data}³_x ∼

0_{, u}

∼

0´0_{, should converge in}

probability to the limiting distribution of F given in(3.2). The second requirement onF∗ _{is that, under}_H₁_,_F∗ _{should be bounded in probability. The latter is achieved}

once the bootstrap sample/model is obtained underH0.

More specifically, fori= 1,2, letbβ∗_i denote the bootstrap estimator ofβgiven in

(3.6)below. Let us introduceIu_b∗u_b∗,i(λj) =|wub∗,i(λj)|2, where w_bu∗,i(λj) =wy∗,i(λj)−bβ

∗0

i wx(λj), j= 1, ...,[T /2],

withwy∗,i(λj)as defined in(3.4)below. Then the bootstrap testF∗ is defined asF

in(3.2)but withβbandΦb =Σb−1_Ω_e_Σ_b−1_{being replaced by their bootstrap counterparts}

b

β∗_i andΦb∗i =Σb−1Ωb∗iΣb−1respectively, and where b Ω∗i = 4π2 T [T /_X2]−1 j=1−[T /2] Ixx(λj)Iu_b∗_bu∗,i(λj), i= 1,2. (3.3)

We need to show that

PrnT1/2C³βb_i∗₋eβ´_≤z¯¯¯³x ∼, u∼ ´o _P → lim T_→∞Pr n T1/2C³βb₋β₀´_≤zo

whenCβ₀=c+D/T1/2 _{for any vector}_D_{with bounded norm, where}_βe _{is de}_fi_{ned in}

(3.5)below, and

PrnT1/2C³βb_i∗₋eβ´_≤z¯¯_¯³x

∼, u∼

´o

=Op∗(1)

whenCβ₀₆=c+D/T1/2 for all_|D_|_≤T1/2, that is underH1.

To achieve this goal, we proceed as in Mammen(1993)in the context ofiiddata. That is,

STEPS 1 AND 2 The same as those given in the introduction. STEP 3 Fori= 1,2, obtain the bootstrap regression

wy∗,i(λj) =eβ 0

wx(λj) +_|w_bu(λj)|η∗j,i j= 1, ...,[T /2], (3.4)

whereeβis the restricted least squares, that is

e

β=βb+ (X0X)−1C0³C(X0X)−1C0´−1³c₋Cβb´. (3.5) STEP 4 Compute the bootstrap counterpart of bβ as the OLS estimator from the

regression modelwy∗,i(λj)onwx(λj), that is

b β∗_i =   [T /2] X j=1 Ixx(λj)   −1 [T /2] X j=1 Re (Ixy∗,i(λj)), (3.6)

(12)

STEP 5 Fori= 1,2, the bootstrap test is given by Fi∗ =ϕ∗ ³ T1/2C³βb∗i −βe ´´ , whereΦb∗

i =Σb−1Ωb∗iΣb−1 withΩb∗i given in(3.3) andϕ∗(a) =a0 ³

CΦb∗

iC0 ´−1

a. STEP 6 For i = 1,2, reject H0 if F ≥ c∗₍₁₋α),i, where c∗(1−α),i is the (1−α)th

quantile of the bootstrap distribution ofF∗

i, that is Prnϕ∗³T1/2C³βb∗_i ₋eβ´´_≤z¯¯_¯³x

∼, u∼

´o

.

Theorem 3.1. AssumingC1₋C4 andC50, underH0∪H1 in(3.1),fori=1,2 as T _{→ ∞}

F_i∗d_→∗χ2_r,

where r =rank(C).

The result of Theorem 3 allows us to implement the (bootstrap) test as follows. Denote ϕ(a) =a0³_C_Φb_C0´−1_a_{and let}_cf

T,(1−α)and c a (1−α)be such that Prn¯¯_¯ϕ³T1/2³Cbβ₋c´´¯¯_¯> cf_T,₍₁ −α) o =α and lim T_→∞Pr n¯¯ ¯ϕ³T1/2³Cbβ₋c´´¯¯_¯> ca₍₁₋_α₎o=α, respectively. Then, Theorem1indicates that cf_T,₍₁

−α)→c a (1−α), whereas Theorem 3 indicates thatc∗₍₁ −α),i satisfiesc∗(1−α),i P →ca₍₁ −α) where Prnϕ³T1/2C³bβ∗_i ₋eβ´´> c∗₍₁₋_α₎_,i¯¯_¯³x ∼, u∼ ´o =α, fori= 1,2. (Observe thatCeβ=c.)

However, since the finite sample distribution of T1/2_C³_β_b∗

i −βe ´

, and thus that of ϕ∗³_T1/2_C³_βb∗

i −eβ ´´

, is not available, c∗

(1−α),i is approximated as accurately as

desired by a standard Monte-Carlo simulation algorithm. To that end, consider the bootstrap samples η∗()₌³_η∗()

1 , ..., η∗ () [T /2]

´0

for= 1, ..., B, and compute bβ∗₁()as in

(3.6)for each. Then,c∗

(1−α),1 is approximated by the valuec∗

B (1−α),1 that satisfies 1 B B X =1 I µ ϕ µ T1/2C µ b β∗₁()₋βe ¶¶ > c∗₍₁B₋_α₎_,₁ ¶ =α,

whereas for the second bootstrap, consider the bootstrap samplesu_e∗()=³u_e₁∗(), ...,u_e∗_T()´0

for = 1, ..., B, and compute βb∗₂() as in (3.6) for each . Then, c∗₍₁₋_α₎_,₂ is approxi-mated by the valuec∗B

(1−α),2 that satisfies 1 B B X =1 I µ ϕ µ T1/2C µ b β∗₂()₋βe ¶¶ > c∗₍₁B₋_α₎_,₂ ¶ =α.

(13)

4. MONTE CARLO SIMULATION

In order to investigate how well the bootstrap estimators βb∗_i, fori = 1,2, perform infinite samples, a small Monte Carlo experiment was conducted. In(1.1) we took

µ₀=β₀= 1. We have considered two sets of experiments. In thefirst set,xtand/or ut are long-memory whereas in the second set both xt and ut are strong-mixing

sequences. The main motivation to consider the second set of experiments is to assess the relative performance of bβ∗i to the M BB in a framework for which theoretical

results are available for the latter bootstrap. So that a more ”fair” comparison is given between the two methods. All throughout the experiments, we have calculated the size of the tests for the null hypothesisH0: β0= 1. To that end, we have employed

5000 replications with samples sizes T = 32,64 and 128. In order to calculate the bootstrap for all the models and sample sizes considered,2000bootstrap samples were employed, that is we have chosenB= 2000.

In thefirst set of simulations, two groups of six models were generated. Specifically, for thefirst group,xtwas generated as a fractional Gaussian noise process(2.3)with parameterdx= 0.3and0.45respectively. On the other hand,utwas generated as an

AR(1) process

ut=ρut₋1+εt (4.1)

withρ= 0.0,0.5and0.9respectively, withεtbeing a sequence ofiid_N(0,1)random variables. For the second group, xt was generated as a fractional Gaussian noise process(2.3)with parameterdx= 0.3and0.4respectively, whereasutwas generated

by(4.1)withεtas a fractional Gaussian noise process(2.3)with parameterdu= 0.15

and 0.05respectively, so thatdu+dx= 0.45 in the latter case. Bothxtandεt were

generated using an algorithm due to Davies and Harte (1987).

In addition, to assess the relativefinite sample performance of the bootstrap test, we have compared the results with those obtained using(a)theχ2

1 limiting

distribu-tion of

F =T1/2³Cbβ₋c´ ³CΣb−1ΩeΣb−1C0´−

1

T1/2³Cbβ₋c´ (4.2) where Ωe was defined in (2.1), and (b) the M BB with four diﬀerent block-lengths

b, namely b = 2,4,8 and 16. It is worth noting that Ωe is the frequency domain counterpart of the more natural estimator ofΩ, Ωb =PT_r₌₁−1₋_T_bγ_x,rbγ_u,r, where_bγ_x,r = T−1P

t(r)xetxet0+r and bγu,r = T−1 P

t(r)ubtubt+r. Following Robinson (1998), under C1₋C5, bothΩb andΩe are consistent estimators ofΩ.

For the first set of experiments, the results for the first and second group of models are displayed in TABLES 4.1 and 4.2 respectively, where ASYdenotes the results of the test using (4.2), whereas BOOT1 and BOOT2 are those using the bootstrap approaches fori= 1,2respectively. Finally, the results for theM BBare only given for T = 128and they are displayed inTABLE 4.3. The M BBalgorithm was implemented as follows. Given theOLSresidualsu_b1, ...,uTb , we considered blocks

of size b, Ub = (_bu, ...,_bu+b₋1)0, = 1, ...,[T /b]. Then, a random sample from Ub ,

= 1, ...,[T /b], was obtained and thus a bootstrap sampleUb∗ = (_bu∗₁, ...,u_b∗_T)0. Given

b

U∗, we compute theDF T, denotedw_bu∗(λj), and the bootstrap regression wy∗(λj) =βe

0

wx(λj) +wub∗(λj) j= 1, ...,[T /2]

(14)

regression model as

y_t∗=βe0_ext+_bu∗_t, t= 1, ..., T, and compute theOLS in the latter.

TABLES 4.1 TO 4.3 ABOUT HERE

The overall picture ofTABLES 4.1 to 4.3 indicates that inferences based on the proposed bootstraps work substantially better than inferences based on the asymp-totic χ2

1 distribution of theOLS estimator βb and with Ω being estimated by (2.1)

or those inferences based on the M BB. It seems that the performance of BOOT1

tends to be better than BOOT2 for larger sample sizes, whereas BOOT2 tends to perform better than BOOT1 for smaller sample sizes, although the diﬀerence is not very substantial. At the1%level, the empirical sizes of the bootstrap tests are some-what disappointing compared to that obtained using the limiting distribution of F. However, as the sample size increases, the performance of theBOOT1 and BOOT2

tend to be similar to that of theASY at the1%significance level. On the other hand, at the5%and10%levels, theBOOT1 andBOOT2 clearly dominate the performance of theASY and even for sample sizes as small as32, the performance of theBOOT1

and BOOT2 tests are outstanding. Finally, when comparing BOOT1and BOOT2

with the M BB, the proposed bootstraps clearly outperform the M BB, indicating that the former ones possess betterfinite sample properties than the latter.

For our second set of experiments, we have considered six models. Specifically,

xtandutwere generated as AR(1)with Gaussian innovations and parametersρx= 0.5,0.9andρu= 0.0,0.5,0.9respectively. The results forBOOT1,BOOT2 andASY

are displayed inTABLE 4.4, whereas those using theM BBare inTABLE 4.5 below. TABLES 4.4 AND 4.5 ABOUT HERE

The first conclusion we can draw from TABLES 4.4 and 4.5 is that they are qualitatively similar to those obtained for the first set of experiments. When we compare the performance of the bootstrapsBOOT1andBOOT2 toASY we observe that there is substantial gain by using the bootstrap approach, specially for very small sample sizes, being the gain more noticeable the higher the autocorrelation of

xtbecomes. Moreover, it seems that the autocorrelation of the regressors have a more negative impact than that from the errors. In fact, it appears that the performance of the bootstrap is very uniform across the diﬀerent correlation levels ofut. Finally, as it

happened with thefirst set of experiments, the results for theM BBare far worst than those obtained forBOOT1andBOOT2. Also, the results for theM BBappear to be in agreement with those obtained elsewhere, see for instance Fitzenberger(1988).

5. PROOFS

5.1. Proof of Theorem 1

The proof of the theorem will be split into two steps. Assuming that the integrable functiong(λ)satisfies the same conditions offxx(λ)or fuu(λ)in C4, in Proposition

1a Central Limit Theorem is achieved for the weighted cross-periodogram ofεtandξt

with weights g1/2(λj). On the other hand, in Proposition 2 we show that once g(λ)

is identified asfxx(λ)fuu(λ), the average cross-periodogram ofxt andutconverges in distribution to a normal random variable. For notational convenience, in the proof

(15)

of the next two results we will assume, without loss of generality, thatxt is a scalar sequence of random variables.

Proposition 1. Letg(λ)be a symmetric integrable function in[₋π, π]which satisfies the same conditions offxx(λ)andfuu(λ)inC4. Assuming C1-C3, asT → ∞,

2π T1/2 T_X−1 j=1 g1/2(λj)wξ(−λj)wε(λj) d →N µ 0,(2π)−1σ2ξσ2ε Z π −π g(λ)dλ ¶ , (5.1) where σ2

ξ andσ2ε denote the variances ofξt andεt respectively.

Proof. The left side of(5.1)can be written as

1 T1/2 T X t=1 εt Ã_XT s=1 ξ_sht₋s ! where ht= 1 T T_X₋1 j=1 g1/2(λj)eitλj _(5.2)

suppressing any reference to T in the definition of ht. (5.1) is implied if the con-vergence holds true conditional on _{ξ_t} and we establish the latter. Denote dt = T−1/2PT

s=1ξsht₋s. Then following Scott (1973), see also Robinson and Hidalgo (1997), the proposition is proven if, asT _{→ ∞},

(a) T X t=1 d2_tE¡ε2_t_|Ft₋1∪Gt¢ P →(2π)−1σ2_ξσ2_ε Z π −π g(λ)dλ (5.3)

and for any ψ >0,

(b) E Ã T X t=1 d2_tE³ε2_t_I³_|dtεt_|2> ψ´_|{ξ_t}´ ! →0, (5.4)

where_I(A)denotes the indicator function of the setA. In fact, we can see that this is aCLT applied todtεt, a martingale diﬀerence sequence with respect to theσ-algebra

generated by_{ξ_i, i= 1, ..., T;εj, j= 1, ..., t−1}.

We begin with(a). ByC2, the left side of(5.3)is

σ2ε T X t=1 d2t = σ2ε T T X t=1 Ã_XT s=1 ξsht−s !2 , (5.5)

whosefirst moment is, byC3,

σ2_εσ2_ξ T T X t=1 T X s=1 h2_t₋_s=σ 2 εσ2ξ T3 T X t,s=1 T_X₋1 j,k=1 g1/2(λj)g1/2(λk)ei(t−s)(λj−λ_k) = σ 2 εσ2ξ T T_X₋1 j=1 g(λj)

(16)

sincePTt=1e

it(λ_j₋λ_k)_{= 0}_unless_j ₌_k _{in which case it is}_T_{. But, Lemma 6.}₁_implies

that the right side of the last displayed equation converges to(2π)−1σ2_ξσ2_εRπ

−πg(λ)dλ.

Next, the second moment of the right side of(5.5) is

σ4ε T2 T X t,r=1 E   Ã_XT s=1 ξ_sht₋s !2Ã _T X s=1 ξ_shr₋s !2  = σ 4 εσ4ξ T2 T X t,r=1 Ã_XT s=1 h2_t₋_s ! Ã_XT s=1 h2_r₋_s ! +2σ 4 εσ4ξ T2 T X t,r=1 Ã_XT s=1 ht₋shr₋s !2 (5.6) +σ 4 εcum(ξs, ξs, ξs, ξs) T2 T X t,r,s=1 h2t₋sh2r₋s,

byC3. Thefirst term on the right of(5.6) is

Ã σ2 εσ2ξ T T X t=1 Ã T X s=1 h2_t₋_s !!2

which converges to the square of(2π)−1σ2

ξσ2ε Rπ

−πg(λ)dλ, proceeding as we did with

thefirst term of(5.5). So, to complete the proof of part (a), it suﬃces to show that the last two terms on the right of(5.6)converge to0, since the latter will imply that the second moment of(5.5)converges to the square of thefirst moment and so(5.5)

converges in probability to the right of(5.3)by Markov inequality.

The third term on the right of(5.6)is, using the definition ofhtin(5.2), propor-tional to 1 T6 T X t,r,s=1 T_X₋1 j,k,,q=1 g1/2(λj)g1/2(λk)g1/2(λ)g1/2(λq)ei((t−s)(λj−λ_k)−(r₋s)(λ₋λ_q)) = 1 T2 T X t,r,s=1 t=r=s  1 T T_X₋1 j=1 g1/2(λj)   4 + 3 T3 T X t,r,s=1 t₆=r=s    1 T T_X₋1 j=1 g(λj)     1 T2 T_X₋1 ,q=1 g1/2(λ)g1/2(λq)     + 1 T4 T X t,r,s=1 t₆=r₆=s   1 T2 T_X₋1 ,q=1 g(λ)g(λq)   which is bounded by D T     1 T T_X₋1 j=1 g1/2(λj)   4 +  1 T T_X₋1 j=1 g1/2(λj)   2 1 T T_X₋1 j=1 g(λj)  +  1 T T_X₋1 j=1 g(λj)   2  . But the last displayed expression isO¡T−1¢since Lemma 6.1implies thatT−1PT_j₌₁−1g(λj) = O(1) and Schwarz inequality that ¯¯_¯T−1PT−1

j=1 g1

/2_(λ_j)¯¯

¯2 ≤ T−1PT−1

(17)

second term on the right of(5.6), proceeding as with the third, is easily shown to be

O¡T−1¢also and so it is omitted. This concludes the proof of part (a). Next, we show part(b). For anyδ >0, the left side of(5.4)is bounded by

E "_XT t=1 d2_tE¡ε2_t_I¡ε2_t > ψ/δ¢_|{ξ_t}¢ # + Prnmax t d 2 t > δ o . (5.7)

The first term of (5.7) can be made arbitrarily small by choosing δ small enough, from the uniform integrability ofε2

t. (Recall that a suﬃcient condition for the latter

property is thatE_|εt_|2+κ< Dfor some κ >0.) Next,

max t d 2 t = max_t T−1 ¯ ¯ ¯ ¯ ¯ T X s=1 ξ_sht₋s ¯ ¯ ¯ ¯ ¯ 2 ≤  D T2 T X t=1 ¯ ¯ ¯ ¯ ¯ T X s=1 ξ_sht₋s ¯ ¯ ¯ ¯ ¯ 4  1/2

because for any sequence φ_t,(maxt_|φ_t|)2= maxtφ2_t _≤PT_t₌₁φ2_t. But

1 T2 T X t=1 E ¯ ¯ ¯ ¯ ¯ T X s=1 ξ_sht₋s ¯ ¯ ¯ ¯ ¯ 4 = 1 T2 T X t=1 T X s,r,q,v=1 E¡ξ_sξ_rξ_qξ_vht₋sht₋rht₋qht₋v ¢ = 3σ 4 ξ T2 T X t=1 Ã T X s=1 h2_t₋_s !2 +cum(ξs, ξs, ξs, ξs) T2 T X t=1 Ã T X s=1 h4_t₋_s ! = O¡T−1¢

proceeding as with the proof of part(a), (cf. (5.6)). So,

E³max t d 2 t ´2 =O¡T−1¢.

Then by Markov inequality, the second term of (5.7) is o(1), which completes the

proof of part(b)and the proposition. ¤

Proposition 2. Assuming C1-C5, asT _{→ ∞}, 1 T1/2 T_X−1 j=1 Ixu(λj)→d N ³ 0,(2π)−2Ω´.

Proof. Since by C4,fxx(λ)fuu(λ) has the same properties ofg(λ) in Proposition

1, then by symmetry it suﬃces to show that

1 T1/2 [_XT /2] j=1 f_xx,j1/2f_uu,j1/2 Ã wx,jwu,₋j f_xx,j1/2f_uu,j1/2 −(2π) wξ,j σξ wε,₋j σε ! P →0,

which is the case, as we now show and where henceforth for a generic function h(λ),

hj denotes h(λj). The left side of the last displayed expression is, after standard calculations, 1 T1/2 [_XT /2] j=1 f_xx,j1/2f_uu,j1/2 Ã wx,j f_xx,j1/2 − (2π)1/2wξ,j σξ ! Ã wu,₋j f_uu,j1/2 − (2π)1/2wε,−j σε !

(18)

+(2π) 1/2 T1/2 [T /2] X j=1 f_xx,j1/2f_uu,j1/2 Ã wx,j f_xx,j1/2 −(2π) 1/2wξ,j σξ ! wε,₋j σε (5.8) +(2π) 1/2 T1/2 [T /2] X j=1 f_xx,j1/2f_uu,j1/2 wξ,j σξ Ã wu,₋j f_uu,j1/2 − (2π)1/2wε,−j σε ! .

First, observe that C4and integrability offuu(λ)fxx(λ)imply that

f_uu1/2(λ)f_xx1/2(λ)_≤Dλ−1/2_|log_|λ_||−(1+δ)/2 (5.9) for any arbitrarily smallδ >0, and by Robinson’s(1995a)Theorem 2 and its obvious extension to[0, π], see Lemma 4.4 of Giraitis et al. (2001),

E ¯ ¯ ¯ ¯ ¯ wx,j f_xx,j1/2 −(2π) 1/2wξ,j σξ ¯ ¯ ¯ ¯ ¯ 2 ≤Dlog_j j.

Then, by the Schwarz inequality, we obtain that the expectation of the first term of

(5.8)is bounded by

D

[_XT /2]

j=1

logj

j3/2_(log_T₋_log_j_{+ log 2π)}(1+δ)/2 =O

³

log−(1+δ)/2T´.

Next we examine the second term of (5.8). Because by C2, E(wε,jwε,₋k) = σ2_εI_(j₌_k)_{, we have that the second moment of that term is bounded by}

D1 T [T /2] X j=1 fxx,jfuu,j logj j ≤D [T /2] X j=1 logj

j2_(log_T₋_log_j_{+ log (2π))}1+δ

= O³log−(1+δ)T´,

by(5.9), so it is the third term of(5.8)proceeding as with the second term but using

C3instead ofC2. This concludes the proof of the proposition. ¤ Now we turn to the proof of Theorem 1, which follows by Proposition 2 and Cràmer’s Theorem because

1 T T_X₋1 j=1 Ixx,j = 1 2πT T_X₋1 t=1 (xt₋Ext) (xt₋Ext)0_→p (2π)−1Σ>0 (5.10) byC1andC3. ¤ 5.2. Proof of Theorem 2

The proof of this theorem is split into two propositions. Proposition 3 shows that the bootstrap second moment converges in probability to Φ, whereas Proposition 4 proves the Lindeberg’s condition. These two propositions will imply the statement of the theorem. Henceforth, we shall denote E∗(_·)as the bootstrap expectation, that is, for any random variable Y,E∗_(Y_{) =}_E³_Y¯¯_¯n_x

∼, u∼ o´ , where x ∼= (x1, ..., xT) 0 _and u ∼= (u1, ..., uT) 0_.

(19)

Proposition 3. Assuming C1₋C4 andC50, asT _{→ ∞}, fori=1,2

E∗³T1/2³bβ_i∗₋bβ´´2 _→P Φ. (5.11) Proof. By the definition ofbβ∗_i, we have that

T1/2³βb∗_i ₋βb´=  1 T T_X₋1 j=1 Ixx,j   −1 1 T1/2 T_X₋1 j=1 wx,j_|wu,j_b |η∗j,i.

Since fori= 1,η∗_j,₁is a sequence of uncorrelated distributed (0,1)random variables, it follows by standard manipulations that the left side of(5.11)is

 1 T T_X₋1 j=1 Ixx,j   −1 1 T T_X₋1 j=1 Ixx,jIubu,jb    1 T T_X₋1 j=1 Ixx,j   −1 . (5.12)

On the other hand, fori= 2, because conditional on the sample,u_e∗

t is a sequence

of independent identically distributed(0,1)random variables, it implies that

E∗¡η∗_j,₂η∗₋_k,₂¢ = 1 TE ∗ Ã _T X t=1 e u∗_te−itλj T X s=1 e u∗_seisλk ! = 1 T T X s=1 eit(λk−λ_j) = _I(j=k). So fori= 2, the left side of(5.11)is also(5.12).

To conclude the proof we need to show that(5.12)converges in probability toΦ. First, by Sluztky’s Theorem and(5.10), thefirst and third factors of(5.12)converge in probability to(2π)Σ−1. So, the proof is completed if the second factor of(5.12)

converges in probability to(2π)−1R₋π_πfxx(λ)fuu(λ)dλ. But that factor is

1 T T_X₋1 j=1 Ixx,jIuu,j+ 1 T T_X₋1 j=1

Ixx,j(Iu_bu,jb −Iuu,j), (5.13)

whose first term, underC1₋C4andC50_{, converges in}_L

1−norm, and thus in

prob-ability, to(2π)−1Rπ

−πfxx(λ)fuu(λ)dλ, see Robinson(1998). So, it remains to show

that the second term of(5.13)converges to zero in probability. Using that

b

ut=eut− ³

b

β₋β₀´0_ext,

the second term of(5.13)is

1 T T_X₋1 j=1 Ixx,j µ³ b β₋β₀´0Ixx,j ³ b β₋β₀´₋2³bβ₋β₀´0Re (Ixu,j) ¶ =op(1) (5.14)

(20)

Now, tr  1 T T_X₋1 j=1 I_xx,j2   = p X r=1    1 T T_X₋1 j=1 f_x2_r_x_r_,j Ã Ix_rx_r,j fx_rx_r,j − (2π)Iξrξr,j σ2 ξ_r !2 + 4π σ2_ξ rT T_X−1 j=1 f_x2 rx_r,j Ã Ix_rx_r,j fx_rx_r,j − (2π)Iξrξr,j σ2_ξ r ! Iξ rξr,j + 4π 2 σ4 ξ_rT T_X₋1 j=1 f_x2 rx_r,jIξ2_rξ_r,j   ,

whereIξ_rξ_r,j=wξ_r,jwξ0_r,₋j, andzrdenotes therthelement of the vectorz. But

pro-ceeding as in the proof of (4.8) in Robinson(1995b, pp. 1648-1651), the expectation of the right side of the last displayed equation is dominated by the expectation of the third term. ButE³I2

ξ_rξ_r,j ´

≤Dby Brockwell and Davis’s(1991)Proposition10.3.2. So, the expectation of the left side of the last displayed equation is bounded by

D p X r=1 1 T T_X₋1 j=1 f_x2 rx_r,j≤D 1 T T_X₋1 j=1 λ−_j2_|log_|λj||−2−2 δ =o(T) (5.15)

since byC4and integrability offxx(λ), for allr= 1, ..., p,fx_rx_r,j≤Dλ−j1|log|λ||−

1−δ

, for some δ > 0. The right side of (5.15) together with Theorem 1, that is that

³ b

β₋β₀´=Op ¡

T−1/2¢_{, implies that the}_fi_{rst term of}_(5.14)_is_o

p(1). Proceeding

simi-larly with the second term of(5.14)and observing thatRe (Ixu,j) = 2−1_(Ixu,j₊_Ixu,

−j),

we conclude that the second term of (5.14)is alsoop(1), which completes the proof

of the proposition. ¤

Proposition 4. Assuming C1-C4 and C5’, for allψ >0 asT _{→ ∞}, , fori =1,2 E∗ Ã 1 T T_X₋1 j=1 ° °¡wx,j|wu,jb |η∗j¢°° 2 I³T−1°°¡wx,j|wbu,j|η∗j,i¢°° 2 > ψ´ ! P →0. (5.16)

Proof. Wefirst examine supjT−1kIxx,jIbuu,jb kwhich, after standard inequalities, is bounded by p X r=1 sup j=1,...,[T /2] T−1fx_rx_r,jfuu,j ¯ ¯ ¯f_x−1/2 rx_r,jwxr,jf− 1/2 uu,j wu,−j ¯ ¯ ¯2 (5.17) + p X r=1 sup j=1,...,[T /2] T−1fxrxr,j ¯ ¯¯f_x−1/2 rx_r,jwxr,j ¯ ¯¯2 |Iu_bu,jb −Iuu,j|.

Next, by Robinson’s(1995a)Theorem 2 and an obvious extension to all

1_≤j_≤[T /2], see Lemma 4.4 of Giraitis et al. (2001),

E ¯ ¯ ¯ ¯fuu,j−1/2wu,j−(2π) 1/2wε,j σε ¯ ¯ ¯ ¯ 2 = O µ_log_j j ¶ and E ¯ ¯¯ ¯fx−_r1x/_r2,jwxr,j−(2π)1 /2wξr,j σξ_r ¯ ¯¯ ¯ 2 = O µ logj j ¶ forr= 1, ..., p.

(21)

So, using the inequalitysupt_|φt|

2

≤Pt|φt|

2

, we have for instance that, for all

r= 1, ..., p, E   sup j=1,...,_[T1/2_] 1 j ¯ ¯ ¯ ¯fx−_r1x/_r2,jwxr,j−(2π)1 /2wξr,j σξ_r ¯ ¯ ¯ ¯ 2  ≤ [T1/2] X j=1 1 jE ¯ ¯ ¯ ¯fx−_r1x/_r2,jwxr,j−(2π)1 /2wξr,j σξ_r ¯ ¯ ¯ ¯ 2 =O(1), (5.18) and E   sup j=[T1/2_]₊₁_,...,_[_{T /}_2] 1 j ¯ ¯¯ ¯fx−r1x/r2,jwxr,j−(2π)1 /2wξr,j σξ_r ¯ ¯¯ ¯ 2  ≤ [_XT /2] j=[T1/2_]₊₁ 1 jE ¯ ¯ ¯ ¯fx−_r1x/_r2,jwxr,j−(2π)1 /2wξr,j σξ_r ¯ ¯ ¯ ¯ 2 =O³T−1/2logT´. (5.19) On the other hand, by An et al. (1983),

sup j=1,...,[T /2] ³ (2π)σ−_ε2log−1T_|wε,j|2 ´ ≤1 a.s.. (5.20) Then, combining(5.18), (5.19)and(5.20), thefirst term of(5.17)is bounded by

D p X r=1 sup j=1,...,_[T1/2_] T−1jfx_rx_r,jfuu,j sup j=1,...,_[T1/2_] 1 j ¯¯ ¯f_x−1/2 rx_r,jwxr,jf−1 /2 uu,j wu,−j¯¯¯ 2 +D p X r=1 sup j=1+[T1/2_]_,...,_[_{T /}_2] T−1jfx_rx_r,jfuu,j sup j=1+[T1/2_]_,...,_[_{T /}_2] 1 j ¯ ¯ ¯f_x−1/2 rx_r,jwxr,jf− 1/2 uu,j wu,−j ¯ ¯ ¯2 ≤ D p X r=1 sup j=1,...,[T1/2_] T−1jfx_rx_r,jfuu,jlog2T +DT−1/2logT p X r=1 sup j=1+[T1/2_]_,...,_[_{T /}_2] T−1jfx_rx_r,jfuu,jlog2T ≤ Dlog−δT

in probability since integrability of _kfxx(λ)fuu(λ)_{k |}log (_|λ_|)_| implies that for some

δ >0,_kfxx,jfuu,j_{k |}log_|λj_||_≤Dλ−_j1_|log_|λj_||−1−δ, and hence_kfxx,jfuu,j_k_≤Dλ_j−1_|log_|λj_||−2−δ. On the other hand, because

Iu_bu,j_b −Iuu,j= ³

β₀₋βb´0Ixx,j³β₀₋βb´+ 2³β₀₋βb´0Re (Ixu,j), the second term of(5.17)is bounded by

sup j=1,...,[T /2] T−1fxx,j¯¯_¯f_xx,j−1/2wx,j¯¯_¯ 2µ°_° °β₀₋bβ°°_° 2 fxx,j¯¯_¯f_xx,j−1/2wx,j¯¯_¯ 2 +°°_°β₀₋βb_°°°f_xx,j1/2f_uu,j1/2 ¯¯_¯f_xx,j−1/2wx,jf− 1/2 uu,j wu,j¯¯¯ ´

(22)

which, by(5.18)and(5.20), is upper bounded by sup j=1,...,[T /2] T−1fxx,j µ°_° °β₀₋βb°°_° 2 fxx,j+°°_°β₀₋βb_°°°f_xx,j1/2f_uu,j1/2 ¶ log2T _≤Dlog−δT because °°°β₀₋βb°°° = Op ¡ T−1/2¢_, _k_f xx,jk≤ Dλ−j1|log|λj||−1− δ and °°°f_xx,j1/2f_uu,j1/2°°° _≤ Dλ−_j1/2_|log_|λj||−1− δ/2

by integrability of fxx(λ) and kfxx(λ)fuu(λ)k |log (|λ|)|

re-spectively. Thus, for i = 1,2 we conclude that the left side of (5.16) is bounded by

D T

T_X₋1

j=1

Ixx,jIuu,jE∗³¯¯η∗_j,i¯¯2_I³¯¯η_j,i∗ ¯¯2> ψlogδT´´. (5.21) However, conditional on the data, η∗_j,₁ is a sequence of iidrandom variables, so for

i= 1 (5.21)is E∗³¯¯η∗_j,₁¯¯2_I³¯¯η∗_j,₁¯¯2> ψlogδT´´D T T_X₋1 j=1 Ixx,jIuu,j,

which converge to zero in probability since δ >0andη∗

j,1 hasfinite second moments

and by Proposition 3 the second factor of the last displayed expression converges in probability to(2π)−1Ω<_∞.

On the other hand, fori= 2,(5.21)is bounded by

sup j E∗³¯¯η∗_j,₂¯¯2_I³¯¯η∗_j,₂¯¯2> ψlogδT´´  D T T_X₋1 j=1 Ixx,jIuu,j  ,

which converges to zero in probability for allδ >0as we now show. First, by Robinson

(1988), the second factor converges in probability to (2π)−1Ω< _∞. On the other hand, standard algebra implies that

E∗³¯¯η∗j,2 ¯ ¯2 I³¯¯η∗j,2 ¯ ¯2 > ψlogδT´´ _≤ D ψ2log2δTE ∗¯_¯η∗ j,2 ¯ ¯4 ≤ D ψ2log2δT    1 T T X t=1 u4_t+ Ã 1 T T X t=1 u2_t !2  .

But, by a well-know argument, see Stout’s (1974) Theorem 3.5.8,ut is also ergodic which implies that the right side of the last displayed inequality converges to zero because δ >0. So, we conclude that (5.21)_→P 0and the proof of the proposition. ¤ 5.3. Proof of Theorem 3

We only sketch the proof since it follows immediately from Theorem 2 and the con-tinuous mapping theorem. First, sinceT1/2³Cbβ∗i −c

´ =T1/2C³βb∗i −eβ ´ , Theorem 2 implies that T1/2C³bβ∗_i ₋eβ´_→d∗ _N(0, CΦC0)

(23)

fori= 1,2. On the other hand, proceeding as in the proof of(5.10),Σb_→P (2π)−1Σ>0

and ifΩe∗

i P∗

→Ωfori= 1,2, we obtain that

CΣb−1Ωe∗iΣb−1C0 P∗ →CΦC0. Then, the continuous mapping theorem implies that F∗

i d∗

→χ2

r, for i = 1,2. So, to

conclude the proof of the theorem, we need to show that Ωe∗

i P∗ → Ω. By definition, e Ω∗i = ¡

4π2¢T−1PT_j₌₁−1Ixx,jIu_b∗_bu∗,j. Next, because w_bu∗(λj) = wy∗(λj)−βb∗_iwx(λj) = ³eβ₋βb∗_i´wx(λj) +_|wu_b(λj)|η∗j,i, we obtain that e Ω∗i −Ωe = 4π2 T T_X₋1 j=1 Ixx,j µ³ b β∗_i ₋βe´0Ixx,j ³ b β∗_i ₋eβ´₋2³bβ∗_i ₋βe´0Re³_|wu_b(λj)|η∗j,iwx(λj) ´¶ +4π 2 T T_X₋1 j=1 Ixx,jI_bubu,j ³¯ ¯η∗ j,i ¯ ¯2 −1´.

Now proceeding as in the proof of (5.14), after we observe that by Theorem 2,

³ b

β∗_i ₋eβ´ = Op∗¡T−1/2¢, in particular that in Proposition 3 we have shown that E∗³T1/2³bβ∗i −eβ

´´2

=Op(1), thefirst term on the right of the last displayed equa-tion isop∗(1). The second term on the right is alsoop∗(1), by similar arguments and

that η∗_j,i, conditional on the data, is a zero mean sequence of uncorrelated random variables. Finally, the third term on the right of the last displayed equation. Because

b

ut=uet− ³

b

β₋β₀´_ext, this term can be written as ³ b β₋β₀´04π 2 T T_X₋1 j=1 I_xx,j2 ³¯¯η∗_j,i¯¯2₋1´ ³βb₋β₀´ −2³bβ₋β₀´0 4π 2 T T_X₋1 j=1 Ixx,jRe (Ixu,j) ³¯_¯η_∗ j,i ¯ ¯2 −1´+4π 2 T T_X₋1 j=1 Ixx,jIuu,j ³¯_¯η_∗ j,i ¯ ¯2 −1´. Clearly the first two terms are op∗(1), whereas the last term of the last displayed

expression is alsoop∗(1)using the same arguments as those employed to examine the

behaviour of (5.21), since conditional on the data, ¯¯η∗

j,i ¯ ¯2

−1 is a sequence of zero mean uncorrelated random variables. Now, conclude the proof since we have already

shown thatΩe _→P Ω, e.g. (5.12). ¤

6. A TECHNICAL LEMMA

Lemma 1. Let ϕ(λ) be an integrable function which satisfies (a) ϕ(λ)>0 for all

(24)

diﬀerentiable in any open set outside the origin. Then, asT _{→ ∞}, 1 T [T/2] X j=1 ϕ(λj)−(2π)−1 Z π 0 ϕ(λ)dλ=o(1).

Proof. The left side of the last displayed expression is

1 Tϕ(λπ)− 1 2π Z λ1 0 ϕ(λ)dλ+ 1 2π [T /2]−1 X j=1 Z λj<λ_≤λj+1 (ϕ(λj)₋ϕ(λ))dλ. (6.1)

Thefirst term of(6.1)is obviouslyo(1), whereas the second term of(6.1)is alsoo(1)

because by integrability ofϕ(λ), for any arbitrarily small δ >0,

Z λ1 0 ϕ(λ)dλ _≤ D Z λ1 0 λ−1_|logλ_|−1−δdλ ≤ D_|logλ1|− δ =o(1).

Next, the absolute value of the third term of(6.1)is bounded by

D [T /_X2]−1 j=1 Z λ_j<λ_≤λ_j+1 λ−j1|λj−λ|ϕ ¡ λ¢dλ

by the properties of ϕ(λ)and the mean value theorem where λ is an intermediate point in the interval (λj, λ). But the last displayed expression is bounded by

D [T /_X2]−1 j=1 1 jϕ(λj) Z λj<λ≤λj+1 dλ (6.2) sincesupλ_j<λ_≤λ_j+1|λj−λ|≤DT

−1_{and by}_(b)_and_(c)_,_sup

λ_j<λ_≤λ_j+1ϕ ¡

λ¢_≤Dϕ(λj). On the other hand, integrability ofϕ(λ)implies that

ϕ(λ)_≤Dλ−1_|log (λ)_|−1−δ

for any arbitrarily smallδ >0. So, we conclude that(6.2)is bounded by

D

[T /_X2]−1

j=1

1

j2_(log_T₋_log_j_{+ log (2π))}1+δ =o(1)

by standard arguments, which completes the proof. ¤

7. CONCLUSIONS

In a time series regression model, we have examined and discussed two similar boot-straps, which contrary to the moving blocks and subsampling bootboot-straps, do not require the choice of a tuning parameter, that is the block lengthb. The bootstraps are based on resampling from the discrete Fourier transform of the least squares resid-uals as in the original Efron’s bootstrap or from the residresid-uals themselves. So, we were

(25)

able to avoid the potential problem that the choice of the block length may have on

finite samples. Although we have examined the algorithms in the context of linear regression models, the approach can be extended to more general models of interest in econometrics. More specifically in the context of instrumental variable/GM M es-timators, which includes most of the available eses-timators, such as theOLS, GLS or

M LE.

To that end, let (yt, x0t)0be a sequence of random variables observed at times t= 1, ..., T, such that they follow the nonlinear modelut=ut(yt, xt;θ0), whereθ0is

p_×1parameter vector. Suppose, for illustration purposes, thatut(_·)can be written as

h(yt, γ) =f(xt, β) +ut. (7.1) A classical example of(7.1), and widely used, is the Box-Cox transformation, where

h(yt, γ) = yγ_t ₋1

γ ,

andθ=¡γ, β0¢0

In this context, under the same conditions specified in Section 2, we propose the following bootstrap procedure.

STEP 1 Let P(zt) be a set of valid instruments, and obtain the IV E or GM M

estimator ofθ=¡γ, β0¢0 defined as ³ b γ,bβ0´0 = bθ= arg min θ Ã T X t=1 ut(θ)P0(zt) ! Ã T X t=1 P(zt)P0(zt) !−1Ã _T X t=1 ut(θ)P(zt) ! = arg min θ   T_X₋1 j=1 wu(θ)(λj)wP(z)(λj)     T_X₋1 j=1 wP(z)(λj)wP(z)(λj)   −1 ×   T_X₋1 j=1 wu(θ)(λj)wP(z)(λj)  

wherewu(θ)(λj)and wP(z)(λj)are the discrete Fourier transforms of ut(θ) =

h(yt, γ)₋f(xt, β)andP(zt)respectively, andastands for transpose together with conjugation of a complex vectora. Next, obtain the residuals

b

ut=h(yt,_bγ)₋f³xt,bβ´, t= 1, ..., T. STEP 2 The same asSTEP 2 given in the introduction.

STEP 3 Fori= 1,2, using the discrete Riemann approximation of the identity

zt= µ_T 2π ¶1/2Z π −π wz(µ)eiµtdµ, t= 1, ..., T, compute the bootstrap residuals

b u∗_t,i= µ 2π T ¶1/2T_X₋1 j=1 eitλj_w b u(λj)η∗j,i, t= 1, ..., T

(26)

and the bootstrap observationsy_t,i∗ as

y∗_t,i=g_bγ ³

f³xt,bβ´+u_b∗_t,i´, t= 1, ..., T

wherega(z)satisfies thath(ga(z), a) =z, e.g. ga(z)is the inverse function of

h(z, a).

STEP 4 Fori= 1,2, compute the bootstrap estimator as

³ b γ∗i,bβ ∗0 i ´0 = bθ∗= arg min θ   T_X₋1 j=1 wu_b∗ i(θ)(λj)wP(z)(λj)     T_X₋1 j=1 wP(z)(λj)wP(z)(λj)   −1 ×   T_X₋1 j=1 wu_b∗ i(θ)(λj)wP(z)(λj)  , where wu_b∗

i(θ)(λj) denotes the discrete Fourier transform of bu∗t,i =h

¡ y∗ t,i, γ ¢ − f(xt, β).

We conjecture that the preceding bootstrap should be valid. The intuition comes from the observation that because w_b_u∗

i(bθ) = wbu(λj)η

∗

j,i, it implies that the

covari-ance structure of the data/model is preserved, so the asymptotic behaviour of the bootstrap estimates should be that ofbθ. The technical details of the bootstrap will be more involved than those in the context examined in the paper. However, in view of the available results for extreme estimators, it is expected that the results given in Section 2 should hold true in the previous context/model. This topic and how we can implement the bootstrap in the situation of heteroscedastic models will be examined in a future paper.

References

[1] An, H.-Z., Chen, Z.-G. and Hannan, E.J., 1983. The Maximum of the Peri-odogram,Journal of Multivariate Analysis 13, 383-400.

[2] Andrews, D.W.K.,1991. Heteroscedasticity and Autocorrelation Consistent Co-variance Matrix Estimation,Econometrica 59, 817-858.

[3] Brillinger, D.R., 1979. Confidence Intervals for the Crosscovariance Function,

Selecta Statistica Canadiana 5, 3-16.

[4] Brillinger, D.R.,1981. Time Series, Data Analysis and Theory (Holden-Day, San Francisco).

[5] Brockwell, P.J. and Davis, R.A., 1991. Time Series: Theory and Methods, (Springer-Verlag,New York).

[6] Carlstein, E.,1986. The use of Subseries Values for Estimating the Variance of a General Statistic from a Stationary Sequence,Annals of Statistics 14,1171-1179. [7] Dahlhaus, R. and Janas, D., 1996. A Frequency Domain Bootstrap for Ratio

(27)

[8] Davies, R.A. and Harte, D.S.,1987. Tests for Hurst Eﬀect,Biometrika74, 95-102. [9] Den Haan, W. and Levin A.,1997. A Practitioners Guide to Robust Covariance Matrix Estimation, in: Maddala, G.S. and Rao, C.R. eds.,Handbook of Statistics, Vol. 15 (North-Holland, Amsterdam) 299-342.

[10] Efron, B.,1979. Bootstrap Methods: Another Look at the Jackknife,Annals of Statistics 7,1-26.

[11] Engle, R.F., Hendry, D.F. and Richard, J-F., 1983. Exogeneity, Econometrica

51, 277-304.

[12] Fitzenberger, B.,1998. The Moving Blocks Bootstrap and Robust Inference for Linear Least Squares and Quantiles Regressions, Journal of Econometrics 82, 235-287.

[13] Franke, J. and Härdle, W.,1992. On Bootstrapping Kernel Spectral Estimates,

Annals of Statistics 20, 121-145.

[14] Giné, E. and Zinn, J.,1989. Necessary Conditions for the Bootstrap of the Mean,

Annals of Statistics 17, 684-691.

[15] Giraitis, L., Hidalgo, J. and Robinson, P.M., 2001. Gaussian Estimation of Para-metric Spectral Density with Unknown Pole,Annals of Statistics 29, 987-1023. [16] Granger, C.W.J. and Joyeux, R.,1980. An Introduction to Long Memory Time

Series and Fractional Diﬀerencing,Journal of Time Series Analysis 1, 424-438. [17] Hall, P. and Horowitz, J.L.,1996. Bootstrap Critical Values for Tests Based on

Generalized-Method-of-Moments Estimators,Econometrica 64, 891-916. [18] Hall, P., Horowitz, J.L. and Jing, B.-Y.,1995. On Blocking Rules for the

Boot-strap with Dependent Data,Biometrika 82, 561-574.

[19] Hannan, E.J.,1957. The Variance of the Mean of a Stationary Process, Journal of the Royal Statistical Society Series B 19, 282-285.

[20] Hannan, E.J., 1963. Regression for Time Series, in: M. Rosenblatt ed., Time Series Analysis, (Wiley, New York)17-37..

[21] Hannan, E.J.,1970. Multiple Time Series, (Wiley and Sons, New York).

[22] Hidalgo, J. and Kreiss, J.-P.,1999. Bootstrap Specification Tests for Covariance Stationary Processes, Preprint.

[23] Hosking, J.,1981. Fractional Diﬀerencing,Biometrika 68,165-176.

[24] Jowett, G.H., 1955. The Comparison of Means of Sets of Observations from Sections of Independent Stochastic Series,Journal of the Royal Statistical Society Series B 17, 208-227.

[25] Keenan, D.M., 1987. Limiting Behavior of Functionals of Higher-order Sample Cumulant Spectra,Annals of Statistics 15,134-151.

[26] Künsch, H.-R., 1989. The Jackknife and the Bootstrap for General Stationary Observations,Annals of Statistics 17,1217-1241.