Detecting for Smooth Structural Changes in GARCH Models

(1)

Detecting for Smooth Structural Changes in

GARCH Models

Bin Chen Department of Economics University of Rochester and Yongmiao Hong

Department of Economics and Statistical Science Cornell University,

Wang Yanan Institute for Studies in Economics (WISE) Xiamen University

June 2009

We would like to thank Zongwu Cai, Kajal Lahiri, Oliver Linton, Suhasini Subba Rao and seminar participants at 2009 New York Camp Econometrics in Lake Placid for their comments. All remaining errors are solely ours. Hong acknowledges support from the National Science Foundation of China and the Cheung Kong Visiting Scholarship by the Chinese Ministry of Education and Xiamen University. Correspondences: Bin Chen, Department of Economics, University of Rochester, Rochester, NY 14627; email: [email protected]; Yongmiao Hong, Department of Economics & Department of Statistical Science, Cornell University, Ithaca, NY 14850, USA, and Wang Yanan Institute for Studies in Economics (WISE), Xiamen University, Xiamen 361005, China; email: [email protected].

(2)

Detecting for Smooth Structural Changes in GARCH Models

Abstract

Detecting and modelling structural changes in GARCH processes have attracted increasing attention in time series econometrics. In this paper, we propose a new approach to testing struc-tural changes in GARCH models. The idea is to compare the log likelihoods of a time-varying parameter GARCH model and a constant parameter GARCH model, where the time-varying GARCH parameters are estimated by a local quasi-maximum likelihood estimator (QMLE) and the constant GARCH parameters are estimated by a standard QMLE. The test does not require any prior information about the alternatives of structural changes. It has an asymptotic N(0,1) distribution under the null hypothesis of parameter constancy and is consistent against a vast class of smooth structural changes as well as abrupt structural breaks with possibly unknown break points. A consistent parametric bootstrap is employed to provide a reliable inference in …nite samples and the simulation study highlights the merits of our approach.

JEL Classi…cations: C1, C4, E0.

(3)

1. INTRODUCTION

Since Engle’s (1982) seminal work, various ARCH and GARCH models have been commonly used to capture volatility dynamics of macroeconomic and …nancial time series. Underlying all these models is the key assumption of stationarity. Given the changing pace of the underlying economic mechanism and technological progress, modeling economic processes over a long time horizon under the stationarity assumption may not be suitable. It is plausible that structural changes may occur, causing the time series to deviate from stationarity. Indeed, various economic factors may lead to structural changes in economic time series. For example, one driving force for structural changes are “shocks” induced by institutional changes, such as changes of exchange rate systems from the …xed exchange rate mechanism to the ‡oating exchange rate mechanism, or the introduction of Euro. The prevalence of structural instability in macroeconomic and …nancial time series has been documented by numerous empirical studies. For example, Andreou and Ghysels (2002) examine the change-point hypothesis in volatility dynamics of international stock market indices and foreign exchange returns and …nd multiple breaks associated with the Asian and Russian …nancial crises; Mikosch and St¼aric¼a (2004) apply their goodness-of-…t test to the S&P500 returns and detect structural changes related to shifts of the unconditional variance. Model stability is crucial for statistical inference, forecasts, and sensible policy implications drawn from the model. In particular, ignoring structural changes in macroeconomic and …nancial time series may easily lead to spurious persistence in the conditional volatility dynamics. Diebold (1986) and Lamoureux and Lastrapes (1990) are among the …rst to suggest that structural changes unaccounted for can yield Integrated GARCH or long memory e¤ects. More recently, Mikosch and St¼aric¼a (2004) and Hillebrand (2005) provide some theoretical explanation for this phenomenon. The spurious IGARCH e¤ects imply that shocks have a permanent impact on volatility and so current information remains relevant when forecasting the conditional variance for all horizons. In contrast, for the short memory volatility process, shocks to variance decay quickly over time. Moreover, model instability may a¤ect asset allocation or lead to large errors in pricing, hedging and managing risk. Pettenuzzo and Timmerman (2005) show that the possibility of future breaks has its largest e¤ect at long investment horizons, but historical breaks can signi…cantly change investment decisions even at short horizons through its e¤ect on current parameter estimates.

Some tests have been proposed to detect structural breaks in GARCH models in the literature. For example, Chu (1995) considers a supremum Lagrange multiplier (LM) test for GARCH models. Berkes, Gombay, Horvath and Kokoszka (2004) develop a sequential likelihood-ratio (LR) test for monitoring parameter constancy of the GARCH model. The test is more informative than any sequential cumulative sum (CUSUM) test performed on the observed returns process or residual transformations. It is, however, computationally intensive as it involves the calculation

(4)

of quasi-likelihood scores. Kulperger and Yu (2005) derive the properties of structural break tests based on the partial sums of squared estimated standardized residuals of GARCH models. These tests all consider one-time shift as the alternative so they may not have good power against multiple breaks.

Almost all existing change-point tests for GARCH models are constructed for abrupt changes. To our knowledge, the only exception is Amado and Terasvirta (2008), who consider testing for:: a smooth transition type time-varying structure of GARCH models. Smooth changes may be more realistic because volatility usually evolves over time in a continuous manner and volatility jumps are rare. Empirical evidence shows that various economic events, such as liberalization of emerging markets, integration of world equity markets, changes in exchange rate or interest rate regimes, may lead to structural changes in volatility models. The changes induced by policy switch, preference changes and technology progress usually exhibit evolutionary changes in the long term. In general, as Hansen (2001) points out, “it may seem unlikely that a structural break could be immediate and might seem more reasonable to allow a structural change to take a period of time to take e¤ect”. In particular, volatility is a measure of risk and it may take time for the market to achieve some consensus.

Recently, time-varying parameter ARCH and GARCH models have appeared as a novel tool to capture the evolutionary behavior of economic time series. For example, Amado and Terasvirta:: (2008) propose both additive and multiplicative time-varying GARCH models. They introduce a smooth transition function that allows all parameters to change smoothly over time. Parametric speci…cations for time-varying parameters lead to more e¢ cient estimation if the underlying coe¢ cient functions are correctly speci…ed. However, economic theories usually do not suggest any concrete functional form for time-varying parameters; the choice of a functional form is somewhat arbitrary. Engle and Rangel (2006) assume that the variance of the process of interest can be decomposed into stationary and nonstationary components, where the nonstationary component is modeled using spline functions of time and the stationary component follows a GARCH process. Dahlhaus and Subba Rao (2006) and Fryzlewicz, Sapatinas and Subba Rao (2008) introduce a time-varying parameter ARCH process for modeling the evolutionary behavior of volatility. The model is asymptotically locally stationary in the neighborhood of each point of time but is globally nonstationary. One advantage of this evolutionary time-varying parameter ARCH model is that little restriction is imposed on the functional forms of ARCH coe¢ cients, except for the regularity condition that they evolve over time smoothly.

Motivated by the ‡exibility of the smooth time-varying parameter ARCH model and the pop-ularity of GARCH models in practice, we will …rst generalize the smooth time-varying parameter ARCH models to a class of smooth time-varying GARCH models and derive the consistency and asymptotic normality of a local QMLE for time-varying GARCH parameters in both interior and

(5)

boundary regions. We then use the time-varying GARCH(p; q) model as the alternative to test smooth structural changes and sudden structural breaks for a GARCH model. We emphasize that unlike the case of stationary GARCH(p; q) models, a time-varying GARCH(p; q) model is not included as a special case in the time-varying ARCH(1) class and therefore the asymptotic analysis is much more involved. Thus, while the main focus of this paper is on testing structural changes of GARCH parameters, our results on local QMLE of time-varying GARCH parameters may have its own independent interest. Moreover, we study the asymptotic properties of the local QMLE of time-varying GARCH parameters in both interior and boundary regions. We …nd that the asymptotic biases of the local QMLE in the interior and boundary regions have di¤erent convergence rates, and a simple boundary-correction will make the bias in the boundary region vanishes at the same rate as in the interior region. All existing works on time-varying ARCH processes (e.g., Dahlhaus and Subba Rao 2006, Fryzlewicz et al. 2008) focus on estimation in the interior region.

This paper proposes a class of consistent tests for smooth structural changes as well as abrupt structural breaks in GARCH parameters with known or unknown change points. The idea is to estimate the smooth time-varying GARCH parameters by a local QMLE and compare them with the standard QMLE for constant GARCH parameters. Compared with the existing tests for structural breaks in GARCH models in the literature, the proposed tests have a number of appealing features.

First, the proposed tests are consistent against a large class of smooth time-varying parameter alternatives. They are also consistent against multiple sudden structural breaks in GARCH models with known or unknown break points.

Second, no prior information on a structural change GARCH alternative is needed. In par-ticular, we do not need to know whether the structural changes are smooth or abrupt, and in the cases of abrupt structural breaks, we do not need to know the dates or the number of breaks.

Third, unlike most tests for structural breaks in GARCH models in the literature, which often have nonstandard asymptotic distributions, the proposed tests have a null asymptotic N(0,1) distribution. The only inputs required are the log likelihoods of QMLE and local QMLE. Any standard econometric software can carry out computational implementation easily.

Fourth, the local QMLE can capture the local behavior of time-varying parameters. Because only local information is employed in estimating parameters at each time point, the proposed tests have symmetric power against structural breaks that occur either in the …rst or second half of the sample period. This is di¤erent from some existing tests (e.g., CUSUM tests) that have di¤erent powers against structural breaks that have same sizes but occur at di¤erent time points. Fifth, unlike some existing tests for structural breaks in GARCH models, no trimming proce-dure is needed for the proposed tests. Thus, the proposed tests are expected to have non-trival

(6)

powers for structural changes near the boundary regions of time, provided that the sample size is large enough. Moreover, the local QMLE for the conditional variance parameters can provide insight into the volatility dynamics.

In Section 2, we introduce the time-varying GARCH framework and hypotheses of interest. Section 3 proposes a local QMLE for the time-varying parameters in GARCH models, and establishes its consistency and asymptotic normality for both interior and boundary regions. Section 4 develops a likelihood ratio testing approach and construct the test statistic. Section 5 derives its asymptotic null distribution and investigates its asymptotic power property. In Section 6, a simulation study is conducted to examine the …nite sample performance of the test via a parametric bootstrap procedure, which is shown to be consistent. Section 7 provides concluding remarks. All mathematical proofs are collected in the appendix. A GAUSS code to implement the proposed tests is available from the authors upon request. Throughout the paper, C denotes a generic bounded constant.

2. TIMY-VARYING GARCH MODEL AND HYPOTHESES OF INTEREST Consider the following data generating process (DGP)

8 > > < > > : Xt= p h0 t"t; h0 t = 00t+ Pp i=1 0 ith0t i+ Pq j=1 0 jtXt j2 ; f"tg i.i.d.(0,1), (2.1)

where Xtis a stochastic time series process, the 0jtand 0

jt are possibly time-varying parameters,

t_{is the index of time, p and q are the orders of the GARCH process, and f"}tg is an i:i:d: sequence

of standardized innovations with mean 0 and variance 1. Let 0_t be the collection of parameters; namely, 0t = 00t; 01t; :::; 0pt; 0 1t; :::; 0 qt 0 ;a (p + q + 1)-dimensional vector.

The above setup nests both constant parameter GARCH and time-varying GARCH processes. For example, if 0_t is not changing over time, we have a constant parameter GARCH(p; q) process, whose asymptotic properties have been studied by Berkes, Horvath and Kokoszka (2003). Lee and Hansen (1994) and Lumsdaine (1996) also establish the asymptotic theory of the QMLE of the GARCH(1,1) model when 0_t is a constant.

For time-varying parameter GARCH processes, one example is the single break in the GARCH

model ₈ > > > > < > > > > : Xt= p h0 t"t; h0 t = ( 0 0+ Pp i=1 0 ih0t i+ Pq j=1 0 jXt j2 ; if t ; 00 0 + Pp i=1 00i h0t i+ Pq j=1 00j Xt j2 ; otherwise. f"tg i.i.d.(0,1), (2.2)

(7)

where is called the break point. Chu (1998) and Kulperger and Yu (2005) have used this model as an alternative to study the parameter constancy of GARCH models when is unknown.

Another example of varying GARCH processes is the smooth transition type time-varying GARCH models proposed by Amado and Terasvirta (2008). They consider both additive:: and multiplicative models, where the time-varying components are included in the conventional GARCH models in di¤erent forms; namely

8 > > < > > : Xt= p h0 t"t; h0 t = 00+ Pp i=1 0 ih0t i+ Pq j=1 0 jXt j2 + 001+ Pp i=1 0 i1h0t i+ Pq j=1 0 j1Xt j2 G (t) ; f"tg i.i.d.(0,1), (2.3) and ₈ > > < > > : Xt= p h0 t"t; h0_t = 0₀+Pp_i=1 0_ih0_{t i}+Pq_j=1 0_jX_{t j}2 [1 + G (t)] ; f"tg i.i.d.(0,1), (2.4)

where G(t) is a smooth transition function of time. A choice of G(t) is a logistic function, namely G (t) = _{f1 + exp [} (t c)]_g 1;

where c and are scalar parameters governing the threshold and speed of transition.

To cover a wide range of possibilities, we do not assume any parametric functional form for

t: Instead, we assume that t is an unknown smooth function of time in form of 0

t = 0

t

T ;

where 0 : [0; 1]! R(p+q+1) is a vector-valued smooth function: The parameter 0t changes over

time but in an evolutionary manner. The DGP in (2.1) becomes the following time-varying GARCH process: 8 > > < > > : Xt= p h0 t"t; h0 t = 00 t T + Pp i=1 0 i t T ht i+ Pq j=1 0 j t T X 2 t j; f"tg i.i.d.(0,1): (2.5)

The DGP in (2.5) includes time-varying ARCH(q) processes (Dahlhaus and Subba Rao 2006, Fryzlewicz et al. 2008) as a special case when 0

i( t

T) = 0 for all t; i = 1; :::; p: In this paper,

we consider GARCH models in (2.5) because GARCH models are more ‡exible than ARCH models in capturing the volatility dynamics and are more parsimonious than ARCH models.

(8)

Parsimonious GARCH models are attractive in estimating and forecasting volatilities.

The speci…cation that parameter 0( )is a function of ratio t=T rather than time t only is a

common scaling scheme in the time series literature (see, e.g., Robinson 1989, Phillips and Hansen 1990, Dahlhaus and Subba Rao 2006 and Cai 2007). It might …rst appear a bit strange because the time-varying parameter t depends on the sample size T: The reason for this requirement is

that a nonparametric estimator for twill not be consistent unless the amount of data on which it

depends increases, and merely increasing the sample size will not necessarily improve estimation of t at some …xed point t; even if some smoothness condition is imposed on t: The amount of

local information must increase suitably if the variance and bias of a nonparametric estimator of t are to decrease suitably. A convenient way to achieve this is to regard t as ordinates of

smooth function 0( ) on an equally spaced grid over [0; 1]; which becomes …ner as T ! 1; and

then consider estimation of 0( ) at …xed points : See Robinson (1989) for more discussion in

a linear regression context.

A keen interest here is whether the parameter 0t is changing over time. The null hypothesis

is

H0 : 0t = for some constant vector 2 and for all t;

where is a (p + q + 1) dimensional parameter space of t:

Under H0;the DGP in (2.1) is a standard GARCH process with constant parameter . The

unknown constant parameter vector could be consistently estimated by the following QMLE ^ = arg max 2 1 T T X s=1 ls( ) ; (2.6)

where ls( ) is the likelihood function; namely

ls( ) = 1 2 log hs( ) + X2 s hs( ) ; hs( ) = 0( ) + 1 X j=1 j( ) X 2 s j:

(9)

the functions _j(u) ; 0 j <_{1; can be de…ned by recursion. If q} p; then 8 > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > : 0( ) = 0= 1 1+ ::: + q ; 1( ) = 1; 2( ) = 2+ 1 1( ) ; ::: p( ) = p+ 1 p 1( ) + ::: + p 1 1( ) ; p+1( ) = 1 p( ) + ::: + p 1( ) ; ::: q( ) = 1 q 1( ) + ::: q 1 1( ) ; (2.7)

and if q < p; the preceding equations are replaced with 8 > > > > > > > > > > > > < > > > > > > > > > > > > : 0( ) = 0= 1 1 + ::: + q ; 1( ) = 1; 2( ) = 2+ 1 1( ) ; ::: q+1( ) = q+1+ 1 q( ) + ::: + q 1( ) ; ::: p( ) = ap+ 1 p 1( ) + ::: q p q( ) : (2.8)

In general, if j > max (p; q) ; then

j( ) = 1 j 1( ) + 2 j 2( ) + ::: + q j q( ) : (2.9)

In practice, we observe only X1; :::; XT and the logarithm of the likelihood function in (2.6)

cannot be computed from the observed data, and so ^ is infeasible. Hence, we replace ls( )with

ls( ) = 1 2 log hs( ) + X2 s hs( ) ; (2.10) where hs( ) = 0( ) + s 1 X j=1 j( ) X 2 s j;

and the feasible QMLE is given by

= arg max 2 1 T T X t=1 ls( ) : (2.11)

(10)

Among many others, Berkes et al. (2003) establish the consistency and asymptotic normality of both ^ and _{under H}0;and Lee and Hansen (1994) and Lumsdaine (1996) derive the asymptotic

properties of QMLE for the GARCH(1,1) model under H0: Heuristically, j( ) decays

exponen-tially so that replacing ls( ) with ls( ) has asymptotically negligible impact and ^ and have

the same asymptotic distribution.

The alternative hypothesis HA is that H0 is false. Under HA; t is time varying with an

unknown form. Examples include the GARCH model with a single break in (2.2) or multi-ple breaks with known or unknown break points, Amado and Terasvirta’s (2008) time-varying:: smooth transition GARCH models in (2.3) and (2.4), Dahlhaus and Subba Rao’s (2006) time-varying ARCH(q) models, and the more general time-time-varying GARCH(p; q) model in (2.5). We allow for a …nite number of abrupt changes, smooth changes and mixtures of them under our alternative hypothesis, which covers a rather wide range of alternatives.

All existing tests for structural changes in GARCH models in the literature consider a para-metric alternative of structural changes. For example, Chu (1995) considers a supremum LM test to check parameter constancy against a single break in the GARCH model in (2.2). Amado and Terasvirta (2008) use a LM test against smooth transition GARCH alternatives in (2.3):: and (2.4). Both tests specify certain parametric alternatives, and they have best power against the assumed alternatives. However, no prior information about the true alternative is usually available for practitioners. Our main objective in this paper is to develop a consistent test for H0 against HA;using a new approach.

In a linear regression framework, Chen and Hong (2008) propose generalized Chow and gen-eralized Hausman tests for smooth structural changes as well as abrupt structural breaks with known or unknown change points in regression models. The idea is to estimate the smooth time-varying parameters by local linear smoothing and compare them with the OLS constant parameter estimator via sums of squared residuals and …tted values respectively. These tests are not applicable to test structural changes in GARCH models as GARCH models require di¤er-ent estimation methods and use di¤erdi¤er-ent criterion functions. In this paper, we shall compare a constant parameter GARCH model with a time-varying parameter GARCH model via the quasi-likelihood criterion. Naturally, our test can be applied to check structural changes in an ARCH model, as it is just a special case. We note that no such a test is available in the literature, even for the time-varying parameter ARCH model, although Dahlhaus and Subba Rao (2006) point out "from a practical point of view, one could evaluate the sum of squared deviations between the kernel-QML estimator at each time point and the global QML estimator. We conjecture that the asymptotic distribution under the null hypothesis of stationarity is a chi-square". Our quasi-likelihood ratio test is rather natural and computationally simple, because log-likelihood values are the outputs of estimations.

(11)

To introduce our test, below we …rst extend Dahlhaus and Subba Rao’s (2006) results on time-varying ARCH models and discuss how to estimate time-varying GARCH models by a local QMLE. Asymptotic properties of the local QMLE of time-varying GARCH(p; q) parameters for both interior and boundary regions may have independent interests since no asymptotic results are available in the literature.

3. ESTIMATION OF TIME-VARYING GARCH COEFFICIENTS

Unrestricted nonstationarity may entail so much arbitrariness in the time dependent behavior of a process that it may be impossible to develop a meaningful asymptotic theory. When a process is changing over time smoothly, increasing the number of observations over time does not necessarily imply an increase in information. For example, one cannot expect an ensemble average to be consistently estimated by the corresponding temporal average. To avoid pathological cases arising from extreme nonstationarity, we impose some restrictions on the process to control the extent of the deviations from stationarity. A natural way of doing so is to embed a stationary structure on the process in the some neighborhood of each time point. This idea is similar to the idea that underlies the nonparametric technique of …tting a line locally to a nonlinear curve. In this case a smoothness condition on the curve is required to validate the approach. Likewise in the present case, the imposition of local stationarity involves the use of a smoothness constraint on the evolution of the nonstationary processes. A rigorous de…nition of local stationarity is introduced by Dahlhaus (1996a, 1996b, 1997) who imposes a smoothness condition in terms of the components in the spectral representation of the process. Heuristically, one can say that a process is locally stationary if the law of motion is smoothly time-varying. Thus a locally stationary process behaves like a stationary process in the neighborhood of each instant in time but has a global nonstationary behavior.

Here, the smoothness of the parameter function 0( )guarantees that the time-varying GARCH

process displays a locally stationary behavior. In order to study the asymptotic properties of the time-varying GARCH process Xtin (2.5), we introduce the stationary GARCH process

n ~ Xt(u)

o associated with the time-varying GARCH process at the …xed point u 2 [0; 1] as follows:

8 > > < > > : ~ Xt(u) = q ~ h0 t(u)"t; ~ h0 t(u) = 00(u) + Pp i=1 0 i (u) ~h0t i(u) + Pq j=1 0 j(u) ~Xt j2 (u) ; f"tg i.i.d.(0,1), t = 1; :::; T; (3.1)

where all coe¢ cients depend on the …xed point u but do not depend on time t: It has been shown in the literature that X2

t admits a time-varying state space representation

(12)

degree of the approximation depends on the rescaling factor T and the deviation _Tt u : This is formally stated in Lemma A.1 in the appendix.

The hypothetical process in (3.1) is a stationary GARCH process at a given point u 2 [0; 1] and thus has a unique representation (Berkes et al. 2003)

~ h0_t(u) = ₀(u) + 1 X j=1 j(u) ~X 2 t j(u) (3.2)

for all t with probability one under certain regularity conditions. The functions j are given in

(2.7)-(2.9), but here they are associated with a given point u 2 [0; 1]: Let be the compact set

= ( = 0; 1; :::; p; 1; :::; q 0 2 Rp+q+1: p X i=1 i+ q X j=1 j < 1; 0 < min 0; 1; :::; p; 1; :::; q max 0; 1; :::; p; 1; :::; q < 1 :

For each u 2 [0; 1]; we assume that uis an interior point in ;where u = ( 00(u) ; 01(u) ; :::; 0p(u) ; 0

1(u) ; :::; 0

q(u))0:Under HA;the local QMLE to estimate t is given by

^ t= arg max 2 Lt( ) = arg max2 1 T T X s=1 kstls( ) ; (3.3) where ls( ) = 1 2 log hs( ) + X2 s hs( ) ; hs( ) = 0( ) + 1 X j=1 j( ) X 2 s j; kst = 1 bk s t T b ;

the kernel k : [ 1; 1] ! R+is a prespeci…ed symmetric bounded probability density, and b b(T ) is a bandwidth such that b ! 0 and T b ! 1 as T ! 1. For notational simplicity, we have suppressed the dependence of kst on the sample size T and the bandwidth b: Examples of k( )

include the uniform kernel

k(u) = 1

21(juj 1); (3.4)

the Epanechniov kernel

k(u) = 3 4(1 u

2₎₁₍

(13)

and the quartic kernel

k(u) = 15 16(1 u

2

)21(_juj 1); (3.6)

where 1( ) is the indicator function.

The functions _j( ) ; 0 j < _{1; are de…ned in (2.7)-(2.9). The estimator ^}t in (3.3)

is regarded as an estimator of t = 00 Tt ; 0 1 Tt ; :::; 0 p Tt ; 0 1 Tt ; :::; 0 q Tt 0 or of u = ( 0

0(u) ; 01(u) ; :::; 0p(u) ; 0 1(u) ; :::; 0 q(u))0 where u Tt < 1 T:

In the derivation of the asymptotic properties of ^t; we rely on the local approximation of

X2

t by the stationary process ~Xt2(u) de…ned in (3.1) for t=T close to u. We de…ne the locally

weighted likelihood of ~X2 t (u) as ~ L (u; ) = 1 T T X s=1 kst~ls(u; ) ; (3.7) where u t T < 1 T and ~ ls(u; ) = 1 2 " log ~hs(u; ) + ~ Xs(u) 2 ~ hs(u; ) # ; ~ hs(u; ) = 0( ) + 1 X j=1 j( ) ~Xs j2 (u) :

It is shown in the appendix that Lt( ) and ~L (u; ) become arbitrarily close to each other, and

both converge in probability to

L (u; ) = DEh~l0(u; )

i

(3.8) as T ! 1; b ! 0; T b ! 1; u Tt <

1

T; where D = 1; when T bT bc t bT bc; where

bT bc denotes the integer part of T b; D = k1c

R1

ck(u)du; when t = [cbT ] or T [cbT ]; where

1 > c > 0: It is easy to see that L (u; ) is maximized by u: By applying the extreme estimator

lemma (e.g., Amemyia 1985, Theorem 4.1.1), we can establish the consistency of the estimator ^

t: Speci…cally, if b ! 0 and T b ! 1 as T ! 1; we have

^ t !P u for u 2 [0; 1] and t T u < 1 T:

Similar to (2.6), Lt( )cannot be computed with the observed sample fXtg T t=1;so we have to replace Lt( ) with Lt( ) = 1 T T X s=1 kstls( ) ; (3.9)

(14)

where ls( ) = 1 2 log hs( ) + X_s2 hs( ) ; hs( ) = 0( ) + s 1 X j=1 j( ) X 2 s j:

Then we de…ne the feasible local QMLE

t= arg max

2 Lt( ) : (3.10)

In order to derive the asymptotic properties of t; we impose the following regularity

condi-tions.

Assumption A.1: The 0

j( ) and 0

j( ) are continuous on [0; 1] except for a …xed number

of points in [0; 1] : For each continuity point u 2 [0; 1] ; there exists a 2 (0; 1] and a constant K <_{1; such that j} 0 j(u) 0j(v)j Kju vj and j 0 j(u) 0 j(v)j Kju vj ; where v 2 N"(u) ;

a small neighborhood containing u:

Assumption A.2: The (p + q + 1)-dimensional parameter space

= ( = 0; 1; :::; p; 1; :::; q 0 2 Rp+q+1: p X i=1 i + q X j=1 j < 1 ; for some > 0 0 < min 0; 1; :::; p; 1; :::; q max 0; 1; :::; p; 1; :::; q < 1

is a compact set. For each u 2 [0; 1]; u 2 Int( ) ; where u = ( 00(u) ; 01(u) ; :::; 0p(u) ; 0

1(u) ; :::; 0 q(u))0:

Assumption A.3: The polynomials 01(u) x + 02(u) x2 + ::: + 0p(u) xp and 1 0 1(u) x 0

2(u) x2 ::: 0

q(u) xq are coprimes on the set of polynomials with real coe¢ cients for some

given 0 u 1.

Assumption A.4: _{The standardized innovation f"}tg is an i.i.d.(0; 1) sequence satisfying limr!0r P ("2t r) =

0 for some > 0; and with (i) E "t4(1+ ) <1 for some > 0 or (ii) E ("12t ) <1:

Assumption A.5: _{The kernel function k : [ 1; 1] ! R}+ _{is a symmetric bounded probability}

density function.

Assumption A.6: The bandwidth b = cT (0 < c < _{1) with either (i) 0 <} < 1 or (ii) 1=13 < < 1:

Assumption A.7: Except for a …xed number of points on [0; 1], the 0

j(u) and 0

j(u) are three

(15)

C for j = 1; :::; q; and l = 1; 2; 3; where C is some bounded constant independent of j and l: Assumption A.1 imposes the Lipschitz continuity of 0

j( ) and 0

j( ) ;but we allow for a

set of a …xed number of points where 0j ( ) and 0

j( ) are discontinuous: Assumptions A.1-A.2

provide su¢ cient conditions that the stochastic process fXtg admits a time-varying state space

representation as Xt = At t T Xt 1+ Bt t T ; where Xt = ht; :::; ht q+1; Xt 12 ; :::; Xt p+12 0 ; Bt( ) = [ 00( ) ; 0; :::; 0] 0 2 Rp+q 1 _{and A} t( ) is a

(p + q 1) (p + q 1) matrix. Assumptions A.2-A.4(i) guarantee that the parameter t can

be uniquely identi…ed. These are standard assumptions imposed by Berkes et al. (2003, 2004) and Kulperger and Yu (2005), among many others, for the case of constant parameters.

Assumption A.5 implies R1₁k(u)du = 1; R1₁uk(u)du = 0 and R1₁u2_{k(u)du <}

1: All ex-amples in Section 2 satisfy this assumption. It is possible to use kernel functions with in…nite support, such as the Gaussian kernel k(u) = p1

2 exp( 1 2u

2₎_for

1 < u < 1. However, we only use kernel functions with bounded support to simplify asymptotic analysis. Assumptions A.4(ii), A.6(ii) and A.7 are only used in Theorems 2 and 3 below. Assumptions A.4(ii) and A.6(ii) are used to derive the close form of the asymptotic bias and Assumption A.7 is to guarantee that the asymptotic bias and variance of the local QMLE are well-de…ned. Similar assumptions have been imposed in Dahlhaus and Subba Rao (2006).

We …rst state the consistency of t in (3.10):

Theorem 1: Suppose Assumptions A.1-A.3, A.4(i), A.5 and A.6(i) hold. If u t T <

1

T where

u_{2 [0; 1]; we have} t ^t!P 0 and t !P u as T ! 1:

Theorem 1 holds even if "t is not i:i:d:N (0; 1): This generalizes Dahlhaus and Subba Rao

(2006), who consider time-varying ARCH processes. We note that unlike the case of stationary GARCH(p; q) models, a time-varying GARCH(p; q) model is not included in the time-varying ARCH(1) class. Furthermore, di¤erent from the time-varying ARCH process, the Volterra expansions of a time-varying GARCH process are rather tedious. Here, we rely on a stochastic recurrent relation (see, e.g., Bougerol and Picard 1992 and Subba Rao 2006) to show that the time-varying parameter GARCH process can be locally approximated by a stationary GARCH process indexed by u 2 [0; 1].

Next, we derive the asymptotic normality of t:

Theorem 2: Suppose Assumptions A.1-A.3, A.4(ii), A.5, A.6(ii), A.7 hold and u _Tt < _T1; where u2 [0; 1] is a continuity point for the coe¢ cient functions 0

j( ) and 0 j( ):

(16)

(i) If t is in the interior region in the sense that bT bc < t < T bT bc; we have p T b t u !dN p T bBu; k2 2 + 1 I (u) 1 ; as T ! 1; where k2 = R1 1k 2_(x)dx; _{= E("}4 t) 3; I (u) = E h @2_L( u;u) @ @ 0 i ; and Bu = 1 2b 2_{I (u)} 1 @3L ( u; u) @ @u2 Z 1 1 x2k(x)dx:

(ii) If t is in the left boundary in the sense that t = bcTbc where 0 c 1; we have p T b t u !dN p T bBul; k2ck1c2 2 + 1 I (u) 1 ; as T ! 1; where k2c = R1 ck 2_{(x)dx; k} 1c = R1 ck (x) dx; Bul = bk1c1I (u) 1 @2L ( u; u) @ @u Z 1 c xk (x) dx + 1 2b 2_k 1 1cI (u) 1 @3L ( u; u) @ @u2 Z 1 c x2k(x)dx;

and and I(u) are de…ned in (i).

(iii) If t is in the right boundary in the sense that t = T _{bcTbc where 0} c 1; we have p T b t u !d N p T bBur; k2ck1c2 ₂ + 1 I (u) 1 ; as T ! 1; where Bur = bk1c1I (u) 1 @2L ( u; u) @ @u Z c 1 xk (x) dx + 1 2b 2_k 1 1c I (u) 1@3L ( u; u) @ @u2 Z c 1 x2k(x)dx

and k2c; k1c; and I(u) are de…ned in (i) and (ii).

We can view as the excess kurtosis of "t; which measures the departure from the assumed

normality in the higher moment. If E"4

t = 3 as in the case of a normally distributed "t; then

the asymptotic variance of interior points can be simpli…ed to k2I (u) 1: The quantity I(u) can

be viewed as a local Hessian matrix, and Bu is the asymptotic bias, which is caused by the

time-varying property of GARCH parameters. From Theorem 2, we conjecture that for interior points, the asymptotic mean square error (AMSE) is given by

AMSE t = b4 4 Z 1 1 x2k(x)dx 2 I (u) 1 @ 3_{L (} u; u) @ @u2 2 2 + k2 2 + 1 T b trace I (u) 1 :

By minimizing the AMSE( t), we could obtain the optimal bandwidth, which is of the order T

4 5

(17)

for interior points:

We observe that lim_c!1R1_ck2(x)dx = k2; limc!1

R1

ck(x)dx = 1; limc!1

R1

cxk (x) dx = 0and

lim_c!1R1_cx2k (x) dx =R1₁x2k (x) dx;and these limits are exactly the constant factors appearing in the asymptotic bias and variance for an interior point. Theorem 2 shows that although the local QMLE is consistent for both interior and boundary points, the asymptotic bias of the local QMLE has a slower convergence rate in the boundary regions than in the interior region. The asymptotic biases for time-varying parameters at interior points and boundary points are O (b2₎

and O (b) respectively. Therefore, the local QMLE su¤ers from boundary e¤ects. The main reason is that we do not have symmetric data available in the boundary regions. On the other hand, the asymptotic variances for interior points and boundary points have the same order of magnitude and the di¤erence is only a scale. The boundary problem of time-varying ARCH or GARCH models has never been studied as previous works (e.g., Dahlhaus and Subba Rao 2006) only focus on time-varying ARCH parameters at interior points. On the other hand, the boundary problem of time-varying linear regression models has been studied by Cai (2007) and Chen and Hong (2008).

To overcome the boundary problem of the local QMLE, we consider the re‡ection method, following Hall and Wehrly (1991). We re‡ect the data in the boundary regions, obtaining pseudo

data Xt = X t for bT bc t 2 and Xt = X2T t for T + 1 t T + bT bc: We use

the overall data (i.e., the union of the original data and the pseudo data) to estimate t in the

boundary regions: By construction, symmetric data are available in the original boundary regions [1; T b]_{[ [T} T b; T ]:This method has also been described as “re‡ection about the boundaries” or “boundary folding” by Schuster (1985), Silverman (1986) and Cline and Hart (1991) in a nonparametric density estimation framework and by Chen and Hong (2008) in a nonparametric regression estimation. Our re‡ection method has advantages over some alternative solutions to the boundary problem. One popular solution is to simply ignore the data in the boundary regions and use only the data in the interior region. Such trimming is simple, but it may lead to the loss of a signi…cant amount of information, even for fairly large sample sizes. For example, if b = (1=p12)T 15; where 1=p12 is the standard deviation of U (0; 1); which could be viewed as

the limiting distribution of the grid points _Tt; t = 1; :::; T; _{as T ! 1; then about 23%, 19%} and 17% of grid points of time will fall into the boundary regions when T = 100; 250 and 500 respectively.

(18)

We de…ne the boundary-corrected local QMLE c t = 8 > > > > > > > > < > > > > > > > > : arg max 2 1 T t+bT bc_X s= bT bc kstls( ) if t = bcT bc; 0 c 1; t if bT bc < t < T bT bc; arg max 2 1 T T +bT bc_X s=t bT bc kstls( ) if t = T bcT bc; 0 c 1; (3.11)

where ls( )and tare de…ned in (3.9) and (3.10) respectively. Note that the pseudo data are only

used to estimate t in the boundary regions. Now we derive the as the asymptotic distribution

of c_t in the boundary regions:

Theorem 3: Suppose Assumptions A.1-A.3, A.4(ii), A.5, A.6(ii), A.7 hold and u _Tt < _T1; where u2 [0; 1] is a continuity point for the 0

j( ) and 0

j( ): If t is in the left boundary or the

right boundary in the sense that t = bcT bc or T bcT bc; where 0 c 1; we have p T b c_t u !d N p T bBu; kb 2 + 1 I (u) 1 ; as T ! 1; where kb = k2+ R1

1k(x)k (x + 2c) dx; k2; ; I (u) and Bu are de…ned in Theorem 2

(i).

Theorem 3 shows that the re‡ection method reduces the asymptotic bias of the local QMLE at the boundary regions. Now, the asymptotic biases for interior points and boundary points are the same; namely, O (b2_{) :}_{That is because symmetric data are available in the original boundary}

regions [1; T b] [ [T T b; T ]by construction. On the other hand, the asymptotic variance of c_t at boundary points is larger than that at interior points since by construction, the pseudo data and the original data are correlated with each other. Note that the re‡ection method has no impact on the original interior region (T b; T T b): Heuristically, the re‡ection method can be viewed as using the boundary kernel _2b1 k s t_{T b} + k s+t_{T b} in the boundary regions and the usual kernel

1 bk

s t

T b in the interior region.

4. NONPARAMETRIC TESTING

We now propose a consistent test for smooth structural changes in GARCH models, which will complement the existing tests for sudden structural breaks and avoid the di¢ culty associated with the possibility of multiple breaks and/or unknown break dates. We note that the facts that the convergence rate of the asymptotic bias of the local QMLE in the boundary regions is slower than that of an interior point and the asymptotic variance of the local QMLE at a boundary point tends to be larger would complicate the form of test statistics to be constructed. The

(19)

…nite sample performance of the test may be a¤ected as well. Hence, we apply the re‡ection method in our testing procedure. We use the overall data (i.e., the union of the original data and the pseudo data) to construct the estimator ct as in (3.11): We note that under H0; 0t is

constant, so no bias exists even in the boundary regions. However, under HA; 0t is time-varying

and correcting the boundary problem is expected to help improve power. With the QMLE and the boundary-corrected local QMLE c_t at hand, we shall compare a constant parameter GARCH model with a time-varying parameter GARCH model via the quasi-likelihood criterion. We consider two cases for our test statistics depending on whether the standardized innovation "t is i:i:d:N (0; 1).

Case 1: f"tg i:i:d:N (0; 1) is correctly speci…ed.

Recall that under H0; we have a standard GARCH(p; q) model with constant parameter

which can be consistently estimated by the QMLE _{in (2.11). Under the alternative H}A; t= 0(t=T )is changing over time and can be consistently estimated by the boundary-corrected

local QMLE c_t: With c_t and ; we can construct a test using likelihood ratio. The idea is to compare the log likelihood of the unrestricted time-varying parameter GARCH model with that of the restricted constant parameter GARCH model. Intuitively, under H0;two likelihoods are close

to each other. Under HA; the nonparametric likelihood is larger than the parametric likelihood

when the sample size T is su¢ ciently large, giving the test its power against a wide range of alternatives. Let lU denote the log likelihood of the (unrestricted) time-varying parameter

GARCH model, that is,

lU = 1 T T X t=1 lt c t ; (4.1)

where ct is the boundary-corrected local QMLE in (3.11). Let lRdenote the log likelihood of the

(restricted) constant parameter GARCH model, that is,

lR = 1 T T X t=1 lt ; (4.2)

where is the QMLE in (2.11). It is important to note that lU and lR are averages of log

likelihoods of the original sample, namely, fXtgTt=1:The pseudo data augmented by the re‡ection

method are only used for estimating tvia c

t in the boundary regions. Hence it will not a¤ect the

asymptotic distribution of test statistics. Intuitively, the use of the pseudo data only has impact on the boundary regions [1; T b] [ [T T b; T ]and its cumulative e¤ect over the boundary regions is asymptotically negligible as T ! 1: However, it improves the …nite sample performance of

(20)

the proposed tests. Meanwhile, we de…ne that the score function St( ) @lt @ = 1 2 " 2 t 1 @ ln ht @ :

We note that under H0, St( ) is a martingale di¤erence sequence (MDS) no matter whether the

distribution of "t is correctly speci…ed or not. However, only under the correct distributional

speci…cation of "t;the information matrix equality holds, namely

E @St( ) @ = E h St( ) St( ) 0i I ( ) ; where Eh@St( ) @ i = Eh @2lt @ @ 0 i

is the Hessian matrix and I( ) is the information matrix. Our LR test statistic for H0 versus HA is based on the comparison of lU and lR :

LR1 = 2Tpb(lU lR) A^ p ^ B ; (4.3)

where the centering factor

^ A = b 1=2(p + q + 1) 8 < :2k(0) 1 T b bT bc X j= bT bc 1 jjj T k 2 j T b +b 2 41 1 T b bT bc X j= bT bc 1 jjj T k j T b Z 1 1 k j T b + 2u du 3 5 9 = ; = b 1=2(p + q + 1) 2k(0) Z 1 1 k2(u)du [1 + o(1)] ;

and the scaling factor ^ B = 4 (p + q + 1) 1 T b T 1 X j=1 1 j T 2k j T b Z 1 1 k (u) k u + j T b du = 4 (p + q + 1) Z 1 0 2k(v) Z 1 1 k(u)k(u + v)du 2 dv + o (1) :

Both ^Aand ^B do not depend on DGP and are nonstochastic, so they are convenient to compute. The log-likelihoods lU and lRare outputs of estimation. Many statistical programs provide values

of lU and lRautomatically. Hence, it is straightforward to compute the LR test statistic. In fact,

a consistent parametric bootstrap is even simpler: one only needs to compare the value of log-likelihood ratio lU lR based on the observed data with that based on bootstrap samples. There

(21)

Case 2: f"tg is nonnormally distributed.

There is a growing consensus that the standardized innovation f"tg may not be normal for

…nancial data. Some fat tailed distribution may …t data better. With possible distributional misspeci…cation of f"tg, the local QMLE remains consistent but not e¢ cient (see Theorems 1–

3). The score function St( ) is still a MDS under H0, but the information matrix equality does

not hold generally. In particular, E [St( ) St0( )] =

var ("2 t)

2 I ( ) 2 + 1 I ( ) ;

where = E("4_t) 3 measures the excess kurtosis of "t: It measures the departure from the

normality in the higher moment. We can construct a robust LR test

LR2 = 2Tpb(lU lR)= ^₂ + 1 A^ p ^ B ; (4.4)

where ^A and ^B are the same as in (4.3) and ^ = _T1 PT_t=1 ^"4_t 3 is a consistent estimator for excess kurtosis : The test statistic LR2 does not assume normality of f"tg:

We note that although our focus is to test whether 0_t = for some constant vector ₂ and for all t; our approach could be extended to test whether a subset of the parameters is constant; namely 01t = 1 for some constant vector 1 2 1 and for all t; where 1 is a subset of

and 0_t = ( 0_1t0; 0_2t0)0_: _{An example is that we are only interested in whether ARCH coe¢ cients are}

constant. There are two possibilities. We discuss the …rst case. If prior information restricts 0_2t to be some constant, it would be analogous to the "partial structural change" problem for the linear regression models where part of the regression coe¢ cients may not be subject to structural changes and the interest is to test the constancy of the other part of the regression coe¢ cients (see, e.g., Andrews, Lee and Ploberger 1996, Bai and Perron 1998). For this case, the null hypothesis is

H0 : 0t = for some constant vector 2 and for all t;

where is a parameter space of t; and the alternative hypothesis is

HA : 01t is time varying and 0

2t = 2 for some constant vector 2 2 2 for some t;

where 2 .

The second case is that no restriction is imposed on 0_2t:Hence the null hypothesis is

(22)

where 1 ; and the alternative hypothesis is

HA: 0t is time varying for some t:

An example of this case has been studied by Ang and Kristensen (2009) in a linear regression framework. They test whether the conditional alphas of CAPM models are constant over time while allowing the conditional betas to be time-varying.

As a subset of the parameters may be time-varying, we can adopt the local pro…le QMLE method. For the …rst case, under HA;we …rst …x 2 and estimate 1t by

1t = 1t( 2) = arg max 12 1 1 T T X s=1 kstls( 1j 2) ; (4.5) where ls( 1j 2) = 1 2 log hs( 1j 2) + X2 s hs( 1j 2) :

Next, we obtain an estimator ^2 of the constant component 2 by substituting the local QMLE 1t into the likelihood function; namely

^ 2 = arg max 22 2 1 T T X t=1 lt 2j 1t ; (4.6) where lt 2j 1t = 1 2 " log ht 2j 1t + X2 t ht 2j 1t # :

Iterations between these two steps have to be employed until a certain convergence criterion is met (see, e.g., Speckman 1988, Carroll, Fan, Gijbels and Wand 1997, and Fan and Huang 2005 for details on the pro…le likelihood or pro…le least-squares method). With proper estimators and [ 1t; ^

0

2]0 at hand, we can compare two models via the likelihood criterion.

The second case can be handled in a similar way, except that the local pro…le QMLE and local QMLE have to be applied under H0 and HA respectively. In this paper, we shall focus on

hypotheses of interest introduced in section 2.

5. ASYMPTOTIC PROPERTIES We now state the asymptotic distribution of LR1 and LR2 under H0:

Theorem 4: _{Suppose Assumptions A.2, A.3, A.4(i), A.5, and A.6(i) hold. (i) Under H}0;

LR2 d

! N(0; 1) as T ! 1: (ii) If in addition "t N (0; 1); then LR1 d

! N(0; 1) under H0 as

(23)

Both LR1 and LR2 have a convenient null asymptotic N(0,1) distribution. This is quite

appealing in light of the facts that most existing tests for structural breaks in GARCH models have nonstandard distributions which may depend on the DGP. The proposed tests do not require formulation of an alternative and are applicable when one has no prior information of the alternative. Moreover, the new tests do not require trimming data (i.e., we test all u 2 [0; 1] rather than restrict u to be a strict subset of [0; 1] ; as usually done in existing tests).

We require that b ! 0 and T b ! 1 as implied by Assumption A.6(i): This is the standard condition for bandwidth b and it covers the conjectured optimal rate b / T 15:As an important

feature of LR1 and LR2; the use of the QMLE in place of the true parameter under H0 has

no impact on the limit distribution of LR1 and LR2:Intuitively, the parametric estimator

con-verges to at apT-rate, which is faster than the nonparametric estimator c_t:Consequently, the asymptotic distributions of LR1 and LR2 are solely determined by the nonparametric estimator

c

t and are nuisance parameter free.

In small samples, the distribution of LR1 and LR2 may not be well approximated by the

asymptotic N(0,1) distribution. Accurate …nite sample critical values can be obtained by using a parametric bootstrap procedure, which we shall discuss and justify in Section 6.

To investigate the asymptotic power properties of LR1 and LR2 under HA; we state the

following theorem.

Theorem 5: Suppose Assumptions A.1-A.3, A.4(i), A.5, and A.6(i) hold. Then for any non-stochastic sequence fMT = o(T

p

b)_{g; we have Pr(LR}1 > MT) ! 1 and Pr(LR2 > MT) ! 1

under HA as T ! 1:

Assumption A.1 allows for both smooth structural changes and abrupt structural breaks with known or unknown break points. We permit 0( )to have a …xed number of discontinuities.

Hence, single structural break and multiple breaks with known or unknown break points, which are often considered in this literature, are included as special cases of (2.2). For example, suppose

0( )is a jump function, namely,

0( ) = ( ₁ 0; 11; :::; 1p; 1 1; :::; 1 q 0 ; if 0; 2 0; 21; :::; 2p; 2 1; :::; 2 q 0 ; otherwise:

Then we obtain the single break GARCH alternative considered in Chu (1995) when 0 is

un-known.

Theorem 5 suggests that the LR tests are consistent against all alternatives to H0, subject

to a set of regularity conditions (i:e:; Assumption A.1). Thus, the proposed test will be able to detect any structural changes in GARCH models as long as the sample size T is su¢ ciently large. This is appealing in light of the fact that no prior information about the alternative of structural

(24)

changes is available in practice. It avoids the blindness of searching for possible alternatives of structural changes in practice. Moreover, as we do not use any trimming procedure, that is, we test all u in the interval [0; 1] rather than a strict subset of it; our tests can detect structural changes that occur near the boundary of [0; 1]; provided that the sample size T is su¢ ciently large and the bandwidth b is su¢ ciently small. And, unlike some existing tests in the literature (e.g., CUSUM test), our tests have symmetric power against structural breaks that occur either in the …rst or second half of the sample period.

6. FINITE SAMPLE PERFORMANCE 6.1 Parametric bootstrap

Theorem 4 provides the null asymptotic N (0; 1) distribution of the LR tests: Thus, one can implement our tests for H0 by comparing LR1 or LR2 with a N (0; 1) critical value, which is

rather convenient in practice. However, like many other nonparametric tests in the literature, the sizes of LR1 and LR2 in …nite samples may di¤er signi…cantly from the prespeci…ed asymptotic

signi…cance level. Therefore, we shall consider a parametric bootstrap procedure in this section. To unify Cases 1 and 2 in Section 4, we consider the general robust statistic

f LR = ^ Q= ^₂ + 1 A^ p ^ B ; (6.1)

where ^Q = 2Tpb(lU lR)and ^Aand ^B are de…ned in (4.3), and propose the following parametric

bootstrap procedure:

Step (i): Obtain a bootstrap sample X fXtgTt=1from the estimated null GARCH model;

Step (ii): Estimate the null model using the bootstrap sample X , and compute a bootstrap statistic fLR in the same way as fLR; _{with X replacing the original sample X =fX}tgTt=1;

Step (iii): Repeat steps (i) and (ii) B times to obtain B bootstrap test statistics f fLR_l_gB l=1;

where B is su¢ ciently large;

Step (iv): Compute the bootstrap p-value p B 1PB_l=11( fLR_l > fLR):

The excess kurtosis ^ estimated from the original sample X will be very close to the one estimated from the bootstrap sample X under H0; so the fLR statistic applies to both normally

and nonnormally distributed cases. The parametric bootstrap has been widely used to improve the …nite sample performance of nonparametric tests. For example, Fan, Li and Min (2006) and Li and Tkacz (2006) apply it to test the correct speci…cation of parametric conditional distributions

(25)

and conditional density in di¤erent contexts respectively. We …rst show the consistency of the parametric bootstrap in the following theorem.

Theorem 6: Suppose Assumptions A.1-A.3, A.4(i), A.5, and A.6(i) hold. Then sup

z2R

P LRf z_jX (z) = oP (1)

as T ! 1 where (z) is the cumulative distribution function of N (0; 1):

Theorem 6 shows that conditional on X ; fLR _!d N (0; 1) _{in probability as T ! 1: The} proof is similar to that of Theorem 4 and we need to use the fact that the parametric bootstrap ensures that in the bootstrap world, H0 always holds. When the null hypothesis H0 is true, the

bootstrap procedure will lead to asymptotically correct size of the test, because fLR converges in distribution to the N (0; 1) limiting distribution; when the null hypothesis is false, because the test statistic fLR will converge to in…nity in probability, whereas asymptotically the bootstrap critical value is still the same as that of N (0; 1); the bootstrap procedure has power.

In fact, when the same kernel function k( ) and the same bandwidth b are used for both fLR and fLR ; the parametric bootstrap described above can be greatly simpli…ed by replacing fLR and fLR with lU lRand lU lRrespectively, where lU and lR are the log-likelihood values of the

local QMLE and standard QMLE based on the bootstrap sample X : This procedure is rather convenient because there is no need to compute ^A; ^B and to estimate : It is also applicable no matter whether "t is normal or nonnormal.

The consistency of the parametric bootstrap does not indicate the degree of improvement of the parametric bootstrap upon the asymptotic distribution. Since fLR is asymptotically pivotal, it is possible that fLR can achieve reasonable accuracy in …nite samples. We shall examine the performance of the parametric bootstrap in our simulation study.

6.2 Simulation study

To examine the size of our test under H0, we consider the following DGP:

DGP S1 [Standard GARCH(1,1)]: 8 > > < > > : Xt= h 1 2 t"t ht= 0:1 + 0:2Xt 12 + 0:7ht 1 "t i:i:d:N (0; 1) : (6.2)

The standard GARCH(1,1) model is the most popular GARCH model and has been widely used in modelling volatilities in …nancial econometrics. We generate 500 data sets of a random sample fXtgTt=1for T = 250 and 500 respectively, using the Matlab Windows Version 7 Random Number

(26)

the …rst 5000 realizations to eliminate the impact of the initial value.

To investigate the power of our test in detecting structural changes in GARCH models, we consider four alternatives:

DGP P1 [Single break in a GARCH ]: 8 > > < > > : Xt= h 1 2 t"t ht= ( 0:1 + 0:2X2 t 1+ 0:4ht 1; if t 0:5T; 0:3 + 0:4X2 t 1+ 0:55ht 1; otherwise; (6.3)

DGP P2 [Multiple breaks in a GARCH ]: 8 > > > > > < > > > > > : Xt= h 1 2 t"t ht = 8 > > < > > : 0:1 + 0:2X_{t 1}2 + 0:3ht 1; if t 0:3T; 0:3 + 0:3X2 t 1+ 0:4ht 1; if 0:3T < t 0:6T; 0:5 + 0:4X2 t 1+ 0:55ht 1 otherwise; (6.4)

DGP P3 [Smooth transition GARCH ]: 8 > > < > > : Xt= h 1 2 t"t ht = 0:1 + 0:2Xt 12 + 0:4ht 1 [1 + 0:5G (t)] G (t) = _{f1 + exp [ 5 (t=T} 0:5)]_g 1; (6.5) where "t i:i:d:N (0; 1) :

The single break has been a structural change with classical importance. Under DGP P1, an abrupt break occurs in the GARCH model at some unknown time t: This alternative has been considered by Chu (1995) and Berkes et al. (2004). DGP P2 admits monotonic multiple breaks. DGP P3 is the smooth transition multiplicative GARCH(1,1) model proposed by Amado and Terasvirta (2008).::

For the proposed LR test fLR, we use the quartic kernel in (3.4). In fact, our simulation experience (not reported here) suggests that the choice of k( ) has little impact on the performance of tests. For simplicity, we choose the bandwidth b = (1=p12)T 15;where 1=p12is the standard

deviation of U (0; 1); which could be viewed as the limiting distribution of the grid points _Tt; t = 1; :::; T; _{as T ! 1: We use the parametric bootstrap procedure described above with the} number of bootstrap iterations B = 100. Both 10% and 5% signi…cance levels are considered.

Under DGP S1, the LR test has good size: For example, the rejection rate is 5:8% at the 5% level when T = 250 and decreases to 5:4% when T = 500: DGP P1 has a single sudden break with unknown break date. The LR test has reasonable power. The rejection rate is 43:6% at the

(27)

5% level even when the sample size T is as small as 250; and increases to 68:6% when T = 500: DGP P2 has multiple breaks. As expected, the rejection rate is higher than that under DGP P1. Under DGP P3, the coe¢ cients of the GARCH model are changing over time smoothly. The rejection rate is a bit low when T = 250, but increases with the sample size T .

To sum up, we observe that the LR test has good sizes in …nite samples when the parametric bootstrap is used. It also has reasonable powers against both sudden structural breaks and smooth structural changes.

7. CONCLUSION

Modelling and detecting structural changes in GARCH processes have attracted increasing attention in time series econometrics. We have contributed to this literature by establishing the asymptotic properties of a local QMLE for the time-varying GARCH models for both interior and boundary regions, and more importantly, proposing a new test for smooth structural changes as well as abrupt structural breaks in GARCH models. All existing works focus on the estimation of time-varying ARCH models, which are special cases of our time-varying GARCH models, and asymptotic estimation results of local QMLE are only available for interior points, even for time-varying ARCH models. Moreover, no model speci…cation test was available for time-varying ARCH models. On the other hand, our LR tests are intuitively appealing and straightforward to compute. They have a convenient null asymptotic N(0,1) distribution, do not require trimming data, do not require prior information on the possible alternative, and are consistent against all smooth structural changes as well as multiple abrupt structural breaks in GARCH or ARCH models. To overcome the adverse impact of the …rst stage nonparametric estimation of the time-varying parameters, we use the parametric bootstrap procedure, which provides reasonable sizes and powers for the proposed tests in …nite samples.

(28)

Table 1 Size and Power of LR Tests T=250 T=500 10% 5% 10% 5% Size DGP S0 .128 .058 .112 .054 Power DGP P1 .620 .436 .814 .686 DGP P2 .844 .722 .956 .912 DGP P3 .242 .154 .480 .346

Notes: (1) DGP S0 is a classical GARCH(1,1) process, given in (6.2); DGP P1 is a GARCH process with a single break, given in (6.3); DGP P2 is a GARCH process with multiple breaks, given in (6.4); DGP P3 is a smooth transition GARCH process, given in (6.5); (2) Parametric bootstrap procedure is used and the bootstrap iteration number B=100; (3) The p values are based on the results of 500 iterations.

(29)

REFERENCES

Amado, C. and T. Teräsvirta (2008): "Modelling Conditional and Unconditional Heteroskedas-ticity with Smoothly Time-Varying Structure," Working paper, Stockholm School of Economics. Amemiya, T. (1985): Advanced Econometrics, Harvard University Press, Cambridge, Massa-chusetts.

Andrews, D.W.K., I. Lee and W. Ploberger (1996): "Optimal Changepoint Tests for Normal Linear Regression," Journal of Econometrics, 70, 9-38.

Andreou, E. and E. Ghysels, (2002): "Detecting Multiple Breaks in …nancial Market Volatility Dynamics," Journal of Applied Econometrics, 17, 579-600.

Ang, A. and D. Kristensenz (2009): "Testing Conditional Factor Models," Working paper, Columbia University.

Bai, J. and P. Perron (1998): "Estimating and Testing Linear Models with Multiple Structural Changes," Econometrica, 66, 47-78.

Berkes, I., E. Gombay , L. Horvath and P. Kokoszka (2003): "GARCH Processes: Structure and Estimation," Bernoulli, 9, 201-227.

— — (2004): "Sequential Change-Point Detection in GARCH(p,q) Models," Econometric Theory, 20, 1140-1167.

Bollerslev, T. (1986): "Generalized autoregressive conditional heteroskedasticity," Journal of Econometrics, 31, 3, 307-327.

Bougerol, P. and Picard, N. (1992): "Strict Stationarity of Generalized Autoregressive Processes," The Annals of Probability, 20, 1714-1730.

Brown, B.M. (1971): "Martingale Limit Theorems," Annals of Mathematical Statistics, 42, 59-66. Cai, Z. (2007): "Trending Time-Varying Coe¢ cient Time Series Models with Serially Correlated Errors," Journal of Econometrics, 136, 163-188.

Carrasco, M. and X. Chen (2002): "Mixing and Moment Properties of Various GARCH and Stochastic Volatility Models," Econometric Theory, 18, 17-39.

Carroll, R., J. Fan, I. Gijbels, and M.P. Wand (1997): "Generalized Partially Linear Single-Index Models," Journal of the American Statistical Association, 92, 477–489.

Chen, B. and Y. Hong (2008): "Testing for Smooth Structural Changes in Time Series Models via Nonparametric Regression," Working paper, Cornell University.

Chu, C.-S. (1995): "Detecting Parameter Shift in GARCH Models," Econometric Reviews, 14, 241-266.

Cline, D.B.H. and J.D. Hart (1991): "Kernel Estimation of Densities with Discontinuities or Discontinuous Derivative," Statistics, 22, 69-84.

Copas (1995): "Local Likelihood Based on Kernel Censoring," Journal of the Royal Statistical Society Series B, 57, 221-235.

(30)

Cox, D.R. (1972): "Regression Models and Life-Tables," Journal of the Royal Statistical Society Series B, 4, 187-220.

Dahlhaus, R. (1996a): "On the Kullback Leibler Information Divergence of Locally Stationary Processes," Stochastic Processes and Their Applications, 62, 139-168.

— — (1996b): "Maximum Likelihood Estimation and Model Selection for Locally Stationary Processes," Journal of Nonparametric Statistics, 6, 171-191.

— — (1997): "Fitting Time Series Models to Nonstationary Processes," Annals of Statistics, 25, 1-37.

Dahlhaus, R. and S. Subba Rao (2006): "Statistical Inference for Time-Varying ARCH Process," The Annals of Statistics, 34, 1075-1114.

Diebold, F.X. (1986): "Modeling the Persistence of Conditional Variances: a Comment," Econo-metric Reviews, 5, 51-56.

Engle, R.F. (1982): "Autoregressive Conditional Heteroskedasticity with Estimates of the Vari-ance of U.K. In‡ation," Econometrica, 50, 987-1008.

Engle, R. and J. Gonzalo Rangel (2005): "The Spline GARCH Model for Unconditional Volatility and its Global Macroeconomics Causes," Working paper, NYU.

Fan, J., M. Farmen. and Gijbels, I. (1998): "Local Maximum Likelihood Estimation and Infer-ence," Journal of the Royal Statistical Society Series B, 60, 591-608.

Fan, J., I. Gijbels and M. King (1997): "Local Likelihood and Local Partial Likelihood in Hazard Regression," The Annals of Statistics, 25, 1661-1690.

Fan, J. and T. Huang (2005): "Pro…le Likelihood Inferences on Semiparametric Varying-Coe¢ cient Partially Linear Models," Bernoulli, 11, 1031-1057.

Fan, Y., Q. Li and I. Min (2006): "A Nonparametric Bootstrap Test of Conditional Distribu-tions," Econometric Theory, 22, 587-612.

Fryzlewicz, P., T. Sapatinas and S. Subba Rao (2008): "Normalized Least-Squares Estimation in Time-Varying ARCH Models," The Annals of Statistics, 36, 742-786.

Hall, P. and T.E. Wehrly (1991): "A Geometrical Method for Removing Edge E¤ects from Kernel-type Nonparametric Regression Estimators," Journal of the American Statistical Associ-ation, 86, 665-672.

Hansen, B (2001): "The New Econometrics of Structural Change: Dating Breaks in U.S. Labor Productivity," Journal of Economic Perspectives, 15, 117-128.

Hillebrand, E. (2005): "Neglecting Parameter Changes in GARCH Models," Journal of Econo-metrics, 129, 121-138.

Hjort, N.L. (1991): "Semiparametric Estimation of Parametric Hazard Rates," Survival Analysis: State of the Art, Kluwer, Dordercht, 211-236.

(31)

Models and Their Applications," Annals of Statistics, 33, 2395-2422.

Lamoureux, C.G. and W.D. Lastrapes (1990): "Persistence in Variance Structural Change and the GARCH Model," Journal of Business and Economic Statistics, 8, 225–234.

Lee, S.W. and B. Hansen (1994): "Asymptotic Theory for the GARCH(1,1) Quasi-Maximum Likelihood Estimator," Econometric Theory, 10, 29–52.

Li, F. and G. Tkacz, (2006): "A Consistent Test for Conditional Density Functions with Time Dependent Data," Journal of Econometrics, 133, 863-886.

Loader, C. (1999): Local Regression and Likelihood, Springer-Verlag New York.

Lumsdaine, R.L. (1996): "Consistency and Asymptotic Normality of the Quasi-Maximum Like-lihood Estimator in IGARCH(1, 1) and Covariance Stationary GARCH(1, 1) Models," Econo-metrica, 64, 575–596.

Mikosch, T. and C. St¼aric¼a (2004): "Nonstationarities in Financial Time Series, the Long-Range Dependence, and the IGARCH E¤ects," The Review of Economics and Statistics, 86, 378-390. Pettenuzzo D. and A. Timmerman (2005): "Predictability of Stock Returns and Asset Allocation under Structural Breaks," Working paper, UCSD.

Phillips, P.C.B. and Hansen B.E. (1990): "Statistical Inference in Instrumental Variables Re-gression with I(1) Processes," Review of Economic Studies, 57, 99–125.

Robinson, P.M. (1989): "Nonparametric Estimation of Time-Varying Parameters," in Hackl, P. eds., Statistical Analysis and Forecasting of Economic Structural Change, Springer, Berlin, 253-264.

Schuster, E.F. (1985): "Incorporating Support Constraints into Nonparametric Estimators of Densities," Communications in Statistics: Theory and Methods, 14, 1123-1136

Silverman, B.W. (1986): Density Estimation for Statistics and Data Analysis, London: Chapman & Hall.

Speckman, P. (1988): "Kernel Smoothing in Partial Linear Models," Journal of the Royal Sta-tistical Society Series B, 50, 413–436.

Subba Rao, S. (2006): "On Some Nonstationary, Nonlinear Random Processes and Their Sta-tionary Approximations," Advances in Applied Probability, 38, 1155-1172.

Tibshirani, R. and T. Hastie (1987): "Local Likelihood Estimation," Journal of the American Statistical Association, 82, 559-567.