**Long M em ory in T im e Series:**

**Sem iparam etric E stim ation**

**and C onditional H eterosced asticity**

by

**Ma r c He n r y**

*London School of Economics and Political Science*

Thesis su b m itted in p artial fulfilment of th e requirem ents of

**UMI Number: U 142981**

**All rights reserved**

**INFORMATION TO ALL U SE R S**

**The quality of this reproduction is d ep en d en t upon the quality of the copy subm itted.**

**In the unlikely even t that the author did not sen d a com plete m anuscript**

**and there are m issing p a g e s, th e se will be noted. Also, if material had to be rem oved,**
**a note will indicate the deletion.**

**Dissertation Publishing**

**UMI U 142981**

**Published by ProQ uest LLC 2014. Copyright in the Dissertation held by the Author.**
**Microform Edition © ProQ uest LLC.**

**All rights reserved. This work is protected against**
**unauthorized copying under Title 17, United S ta tes C ode.**

**ProQ uest LLC**

**789 East E isenhow er Parkway**
**P.O. Box 1346**

*1* eL-S>

### f

*' f G S o*

X d& vas^

in his b ra in s,- W hich is as dry as th e rem ainder biscuit

A fter a voyage,- he h a th strange places cra m m ’d W ith observation, th e which he vents

In m angled forms.

**A bstract**

**A cknow ledgem ents**

T h e story of m y PhD is no Greek tragedy. The plot lacks purity, I never m anaged to stay in one place and it took m e considerably more th a n one day to w rite it. However, I was always confident my supervisor, P eter Robinson, would save it from becom ing a farce. My g ra titu d e goes to P eter for accepting yet another P h .D . stu d en t in O ctober 1994, n otw ithstanding th e crowds already flocking a t his doorstep. My g ra titu d e goes to P eter for spending a considerable am ount of tim e in collaborative work even after he had becom e aware of my shortcom ings. My g ra titu d e goes to P eter for always tre a tin g m e as an equal, for never p u ttin g pressure on a m uch valued freedom and for providing m e w ith incentives and encouragem ents when I m ost needed them .

Alice M esnard, Claudio M ichelacci, Rohini Pande, B arb ara Petrongolo, Helene Rey, Cecilia T esta and E tienne W asm er, who helped restore my courage and will to get back to work. W hen confidence really w ithered, Javier Hidalgo, smoking away m any a nostalgic cheap Spanish cigarette on th e steps of St. C lem ents, would usher me into th e entrails of th e building to feed on Jorn R o th e’s and Rob N o rth c o tt’s faith in pure theory (and chess).

However, courage, will, faith and fine words did little to help m e afford life in London and LSE fees. None of this (not even life in Ian Tonks’ little B rixton flat I shared w ith Alice, Rohini and all th e others who took an active p a rt in th e Tales of B rixton M otel) would have been possible w ithout th e help of a grant from th e Ecole polytechnique, adm inistered faithfully by M artine G uibert and awarded under th e auspices of th e Ecole des H autes Etudes en Sciences Sociales and th e shadow supervision of G abrielle Demange. W hen this ran out, I tu rn e d to C hristian Gourieroux and was aw arded a g ran t from th e C entre de Recherche en Economie et S tatistiq u e (C R EST ). Je desire a ce titre m anifester, en frangais, et on s’apecevra que m em e l ’unite de langue est violee, to u te m a reconnaissance a C hristian qui m ’a accueilli au C REST en enfant de la m aison (bien q u ’un peu prodigue) et qui m ’a tire d ’une mauvaise passe financiere en m e p e rm e tta n t d ’obtenir un financem ent CREST. J ’ai passe d ’excellents m om ents en compagnie de Julien B echtel, M onica Billio, Cecile Boyer, Eric B urgayran, Laurence Carassus, Serge Darolles, Gaelle le Fol, N adine Guedj, Jean -P au l L aurent, D ietm ar Leisen, Clotilde N app, Huyen P ham , Olivier Scaillet, A ndre Tiom o, N izar Touzi, Fanda Traore e t Jean-M ichel Zakoi’an au cafe-comique (souvent m ue en cafe-concert sous l ’im pulsion de Clotilde et Gaelle) du cinquiem e etage de MK2. Ces m om ents furent tres episodiques, mais sont appeles, je l’espere, a se renouveler dans l ’avenir.

**C o n te n ts**

**1 Long m em o ry and co n d itio n a l h e te r o sc e d a stic ity ** **21**

1.1 Long m e m o r y ...21

1.2 Sam ple m ean of long m em ory p ro c e s s e s ... 27

1.3 C onditional h e te ro s c e d a s tic ity ... 28

1.4 E stim atin g d e p e n d e n c e ...34

1.4.1 P a ra m e tric estim atio n of long m e m o r y ...35

1.4.2 Sm oothed periodogram spectral e s tim a tio n ...37

1.4.3 S em iparam etric estim ation of long m e m o r y ...39

1.5 Choice of b a n d w i d t h ... 44

1.6 Long m em ory in speculative r e t u r n s ... 46

1.7 S y n o p s is ... 48

**2 A veraged p erio d o g ra m s ta tistic ** **51**
2.1 I n tr o d u c tio n ... 51

2.2 Averaged periodogram s t a t i s t i c ...52

2.3 A sym ptotic n orm ality of th e averaged p e r i o d o g r a m ... 54

12

2.5 E stim ation of long m e m o ry ... 71

2.6 F in ite sam ple investigation of th e averaged periodogram long m em ory e s t i m a t e ... 72

2.7 E stim ation of statio n ary cointegration ... 80

2.8 C o n clu sio n ...82

**3 ** **Local W h ittle e stim a tio n o f lon g m em o ry w ith con d ition al h e t**
**ero sced a sticity ** **83**
3.1 I n tr o d u c tio n ... 83

3.2 Local W h ittle e s t i m a t e ...84

3.3 Consistency of th e local W h ittle e s ti m a te ...8 6 3.4 A sym ptotic n o rm ality of th e local W h ittle e s t i m a t e ...90

3.5 F in ite sam ple com parison ... 100

3.6 C o n c lu sio n ...115

**4 ** **O p tim al b an d w id th choice ** **117**
4.1 I n tr o d u c tio n ... 117

4.2 B andw idth selection for th e averaged p e rio d o g ra m ... 123

4.3 B andw idth choice for th e local W h ittle e s t i m a t e ... 127

4.4 A pproxim ations to th e optim al b a n d w id th s ... 129

4.5 C o n c lu sio n ...141

**5 ** **A n a ly sis of d ep en d en ce in in tra-day foreign exch an ge retu rn s ** **143**
5.1 I n tr o d u c tio n ... 143

13

5.3 M ethodology ...153

5.3.1 T esting for persistence, long range dependence and sta tio n a rity 154 5.3.2 E s tim a tio n ...155

5.3.3 S tatio n ary c o in te g r a t io n ... 158

5.4 R e s u lts ...163

5.4.1 T esting for Long Range D ependence ...163

5.4.2 S em iparam etric E s tim a tio n s ... 166

5.4.3 Specification Tests on th e Fully P aram etric M o d e l...168

5.4.4 F ractional cointegration ...171

**L ist o f T a b les**

2.1 M oderate long m em ory averaged periodogram biases ...74

2.2 M oderate long m em ory averaged periodogram R M S E s ...74

2.3 M oderate long m em ory relative e ffic ie n c ie s... 74

2.4 Very long m em ory averaged periodogram b i a s e s ...75

2.5 Very long m em ory averaged periodogram R M S E s ... 75

2.6 Very long m em ory relative efficiencies...75

2.7 Averaged periodogram relative efficiencies for larger sam ple sizes . . . 79

3.1 Local W h ittle biases w ith a n tip e rs is te n c e ... 102

3.2 Local W h ittle RM SEs w ith a n tip e rs is te n c e ... 102

3.3 Local W h ittle 95% coverage probabilities w ith an tip ersisten ce . . . . 103

3.4 Log periodogram relative efficiencies w ith a n tip e r s is te n c e ...103

3.5 Local W h ittle biases w ith short m e m o r y ... 104

3.6 Local W h ittle RM SEs w ith short m e m o r y ... 104

3.7 Local W h ittle 95% coverage probabilities w ith short m e m o r y ...105

3.8 Log periodogram relative efficiencies w ith short m em ory ...105

3.9 Local W h ittle biases w ith m oderate long m e m o r y ... 105

16

3.11 Local W h ittle 95% coverage probabilities w ith m o d erate long m em ory 106

3.12 Log periodogram relativ e efficiencies w ith m o d erate long m em ory . . 106

3.13 Local W h ittle biases w ith very long m e m o r y ...107

3.14 Local W h ittle R M SEs w ith very long m e m o r y ... 107

3.15 Local W h ittle 95% coverage probabilities w ith very long m em ory . . . 1 0 7 3.16 Log periodogram relativ e efficiencies w ith very long m e m o r y ... 108

3.17 Local W h ittle biases w ith *t 2* errors ... 110

3.18 Local W h ittle RM SEs w ith *t 2* e r r o r s ... I l l
3.19 Local W h ittle 95% coverage probabilities w ith *t 2* e r r o r s ... I l l
3.20 Log periodogram relativ e efficiencies w ith *t 2* e r r o r s ... I l l
3.21 Local W h ittle biases w ith *t 4* errors ... 112

3.22 Local W h ittle R M SEs w ith e r r o r s ... 112

3.23 Local W h ittle 95% coverage probabilities w ith e r r o r s ... 112

3.24 Log periodogram relativ e efficiencies w ith *14* e r r o r s ... 113

4.1 A u tom atic estim ates of long m em ory in in tro du cto ry exam ples . . . . 133

4.2 Infeasible and feasible a u to m a tic local W h ittle e s tim a tio n ...136

4.3 Sensitivity of a u to m a tic procedures to conditional heteroscedasticity . 139 4.4 A u tom atic local W h ittle estim atio n of long m em ory in fractional G aussian noise s e r i e s ... 141

5.1 Sum m ary statistics for exchange ra te r e t u r n s ... 148

5.2 Sum m ary S tatistics for th e Logarithm of Squared R e t u r n s ... 149

5.3 Test for long m em ory on r e t u r n s ...163

17

5.5 Test of long m em ory on deseasonalized v o la tility ... 166

5.6 Long m em ory in r e t u r n s ... 167

5.7 Long m em ory in v o l a t i l i t y ...168

5.8 Long m em ory in deseasonalized v o la tility ... 169

5.9 P a ra m e tric testing for long m em ory in v o l a t i l i t y ... 170

**L ist o f F ig u res**

2.1 A veraged periodogram em pirical d istribution for *n =* 500 ... 78

2.2 Averaged periodogram em pirical d istribution for *n* = 1000 ... 78

2.3 Averaged periodogram em pirical d istribution for *n =* 2000 ... 79

3.1 Em pirical distributions of th e local W h ittle e stim ate w ith G ARCH
errors *n* = 64, *m — 4* ...113

3.2 Em pirical distributions of th e local W h ittle e stim ate w ith G A RCH
errors *n* = 128, m = 1 6 ... 114

3.3 E m pirical d istributions of th e local W hittle e stim ate w ith G A RCH
errors *n =* 256, m = 6 4 ... 114

4.1 Long m em ory function of b an d w id th in the Nile river d a t a ... 119

4.2 Long m em ory estim ation in an A R FIM A (l,-.25,0) series ...120

4.3 Long m em ory estim ation in an ARMA(1,0) s e r i e s ... 120

4.4 Long m em ory estim ation in an A RFIM A (1,.25,0) s e r i e s ...1 2 1 4.5 Long m em ory estim ation in an A RFIM A (1,.45,0) s e r i e s ...1 2 2 4.6 O ptim al b an dw id th for th e local W h ittle e stim ate of long m em ory . . 130

4.7 Local W h ittle biases against b a n d w id th ...134

20

4.9 A u to m atic and optim al bandw idths for th e local W h ittle estim ate . . 137

4.10 RM SEs w ith optim al, au to m atic and ad hoc bandw idth choice . . . . 138

4.11 Averaged periodogram RM SEs against b a n d w i d th ...140

5.1 Periodogram for JP Y /U S D Log Squared R eturns ... 150

5.2 Log Periodogram for JP Y /U S D Log Squared R e t u r n s ... 151

5.3 JP Y /U S D Log Squared R eturns: Sample A utocorrelations 1 to 1000 . 151

5.4 Periodogram for Deseasonalised JP Y /U S D Log Squared R eturns . . . 165

**C h a p te r 1**

**L o n g m e m o r y an d c o n d itio n a l**

**h e te r o s c e d a s tic ity**

**1.1 **

**Long m em ory**

T h e th eo ry of econom etric tim e series is th e branch of econom etrics concerned w ith
th e m odelling of dependence across different realisations of a economic process *x .*

B ecause of the description of th e scale of realisations of th e process as a tim e scale,
this dependence is usually called tem p o ral dependence and th e process is indexed
by *t*. Allowing for tem p o ral dependence in th e process implies relaxing th e indepen
dence p a rt of th e tra d itio n a l assum ption of independence and id en tity of d istrib u tio n
(i.i.d.) for th e stochastic process under focus. T he id en tity of d istrib u tio n p a rt of th e
i.i.d. assum ption is either defined by stric t stationarity, m eaning th a t for all positive
integers n , *t i*, . . . , *t n* and /i, th e distributions of (*x*tl, . . . , *x tn)* and *(x tl+hi* • •, *Xtn+h)*

99 _{Chapter 1}

of an infinite weighted sum of uncorrelated errors of com m on variance, w ith square sum m able filtering weights.

oo oo

*X t = E ( x t)* + *Y l , a 3 £ i - 3 i* <*0 = 1, 5 Z ° J < 0 0 , ( i - 1)

i=o *j=o*

w ith

*E(ej)* = 0 a.s. and *E(£j£k)* = *Sjk&e* f°r — 0, (1*2)

where *8* stands for the K ronecker symbol.

A way of relaxing th e independence w ithin the i.i.d. assum ption while retaining
weak dependence between d istan t x ’s, or asym ptotic independence, was introduced
by R osenblatt (1956) and Ibragim ov (1959),(1962) w ith th e notions of strong and
uniform mixing. Let ( 0 , *A*, *P)* be th e probability space in which th e process *x t* is
defined, where *A* is th e sm allest Borel field including all th e Borel sets of th e form
{cj|(xj(fc)(nfc,o;), *k =* 1, . . . , m) G 5 } w ith *B* a Borel set in IRm. Let *T v* be th e cr-field

of events determ ined by X*, *t < p* and *T q* be the cr-field of events determ ined by X*,

*t > p.* Define th e sequences

*px ( k ) : = s u p { \ P ( A n B ) - P ( A ) P ( B ) \ , * *A e F p, * *k >* 1} (1.3)

and

**c,(fc) := s u p { | f ^ p - P ( B ) |, **

*A e r P, B e j ^ +k,*

*** > i ) .**

**(1.4)**

*px ( k*) was introduced by R osenblatt (1956) and called a-m ixing or strong-m ixing
sequence and C^(^) was introduced by Ibragimov (1962) and called ^-m ixing or
uniform -m ixing sequence (T he usual *a* and *</>* notations are replaced by *p* and £
respectively to avoid a clash of n o tatio n w ith w hat follows). T he strictly statio n ary
process *x t* is called strong (resp. uniform ) mixing if *px( k ) —*>0 (resp. Cr(&)->0) when

fc—*0 0. It is easily seen th a t 2*px (k)* < *Cx(k)* and therefore th a t uniform -m ixing

*Long m emory* 23

sum m ability conditions on cum ulant m om ents of all orders, which are assum ed to
exist. Defining th e *k th* order cum ulant of a strictly statio n ary process by

**c u m f * ! , . ****, . , X k) =**** £ ( - l ) p(P - 1 )!(E I I ****Xj ) . . . ( E Y [ Xj )****(1.5)**

*jevp*

where th e sum m ation extends over all partitions (i/i,. . . ,i/p), *p* = 1 , . . . , & , and

defining

*= c v m ( x u x i+tli. . . , x t+tk_1),* for *t u* . . . , *t k- X =* 0 ± 1, . . . , (1.6)

Brillinger (1975) introduces a m ixing condition of th e form

**+ o o**

*^ 2* < 0 0 for all *k >* 2. (1.7)

*1***1 ** **00**

It is easily seen th a t this condition includes absolute sum m ability of autocovariances
of th e process, a condition which restricts th e choice of filtering weights on the
innovations *et* in 1.1, because it is equivalent to

**4-oo ** **00**

**E ** l E ^ i +i | < o o , **(1.8)**

*l——00 j*—0

w ith th e convention *aj =*

**0**

**,**

*j <*

**0**

**.**

24 *Chapter 1*

flooding of alluvial plains by th e Nile river bringing prosperity to th e region. A more definite account is provided by th e p articu larly reliable m easurem ents of th e annual low levels of th e Nile at th e G hoda Range collected betw een A.D. 622 and A.D. 1284 and appearing in Toussoun (1925) (th e first missing observation is for year A.D. 1285 outside th e sam ple chosen). Two characteristics of this series are consistent w ith non m ixing behaviour: slow decay of sam ple autocorrelations and a sam ple m ean w ith variance which decays a t a m arkedly slower ra te th a n n- 1 (for

graphical assessments, see, e.g. B eran (1994) p. 22). H urst (1951) gives a q u an tita
tive account of a phenom enon la te r nam ed after him “H urst effect” together w ith a
heuristic approach to th e m easurem ent of th e degree of tem p o ral dependence associ
ated w ith this effect. He defined th e rescaled adjusted range or *R / S* statistic which
is th e standardised ideal capacity of a reservoir betw een a tim e origin and tim e T ,
and he observed a p a tte rn consistent w ith th e relation

*E [ R / S ) ~ cTh* as *T* *—>* oo w ith *H >* (1.9)

where indicates th a t th e ra tio of th e left hand side and th e right hand side
tends to one when *T* tends to infinity, whereas *H* (often called th e self-sim ilarity
param eter) should be equal to | if th e river flow behaved like a process under any
type of m ixing assum ption. In his pioneering work on stock prices, M andelbrot
(1973) identified th e sam e type of phenom enon and related it to th e self-similarity
distributional p rop erty introduced by Kolmogorov (1940), by which th e join t distri
bution of *x*t l , . . . , *x tn* is identical to *a~H* tim es th e jo in t d istrib u tio n of *x*atl, . . . , *x atn*

for any *a >* 0, w ith th e in tro d u ctio n of fractional G aussian noise (in M andelbrot
and Ness (1968)), a G aussian process w ith zero m ean and autocovariances following

cov(a;1,a;i+ i) = i v a r ( z i ) { |j + *l \ 2H - 2 \j\2H* + |*j* - 1|2H}, for *j =* 0, ±1, . . (1.1 0)

*Long m em ory* 25

*be consistent with the presence of one or more unit roots, but the same series also*

*displayed a tendency fo r the spectrum of first differences to exhibit a trough at zero*

*f r e q u e n c y . . .*T h e large proportion of variance concentrated around zero frequency
is indeed a tra n slatio n in th e frequency dom ain of non sum m ability of autocovari
ances, b u t it need not indicate non stationarity, let alone th e presence of a un it
root.

T he m ost popular m odel, besides fractional Gaussian noise in 1.10, which encom passes this distinction, is th e autoregressive fractionally in teg rated m oving average model, where

( 1 - *L ) dxb(L)xt = a(L)eti* w ith *~ \ < dx < \ ,* (1-H )

**z ** **z**

where *a(z)* and *b(z)* are bo th finite order polynom ials w ith zeros outside th e u n it
circle in th e com plex plane. This m odel was proposed by A d en sted t (1974), Hosking
(1981) and G ranger and Joyeux (1980). (1 — *L )d* has a binom ial expansion which is
conveniently expressed in term s of th e hypergeom etric function

OO

**(1 - ****L f = F ( - d****, 1, 1 ****; L) = **

*J2*

### r(Jb

**-**

**d)T(k****+ l ) " ^ - * / ) - 1 /;***

**( 1. 12)**

**k —0**where T(.) denotes th e G am m a function. W riting oo

* a i z )* = (L 13)

*j=o*

1 .1 1 corresponds to a p aram etric representation nested in specification 1.1-1 . 2 w ith

a (* ) = ( l - * ) - < ■ (1.14)

and when | > *dx* > 0, this im plies a slow decay of filtering weights

*aj* = *0 ( j d*~l )* as *j* —» oo (1*15)

and of autocorrelations

**Y 2 i= 0 *** „ - 2 d x ~ l* „• v __

—— ----2 ~ *c3* as *j -¥* oo (1.16)

**l ^ i - 0 a i**

26 *Chapter 1*

Hyperbolic decay of th e filtering weights as in 1.15 im plies th a t th e la tte r decay so
slowly as to be non sum m able even though they rem ain square sum m able for *dx* 2 *

In term s of im pulse response, this implies a very long lived b u t non perm an en t re
sponse to shocks at any tim e in th e series, consistent w ith very slow m ean reversion
in th e process (a notion widely used in th e economic literatu re, and which can be
form ally defined w ithin this fram ework by a *dx* value strictly below one, distinguish
ing it thereby from a statio n arity concept). H yperbolic decay of autocorrelations
translates into th e existence of a singularity of hyperbolic n atu re in th e spectral
density of th e process in th e neighbourhood of frequency zero. C onditions for equiv
alence between tim e dom ain and frequency dom ain representations of long m em ory
are discussed in Yong (1974). Therefore, thro u g ho u t this work, long m em ory in a
weakly statio n ary tim e series *x t*, *t* = 0, dbl , . . wi t h autocovariances satisfying

co*v ( x u x t+j) = [ f(X )cos(jX)d X j =* 0 , ± 1 , . . . , (1-17)

* J* —7T

will be m odelled sem iparam etrically by

/(A ) ~ *L ( X ) \ ~ 2i* * *as* A -> **0 + , ** w ith - *\ < dz <* (1.18)

*Li * *Li*

where *L(X) >* 0 and is continuous at A = 0 when *dx =* 0, and is otherw ise a slowly
varying function at zero defined by

*j*. J —>* 1 as A —> 0 for all *t >* 0. (1-19)
L(A)

U nder 1.18, /(A ) has a pole a t A = 0 for 0 < *dx* < | (when th ere is long m em ory in

*x t*), / ( A) is positive and finite for *dx* (which is identified w ith short m em ory in *x t)*

and / (0) = 0 for — | < *dx* < 0 (which can be described as negative dependence or

*Long m em ory* 27

B eran (1993) and Janacek (1993) w ith an appended quasi m axim um likelihood es tim atio n m eth o d for th e degree of dependence. The sp ectral density in th e m ost com m on version of th is m odel is given by

/(A ) = |1 *- e'x \~2dx e x p { J 2 v j cos J *}* (1.2 0)

j=o

where th e short m em ory p a rt of th e representation provides an arb itra rily accurate approxim ation of any positive function w ith a Fourier decom position.

**1.2 **

**Sam ple m ean o f long m em ory processes**

Long range dependence m ay also be detected through th e behaviour of th e sam ple m ean of th e process. Consider th e p artial sums

S„ = X > t, (1-2 1)

*t=* 1

w ith variance cr^ = *E \ S*n |2, and suppose *E x t* = 0 w ithout loss of generality. As

noted by R obinson (1994d), cr2 always exists and is equal to 27r*n* tim es th e Cesaro

sum , to *n* — 1 term s, of th e Fourier series of /(A ), spectral density of th e process *x*t .

Therefore, if / is continuous at th e origin,

**2**

— ^ 2 7 r/(0 ) as n —>oo.

*n*

Thus, if / ( 0 ) 7^ 0 is estim ated consistently by /( 0 ) , the C entral Lim it T heorem and

Slutzky’s T heorem yield

5'n(2 7 rn /(0 ))_ 2 —*yd* A f(0,1) as *n —> oo* (1.2 2)

See H annan (1979) for a proof of 1.22 for a stationary *x t* following 1 .1 w ith i.i.d.

innovations *et* (this can be extended to conditionally hom oscedastic and uniform ly
integrable m artin g ale differences) and

**oo**

£ N < ° ° . (1-23)

28 *Chapter 1*

Eicker (1956) gives consistent estim ates of th e lim iting variance of th e LSE and other sim ple estim ates of p aram etric models in th e presence of p aram etric and nonpara- m etric disturbance autocorrelation. 1 . 2 2 implies a convergence of th e sam ple m ean

a t th e ra te a feature which generally fails when 1.23 does not hold. W hen th e
filtering weights *ctj* are no t absolutely sum m able, b u t follow 1.15, w ith long m em ory
p aram eter *dx* > 0, th e *cr2* need to be scaled w ith a factor n “ (2rf*+1) instead of n- 1

to converge. Moreover, th e sam ple m ean is no longer B LU E (A denstedt (1974)).
Sam arov and Taqqu (1988) found th a t efficiency can be poor for *d* < 0 b u t is a t least
0.98 for *d >* 0. Providing *u 2n* does not have a finite lim it, a central lim it theorem
continues to hold when *x t* follows 1 .1 w ith i.i.d. innovations (this can be relaxed to

m artingale difference innovations) in the form:

*(r~l S n -+d* N (0,1) w ith *n ^ dx(j~l* *—> c >* 0, (1-24)

in Ibragim ov and Linnik (1971). G iraitis and Surgailis (1986) use th e A ppell gener alisation of H erm ite polynom ials to extend th e result 1.24 to nonlinear functions of processes satisfying 1 .1 w ith i.i.d. innovations

**1.3 **

**C onditional h eterosced asticity**

Long m em ory therefore provides a framework for a very parsim onious representation
of tem po ral dependence, in th a t the long range dependence is em bodied in th e
one p aram eter *dx.* To derive asym ptotic d istrib u tio n al results for processes w ith
strong tem poral dependence typically outside th e scope of any m ixing assum ption,
th e approach chosen here relies on a m artingale difference or non p redictab ility
assum ption for th e Wold innovations *et* of th e process. In o th er words, lettin g *T%*

denote th e filtration associated w ith the <7-field of events generated by *(es, s* < *t*),
one needs to assum e th e innovations are *martingale differences:*

*E ( e t \ F t - i )* = 0 alm ost surely (a.s.). (1.25)

*Long m em ory* 29

(1979)’s, require th e assum ption of constant conditional variance

= *a 1* a.s. (1.26)

th a t m any tim e series, in p articu lar long financial tim e series w here a large degree of tem p o ral dependence is (and can be) observed, are generally believed to violate. F inancial re tu rn s, constructed from first differenced logged asset prices or foreign exchange bank quote m idpoints sam pled a t weekly, daily or in tra-d aily frequencies, typically exhibit thick-tailed distributions and volatility clustering, i.e. conditional variances changing over tim e in such a way th a t periods of high m ovem ent are followed by periods displaying th e sam e characteristic, and periods of low m ovem ent also. O ne therefore needs to allow for tim e varying volatilities for th e innovations, and 1.26 needs to be replaced by

*E{e 2t\T t- i)* = o f a.s. (1-27)

where o f is a stochastic process whose tem poral dependence properties can in tu rn be considered. T h e conditional variance o f can be allowed to depend on some laten t stru c tu re , as in th e m odel due to Taylor (1980):

*£ t = r ) t ° u*

log 0* = 7 0 + 7 1 log o-t_ i + *u t ,*

*T)t , u t* independent i.i.d. .

30 *Chapter 1*

(1991), and some nonlinearities were introduced by Sentana (1995) w ith an ex ten sive study of q u adratic A RCH m odels and by Zakoian (1995) w ith th e threshold ARCH class of models. An extensive review of th e lite ra tu re in this field of econo m etric research is given by Bollerslev, Engle, and Nelson (1994). All of th e above are based on a param eterisatio n of th e one step-ahead forecast density, a p articu larly appealing feature -as pointed out by Shephard (1996)- as m uch of finance theory is concerned w ith one step-ahead m om ents or distributions defined w ith respect to th e economic ag en t’s inform ation. A sym ptotic theory for p aram etric ARCH m od elling was proposed by Weiss (1986), Lee and Hansen (1994) Lum sdaine (1996) and Newey and Steigerwald (1994). Bollerslev, Chou, and K roner (1992) give reviews of th e GARCH m odelling approach. A nonp aram etric specification encom passing b o th ARCH and GARCH as special cases was proposed by Robinson (1991b) w here o f is an infinite sum of lagged values of ej:

**oo ** **oo**

o f = cr2 + y f V??(ef_v — *&1)* a.s. w ith < oo. (1.28)

j= i *j*=i

T his can be reparam eterised as

**oo**

*°t = P +* £ V’jfi*t- j* (1-29)

*j=* 1

and includes bo th stan d a rd A R CH (when *ipj* = 0, *j >* p, for finite p) and G A RCH
(for which th e *ipj* decay exponentially) models. However, as Robinson (1991b) in
dicated, long m em ory behaviour is also covered. This, and th e sem i-strong ARM A
representation for th e squared innovations im plied by th e above specification, is
m ade apparent in th e following rep aram eterisatio n . If, for com plex valued *z,*

**oo**

*tp(z) =* 1 - *Y j & j z 3* (1-30)
*3 —* 1

satisfies

**IV-(z)l **

^ 0, 1*1 < 1, (1.31)
define

**oo**

*<t>(z ) = Y j & z 3* = V’W * 1, *<h =* 1- (1-32)

*Long m emory* 31

T hen, Robinson (1991b) rew rote 1.28 as

**oo**

*A*

*~ ° 2*

### =

### (L33)

**i= o**

where

*ut = e2t - a 2* (1.34)

satisfies

*E ( y t \Ft-i)* = 0 a.s., (1.35)

by construction. As a result, th e chosen specification does n o t include all weak GARCH processes as defined by D rost and N ijm an (1993) as processes w ith th e sam e linear projections as ordinary GARCH. However, as for weak A RM A processes, lim iting d istribu tio n theory for weak GARCH processes, provided, for instance, by Francq and Zakoian (1997), relies on m ixing assum ptions which m ay preclude th e high levels of tem p o ral dependence in th e squares which are allowed by 1.33 w ith a suitable choice of filtering weights. To allow for specific types of nonlinearities in th e squares, R obinson (1991b) also proposed a q u ad ratic version of 1.28:

oo 2

*a 2 = ( a +* a.s. (1.36)

*j*=i

32 *Chapter 1*

(Nelson (1990b)), it corresponds to persistence of shocks on b o th forecast m om ents
of *a 2* and on forecast d istributions of *a 2.* A long m em ory representation of volatil
ity, replacing for instance th e u n it root by a fractional filter in th e equation for th e
squares, reconciles a high degree of tem p oral dependence in volatilities w ith lack of
persistence and, possibly, w ith covariance stationarity. D enoting * s t* = cq2 —

*a 2*and

*Xt =* £? — <72, for *I* > 0, we have

O O

*st+i = i/>ixt* + 5Z ^*j X t - j + i* a.s. . (1.37)
j #

Now as under 1.28, *ipiXt* —> 0 alm ost surely when *I* —>• oo, x is p ersisten t in th e
volatility according to none of th e definitions adopted by Nelson (1990b), i.e. per
sistence in probability, in *L p-*norm or alm ost surely.

Besides, th e analogy is ap p aren t betw een th e clustering of volatilities of financial returns and w hat M andelbrot (1973) described as “Joseph effect” . A nd, effectively, W histler (1990), Lo (1991), Ding, G ranger, and Engle (1993) and Lee and R obin son (1996), are am ong th e first to show how well th e long m em ory representation perform s em pirically. A general fractionally in tegrated G A RCH m odel is obtained as a special case of specification 1.28 w ith th e <^>(2) polynom ial defined as

= (1-38)

*a ( z )*

for 0 < *d£* < | and finite order polynom ials *a(z)* and *b(z)* whose zeros lie outside
th e u n it circle in th e com plex plane. N ote th a t th e degree of fractional integration
is called *ds* in th is case to distinguish long m em ory in th e squared innovations from
long m em ory in th e levels. Baillie, Bollerslev, and M ikkelsen (1996) apply 1.38 to
asset prices w ith th e addition of a drift p a ra m ete r

(1 - *L ) d'a { L ) e 2* = /1 + *b(L)vt.* (1.39)

Nelson (1990b) proves alm ost sure convergence of th e conditional variance *a 2* in th e
short m em ory case *de* in 1.38 w ith *a(z)* and *b(z)* of degree one. A p art from 1.38,
the requirem ent

00

0 < *^ 2 $ 2j <* 0 0 (1-^0)

*Long m emory* 33

includes th e o th er trad itio n al long m em ory specification of m oving average coeffi cients, th e fractional noise case w ith autocorrelations satisfying

corr (e?,e*+J-) = E g ° J ^ ’+3 = *\ { \j -* i r * - *\j\2d+l* + *|j* + l r 1} ■ (1.41)

* Z ^ i = 0 V i* z

Robinson (1991b) developed Lagrange m ultiplier tests for no-ARCH against a lte r
natives consisting of general finite param eterisatio n of 1.28, specialising to 1.38 and
1.41. In b o th these cases, th e autoregressive weights *^ j* satisfy

U nder 1.42 and

2 t e l < ° °

*3=1*

m ax *E* (eJJ < oo,

(1.42)

(1.43)

it follows th a t

**00 ** *0*

*E t f )* < *E f c U e U - S ) }*
*j*=0

**00 ** **00**

*3= 0*

**< I<**

**< I<**

*3 = 0*

(1.44)

where K is a generic constant, so th e innovations in 1.33 are square integrable
m artingale differences, is well defined as a covariance statio n ary process and its
autocorrelations can exhibit th e usual long m em ory stru c tu re im plied by 1.38 or
1.41. Even if 1.43 does not hold, th e “autocorrelations” *YtLo fafa+j/ YiZo <f>]* are
well defined un d er 1.40. B oth p aram etric representations 1.38 and 1.41 have th e
im plication th a t autocorrelations follow

*Y i -0 fiifii+j* _{~ }* _{cj}*•2de- l

_{as }

*•*

_{j}00

**E S o**

which in tu rn im plies a ra te of decay for th e innovations filtering weights of

**<t>3 = **

*0*

**) as i 00.*

**( j d‘**(1.45)

(1.46)

This is taken as a characterisation of long m em ory in th e process when *dE >* 0
and it implies nonsum m ability of weights *(j>j* and autocovariances

34 *Chapter 1*

The ra te of convergence of th e sam ple m ean is also characteristic of long m em ory
processes when *(f)j* satisfies 1.46. Indeed, 1.35 and 1.44 im ply th a t th e p a rtia l sums
of th e squared innovations have variance

**n ****n****oo**

V a r E ( e ? - cr2)] = *E E h h + ' - t E t f - j )* (1.48)

<=1 s,t=l *j =0*

**=**

### 0 (n£e«)

### (1.49)

w ith

*t=i*

**©< := ****E \ M****h*****\ = 0 ( t ****1)****(1.5 0 )**

*3= 0*

under 1.46. Therefore, we have th e sam e ra te up p er bound as in 1.24, i.e.

**£ ( e ? - ****°*) = Op(nd' + i ) .****(1.5 1 )**

*t***= 1**

This result, and nonsum m ability of th e </>/s, is to be contrasted w ith stan d a rd la te n t
ARMA representations for th e squares, w here weights decay exponentially and are,
therefore, absolutely sum m able. In view of th e em pirical evidence and th e focus
on possible long m em ory in financial re tu rn s £*, it seems ap p ro p riate to allow for
possible long m em ory in th e *e\* also. T his thesis is concerned w ith th e estim ation
of tem poral dependence stru ctu res in a covariance statio n ary tim e series *x t* v ia th e
analysis of its sp ectral density in a neighbourhood of zero frequency. This concerns
the case where *x t* displays short m em ory as well as th e cases where *x t* displays
long m emory; and a large p a rt of th e results provide asym ptotic theory in case th e
squared innovations possibly ex h ib it long m em ory them selves.

**1.4 **

**E stim atin g d ep en d en ce**

E stim ating th e degree of dependence and carrying out inference on th e process *x t*

*Long m em ory* 35

**1.4.1 ** **P a ra m etric estim a tio n o f long m em ory**

Take *x t* to be a covariance statio nary series w ith m ean *f i*o and spectral density
/ ( A;$o), *—n* < A < **7r, **where / is a given function of A and *6.* For a realisation of
size n, we consider th e discrete Fourier transform

**n**

* wx(X) =* (27m)“ 2 J

*(1.52)*

**~2xtettX***t*=i
and the periodogram

4 (A ) = K ( A ) |2. (1.53)

This statistic was first proposed by Schuster (1898) to investigate hidden periodic ities in tim e series. A useful general result, under various regularity conditions, is th e following1:

*^ £ rCW[I*M-f(Wo)]d\-+iN(p,A(0 + B(Q)*

### as n-Kx>,

### (1.54)

where

*M O* ** = ** *~* **(1.55)**

*B(0 =*

### (1.56)

and where

*=*

### (2tt)-3 £

### (1.57)

**u ,v ,w**

is th e fourth order cum ulant spectral density, and

*£u,v,w* — cum(x^, *Xf*^.u, *Xf^.y*, *Xf* ) (1.58)

is th e fourth order cum ulant m om ent of th e process *x t .*

**1In case ****x t**** are residuals from a fitted param etric m odel, taking ****y o**** = 0 does not, under regularity**

**conditions (including conditions on the function C(A)), affect the asym ptotic properties o f the**

36 *Chapter 1*

W hen *x t* is G aussian,

**B(C) **

vanishes, and under suitable regularity conditions on
((A) and /(A ; 0), Fox and Taqqu(1986) show th a t 1.54 holds even when *x t*is strongly autocorrelated, providing a pole of /(A ; 0) is m atched by a zero of f(A) of suitable order. Fox and Taqqu (1986) showed as a result th a t W h ittle ’s e stim ate of

*9q*

(W h ittle (1962), H annan (1973)), i.e. an estim ate resulting from the m inim ization of

is asym ptotically norm al w ith ra te n -1/ 2. B eran (1986) and D ahlhaus (1989) ex ten d
this result to prove th a t th e G aussian m axim um likelihood estim ate of *0O* rem ains
efficient in th e C ram er-R ao sense when *x t* is strongly autocorrelated. R obinson
(1994d) shows th a t root-n asym ptotic norm ality results also apply to th e e stim ate
resulting from th e m axim ization of a discretised version of th e W h ittle likelihood

where A *j* = *2it j / n* are th e harm onic frequencies.

W hen *x t* is possibly non Gaussian, th e th ree estim ates above (pseudo m axim um
likelihood, W h ittle and discretized W h ittle) continue to be root-n consistent and
asym ptotically norm al under conditions involving weak autocorrelation (see M ann
and W ald (1943), W h ittle (1962), H annan (1973) and Robinson (1978a)). Solo
(1989) shows 1.54 for *x t* satisfying 1.1 w ith 1.25 and restrictions on th e W old coeffi
cients which include m any strongly autocorrelated non G aussian processes. In case

*x t* is non G aussian, *B ( ( )* does not vanish in general. An im p o rtan t case w here this
occurs is when *x t* follows 1.1 and 1.25 w ith dynam ic conditional heteroscedasticity
in the innovations as discussed in th e previous section. In th a t case, / 4(A ,—/i,/i)
contains contributions from fourth cum ulant m om ents of th e innovations *et* o th er
th an *k* = cum (et ,£ t ,£ t ,£ t). U nder 1.26, which imposes constant second and fo u rth
conditional m om ents, we have

*Long m em ory* 37

and zero otherw ise. U nder 1.27, w ith of defined by 1.28, however,

cum (er ,6 s,£ i,e u) — *k* if *r = s = t = u,* (1.61)
— 7r_s if *r = t * *s = u,* (1.62)

= 7r _* if *r = s ^ t = u,* (1.63)

= *j r - s* if r = u / t = s, (1-64)

and zero otherw ise. Therefore, th e fourth cum ulant 1.58 is equal to

**oo**

*Cu,V,XJJ* ^ ^ y *&kd-k-\-uQk-\-v&k + w* (1.65)

*k=0*

d- ^ y *Kk—i * *&k+v&k+w* d* *&j+v&k+u&k+w*
*k^j*

d- (1.66)

and zero otherw ise. T h e ARCH special case of 1.27 was considered by Weiss (1986),
and th e G ARCH (1,1) by Lee and Hansen (1994). B oth show asy m p to tic n o rm ality
of th e quasi m axim um likelihood. Lum sdaine (1996) allows for n o n sta tio n a rity of
th e in teg rated form in th e conditional variance equation, b u t long te rm dependence
is not covered for *x t.*

**1.4.2 ** **S m o o th e d p eriod ogram sp ectra l estim a tio n**

S em iparam etric altern atives in th e estim ation of th e slope of th e logged sp ectru m at th e origin rely on specification 1.18. Local specification around th e frequency of interest avoids th e pitfall of p aram etric estim ation of th e long m em ory p a ra m e te r

*dx:* a m ispecified sp ectru m at non zero frequencies m ay cause inconsistency in esti
m ation of th e long m em ory p aram eter (characterizing th e low frequency dynam ics
of th e system ). T his ty p e of estim ation is based on low frequency harm onics of th e
periodogram 1.53 whose properties are briefly discussed in th is p arag rap h .

38 *Chapter 1*

is not a consistent estim ate for th e spectral density. More precisely, consider a sta
tionary process *x t* following **1.1 **w ith i.i.d. innovations *e t* and filtering weights *ctj*

satisfying

O O

*^ 2 r \ a j*I < oo, (1.67)

*j=o*

and w ith spectral density defined as in 1.17. T he periodogram of *x t* has th e following

asym ptotic sam pling properties:

co v (/x(A ),/*(ju)) = (1 + £oa + + 0 (71- 2)) + 0 ( n _1), (1.68)

where *S* stands for th e K ronecker symbol. A proof of this resu lt is given in Brock-
well and Davis (1991). T he sam e result is shown to hold by Brillinger (1975) for
a strictly stationary process satisfying th e m ixing condition 1.7. 1.68 shows not
only th a t periodogram ordinates are not m ean-square consistent, b u t also th a t at
distinct frequencies th ey are asym ptotically uncorrelated un d er these conditions,
which perm its th e construction of consistent estim ates of th e spectral density such
as

/(A ) = £ *Wn{ j ) I x {*A +

|j|<m *71*

where *m* is a bandw idth sequence satisfying at least

1 *m * *. „* ,

1--- ^0 as *n* —>• 0 0, (1.69)

*m * *n*

and *W n(j)* is a sequence of sym m etric weight functions satisfying

*W n(j) =* 1 and as »->oo. (1-70)

|j|<m |j|<m

U nder 1.67 and 1.43, we have (see Brockwell and Davis (1991) for a proof)

J i m , ( C0V( / ( A)>

**f i t * ) )**

= (1 + ^oa + **f i t * ) )**

**$ * \ ) S \ J x W 2,**

\|j|<m /
**$ * \ ) S \ J x W 2,**

*Long m emory* 39

**1.4.3 ** **S em ip a ra m etric estim a tio n o f lon g m em o ry**

T he estim ation strateg y based on low frequency periodogram ordinates which is considered in this work is related to a strateg y first proposed by Hill (1975) in tail estim ation for distributions w ith a high degree of leptokurtosis. R esearch in th a t field was fuelled in th e last couple of years by in stitu tio n al regulations allowing banks to derive th e ir own m ethod of estim ation for th e p robability of ex trem e losses. H ill’s sem iparam etric approach to th e estim atio n of th e tail of d istrib u tio n s relies on a p aram etric specification of th e tail of th e d istrib u tio n and a n o n p aram etric tre a tm e n t of th e rest of th e d istribution. T h e p robability d istrib u tio n is said to feature a heavy tail if it behaves asym ptotically like th e P areto d istrib u tio n

*P ( Y > y ) = y~'yL(y),* 7 > 0, *y >* 1, (1.71)

where *L(y)* is a slowly varying function a t infinity. T h e Hill e stim ate for th e “tail
index” 7 (which, sim ilarly to th e long m em ory p a ra m ete r in th e tim e dom ain or th e

frequency dom ain representations, appears as an exponent) m axim ises a conditional P areto likelihood

where Y^) > . . . > Y(n) are th e order statistics of a sam ple of observations Yi , . . . , *Yn*

and *m* is th e num ber of statistics used in th e estim ation, satisfying 1.69.

Now suppose *x t* is weakly statio n ary and follows 1.18. One wishes to e stim ate
th e degree of long m em ory *dx* in a way th a t is robust to possible m ispecification
of th e short range dynam ics. A sem iparam etric estim ate of *dx* was proposed by
Kiinsch (1987) and will be dwelt upon in ch ap ter 3 of th is thesis. It is based on th e
W h ittle likelihood discretisation 1.59, b u t th e o p tim isation is realised over th e first

*m* frequencies only, w ith 1.69, in accordance w ith th e local specification 1.18. T he
function to m inim ise is

40 *Chapter 1*

noting th a t /(A ) is replaced by its local p aram etric form in th e neighbourhood of zero frequency 1.18 w ith 0 < L ( A ) = G < o o . The estim ate is not defined in closed form, so th a t a prior consistency result (as in Robinson (1995a) under very weak local sm oothness conditions for th e spectral density, in addition to assum ption 1.26 is necessary. Robinson (1995a) also proves asym ptotic norm ality under

/(A ) = CrA“ 2d* (l -f O(A^)) as n —> oo, (1-74)

for some *p* G [0,2), under slightly stronger local sm oothness conditions and finite

fourth m om ents for *x t* still satisfying 1.26. T he proof of th e result

*y /m (d x - dx) -± d N(0,* j ) (1-75)

assumes th e following restrictions on th e choice of “bandw idth” m,

1 *m ^* log m

1--- r-r---*>* 0 as *n —>* oo (1-76)

*m * *n 1*3*

which perforce restricts th e ra te of convergence of th e estim ate. T he la tte r will
therefore be inefficient w ith respect to correctly specified p aram etric W h ittle esti
m ation, when *m =* [(n — 1)/2], which has *n*~s ra te of convergence. Robinson also
conjectures th a t th e theorem still holds under th e m ilder and m ore n a tu ra l condition

1 *m 2P+1*

*m * *n 1*

*I I I* / V

H r-r *>* 0 as *n* *—>* oo. (1-77)

An asy m p to tic n orm ality result for th e estim ate m ay still hold when *m* is of

*ex-20*

act order of m ag n itu d e n 2^ 1, corresponding to optim al sm oothing. In th a t case, asym ptotic bias will no t be zero as in th e cases of “oversm oothing” (cases w here m is small in order to avoid asym ptotic bias): 1.77 and 1.76.

*Long m emory* 41

available for statio n ary processes w ith spectral densities satisfying 1.74. G iraitis, Robinson, and Sam arov (1997) give a ra te optim ality theo ry b u t no lower bound for th e asym ptotic variance of estim ates achieving th a t rate. C h ap ter 4 will be concerned w ith o p tim al sm oothing and optim al bandw idth selection.

O th er estim ates of *dx* following th e sam e estim ation principle are th e log p eri
odogram e stim ate proposed by Geweke and P orter-H udak (1983), th e averaged pe
riodogram e stim ate proposed by Robinson (1994c) and th e exponential e stim a te
proposed by Janacek (1993). T he log periodogram is based on a regression of th e
first *m* harm onics of th e log periodogram on a simple function of frequency. An
efficiency im proving version of this estim ate was proved in R obinson (1995b) to pro
duce a consistent and asym ptotically norm al estim ate of dx, applying least squares
to th e regression

log *I x (Xj)* = *C* + dx(21og *Xj)* + *Uj,* j = / + l , . . . , m (1-78)

where *I* is a trim m in g p aram eter which diverges at a slower ra te th a n th e b an d
w idth m . H urvich, Deo, and Brodsky (1998) fu rth er show th a t for a slightly m ore
specific local p aram eterisation , th e original G ew eke-Porter-H udak e stim a te is also
asym ptotically norm al and th a t no trim m ing of very low frequency harm onics is
necessary. Ja n a c e k ’s estim ate (Janacek (1993)) is a co u n terp art of th e log p eri
odogram e stim ate based on th e fractional exponential m odel 1.2 0, b u t, to d ate,

th ere seems to be no asym ptotic theory for it. The averaged periodogram esti
m a te proposed by Robinson (1994c) will be considered in C h ap ter 2 of th is thesis
in m ore depth. It is based on an analogy w ith th e weak dependence case w here
averaging over approxim ately independent periodogram harm onics in a neighbour
hood of zero frequency produces a consistent estim ate of th e sp ectral density at
zero frequency. However, th e asym ptotic properties of low periodogram o rdinates
are considerably affected by long range dependence, and new results had to be de
rived. T h e idea of th e log periodogram estim ate is draw n from th e id en tity 1.78
under 1.18, where *C =* logL(O) — *E, E* is E u ler’s constant *E =* 0 .5 7 7 2 ..., and

42 *Chapter 1*

assum ptions -all of which include a form of weak dependence-, th a t for A 7^ 0 m odulo
7r and *J* a finite positive integer, / x(Aj), *j =* 1, . . . , *J* are asym ptotically independent

/( A ;) x l/2 variates (see for instance Theorem 5.2.6 page 126 in B rillinger (1975) and Theorem 12 page 223 of H annan (1970)). A sym ptotic properties of th e averaged periodogram estim ate of th e spectral density a t zero frequency

**-I ** **771**

/ ( ° ) = - £ * « ( * > ) . (1-79)

*m* being th e bandw idth following at least 1.69, and, more generally, of w eighted
periodogram spectral estim atio n (in Brillinger (1975)), follow from this asy m p to tic
distributional result for ordinates of th e periodogram under short m emory.

For a process *x t* where th e conditional homogeneity condition 1.26 fails (and th ere
fore th e conditions applied by H annan (1970) who assumed th e *et* to be i.i.d .), and
is replaced by 1.27 w ith *a 2* defined by 1.28, th e asym ptotic d istribu tio n al resu lt for
periodogram ordinates m ay not continue to hold, possibly because of non sum m able
fourth cum ulant contributions to asym ptotic variances. In C h ap ter 2, it is proved
th a t in spite of this, 1.79 rem ains an asym ptotically norm al estim ate of / ( 0 ) w ith a
suitable choice of bandw idth m.

W hen *x t* displays long m em ory (it follows 1.18), the asym ptotic d istrib u tio n al result
continues to hold for fixed positive frequencies (see R osenblatt (1981) and Y ajim a
(1989)) b u t not for periodogram ordinates in a neighbourhood of zero, as docu
m ented by Kiinsch (1996), H urvich and B eltrao (1993), C om te and H ardouin (1995),
and Robinson (1995b). T he periodogram ordinates Ix(Xj) are no longer indep en d en t
or identically d istrib u ted when th e sam ple size *n* tends to infinity. In this settin g ,
Theorem 2 of Robinson (1995b) gives a m ajor result on asym ptotic variance and cor
relations of low frequency periodogram ordinates which applies to th e dependence
stru c tu re considered in this thesis under 1.74: p u ttin g *v(X) = wx ( \ ) / G 1/ 2\ ~ dx,*

where tu*(A) is th e discrete Fourier transform defined in 1.52, and *a 2* is th e uncon
ditional variance of th e innovations to th e process, we have

*Long m emory* 43

£[t,(A,-)«>(A*)] = O ( ^ ) . (1.81)

T his result is in stru m e n ta l to th e proofs of th e asym ptotic properties of th e log periodogram , th e local W h ittle and th e averaged periodogram estim ates of long m em ory, and it rem ains valid when the conditional hom oscedasticity condition 1.26 is relaxed to 1.27 w ith o f following 1.28. In this setting, C h ap ter 3 proves th a t th e asy m p to tic n orm ality resu lt 1.75 continues to hold for th e local W h ittle e stim ate of long memory, and th a t it continues to hold w ith identical asym p to tic variance so th a t no features of th e ARCH stru ctu re defined by 1.28 or 1.36 enter. T his resu lt is due to additional sm oothing of th e periodogram via th e slightly m ore strin g en t condition on th e choice of bandw idth

ra log m = o(n 2 ~de) as *n* *—>* oo (1.82)

which ensures th a t th e contribution to th e variance of th e periodogram of th e er
rors *et* from fourth cum ulants 1.62-1.64 induced by long m em ory conditional h e t
eroscedasticity is of sm all order of m agnitude w ith regards to th e suitable approx
im atin g m artingale. T his im plicit effect of ARCH -restrictin g a tta in a b le rates of
convergence for th e estim ates- is directly in contrast w ith p a ra m etric or ad aptive
estim atio n (see, e.g. Weiss (1986) and K uersteiner (1997)) w here A R C H -type be
haviour directly affects lim iting distributional properties.

This outcom e (i.e. no explicit effect of ARCH) is especially desirable in th e case
of th e local W h ittle estim ate. This is in th e first place due to th e sim plicity of th e
lim iting variance in 1.75, which is independent of *G* and *dx .* M oreover, although
m axim um likelihood estim atio n of p aram etric versions of 1 . 3 3 such as 1.38 or 1.41 is

44 *Chapter 1*

not p erm it long memory, whereas long m em ory lite ra tu re features either Gaussian processes (e.g. Fox and Taqqu (1986), Robinson (1995b)), non linear functions of Gaussian processes (e.g. Taqqu (1975)), linear functions of independently and iden tically distributed sequences (e.g. G iraitis and Surgailis (1990)), nonlinear functions of such linear filters ( “Appell polynom ials” , see G iraitis and Surgailis (1986)), as well as the m odel defined by 1.1, 1.2, 1.25 and 1.26. None of these approaches represents

conditional heteroscedasticity in a m artingale difference sequence.

**1.5 **

**Choice o f b and w id th**

It is apparent from th e discussion above, th a t th e choice of bandw idth m , th e num ber of periodogram ordinates used in th e estim ation procedure, is crucial in sem ipara m etric estim ation of long memory. It is crucial to b o th asym ptotic d istrib u tio n al results and m ean square optim ality. M oreover, insofar as it determ ines from which point th e practitioner starts to describe th e behaviour of th e series as asym ptotic, bandw idth is central to th e concept of long m em ory itself. In th a t regard, specifying the series only in the “asy m p to tic region” w ith a stru c tu re th a t does not im pose itself on short run cycles, seems an intrinsically b e tte r approach, n otw ithstanding considerations of efficiency and robustness.

*peri-Long m emory* 45

odogram estim ate in Lobato and Robinson (1996), Delgado and R obinson (1996), Delgado and R obinson (1994). T he need for an o p tim ality theory for th e d eterm i n atio n of ban d w id th is therefore evident. G iraitis, Robinson, and Sam arov (1997) show th a t for long m em ory estim ates, in a sim ilar way as for sm oothed periodogram estim ates, one cannot im prove on a ra te of convergence which depends on th e lo cal sm oothness properties of th e spectral density following specification 1.74. T hey fu rth er show th a t th e log periodogram e stim ate of long m em ory in th e form pro posed by Robinson (1995b) a tta in s this optim al ra te of convergence. U nder th e m ore restrictive specification

/(A ) = |2s i n ( ^ ) |‘2V ( A ) (1.83)

where / * ( A) is tw ice continuously differentiable and positive at A = 0, H urvich, Deo, and B rodsky (1998) give a precise expression for th e m ean squared erro r of th e estim ate and derive an optim al bandw idth form ula. For spectral densities satisfying

*f W = L ( X ) X ~ 2dr*( l + *Effdr^ 13*+ 0 < 1 ^ 1 < oo, *p*

** e **

(0,2], (1.84)
46 *Chapter 1*

sub-sam ple bo o tstrap technique employed relies on th e i.i.d. assum ption for th e observations, and does not seem to be readily extendible to strong dependence. One therefore needs to rely on M onte Carlo experim ents to assess th e qu ality of optim al bandw idth selection form ulae, and it rem ains advisable to report a wide range of bandw idth choices in em pirical applications.

**1.6 **

**Long m em ory in sp ecu lative returns**

*Long m emory* 47

Fielitz (1977). This finding raises a num ber of questions on th e effects long m em ory
in re tu rn s may have on portfolio decision and on derivative pricing using m artin g ale
m ethods. However, th e finding of Greene and Fielitz (1977) is challenged by Lo
(1991) w ith a slightly m ore powerful analysis based on a modified form of th e *R / S*

statistic. Lee and Robinson (1996) are th e first to apply sem iparam etric m eth o d s to th e m easure of m em ory in stock price retu rn s, and Lobato and Savin (1998) apply th e Pitm an-efficient te st statistic developed in Lobato and R obinson (1998) to con clude w ith Lo (1991) th a t evidence of long m em ory in retu rn s is spurious. T hey do, however, find strong evidence of long range dependence in th e squared and absolute retu rn s, as do Ding and G ranger (1996). This refines th e widely recognised stylised facts on conditionally heteroscedastic behaviour of financial retu rn s (see M andelbrot (1963) and F am a (1965) for a first description of the phenom enon) and reinforces th e value of long m em ory estim atio n procedures robust to (possibly long m em ory) conditional heteroscedasticity when exam ining th e long ru n p red ictab ility of retu rn s.

48 *Chapter 1*

*( “event studies*” ) and, in p articu lar, th e long run effect of transactions on th e price
process (see for instance Lyons (1985), and Hasbrouck (1991)).

**1.7 **

**Synopsis**

T he following two chapters are concerned w ith th e effect of possibly long m em ory conditional heteroscedasticity on sem iparam etric estim ation of long memory.

C hapter 2 considers th e averaged periodogram statistic for a linear process w ith pos

sibly long m em ory in th e innovations conditional variance. An asym ptotic norm ality result is given for averaged periodogram estim atio n of finite and positive spectral densities a t zero frequencies. T h e proof is ad ap ted from Robinson and H enry (1997). T he robustness of th e results in R obinson (1994c) regarding consistency of th e av eraged periodogram statistic in th e presence of long m em ory is then shown and a M onte Carlo experim ent assesses th e effect of conditional heteroscedasticity in sm all sam ple averaged periodogram long m em ory estim ation. T he estim ation of statio n ary cointegration is then discussed in this framework.

C hapter 3 presents th e proofs of robustness to (possibly long m em ory) conditional heteroscedasticity of th e consistency and asym ptotic norm ality results for th e local W h ittle estim ate of long m em ory in R obinson (1995a). A M onte Carlo stu d y in vestigates th e effect of conditional heteroscedasticity on local W h ittle estim ation of long m em ory in small sam ples. T his ch ap ter is based on a jo in t research w ith P e te r Robinson, appearing in R obinson and H enry (1997).

*Long m em ory* 49

Henry and Robinson (1996).

**C h a p te r 2**

**A v e r a g e d p erio d o g ra m s t a tis tic**

**2.1 **

**In trod u ction**

This second ch ap ter is concerned w ith th e use of an averaged periodogram s ta tistic
proposed by G renander and R osenblatt (1966) to investigate tem p o ral dependence in
weakly dependent tim e series. T he process *x t* considered is statio n ary and satisfies

1 .1 and 1 . 2 w ith th e m artingale dependence assum ption 1.25 on innovations *et .*

T h e approach is sem iparam etric in th e sense th a t *x t* is supposed to have sp ectral
density /(A ) satisfying the local specification 1.18 w ith *dx* > 0; and th e averaged
periodogram s ta tistic is used to investigate th e behaviour of /(A ) in a neighbourhood
of zero frequency, estim ating / (0) = L (0) when *dx* = 0 and estim atin g *dx* w hen th e

la tte r is strictly positive. Section 2 of this chapter presents issues and p ast results.

In th e use of a sem iparam etric approach, one m ay have in m in d estim atin g de pendence in long financial d a ta series. To th a t end, asy m p to tic properties of th e averaged periodogram statistic need to be justified when th ere is a possibly high degree of tem po ral dependence in conditional variances.

52 *Chapter 2*

to th e case 1.27 w ith o f defined by 1.28 corresponding to (possibly long m em
ory) conditional heteroscedasticity in th e innovations of a generalised linear process.
Consistency of th e averaged periodogram based estim ate of *dx >* 0 is proved w ith a
specific rate of convergence by Robinson (1994c). Section 4 of this chapter extends
th e validity of th e la tte r result to processes satisfying 1.27 w ith o f following 1.28.

A simple corollary is th e extended validity of a consistent estim ate of statio n ary cointegration proposed by R obinson (1994c). This is presented in Section 5 of this chapter while Section 6 proposes an investigation of th e effect of conditional h e t

eroscedasticity in sm all sam ples. Section 7 concludes this chapter.

**2.2 **

**A veraged period ogram sta tistic**

Let th e discrete Fourier tran sfo rm of a covariance statio n ary process *Xt* be defined
as in 1.52 and th e periodogram * I x ( X )* as in 1.53. Define th e averaged periodogram

by

**p _ [ A n /27r]**

*= T* E *W* (2-1)

*n * *i*=i

where A*j = 2n j l n , n* is th e sam ple size and *[x]* denotes th e largest integer sm aller
or equal to *x.* Because * I x ( X j )* is invariant to location shift, no m ean correction

is necessary for 2.1. * F (*A) is a discrete analogue of th e m ore widely docum ented

continuously averaged periodogram (see Ibragim ov (1963)) where 1.53 is replaced
by its dem eaned version. T he e stim ate / (0) = * F ( X m ) / X m* given in 1.79 was proposed

for /( 0 ) by G renander and R o senb latt (1966) and is readily generalisable to a wide
class of weighted periodogram spectral estim ates defined below. Let *K (*A) be a
bounded function satisfying

*I<{X)dX =* 1, *I < { -*A) = *K (* A). (2.2)

Defining

oo

*K m(X) = m* £ *K ( m ( \ + 2n j ) )* (2.3)

*Averaged periodogram statistic* 53

where *m* is a positive integer called th e bandw idth, weighted periodogram e stim atio n
of / (0) is given by

/« (0 ) = — 2 * » ( * , - ) / , (A*). (2.4)

* 71 j*= 1

The class of kernel functions such th a t

* K m {*A) = 0 for A >

*(2.5)*

**X m**provides a basis for estim ation of /( 0 ) under specification 1.18 w ith *dx* = 0. Sup
posing 1.69 is satisfied, a set of sufficient conditions for

*fw (*0) — / (0) as *n* *—y* oo (2.6)

includes absolute sum m ability of fo u rth cum ulants

**-fo o**

*Y* |c u m (x i,x i+/l, x i +i, x i +J)| < oo. (2.7)

**h , i , j = - o o**

Suppose th a t a local Lipschitz condition is im posed on th e sp ectral density in th e form,

*f W* = /(0 )(1 + *E p r f )* + *o ( \ 0)* as A -> 0+ , (2.8)

w ith

*P* G (0,2], 0 < / (0) < oo, 0 < *Efi <* oo,

and suppose th e b an d w id th *m* satisfies 1.77. U nder th e conditions above, asy m p to tic
norm ality of /(A ) given by 1.79

m ^ (/(0 ) - /( 0 ) ) A /'(0,/(0)2) as n o o (2.9)

occurs under th e two following sets of sufficient conditions: B rillinger (1975), T heo
rem 5.4.3, page 136 assumes 1.7 and existence of all m om ents of *x t ;* H annan (1970),
T heorem 13, page 224, assumes th a t *x t* follows 1 .1 w ith i.i.d. innovations. H an n an

(1970), T heorem 13’, page 227 also proves 2.9 un d er th e uniform m ixing condition