Long M em ory in T im e Series:
Sem iparam etric E stim ation
and C onditional H eterosced asticity
by
Ma r c He n r y
London School of Economics and Political Science
Thesis su b m itted in p artial fulfilment of th e requirem ents of
UMI Number: U 142981
All rights reserved
INFORMATION TO ALL U SE R S
The quality of this reproduction is d ep en d en t upon the quality of the copy subm itted.
In the unlikely even t that the author did not sen d a com plete m anuscript
and there are m issing p a g e s, th e se will be noted. Also, if material had to be rem oved, a note will indicate the deletion.
Dissertation Publishing
UMI U 142981
Published by ProQ uest LLC 2014. Copyright in the Dissertation held by the Author. Microform Edition © ProQ uest LLC.
All rights reserved. This work is protected against unauthorized copying under Title 17, United S ta tes C ode.
ProQ uest LLC
789 East E isenhow er Parkway P.O. Box 1346
1 eL-S>
f
' f G S o
X d& vas^
in his b ra in s,- W hich is as dry as th e rem ainder biscuit
A fter a voyage,- he h a th strange places cra m m ’d W ith observation, th e which he vents
In m angled forms.
A bstract
A cknow ledgem ents
T h e story of m y PhD is no Greek tragedy. The plot lacks purity, I never m anaged to stay in one place and it took m e considerably more th a n one day to w rite it. However, I was always confident my supervisor, P eter Robinson, would save it from becom ing a farce. My g ra titu d e goes to P eter for accepting yet another P h .D . stu d en t in O ctober 1994, n otw ithstanding th e crowds already flocking a t his doorstep. My g ra titu d e goes to P eter for spending a considerable am ount of tim e in collaborative work even after he had becom e aware of my shortcom ings. My g ra titu d e goes to P eter for always tre a tin g m e as an equal, for never p u ttin g pressure on a m uch valued freedom and for providing m e w ith incentives and encouragem ents when I m ost needed them .
Alice M esnard, Claudio M ichelacci, Rohini Pande, B arb ara Petrongolo, Helene Rey, Cecilia T esta and E tienne W asm er, who helped restore my courage and will to get back to work. W hen confidence really w ithered, Javier Hidalgo, smoking away m any a nostalgic cheap Spanish cigarette on th e steps of St. C lem ents, would usher me into th e entrails of th e building to feed on Jorn R o th e’s and Rob N o rth c o tt’s faith in pure theory (and chess).
However, courage, will, faith and fine words did little to help m e afford life in London and LSE fees. None of this (not even life in Ian Tonks’ little B rixton flat I shared w ith Alice, Rohini and all th e others who took an active p a rt in th e Tales of B rixton M otel) would have been possible w ithout th e help of a grant from th e Ecole polytechnique, adm inistered faithfully by M artine G uibert and awarded under th e auspices of th e Ecole des H autes Etudes en Sciences Sociales and th e shadow supervision of G abrielle Demange. W hen this ran out, I tu rn e d to C hristian Gourieroux and was aw arded a g ran t from th e C entre de Recherche en Economie et S tatistiq u e (C R EST ). Je desire a ce titre m anifester, en frangais, et on s’apecevra que m em e l ’unite de langue est violee, to u te m a reconnaissance a C hristian qui m ’a accueilli au C REST en enfant de la m aison (bien q u ’un peu prodigue) et qui m ’a tire d ’une mauvaise passe financiere en m e p e rm e tta n t d ’obtenir un financem ent CREST. J ’ai passe d ’excellents m om ents en compagnie de Julien B echtel, M onica Billio, Cecile Boyer, Eric B urgayran, Laurence Carassus, Serge Darolles, Gaelle le Fol, N adine Guedj, Jean -P au l L aurent, D ietm ar Leisen, Clotilde N app, Huyen P ham , Olivier Scaillet, A ndre Tiom o, N izar Touzi, Fanda Traore e t Jean-M ichel Zakoi’an au cafe-comique (souvent m ue en cafe-concert sous l ’im pulsion de Clotilde et Gaelle) du cinquiem e etage de MK2. Ces m om ents furent tres episodiques, mais sont appeles, je l’espere, a se renouveler dans l ’avenir.
C o n te n ts
1 Long m em o ry and co n d itio n a l h e te r o sc e d a stic ity 21
1.1 Long m e m o r y ...21
1.2 Sam ple m ean of long m em ory p ro c e s s e s ... 27
1.3 C onditional h e te ro s c e d a s tic ity ... 28
1.4 E stim atin g d e p e n d e n c e ...34
1.4.1 P a ra m e tric estim atio n of long m e m o r y ...35
1.4.2 Sm oothed periodogram spectral e s tim a tio n ...37
1.4.3 S em iparam etric estim ation of long m e m o r y ...39
1.5 Choice of b a n d w i d t h ... 44
1.6 Long m em ory in speculative r e t u r n s ... 46
1.7 S y n o p s is ... 48
2 A veraged p erio d o g ra m s ta tistic 51 2.1 I n tr o d u c tio n ... 51
2.2 Averaged periodogram s t a t i s t i c ...52
2.3 A sym ptotic n orm ality of th e averaged p e r i o d o g r a m ... 54
12
2.5 E stim ation of long m e m o ry ... 71
2.6 F in ite sam ple investigation of th e averaged periodogram long m em ory e s t i m a t e ... 72
2.7 E stim ation of statio n ary cointegration ... 80
2.8 C o n clu sio n ...82
3 Local W h ittle e stim a tio n o f lon g m em o ry w ith con d ition al h e t ero sced a sticity 83 3.1 I n tr o d u c tio n ... 83
3.2 Local W h ittle e s t i m a t e ...84
3.3 Consistency of th e local W h ittle e s ti m a te ...8 6 3.4 A sym ptotic n o rm ality of th e local W h ittle e s t i m a t e ...90
3.5 F in ite sam ple com parison ... 100
3.6 C o n c lu sio n ...115
4 O p tim al b an d w id th choice 117 4.1 I n tr o d u c tio n ... 117
4.2 B andw idth selection for th e averaged p e rio d o g ra m ... 123
4.3 B andw idth choice for th e local W h ittle e s t i m a t e ... 127
4.4 A pproxim ations to th e optim al b a n d w id th s ... 129
4.5 C o n c lu sio n ...141
5 A n a ly sis of d ep en d en ce in in tra-day foreign exch an ge retu rn s 143 5.1 I n tr o d u c tio n ... 143
13
5.3 M ethodology ...153
5.3.1 T esting for persistence, long range dependence and sta tio n a rity 154 5.3.2 E s tim a tio n ...155
5.3.3 S tatio n ary c o in te g r a t io n ... 158
5.4 R e s u lts ...163
5.4.1 T esting for Long Range D ependence ...163
5.4.2 S em iparam etric E s tim a tio n s ... 166
5.4.3 Specification Tests on th e Fully P aram etric M o d e l...168
5.4.4 F ractional cointegration ...171
L ist o f T a b les
2.1 M oderate long m em ory averaged periodogram biases ...74
2.2 M oderate long m em ory averaged periodogram R M S E s ...74
2.3 M oderate long m em ory relative e ffic ie n c ie s... 74
2.4 Very long m em ory averaged periodogram b i a s e s ...75
2.5 Very long m em ory averaged periodogram R M S E s ... 75
2.6 Very long m em ory relative efficiencies...75
2.7 Averaged periodogram relative efficiencies for larger sam ple sizes . . . 79
3.1 Local W h ittle biases w ith a n tip e rs is te n c e ... 102
3.2 Local W h ittle RM SEs w ith a n tip e rs is te n c e ... 102
3.3 Local W h ittle 95% coverage probabilities w ith an tip ersisten ce . . . . 103
3.4 Log periodogram relative efficiencies w ith a n tip e r s is te n c e ...103
3.5 Local W h ittle biases w ith short m e m o r y ... 104
3.6 Local W h ittle RM SEs w ith short m e m o r y ... 104
3.7 Local W h ittle 95% coverage probabilities w ith short m e m o r y ...105
3.8 Log periodogram relative efficiencies w ith short m em ory ...105
3.9 Local W h ittle biases w ith m oderate long m e m o r y ... 105
16
3.11 Local W h ittle 95% coverage probabilities w ith m o d erate long m em ory 106
3.12 Log periodogram relativ e efficiencies w ith m o d erate long m em ory . . 106
3.13 Local W h ittle biases w ith very long m e m o r y ...107
3.14 Local W h ittle R M SEs w ith very long m e m o r y ... 107
3.15 Local W h ittle 95% coverage probabilities w ith very long m em ory . . . 1 0 7 3.16 Log periodogram relativ e efficiencies w ith very long m e m o r y ... 108
3.17 Local W h ittle biases w ith t 2 errors ... 110
3.18 Local W h ittle RM SEs w ith t 2 e r r o r s ... I l l 3.19 Local W h ittle 95% coverage probabilities w ith t 2 e r r o r s ... I l l 3.20 Log periodogram relativ e efficiencies w ith t 2 e r r o r s ... I l l 3.21 Local W h ittle biases w ith t 4 errors ... 112
3.22 Local W h ittle R M SEs w ith e r r o r s ... 112
3.23 Local W h ittle 95% coverage probabilities w ith e r r o r s ... 112
3.24 Log periodogram relativ e efficiencies w ith 14 e r r o r s ... 113
4.1 A u tom atic estim ates of long m em ory in in tro du cto ry exam ples . . . . 133
4.2 Infeasible and feasible a u to m a tic local W h ittle e s tim a tio n ...136
4.3 Sensitivity of a u to m a tic procedures to conditional heteroscedasticity . 139 4.4 A u tom atic local W h ittle estim atio n of long m em ory in fractional G aussian noise s e r i e s ... 141
5.1 Sum m ary statistics for exchange ra te r e t u r n s ... 148
5.2 Sum m ary S tatistics for th e Logarithm of Squared R e t u r n s ... 149
5.3 Test for long m em ory on r e t u r n s ...163
17
5.5 Test of long m em ory on deseasonalized v o la tility ... 166
5.6 Long m em ory in r e t u r n s ... 167
5.7 Long m em ory in v o l a t i l i t y ...168
5.8 Long m em ory in deseasonalized v o la tility ... 169
5.9 P a ra m e tric testing for long m em ory in v o l a t i l i t y ... 170
L ist o f F ig u res
2.1 A veraged periodogram em pirical d istribution for n = 500 ... 78
2.2 Averaged periodogram em pirical d istribution for n = 1000 ... 78
2.3 Averaged periodogram em pirical d istribution for n = 2000 ... 79
3.1 Em pirical distributions of th e local W h ittle e stim ate w ith G ARCH errors n = 64, m — 4 ...113
3.2 Em pirical distributions of th e local W h ittle e stim ate w ith G A RCH errors n = 128, m = 1 6 ... 114
3.3 E m pirical d istributions of th e local W hittle e stim ate w ith G A RCH errors n = 256, m = 6 4 ... 114
4.1 Long m em ory function of b an d w id th in the Nile river d a t a ... 119
4.2 Long m em ory estim ation in an A R FIM A (l,-.25,0) series ...120
4.3 Long m em ory estim ation in an ARMA(1,0) s e r i e s ... 120
4.4 Long m em ory estim ation in an A RFIM A (1,.25,0) s e r i e s ...1 2 1 4.5 Long m em ory estim ation in an A RFIM A (1,.45,0) s e r i e s ...1 2 2 4.6 O ptim al b an dw id th for th e local W h ittle e stim ate of long m em ory . . 130
4.7 Local W h ittle biases against b a n d w id th ...134
20
4.9 A u to m atic and optim al bandw idths for th e local W h ittle estim ate . . 137
4.10 RM SEs w ith optim al, au to m atic and ad hoc bandw idth choice . . . . 138
4.11 Averaged periodogram RM SEs against b a n d w i d th ...140
5.1 Periodogram for JP Y /U S D Log Squared R eturns ... 150
5.2 Log Periodogram for JP Y /U S D Log Squared R e t u r n s ... 151
5.3 JP Y /U S D Log Squared R eturns: Sample A utocorrelations 1 to 1000 . 151
5.4 Periodogram for Deseasonalised JP Y /U S D Log Squared R eturns . . . 165
C h a p te r 1
L o n g m e m o r y an d c o n d itio n a l
h e te r o s c e d a s tic ity
1.1
Long m em ory
T h e th eo ry of econom etric tim e series is th e branch of econom etrics concerned w ith th e m odelling of dependence across different realisations of a economic process x .
B ecause of the description of th e scale of realisations of th e process as a tim e scale, this dependence is usually called tem p o ral dependence and th e process is indexed by t. Allowing for tem p o ral dependence in th e process implies relaxing th e indepen dence p a rt of th e tra d itio n a l assum ption of independence and id en tity of d istrib u tio n (i.i.d.) for th e stochastic process under focus. T he id en tity of d istrib u tio n p a rt of th e i.i.d. assum ption is either defined by stric t stationarity, m eaning th a t for all positive integers n , t i, . . . , t n and /i, th e distributions of (xtl, . . . , x tn) and (x tl+hi • •, Xtn+h)
99 Chapter 1
of an infinite weighted sum of uncorrelated errors of com m on variance, w ith square sum m able filtering weights.
oo oo
X t = E ( x t) + Y l , a 3 £ i - 3 i <*0 = 1, 5 Z ° J < 0 0 , ( i - 1)
i=o j=o
w ith
E(ej) = 0 a.s. and E(£j£k) = Sjk&e f°r — 0, (1*2)
where 8 stands for the K ronecker symbol.
A way of relaxing th e independence w ithin the i.i.d. assum ption while retaining weak dependence between d istan t x ’s, or asym ptotic independence, was introduced by R osenblatt (1956) and Ibragim ov (1959),(1962) w ith th e notions of strong and uniform mixing. Let ( 0 , A, P) be th e probability space in which th e process x t is defined, where A is th e sm allest Borel field including all th e Borel sets of th e form {cj|(xj(fc)(nfc,o;), k = 1, . . . , m) G 5 } w ith B a Borel set in IRm. Let T v be th e cr-field
of events determ ined by X*, t < p and T q be the cr-field of events determ ined by X*,
t > p. Define th e sequences
px ( k ) : = s u p { \ P ( A n B ) - P ( A ) P ( B ) \ , A e F p, k > 1} (1.3)
and
c,(fc) := s u p { | f ^ p - P ( B ) |,
A e r P, B e j ^ +k,
* > i ) . (1.4)px ( k) was introduced by R osenblatt (1956) and called a-m ixing or strong-m ixing sequence and C^(^) was introduced by Ibragimov (1962) and called ^-m ixing or uniform -m ixing sequence (T he usual a and </> notations are replaced by p and £ respectively to avoid a clash of n o tatio n w ith w hat follows). T he strictly statio n ary process x t is called strong (resp. uniform ) mixing if px( k ) —>0 (resp. Cr(&)->0) when
fc—*0 0. It is easily seen th a t 2px (k) < Cx(k) and therefore th a t uniform -m ixing
Long m emory 23
sum m ability conditions on cum ulant m om ents of all orders, which are assum ed to exist. Defining th e k th order cum ulant of a strictly statio n ary process by
c u m f * ! , . , . , X k) = £ ( - l ) p(P - 1 )!(E I I Xj ) . . . ( E Y [ Xj ) (1.5)
jevp
where th e sum m ation extends over all partitions (i/i,. . . ,i/p), p = 1 , . . . , & , and
defining
= c v m ( x u x i+tli. . . , x t+tk_1), for t u . . . , t k- X = 0 ± 1, . . . , (1.6)
Brillinger (1975) introduces a m ixing condition of th e form
+ o o
^ 2 < 0 0 for all k > 2. (1.7)
11 00
It is easily seen th a t this condition includes absolute sum m ability of autocovariances of th e process, a condition which restricts th e choice of filtering weights on the innovations et in 1.1, because it is equivalent to
4-oo 00
E l E ^ i +i | < o o , (1.8)
l——00 j—0
w ith th e convention aj =
0
, j <0
.24 Chapter 1
flooding of alluvial plains by th e Nile river bringing prosperity to th e region. A more definite account is provided by th e p articu larly reliable m easurem ents of th e annual low levels of th e Nile at th e G hoda Range collected betw een A.D. 622 and A.D. 1284 and appearing in Toussoun (1925) (th e first missing observation is for year A.D. 1285 outside th e sam ple chosen). Two characteristics of this series are consistent w ith non m ixing behaviour: slow decay of sam ple autocorrelations and a sam ple m ean w ith variance which decays a t a m arkedly slower ra te th a n n- 1 (for
graphical assessments, see, e.g. B eran (1994) p. 22). H urst (1951) gives a q u an tita tive account of a phenom enon la te r nam ed after him “H urst effect” together w ith a heuristic approach to th e m easurem ent of th e degree of tem p o ral dependence associ ated w ith this effect. He defined th e rescaled adjusted range or R / S statistic which is th e standardised ideal capacity of a reservoir betw een a tim e origin and tim e T , and he observed a p a tte rn consistent w ith th e relation
E [ R / S ) ~ cTh as T —> oo w ith H > (1.9)
where indicates th a t th e ra tio of th e left hand side and th e right hand side tends to one when T tends to infinity, whereas H (often called th e self-sim ilarity param eter) should be equal to | if th e river flow behaved like a process under any type of m ixing assum ption. In his pioneering work on stock prices, M andelbrot (1973) identified th e sam e type of phenom enon and related it to th e self-similarity distributional p rop erty introduced by Kolmogorov (1940), by which th e join t distri bution of xt l , . . . , x tn is identical to a~H tim es th e jo in t d istrib u tio n of xatl, . . . , x atn
for any a > 0, w ith th e in tro d u ctio n of fractional G aussian noise (in M andelbrot and Ness (1968)), a G aussian process w ith zero m ean and autocovariances following
cov(a;1,a;i+ i) = i v a r ( z i ) { |j + l \ 2H - 2 \j\2H + |j - 1|2H}, for j = 0, ±1, . . (1.1 0)
Long m em ory 25
be consistent with the presence of one or more unit roots, but the same series also
displayed a tendency fo r the spectrum of first differences to exhibit a trough at zero
f r e q u e n c y . . .T h e large proportion of variance concentrated around zero frequency is indeed a tra n slatio n in th e frequency dom ain of non sum m ability of autocovari ances, b u t it need not indicate non stationarity, let alone th e presence of a un it root.
T he m ost popular m odel, besides fractional Gaussian noise in 1.10, which encom passes this distinction, is th e autoregressive fractionally in teg rated m oving average model, where
( 1 - L ) dxb(L)xt = a(L)eti w ith ~ \ < dx < \ , (1-H )
z z
where a(z) and b(z) are bo th finite order polynom ials w ith zeros outside th e u n it circle in th e com plex plane. This m odel was proposed by A d en sted t (1974), Hosking (1981) and G ranger and Joyeux (1980). (1 — L )d has a binom ial expansion which is conveniently expressed in term s of th e hypergeom etric function
OO
(1 - L f = F ( - d, 1, 1 ; L) =
J2
r(Jb
- d)T(k + l ) " ^ - * / ) - 1 /;* ( 1. 12) k —0where T(.) denotes th e G am m a function. W riting oo
a i z ) = (L 13)
j=o
1 .1 1 corresponds to a p aram etric representation nested in specification 1.1-1 . 2 w ith
a (* ) = ( l - * ) - < ■ (1.14)
and when | > dx > 0, this im plies a slow decay of filtering weights
aj = 0 ( j d*~l ) as j —» oo (1*15)
and of autocorrelations
Y 2 i= 0 „ - 2 d x ~ l „• v __
—— ----2 ~ c3 as j -¥ oo (1.16)
l ^ i - 0 a i
26 Chapter 1
Hyperbolic decay of th e filtering weights as in 1.15 im plies th a t th e la tte r decay so slowly as to be non sum m able even though they rem ain square sum m able for dx 2 *
In term s of im pulse response, this implies a very long lived b u t non perm an en t re sponse to shocks at any tim e in th e series, consistent w ith very slow m ean reversion in th e process (a notion widely used in th e economic literatu re, and which can be form ally defined w ithin this fram ework by a dx value strictly below one, distinguish ing it thereby from a statio n arity concept). H yperbolic decay of autocorrelations translates into th e existence of a singularity of hyperbolic n atu re in th e spectral density of th e process in th e neighbourhood of frequency zero. C onditions for equiv alence between tim e dom ain and frequency dom ain representations of long m em ory are discussed in Yong (1974). Therefore, thro u g ho u t this work, long m em ory in a weakly statio n ary tim e series x t, t = 0, dbl , . . wi t h autocovariances satisfying
cov ( x u x t+j) = [ f(X )cos(jX)d X j = 0 , ± 1 , . . . , (1-17)
J —7T
will be m odelled sem iparam etrically by
/(A ) ~ L ( X ) \ ~ 2i* as A -> 0 + , w ith - \ < dz < (1.18)
Li Li
where L(X) > 0 and is continuous at A = 0 when dx = 0, and is otherw ise a slowly varying function at zero defined by
j*. J —> 1 as A —> 0 for all t > 0. (1-19) L(A)
U nder 1.18, /(A ) has a pole a t A = 0 for 0 < dx < | (when th ere is long m em ory in
x t), / ( A) is positive and finite for dx (which is identified w ith short m em ory in x t)
and / (0) = 0 for — | < dx < 0 (which can be described as negative dependence or
Long m em ory 27
B eran (1993) and Janacek (1993) w ith an appended quasi m axim um likelihood es tim atio n m eth o d for th e degree of dependence. The sp ectral density in th e m ost com m on version of th is m odel is given by
/(A ) = |1 - e'x \~2dx e x p { J 2 v j cos J *} (1.2 0)
j=o
where th e short m em ory p a rt of th e representation provides an arb itra rily accurate approxim ation of any positive function w ith a Fourier decom position.
1.2
Sam ple m ean o f long m em ory processes
Long range dependence m ay also be detected through th e behaviour of th e sam ple m ean of th e process. Consider th e p artial sums
S„ = X > t, (1-2 1)
t= 1
w ith variance cr^ = E \ Sn |2, and suppose E x t = 0 w ithout loss of generality. As
noted by R obinson (1994d), cr2 always exists and is equal to 27rn tim es th e Cesaro
sum , to n — 1 term s, of th e Fourier series of /(A ), spectral density of th e process xt .
Therefore, if / is continuous at th e origin,
2
— ^ 2 7 r/(0 ) as n —>oo.
n
Thus, if / ( 0 ) 7^ 0 is estim ated consistently by /( 0 ) , the C entral Lim it T heorem and
Slutzky’s T heorem yield
5'n(2 7 rn /(0 ))_ 2 —yd A f(0,1) as n —> oo (1.2 2)
See H annan (1979) for a proof of 1.22 for a stationary x t following 1 .1 w ith i.i.d.
innovations et (this can be extended to conditionally hom oscedastic and uniform ly integrable m artin g ale differences) and
oo
£ N < ° ° . (1-23)
28 Chapter 1
Eicker (1956) gives consistent estim ates of th e lim iting variance of th e LSE and other sim ple estim ates of p aram etric models in th e presence of p aram etric and nonpara- m etric disturbance autocorrelation. 1 . 2 2 implies a convergence of th e sam ple m ean
a t th e ra te a feature which generally fails when 1.23 does not hold. W hen th e filtering weights ctj are no t absolutely sum m able, b u t follow 1.15, w ith long m em ory p aram eter dx > 0, th e cr2 need to be scaled w ith a factor n “ (2rf*+1) instead of n- 1
to converge. Moreover, th e sam ple m ean is no longer B LU E (A denstedt (1974)). Sam arov and Taqqu (1988) found th a t efficiency can be poor for d < 0 b u t is a t least 0.98 for d > 0. Providing u 2n does not have a finite lim it, a central lim it theorem continues to hold when x t follows 1 .1 w ith i.i.d. innovations (this can be relaxed to
m artingale difference innovations) in the form:
(r~l S n -+d N (0,1) w ith n ^ dx(j~l —> c > 0, (1-24)
in Ibragim ov and Linnik (1971). G iraitis and Surgailis (1986) use th e A ppell gener alisation of H erm ite polynom ials to extend th e result 1.24 to nonlinear functions of processes satisfying 1 .1 w ith i.i.d. innovations
1.3
C onditional h eterosced asticity
Long m em ory therefore provides a framework for a very parsim onious representation of tem po ral dependence, in th a t the long range dependence is em bodied in th e one p aram eter dx. To derive asym ptotic d istrib u tio n al results for processes w ith strong tem poral dependence typically outside th e scope of any m ixing assum ption, th e approach chosen here relies on a m artingale difference or non p redictab ility assum ption for th e Wold innovations et of th e process. In o th er words, lettin g T%
denote th e filtration associated w ith the <7-field of events generated by (es, s < t), one needs to assum e th e innovations are martingale differences:
E ( e t \ F t - i ) = 0 alm ost surely (a.s.). (1.25)
Long m em ory 29
(1979)’s, require th e assum ption of constant conditional variance
= a 1 a.s. (1.26)
th a t m any tim e series, in p articu lar long financial tim e series w here a large degree of tem p o ral dependence is (and can be) observed, are generally believed to violate. F inancial re tu rn s, constructed from first differenced logged asset prices or foreign exchange bank quote m idpoints sam pled a t weekly, daily or in tra-d aily frequencies, typically exhibit thick-tailed distributions and volatility clustering, i.e. conditional variances changing over tim e in such a way th a t periods of high m ovem ent are followed by periods displaying th e sam e characteristic, and periods of low m ovem ent also. O ne therefore needs to allow for tim e varying volatilities for th e innovations, and 1.26 needs to be replaced by
E{e 2t\T t- i) = o f a.s. (1-27)
where o f is a stochastic process whose tem poral dependence properties can in tu rn be considered. T h e conditional variance o f can be allowed to depend on some laten t stru c tu re , as in th e m odel due to Taylor (1980):
£ t = r ) t ° u
log 0* = 7 0 + 7 1 log o-t_ i + u t ,
T)t , u t independent i.i.d. .
30 Chapter 1
(1991), and some nonlinearities were introduced by Sentana (1995) w ith an ex ten sive study of q u adratic A RCH m odels and by Zakoian (1995) w ith th e threshold ARCH class of models. An extensive review of th e lite ra tu re in this field of econo m etric research is given by Bollerslev, Engle, and Nelson (1994). All of th e above are based on a param eterisatio n of th e one step-ahead forecast density, a p articu larly appealing feature -as pointed out by Shephard (1996)- as m uch of finance theory is concerned w ith one step-ahead m om ents or distributions defined w ith respect to th e economic ag en t’s inform ation. A sym ptotic theory for p aram etric ARCH m od elling was proposed by Weiss (1986), Lee and Hansen (1994) Lum sdaine (1996) and Newey and Steigerwald (1994). Bollerslev, Chou, and K roner (1992) give reviews of th e GARCH m odelling approach. A nonp aram etric specification encom passing b o th ARCH and GARCH as special cases was proposed by Robinson (1991b) w here o f is an infinite sum of lagged values of ej:
oo oo
o f = cr2 + y f V??(ef_v — &1) a.s. w ith < oo. (1.28)
j= i j=i
T his can be reparam eterised as
oo
°t = P + £ V’jfit- j (1-29)
j= 1
and includes bo th stan d a rd A R CH (when ipj = 0, j > p, for finite p) and G A RCH (for which th e ipj decay exponentially) models. However, as Robinson (1991b) in dicated, long m em ory behaviour is also covered. This, and th e sem i-strong ARM A representation for th e squared innovations im plied by th e above specification, is m ade apparent in th e following rep aram eterisatio n . If, for com plex valued z,
oo
tp(z) = 1 - Y j & j z 3 (1-30) 3 — 1
satisfies
IV-(z)l
^ 0, 1*1 < 1, (1.31)define
oo
<t>(z ) = Y j & z 3 = V’W * 1, <h = 1- (1-32)
Long m emory 31
T hen, Robinson (1991b) rew rote 1.28 as
oo
A
~ ° 2
=
(L33)
i= o
where
ut = e2t - a 2 (1.34)
satisfies
E ( y t \Ft-i) = 0 a.s., (1.35)
by construction. As a result, th e chosen specification does n o t include all weak GARCH processes as defined by D rost and N ijm an (1993) as processes w ith th e sam e linear projections as ordinary GARCH. However, as for weak A RM A processes, lim iting d istribu tio n theory for weak GARCH processes, provided, for instance, by Francq and Zakoian (1997), relies on m ixing assum ptions which m ay preclude th e high levels of tem p o ral dependence in th e squares which are allowed by 1.33 w ith a suitable choice of filtering weights. To allow for specific types of nonlinearities in th e squares, R obinson (1991b) also proposed a q u ad ratic version of 1.28:
oo 2
a 2 = ( a + a.s. (1.36)
j=i
32 Chapter 1
(Nelson (1990b)), it corresponds to persistence of shocks on b o th forecast m om ents of a 2 and on forecast d istributions of a 2. A long m em ory representation of volatil ity, replacing for instance th e u n it root by a fractional filter in th e equation for th e squares, reconciles a high degree of tem p oral dependence in volatilities w ith lack of persistence and, possibly, w ith covariance stationarity. D enoting s t = cq2 — a 2 and
Xt = £? — <72, for I > 0, we have
O O
st+i = i/>ixt + 5Z ^j X t - j + i a.s. . (1.37) j #
Now as under 1.28, ipiXt —> 0 alm ost surely when I —>• oo, x is p ersisten t in th e volatility according to none of th e definitions adopted by Nelson (1990b), i.e. per sistence in probability, in L p-norm or alm ost surely.
Besides, th e analogy is ap p aren t betw een th e clustering of volatilities of financial returns and w hat M andelbrot (1973) described as “Joseph effect” . A nd, effectively, W histler (1990), Lo (1991), Ding, G ranger, and Engle (1993) and Lee and R obin son (1996), are am ong th e first to show how well th e long m em ory representation perform s em pirically. A general fractionally in tegrated G A RCH m odel is obtained as a special case of specification 1.28 w ith th e <^>(2) polynom ial defined as
= (1-38)
a ( z )
for 0 < d£ < | and finite order polynom ials a(z) and b(z) whose zeros lie outside th e u n it circle in th e com plex plane. N ote th a t th e degree of fractional integration is called ds in th is case to distinguish long m em ory in th e squared innovations from long m em ory in th e levels. Baillie, Bollerslev, and M ikkelsen (1996) apply 1.38 to asset prices w ith th e addition of a drift p a ra m ete r
(1 - L ) d'a { L ) e 2 = /1 + b(L)vt. (1.39)
Nelson (1990b) proves alm ost sure convergence of th e conditional variance a 2 in th e short m em ory case de in 1.38 w ith a(z) and b(z) of degree one. A p art from 1.38, the requirem ent
00
0 < ^ 2 $ 2j < 0 0 (1-^0)
Long m emory 33
includes th e o th er trad itio n al long m em ory specification of m oving average coeffi cients, th e fractional noise case w ith autocorrelations satisfying
corr (e?,e*+J-) = E g ° J ^ ’+3 = \ { \j - i r * - \j\2d+l + |j + l r 1} ■ (1.41)
Z ^ i = 0 V i z
Robinson (1991b) developed Lagrange m ultiplier tests for no-ARCH against a lte r natives consisting of general finite param eterisatio n of 1.28, specialising to 1.38 and 1.41. In b o th these cases, th e autoregressive weights ^ j satisfy
U nder 1.42 and
2 t e l < ° °
3=1
m ax E (eJJ < oo,
(1.42)
(1.43)
it follows th a t
00 0
E t f ) < E f c U e U - S ) } j=0
00 00
3= 0
< I<
3 = 0
(1.44)
where K is a generic constant, so th e innovations in 1.33 are square integrable m artingale differences, is well defined as a covariance statio n ary process and its autocorrelations can exhibit th e usual long m em ory stru c tu re im plied by 1.38 or 1.41. Even if 1.43 does not hold, th e “autocorrelations” YtLo fafa+j/ YiZo <f>] are well defined un d er 1.40. B oth p aram etric representations 1.38 and 1.41 have th e im plication th a t autocorrelations follow
Y i -0 fiifii+j ~ cj•2de- l as j•
00
E S o
which in tu rn im plies a ra te of decay for th e innovations filtering weights of
<t>3 =
0
( j d‘ *) as i 00.(1.45)
(1.46)
This is taken as a characterisation of long m em ory in th e process when dE > 0 and it implies nonsum m ability of weights (j>j and autocovariances
34 Chapter 1
The ra te of convergence of th e sam ple m ean is also characteristic of long m em ory processes when (f)j satisfies 1.46. Indeed, 1.35 and 1.44 im ply th a t th e p a rtia l sums of th e squared innovations have variance
n n oo
V a r E ( e ? - cr2)] = E E h h + ' - t E t f - j ) (1.48)
<=1 s,t=l j =0
=
0 (n£e«)
(1.49)
w ith
t=i
©< := E \ M h*\ = 0 ( t 1) (1.5 0 )
3= 0
under 1.46. Therefore, we have th e sam e ra te up p er bound as in 1.24, i.e.
£ ( e ? - °*) = Op(nd' + i ) . (1.5 1 )
t= 1
This result, and nonsum m ability of th e </>/s, is to be contrasted w ith stan d a rd la te n t ARMA representations for th e squares, w here weights decay exponentially and are, therefore, absolutely sum m able. In view of th e em pirical evidence and th e focus on possible long m em ory in financial re tu rn s £*, it seems ap p ro p riate to allow for possible long m em ory in th e e\ also. T his thesis is concerned w ith th e estim ation of tem poral dependence stru ctu res in a covariance statio n ary tim e series x t v ia th e analysis of its sp ectral density in a neighbourhood of zero frequency. This concerns the case where x t displays short m em ory as well as th e cases where x t displays long m emory; and a large p a rt of th e results provide asym ptotic theory in case th e squared innovations possibly ex h ib it long m em ory them selves.
1.4
E stim atin g d ep en d en ce
E stim ating th e degree of dependence and carrying out inference on th e process x t
Long m em ory 35
1.4.1 P a ra m etric estim a tio n o f long m em ory
Take x t to be a covariance statio nary series w ith m ean f io and spectral density / ( A;$o), —n < A < 7r, where / is a given function of A and 6. For a realisation of size n, we consider th e discrete Fourier transform
n
wx(X) = (27m)“ 2 J~2xtettX (1.52)
t=i and the periodogram
4 (A ) = K ( A ) |2. (1.53)
This statistic was first proposed by Schuster (1898) to investigate hidden periodic ities in tim e series. A useful general result, under various regularity conditions, is th e following1:
^ £ rCW[I*M-f(Wo)]d\-+iN(p,A(0 + B(Q)
as n-Kx>,
(1.54)
where
M O = ~ (1.55)
B(0 =
(1.56)
and where
=
(2tt)-3 £
(1.57)
u ,v ,w
is th e fourth order cum ulant spectral density, and
£u,v,w — cum(x^, Xf^.u, Xf^.y, Xf ) (1.58)
is th e fourth order cum ulant m om ent of th e process x t .
1In case x t are residuals from a fitted param etric m odel, taking y o = 0 does not, under regularity
conditions (including conditions on the function C(A)), affect the asym ptotic properties o f the
36 Chapter 1
W hen x t is G aussian,
B(C)
vanishes, and under suitable regularity conditions on ((A) and /(A ; 0), Fox and Taqqu(1986) show th a t 1.54 holds even when x t is strongly autocorrelated, providing a pole of /(A ; 0) is m atched by a zero of f(A) of suitable order. Fox and Taqqu (1986) showed as a result th a t W h ittle ’s e stim ate of 9q(W h ittle (1962), H annan (1973)), i.e. an estim ate resulting from the m inim ization of
is asym ptotically norm al w ith ra te n -1/ 2. B eran (1986) and D ahlhaus (1989) ex ten d this result to prove th a t th e G aussian m axim um likelihood estim ate of 0O rem ains efficient in th e C ram er-R ao sense when x t is strongly autocorrelated. R obinson (1994d) shows th a t root-n asym ptotic norm ality results also apply to th e e stim ate resulting from th e m axim ization of a discretised version of th e W h ittle likelihood
where A j = 2it j / n are th e harm onic frequencies.
W hen x t is possibly non Gaussian, th e th ree estim ates above (pseudo m axim um likelihood, W h ittle and discretized W h ittle) continue to be root-n consistent and asym ptotically norm al under conditions involving weak autocorrelation (see M ann and W ald (1943), W h ittle (1962), H annan (1973) and Robinson (1978a)). Solo (1989) shows 1.54 for x t satisfying 1.1 w ith 1.25 and restrictions on th e W old coeffi cients which include m any strongly autocorrelated non G aussian processes. In case
x t is non G aussian, B ( ( ) does not vanish in general. An im p o rtan t case w here this occurs is when x t follows 1.1 and 1.25 w ith dynam ic conditional heteroscedasticity in the innovations as discussed in th e previous section. In th a t case, / 4(A ,—/i,/i) contains contributions from fourth cum ulant m om ents of th e innovations et o th er th an k = cum (et ,£ t ,£ t ,£ t). U nder 1.26, which imposes constant second and fo u rth conditional m om ents, we have
Long m em ory 37
and zero otherw ise. U nder 1.27, w ith of defined by 1.28, however,
cum (er ,6 s,£ i,e u) — k if r = s = t = u, (1.61) — 7r_s if r = t s = u, (1.62)
= 7r _* if r = s ^ t = u, (1.63)
= j r - s if r = u / t = s, (1-64)
and zero otherw ise. Therefore, th e fourth cum ulant 1.58 is equal to
oo
Cu,V,XJJ ^ ^ y &kd-k-\-uQk-\-v&k + w (1.65)
k=0
d- ^ y Kk—i &k+v&k+w d* &j+v&k+u&k+w k^j
d- (1.66)
and zero otherw ise. T h e ARCH special case of 1.27 was considered by Weiss (1986), and th e G ARCH (1,1) by Lee and Hansen (1994). B oth show asy m p to tic n o rm ality of th e quasi m axim um likelihood. Lum sdaine (1996) allows for n o n sta tio n a rity of th e in teg rated form in th e conditional variance equation, b u t long te rm dependence is not covered for x t.
1.4.2 S m o o th e d p eriod ogram sp ectra l estim a tio n
S em iparam etric altern atives in th e estim ation of th e slope of th e logged sp ectru m at th e origin rely on specification 1.18. Local specification around th e frequency of interest avoids th e pitfall of p aram etric estim ation of th e long m em ory p a ra m e te r
dx: a m ispecified sp ectru m at non zero frequencies m ay cause inconsistency in esti m ation of th e long m em ory p aram eter (characterizing th e low frequency dynam ics of th e system ). T his ty p e of estim ation is based on low frequency harm onics of th e periodogram 1.53 whose properties are briefly discussed in th is p arag rap h .
38 Chapter 1
is not a consistent estim ate for th e spectral density. More precisely, consider a sta tionary process x t following 1.1 w ith i.i.d. innovations e t and filtering weights ctj
satisfying
O O
^ 2 r \ a jI < oo, (1.67)
j=o
and w ith spectral density defined as in 1.17. T he periodogram of x t has th e following
asym ptotic sam pling properties:
co v (/x(A ),/*(ju)) = (1 + £oa + + 0 (71- 2)) + 0 ( n _1), (1.68)
where S stands for th e K ronecker symbol. A proof of this resu lt is given in Brock- well and Davis (1991). T he sam e result is shown to hold by Brillinger (1975) for a strictly stationary process satisfying th e m ixing condition 1.7. 1.68 shows not only th a t periodogram ordinates are not m ean-square consistent, b u t also th a t at distinct frequencies th ey are asym ptotically uncorrelated un d er these conditions, which perm its th e construction of consistent estim ates of th e spectral density such as
/(A ) = £ Wn{ j ) I x {A +
|j|<m 71
where m is a bandw idth sequence satisfying at least
1 m . „ ,
1--- ^0 as n —>• 0 0, (1.69)
m n
and W n(j) is a sequence of sym m etric weight functions satisfying
W n(j) = 1 and as »->oo. (1-70)
|j|<m |j|<m
U nder 1.67 and 1.43, we have (see Brockwell and Davis (1991) for a proof)
J i m , ( C0V( / ( A)>
f i t * ) )
= (1 + ^oa +$ * \ ) S \ J x W 2,
\|j|<m /Long m emory 39
1.4.3 S em ip a ra m etric estim a tio n o f lon g m em o ry
T he estim ation strateg y based on low frequency periodogram ordinates which is considered in this work is related to a strateg y first proposed by Hill (1975) in tail estim ation for distributions w ith a high degree of leptokurtosis. R esearch in th a t field was fuelled in th e last couple of years by in stitu tio n al regulations allowing banks to derive th e ir own m ethod of estim ation for th e p robability of ex trem e losses. H ill’s sem iparam etric approach to th e estim atio n of th e tail of d istrib u tio n s relies on a p aram etric specification of th e tail of th e d istrib u tio n and a n o n p aram etric tre a tm e n t of th e rest of th e d istribution. T h e p robability d istrib u tio n is said to feature a heavy tail if it behaves asym ptotically like th e P areto d istrib u tio n
P ( Y > y ) = y~'yL(y), 7 > 0, y > 1, (1.71)
where L(y) is a slowly varying function a t infinity. T h e Hill e stim ate for th e “tail index” 7 (which, sim ilarly to th e long m em ory p a ra m ete r in th e tim e dom ain or th e
frequency dom ain representations, appears as an exponent) m axim ises a conditional P areto likelihood
where Y^) > . . . > Y(n) are th e order statistics of a sam ple of observations Yi , . . . , Yn
and m is th e num ber of statistics used in th e estim ation, satisfying 1.69.
Now suppose x t is weakly statio n ary and follows 1.18. One wishes to e stim ate th e degree of long m em ory dx in a way th a t is robust to possible m ispecification of th e short range dynam ics. A sem iparam etric estim ate of dx was proposed by Kiinsch (1987) and will be dwelt upon in ch ap ter 3 of th is thesis. It is based on th e W h ittle likelihood discretisation 1.59, b u t th e o p tim isation is realised over th e first
m frequencies only, w ith 1.69, in accordance w ith th e local specification 1.18. T he function to m inim ise is
40 Chapter 1
noting th a t /(A ) is replaced by its local p aram etric form in th e neighbourhood of zero frequency 1.18 w ith 0 < L ( A ) = G < o o . The estim ate is not defined in closed form, so th a t a prior consistency result (as in Robinson (1995a) under very weak local sm oothness conditions for th e spectral density, in addition to assum ption 1.26 is necessary. Robinson (1995a) also proves asym ptotic norm ality under
/(A ) = CrA“ 2d* (l -f O(A^)) as n —> oo, (1-74)
for some p G [0,2), under slightly stronger local sm oothness conditions and finite
fourth m om ents for x t still satisfying 1.26. T he proof of th e result
y /m (d x - dx) -± d N(0, j ) (1-75)
assumes th e following restrictions on th e choice of “bandw idth” m,
1 m ^ log m
1--- r-r---> 0 as n —> oo (1-76)
m n 1*3
which perforce restricts th e ra te of convergence of th e estim ate. T he la tte r will therefore be inefficient w ith respect to correctly specified p aram etric W h ittle esti m ation, when m = [(n — 1)/2], which has n~s ra te of convergence. Robinson also conjectures th a t th e theorem still holds under th e m ilder and m ore n a tu ra l condition
1 m 2P+1
m n 1
I I I / V
H r-r > 0 as n —> oo. (1-77)
An asy m p to tic n orm ality result for th e estim ate m ay still hold when m is of
ex-20
act order of m ag n itu d e n 2^ 1, corresponding to optim al sm oothing. In th a t case, asym ptotic bias will no t be zero as in th e cases of “oversm oothing” (cases w here m is small in order to avoid asym ptotic bias): 1.77 and 1.76.
Long m emory 41
available for statio n ary processes w ith spectral densities satisfying 1.74. G iraitis, Robinson, and Sam arov (1997) give a ra te optim ality theo ry b u t no lower bound for th e asym ptotic variance of estim ates achieving th a t rate. C h ap ter 4 will be concerned w ith o p tim al sm oothing and optim al bandw idth selection.
O th er estim ates of dx following th e sam e estim ation principle are th e log p eri odogram e stim ate proposed by Geweke and P orter-H udak (1983), th e averaged pe riodogram e stim ate proposed by Robinson (1994c) and th e exponential e stim a te proposed by Janacek (1993). T he log periodogram is based on a regression of th e first m harm onics of th e log periodogram on a simple function of frequency. An efficiency im proving version of this estim ate was proved in R obinson (1995b) to pro duce a consistent and asym ptotically norm al estim ate of dx, applying least squares to th e regression
log I x (Xj) = C + dx(21og Xj) + Uj, j = / + l , . . . , m (1-78)
where I is a trim m in g p aram eter which diverges at a slower ra te th a n th e b an d w idth m . H urvich, Deo, and Brodsky (1998) fu rth er show th a t for a slightly m ore specific local p aram eterisation , th e original G ew eke-Porter-H udak e stim a te is also asym ptotically norm al and th a t no trim m ing of very low frequency harm onics is necessary. Ja n a c e k ’s estim ate (Janacek (1993)) is a co u n terp art of th e log p eri odogram e stim ate based on th e fractional exponential m odel 1.2 0, b u t, to d ate,
th ere seems to be no asym ptotic theory for it. The averaged periodogram esti m a te proposed by Robinson (1994c) will be considered in C h ap ter 2 of th is thesis in m ore depth. It is based on an analogy w ith th e weak dependence case w here averaging over approxim ately independent periodogram harm onics in a neighbour hood of zero frequency produces a consistent estim ate of th e sp ectral density at zero frequency. However, th e asym ptotic properties of low periodogram o rdinates are considerably affected by long range dependence, and new results had to be de rived. T h e idea of th e log periodogram estim ate is draw n from th e id en tity 1.78 under 1.18, where C = logL(O) — E, E is E u ler’s constant E = 0 .5 7 7 2 ..., and
42 Chapter 1
assum ptions -all of which include a form of weak dependence-, th a t for A 7^ 0 m odulo 7r and J a finite positive integer, / x(Aj), j = 1, . . . , J are asym ptotically independent
/( A ;) x l/2 variates (see for instance Theorem 5.2.6 page 126 in B rillinger (1975) and Theorem 12 page 223 of H annan (1970)). A sym ptotic properties of th e averaged periodogram estim ate of th e spectral density a t zero frequency
-I 771
/ ( ° ) = - £ * « ( * > ) . (1-79)
m being th e bandw idth following at least 1.69, and, more generally, of w eighted periodogram spectral estim atio n (in Brillinger (1975)), follow from this asy m p to tic distributional result for ordinates of th e periodogram under short m emory.
For a process x t where th e conditional homogeneity condition 1.26 fails (and th ere fore th e conditions applied by H annan (1970) who assumed th e et to be i.i.d .), and is replaced by 1.27 w ith a 2 defined by 1.28, th e asym ptotic d istribu tio n al resu lt for periodogram ordinates m ay not continue to hold, possibly because of non sum m able fourth cum ulant contributions to asym ptotic variances. In C h ap ter 2, it is proved th a t in spite of this, 1.79 rem ains an asym ptotically norm al estim ate of / ( 0 ) w ith a suitable choice of bandw idth m.
W hen x t displays long m em ory (it follows 1.18), the asym ptotic d istrib u tio n al result continues to hold for fixed positive frequencies (see R osenblatt (1981) and Y ajim a (1989)) b u t not for periodogram ordinates in a neighbourhood of zero, as docu m ented by Kiinsch (1996), H urvich and B eltrao (1993), C om te and H ardouin (1995), and Robinson (1995b). T he periodogram ordinates Ix(Xj) are no longer indep en d en t or identically d istrib u ted when th e sam ple size n tends to infinity. In this settin g , Theorem 2 of Robinson (1995b) gives a m ajor result on asym ptotic variance and cor relations of low frequency periodogram ordinates which applies to th e dependence stru c tu re considered in this thesis under 1.74: p u ttin g v(X) = wx ( \ ) / G 1/ 2\ ~ dx,
where tu*(A) is th e discrete Fourier transform defined in 1.52, and a 2 is th e uncon ditional variance of th e innovations to th e process, we have
Long m emory 43
£[t,(A,-)«>(A*)] = O ( ^ ) . (1.81)
T his result is in stru m e n ta l to th e proofs of th e asym ptotic properties of th e log periodogram , th e local W h ittle and th e averaged periodogram estim ates of long m em ory, and it rem ains valid when the conditional hom oscedasticity condition 1.26 is relaxed to 1.27 w ith o f following 1.28. In this setting, C h ap ter 3 proves th a t th e asy m p to tic n orm ality resu lt 1.75 continues to hold for th e local W h ittle e stim ate of long memory, and th a t it continues to hold w ith identical asym p to tic variance so th a t no features of th e ARCH stru ctu re defined by 1.28 or 1.36 enter. T his resu lt is due to additional sm oothing of th e periodogram via th e slightly m ore strin g en t condition on th e choice of bandw idth
ra log m = o(n 2 ~de) as n —> oo (1.82)
which ensures th a t th e contribution to th e variance of th e periodogram of th e er rors et from fourth cum ulants 1.62-1.64 induced by long m em ory conditional h e t eroscedasticity is of sm all order of m agnitude w ith regards to th e suitable approx im atin g m artingale. T his im plicit effect of ARCH -restrictin g a tta in a b le rates of convergence for th e estim ates- is directly in contrast w ith p a ra m etric or ad aptive estim atio n (see, e.g. Weiss (1986) and K uersteiner (1997)) w here A R C H -type be haviour directly affects lim iting distributional properties.
This outcom e (i.e. no explicit effect of ARCH) is especially desirable in th e case of th e local W h ittle estim ate. This is in th e first place due to th e sim plicity of th e lim iting variance in 1.75, which is independent of G and dx . M oreover, although m axim um likelihood estim atio n of p aram etric versions of 1 . 3 3 such as 1.38 or 1.41 is
44 Chapter 1
not p erm it long memory, whereas long m em ory lite ra tu re features either Gaussian processes (e.g. Fox and Taqqu (1986), Robinson (1995b)), non linear functions of Gaussian processes (e.g. Taqqu (1975)), linear functions of independently and iden tically distributed sequences (e.g. G iraitis and Surgailis (1990)), nonlinear functions of such linear filters ( “Appell polynom ials” , see G iraitis and Surgailis (1986)), as well as the m odel defined by 1.1, 1.2, 1.25 and 1.26. None of these approaches represents
conditional heteroscedasticity in a m artingale difference sequence.
1.5
Choice o f b and w id th
It is apparent from th e discussion above, th a t th e choice of bandw idth m , th e num ber of periodogram ordinates used in th e estim ation procedure, is crucial in sem ipara m etric estim ation of long memory. It is crucial to b o th asym ptotic d istrib u tio n al results and m ean square optim ality. M oreover, insofar as it determ ines from which point th e practitioner starts to describe th e behaviour of th e series as asym ptotic, bandw idth is central to th e concept of long m em ory itself. In th a t regard, specifying the series only in the “asy m p to tic region” w ith a stru c tu re th a t does not im pose itself on short run cycles, seems an intrinsically b e tte r approach, n otw ithstanding considerations of efficiency and robustness.
peri-Long m emory 45
odogram estim ate in Lobato and Robinson (1996), Delgado and R obinson (1996), Delgado and R obinson (1994). T he need for an o p tim ality theory for th e d eterm i n atio n of ban d w id th is therefore evident. G iraitis, Robinson, and Sam arov (1997) show th a t for long m em ory estim ates, in a sim ilar way as for sm oothed periodogram estim ates, one cannot im prove on a ra te of convergence which depends on th e lo cal sm oothness properties of th e spectral density following specification 1.74. T hey fu rth er show th a t th e log periodogram e stim ate of long m em ory in th e form pro posed by Robinson (1995b) a tta in s this optim al ra te of convergence. U nder th e m ore restrictive specification
/(A ) = |2s i n ( ^ ) |‘2V ( A ) (1.83)
where / * ( A) is tw ice continuously differentiable and positive at A = 0, H urvich, Deo, and B rodsky (1998) give a precise expression for th e m ean squared erro r of th e estim ate and derive an optim al bandw idth form ula. For spectral densities satisfying
f W = L ( X ) X ~ 2dr( l + Effdr^ 13+ 0 < 1 ^ 1 < oo, p
e
(0,2], (1.84)46 Chapter 1
sub-sam ple bo o tstrap technique employed relies on th e i.i.d. assum ption for th e observations, and does not seem to be readily extendible to strong dependence. One therefore needs to rely on M onte Carlo experim ents to assess th e qu ality of optim al bandw idth selection form ulae, and it rem ains advisable to report a wide range of bandw idth choices in em pirical applications.
1.6
Long m em ory in sp ecu lative returns
Long m emory 47
Fielitz (1977). This finding raises a num ber of questions on th e effects long m em ory in re tu rn s may have on portfolio decision and on derivative pricing using m artin g ale m ethods. However, th e finding of Greene and Fielitz (1977) is challenged by Lo (1991) w ith a slightly m ore powerful analysis based on a modified form of th e R / S
statistic. Lee and Robinson (1996) are th e first to apply sem iparam etric m eth o d s to th e m easure of m em ory in stock price retu rn s, and Lobato and Savin (1998) apply th e Pitm an-efficient te st statistic developed in Lobato and R obinson (1998) to con clude w ith Lo (1991) th a t evidence of long m em ory in retu rn s is spurious. T hey do, however, find strong evidence of long range dependence in th e squared and absolute retu rn s, as do Ding and G ranger (1996). This refines th e widely recognised stylised facts on conditionally heteroscedastic behaviour of financial retu rn s (see M andelbrot (1963) and F am a (1965) for a first description of the phenom enon) and reinforces th e value of long m em ory estim atio n procedures robust to (possibly long m em ory) conditional heteroscedasticity when exam ining th e long ru n p red ictab ility of retu rn s.
48 Chapter 1
( “event studies” ) and, in p articu lar, th e long run effect of transactions on th e price process (see for instance Lyons (1985), and Hasbrouck (1991)).
1.7
Synopsis
T he following two chapters are concerned w ith th e effect of possibly long m em ory conditional heteroscedasticity on sem iparam etric estim ation of long memory.
C hapter 2 considers th e averaged periodogram statistic for a linear process w ith pos
sibly long m em ory in th e innovations conditional variance. An asym ptotic norm ality result is given for averaged periodogram estim atio n of finite and positive spectral densities a t zero frequencies. T h e proof is ad ap ted from Robinson and H enry (1997). T he robustness of th e results in R obinson (1994c) regarding consistency of th e av eraged periodogram statistic in th e presence of long m em ory is then shown and a M onte Carlo experim ent assesses th e effect of conditional heteroscedasticity in sm all sam ple averaged periodogram long m em ory estim ation. T he estim ation of statio n ary cointegration is then discussed in this framework.
C hapter 3 presents th e proofs of robustness to (possibly long m em ory) conditional heteroscedasticity of th e consistency and asym ptotic norm ality results for th e local W h ittle estim ate of long m em ory in R obinson (1995a). A M onte Carlo stu d y in vestigates th e effect of conditional heteroscedasticity on local W h ittle estim ation of long m em ory in small sam ples. T his ch ap ter is based on a jo in t research w ith P e te r Robinson, appearing in R obinson and H enry (1997).
Long m em ory 49
Henry and Robinson (1996).
C h a p te r 2
A v e r a g e d p erio d o g ra m s t a tis tic
2.1
In trod u ction
This second ch ap ter is concerned w ith th e use of an averaged periodogram s ta tistic proposed by G renander and R osenblatt (1966) to investigate tem p o ral dependence in weakly dependent tim e series. T he process x t considered is statio n ary and satisfies
1 .1 and 1 . 2 w ith th e m artingale dependence assum ption 1.25 on innovations et .
T h e approach is sem iparam etric in th e sense th a t x t is supposed to have sp ectral density /(A ) satisfying the local specification 1.18 w ith dx > 0; and th e averaged periodogram s ta tistic is used to investigate th e behaviour of /(A ) in a neighbourhood of zero frequency, estim ating / (0) = L (0) when dx = 0 and estim atin g dx w hen th e
la tte r is strictly positive. Section 2 of this chapter presents issues and p ast results.
In th e use of a sem iparam etric approach, one m ay have in m in d estim atin g de pendence in long financial d a ta series. To th a t end, asy m p to tic properties of th e averaged periodogram statistic need to be justified when th ere is a possibly high degree of tem po ral dependence in conditional variances.
52 Chapter 2
to th e case 1.27 w ith o f defined by 1.28 corresponding to (possibly long m em ory) conditional heteroscedasticity in th e innovations of a generalised linear process. Consistency of th e averaged periodogram based estim ate of dx > 0 is proved w ith a specific rate of convergence by Robinson (1994c). Section 4 of this chapter extends th e validity of th e la tte r result to processes satisfying 1.27 w ith o f following 1.28.
A simple corollary is th e extended validity of a consistent estim ate of statio n ary cointegration proposed by R obinson (1994c). This is presented in Section 5 of this chapter while Section 6 proposes an investigation of th e effect of conditional h e t
eroscedasticity in sm all sam ples. Section 7 concludes this chapter.
2.2
A veraged period ogram sta tistic
Let th e discrete Fourier tran sfo rm of a covariance statio n ary process Xt be defined as in 1.52 and th e periodogram I x ( X ) as in 1.53. Define th e averaged periodogram
by
p _ [ A n /27r]
= T E W (2-1)
n i=i
where Aj = 2n j l n , n is th e sam ple size and [x] denotes th e largest integer sm aller or equal to x. Because I x ( X j ) is invariant to location shift, no m ean correction
is necessary for 2.1. F (A) is a discrete analogue of th e m ore widely docum ented
continuously averaged periodogram (see Ibragim ov (1963)) where 1.53 is replaced by its dem eaned version. T he e stim ate / (0) = F ( X m ) / X m given in 1.79 was proposed
for /( 0 ) by G renander and R o senb latt (1966) and is readily generalisable to a wide class of weighted periodogram spectral estim ates defined below. Let K (A) be a bounded function satisfying
I<{X)dX = 1, I < { -A) = K ( A). (2.2)
Defining
oo
K m(X) = m £ K ( m ( \ + 2n j ) ) (2.3)
Averaged periodogram statistic 53
where m is a positive integer called th e bandw idth, weighted periodogram e stim atio n of / (0) is given by
/« (0 ) = — 2 * » ( * , - ) / , (A*). (2.4)
71 j= 1
The class of kernel functions such th a t
K m {A) = 0 for A > X m (2.5)
provides a basis for estim ation of /( 0 ) under specification 1.18 w ith dx = 0. Sup posing 1.69 is satisfied, a set of sufficient conditions for
fw (0) — / (0) as n —y oo (2.6)
includes absolute sum m ability of fo u rth cum ulants
-fo o
Y |c u m (x i,x i+/l, x i +i, x i +J)| < oo. (2.7)
h , i , j = - o o
Suppose th a t a local Lipschitz condition is im posed on th e sp ectral density in th e form,
f W = /(0 )(1 + E p r f ) + o ( \ 0) as A -> 0+ , (2.8)
w ith
P G (0,2], 0 < / (0) < oo, 0 < Efi < oo,
and suppose th e b an d w id th m satisfies 1.77. U nder th e conditions above, asy m p to tic norm ality of /(A ) given by 1.79
m ^ (/(0 ) - /( 0 ) ) A /'(0,/(0)2) as n o o (2.9)
occurs under th e two following sets of sufficient conditions: B rillinger (1975), T heo rem 5.4.3, page 136 assumes 1.7 and existence of all m om ents of x t ; H annan (1970), T heorem 13, page 224, assumes th a t x t follows 1 .1 w ith i.i.d. innovations. H an n an
(1970), T heorem 13’, page 227 also proves 2.9 un d er th e uniform m ixing condition