Texts in Statistical Science
Time Series
Analysis
Henrik Madsen
T
ech
nical Un
iversity of
D
en
m
ark
Chapman & HallfCRC Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300 Soc" Raton, FL33487·2742
(12008 by Taylor & Francis Group, LLC
Chapman & Hall/CRC Is an imprint of Taylor & Fran(is Group, an Informa business No claim to original U.S. Government works
Printed in the United States of America on add-free paper 109876S4321
Intemalion,,1 Standard Book Numbl':r-13: 978-1-42OO-S967-0 (Hardcover)
This book contains mformation obtained from authentic and highly regarded 5Oureel. Reprinted material is quoted with permission. and sources are indicated. A wide variety of re(erenees arc tisted. Reasonable efforlS have been made to publish reliable data and information, but the author and the publisher cannot assume responSibility for the validity of all materials or for the conse-quem:es of their use.
Except as permitted under U.S. Copyright Law. no part of this book may be r~rmted, reproduced. transmuted. or utililed In any form by any electronic. mechanical. or other means, now known or hereafter invented. mcludmg photocopying. microfilming. and recording. or in any information norage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work. please access www. copyright.com (http://www.copyright.coml)orcontacttheCopyright Clearance Center. Inc. (Ceq 222 Rosewood Drive. Danvers. MA 01923. 978-750-8400. CCC is II not for.profit organi13t1on that provides licensrs and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC. a $Cpa rate system of payment has been arranged.
T ... demark Notice: Product or corporate names may be trademarks or registered trademar~5. and are used only for identificalion and uplanalion WIthout intent to infringe.
Library ofCongreu Cataloglng-in-Publleation Oata Mad$Cn. Henrik,
[9SS-11mI' series analysis I Henrik Mad$Cn.
p. em (Chapman &. Hall/CRC texts in statistical Klenee series; ". 72)
Include. bibliographical rderences and IIldcx. ISBN 978·1 42OO-S967·0 (hardback: alk. paper) I. Time-$Crie$ analySi$.1. Title. [[. Senes. QA280,M322007
SI9.S·S-·dc22
Visit the Taylor " runcil Web site at http,l/w""w.taylonndfrancis.com
and the CRC Prell Web ,Ite at http://www.cr<:pleu.com 2007036211
Contents
Preface Notation 1 2 3 lntroduction1.1 Examples of timl' ~rics . . . 1.1.1 Dollar to Euro exchange ratl..' . 1.1.2 Numbt'r of monthly airline pas .. '>t.'ngt'rs 1.1.3 Ileat dynamics of a builditlg
1.1..1 Predator.prcy relationship 1.2 A first crash course . . . 1.3 Contents and SC01)C of the book Multivariate random variables
2.1 ,Joim. and lIIarginnl densities 2.2 Conditional distributions . . 2.3 Expt'Ctations and moments .
2..1 ~Iolllents of multivariate random variables 2.5 Conditional expectation
2.6 The mullivariatc normal distribution.
2.7 Distributions derived frOIll the lIormal distribution 2.8 Linear projections
2.9 Problems . . .
f
2:z
Rcgrcssion.based methods
3.1 The regression modd . . . . . 3.2 The general linear model (CLM) . .
3.3 3.1
3.2.1 T...('ast !;(Iuarcs (LS) estimates
3.2.2 ;\Iax.imum likelihood (ML) N;limatcs . PT('(liction. . . . . . . . . 3.3.1 Prediction in the gencral liuear model R('grcssioll and expont"ntial smoothing . . . . :.1.1.1 Pr«\ictions in the constant meltn modt'l
xiii xv 1
2
2 2 3 5 713
13 14 15 1720
22
23
24 2931
31 33 3·140
44 45 474
8
•
3A.2 Locally constant mean model und simple ('xpollcntial smoothing
3.·1.3 Prediction ill trcnd models
3. U Lo<-al trend and exponential smoothing 3.5 TilllC' S(>ries with seasonal ''8.riations
3.5.1 TIl{' das ... ical decomposition. 3.5.2 Holt-Winters procedure . . . 3.6 Ciolmllllld lQ(~al trend model-an example
3.7 ProblC'ms... . . .
Linear dynamic systems
1.1 Linear systelilS in til(> lime domain 4.2 Liut'1lr :-;ystcms in the frequency domain 4.3 Sampling.... . . .
4.4 1'11{' z-trllnsform
1.5 F'r{"Qucntly used operators 4.6 The Laplacl' transform
1.7 A cOInparison betwccn transformations
4.8
Problems...5 Stochastic processes
5.1 luLrodlloioll . . . . .
5.2 Stochastic proccsscs and their momenlls 5.2.1 Characteristics for stochastic processes 5.2.2 C{)varianC<! and correlation functions. 5.3 Linear processes . .
5.3.1 Processes in discrete time 5.3.2 Pr~'S in continuous timl'
5.4 Stationary processes in the frequency dOIlUlin 5.5 COllllllonly UI~..'(I linear prOC€SS(.'S
5.5.1 The ~IA process 5.5.2 The AR process 5.5.3 The Aru.IA process 5.6 NOIl-blatiouary models.
5.6.1 The AIU~lA process. 5.6.2 Seasonul models 5.6.3 ~Iodels with covariatcs
5.(3A ~Iod(,[s with time-varying mean values . 5.6.5 Models with time-varying coefficients 5.7 Optimal prediction of stochastic processes .
5.7.1 Predictioll in the ARIMA process 5.8 Problems . . . .
50
52
56
59
GO
61 6265
69 70 73 7880
87 9091
96 97 97 97on
103 107 107 III 113 117 117 1191
25
130 130 132 131 131 135 135 137 1106 Identification, estimutiou, and model checking (3.1
6.2
6.3
6.4
intro<illClion
Estimation of COVl\rillll{"e Il.lld rorrdntion functions (3.2.1 Autocovariance and autocorrelation fUllction . ..; . 6.2.2 Cros&-("()vRriflll{"(> fLnd cr~'H:orrclation functions Idcntificlltioll
6.3.1 IdentifiriltiOl1 of tll(, degrcc of differencing 6.3.2 Identification of th(' AIU.IA part . . . 6.3.3 Coilltl'gfillion. . . . .
Estimation of param('t('rs ill stRlldard models 6.1.1 r-.lom{'lIt t'stimatC8 . . . . . . . (3.4.2 The LS estimator for linear dynamic models 6.1.3 Tile prediction crror method . .
6AA The ~IL 1Ill'lllod for dynamic models 6.5 Selectioll of the modcl order
6.5.1 The 3utocort"l'lation [UII{"tions 6.5.2 Testing thc model
6.5.3 IrlfOrlllllliull ("I"it('ria 6.6 ~lodcJ checking . .
6.6.1 Cross-vulid/lUoli 6.6.2 il.C!iiduru analysis.
6.7 Cnse study: Ele<:tricity consumption (3.8 Problems . . . .
7 Spectral analysis
7.1 The p<'riodogram . . . . . . . 7.1.1 lIarmonic allal~·sis. . . . . . 7.1.2 Properties of the pcriodogram 7.2 Consistent estimates of the spectrum 7.2.1 The trullcated periodogram . 7.2.2 Lag-and spectral windows
7.2.3 Approxillllitive distributions for spectral estimates 7.3 Till' CfQ6S-Spectrutn . . . . . . .
7.3.1 Tilt' (1)-sI>t'CtrulII and tile quadrature spectrum 7.3.2 Cross-amplitude spectrum, phase spectrum.
t"ohcr£,IICt' Spt.'CtrlUll, Wlin tipcctrUIll 7.1 Estimation of the cross-spectrum
7.5 Problems . .
8 Linear systems and stochastic pl"Ocesses
8.1 Iklationship bCl ... ('('n input and output processes . 8.1.1 ,\Iolllclil relations . . . . . . .
145 115 116 146 150 152 153 15-1
156
157 1571
59
163 166 170 171 171 174 174 175 175 179182
187187
189
190194
195196
200 203206
206
209210
215215
216I
8.1.2 Spectral rclutiom. . . 8.2 Systems with measurement lIoise 8.3 IlIpm-outpul llIodds . .
8.3.1 Transfer function models 8.3.2 Difrereu<-,<, {'(Illation models 8.3.3 Output error modeL<, . .
8. I IdentifiClltioll of tmuskr fUllction models 8.5 f..luitipic-input models.
8.5.1 ~Iolllcnt relations . . 8.5.2 SpN:'trai T('lations . .
8.5.3 Identification of multiple-input models. 8.6 I:::'timation... . . .
8.6.1 Moment estimates . 8.6.2 LS l'!!;tillllltl'S . . . 8.6.3 Prroictioll error method. 8.6.1 hit ('StillllllP;, • •
8.6.5 Output efror method 8.7 hfodel ciu'('killg .
8.8 Prediction in trrulsfN fUliction models 8.8.1 fo.lillilllum vnriancc controller 8.9 Intervention Illodpls
8.10 Problem... . . . 9 Multivariate time series
9.1 Statiolllu'y sl.ocha..,tic proccs.scs and their moments 9.2 Linear processes . . . . .
9.3 Till' IIIllhimriate ARMA process
9.3.1 Theoretical covarian('(' matrix fUllctions 9.3.2 Partial correlatiOll matrix. . . . . . 9.3.3 q-conditioncd partial correlation matrix 9.3.4 VAll reprcscntation . . 9,4 :-\on-stationary models . . . .
9.1.1 The multiwt.riatc ARI~IA process 9.4.2 The multivmiate seasonal model 9.1.3 Time-varyillg mode!.s
9.5 Prediction. . . . . .
9.5.1 ~lis."illg value:s for some signals 9.6 Identification of multivariate lIlockb
9.6.1 ld{'mificatiOIl using pr('-whitcning 9.7 Estimation of parRIIlI'ten; . . .
9.7.1 Least 8quRr('S estimlltiOIl
9.7.2 An I'xteuded LS mC'thod for multi\1lCiklt> ARMAX models (th{' Spliid Illt'lhod)
9.7.3 f..IL estimates . . . 218 220 222 222 223 223 223 226 226 227 227 228 229 229 22!J 229 230 230 233
238
241 2·\·1 247 21\9 251 251255
259260
260 261 261 261 262 262 265 267 269 269 270 271 271 9.8 r-.hxld checking. !J.!J ProbleIllS . .10 State space models of dynamic systems 10.1 The lillea.!· stoChill:il ic stal e spru:e modd 10.2 Trall.!>fer fumtioll and slatC spaCi' formulaliOlls 10.:1 LuterpolulioD, re<:ollstruetioll, alld prediction
10.3.1 Tile Kahllan filter . . . . 10.3.2 k-btcp prroictions ill !oolate !oopoce models
10.3.3 EllIpirintl BaycsiRIl description of the Killman filt{'r 10.4 SOIllC COlllmon models ill slat{' SPI\CC form
10.1.1 Signal extraction . . . . . . 10.5 Ti.me series with missing o\.)M·f\'8tiolls
10.5.\ E~stiml\tioll of autocorrelation flll\(·tions 10.6 l\IL estimates of state spa£'(> models
10.7 Problems... . . . 11 Recursive estimation
11.1 Recursive LS
11.l.l Re<:ursive LS with forgetting 11.2 Recursive pseudo-linear regression (RPLR) 11.3 Recursive prroiction error met.hods (HPEM) 11.1 t-.lodel-ba.'>Cd ad"ptivc C!>tima.tion . . . 11.5 t-.lodels with till1('-varying parallietcrs
11.5.1 The regr{'S,>;ioll model with time-varying panullctcfS 11.5.2 Dynamic model.; with tillie-varying parameters 12 Renl life inspired problems
12.1 Predietion of wind power production. 12.2 12.3 12.·1 12.5 lUi 12.7 12.S 12.9 12.10 12.11 12.12
Predietion of the eonblllllptioo of medicine Effect of chcwing gum . .
PrPdiction of stock prict.'S . .
\\'astewlltcr trcatment: Using root zolle plallL'i Scheduling systelTI for oil d<."Jivery
Warning system ror slippery roads Statistieal qUlllit\' {'omrol
\\'a.o.;t<."wutcr treatment: ~Iodeling and control Sales llUlllbcrs . . . .
:\Iodelillg and pl'<'<iiC'tion of stock prices Adapti\'{' mooeling of ililerest rat{'::;
Appendix A The solution to difference equations Appendix B Partial autocorrelations
271 278 283 281 286
288
289296
2'J6 299 301 307 307 :\07 310 313 313 316 31!.l 321 321 325 325 326331
333 334 336338
3103-11
311 345 3.\7 350 352 353 355 357The aim of this book is to give an iutroduction to timt' st.'rie:3 aJlIlly~is. The emphasis is on methods for modeling of linear :;tochu:.tic systcms. Doth time domain and frequency domain description:; will lX' given; however. {'mphnsis is on the time domain description. Due to the highly different mathematical approachf:'S 1ll"l'(lcd for Iincar and non-linear systems, it is instructiv(' to deal with them in 5etx'fate textbooks, which is why nOIl-lillear time series anniysis is not a topic ill this book im;tead the reader is referred to ~Iadsell, Holst, Iwd Lindstrom (2007).
Theon.'ms fire HSt'd to empiLtl.';ize the most import.nnt results. Proofs are given ouly when they clarify the r('!;ults. Small prohll'llIs ar(' included ilL the (lnd of mOSt chapters, and 8 ~parate chupter with real-life problems is included IL>; the final chapter of the book. This also sen.'eo 1\,>; a delllonstrat ion of the mall)' possible applications of tillll' l;{'rie:; analy~is in areas such as physics, cngineering, IIlld I.'(:ouollletrics.
During the se<luence of cJlIl.ptt'rs, lTlore advUllt'{.'(i stochastic models arc p;radualiy illtroduced; with this npproach, the family of linear time S('ries models and methods is pnt into a clear relationship. Following un initial dlHpter ('Owring stlltic model>; nud methods such as the use of tht' gcneral linear model for time series data, the r('St of the book is devoted to stocluu.tic dYllamic 1Il0dl'ls which are mostly forllluillted as ditrereIlcc equations, as in the famous ABMA or vector AR~IA processes. It. will be obvious to LII(' feadt'r of this hook that l'ven knowing how to solve difft·]·(,uct' t.'(juations becomes important for understanding the bd18\'ior of important. aspects such as the autocovarian('(' functions and the nature of the optimal pre<lictions.
The importlUlt concept. of tillll'-varyillg systems is dealt with using a state space approach and the Kalman filter. lIo\\-'t'Vl'r, the strength of also using Rdnpliv(' estimation IlIt·thods for on-lint' for('('/istillg and control is often not adequiltcly recognized. For instance, in finance the classical methods for forecasting /ITt' ofu'll not very lIS1'ful, but, by IIsillg adoptive tcchlliquC8. interesting re.ults nre often obtained.
TIl{> last dutptpr of this book is devoted lO pro1>I(,lIIs inspired by r('a1 life. Solutiollh to the problf'llli'l ar(l found at http://wwv.imm.dtu.dk/-hm/ time. series. analysis. This hOUl(' page a!:;o coulnins additional exer<:i!';('S, called assignments, intendE'd for ht'ing solved lI!'ing n {·otHpUlt'r with dcdieat('(1
software for tillLt, St'ri,,!; Hllalysis.
I lUll grall'ful to all who hnv(' contributed with USC'ful comments and suggestions for improwlll<'lll. Esp('("ially, I would like to thallk my ("OII('agu{'S Ja.n 1I0lht, H(,llrik Spliid, Leif ~Iejlbro, Niel,., Kjolstad Poulsen, and IIcnrik Aalborg Nielsen for thdr vahlllbl(' ('Ommems and suggestions. FUrthermore. J would like to thank forlllN studC'nts Mortell Boicr Olsell, RllSlllUS Tillllstorf, and Jail ;\'ygaard Nielsen for thdr greul ('frort in proofreading and improving the first IllfUllIS('ript ill DlluiJ.;h. For this 2007 edition in English. J would like to thank i)(>von Yates. Stig l\IortClbell, and Faunar
Om
Thonlfl.rson for proofreading alld tlwir wry uS('ful suggC'Stions. III particular, I run grateful to Anna Ht'lgfl J6nsd6ltir for her assist(UlCC with figures and examples. F'inally, I would like to thank 1\Iortell HIlghohn for both proofreading and for prop06ing ami ('r(,!lting a new layout in LNfEX.Lyngby. J)1'tlIunrk H('1Uik fIf(UI.~(,11
Notation
All vectors are L"OiulIIll \'('('lQl':'j. Vet·tors Ilud matrices arc emphru;ized using il bold fOIlt. Lowercase iC'ttcrs ilr(' used for v('('tor!) and uppercase letters are used for mll.tric('!:i. Tr!l..Il~I){)billg is d(,lIot('d with the upper index
or.
RWldom variables are always writtell usillg uppt'r('IL'>C letters. Thus, it i~ not possible to distinguish bctW('('B n Illultivariate random variable (random vector) and a matrix. Ilow('\'cr, ralldom variablC!'i Hn:' assigued to letters from thC' last part of tit,· IIlplml)('t (X, Y, Z, U, V, ... ), while deterministic term!) arc assigned to letters from the first part of tll(' Itlpllllhtt (a, h, c, d, ... ). Thus, it should be poosiblt· 1.0 distinguish hclw(.'(..'ll Il mutrix and a random \"ect.or.
CHAPTER I
Introduction
Time series analysis deals with statistical methods for analyzing and IIIcxit'lillg an ordered ~('(lllt'lLet' of observations. This modeling rcsults ill a stoch(lStic process modcl for the system which gCllerated th(' dala. The or(\l'ring of observations is mO/'lt often, but not always. through time, particularly in terms of t"qually spaced time intcr\"uI8. In some applied litenHurt', time seri<'S are often called signals. In more theoretical literature a time serieR is ju;;t till ob~rvcd or lIIc8..<;urcd rcalizatioll of II stocha...,tic process.
This book on time' series analysis focuses 011 modeling using linear model!!. During the s<'<jUC'IlCC of chapters lIIorE' and more advanced models for dynamic ~rstcms afE' introduced; by this approach the family of linear time series models and metbods are placed in 1\ structured relationship. In a subse<went book, lion-linear time S(>ries models will be considered.
At the sluue time the book intends to provide the reader with nil un-derstanding of the mathematical aud statistical bo.ckgroulld for time series analysis allel mod('ling. Tn general the theory in this book is kept in a second order theory framework. focu88ing Oil the second ordt'r cliarRcteristiCll of the pcrsistl'nc(' ill time as measured by the alltocovariancc and autocorrelation fUllctions.
The separation of linear and nOli-linear time series analysis into two books facilitates a clear demonstration of the highly different mathematical air preaches that arc needed in eneh of these two cl\ses. [n linear time s('ries analysis some of the most important approaches arc linked to the fnet that sUI>erposition is valid, and that clns.<;ieal frequellcy domain approaches are ciirt'Ctly u;.;abl(". For nOli-linear time series supcrpooitioll is not valid and frl'qucncy domain approaches are in general not very lL~rul.
The book call be seen as a text for graduates ill ('ngineering or ;;cience d{'partments, but aL'iO for statisticians who want to understand the link be-tW('('n lHoci('ls IlIld methods for linear dynamical ~yst('ms and linear stochastic processes. The intention of the approach taken in this book is to bridge the gap ixotwE't'n ~ci('lItists or ellgio(!o('rl:!. who often 11IIn' a good understanding of IIlt'thods for dcscribing dynamical syst('ms, and statisticians, who have a good understanding of slali;;tieal thoory liliCh as likelihood-ba.'>("(\ approa.chC'S.
tn classical statistical analYMis lhe correlation of data in time is of len dbregarde<i. For instance ill fl'gff'Kt,ioll analysis til(' f\.<;sumption about SC'rial
2 INTRO[)!I('TION
tUicorrclatcd residuals is often violated in practice. In this book it will b£' demonstrated that it is nucial to take this autocorrelation into ac<:oulll in til£' modeling procedure. Also for applicfllioll.!; such as simulations And fOT(,(,lLSling. we will most ofl('1\ \)(> able to provide mllch morc rCf1.!>Ollablc ami rcnJistie results by taking the autocorrdntion into accollnt.
On the other iUUld ad('quate lIlethods and models for lime series IInfll\'sis can oft(>11 be seen as a simple cxtcllhion of linear regres.<;ion analysis wl~cre previous ollM'rmtiolls of til{' dependent variable are induci{'d ns {'xplawllory \'ariables in a simple linear regrcs.,,>ioll t,rlle of model. This facilitates a mther C1L'>Y approoch for IIl1d(>rstnnding tuany method'! for time series anHlvsis. as
demonstrated in various dmptcrs of this hook. •
Theft' arf' a lIIullber of reru;ons for studying time series. Th{'S(' indtlri(' n charact('rizution of time S('ri(>s (or signals), understanding and lUodcliu!1; tlLt' datn gellerlltillg syst(>nJ, forecl\.>;ting of future valucs, aud optimal cOlilrol of n system.
In th(' rt'St of this chapter we will first consider sollle typintl lillie S('rics and briefly mention the rl.'usons for studying them and the methods to mit' in ('I\('h eas('. TlwlI sOllie of the important Illcthodologit!s lind models art' illtroduced with the help of 1\11 eXllIllpk' when> we wish to predict the 1II0llthly wllellt pri('{'S. Finlllly the contents of the book is outlined while focusing on the Illodel structures and their ba.':Ik rclfllions.
1.1
Example
s of time seri
e
s
In this M'Ct iOIl we will show examp!t.-s of lillie serie;. and nt the same time indicate possiblt, appliclltions of time series analysis. The examples COlilitill both lypical ('xamplcs frolll ecollomic :;tudics and more technical applications.
1.1.1
Dollar to Euro e
xc
hang
e
rat
e
The finot ('xllrnple is the dnily US dollar to Euro interbank exchange rnte shown in Figure 1.1. This is a typical economic time series where tilllt' S(>rie!ot III1a1ysis could Ix> used to fonnulat(' II model for forecasting future vnlues of the exchange mle. The analYSis of such II problem relates to the models and methods d<'S('ribcd in Chapters 3, 5, Illid 6.
1.1
.
2
Number of monthly
a
irline
passengers
Next WI' consider the IIUll1\)('r of 1II0litbly I\irliue passengel'!' in tht' US shown in Figure 1.2. Fa!' this Sf'rips 1\ elCll( al1nual variation is S('('II. Again it might be II;-,('ful to c(mSlruet a model for making forPC'a."!t.'; of the future lIuIIII)(>r of airlinf' pa. .. SC"ngefS. ~lodl'ls and lIIeth(xls for analyzing tillle S('ries with M'a.'iOnni variation an' dl'M'rihc'(l ill ChapH'rs 3. 5. and (i.
l.J EXAMPLES OF TIMF, SEIlIES
0 <..> 00 0 0
"
00"
00 ~••
0 Q3 Q4 QI 200l Q2 Q:I QI QI 2005 Q2 Q3 2006 Figure 1.1: Daily US dollar' to Sum iutcrooflk CLcitaJlge mtl'.~
~•
0~
~ ~~
~~
~«
~ 1995 1996 1997 199b 1996 2000 QI QI Q2 2007 2001 2002 3Figure 1.2: !\umber of mOllthly airlill~ IHUUf'fl!Jl"rs in thl" US. A cleor annual t'oriotion fUn be ,'jet'n in the serif's.
1.1.3
Heat
dynamics of a building
~ow let us consider a mort' tCi:'lllli('lll t'xHlllple. Figure 1.3 on the following paw' shows meilliurcments from an unoccupied test building. The data 011 the lowcr plot show the iudoo!' !lir t('lIIpl'mtufe, while on the upper plo~ the ambit'lit air telllpefaLIIf(', th(' h('l\t supply, and th(' solar radiation are shown. For this exnmple it might b(' illteresting to characterize the thermal behavior of the buildiug. As It. part of thaI tht' so-('allf'<1 resistance against heat flux from iu:-;ide to outside can 1)(' estimated. The rcsistanc(' chnrocterizcs the insulation of the building. It might nbo IX' IIst'ful to ('Slahlish It. dynamic model for lht' building and to f'Stinu\I(' thl' lilllf' constants. I\I1Owlc'(\g(' of the time oonstanL'l ('an be used for designing optimal oolllrollel'S for the heat supply.
'I
E
. ~~-5""
~-"
"; S!~o
:e-~oo • g)~~
"
'2 t-" Nh
INTR.ODUCTION Input variables~~~~~~~~~
On. I [ 1983 Oct 12 1983 Oct 13 1983 Oct 11 1983Output vllriable
0:00 12:00 0:00 12:00 0:00 12:00 0:00
Oct 11 1983 Oct 12 1983 Oct 13 1983 Oct II 19S3
Figure 1.3: Mt'.QStcIYmM.I" from an tJrloccupied te.~t building_ The input t'ariable.8 all" (JJ solar rodialion, (2) ambient air tempernlure, alld (3) heat input. Tile output va"'ablt'
as
thf' indoor air temperature.For this case methods for transfer function modeling os descril){'(1 in Chap-ter 8 call 0(' 1lS('(J, wll('re the input (explanatory) variables arc the solar radiation, heat input, find outdoor air temperature, while the output (d£'l>('n. dent) variablr is till' indoor air temperature. For the methods in Chapter 8 it is crucial that all the signals can 1)(' classified as either input or output. :-;cries related to til(' sysH'rn considered.
1.1.4
Predator~prey relationshipThis f'xllmple ilIustrntcs 11 typical multivariate time S(>ri(':;, sinc(' it iJ:i not p~sibl(' to clru..o;i(v OIl(' of thl' M'ti('s as input and the other Sf"ri£'S lL'l output. Figur<' 1..-1 shows 1\ wi(\f'l.v studied pn.'(l.at,or~prey CIi.SC, ntllnd.v til(" M'rif'S of tullluully Imcl('d skillf.; of muskrat and mink by til(' HmL-,()II·f.; Buy C()Iupnny
J.2 A FIRST CflASH ('OURSE 5
~ r..luskral
f
"
~ N M ~'"
"'"
:t
0 1850 lSOO 18iO''''''
18!Xl 1 !J()() I!)JO Figure 1.4; Armuallll troded skinA of mu.;kmt and lIIi,tk by the Hudson's Bay Company afler' logarithmic irn'l$for~lIati(m. It is 'lOt lx>ssible to classify olle of the series as input and the other 8rne., as OUil)Ut.during the 62 yellr period 1850 19J L 111 fad the population of muskrats depends on the population of mink, and the population of mink depends on the number of muskrats. In sud I Cfl..';t'S both series must be included in a multivariate time scriCii. This series has been considered in many texts on time wries analysis, and the purpose is to dC'i;Cribe in gell('ral the relation ootWf'('1l
populations of muskrat and lIIiuk. l\kthods for ullulyzing such multivariate series are considered in Chapter 9.
1.2 A first crash
course
Let IL.., introduce some of th(' mO!iI important concepts of time series a.nalysis by considering an example whcn' w(, look for simple modcls for prcciicting the Inonthly prices of wheat.
In the following, IN PI dellote tht' pri('{' of wheat at time (month) t. The first naive guess would be to say that the price next month is the same as in this mouth. H(,llce, the predictor is
(1.1 ) This predictor is called tile naive predictor or the pen~istent pn:dicior. The syntax used is short for a prediction (or (,!!limAtP) of PHI given the observations Ph PI I, ....
~cxt month, i.e., ut tilllC I + I, tll(, 1\('lnal price is P1
+
1. This means that the prediction error or innot'afion ma .... be computed as6 iNTIlO1)UC"·'ON
By mmbiniug &:jUntiolls (1 I) Illld (1.2) we obtain tile stoch(L.'1tir lIIothl for tiJ(' wlll~at pricf'
( 1.3)
If
{ed
h;n
s<'qU{'IIC('of unt"Orrc!ntt'C!
Z(,TO melUl random variables (wMle Floise), the process (1.3) is ('ailed a mndom walk. The random walk lIIodd is vcry often seell in fimul(;c nud {'("{)lIolll('lrirs. For this model the optimal predictor is the naive predktor (1.1).The random walk can be rewritten as
(1. I) which shows that the random walk is nn integration of tlu::' lIoise, auel that tll(' varian('(' of PI is unbounded; therefore, no stationary distribution exists. This jJj illl example of n nOIl-stutionaql ]JTocess.
IIow('y('(, it is obviolls to try to {'Ollsidcr the mOTC general model
( 1.5)
callN:\ till' A R( I) Tn()(lr/ (the fllltoregressive first order model). For tllis proc"("S.-; H statiollary dbtribution CxiJils for
11;'1
<
I. Notice that the random walk is obtailll'(l for r.p J.Another ('andidate for fi model for wheat prices is
( 1.0) \vhicil assUiliClI that the pri<.-e this mouth is explained by the price in tile samf' month I&;t yE'fLr. Thb. S('('ms to be a reasonable guCSl; for a simple modC'!, ~in("(' it is well kno ... n that Wh('At pri('(' {'xhihiu; a seasonal variatiorl.. (The noi~ pl"O<'eS.>;CS in (1.5) and (1.6) are, despite the notation used. orcoun;(>. IIOt tht, ::<8llle).
For wlwllt priC<'S it is obvious that both the actual price and the prict' in t he sallie month ill thf' previous year might bE> used in II description of tilt' expected pric(' next month. Such a model is obtained if w(' fL'>-'i\tIll(' thnt tilt' innovation Et ill model (1.5) shows an allllual variation, i.e., the cotnbill('(! JIlodel is
(1.7) l\lodt'ls sl1("h fL'l (1.0) and (1.7) aI'£' call£'d se.aBQnai models, and they I\r(' USN) vcry oftel1 ill l'('ollomNries.
Notic(', thllt for~' = 0 w(' obtain the AR(I) model (1.5). whilt' for I{) 0 th(' most simpl(' !t('{\sonal 1110<1('] in (1.6) is obtained.
Dy introdudug th(' backward shifl opautor B by
( I.H)
1.3 CONTENTS AND SCOPE OF TIlE BOOK 7
the modeh; can be written in a more (''Ompnct form. The AIl(l) model can be written as (i-r.pB)P,
=E,.
and !lieS('iL'>Olllli nl(x\C'1 in (1.7) as(1.9) If \\-e furtherlllore iUlroduce tile dijJrrY'ncf' operator
'V (1-13) (1.10)
thell tht'random walk canlK- written \1P, = E/ uliing a very compact notation. In this book thCS(' kinds of notlltions wilt be widely used in order to obtain compact equutions.
Given a time SfrifS of ollM'rvro monthly wheat prices, Ph P 2 ,···, PN, thl' model slructmY! CUll be identified, aud, for n. given model, the time series clln be used for l)(.1mmeler eMi7llalion.
The model idcnhjicatiu1t i:; moot often bo..:;cd 011 the estimated autocorre -lation function, since, liS it will be shown ill ChaptN 6, the autocorrdation fUllction fulfils the sUllie difference equation as the lUodC'1. The autocorrelation function shows how the pric(, is corl'('lal('(\ 10 pr{'violls prices; more specifically the autocorrelation in lag k, called p(k), is simply the correlation between P, and P, k for slalioullry pron'S.-;t'S. For tile lIIonthly values of the wheat price we might expect a dominant allllual vnriatioll and, hence, that the autocorrelatioll in lag 12. i.e" p(12) is high.
The models above will, of COUI"S(', be generalized in the book. It is important to notice that Ihe:sc prm'Ci>.'i{'$ nil ht'long to the more general elM'] of linear prON.'SSeS. which again is strongly rciat('() to the theory of lineM systems as demonstrated in the book.
1.3 Contents
and scope of th
e
book
A~ m~mtiolled previou~ly, this book will concentrate on analyzing and modeling dynamical sy~temJ-; lL~itlg statistical methods. The approR.ch tR.kell will foclls on the formulation of appropriate models, their theoreticul characteristics. and on links betw('('u tlie lIIellll)('T'S of the class of stochastic dynamic models cOIll;idered. 111 gelleraL the models eonsid('re<i arc all linear and formulated ill discrete time. Howev{'r, sonl(' r('silits r('\al('(1 to continuous tillle models arc proVided.
This SC'Ction d('S{'ril)('s liJ(' eOlllen!s of the subsequent chapters. In order to illustrate lht' r('latian betw('Cn various models, SOIlJ(' fUlldalllf'ntal eXlllnpl('S of the consider(.'ti lIIod('l'l art' olltlinf'd in tli(' following section. lJowever, for more rigorous dt'S('riptions of til{' d('ti\ils r('latro to the models we refer to til(' following chapters.
10 Chuptf'r 2 1 he {"OIlCt'pt of IIIIlIt imriat£' rR.ndom vR.riablt'!> is inlrodu('('(1. Thi~ dlaplt'r also introdu("(", U(-(·j .... o;ary fUlldamt'lllai {'on('('pts such 8.', th('
8 lNTnootJe'Tl0N
conditional meall and the linear projection. In generaL the chapter provides the formulas Illld methods for adapting a second order approach for characterising random variables. The second order approach limits the attcmioll to fir"t and ;;('('Oud orcll'f ('('lIlm] 1Il0nWllts of the density related to the random vnriablc. 'This approach link:; closd.r to the very important SE'C:Ond order charact£'risatiOIl of stochastic prO<'t'SS('S by the autocQvariancc function in suixseqllcllt. chaptNli.
Although lillie series arc realizations of dynamical phenomena. non.dynam-kal methods arc often used. Chapter 3 is devoted to df.'S<'rihing .~tatj(' "ux/fL9
applied for tillie S('ries analysis. lIowen'T, in the rest. of the book dynamical models will be considered. The mcth(}(l<; introduced in Chapter 3 ArC' all linked to the class of regression models, of which the general linear model is lhe moot important nlember. A hrit'f dCM'ription of the general linear model follows Iwre.
III the following, let
Yi
dellote thl'dependent variable and x, = (XIt,X2j, .. xpI)T n known vector of p explanalory (or indcpendem) varillbl{'S index(>{1 by the !illll't.
The yf"tll~lTIllinear' model (GLM) is a linear relation between the variables which can be written,
Yt = L XktOk
+£,
k=1
(1.11 )
where £j is 1\ !,NO IIlt'an nmdolll variable, and 0 =
(0
1 , O2 , . .• , Op)"I' is a Vl.'Ctor of the l' parameters of the model. Notice that the mod('1(1.1 I)
is a stntic model sinCl' nil the variJ:lblcs refer to the same point in time.On-line and rccursive lIl('thods arc very important for time series analysis. These methods provide us with the possibility of always ul>illg the IIlCbt recent data, e.g., for oil-ii Ill' predictions. Furthermore, changes in time of the considered phenomena calls for adapth'c models, where the parameters typically are allowed LO vary slowly ill tilll£'. for on-line predictions and conlrol, adnpth'c estimation of paramcters ill relativcly simple model,; is often to be preferred, since til(' alternative is a rather complicated model with explicit lilll(,--vl\fying parametcrs. Adaptive lI1ethods for t'Stim3ting parameters in thc gelleral linenr model are considered ill Chapter 3. This approach introduces exponelLtial smoothil19, the Holt-Winter procedure, and trend modcl8 as importunt specil!l
'''''''''.
The remaining chapters of the book consider linear systems lind appropriRtt' rf'iat{'(1 dynamicaI1lux.lelli. A lincar system converts an input series to an Output series as illustrated in Figure loS,
III (,llIl.pt('r 4 we introduce linear dynamic deterministic ~yM('ms. In this chapter. one should note that for raudom variables capital INters nrc \I!'i('d wh('rcN! for d('V'flllinbtic variahlt'S WI' use lower case il'ttf'f'l"i.
Al; a backgrollnd for Chaptn .j, one !>bould be aware that for lin(·ftf Hml tim('-im-ariant tlYlilclII>; the funduflleUlal rdation belwt"{'n tbe dewnnillbtic
1.3 CONTENTS AND SCOPE OF TIlE nOOK
SystclII
Input Output
Figure 1.5: Schnnatic n>pn' .. If'ntafion of II ijnror "II.dem.
input Xt and the corre.pondillg OlltPlltYf b the c01lVo/ution X
y, =
L
Ilk;]", l',
~9
(1.12)
The sequence {hk } is cailed thc impl1lse response jundion for the linear dynamic system. For physicul s.\'stellLs wilcf(' till' output dO(.'8 uot depend 011 future w,lu('S of the input. the sum in (1.12) is from k = O. lJased on the impulse rf'!o;pon!ie fUllctioll we will obtaiu th(' /1"Cfluency response function by u Fourier transformation, and the tm1l.'ljeT" f!lndian hy lIsing the z transfonnatiOll.
A very important model belonging to the model class described by (1.12) is the linear ditTel'f'1Ice efllL(J./ion
( 1.13) Chapter S considers stochastic proc(>S.."t'S, and the foclls is on the lineflf stocllru,:tic process {V,} which is defined by the <."ouvolutioll
~
}'t
=
L
1;-'kf"l-k (l.L4) k,'Owhere
{el}
is the so-called whitc noiS(' process, i.e.,a
sequcnce of mutually ullcorrelated identically distribut('(\ Z('fO mean random variables. Equation (1.1.1) defines a zero Illcan process, ho ... e\'cr, if the mcan is not zero the mcan Ill' is just added on the right hand side of (1.14).Notice the similarity between
(1.14)
and(1.12).
This implies that a trans-fer function can be defint'<\ for the linear prOC ... tiS as for tbe deterministic linear systems. Stochastic procCS6CS with a rational transfer function arc the AJU..tA(p,q) proCf':S.'>, tlw ARlr-.-JA(I', d, f/) pro('es. .. , and the llluitiplir,ntive 8£'&;Ona.l procesSt'8. These important processes arc considered in detaiL As an example the proccs..'l {Yt } gi veil by(1.15)
where {eel is white nOise, is known Nl the A RMA(I', q) process. Notice the similarity betwet'n (1.14) and (1.15). Such a process is useful for describing the data related to the dollar to Euro exdm.ng(, rate ill Section 1.1.1, and the 8ea8oDai prooess models are US('ful for modeling the monthly number of airline pe.ssengers in Section 1.1.2.10 INTRODUCTION
Givell a time series of observations Y\, Y2,"" YN , Chapter 6 deals with identification, estimation, and model checking for finding an appropriate model for tile underlying stochastic process. This chapter focuses on time domain methods where the autocorrelation fUllctioll is the key to an identification. Frequency domain methods are typically linked to the spectral analysis which is the subject of Chapter 7.
The iiO--calle<l transfer /tmction models are considered in Chapter 8. This class of models describes the relation between a stochastic input process {XI} and the outpnt process
{Ve}.
Basically the models can be writtcn~
Y! =
L
h"Xt _k+
Nt (1.16) k=Owhere {Nd is a correlated noise procC5.'>, e.g., an AllMA(p,q) process. This gives rise to the so-called Box-Jenkins transfer flLnclion model, which can be seell as a combination of (1.12) and (1.14) on the previous pagc. It is relath·eJy straightforward to include a number of input processes hy adding the corre.pondiug lIumber of extra convolutiolL" 011 the right hand side of (1.16). An important assumption related to the Box-,Jenkins tran~fer function models is that the output process does Hot influence the input process. lienee for the heat dynamics of a building example in Section 1.1.3, a transfer fUlietion model for the relation betwccn the outdoor air temperature and the indoor air temperature Clln be formulated. This model can be extended to also include the solar radiation and the heat ~upply (provid(,>(i that 110 feedback exists from the indoor air temperflture to the heat supply).
In the case of multiple processes with no obvious split in input and output proces.'ieli, the llluitivariate approach must be considered. In Chapter 9 the multivariate linea'· process is inlroduc(:d as a.n tn-dimensional stochastic process {Yi} defined by the multivariate convolution
~
Yi.
=L
tPkEt_k (1.17)b<O
where
tP
is a coefficient matri.x, and{Et}
the Illultivariate white uoi<;e prOCt'S.<;. This formulation is used in Chapter!) as the background for formulating the multivariate ARMA(p,q) P'YJCCss (also calk-d the Vector-ARMA(p,q) process) and other related models.As Illentioned previously, the muskrat-mink Cil...,*, reprcscnts a problem which must be formulated as a multivariate process, simply be<;ause the population of minks influences the population of muskrats and vice versa.
Until now all the models can be considered as input-output models. The purp~ of the modeling procedure is simply to find an appropriatf" model which relates the output to the input process, which in lUany cases is simply 1h(' white noise process. An important class of modeb which not only focllses
1.3 CONTENTS AND SCOPE OF TilE BOOK II
on the input-output relations. but also on the internal state of the system, is the class of stall; SWtCl; model..; iutroduced in Chapter 10.
A state space model in discrete time is formulated using a first order (multivariat.e) differt'llct' C<llIiltion dt'scrihing the dynamics of t.he stute ved01',
which we shall denote Xt, and a stat.ic relation betwccn the state vector and the (multivmiatt') obS('rvation
Yr.
.
MorC' spcdfically the linear stu/.f' .~l)aCC modd cousists of the sy..;tem equation(1.18) and the ObSe11!ation cqlmtio71
(1.19) where XI is the m.-dimensional, latent (not directly observable), random statc 1Jector. Furthermore u/ is a deterministic input vector,
Yi
is a vector of observable (Illeasurable) stochastic output, and A, B, ami C are known matrices of suitable dimensions. Finally,{eu}
and{e2,d
are vector white noise processes.For linear state space models the Kalman filter is us(."(i to estimate the latent state vcctor and for providing predictions. The Kalman smoothe,· can be IIsed to estimate the values of tbe latent state vector, given all N values of the time series, for l'!.
To illuslrat(' all example of application of the stat(' spac(' Illodt'l, eonsider II.gain the heat dynamics of the test building in Section 1.1.3. ~Iadsen and Ilolst (1905) shows that a second order systcm is 1l()C(led to de;cribe the dynamics. Furthermore it is suggest.ed to dcfine the two elements of the state vector as the indoor air temperat.ure and the temperature of tile heat. accullluillting concrete floor. The input vector Ut com.,ists of the ambient ail' temperature, the solar radiution, and the heat input. Only the indoor air t(>llIperature is observed, and hence, Yt is the measured indoor air temperature. Using the state space approach gives us a possibility of estimlltillg the temperature of heat a(TlIllllllating in the concrete floor using the so-called Kalman filter technique.
In general, the parameters of the models arc as.':!umed constant in lime. However, in practice. it is often obscn'cd that the dynamical characteristics change with time. In Chapter J I recw·sive and UdUl}tivc methods are introduced. BII.<.;icaHy the adaptive schemes iutro<illC'e II. 1 ime window related to the data, SITch that newel" datil obll'liw; more influence than older data. Tbis leads to methods for adaptive for(''Custing and control.
CIIAPTER 2
Multivariate
random
variables
An rt-dimensiorwi mn(iom lmri(Lble (1TUldom "ector) is a vector of II scalar
random variables. The random vector is written
(
X
'
)
X
,
X ~
)
(
n
(2.1)
Random vectors wililliso be clCllottxi multivariate nmdom variables.
2.1
Jointand
marginal densities
En>ry random variable hf\."i 1\ distribution fU!lclion. The II-dimensional random variable X has the joint distribution jurtction
(2.2) If X is defined on a continuous sample space, the joint (probability) density junction is defined as
I(
x.,
....
x"
) _ - ! : l o 8"F(x" ...• J::I x .. )v I I " -vI"
(
2.3
)
and the relation bctwC('1l tiu' distrilmtioll and density function isF(Xl"",X'I) =
l.z;
~
...
i
:f(t),
...
'tn)dtl ... dtn (2.'1) A random variable, X. is cnlll'ddiscrete
if it. takes values 011 a discrete (countable) sample spacf'. III this ('sse the joint dcm;ity function (ormass
Junction) is defined as/(XI,· .. ,7 .. ) P{X\ =x\ .... ,X .. = x .. }
(
2.5
)
The joint distribution and mass fUllclions are n-'lo.tl>(1 byF(I\, ... ,J',,)=
L
...
L
f{th ... ,t,,) (2.6)14 i\]ULTIVAJUAU: RANDOM VAIIIABI. ..
.s
For Il sub-.... ector (X\, ... ,XIt)T (k
<
Il) of the random v('{'tor X th<-marginal density function isfS(Xt, . .. ,.:rio) =
I
:···
I:
f(x\, . .. ,In) dXk+l ... dIn (2.7) in the contiuuous cru;c, and(2.8) if X iii dbcTctC, III hoth ('a.'!(.'S th(> marginal distribution junctioll is
PS(Xj .. . " II;,) = F(I\, .. .. :C,b~ .. .. ,x) (2.9) P]CIl.'j(' note that w(' will writ£' Ix(x) Rnd Fx(x) instead of f(x) and F(.r), fcspccti\'cly, whenever i~ is ncccssaf}' LO emphasize to which random variubl(' the fll11('tlon bdoup;s.
2.2
Conditional distributions
In liul(' SNit'S Imalysis ('omlitional distriuutions plRY all important role. es-pecially in relation to prroiction and filtering. For instance, ill prl,'(lklioll, where we wanl [0 gi\:(' 1\ slllt{'/ll<."nl about future valu(':oi of lh(' liml' series
gin'n pa...,' ohM'rVRtions, then the conditional distribution contains all available information about thc futurc valu('.
u't A and B d('lIot(' sollie t't'''nL,. If P(B)
>
0 thell tlip conditional probability of A occurring givcn that B occurs isP(A 18)
~
p(AnO) 1'(0)We intcrpr('t P(A
I
il) Il.'l '"the probahilit.y of A given 8."'(2.10)
SIlPPO~;(' that til(' continuous random variables X and Y havc joint. d('nsity. /x,~. If we wish to liS(' (2.10) to d(,t('rll1ill(' the conditional distribution of }' givl'll that X tak{'S lll(' vallie 7, we have the problem that the probability P(Y
:5
yI
X = x) is ullddincd us wc lIlay only condition 011 ('V('lIts which lu\V(' a strictly positiVl' probability, and P(X =x)
= O. lIowl.'vl.'r,/dx)
b positive for somc ;r; Vallll'$. Therefore, for both discrete and COHtillllOIiS random v»rinble; we II.W the following definition.DEFINITION 2.1 (CONOIIION/\L m:ssl'ry)
The cOlldilio1lui cl('n.~ity (flmdion) of Y given X = .r is
I
) .\ r( )
Y _£n(f,Y) - !x(:r) . (Ix (x)> 0)
(2.11 )... h('rp / \) is til(' joint d('ll!;it)' (unction of X and Y. Bolh X find}' ilia.\" 1)(, nlllh imrint(' random mrinhlC'l-;.
2.3 EXP~CTATIONS ANI) MOME:'OTS 15
Th(' conditional distrib1ltion/ut/ction is then fOllnd by integration or summa~
tion
as
previously dCl;('ribcd ill (2.1) uud (2.5) for the continliOUS and dis{Tele cases, respectin'iy.It follows from (2.11) lhnl
Ix.dr.y)
~h x,.'(y)Ix(x)
(2.12)and by interchanging X and }' on the right hand side of (2.12) ... e get Bayes' rule:
I
)1·\. .
.,.1/ -() _
Ix ,.
h(x),(x)!>(y)
\\.{' now define indel)('ud('nl'l':DEFINITION 2.2 (INDEPENIJENC":) X 100d Y a.n.' indqJt,ltdent if
I
x.> (r,
y)
~Ix(')Iy(")
which corresponds toFxy(x,y) = Fx(x)Fdy)
U X and Y arc iudcj>('nd('nt, it is d('urly seen that
IYI'
,.(y) ~Iy(,)
(2.13)
(2.14)
(2.15)
(2.16)
Bear in mind
that11
two nmdom variables are independent. then they a,"£ uL,o uncorreiatcd, while unoorrclatcd variables arc not. necessarily independent.2.3
Expectations and momentsFor a discrete variable X, til(' ('xp('('tA.tion is
ElXI
L
XP{X
~x),
•
i.{'., an average of tIl(' po~)sible valuei'! of X, e<tch value being weighted by its probability.
For continuous variables, ('xpt,<.·tatiolll'l Uf(' defined as integrals.
DEFINITION 2.3 (ExPEerATloN)
Tht' expectation
(or
mCan value) of a continuous variable X with density (unction / x isE
I
X
I
I
:
xIx (x) d.<
(2.17)•
16 MULTtVAIUATE RANDO~l VAnlAIlLES
WhCIl{'VCf this imcgml exists, I.e., if
Ellxll
< 00
(2.18)~ot(' that usually w(' rulo,"" the existence of
J
g(r)d;r only if JIg{x)ld.r < Xl. • Remark 2.1The e.xpectation opcmtor E can more generally be defined by a StieUjes illtcg''fl~
i.(' ..
EIXI
~
[
:
xdFx(x), (2.19) where F, i:; t.iJ{' dil'it rilmtion (unction for X. This d(>finitiOIl covers bothdiscrete and continuous variables. ..
TIlt' exp('('tlltioll is called the first. moment, becausc it is the first moment of Ix with rCtipcct to the line x = O.
If X is 1\ rnlidollL variable find 9 is a function, then Y = g(X) is also a
mndom variable. To calculate the expectation of }', we could find
h
lind tiS(' (2.17). Ilowewf, th<' pro('('&'! of findingf\
can be complicated; instead, we can exploit thatE
I
I
'1
EI9(X)1
~
J
:
g(x)!x(x)dJ. (2.20) This provides a method for calculating the "moments" of a di:;triblltioll.DEFINITION 2.1 (hlom:NTS) The Il 'th mOf/lC1lt of X i:;
E
I
Xn
l
~
I
:
,n
!x(x) <ix, and the n 'tli cc.nt7YJi moment isEI(X
E
I
X
J)"I
~
I
:
(x - EIXJ)"fx(x)dx.(2.21)
(2.22)
The second cen1ml moment is also called the variance, and till' variancc of
X is givcn by
V"'IXI
=
E[(XE
[
XJ)'
I ~
EIX'
I
(EIXJ)'.
(2.23)2. I ]\IOIlIENTS OF MUI.TIVAnlA'n~ RANDOIII VAlUABI.ES 17
:"-Jow let X be an II-dilll('Il~iOl\nl rnlll\Otll mriablc allo pnt Y = g(X). where
9 is a function. Ah in (2.20) we have
E
I
Y
I
~EI9
(X)J
~
J
X
...
J
X
g(.rl,"" .r,,)J.-dx]
.. ..
,.rn ) (ixi ... dXn
. ""'" - :x;
(2.21)
In particular. setting g(Xt.X2) = aXt
+
bI2 implies thatE
[a
X,
+bX,1
~aE
[
X
,J
+bEIX,1
(2.25) Thus, the expectation opcmlol' i.~ a /inca I" OJH'futor.The {'Iemenlos X. and XJ in lite random vector X (2.2·1) can be uSf'd for (kfining mixed. moments
E
[
X~X
;J),
(2.26)where Q and j3 are integers alld similarly for the corresponding cenlralmolll('lIts. Of special intcrest is the covariance of X, 11ll(1 X)
CovIX
"
X
,1
~E
I(
X
,
-
EIX,I)(X,
-
E[X,
1l1
~
E
I
X,X,I
-
E
IX,I E
I
X,I·
(2.27) Thp covariance gives information about the simultaneous variation of two variables and is u~f\ll for finding illterdl'Ilt'lldcncies.By !l1>plying (2.27) and the linearity of the cKpectation operator, we obtain the following important calculation rule fm' the
coranullce:
Cov[aXI
+
bX2,CXa+
dX,) = a.cCOV(Xh Xa)+
Q(lCov[X\. X .. ]+
bcCoV[
X2
' Xa]
+
bdCov(X
2 •X"J
where X I, .... X I are random vllrillbl{'S and a, ... ,d are constants.2
.4
Moments
of multivariate random
variable
s
(2.28)
We now consider IIlultivariate random variables. In time series analysis v.'C of tell usc only the first llIomcm (the mC1I1I valuc) and the second central moment (the variance). Therefore, it is helpful to have the following definitions:
DEFINITION 2.5 (EXPECTATION OF TilE HANDOM VECTOR)
The expectation (or the me.an v(llue) of the rondom ,'cctor X is
18 i\IULTtVARIATE RANDOM VAlUABLES
DEF1'1tTION 2.6 (COVARIANCe) The col'uria1lct> (matri.c) of X is
Ex = Vllr!Xj E[(X - I')(X _ /,)T[ V",[X,]
Cov[X
l •Xtl
Cov[X\.X2 ] V.,[X,[ Co"[X,,.X,J
Cov[X". X,] ...Ex is called the oovurian('(' matrix of X. SomC'timcs W(' shall U!iC the notation
Cov[X],X,,]
eo,,[X,. X,,]
V",[X,,[
For the variance',
u;,
we sometimes usc the notationu".
(2.30)
(2.31 )
Th(' rorrdatiOIl of two random variables. X, and Xj • is R normaliUltiOIl of the covariance and can be written
Cov[X;. Xjl = U'j p'} =
JVlU'[X,J
VarlXjJ U,Uj Equation!; (2.30) and (2.32) ll'ad to the definition DI:.:FtNITlON 2.7 (CORHELATION MATRIX)The conrlalioll matriJ for X is
TltEOREM 2.1 ( I p" R~ p~
:
P'I] p" IP'''
)
p,,, I (2.32) (2.33)The covariU1lN' matrix E (J1Id tht corrclatiml matrix R arc (a) .llYlllmctrir 1l1ld (b) positivt' semi.definite.
Proof (a) TIl(' ~~'lllInf'lry is ob"iolL' •. (1)) USP Var(zTXJ:2:: 0 for all \'11111('1; of
•
2.4 t>.IOMENTS OF r-.nJLTIVAHIATE IUNDOM VMIIAllLES ID
.. Remark 2.2
U, for inMall('f', E is positivc dcfinitc, we ofum write 1:
>
O.•
DElm~n ION 2.8
,Thf' covaria11ct: mabil: of 1\ mndom \'('('wr X with dilll{'llsion p and mean J.J..
I
and a random \"('("tor Y with dimensiOll q and mean v isExy ~ C[X. Y] ~ E I(X I')(Y v)TJ
(2.31)
It can be clearly Sft'n that C{X.
X
I
VarlXI.TnEormM 2.2 (CALCULATION RULES Fon TilE COVARIANCE)
Lt't X and Y be dejifu;(l (Ul in /)('jinilion JUi, and let A and B be 11 x p and m x q real mahices. Let U and V II(' IJ- ami q-dillu·n .. ~iollal mndom Ilccior8. Thrn
C[A(X
+
U), B(Y+
V)] ~ Aq
x,
y
[
a
T+
AC[X. V]BT+
AC[U,Y]BT + AC[U, V]BT Important sp<:cial casc.' Uf1"V.,[AXI ~ A V",[X]AT C[X+U,Y[ qX.YI+CU.Y
(2.35)
(2.30) (2.37)
Proof FoUow!; dir('('tly from lh{' ddlnition of tht' {"() .... arianct' and the linearity
of the expectatiOIl Ol){'rator.. •
Compare til{' rules abo .... {' with the rules for !;cillar random variables gi .... en in (2.28).
Example 2.1 (Linear trnnsfonnations of randml"l variables)
Let X he au lI-dillll'IISiOtllll nllldOll1 variahl<- with mean JLx and (."()variallc('
Ex,
:\ow \lit' introduC'('a
n{'w ntudOIll vnriHhll' Y = (Y] ... , YdT by till'linear
transfoflluuiollY a -t BX (2."ti)
when-
a hi a (k x I) vt'Ctor A.nd B i!; a (~. x 11) IJIlltrix .. B~' Il~illg tllf' flU:! thCltthe
expt'ctation opc.'ro.tor hi IiIK'Rl" Wt' fiudpy ~ E[Y] ~ E[a
+
BX] = a+
BE[X[ ~ a+
B/,x (2.39)•
20 !\IUl.'llVAIUATE HANDOM VAllIABL£S
Rnd b~· \L'iillP; tile !-.tate rule; (2.3!j) Mild (2.37) for cakulating the oomrianC('
Ey = Var'a t BXJ = VllrlBXj BVar[X]nT BExBT (2.10)
Picas(' note that. sometimes COyi',
-I
i!-. used also in the multivariate cu.<;e.2.5
Conditional
expectation
The conditiOIll\1 f'xppctation is the expectation of 1\ random variable, ~iv('n wl.lues of another random vuriabl(·. Later it will b(' shown that the optimal predictor (in tl'rlllS of minimum varinnce) is the CQuciitiollui meal], Conditional Ul('ans are also used in filtering.
DEFINITION 2.9 (CONDlTIOI'AI. EXPEerATIOI\)
TIl(' o:mditioMI t":rpeciation (or c01ldiliollal menll) of tile random \1\riablc }' given X = I is
E
[YIX
~
r[
!:
"Iy xoz(y)dy
(2.4 [).. Remark 2.3
If ~'C know tilt· valu(' X = .(1 and lht' conditional dCllsity function, thell w(, HrC
able to t'akulat(' th" conditional In('all. For anolht'r vah1(' X = Xl we
will
geta different value for lhe L'OndiliOlial mean. lienee. the conditional exp{'<:tutioll of Y givt>n X x is written '4-'(x) = E[l'IX =
.£').
Il(>(,AUSC the conditional expectatiol1 ocp('nds on the vHlup J' taken by X, we cnn abo think of lit{' rollditional (>x»('('tRtion as a function I,:'(X) of X iL~Ir. .. THEORE~I 2.3 (PnoPEIITIES OF' ('O:-;DlTIONAL ~IEA""S)Let X, Y IIwl Z be mndom variables (with joint dUI.~it!J /x.}',z). a ll1ul b alY' rml numbers, and 9 is a Teal fUII('tion. Then
E[YlX] = E[l') , if X lInd l~ a1'(' i1ule,>endent
E[Y
[ ~
E!E
!
Y[XII
Elq(X)
'I
X[
~ g(X) E[Y[XIP.19(X)Y[ ~ E[g(X)
E[YIXII
E!"IXI
~ aE[9(X)IX~ = g(X)
E/rX
+
dZYI - cE[X,)" . dE[ZP"Proof Qmitll'il but follows frolll (2.11) on pagt' 11 nlld (2..11).
(2.12) (2.13) (2.11) (2.15) (2.16) (2. [7) (2 '")
•
2.5 CONDITIONAL Exp~~C'rNnor-; 2[Equation (2,18) sho\\'S that the conditional expect.ation operator is linear, DEFINITION 2.10 (CONDITIONAl- "AIHANCE)
The conditiowd vm'irlnce of Y Kivell X is
H..ud the conditional covari(Htce I)('tw('('n Y and Z giv('n X is
C[
Y
,
Z
I
X
[
~
E[(
Y
-
E[
Y
[
X
I)(
Z
-
E[Z[X[ (X
l
(2.49)
(2.50)
From R{'umrk 2,3 it is clearly SC'CII that Var[YIX[ and C[Y, ZIXj are matrices of random mriablcs,
TIlEOREM 2, I (TilE VAJUAN('E SEPARATION TIIEORE\I)
[..{'i X, Y awl Z be rundo,n I'ariablf.'s, Then
V.,[YI
~E[V"
'
!
Y
[
X
[[
+
V",[
E
[
Y
I
X
II
Cry ,
Zl
~E
[C[
Y
.
ZIX
[[
+
C[E[
Y
I
X
I,
8
[
Z
[
X
II
Proof Qlllit.l{'d S<'e. for instance, Jazwinski (1970),Example 2.2 (Linear model)
A~"'IlIll(' }" is dl'finro by the lin{'£\J' Illodel
v
= XO+f(2.51) (2.52)
•
wher(' X and art' lIIutually inci('p('lldl'llt random variables wilh 1II('llIl P.\ alld
I'
,
0, and \'arianc('0'\
nud(1;.
f('SJl(.'<'tiwly" \\"(, a."mnl<' that (J is known_ B.\' u:-;illK the Ih('Qr('lIIs fil)()v(" it follows thatETIX
'
~qxo
+
'I
X
[
SO Var[YIXl Var[XO+
fi
X
] -
0';
HrUt", WI' M'f' Ih,.j for a give-It X J', we haw thl' "pn'(lktion"
ElF
l
X
xl-=
x8, and til(' c-orr{'spolldillg ull('l'Ttainty is given hy tIl(' variance Var!}'IX ~
x[
""
O'~,
.Tilt' lUiUgillill nwan of ). i:-;
and by IUliug (ViI) Wf' gt't th(' marginal vnriam'('
V"'I)·I
~q\'",[)'IX!J
+ \
·",[E!)·
X
I
=
u; -+ (J:lui
Hence,
the variance eeparation theorem }ieidlo tbat tilt' IImrginal \lU'bl.ll('('of
Y ..
COIDpOIJl!d of the variance of € plU8 8 8CaIed contribution from t.he22
~IUI:rlVARIATE RANI)OM VAIlI,\Il!.ES2.6
The multivariate normal distribution
In time S(>ri('S analysis th(' normal distribution and the distributions derived
from the uormal distribution lITe of major interest. For eXfllUplt·, tilt' lIIul
ti-\'flriate normal dbtributiOIl is the fundamental tool used for formulating the
likelihood fUllction in later chapters.
\\'p assume thllt XL. Xl, ... . X" fire independent random variflbl~ with
d · 2 1 2 II' . X "( ')
Illpans It]. Jl2.· .. , II ... , an v8n8Uct'S
°1 ,
(71.- . . . • On' C wnte ,E.~ It" (1; . :\ow, defint' tilt, mndolll vtoclor X=
(XI.X!, .... X .. )T. Bet'auS(' til(' mlldOIll varillhl('S flrC' indC'penciC'nt, it follows from (2.14) on page 15 thatI.dx\ ... r,,)
= lx,
(;cd'"1
...
..(;1',,)II
"
- - e x p 1[(Z,
~
'")']
, IO,,f'ii
2a~m
;:.,
".')
(2,)"/' exp[~;
t.
["'
.,"']'j
By introducing lilt' mpan J.L = (1'1, ... , Il"rr Il.lld the covariance Ex =
VarIX ] ding(a? ... . a~), this is written
h(x) (2.53)
A gell('ralizatioll to the (·I\SC:' wht'J"(' the covariance matrix is n fun matrix Ipll.eI..; to til(> following:
DHT .... ITION 2.11 (Till:; MULTI\ARIATE NOIU1AI. D1STRlIll'TIO~)
The joint d{'lIsity fUllction for thp n-dimensional random \ .. riable X with mean 11. and covariance E is
1
::--:-c;;;....r.=~exp [ '(x I-'f E '(x 11.)] (211")"/2 Jdf't E " 2
-j,(r) (2.51)
where E
>
O. We write X E N(I-', E). If X E N(O, 1) we say t hul X isslandardil.t'(l normally dbtributed. TIIEOREM 2.5
Any ll-dimflMirJUlIl nonwdly disI1'ibll/(:d nmdom Ilori(lb/t' wilh m((w I' am[
rou
(JI
"iallcc
E ran bewritten
as
x
I.£+Te (2.55)Proof 011\' to liJ(· l"oyllllllt'lry of E, there alwll.Ys c'xisi. ... 1\ f('nlllliltrix so lIull E TTT. TIII'Il ih(' f('l-;ult follows from (2.1O) on pAgt' 20 auel (2.:m) Oil
•
2.7 DISTRIBUTIONS DERIVED ~llOM TilE: NOIHIIAL OISTIOI3L;TION
23
2.7
Distributions derived
from
the normal distribution
~Iost of the tc:st qUlllititics 11.';('(1 ill tilllt:' series Ilnalysis Ilre imsed on the normal
dbtributioll or on 011(> of the dislributions d('rivoo from the normal distributiOIl. Any lincur combinalion of 1IaTilt/dly distribuird random t'ariables is normal. If. for iruitanc{', X E ;:>';(1-'. E), then the linear trallBformation Y = a
+
BX defines!l norll1811y dbtributcd random variable I\S(2.56) Compare with Example 2.1011 I>agc 19.
Let Z = (ZI, ... , Z,,)T 1)(' a ,"('('lor of independcllt N(O, I) random mriables.
The (central) \,:1 dist1"ibulion with It dCY1r.:CS 0/ f1'Ccdom is obtailU.'d IL'! th{' SQuared SlIlll of II indpw'ndellt N(O, I) random variables, i.e.,
"
X2
=
2:
Z; =
Z",Z
E \,2(n) (2.57), .. I
From this it b dear tllllt if 1'1, ... ,Yn Il.l"t' illdept'ndent N(/l" a;) random
variables, then
(2.58)
since Z, = (Y; It.)/a, is N(O, I) distribut('(1.
For Y E t\n(l-', E) (E
>
0), wc ha\"C(2.59) This follows by using Theorem 2.5 and (2.57).
The 1W'l-tenlrol,\ 2 di,~tn·bulioll with '1 dt'gt'et's of freedom and non-centrality parameter'\ appears whcn consid('ring the sum of squared normally distributed
variables when the nU'aliS art' not ll('('('SS8rily zero. Ilence,
(2.60) where'\ =
4
/,
"'
E
II-'. COIllPIl1"(, (2.59) I-Uld (2.60).Let
X?,
..
. ,
X~. denott' independent ,\ :I(n" '\,) distributed HlIldom vari-ables. Then the rcpnxiuctiolt InY)IHT/Y of th{' X2 dbtribution isf:
X? E \'[
f:
u ..f:
",j.
'0
1 ,. I •. I(2.61)
If E is singular with milk k
<
II, thm y7:E Y is \2 distribut{'(i with kdegrees of rreedom and nOll-N'ntmlit.\· pa.rauwt('r'\ 4I-'TE- p.. wherf' E
-•
24
l\IULTIVARIATE RANDOM VARlt\lILESdenote:,; a gPllefalize<i inverse (called g-inverse) for ~ (SC'C, for instance. Rao (1973)).
The (stU/inti) t di~lriblltion with 11 degrees oj fn:cdom is obtained fl. ...
Z
T
=
(X2/n)1/'1 E t(n), (2.62)where Z E X(O, I). X2 E :\. zen), and Z and X2 nr(' indepeudcllt. The nOH-central t distribution is obtained from (2.62) if Z E N(IL I). and we wl'i!p
T E 1(11./'),
The F' distribtJtion wilh (IL. III) drg/'r,n of jrT't'dom appenrs as the followinA
ratio
xUn
p ~
X
-'/
E F(n, m) 2 m.(2.63) where
X;
E \ 'len),xi
E "Z(m), IUldX;
Ilndxi
are incicp<'udent. It is cleml.\" seen from (2.62) that T2 E F(l, 11).The non-reflt.ml F disi1'ibution Wilh (n.
m)
(l('gret',~ of freedom Rnd nou-Ct'ntrality pammctn'>.
is obtnilwd frOI11 (2.63) if X~ E "Z(n, ..\),xi
E \ 2(W),and
X;
nudxi
Ilr(' ind('pendcnt. The non-central F distribution is writtcnFE F(n, m; '\).
2.8
linear projections
This seclion t'ontllin:s the fundalllental theorems used ill, e.g., linear regr('s;;ion, where the iudpp('nd(,llt variables arc stochastic, 1:IS wt'll as in linear pre<iictiou:s
and the Kalman filler.
TIIEOREM 2.6 (LINEAR PROJECTION)
Let Y = (YI, .•• , },,")T and X
=
(X., ... ,X",)T be nmdo"l vectors, and let the (m+
n)-dirlll"f1llional vector (Y. X)" have tlie mamand cotJal'iance Define the lin('ar projcction of Y on X
E[Y[X[ ~
,,+
BXThen tlte TJr'Ojcrtio71 and the variance 0/ the projcctiO'l e'1m' M giuf'1I by (2.G5)
E[VarIYIXjJ Eyy - EyxEx~E;:x (2.66)
f""inally, the p,'Ojc.l'lion error, Y E[Y IX ], and X al'l' unrorrelatcd. i.e.,
C[Y E[Y[X[, X[ - 0 (2.67) 2.8 LINEAR PROJECTIONS Y
Y
-
E
[
Y
IX
[
x
E[Y
IX
[
Figure 2.1: The projection E[YIX] olY on X .
Proof From Throrf'1ll 2.4 011 pl\gt' 21:
C[Y,X [ ~
E[C[Y,
X
I
X
II
+
C[E[Y
I
X
[,
E[X
IX
II
~E[
O
[
+C[a+
BX.X[~ B Vru-[X[
E[YI
~E[E[
Y
I
X
[[
E[a+
BX[ ~ alBE
[
X
[,
which l(>nds to a ~ E[Y[ - B E[X[ i.(> .. 25 (2.68) (2.69)
[{Iuation (2.65) is now ohtain<'<i by using the vaim.'S for a and B ill (2.6·1). ~ow
E;Vru-[YX
[[
~",,
[
Y
-
E[Y
I
X
[[
Var[Y a BXJ
= Ey y
+
BExXBT - BExy - EyxBT= Ey y I:y x Ex~ Exy
CI
Y
E[YIX[.
X' ~ C[Y a BX. X, = E yx BExx = 0 (2.70) (2.71)•
Rcf('rring to (2.67), w(> !'fay thnt the error, (Y E[YiX '), fIIHI X ar('