x:\...\maravall\tramseat\cursos\reviewts.ts
11th March 2003
NOTES ON PROGRAMS
TRAMO AND SEATS
PART I
Introduction and Brief Review of Applied Time Series Analysis Agustín Maravall
Bank of Spain
1
2
3
In our application,
we center on series observed with a 1, 2, 3, 4, 6, and 12 times a year frequency. The most relevant ones:
MONTHLY and QUARTERLY time series
] x , , x , x [ SERIES TIME ≡ 1 2 K T ns observatio 600 T 36 -12 ≤ ≤
(minimum depends on the frequency of observations and on the type of analysis performed).
Our interest: SHORT-TERM ANALYSIS
SOME EXAMPLES OF PROBLEMS THAT WE SHALL ADDRESS:
4
Monthly Time Series
0 100 200 300 400 500 600 700 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 Forecast 0 200 400 600 800 1000 1200 120 132 144 156 168 last 2 years
5 Interpolated Series 0 100 200 300 400 500 600 700 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 Missing Observations 0 50 100 150 200 250 300 350 400 450 500 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 * * * * *
6 Outlier Contamination 0 20 40 60 80 100 120 140 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 Outliers -30 -20 -10 0 10 20 30 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 LS TC AO
7
Series with Intervention Variable
0 20 40 60 80 100 120 140 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140
Intervention Variable Effect
-30 -20 -10 0 10 20 30 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140
8 Seasonal Factors 70 80 90 100 110 120 130 140 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134
Regression variable and special effects
1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 Top: Holiday effect Middle: Easter effect Bottom: TD effect 100
100 100
9
Seasonally Adjusted Series
0 100 200 300 400 500 600 700 1 13 25 37 49 61 73 85 97 109 121 133 Trend-Cycle 0 100 200 300 400 500 600 700 1 13 25 37 49 61 73 85 97 109 121 133
10 Seasonal Factors 70 80 90 100 110 120 130 140 120 132 144 156 168
Forecast: Seasonal Factors
Forecast: Trend & Original Series
0 200 400 600 800 1000 1200 120 132 144 156 168
11 Business Cycle 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 ST and LT Trends 60 80 100 120 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
12
Standard ("routine") treatment at present typically solves the previous problems using different procedures, that often have little to do with each other. For ex.:
* Forecasting: ARIMA, EWMA
* Interpolation: Chow-Lin, Denton
* Seasonal adjustment: X11 , X11A
* Preadjustment for trading day, easter effect, holidays...: Regression / Prior correction (For ex., divide by # of working days)…
* Trend extraction: Henderson Moving Averages, HP filter… * Outliers: Some weighted trimming?
Robust procedures instead?
* Forecast of a trend: ?
Often: Fit ARIMA to some trend, and obtain ARIMA forecasts (not recommended)
* SE of xˆat ? (xat : SA series)
Important issue (Bach Commitee, Moore Committee,…) and so on...
13
We shall present a methodology that
* permits to deal with all those issues jointly, within a unified framework.
* This framework provides OPTIMAL ESTIMATORS (OR FORECASTS) with respect to
- well-defined STATISTICAL MODELS,
- well-defined ESTIMATION CRITERION,
in an EFFICIENT way.
The Model-based approach will facilitate
* interpretation
For example, the model may specify that the sum of the seasonal component over a 12 consecutive-month period is a zero-mean, small variance, stationary process. * diagnostics
The joint distribution of the estimators can be derived, and hence standard tests can be performed.
* inference
For example, we can obtain optimal forecasts of the rate of growth of the SA series, with the associated SE.
14
The methodology is based on:
1) Identifying REG-ARIMA models for the observed series.
2) Decomposing the previous model for the series into unobserved components.
3) Obtain the MMSE estimator of the components (or signals). These estimators will be:
E (signal | observations)
Before explaining the methodology, it will be helpful to start with a brief (and informal) review of some applied time series analysis concepts and tools.
15
16 General Framework: Stochastic process: ) z ( f ~ zt t t Time Series: [ z1, z2, ... , zT ]
We consider it as a particular (partial) realization of a stochastic process.
Hence, a sample of size 1 for each ft.
17
STATIONARITY AND DIFFERENCING
Strong condition. Although few economic series will satisfy it, simple transformations will render them stationary.
Basic condition: ) z ..., , z ( f ) z ..., z ( f 1 T = 1+k T+k 1 2 k+1 k+2
In particular, for marginal distribution (T=1)
) z ( f = ) z (
ft t t for every t, hence
= µ = z t t V Vz Ez both: - are finite - do not depend on t
18
In practice, constant variance is achieved through: - log-level transformation
+
- outlier correction
Alternatively, one may use
NONLINEAR FILTERS
(ex. : GARCH, Bilinear, Stochastic Volatility models, … )
- Not yet fit for large-scale use.
- For many of these models, point forecasts or point estimator of the series and components obtained with linear model remains approximately optimal.
- The decomposition of NL models still poses some problems.
- Monthly and lower-frequency data seldom display markedly nonlinear structures.
19
Roughly:
LOGS are appropriate when the amplitude of the series oscillations is approximately proportional to the level.
Note: LOG transformation has some nice features. - "Scale" free
- Natural interpretation: ( dlogx =dx / x ≅ ∇x / x ) variations are expressed as fractions ("per one") of the level of the series.
Thus, for example,
ARIMA fit to the logs→σa (residuals )=.004
We can say: The serie is forecasted (1 p.a.) with an error equivalent to roughly 4‰ of the level of the series.
On the negative side, it may induce BIASES (due to the fact that geometric means underestimate aritmetic means).
⇒ Annual mean of original series > mean of SA series (and of trend)
(If biases are large, there are 2 options: - ad-hoc corrections;
20
Concerning STATIONARITY IN MEAN, most economic time series display a mean (i.e., a “local level”) that cannot be assumed constant. The two most important reasons:
a) The presence of a trend (or a trend-cycle)
Obviously, the mean of the series in the first years is not the same as the one for the last years.
Trend 0 1 2 3 4 5 6 1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193 205 217 229
21
b) The presence of seasonality:
Obviously, the level of the series depends on the period within the year. Seasonality 60 70 80 90 100 110 120 130 140 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161
22
To achieve Constant Mean:
Let:
B: "Backward" operator; Bj zt = zt-j
s = # obs./year (12, if monthly; 4 if quarterly;… )
We use operators: B -1 = ∇ (regular difference) s s =1-B ∇ (seasonal difference) B + B + 1 =
Ss K s-1 (sum over a year)
23
* Assume xt is a linear trend
; t b + a = xt then, b xt = ∇ or 0 xt 2 = ∇ . In general: d
∇ reduces polynomial of degree d to a constant (cancels polynomial of degree d-1)
Also: ∇12 xt =12b
24
Note: Cosine function
Basic element in cyclical and seasonal movements: the (deterministic) function ) t cos( m xt = t ω +ω0 , t = 0, 1, … (A)
m = modulus (max. value) ;
ω = Frequency (# of cycles / unit of time), measured in radians =
τ π 2 ;
τ = Period (# of units of time needed to complete a full cycle) ;
0 ω = Phase (angle at t = 0) t x τ ω0 m m = 1
25
* Using expression for cos (a + b), (A) above can also be rewritten as ) t sin D t cos C ( m xt = t ω + ω . * Recall:
(A) is solution of 2nd order difference equation (with real coefficients)
0 x
x
xt +φ1 t−1 +φ2 t−2 = , when roots are complex
⇒ always, pairs of complex conjugates. (see below)
26
Consider a monthly series
* Assume xt is a sine wave, for ex.
π t 6 cos = xt , then ∇12 xt =0
given that cos
(
)
π = π − π = − π t 6 cos 2 t 6 cos 12 t 6
Same thing holds when
t
2
cos
=
x
tπ
ω=π/6 -1 0 1 0 1 2 3 4 5 6 7 8 9 10 11 12 ω=π /2 -1 0 1 0 1 2 3 4 5 6 7 8 9 10 11 1227
So: what is the complete solution of
0 = xt
12
∇ ?
i.e., most general F(t) that is cancelled by ∇12
0 x
x
xt t t 12
12 = − =
∇ − Linear difference equation
(homogeneous).
Note: Homogeneous Linear Difference Equations
(with real coefficients).
Let equation be 0 x ... x xt +φ1 t−1 + +φp t−p = , t = 1, 2, …
Replacing x by t r , the t characteristic equation is obtained, equal to 0 r ... r rp +φ1 p−1 + +φp−1 +φp = .
28
This equation has
some real p roots
some complex
(always in pairs of complex conjugates:
) i b a r i b a r 2 1 − = + =
Solution to the difference equation:
∑ = = p 1 j t j j t c r x , where j
r : a root of the charact. equat.
j
c : arbitrary constants.
The final way in which the roots are expressed is the following (the c’s are always constants, to be determined from the starting conditions).
29
1) Single real root
t j j t c r
x = .
Notice : if rj >1 ⇒ root is explosive. We shall restrict attention to roots with
rj ≤1 .
2) Multiple real roots
order of multiplicity : k + 1 ; t k k 1 0 t (c c t ... c t ) r x = + + + , where r1 = ... = rk+1 = r .
3) Single complex root
30 Let i b a r i b a r 2 1 − = + =
be the pair, and let
, r a cos a , ) b a ( r 1/2 = ω + = then ) c t ( cos r c xt = 0 t ω + 1 . (c0 , c1: constants) real i r1 r2 a ω r b
31
4) Multiple complex roots Very rarely encountered.
t
x = A mixture of previous solutions. Notice that, in all cases, if
1 r = ,
the solution has a systematic explosive behavior, which is not found in actual economic series.
Thus, we shall assume always 1
r ≤
(This assumption is in reality an identification condition.)
In summary:
SOLUTION OF DIFFERENCE EQUATIONS:
t
x = Sum of
* damped exponentials in time ( → 0 ) * polynomials in time (deterministic trends) * cosine functions (seasonal and other cycles)
32
Remark
Using the backward operator B, the difference equation can be written as 0 x ) B ( t = φ , where p p 1 B ... B 1 ) B ( = +φ + +φ φ .
Comparing this last expression with the characteristic equation * roots of
[
]
= = φ i r 1 0 ) B ( . Thus,In terms of the roots of φ(B) = 0, the condition 1
r ≤
becomes
1
b ≥ ,
33
Back to the ∇12 example.
Thus, for x t − xt−12 = 0 , the characteristic equation: r12 -1=0; or r=(1)1/12 ,
or twelve roots of unit circle
All 12 roots have unit modulus. Roots are: * 2 real roots: r1 = 1 r2 = -1 1 1 -1 -1 C1 C2 real ω=π/6 imaginary
34
* 10 complex roots,
in pairs of complex conjugates
Each complex conjugate pair is associated with a frequency 6 π = ω 6 2 π = ω 6 3 π = ω 6 4 π = ω 6 5 π = ω
Each pair will generate a solution of the type A cos (ωt + B)
35
Consider the frequency
6 π = ω .
How many periods are needed to complete a full circle? ⇒ 12 periods.
Hence frequency implies 1 circle (or cycle) per year.
For frequency
6 2 π =
ω , 6 periods are needed to complete the circle, hence
3 π =
ω implies that 2 circles per year are completed. For frequency ω= 36π =2π, 4 periods are needed to complete the circle, hence
2 π =
ω ⇒ 3 circles completed per year, and so on.
…..
Finally, for ω= π = π 6
6 , the root is real and equal to r = -1. For this frequency, a full circle is completed in two periods. Hence, for monthly data,
r = - 1 (ω = π) ⇒ 6 circles per year
Notice that the root r = -1 implies that the factor (1 + B) appears in the factorization of the AR polynomial. (Such is the case when this polynomial is S, or ∇s .)
36 In short: π + ∑ + = = j 6 1 j j t t B 6 j cos A C X
C: constant, associated with zero frequency root B = 1 (i.e., with the factor (1-B)) : 6 j π seasonal frequencies.
j = 1 once a year : “ Fundamental “ frequency j = 2 twice a year
... ... “ harmonics “ j = 6 six times a year
37 ω = π / 6 -1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 ω = π / 3 - 1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 ω = π / 2 -1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 ω = 2 π / 3 -1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 ω = 5 π / 6 -1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 ω = π -1 0 1 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2
38
Another way to look at it:
12 ∇ can be factorized as AR factors Frequency = B 1− 12 (1- 3 B+B2) x Once a year x ) B + B -(1 2 Twice a year x ) B + (1 2 3 times a year x ) B + B + (1 2 4 times a year x ) B + B 3 + (1 2 5 times a year ) B + 1 ( 6 times a year ) B 1
( − associated with trend = (1−B)S
1 – B contains a trend root,
S contains the seasonal roots
(one real and 5 pairs of complex conjugates). All of them are “unit roots” (unit in modulus).
39
Hence, for example,
0 xt 12 = ∇ ∇ , (1) since ∇∇12 =∇2 S , will cancel
(A) the polynomial p a b t
(
2 pt 0)
t = + ∇ =
and
(B) the seasonal cycles for the 1, 2, ..., 6 times-a-year frequencies (S s t = 0) s t =
∑
ω + j j j j ) B t ( cos A 12 2 j j = π ω , j = 1,2,…, 6.40
Notice that, when specifying stochastic ARIMA models, we will not say that ∇∇12 xt is exactly = 0, but instead that
t t
12 x = z
∇ ∇
where zt is a zero mean, finite variance stationary stochastic
process.
That is, ∇∇12 xt will on average be zero and will not depart too
much from it.
Thus, every period, the functions (A) and (B) will be "perturbated" by a stochastic input,
so that ) t ( ) t ( ,b a b , a → ) t ( C C→ A Aj→ (jt ) B Bj→ (jt ) and so on.
41
DISTRIBUTION OF THE STATIONARY SERIES So, let in general
D s d = (B) ∇ ∇ δ
represent all the differences applied for reaching stationary, i.e.,
t t = (B) x
z δ is stationary.
Implication : [ z1, ..., zt ] will have a well-defined proper joint
distribution.
Further, we assume: JOINT NORMALITY
Hence, we consider: LINEAR STATIONARY STOCHASTIC PROCESSES
The time series generated by them will be jointly Normally distributed.
In the multivariate Normal distribution, conditional expectations are linear functions of the observed series. For ex.:
Expectation of a future value: ) z ..., , z | z ( E t+j 1 t : forecast ( j > 0 )
42
Expectation of a missing value: ) z ... z z ... z | z ( E (t) 1 t−1 t+1 T : interpolator (zt missing)
Expectation of a signal st buried in zt (zt = st + noise)
E (st | z1 ,…, zT): signal extraction
43 Stationarity ⇒ µ = t Ez z t V Vz = k k t t ,z ) z ( Cov − = γ : k
γ only depends on |k|, the relative distance between observations.
(it does not depend on t)
Thus, ) , ( N ~ ) z , , z ( 1 K T T µ Σ γ γ γ γ γ γ γ γ γ = ∑ − z 1 z 1 z 1 T 2 1 z .) sym ( ... ... ... ... ...
44
More parsimonious representation:
AUTOCOVARIANCE FUNCTION: γk = F Cov A as a function of k Let F=B−1 ( .ie., F zt =zt+1 ); F ≡ "Forward" Operator
AUTOCOVARIANCE GENERATING FUNCTION:
) F B, ( = . F . G ACov γ ) F + B ( + = + ) F + B ( + ) F + B ( + = ) F B, ( j j j 1 = j 0 2 2 2 1 0 γ γ γ γ γ γ
∑
∞ K45
Better (scale free) measure: autocorrelation
k -Lag = = 0 k k γ γ ρ autocorrelation AUTOCORRELATION FUNCTION k of function as ACF≡ρ
( since symmetric, only needed for k > 0 )
AUTOCORRELATION GENERATING FUNCTION ) F + B ( + 1 = ) F B, ( ACGF j j j 1 = j ρ ρ ≡
∑
∞ ACF -1 -0,5 0 0,5 1 -24 -20 -18 -16 -12 -8 -4 0 4 8 12 16 20 24 2846 Results: If zt stationary, then 1) ρ0=1, 2) ρj=ρ-j , ( symmetric ) 3) |ρk |<1, k ≠ 0 4) ρk →0 as k →∞ 5)
∑
∞ |ρk |<∞ 0 = k (Convergence condition)Note: JOINT NORMALITY implies that
) F B, ( , , γ0 ρ µ
fully characterize the joint distribution function of [ z1, ..., zT ] ;
they contain all the "sample" information.
k
ρ = Lag-k autocorrelation is a measure of the (linear) dependence between observations distant k periods
47
Wold Representation
A linear (≡jointly Normal) stationary purely stochastic process can be expressed as:
), 1 ( , a ... a a a = z 0 0 j j t j 2 t 2 1 t 1 t t = ψ ψ = + ψ + ψ +
∑
∞ = − − − where = − − ≡ − = ) parameters known ( error forecast . a . p 1 " s innovation " " residuals " iable var noise white ) V , 0 ( niid a a t 1 t | t t t z zˆ a = − − j t | t zˆ48
Hence
zt = Linear filter applied to innovations. Also called the MA
representation of zt. Filter is one-sided (only past and present
innovations) and convergent,
∞ → → ψ j 0 j ∞ ψ
∑
∞ | j |< 0 . In short: a (B) = z t ψ t∑
∞ = ψ = ψ 0 j j jB ) B ( Useful result: ) z ( GF Cov A = ) F B, ( Let γz t Then, V ) F ( ) B ( = ) F , B ( a z ψ ψ γIn particular, for the variance : V ) + + + (1 2 a 2 1 2 0= ψ ψ K γ
49
ACF: Basic tool in "Time Domain analysis" of a series. Another important tool:
Spectrum
Basic tool in "Frequency Domain analysis" of a series.
Consider the time series:
[
1 2 T]
t x , x ,..., x
x =
We can “exactly explain” the series with the polynomial of degree (T-1): 1 T 1 T 1 0 t a a t ... a t x = + + + − −
(Set, succesively, t = 1, 2, …, T, and a linear system of T equations in the T unknowns a0 , a1 , ... , aT−1 is obtained).
In a similar manner, we can represent (exactly) the T observations
50
To simplify, assume T is even, so that T = 2q
Define the Fundamental Frequency
T 2 1 π = ω
(i.e., the frequency of one full circle completed in T periods), and its Harmonics:
j T 2 j π = ω , j = 1, 2, …, q. Then, express xt as
(
)
∑
= ω + ω = q 1 j j j j j t a cos t b sin t x Letting t = 1, 2, …, T ,a linear system of T equations in the T unknowns :
(
aj ,bj)
, j = 1, …, q , is obtained.For a particular periodic component (with frequency ωj), the “amplitude” (i.e., the height of the peak) is equal to
2 j 2 j 2 j a b A = +
Notice: The bigger this amplitude, the larger will be the contribution of the component in explaining xt .
51 Example: -3 -2 -1 0 1 2 3 1 2 3 4 5 6 7 8 9 10 -1,00 -0,80 -0,60 -0,40 -0,20 0,00 0,20 0,40 0,60 0,80 1,00 0 1 2 3 4 5 6 7 8 9 10
52
In general, group the cosine functions by intervals of frequency
(summing the amplitudes)
HISTOGRAM OF THE DISTRIBUTION BY FREQUENCY
0 π ω (frequency)
In the same way that:
density function ≡ population counterpart of standard histogram,
53 Given that , ) ( cos ) ( cos , ) 2 ( cos ) ( cos α − = α π + α = α
spectrum is periodic and symmetric ⇒ enough to consider: 0≤ω≤π γ ω ω ∫π 0 2 0 = d ) ( g ( variance of series ) (or ∫π π − )
∴ Spectrum can be interpreted as a decomposition of variance by intervals of frequency.
(Standarized) it displays properties similar to those of a density function.
SPECTRUM SERIES
54 = ω frequency in radians Recall: τ π ω= 2 Period = τ
Spectrum g( ω) : decomposes V x by frequency.
ω τ -1.5 -1 -0.5 0 0.5 1 1.5
t
SPECTRUM SERIESvariance associated with interval dω
0 π
g(ω)
55
Consider the once-a-year frequency in monthly data
6 = 12 2 = 12 = ⇒ ω π π τ
Hence: If series has important seasonal component with that frequency, in gx( ω): months 12 = τ SPECTRUM AR(2) peak for ω= π/6 0 π/6 π
56
For trend:
A way to think about trends:
Cycles with period close to ∞ (ex.: cycles with periods 1000 years, 10.000 years,…) 0 ⇒ ω→ ∞ → τ g(ω) 0 π
57
If the spectrum of a quarterly series is, for example: ) ( g x ω Peak for: ω= 0 Trend s frequencie seasonal year a Twice = year a Once 2 = → π ω → π ω 0 π/2 π SPECTRUM SERIES 0 π/2 π
58
To extract some signal from a series, for example, to S.A. a series:
- remove variation around seasonal frequencies - leave the rest unchanged.
If nt = SA series is estimated through
(
B, F)
x ... c x c x c x c x c x ... c nˆt = t = + 2 t−2 + 1 t−1+ 0 t + 1 t+1+ 2 t+2 + where( )
= +∑
(
+)
j j j j 0 c B F c F , B c is a symmetric filter, the F.T. of the filter(recall: Bj + Fj → 2 cosjω) is
( )
ω = + ∑( )
ω j j 0 2 c cos j c c~ From nˆt = c(
B, F)
xt ,( )
ω =[
( )
ω]
2 x( )
ω nˆ c~ g g( )
Gainof filter c~ ω =( )
59
Hence:
Spectrum of nˆt = ( Squared gain of filter ) x ( Spectrum of series )
Squared gain: Determines, for each frequency, which proportion of the series variance is passed on to the signal estimator.
= 1 all the variation is passed = 0 the frequency is ignored
0 π/2 π
1
60
SPECTRUM OF A LINEAR PROCESS (AND OF AN ARIMA MODEL)
In general, from ACovGF, the spectrum is easily obtained as its Fourier Transform: ) F + B ( + ) F B, ( j j j 0 ∑ γ γ = γ ω ⇒ → ω j cos 2 F + B e B j j i F.T. π ≤ ω ≤ 2 0
[
+2 cos j]
2 1 = ) ( γ0 ∑γj ω π ω gFor notational simplicity, we shall work with
) ( 2 = ) ( g ω π g ω ,
61
Ex. 1:
ACF and spectrum of MA (2)
t 2 2 1 t =( 1+ B+ B )a x θ θ
[
Va =1]
, (F) (B) = ACovGF θ θ ) F + F + (1 ) B + B + (1 θ1 θ2 2 θ1 θ2 2 2 2 2 1 + + 1 = θ θ coeff .of B0 :γ0 ] F + B [ ) + 1 ( + θ1 θ2 coeff .of B andF :γ1 ] F + B [ + 2 2 2 θ coeff. of B2 and F2 : γ2 In short ω → 2cosk F + Bk k[
0+2 1cos +2 2cos2]
Va = ) ( g ω γ γ ω γ ω62
Ex. 2:
zt ~ white noise ⇒gz( ω )=constant
Ex. 3: AR(1) 1 < | | , a = z + zt φ t−1 t φ , a = z ) B + (1 φ t t a B + 1 1 = zt t φ B 1 1 ) B ( a ) B ( zt t φ + = ψ → ψ = (Wold representation)
Therefore, ACovGF is: ψ(B)ψ(F) Va
= V ) F + 1 ( 1 ) B + 1 ( 1 = ) F , B ( a z φ φ γ ) F + B ( + + 1 1 = 2 φ φ
ACF White noise
0 0,2 0,4 0,6 0,8 1 1 2 3 4 5 6 7 8 9 10 11 12 gz (ω) 0 ω π
63
Hence the spectrum is: (B + F → 2 cos ω)
V cos 2 + + 1 1 = ) ( gz 2 a ω φ φ ω ) ( gz ω φ < 0 0 π ACF ( 1+.8B)xt = at 0 0,2 0,4 0,6 0,8 1 1 2 3 4 5 6 7 8 9 10 11 12 φ > 0 0 π ACF ( 1-.8B)xt = at -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1 1 2 3 4 5 6 7 8 9 10 11 12 ACF(1-.8B)xt=at ACF(1+.8B)xt=at
64
Ex. 4 : PSEUDO-SPECTRUM
As φ → −1 the model approaches NONSTATIONARITY. In the limit: ( 1-B ) zt = at ≡ RANDOM WALK t t a z = ∇ (va = 1) zt = at + at-1 + at-2 + at-3 + … As t → ∞
* Mean is not defined
(
"0⋅∞")
. * Variance goes to ∞.As with stationary models, write
( )
= ∇ ψ B 1 “Pseudo” A Cov. GF = va F 1 1 B 1 1 − −(does not converge)
65 ) cos -1 ( 2 1 = ) ( g . T . F z ω ω ≡ ∞ → ⇒ ω=0 g g
∫ does not converge (variance goes to ∞)
(pseudo) spectrum of a random walk
p-spectrum is: - informative - well-behaved in a very basic way.
SPECTRUM SERIES
ω
SPECTRUM SERIES
0 π/2 π
66
For example, consider the two trend spectra:
and two associated realizations
The trend that corresponds to the wider spectral peak contains more stochastic variability (i.e., is of a more “moving” nature). The narrow peak generates a more stable trend.
Trend 4,25 4,3 4,35 4,4 4,45 4,5 4,55 4,6 4,65 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 Trend Spectra 0 0,5 1 0 π
67
Seasonal component Spectra
0 0,5 1
Similarly, from the two spectra:
68
As was the case with the trend, the narrow spectral peaks produce stable seasonal components.
The wider peaks produce components that change faster (more moving components).
In what follows we shall also refer to the “p-spectrum” simply as the "spectrum". Seasonal Component -3 -2 -1 0 1 2 3 Seasonal Component -3 -2 -1 0 1 2 3
69 Ex. 5 : AR (2) t t 2 2 1 B+ B ) z =a + (1 φ φ 0 B B 1+ φ1 + φ2 2 = Either:
- 2 real roots ( each ~ AR(1) )
- Complex conjugate root, → zt =r cos(ωt)+L
with * modulus r = φ2 * frequency φ ω r 2 arcos = 1 (in rads.) period ω π τ=2
In this case, spectrum shows a peak for ω
SPECTRUM SERIES
70
General Hint: Useful way to look at AR(p) : Factorize it as:
Real roots x Complex roots
↓ ↓
each one : AR(1) each pair : AR(2)
SPECTRUM SERIES: Real roots AR(1)
SPECTRUM SERIES: Complex roots AR(2)
0 ω π
71
Ex. 6 : SEASONAL SERIES
If there is seasonal nonstationarity,
unit roots show up as ∞ for seasonal frequencies.
For example, consider a quarterly time series xt such that
. process stationary : z , z x t t t 4 = ∇ Then, t 2 t 3 2 t t 4 t z ) B 1 ( ) B 1 ( 1 1 z B B B 1 1 1 z S 1 z 1 x + + ∇ = = + + + ∇ = = ∇ = ∇ = π = ω + π = ω + = ω ∇ with associated root : B 1 2 with associated root : B 1 0 with associated root : 2 . ) F B ( ) F 1 ( ) B 1 ( 1 ) F 1 ( ) B 1 ( 1 ) F 1 ( ) B 1 ( 1 ) x ( ACGF p 2 2 γz , + + + + − − =
72
Operating and using Bj +Fj = 2 cosjω, the p-spectrum is
. bounded ) ( .) freq seasonal ( 2 for . ) 2 cos 1 ( 2 1 .) freq seasonal ( for . ) cos 1 ( 2 1 0 for . ) cos 1 ( 2 1 ) ( g z x ω γ π = ω ∞ ω + π = ω ∞ ω + = ω ∞ ω − = ω
Unit AR roots dominate the spectrum of the series.
Thus, a "standard" series with trend and seasonality (both NS) will display a spectrum of the type:
73
SPECTRUM QUARTERLY SERIES
0 π/2 π
SPECTRUM MONTHLY SERIES
74
ARIMA models
Back to the Wold general representation of a purely stationary series:
t t = (B )a
z ψ
Problem: In general ψ(B) of degree ∞K Thus we use rational approximation:
) B ( ) B ( = ) B ( φ θ ψ = ) B ( θ finite degree q = ) B ( φ finite degree p Therefore, t t ( B)a ) B ( = z φ θ , or: t t = ( B)a z (B) θ φ
75
Autoregressive Moving-Average Models: ARMA models
AR (p) polynomial: 1 + φ1 B +… + φp Bp MA (q) polynomial: 1 + θ1 B +… + θq Bq t p p 1 B+ + B )z + 1 ( φ K φ = q t q 1 B+ + B ) a + (1 θ K θ ARMA (p,q) model.
Let zt =δ( B )xt [δ (B )≡stationary transformation ] If δ (B )=∇d
ARIMA ~
xt ( p, d, q ) model
I ≡ integrated (of order d) , xt ~ I
( )
dModel: t t = ( B) a x ) B ( ) B ( δ θ φ : ) B ( φ stationary : ) B ( θ invertible : ) B (
76
Stationarity of ARMA Models:
Roots of φ( B )=0 lie outside the Unit Circle
UNIT CIRCLE = circle in the complex plane with radius = 1
Let B1 ,…, Bp ≡ roots of φ (B) = 0.
Stationarity implies that
that is: moduli of the roots B1LBp of φ( B )=0 are > 1
modulus ω: frecuency real i 1 2 1 2 -1 -2 -1 -2 Four roots of unit circle
77 Ex: a) AR(1) : x1 +φxt−1 =at t t =a x B) + (1 φ * root of φ ⇒ φB=0 B=-1 + 1
* root outside U.C. when
1 < | | 1 1 ) B ( mod > ⇒ φ φ − =
b) Stationarity region for AR(2) :
t 2 t 2 1 t 1 t + x + x a x φ − φ − =
Conditions for roots of 0 = B B+ + 1 φ1 φ2 2 to be > 1 in moduli.
78
Useful diagram:
The stationary region is the region inside the triangle.
The shaded area is the region of complex roots (periodic behavior). 1 2 -2 -1 -2 -1 1 2 φ1 φ2
Region of complex roots
79
If zt is stationary, φ B) –1 converges, and hence we can write
zt =
[
φ(B)-1 θ(B)]
at .- The series accepts a convergent MA representation, - Its ACF converges to zero.
Invertibility
Roots of θ( B )=0 lie outside the Unit Circle. Thus θ( B)−1 converges and we can write
[
θ (B)-1 φ (B)]
zt = atIf zt is invertible,
- it accepts a convergent AR representation
Remark:
- A unit AR root induces nonstationarity, ∞
→ ωassoc. withU.R . ) (
gx
- A unit MA root induces noninvertibility, 0 = ) . R . U with . assoc ( gx ω
80 Model: t t =(1+B)a x B) -1 ( Model: t t =( 1-B)a x ) B + 1 ( AR root (1-B) ; MA root (1+B) 0 π AR root (1-B) ; MA root (1+B) 0 π AR root (1+B) ; MA root (1-B)
81
TWO USEFUL ALTERNATIVE REPRESENTATIONS:
ARMA model: a ) B ( = a ) B ( t θ t φ (a) a =( 1+ B+ B + )a ) B ( ) B ( = xt t ψ1 ψ2 2 K t φ θ or t t = (B)a x ψ
↑ "Psi" - weights (ψ- weights ) * If x is stationary: t 0 weights - → ψ
82 (b) Alternatively, t t =a x (B) (B) θ φ , or t t 2 2 1B+ B + )x =a + 1 ( π π K ,
and in compact form,
t t = a x (B) π ↑ "Pi" - weights (π- weights ) * If xt is invertible, 0 weights - → π
83
For seasonal data, often the general multiplicative ARIMA model is used: (B) = x ) B ( ) B ( s d ∇Ds t θ ∇ Φ φ Θ (Bs)at φ(B) ≡ regular AR pol. ≡ Φ( Bs) seasonal AR pol. ( in Bs ) ≡ θ(B) regular MA pol. Θ(Bs) ≡ seasonal MA pol. ( in Bs )
where: φ(B) and Φ(Bs) are stationary. (B)
θ and Θ (Bs) are invertible. Two important points:
- PARSIMONY (few parameters);
- short-term use (max. horizon: 1 – 2 years)
(it may affect the order of differencing: roughly, * short-term use favors differencing
84 "Box-Jenkins"-type approach: IDENTIFICATION of the model ESTIMATION DIAGNOSIS O.K.? No Yes INFERENCE
85
IDENTIFICATION OF THE ARIMA MODEL
One has to determine:
a) Degrees of differencing b) Orders p and q of ARMA
a) Traditional criterion: "fast enough convergence of ACF”. As we shall see, unit roots are easily detected through estimation (Tiao, Tsay)
b) Main idea: to "match" ACF of some known ARMA. Basic traits of ACF of ARMA (p, q) :
t q q 1 t p p 1B+ + B )x =( 1+ B+ + B )a + 1 ( φ K φ θ K θ ACF: ≡
ρk Lag-k autocorrelation of xt, as a function of k:
It will display
* q starting conditions,
after which the AR difference equation
* ρk+φ1ρk-1+K+φp ρk-p=0
holds. Hence, for k>q , ρkis solution of
0 = ) B ( ρk φ where B operates on k.
86
Thus: "Eventual ACF"
[
( pol .in t )+cosinefunc .]
= k ∑ ρ ACF (1-.8B)xt=at 0 0.5 1 ACF (1+.8B)xt=at -1 -0.5 0 0.5 1 1 3 5 7 9 11 13 15 17 19 21 23
ACF AR(2) complex roots
-1 -0,5 0 0,5 1 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
87
At present, "identification" uses more efficient procedures (as will be seen in TRAMO)
Notes:
a) Often, more than one model may seem reasonable,
hence always some room for the analyst experience or purpose. b) In practice we do not know the ACF and autocorrelations have to be estimated.
Estimation can induce large (spurious) covariances that have a distorting effect on the sample ACF, which may fail to damp out according to expectations.
88 SAMPLE ACF -0,5 0 0,5 1 THEORETICAL ACF 0 0.5 1 1 4 8 12 16 20 24
89
PARAMETER ESTIMATION (more later) Rough intuition : [x1KxT ] Assume: 1 t t t a a x = + θ − MA (1), |θ|<1 or 1 t t t x a a = −θ − Conditional on a0: Set θ =θ0 Compute sequentally: 0 0 1 0 1 x a a = − θ 0 1 0 2 0 2 x a a = − θ ……….. 0 1 T 0 T 0 T x a a = − θ − ____________
( )
∑
= T 1 2 0 j 0 a SS90 Varying θ: θˆ = min. SS Notice: 1 t t t x a a = −θ − 2 t 1 t 1 t x a a − = − −θ − at−2 = xt−2 −θat−3 ….. yields 0 T 1 1 T 2 T 2 1 T T T x x x ... ( ) x ( ) x a = −θ − + θ − − + −θ − + −θ
1) HIGHLY NON LINEAR FUNCTION OF PARAMETERS (∴NL optimization )
2) Effect of starting conditions
IMPORTANCE OF INVERTIBILITY ( | θ | <1) SS
SS
−1 θ0 θˆ 1
91
DIAGNOSTICS (more later) * In-sample
* out-of-sample
Mostly : residual-based diagnostics.
aˆt ~ Niid (0, Va)
INFERENCE (more later)
Example : Forecasting (parameters known)
Notation: xˆt+k|t ≡ Forecast of xt+kmade in period t.
t 1 t | t t xˆ a x = − + t
a ≡ 1-period-aheadforecasterror
= xt − xˆt|t−1 (≡innovation)
These a ‘s are the ones of the Wold representation, and of the t ARIMA model.
92
Forecast:
xˆ t+k| t =MMSEt ( xt+k )= (under our assumptions) =
) x x | x ( E = t+k 1K t
Computation with Kalman filter (later).
Forecast function :
xˆ t+k| t as a function of k.
For ARMA (p, q) :
Forecast function:
* q starting conditions,
after which the AR difference equation
0 = xˆ + + xˆ + xˆ t+k| t φ1 t+k-1 |t K φp t+k-p| t
holds. Hence xˆ t+k| t is solution of
0 = xˆ ) B ( t+k| t φ where B operates on k.
93
Note:
Eventual Forecast Function and ACF are solution of the same AR finite difference equation
(by looking at the correlation between present and past, we know the correlation between present and future...)
USEFUL WAY TO LOOK AT THE FORECAST: Use ψ- weights: = xt+k a t+k+Ψ1a t+k-1+K+ K + a + a + a t+1 k t k+1 t 1 1 -k Ψ Ψ − Ψ + ; since , known are errors forecast past : ) 0 k ( a a E , unknown are errors forecast future : ) 0 k ( 0 a E k t k t t k t t < = > = + + + ; ... a a = x E = xˆ t+k| t t t+k ψk t +ψk+1 t−1 +
94
Thus, FORECAST ERROR = e t+k| t x t+k - xˆ t+k| t = 4 4 4 4 4 4 3 4 4 4 4 4 4 2 1 K s innovation future" " of 1) -(k MA a + + a + a t+k ψ1 t+k-1 ψk-1 t+1
From this, distributions are easily derived.
Example: Simplest one k = 1 ) V , 0 ( N ~ et+1|t a ,
or, for a vector of 1-period-ahead forecast errors,
) I V , 0 ( N ~ et+1|t a …