1
Chapter 6:
Multivariate Cointegration Analysis
2 Contents:
VI. Multivariate Cointegration Analysis - Johansen Test ... 3
VI.1 The Simpelst Case: p = 1, VAR(1) ... 3
VI.2 VAR(p)-Model ... 12
VI.3 Model Specification ... 14
VI.4 Testing the Rank of Cointegration - An Example ... 16
3 VI. Multivariate Cointegration Analysis - Johansen Test
VI.1 The Simpelst Case: p = 1, VAR(1)
For example, there is a three dimensional vector Y consisting of the three month interest rates for the US dollar, the Euro and the Yen. Within these three I(1) variables we can find up to two cointegrating relations due to the interest rate parity and stationary expected changes in the rate of exchange.
ZZ 10 1 0 1 1
Y
Y
Y
4
As we seen before, we have a VAR(1) model for the M I(1) variables in levels. In this simple case, we can write:
Yt = µ + ΓYt-1 + εt
where: Y, µ and ε are (Mx1) vectors and Γ is a (MxM) matrix.
5
By subtracting the lagged vectors Y from both sides of the equation we receive the following relation:
Yt - Yt-1 = µ + ΓYt-1 - Yt-1 + εt or
∆Yt = µ + (A1 - I)Yt-1 + εt
∆Yt = µ + (Γ - I)Yt-1 + εt
In this equation we have an I(0) vector on the left hand side. On the right side there is a vector of constants as well as another I(0) vector ε. Thus, the term (Γ - I)Yt-1 must be also I(0). If the variables are not cointegrated, then the matrix Γ must be a unit matrix I. On the other hand, if there exists r cointegrated relations (Z is a (rx1) vector), this term can be written as a I(0) variable:
(Γ - I)Yt-1 = λγ’Yt-1 = λZt-1
where γ’ is the (rxM) matrix of the cointegration coefficients and λ is a (Mxr) matrix.
6
When multiplying with the cointegration matrix the latter results in the (MxM) matrix (Γ - I). This term is I(0) and λ can be interpreted as the matrix of the M times r error correction coefficients:
∆Yt = µ + λZt-1 + εt
This model is a generalization of the ECM in the previous section. In the case of a VAR(1) model there appears no lagged differences in the error correction model. If the initial model constitutes a VAR(p) model then the error correction representation contains additionally (p-1) difference terms.
Since the matrix (Γ - I) can be represented by the product of a (rxM) and a (Mxr) matrix, it has the rank r. This means that the number of cointegrated relations is determined by the rank of the matrix.
In the marginal case r = 0, i.e Γ = I, the model reduced to a VAR model in differences (M independent random walks). If r equals M we are concerned with M stationary level data, I(0).
7
The approach of Johansen is based on the maximum likelihood estimation of the matrix (Γ - I) under the assumption of normal distributed error variables. Following the estimation the hypotheses r = 0, r
= 1, …, r = M-1 are tested using likelihood ratio (LR) tests.
In the formulation of a VAR(p) model we receive the equation:
∆yt = A0 + Πyt-1 +
∑ ∑ ∑ ∑
−−−− ++++1 - p
i=1
t i
t
i∆y Bx
Γ + εt
As all factors in this equation except Π yt-1 are clearly stationary if the variables are cointegrated, it means that also Π yt-1 must be stationary. Furthermore, every cointegration relationship has to appear in Π. Even more, their number is given by the rank of Π.
Π can be decomposed as Π = αβ’, where the relevant elements of the α matrix are adjustment coefficients and the β matrix contains the cointegrating vectors. As the interest lies in α and β, the system should be reduced to one containing only them.
8
To do that, one should regress ∆yt on ∆yt-1, …, ∆yt-(p-1) and then Yt-1 on the same variables. The residuals are denoted respectively R0t and R1t. Now the regression equation is reduced to
R0t = αβ’R1t + et This is a multivariate regression problem:
11 10
01 00
S S
S
S is the matrix of sums of squares and sums of products of R0t and R1t.
Johansen (1991) shows that the asymptotic variance of β’R1t is β’Σ11β, the asymptotic variance of R0t
is Σ11 and the asymptotic covariance matrix of β’R1t and R0t is β’Σ10, where Σ00, Σ10, and Σ11 are the population counterparts of S00, S10 and S11.
The procedure is to maximize the likelihood function first with respect to α holding β constant and then maximize with respect to β. For α the result is:
α’ = (β’S11β)-1β’S10
9
The conditional maximum of the likelihood function with respect to β is (L(β))-2/T = |S00-S01β(β’S11β)-1β’S10|
So maximization of the likelihood function with respect to β means minimization of this determinant.
By further mathematical manipulations this is equivalent to the finding of the characteristic roots of the equation:
0 λI
- S S S
S11-1 10 -001 00 =
The roots of this equation are the r canonical correlations between R0t and R1t. It means that those linear combinations of Yt-1 will be selected that are highly correlated to linear combinations of ∆Yt after conditioning on the lagged variables ∆Yt-1, …, ∆Yt-(p-1).
10
Denoting with λi the characteristic value, the maximum likelihood function will be (under the assumption of normal distributed error terms):
λˆ ) - 1 ( S
L
n
1 i
i 00
T / 2 -
max
∏
=
=
Therefore, the estimation problem is a canonical correlation analysis of the current ∆Yt and the lagged ∆Y.
11
The trace statistic is
∑
+= =n 1 r i
i
trace -T ln(1- λˆ )
λ
where λˆr+1, …, λˆn are the smallest characteristic roots. If the statistic is bigger than the critical value, the null hypothesis of at most r cointegrating vectors is rejected.
The maximum eigenvalue statistic is
λmax = -Tln(1-λˆr++++1)
If the statistic is bigger than the critical value, the null hypothesis of exactly r cointegrated vectors is rejected. The critical values for both test are derived from the trace and maximum eigenvalue of the stochastic matrix and depend on whether we include a trend (either linear or quadratic) or a constant in the VAR model. Since we have not to deal with stationary variables, but with I(1) variables, the test values are not χ2 and follow a different distribution that is tabulated by Johansen and Juselius.
12
VI.2 VAR(p)-Model
Consider a VAR of order p with M I(1) variables in levels:
yt = A0 + A1yt-1 + A2yt-2 + … + Apyt-p + Bxt + εt
∆yt = A0 + (A1 - I)yt-1 + A2yt-2 + A3yt-3 + … + Apyt-p + Bxt + εt
∆yt = A0 + (A1 - I)yt-1 - (A1 - I)yt-2 + (A1 - I)yt-2 + A2yt-2 + A3yt-3 + … + Apyt-p + Bxt + εt
∆yt = A0 + (A1 - I)∆yt-1 + (A2 + A1 - I)yt-2 + A3yt-3 + … + Apyt-p + Bxt + εt
∆yt = A0 + (A1 - I)∆yt-1 + (A2 + A1 - I)yt-2 + (A2 + A1 - I)yt-3 + (A2 + A1 - I)yt-3 + A3)yt-3 +…+ Apyt-p + Bxt + εt
∆yt = A0 + (A1 - I)∆yt-1 + (A2 + A1 - I)∆yt-2 + (A3 + A2 + A1 - I)yt-3 + … + Apyt-p + Bxt + εt
∆yt = A0 + Γ1∆yt-1 + Γ2∆yt-2 + … + Γp-1∆yt-p-1 + Γpyt-p + Bxt + εt
with: Γi = (Ai + Ai-1 + … + A1), I = unit vector where: yt-p is I(1) and Γpyt-p is I(0)
13
Γp calculates stationary linear combinations of the non-stationary y and the rows of Γp are the cointegrating vectors for the elements of y.
zp := Γpyt-p is I(0) or
∆yt = A0 + Πyt-1 + ∑∑∑∑p-1 −−−− ++++
i=1
t i t
i∆y Bx
Γ + εt
where yt is a k-vector of non-stationary I(1) variables, xt is a d-vector of deterministic variables, and εt
is a vector of innovations. We may rewrite the VAR as,
with: ∑p
i=1Ai -I
Π = and
∑
p1 j=i
i - Aj
Γ
+
=
14
VI.3 Model Specification
Eviews considers the following five cases considered by Johansen (1995):
1. The level data yt have no deterministic trends and the cointegrating equations do not have intercepts:
H(r): Πyt-1 + Bxt = αβ’yt-1
2. The level data yt have no deterministic trends and the cointegrating equations have intercepts:
H(r): Πyt-1 + Bxt = α(β’yt-1 + ρ0)
3. The level data yt have linear trends but the cointegrating equations have only intercepts:
H(r): Πyt-1 + Bxt = α(β’yt-1 + ρ0) + α┴γ0
15
4. The level data yt and the cointegrating equations have linear trends:
H(r): Πyt-1 + Bxt = α(β’yt-1 + ρ0 + ρ1t) + α┴γ0
5. The level data yt have quadratic trends and the cointegrating equations have linear trends:
H(r): Πyt-1 + Bxt = α(β’yt-1 + ρ0 + ρ1t) + α┴(γ0 + γ1t)
The terms associated with α┴ are the deterministic terms “outside” the cointegrating relations. When a deterministic term appears both inside and outside the cointegrating relation, the decomposition is not uniquely identified. Johansen (1995) identifies the part that belongs inside the error correction term by orthogonally projecting the exogenous terms on to the α space so that α┴ is the null space of α such that α’α┴ = 0. EViews uses a different identification method so that the error correction term has a sample mean of zero. More specifically, we identify the part inside the error correction term by regressing the cointegration relations β’yt on a constant (and linear trend).
16
VI.4 Testing the Rank of Cointegration - An Example
a) The Choice of the optimal Lag Length
Lag LogL LR FPE AIC SC HQ
0 354.2837 NA 6.74e-06 -3.394046 -3.345746 -3.374514 * indicates lag order selected by the criterion
1 2472.603 4154.772 9.50e-15 -23.77395 -23.58075 -23.69582 LR: sequential modified LR test statistic (each test at 5% level) 2 2659.508 361.1675 1.70e-15 -25.49283 -25.15473* -25.35610 FPE: Final prediction error
3 2678.005 35.20814 1.55e-15 -25.58459 -25.10159 -25.38927 AIC: Akaike information criterion 4 2701.939 44.86089 1.35e-15 -25.72888 -25.10097 -25.47496 SC: Schwarz information criterion 5 2717.762 29.20072 1.26e-15 -25.79480 -25.02200 -25.48229* HQ: Hannan-Quinn information criterion 6 2727.733 18.11203* 1.25e-15* -25.80419* -24.88648 -25.43308
7 2734.648 12.35907 1.28e-15 -25.78404 -24.72143 -25.35433 8 2740.000 9.411987 1.32e-15 -25.74880 -24.54129 -25.26049 9 2746.710 11.60371 1.35e-15 -25.72666 -24.37426 -25.17976 10 2753.414 11.39994 1.39e-15 -25.70448 -24.20717 -25.09898
17
b) Trace statistics
Unrestricted Cointegration Rank Test (Trace)
Hypothesize
d Trace 0.05
No. of CE(s) Eigenvalue Statistic Critical Value Prob.**
None * 0.142281 48.75529 29.79707 0.0001 At most 1 * 0.071604 15.91097 15.49471 0.0433 At most 2 5.30E-05 0.011335 3.841466 0.9150
Trace test indicates 2 cointegrating eqn(s) at the 0.05 level * denotes rejection of the hypothesis at the 0.05 level **MacKinnon-Haug-Michelis (1999) p-values
18
The portion of the output tells you whether there is cointegration and the number of cointegrated vectors. Here one cannot reject the null of two cointegrating vectors using the trace test. We saw in class the differences between the trace and maximal e igenvalue tests. The latter can be evaluated from the column of eigenvalues provided.
The trace statistic reports in the first block tests the null hypothesis of r cointegrated relations against the alternative of k cointegrating relations, where k is the number of endogenous variables. We can see from the second column that the first two eigenvalues are much higher compared to the last eigenvalue, which lies near zero. This suggests that there exist two cointegrated relations. The null hypothesis r = 0 and r ≤ 1 can clearly be rejected. The calculated test value of 48,75 lies outside the interval between 0 and 29,79. Also the second test value of 15,91 is higher than 15,49.
19
c) Maximum eigenvalues statistics
Unrestricted Cointegration Rank Test (Maximum Eigenvalue)
Hypothesized Max-Eigen 0.05
No. of CE(s) Eigenvalue Statistic Critical Value Prob.**
None * 0.142281 32.84433 21.13162 0.0007 At most 1 * 0.071604 15.89963 14.26460 0.0273 At most 2 5.30E-05 0.011335 3.841466 0.9150
Max-eigenvalue test indicates 2 cointegrating eqn(s) at the 0.05 level
* denotes rejection of the hypothesis at the 0.05 level **MacKinnon-Haug-Michelis (1999) p-values