This section presents the general state space model and how we apply it to the Vector Error Correction Model (VECM).
4.2.1 The general state space model
The general state space model used in this chapter is composed of two main equations, the measurement equation and the transition equation (4.5). We use the same notation as for the state space model introduced by Primiceri (2005).
yt= Htβt+ t , Measurement equation (4.5) βt= F βt−1+ ut , Transition equation where t ut ∼ N 0 0 , Rt 0 0 G (4.6) Now let: βt|s = E(βt|Ys, Hs, Rs, G) Vt|s = V ar(βt|Ys, Hs, Rs, G)
Then, given β0|0 and V0|0, we use the following standard Kalman lter: βt|t−1= F βt−1|t−1 (4.7) Vt|t−1 = F Vt−1|t−1F0+ G Kt= Vt|t−1Ht0(HtVt|t−1Ht0+ Rt)−1 βt|t= βt|t−1+ Kt(yt− Htβt|t−1) Vt|t= Vt|t−1− KtHtVt|t−1
The backward recursion is dened as follows. We rst start to simulate βT from its moment
and variance given the information at time T : βT |T and VT |T. Then recursively, for each t from
T − 1 to 1, we draw βt from βt|t+1 and Vt|t+1 where:
βt|t+1 = βt|t+ Vt|tF0Vt+1|t−1 (βt+1− F βt|t) (4.8)
Vt|t+1 = Vt|t− Vt|tF0Vt+1|t−1 F Vt|t
4.2.2 State space model of the Vector Error Correction Model
We can rewrite the VECM (4.2) as:
yt= ΘtZt+ ut, ut ∼ N (0, Σ) (4.9) where: yt = ∆xt , (size p × 1) Θt = (Πt, Ψ1,t, ..., Ψk−1,t) , (size p × pk) Zt= (x0t−1, ∆x 0 t−1, ..., ∆x 0 t−k+1) 0 , (size pk × 1)
The VECM model (4.9) constitutes in fact the measurement equation of our state space model. Let us now dene θt = V ec(Θt) and b = p × pk = p2k where:
Then, from the measurement equation (4.9), we have: V ec(yt) = yt = V ec(ΘtZt) + V ec(ut)
= V ec(ΘtZt) + ut
= (Zt0 ⊗ Ip)V ec(Θt) + ut
(because V ec(AXB) = (B0⊗ A)V ec(X)) = (Zt0 ⊗ Ip)θt+ ut
Hence, the measurement equation can be written as:
yt= (Zt0⊗ Ip)θt+ ut, ut∼ N (0, Σ) (4.11)
We now need to construct the dynamics of the state equation in the state space model, which implies that a b × b parameter matrix F should be introduced like in the transition equation in the general state space model shown in Equation (4.5). In our model we decided to simplify the lag parameter matrix F by a scale variable ρ to which we will apply a Bayesian inference later on in this chapter (see Section 4.3.2). This choice is made in order to avoid lengthy computations in the Forward Filtering Backward Recursion algorithm.
The time-varying models that are in the literature consider a constant variance of the errors for the transition equation, meaning that the expected evolution of the parameters θt is the
same for all the time periods. A constant variance of the errors then allows gradual and smooth evolution of the parameters, which happens to be generally the case. We will denote by Q the variance matrix of the errors in the transition equation. Therefore the b × 1 vector θt will have
the following dynamics:
θt = ρθt−1+ νt, νt ∼ N (0, Q) (4.12)
Finally, our state space model is summarised as follows:
yt= (Zt0⊗ Ip)θt+ ut, ut∼ N (0, Σ) Measurement equation (4.13)
Also, in the following sections, Dt will denote the information brought by the data at time
t, i.e. the information brought by yt and yt−1. Furthermore, we can denote by D1:t all the
information brought by the data from time 1 to time t, that is, the information brought by y0, y1, · · · , yt.
4.2.3 Forward Filtering and Backward Recursion of the Vector Error
Correction Model
This section describes the Kalman Filter applied to the Vector Error Correction Model. In this chapter, we use a two-lter smoothing algorithm in order to calculate the posterior distribution of the parameter vector θt (conditional on observations). The rst algorithm goes forward in
time from t = 1 to t = T whereas the second algorithm goes backward in time from t = T to t = 1 hence the notion of Forward Filtering and Backward Recursion algorithm (FFBS). This algorithm is also referred in the literature as the forward-backward smoothing algorithm. Gibbs sampling in state space models is achieved by this algorithm. In this methodology, we must rely on the posterior distributions of the backward recursion because the parameters θt are simulated
given all the data: θt|D0:T. If we take the posterior distributions of the forward ltering part,
then these distributions would only depend on the previous data time points: θt|D0:t, see Carter
and Kohn (1994) and Fruewirth-Schnatter (1994).
The rst part consists in creating the expectation and the variance of our parameter θt given
the information at time t: θt|t and Pt|t. That rst part is called the Forward Filtering algorithm
(4.14). By applying the Kalman Filter (4.7) seen in Section 4.2.1 to our state space model (4.13) and given θ0|0 and P0|0, we have:
θt|t−1 = ρθt−1|t−1 (4.14)
Pt|t−1 = ρ2Pt−1|t−1+ Q
Kt= Pt|t−1(Zt⊗ Ip)((Zt0 ⊗ Ip)Pt|t−1(Zt⊗ Ip) + Σ)−1
Pt|t = Pt|t−1− Kt(Zt0⊗ Ip)Pt|t−1
The following step in the estimation of the time-varying parameters is to introduce the back- ward recursion. This step constitutes the smoothing ltering part in which we collect the expec- tation and the variance of our parameter θt given the information at time t + 1, ∀t ∈ [[1, T − 1]].
We use the same algorithm described by Primiceri (2005) for the VAR process, but applied to the Vector Error Correction Model instead. After the forward ltering steps, we rst simulate θT
from its moment and variance given the information at time T : θT |T and PT |T. Then recursively,
for each time t from T − 1 to 1 (backward recursion), we draw θt from a multivariate normal
distribution with mean θt|t+1 and variance Pt|t+1 where:
θt|t+1 = θt|t+ ρPt|tPt+1|t−1 (θt+1− ρθt|t) (4.15)
Pt|t+1= Pt|t− ρ2Pt|tPt+1|t−1 Pt|t
4.2.4 Bayesian inference on the covariance matrix Σ
For any time t, we assume Σ to have an Inverse-Wishart prior distribution implying two hyperparameters A and q like in the previous chapter (see Section 3.3.2):
Σ ∼ IW (A, q) (4.16)
Let us now write the likelihood of the measurement equation given the data. This likelihood is the one that takes into account all the information from time 1 to time T . It is therefore proportional to the product from time 1 to time T of the probability distribution functions of each error ut: L(Σ; D1:T, θ0:T) ∝ T Y t=1 f (ut|D1:t, θt) ∝ T Y t=1 |Σ|−1/2exp −1 2Tr(Σ −1 utut0) giving: L(Σ; D1:T, θ0:T) ∝ |Σ|−T /2exp − 1 2Tr(Σ −1 T X t=1 utut0) ! (4.17)
where D1:T represents all the information given by the data ytfrom time 1 to T and θ0:T represents
all the information given by θt= V ec(Πt, Ψt) from time 0 to T .
After that, it is straightforward to derive the expression of the posterior Inverse-Wishart distribution of Σ, conditional on all the parameters θ0:T and all the data D1:T:
Σ|D1:T, θ0:T ∼ IW (A + T X t=1 utut0, T + q) (4.18) where ut= yt− (Zt0 ⊗ Ip)θt.