Multivariate GARCH models

(1)

Multivariate GARCH models

CIDE-Bertinoro-Courses for PhD students - June 2015

Malvina Marchese

Universita’ degli Studi di Genova [email protected]

(2)

Bauwens, L., Laurent, S., Rombouts, J.V.K. (2006),Multivariate GARCH models, Journal of Applied Econometrics, 21, 79-109. Silvennoinen, A, Terasvirta, T. (2008)Multivariate GARCH models, in T. G. Andersen, R. A. Davis, J.-P. Kreiss and T. Mikosch, eds. Handbook of Financial Time Series, New York, Springer.

(3)

Multivariate volatility modelling

Portfolio returns

Assume to have a portfolio of nassets. The portfolio return is given by r(_tp) = n X i=1 wt,irt,i

wherert,i is the (close-to-close) return on thei-th portfolio asset

and wi,t is the associated portfolio weights. The wi,t are such that 1 _w_i,t ≥₀

2 Pk

i=1ni,t= 1

The first assumption could be relaxed allowing for negative weights (related to short-selling operations).

(4)

Portfolio volatility (1)

From standard properties of the variance, it follows that portfolio volatility is given by σ₍2_p₎_,t = n X i=1 w_t,i2 σt,ii+ X i6=j wt,iwt,jσt,ij = k X i=1 w_t,i2 σt,ii+ X i6=j wt,iwt,j √ σt,iiσt,jjρij,t where

σt,ij=cov(rt,i, rt,j|It−1) for i, j= 1, . . . , n.

and ρij,t= σt,ij √ σt,iiσt,jj =corr(rt,i, rt,j|It−1)

(5)

Portfolio volatility (2)

In matrix terms, portfolio volatility can be written as

σ₍2_p₎_,t =w0_tΣtwt

wherewt= (wt,1, . . . , wt,n) 0

and Σt=var(rt|It−1) is the conditional variance-covariance matrix of returns

r_t= (rt,1, . . . , rt,n) 0

In terms of correlations,σ2₍_p₎_,t can be equivalently expressed as

σ₍2_p₎_,t=w_t0DtDt−1ΣtDt−1Dtwt=w 0

tDtRtDtwt

whereDtis a (k×k) diagonal matrix whose i-th diagonal element

is given by√σt,ii andRtis the conditional correlation matrix ofrt.

(6)

A general class of multivariate CH models

The main aim of multivariate CH models is to predict future values ofΣt.

A wide class of multivariate CH models can be obtained as a special case of the following general model scheme

rt = µ_t+ Σ1_t/2zt=µ_t+ut (1) Σt = Σ(It−1;θσ). (2) where 1 _µ t=E(rt|It−1) 2 _z_t ∼ iid(0, Ik) 3 _Σ1/2

t is any p.d. (k×k) matrix such thatΣ

1/2

t (Σ

1/2

t )0= Σt

The model also implies that

var(rt|It−1) =var(ut|It−1) = Σ1t/2var(zt)(Σ1t/2) 0 _{= Σ}

t

(7)

Multivariate GARCH (MGARCH) models

For ease of reference we will denote asMultivariate GARCH (MGARCH)the models belonging to the class defined by equations (1-2). Different MGARCH models will be characterized by

different specifications of the dynamic equation (2).

Bauwens, Laurent and Rombouts (2006, JAE) classify MGARCH models into three different categories

1 _{direct generalizations of the univariate GARCH model:} VECH(Bollerslev, Engle, and Wooldridge, 1988, JPE), BEKK (Engle and Kroner, 1995, ET), RiskMetrics and factor models. 2 _{linear combinations of univariate GARCH models: generalized}

orthogonal models and latent factor models.

3 _{nonlinear combinations of univariate GARCH models: Dynamic} Conditional Correlation (DCC) models (Engle, 2002, JBES)

In this course we will focus on categories (1) and (3).

(8)

Main issues in MGARCH modelling

Identify appropriate conditions to be imposed on θσ for

guaranteeing the PDness of Σt.

In order to make the estimation feasible, we need to find

parsimonious parameterization without paying a too high price in terms of flexibility of the dynamics of Σt.

Find Σ =E(Σt) =V ar(ut) and identify appropriate conditions to

be imposed on θσ for guarateeing week stationarity of the model

and existence ofΣ.

For ease of exposition we will focus on1-lag dynamic models (which is the most widely diffused choice in practical applications).

(9)

Financial applications of MGARCH models

MGARCH models have several applications in different fields of finance Prediction of VaR and ES (eg Christoffersen, 2008, HoFTS) Hedging (eg Storti, 2008, SMA)

Portfolio Optimization (eg Engle and Colacito, 2006, JBES) Option Pricing ( eg Rombouts, Stentoft and Violante, 2012, WP) Analysis of contagion (eg Billio and Caporin, 2005, SMA) and volatility spillovers (eg Chang and McAleer, 2011,WP)

(10)

Focus: VaR prediction via MGARCH models (1)

VaR prediction is one of the main applications of multivariate GARCH models

Assume that wt=w(It−1), which is a natural assumption: on a daily scale an hypothetical investor decides the allocation of his portfolio using information available at market closure on the previous day.

Remind that portfolio returns are given by

r_t(p) =w_t0µ_t+w0_tΣ1_t/2zt

The availability of an analytical expression for VaR depends on the shape of the distribution of zt

(11)

Focus: VaR prediction via MGARCH models (2)

Normal errors: zt∼M V N(0, In). Since linear transformations of

MVN distributions are still normal, we have

(r(_tp)|It−1)∼M V N(w0_tµt,w 0 tΣtwt)

The one-step-ahead VaR at level(1−p) is then given by:

VaRt,p,1=w

0 tµt+

q

w_t0ΣtwtNp

whereNp is the order p quantile of a standardized normal

distribution.

(12)

Focus: VaR prediction via MGARCH models (3)

Multivariate Student’s t errors: zt∼tn(0, In, ν). As for the

Normal distribution, linear transformations of MVtdistributions are still twith the same number of degrees of freedom. The one-step-ahead VaR at level (1−p) is then given by:

VaR_t,p,₁ =w0_tµ_t+

q

w0_tΣtwtt∗p,ν

wheret∗_p,ν =p(ν−2)/νtp,ν is the orderp quantile of an univariate

standardized Student’s t distribution withν degrees of freedom. In general, it is not always possible to derive the exact form of VaR. In this case, and for horizons k >1, simulation techniques should be used, as in the univariate case.

(13)

VECH models

The general VECH model (Bollerslev, Engle and Wooldridge, 1988, JPE) is defined as rt = µt+ Σ 1/2 t zt=µt+ut (3) ht = c+Aηt−1+Ght−1 (4) where ht = vech(Σt) η_t = vech(utu 0 t−1)

Aand Gare (n(n+ 1)/2×n(n+ 1)/2) parameter matrices andcis a (n(n+ 1)/2×1)parameter vector

Heavily parameterized: The total number of parameters is

n(n+ 1)(n(n+ 1) + 1)/2 that, forn= 2 gives 21 parameters, 78 for

n= 3 and 210 for n= 4!

Conditions onc,A andG for positive definiteness ofΣt are

difficult to derive.

(14)

Diagonal VECH models (1)

In order to reduce the number of parameters to a tractable number, some simplifying assumptions must be imposed. One solution is to assume thatA and Gare diagonal matrices reducing the number of parameters to n(n+ 5)/2(e.g. 7,12,18 for n=2,3,4). Thediagonal VECH model can be also represented in terms of Hadamard products as Σt=C ◦ +A◦(ut−1u 0 t−1) +G ◦ Σt−1

whereC=diag(vech(C◦)),A=diag(vech(A◦))and

G=diag(vech(G◦))

It is easy to show thatΣt will be PD ∀tif C ◦

is PD while and Σ0,

A◦ and G◦ are PSD. The PDness of C◦ can be easily imposed reparameterizing the model in terms of the Cholesky

(15)

VECH models

Diagonal VECH models (2)

In order to obtain a more parsimonious model, an alternative strategy is to constrain the parameter matrices to be of rank one.

A=aa0 G=gg0 C=cc0

witha, b, c being (n×1)vectors. In this case,Σt will be only PSD

unless we impose PDness ofC.

For vast dimensional systems, A◦ andG◦ are usually constrained to be given by matrices of ones multiplied by a positive scalar (scalar VECH model).

A=a×uu0 G=g×uu0

withui = 1, for i= 1, . . . , n.

(16)

Statistical properties of VECH models

The VECH model in equations (3-4) is covariance stationary if and only if the eigenvalues of (A+G) are in modulus less than one

max(|eig(A+G)|)<1

The second unconditional moment of a stationary VECH process is

vechΣ = (In∗−A−G)−1c

(17)

VECH models

Covariance targeting

Assume that rt is a scalar VECH process.

Σt=C ◦

+a(ut−1u

0

t−1) +gΣt−1 (5)

The above expression for the unconditional covariance matrix can be reformulated as

Σ = (1−a−g)−1C◦

that can be inverted to give

C◦= (1−a−g)Σ (6)

Substituting (6) into (5) allows to further reduce to 2 the number of parameters to be simultaneously estimated in the scalar VECH model (Σcan be estimated as the sample covariance matrix of filtered returns).

This technique is known ascovariance targeting and generalizes to the multivariate case the variance targeting of Engle and Mezrich (1996, Risk).

(18)

The multivariate RiskMetrics (EWMA) predictor

The JP Morgan (1996) has proposed a multivariate extension of the univariate RiskMetrics volatility predictor.

This can be represented as a special case of the scalar VECH model

ht=ληt−1+λht−1

where0≤λ≤1.

Thedecay factor λproposed by RiskMetrics is 0.94, for daily data, and 0.97 for monthly data.

(19)

VECH models

Focus: the bivariate VECH(1,1) model

ht,11 = c1+a11u2t−1,1+a12ut−1,1ut−1,2+a13u2t−1,2+ + g11ht−1,11+g12ht−1,12+g13ht−1,22 ht,12 = c2+a21u2t−1,1+a22ut−1,1ut−1,2+a23u2t−1,2+ + g21ht−1,11+g22ht−1,12+g23ht−1,22 ht,22 = c3+a31u2t−1,1+a32ut−1,1ut−1,2+a33u2t−1,2+ + g31ht−1,11+g32ht−1,12+g33ht−1,22

whereht,ij=cov(rt,i, rt,j|It−1).

(20)

BEKK models (1)

The class of BEKK models was proposed by Engle and Kroner (1995, ET). Differently from VECH models, BEKK models guarantee PDness ofΣt without imposing costraints on the model

parameters.

In a BEKK(1,1,K) the dynamic updating equation for Σtis given

by Σt=C 0 C+ K X k=1 A0_k(ut−1ut−1 0 )Ak+ K X k=1 G0_k(Σt−1)Gk

whereAk, Gk are (n×n) matrices (fork= 1, . . . , K) andC is

upper triangular.

The BEKK model can be shown to be a special case of the general VECH model.

(21)

BEKK models

BEKK models (2)

In practical applications BEKK models with K= 1 are usually considered.

For a BEKK(1,1,1) model the number of parameters isn(5n+ 1)/2

(e.g. 11,24,42 for n=2,3,4). In order to reduce this number the A1

and G1 matrices can be constrained to be diagonal or scalar. It is

immediate to see that the scalar BEKK coincides with the scalar VECH model.

The stationarity conditions can be obtained by deriving the equivalent VECH formulation of the model (see Engle and Kroner, 1995)

(22)

Focus: the bivariate BEKK(1,1,1) model

The implied models for the conditional variances and covariances are constrained versions of those implied by the VECH(1,1) model

ht,11 = ω11+a11∗ u2t−1,1+ 2a ∗ 11a21ut−1,1ut−1,2+a∗21u2t−1,2+ + g₁₁∗ ht−1,11+ 2g∗11g ∗ 21ht−1,12+g∗21ht−1,22 ht,12 = ω21+ (a∗11a∗12)u2t−1,1+ (a11∗ a∗22+a∗21a∗12)ut−1,1ut−1,2+ + (a∗₂₂a∗₂₁)u2_t−₁_,₂+ (g₁₁∗ g∗₁₂)ht−1,11+ (g∗11g22∗ +g21∗ g12∗ )ht−1,12+ + (g∗₂₂g∗₂₁)ht−1,22

whereht,ij=cov(rt,i, rt,j|It−1);a∗_ij andg_ij∗ are the elements of the A1

and G1 matrices in the BEKK model formulation,ωij are the elements

(23)

BEKK models

Estimation of BEKK and VECH models

The estimation of BEKK and VECH model parameters can be performed maximizing the Gaussian QML function

`T(θµ,θσ) =− 1 2 T X t=1 log|Σt| − 1 2 T X t=1 (rt−µ_t)0Σ_t−1(rt−µ_t) (7)

Alternatively, the model can be estimated by ML under the assumption that the standardized errorszt follow a multivariate t

distribution with density

f(zt|θµ,θσ, ν) = Γ((ν+n)/2) Γ(ν/2)[π(ν−2)]n/2[1 + z0_tzt ν−2] −(n+ν)/2

Other distributions have been considered: eg Bauwens and Laurent (2005, JBES) assume a multivariate skewedt distribution forzt.

(24)

Conditional Correlation models

Conditional correlation models separate the modelling of

conditional variances from that of conditional correlations in two different steps

1 _{we separately define and estimate}_n_{univariate models for the} conditional variances

2 _{we estimate the conditional correlation matrix.}

This approach has two important advantages

1 _{computational simplicity: the number of parameters to be}

simultaneously estimated is reduced since a complex optimization problem is disaggregated into simpler ones

2 _{flexibility: it allows for more flexible model structure since}

(25)

Conditional correlation models

Constant Conditional Correlation (CCC) models

In the CCC model (Bollerslev, 1990, RES) the conditional

correlation matrix is assumed to be constant. This is equivalent to impose that the conditional covariances are proportional to the product of conditional standard deviations.

The conditional covariance matrix is modelled as

Σt=DtRDt

withDt=diag(

√

σt,11, . . . , √

σt,nn)0 where the conditional

variancesσt,jj can be generated by any GARCH type model and R

is the conditional correlation matrix of returns

R=corr(rt|It−1) ρi,i= 1,∀i

It is easy to show that the element of place (i, j) inΣt is given by

σt,ij =ρi,j

√

σt,iiσt,jj

(26)

Dynamic Conditional Correlation (DCC) models

The constant conditional correlation assumption is often inadequate. Empirical evidence suggests that the level of

conditional correlations is time varying (e.g. a higher correlation is usually detected in high volatility periods).

DCC models are also based on a two-step model building strategy. Differently from CCC models, the conditional correlation matrix is time varying (Rt) as a function of a vector of unknown parameters

Rt=R(It−1;θc)

Several different versions of the DCC model have been proposed. In this course we focus on

1 _{the DCC model proposed by Engle (2002, JBES), DCC-E, and its} variants

(27)

The DCC-E model: general formulation

The DCC-E(1,1) model is defined by the following set of equations (for simplicity assume µ_t=0)

Ht = DtRtDt σt,ii = ωi+αir2t−1,i+βiσt−1,ii i= 1, ..., n Dt = diag( √ σt,11, . . . , √ σt,n)0 t = Dt−1rt Qt = C 0 C+At 0 t+BQt−1 Rt = (diag(Qt))−1/2Qt(diag(Qt))−1/2

where C is upper triangular, A and B are (n×n) PSD parameter matrices.

The last equation is needed in order to guarantee that Rt is a well

defined correlation matrix.

The specification of the σt,ii can be easily changed to allow for

other different univariate volatility models.

(28)

The DCC-E model: scalar formulation

For vast dimensional systems the general DCC-E model is not feasible due to the high-number of parameters and so it is replaced by a restricted versionwith scalar parameter matrices

Qt=S(1−a−b) +at 0

t+bQt−1 (8)

wherea, b≥0,a+b <1 and S is PD. These restrictions imply thatQt is PD.

In order to reduce the number of parameters to be simultaneously estimated, Engle (2002) concentrates out the matrix S setting

S =E(t 0

t) =E(Rt) = ¯R. In practical applicationsR¯ is replaced

by the sample covariance matrix of standardized returnsˆt:

ˆ ¯ R= (1/T) T X t=1 ˆ tˆ 0 t

(29)

The DCC-E model: Aielli’s critique (1)

Aielli (2011) shows that S=E(t 0

t) if and only if

E(t 0

t) =E(Qt) = ¯Q

This equality in general does not hold (except for the constant conditional correlation case). By the law of iterated expectation:

E(t 0 t) = E[E(t 0 t|It−1)] = E(Rt) =E((diag(Qt))−1/2Qt(diag(Qt))−1/2)6= ¯Q.

This motivates a new variant of the DCC-E model called the

corrected DCC (cDCC)that is not affected by this bias (although we must remark that the empirical performances of cDCC and DCC-E models are very close).

(30)

The DCC-E model: Aielli’s critique (2)

The cDCC model replaces equation (8) by the following recursion

Qt= ¯Q(1−a−b) +aete 0

t+bQt−1

whereet= (diag(Qt))1/2t. It is easy to show that

E(ete0_t) =E( ¯Q)

If theet were observable,Q¯ could have been estimated as

ˆ ¯ Q= (1/T) T X t=1 ete 0 t

but this is not feasible sinceet is dependent on the unknown

(31)

The DCC-T model

The DCC-T model was proposed by Tse and Tsui (2002). The main difference with respect to the DCC-E model is that the conditional correlation matrixRtis directly generated by the linear

dynamic equation

Rt= (1−a−b)S+aΨt−1+bRt−1

where S is a PD matrix with diagonal entries equal to 1,Ψt−1 is

the sample correlation matrix oft computed over a moving

window of lengthM ψij,t−1= PM m=1t−m,it−m,j q PM m=12t−m,iPMm=12t−m,j

wheret,i=rt,i/

√

σt,ii. A necessary condition for PDness ofΨt,

and then ofRt, is that M ≥n.

Aielli (2011) shows that, even for this model, the targeting matrix

R is not easy to estimate.

(32)

Estimation of conditional correlation models (1)

The estimation of conditional correlation models is based on a two-step procedure

1 _{Estimation of univariate conditional variance (θ}_σ_{) parameters} 2 _{Estimation of conditional correlation parameters (θ}_c_{), given first}

stage estimates ofθσ

Assuming (for simplicity)µ_t=0, the log-likelihood function can be decomposed as the sum of a volatility and a correlation component

(33)

The volatility part can be written as

`v(r;θσ) = − 1 2 T X t=1 log(|Dt|2) +r 0 tD −2 t rt = −1 2 n X i=1 T X t=1

log(σt,ii) +r2t,iσt,ii−1 = n X i=1 `v,i(ri;θσ,i)

which is the sum of the univariate likelihoods of the 1st stage volatility models.

The correlation part is then given by

`c(rt;θσ,θc) =− 1 2 T X t=1 (log|Rt|) + 0 tR −1 t t− 0 tt)

(34)

Estimation of conditional correlation models (2)

The consistency of 1st (θˆv) and 2nd (ˆθc) stage estimators follows

from standard likelihood theory and asymptotic results on two-stage estimation (White, 1994, Theorem 3.10)1. Asymptotic normality ofˆθv also follows from standard likelihood theory results.

The whole estimation problem can be represented as a two stage GMM estimation problem. Theorem 6.1 in Newey and McFadden (1994, HoE vol. IV, chap. 36) can be applied in order to prove the asymptotic normality ofθˆc

√

T(ˆθc,T −θc,0)

d

→N(0, Vc)

(35)

Challenges in multivariate volatility modelling

Main challenges in multivariate volatility modelling

Curse of Dimensionality: in practical applications the dimension of the portfolio (k) is usually very high and this leads to a very large number of parameters to be estimated (unless severe constraints are imposed on the dynamics ofHt).

Model uncertainty. Several alternative models and approaches are available: is it possible to improve their predictive performance by considering combinations of different models (forecast

combinations, model averaging)?

Inferencein very large dimensional models, even if the number of parameters is manageable, presents some relevant statistical and computational problems.