2 3
t
s then volatility in the state 2 is three times higher than in state 1. In SWARCH models, the states refer to the states of volatility, for a 2 state example, we have “high” or “low” volatility state. Since t is an unobservable variable, estimation is usually done with a variation of Kalman Filter model. Estimation of the model will estimate the volatility parameters and the transition probabilities. As a by-product of the estimation, we will also have an estimate for the latent variable that is, the “states”.
2.0.10 ARCH (1) Regression Model
The ARCH process defined in the previous sections is used as a tool to capture the behaviour of the volatility when it is time-varying in a high frequency time series data. The ARCH model was conceived by Onyeka-Ubaka, Abass and Okafor (2010) as “a regression model with the conditional volatility as the response and the past lags of the squared return as the covariates.” However, very few financial time series have a constant conditional mean zero, hence, an ARCH model can be presented in a regression form by letting t be the innovation process in a linear regression:
t t
t
x
y
. These ARCH processes described by Engle (1982) are a powerful class of time series models for modelling a wide variety of financial processes. The general ARCH process in terms of t(the information set available at time t) can be written as
y
t x
t b e
t (2.19) yt
t1 ~ N(xtb,
t2)
t2 g (
t1,
t2, ,
tp, x
t1, x
t2, , x
tp, a )
e
t y
t
t28
where t2 is the variance function, g is a convenient function specification usually linear in the parameters. xt is a vector of lagged endogenous and current exogenous variables, t is an innovation. These x´s and t also enter the information set. Then the t2 in (2.19) becomes simple form t2 h(t1,a), where a is a vector of unknown parameters of variance function, b is a vector of unknown parameters of regression model, the subscript t indicates that Xt and Yt are series of equally spaced observations through time and the superscript ' denoting the matrix or vector transpose.
Estimating the a and b parameters can be executed by the exact maximum likelihood method.
Inclusion of the regression yt xtbut and the variance function
t2 h(
t1,a)leads to least number of parameters in the spirit of the parsimony (that is in simplest form). Estimation of the model parameters requires that the successive observations are represented by a linear combination of present and past values of independent errors (white noise process). Test of hypotheses and interval estimation assume that the errors are normally distributed, so that their characterization by second order statistics is sufficient. The study also considers the validity of these assumptions, (Belsley, Kuh and Welsch (1980), Berry and Feldman (1985)):(i) Zero mean: E[et] = 0
(ii) Constant Variance: E[et2] = 2
(iii) Non-auto regression: E[
e
te
th] = 0 (h0)It is possible to estimate optimally the regression parameters and their variances with the following formulae:
N
t t N
t
t t
X X
Y Y X X b
1
2 1
) (
) )(
ˆ ( (2.20)
29
aˆ Yt bˆXt (2.21)
N
t t
e
X X b s
Var
1
2 2
) (
ˆ)
( (2.22)
N
t t e
X X
X s N
a Var
1
2 2 2
) (
) 1
(ˆ (2.23)
2 ˆ) (
1
2 2
N Y Y s
N
t t
e , where
Y ˆ a ˆ b ˆ X
t (2.24)The estimators are optimal in the sense that they are unbiased, efficient and consistent. Unbiased estimators are those in which the expected value of the estimator, say bˆ , is equal to the true value, b. An estimator is relatively efficient if it has a smaller variance than any other estimator of b, and it will be consistent if both its bias and variance approach zero as the sample size approaches infinity.
Together these properties mean that an estimator will be centered around the true value as the sample size increases. Of particular interest are assumptions (i) and (iii), which, together, imply that the covariance of any two disturbance terms (that is, cov[eteth]) is equal to zero. This means that one assumes that disturbances at one point in time are not correlated with any other disturbances.
The basic indicator of whether the non-auto regression assumption is violated is whether there is a sample correlation between the various random disturbance terms. To obtain a visual indication of the nature of the correlation, it is helpful to construct correlograms that provide graphical representations of the estimated autocorrelation and partial autocorrelation functions with time lags and autocorrelation and partial autocorrelation coefficients forming the axes (See Figures 11a, 11b, 12a, 12b, 13a, 13b, 14a and 14b). The problem arises in time series analysis because the
30
disturbances, which are a summary of a large number of theoretically irrelevant (and supposedly random) factors that enter into the relationship under study, are likely to be carried over into the subsequent time periods (Samprit and Bertram (1977), Cook (1979), John, William and Michael (1983), Esan and Okafor (1995), Charles and Shayle (2001), Dallah, Okafor and Abass (2004)). In this regard, Kmenta (1986) likened the autoregressive disturbance to sound effect of tapping a musical instrument:
“while the sound is loudest at the time of impact, it does not stop immediately but lingers on for a time until it finally dies out. This may also be characteristic of the disturbance, since its effects may linger for some time after its occurrence. But while the effect of one disturbance lingers on, other disturbances take place, as if the musical strings were tapped over and over, sometimes harder than at other times (p. 299).”
This analogy is consistent with the presence of a first order autoregressive process discussed in BL-GARCH model. The ARCH regression model has a variety of characteristics which made it attractive for econometric applications. Econometric forecasters have recognized that their ability to predict the future varies from one period to another. McNees (17: 52) as cited in Engle (1982) suggests that, “the inherent uncertainty or randomness associated with different forecast periods seems to vary widely over time”. He also documented that, “large and small errors tend to cluster together (in contiguous time periods). This analysis, immediately suggests the usefulness of the ARCH model, where the underlying forecast variance change over time is predicted from past forecast errors. The results presented by McNees also show some autocorrelation during the episodes of large variance.
31
2.0.11 Multivariate ARCH Models
All the ARCH models that have been discussed are univariate. However, assets and markets affect each of them not only in terms of expected returns but also in terms of volatility. Thus, the accurate estimation of time-varying covariance between asset returns has been crucial for asset pricing and risk management. The generalization of univariate models to a multivariate context leads to a straightforward application of ARCH models to portfolio selection and asset pricing theory. Let the (n1) vector {} refer to the multivariate discrete time real-valued stochastic process to be forecasted, where Et1(yt)t denotes the conditional mean. The innovation process for the conditional mean t yt t has a (nn) conditional covariance matrix Vt1(yt)t2. Let {t} be a sequence of (n1) random vectors generated as:
t
tz
t (2.25) where
E
t1(
t) 0
E
t1(
t
t )
t22
t is a matrix (nn) positive definite and measurable with respect to the information set t1, that is the -field generated by the past observations:
t1,
t2, ...
.
{zt} is a sequence of (n1) i.i.d. random vector with the following characteristics:
E[zt]0, E[ztzt] IN, zt ~G(0,IN)
where G is a continuous density function.
32
The parameterization of t2 as a multivariate GARCH, which means as a function of the information set t1, allow each element of t2 to depend on q lagged of the squares and
cross-products of
t, as well as p lagged values of the elements of t2, and a (K1) vector of dummies.So the elements of the covariance matrix follow a vector of ARMA process in squares and cross-products of the disturbances.
For a system of n regression equations, the natural extension of (2.19) to a multivariate framework could be represented as:
,...).
, ,..., ,
(
) , 0 (
~
2 1 2 1 1
t t t t t
t t
t
t t t
g f x y
where t is the conditional variance, B is a kn matrix of unknown parameters, xt, a k1 vector of endogenous and exogenous explanatory variables included in the available information set,
t1, f(.) is the conditional multivariate density function of innovation process and g(.) a function of the lagged conditional covariance matrices and innovation process.The natural multivariate extension of the GARCH (p, q) model is:
( ) ( )
1 1
0
0 j t j j
p
j i i t i t i q
i
t
. (2.26)where 0 is a lower triangular matrix with [n(n+1)/2] parameters and, iand j denote
(nn) matrices with
n
2 parameters each. Engle and Kroner (1995), based on an earlier work of Baba, Engle, Kraft and Kroner (1990), proposed model (2.26) to what they referred as the BEKK model. This parameterization guarantees that t is positive definite and requires the estimation of33
[n(n1)/2]n2(q p) parameters. For example, for n = 3, the multivariate GARCH (1, 1) model contains 24 parameters for estimation. Lee (1999) investigated the output-inflation variability tradeoff using the bivariate BEKK model. Recently, Moschini and Myers (2002), in order to estimate time-varying optimal hedge ratios in commodity markets, modified the BEKK model of (2.26) in the form:
( ) ( ) .
1 1
0
0 j t j j t
p
j i i t i t i q
i t
t
(2.27)As Moschini and Myers noted, the covariance matrix is positive definite as long as t is a positive definite matrix.
A simpler expression of t can be obtained through the use of the vector-half (vech (.)) operator that stacks the lower triangular elements of an (nn) matrix as an[n(n1)/2][n(n1)/2] vector. Since the conditional covariance matrix t is symmetric, vech (t) contains all the unique elements int. Following Engle (1987), Bollerslev, Chou and Kroner (1992), Ng, Engle and Kane (1994), a natural multivariate extension of the univariate GARCH (p, q) model is
t jq
i
p
j j i
t i t i
t W vech vech
Vech
1 1
*
*
W*(L)vech(tt)*(L)vech(t) (2.28)
W is a [n(n+1)/2]1 vector, the *i and *j are [n(n1)/2(n(n1)/2)]matrices. This general formulation is termed vech representation by Engle and Kroner (1995). The number of parameters is
].
] 2 / ) 1 ( )[
( 2 / ) 1 (
[n n pq n n 2 Even for low dimensions of n and small values of p and q the number of parameters is very large; for n = 5 and p = q = 1 the unrestricted version of (2.28) contains 465 parameters. For any parameterization to be sensible, we require that t be positive
34
definite for all values of t in the sample space. In the vech representation this restriction can be difficult to check, let alone impose during estimation, Rossi (2004).
Engle, Granger and Kraft (1986) proposed a natural restriction in the ARCH context and in the multivariate expression of the GARCH (p, q) model, serious problems arise:
(i) The model might not yield a positive definite covariance matrix unless nonlinear inequality restrictions are imposed,
(ii) The number of parameters has to be estimated is (n(n1)/2)(1n(n1)/2)(q p), a very large number even for low dimensions of n. For example, for n = 3, the multivariate GARCH (1, 1) model contains 78 parameters for estimation.
A number of models, considered in the financial literature, have dealt with imposing constraints in multivariate GARCH models in order to reduce the number of parameters that should be estimated.
35
These constraints have to be compatible with a positive definite conditional covariance matrix and must lead to tractable GARCH (p, q) model where the ~i
and ~j
matrices are supposed to be diagonal. Thus, the number of parameters is reduced to (n(n1)/2)(1q p). So, for example, for n = 3, the diagonal GARCH (1, 1) model requires the estimation of 18 parameters. Bollerslev, Engle and Wooldridge (1988) used this model for analyzing returns on bills, bonds and stocks, while Baillie and Myers (1991), Bera, Garcia and Roh (1991) and Myers (1991) estimated hedge ratios in commodity markets. Ding and Engle (2001) gave sufficient conditions for the diagonal multivariate GARCH (1, 1) model to be positive definite and proposed four models, which are nested to the multivariate diagonal multivariate GARCH (1, 1) model.
Storti and Vitale (2003) proposed BL-GARCH model in Gaussian framework. Diongue, Guegan and Wolff (2010) extended their works using elliptical noise to capture the leverage effect or negative correlation between asset returns and volatility.