Volatility estimation with GARCH models - NLP models: volatility estimation

NLP models: volatility estimation

6.1 Volatility estimation with GARCH models

Empirical studies analyzing time series data for returns of securities, interest rates, and exchange rates often reveal a clustering behavior for the volatility of the process

112

6.1 Volatility estimation with GARCH models 113 under consideration. Namely, these time series exhibit high volatility periods alter-nating with low volatility periods. These observations suggest that future volatility can be estimated with some degree of confidence by relying on historical data.

Currently, describing the evolution of such processes by imposing a station-ary model on the conditional distribution of returns is one of the most popular approaches in the econometric modeling of financial time series. This approach ex-presses the conventional wisdom that models for financial returns should adequately represent the nonlinear dynamics that are demonstrated by the sample autocorre-lation and cross-correautocorre-lation functions of these time series. ARCH (autoregressive conditional heteroscedasticity) and GARCH (generalized ARCH) models of Engle [27] and Bollerslev [17] have been popular and successful tools for future volatility estimation. For the multivariate case, rich classes of stationary models that gener-alize the univariate GARCH models have also been developed; see, for example, the comprehensive survey by Bollerslev et al. [18].

The main mathematical problem to be solved in fitting ARCH and GARCH models to observed data is the determination of the best model parameters that maximize a likelihood function, i.e., an optimization problem. See Nocedal and Wright [61], page 255, for a short discussion of maximum likelihood estimation.

Typically, these models are presented as unconstrained optimization problems with recursive terms. In a recent study, Altay-Salih et al. [2] argue that because of the recursion equations and the stationarity constraints, these models actually fall into the domain of nonconvex, nonlinearly constrained nonlinear programming.

Their study shows that by using a sophisticated nonlinear optimization package (sequential quadratic programming based FILTER method of Fletcher and Leyffer [29] in their case) they are able to significantly improve the log-likelihood functions for multivariate volatility (and correlation) estimation. While their study does not provide a comparison of forecasting effectiveness of the standard approaches to that of the constrained optimization approach, the numerical results suggest that constrained optimization approach provides a better prediction of the extremal behavior of the time series data; see [2]. Here, we briefly review this constrained optimization approach for expository purposes.

We consider a stochastic process Y indexed by natural numbers. Yt, its value at time t, is an n-dimensional vector of random variables. Autoregressive behavior of these random variables is modeled as

Yt =

m i=1

φiYt−i + εt, (6.1)

where m is a positive integer representing the number of periods we look back in our model andεt satisfies

E[εt|ε1, . . . , εt−1]= 0.

While these models are of limited value, if at all, in the estimation of the actual time series (Yt), they have been shown to provide useful information for volatility estimation. For this purpose, GARCH models define

ht := E

ε²t|ε1, . . . , εt−1 in the univariate case and

Ht := E

εtεt^T|ε1, . . . , εt−1

in the multivariate case. Then one models the conditional time dependence of these squared residuals in the univariate case as follows:

ht = c +

This model is called GARCH( p, q). Note that ARCH models correspond to choos-ing p= 0.

The generalization of the model (6.2) to the multivariate case can be done in a number of ways. One approach is to use the operator vech to turn the matrices Ht andεtε^Tt into vectors. The operator vech takes an n× n symmetric matrix as an input and produces an n(n+ 1)/2-dimensional vector as output by stacking the elements of the matrix on and below the diagonal on top of each other. Using this operator, one can write a multivariate generalization of (6.2) as follows:

vech(Ht)= vech(C) + an n× n symmetric matrix.

After choosing a superstructure for the GARCH model, that is, after choosing p and q, the objective is to determine the optimal parametersφi,αi, andβj. Most often, this is achieved via maximum likelihood estimation. If one assumes a normal distribution for Yt conditional on the historical observations, the log-likelihood function can be written as follows [2]:

−T

in the univariate case and

−T

6.1 Volatility estimation with GARCH models 115 Exercise 6.1 Show that the function in (6.4) is a difference of convex functions by showing that log ht is concave andε²_t/ht is convex inεt and ht. Does the same conclusion hold for the function in (6.5)?

Now, the optimization problem to solve in the univariate case is to maximize the log-likelihood function (6.4) subject to the model constraints (6.1) and (6.2) as well as the condition that ht is nonnegative for all t since ht = E[εt²|ε1, . . . , εt−1].

In the multivariate case we maximize (6.5) subject to the model constraints (6.1) and (6.3) as well as the condition that Ht is a positive semidefinite matrix for all t since Ht defined as E[εtε^T_t|ε1, . . . , εt−1] must necessarily satisfy this condition.

The positive semidefiniteness of the matrices Ht can either be enforced using the techniques discussed in Chapter 9 or using a reparametrization of the variables via Cholesky-type L D L^Tdecomposition as discussed in [2].

An important issue in GARCH parameter estimation is the stationarity properties of the resulting model. There is a continuing debate about whether it is reasonable to assume that the model parameters for financial time series are stationary over time. It is clear, however, that the estimation and forecasting is easier on stationary models. A sufficient condition for the stationarity of the univariate GARCH model above is that theαi’s andβj’s as well as the scalar c are strictly positive and that

q i=1

αi+

p j=1

βj < 1, (6.6)

see, for example, [35]. The sufficient condition for the multivariate case is more involved and we refer the reader to [2] for these details.

Especially in the multivariate case, the problem of maximizing the log-likelihood function with respect to the model constraints is a difficult nonlinear, nonconvex optimization problem. To find a quick solution, more tractable versions of the model (6.3) have been developed where the model is simplified by imposing additional structure on the matrices Ai and Bj such as diagonality. While the resulting prob-lems are easier to solve, the loss of generality from their simplifying assumptions can be costly. As Altay-Salih et al. [2] demonstrate, using the full power of state-of-the-art constrained optimization software, one can solve the more general model in reasonable computational time (at least for bivariate and trivariate estimation problems) with much improved log-likelihood values. While the forecasting effi-ciency of this approach is still to be tested, it is clear that sophisticated nonlinear optimization is emerging as a valuable tool in volatility estimation problems that use historical data.

Exercise 6.2 Consider the model in (6.3) for the bivariate case when q = 1 and p = 0 (i.e., an ARCH(1) model). Explicitly construct the nonlinear programming

problem to be solved in this case. The comparable simplification of the BEKK representation [4] gives

Ht = C^TC+ A^Tεt−1ε_t−1^t A.

Compare these two models and comment on the additional degrees of freedom in the NLP model. Note that the BEKK representation ensures the positive semidefi-niteness of Ht by construction at the expense of lost degrees of freedom.

Exercise 6.3 Test the NLP model against the model resulting from the BEKK representation in the previous exercise using daily return data for two market in-dices, e.g., S&P 500 and FTSE 100, and an NLP solver. Compare the optimal log-likelihood values achieved by both models and comment.

In document Optimization Methods in Finance (Page 126-130)