• No results found

4.3 Methods in volatility modeling and forecasting

4.3.1 HAR model

This section discusses in detail the HAR model introduced in Section 2. Long-memory dependence in financial market volatility is long-established fact. Different models have been proposed to capture this behavior (see e.g. Section IV of Andersen, Boller- slev and Diebold, 2007, for a list of references). The HAR model is an outcome of this literature, in particular of the HARCH-class models (U. A. Müller, Dacorogna, Davé, Olsen et al., 1997), heuristically motivated by the heterogeneous market hy- pothesis (U. A. Müller, Dacorogna, Davé, Pictet et al., 1993): heterogeneous market participants trade on the market over different investment horizons, coexisting and interacting within the same market. E.g., high-frequency traders may be thought as participants trading at intra-day horizons, whereas large institutional traders hold their positions over longer time horizons. The typical slow-decay observed in volatil- ity autocorrelation and stylized facts about returns’ and volatility distribution can be reproduced by mixing in a simple linear model only three volatility components,

intuitively corresponding to the contribution to total daily volatility from trading on daily, weekly, and monthly horizons. Such a model, known as the Heterogeneous Autoregressive (HAR) model of (Corsi, 2009), is very attractive due to its simplicity in estimation, interpretation and in forecasting ability.

The daily latent volatility process ˜σt(d)is modelled under a three-factor stochastic volatility specification. Factors are the past (realized) volatilities at different frequen-

cies9. The three volatility terms identified in the HAR model are a daily component

d, a weekly componentw and a monthly componentm, these are referred to as

partial-volatility terms, since specific of a given time horizon. The latent volatility ˜

σ()at any time scale is assumed to be a (linear) function of the past observed realized variance at the same time-scale10and of the expectation of next-period’s longer-term partial volatility components11. The hierarchical cascade assumption reads12:

˜ σt(d+1)d=c(d)+φ(d)RV(d) t +γ(d)E ˜ σt(+1w)w+ω˜(td+1)d , (4.15) ˜ σ(w) t+1w=c(w)+φ(w)RV( w) t +γ(w)E ˜ σ(m) t+1m +ω˜(tw+)1w, ˜ σ(tm+1)m=c(m)+φ(m)RV(m) t +ω˜( m) t+1m,

where RVt(p) is obtained by averaging p daily lagged realized variance estimates.

Specifically,RVt(w)= 154 i=0RV( d) t−iandRV (m) t =221 22 i=0RV( d)

t−iare the weekly and monthly volatility components. Importantly, the error terms ˜ω(td+)1d, ˜ω(tw+)1d and

˜

ω(tm+1)d, are serially independent, zero-mean and must guarantee positivity of the estimates.

By setting ˜σt(d)=σt(d)withσt(d)being the square-root of the integrated volatility13

t

t−1dσs2d s

1

2, by substitutions eq. (4.15) turns into:

σt(d+1)d=c+β(d)RVt(d)+β(w)RVt(w)+β(m)RVt(m)+ω˜(td+1)d , (4.16) 9RV(p)

t denotes the realized variance estimated intfor the time-horizon p

10This corresponds to an AR-like structure: eq. (4.15) do not involve lagged values of ˜σ(·)

t , but rather

their respective proxies, i.e.RVt(·).

11For ˜σ(m)

t only a linear function of monthly-RV remains, so the AR-like term.

12t+1dreads as “(end of) daytplus one day”, similarly: +1wand+1wrespectively stand for a

week and a month ahead w.r.t. dayt. RV are the actually observed ex-post values.

13As pointed out in Section 2 is the integrated volatility the usual quantity of interest, as a synthesis

which corresponds to the three-factor representation earlier mentioned. By introduc- ing an error termω(td+1)d that accounts for both measurement and estimation errors associated with using RV as a proxy for ˜σt(d+1)d -or analogously recalling thatRVt(+1d)dis not an error-free measure forσt(d+1)d, (Barndorff-Nielsen and Shephard, 2002)-,σt(d+1)d rewrites

σt(d+)1d=RVt(+d)1d+ω(td+)1d . (4.17) By substituting eq. (4.17) into eq. (4.16) and collapsing the respective error terms, the HAR-RV model reads as:

RVt(+1d) =c(d)+β(d)RV(d)

t +β(w)RVt(w)+β(m)RVt(m)+ωt+1d , (4.18) whereωt+1d=ω˜(d)

t+1d−ω (d)

t+1d. This corresponds to an autoregressive model with

autoregressive weights taking a step-function form, restricted in a parsimonious way such that the three emerging components are economically meaningful and interpretable (Corsi, 2009).

The standard estimation of the HAR-RV model is performed via OLS. To guarantee non-negativity of the volatility estimates, eq. (4.18) can be written and estimated in the logs. To account for serial correlation a common practice is to use the Newey- West covariance correction in the estimation. Note that the HAR model can be implemented over the preferred volatility measure, e.g. over the realized kernel as in Publication IV.

As pointed out in Section 1.2, the discussion in Publication IV is based on some critical aspects of the HAR model. Here, I summarize them by referring to the above construction. (i) Linearity of equations (4.15) and thus in the linkage between the components involved in each equation. (ii) ˜ω(td+1)d are (a) mutually independent, (b) zero-mean, (c) left-tail truncated to guarantee positivity in the estimates. (iii) Independence of theβ(·)coefficients in eq. (4.16) over time. (iv) Positivity of the estimates in eq. (4.15), as (Corsi, 2009) suggest, can be achieved with an alternative specification of the model in eq. (4.18) in terms of log-RV (which is a common practice), by doing this (a) eq. (4.17) is assumed to hold in the logs as well, (b) non-log estimates are retrieved by bootstrapping, i.e. simulation (v) The HAR-RV model corresponds to a reparametrization of an AR model, with autoregressive weights

taking a step-function14 (a) this is a step-like approximation of the typical power- law decay in volatility, (b) a limited number of volatility terms only resemble a portion of overall long-range correlation involving a continuous of time-scales. (vi) (a) Presence of autocorrelation, heteroscedasticity and general non stationarity in

ωt+1d require attention under OLS estimation, e.g. by using HAC standard errors,

(b) although normality of the residuals is on a general level not critical for OLS applicability, confidence interval for the predictions (either for the mean response or observations) are symmetric, while e.g. volatility shows skewness. These points are further discussed in Section 3.1 of Publication IV.

Publication IV relies on some critical aspects of some of the above key-points in the

HAR model construction. These arenotto be seen as a criticism but rather as starting

points for reasoning over possible limitations of the HAR model and for developing of possible extensions. Indeed, HAR’s simplicity, its ability in reproducing several stylized features empirically observed in the markets and its good prediction ability, broadly motivate its widespread use.