Models of Financial Return With Time-Varying Zero Probability

(1)

MPRA

Munich Personal RePEc Archive

Models of Financial Return With

Time-Varying Zero Probability

Genaro Sucarrat and Steffen Grønneberg

BI Norwegian Business School, BI Norwegian Business School

17 January 2016

Online at

https://mpra.ub.uni-muenchen.de/68931/

(2)

Models of Financial Return With Time-Varying Zero Probability*

Genaro Sucarrat_{and Steen Grnneberg}

17 January 2016

Abstract

The probability of an observed nancial return being equal to zero is not nec-essarily zero. This can be due to price discreteness or rounding error, liquidity issues (e:g: low trading volume), market closures, data issues (e:g: data impu-tation due to missing values), characteristics specic to the market, and so on. Moreover, the zero probability may change and depend on market conditions. In standard models of return volatility, however, e:g: ARCH, SV and continu-ous time models, the zero probability is zero, constant or both. We propose a new class of models that allows for a time-varying zero probability, and which can be combined with standard models of return volatility: They are nested and obtained as special cases when the zero probability is constant and equal to zero. Another attraction is that the return properties of the new class (e:g: volatility, skewness, kurtosis, Value-at-Risk, Expected Shortfall) are obtained as functions of the underlying volatility model. The new class allows for au-toregressive conditional dynamics in both the zero probability and volatility specications, and for additional covariates. Simulations show parameter and risk estimates are biased if zeros are not appropriately handled, and an appli-cation illustrates that risk-estimates can be substantially biased in practice if the time-varying zero probability is not accommodated.

JEL Classication: C01, C22, C32, C51, C52, C58

Keywords: Financial return, volatility, zero-inated return, GARCH, log-GARCH, ACL

*_{We are grateful to participants at the SNDE Annual Symposium 2015 (Oslo) and IAAE}

Con-ference (Thessaloniki) for useful comments, suggestions and questions.

_{Corresponding author. Department of Economics, BI Norwegian Business School. Email:}

[email protected]. Webpage: http://www.sucarrat.net/

(3)

Contents:

1 Introduction 2

2 Financial returns with time-varying zero probability 4

2.1 The ordinary model of return . . . 4

2.2 The model of return with time-varying zero probability . . . 4

2.3 Some general properties . . . 5

3 Models of 1t and t 8

3.1 Models of 1t . . . 8

3.2 Models of t . . . 9

3.3 A joint model of 1t and twith two-way feedback . . . 10

4 Simulations 11

4.1 The eect on parameter and risk estimates . . . 11

4.2 A missing values algorithm . . . 12

5 Empirical application 13 6 Conclusions 15 References 17 A Derivation of properties 7 to 10 18 A.1 Pdfs and cdfs . . . 18 A.2 Quantiles . . . 18

A.3 Tail expectations . . . 18

B Missing values estimation algorithm 19

C Estimation of the joint model of 1t and t with two-way feedback 20

1 Introduction

It is well-known that the probability of an observed nancial return being equal to zero is not necessarily zero. This can be due to price discreteness and/or rounding error, liquidity issues (e:g: low trading volume), market closures, data issues (e:g: data imputation due to missing values), characteristics specic to the market, and so on. Moreover, the zero probability may change and depend on market condi-tions. In standard models of nancial return volatility, however, the probability of a zero return is either zero, or non-zero but constant. Examples include the

Autore-gressive Conditional Heteroscedasticity (ARCH) class of models ofEngle (1982), the

Stochastic Volatility (SV) class of models (seeShephard (2005)) and continuous time

models (e:g: Brownian motion).1 _{Hausman et al.} ₍₁₉₉₂_{) relaxed the constancy}

as-sumption by allowing the zero probability to depend on other conditioning variables (e:g: volume, duration and past returns) in a probit framework. This was then

ex-tended in two dierent directions byEngle and Russell(1998), andRussell and Engle

(2005), respectively. The latter in particular provides a comprehensive framework,

since there price-changes are modelled by an Autoregressive Conditional Multinomial

(4)

(ACM) model coupled with a continuous time model of the durations between trades.

However, as pointed out byLiesenfeld et al. (2006), there are several limitations and

drawbacks with this approach. Instead, they propose a dynamic integer count model,

which is extended to the multivariate case inBien et al.(2011). Finally,Rydberg and

Shephard (2003) propose a framework in which the price increment is decomposed multiplicatively into three components: Activity, direction and integer magnitude.

Even though discrete models in many cases may provide a more accurate charac-terisation of observed returns, the most common models in empirical practice { e:g: ARCH, SV and continuous time models { are continuous. Arguably, the discreteness-point that causes the biggest problem for continuous models is located at zero. This is because zero often is the most frequently observed single value { particularly in intraday data, and because its probability is often time-varying and dependent on random or non-random events (e:g: periodicity), or both. A time-varying zero prob-ability invalidates the parameter and risk estimates of continuous models, since the underlying estimation theory relies on the assumption that the conditional density is identical over time. We propose a new class of nancial return models that allows for a time-varying conditional probability of a zero return. The new class decom-poses return multiplicatively into a continuous part, which can be specied in terms of common volatility models, and a discrete part at zero. Standard volatility models (e:g: ARCH, SV and continuous time models) are therefore nested and obtained as

special cases when the zero probability is constant and equal to zero. Hautsch et al.

(2013) proposed a model for positively valued variables (e:g: volume) that uses a

similar decomposition to ours. However, their dynamics is governed by a (restricted) log-GARCH specication, and by a specic conditional density. Our model is much more general. The volatility dynamics need not be specied as a log-GARCH, and the continuous density (in squared return) need not be a Generalised F . In fact, their model is nested and obtained as a very specic case in our model class, when the log-volatility dynamics is interpreted as a Multiplicative Error Model (MEM) (seeBrownlees et al. (2012) for a recent survey of MEM models). Another attraction of our model class is that many return properties (e:g: conditional volatility, return skewness, Value-at-Risk and Expected Shortfall) are readily obtained as functions of the underlying volatility model. Moreover, our model allows for autoregressive conditional dynamics in both the zero probability and volatility specications, and for a two-way feedback between the two. In the absence of a feedback eect from volatility to the zero probability specication, then estimation becomes particularly simple, since the model of zero probability and the model of volatility can then be estimated separately. The model is readily extended to include additional condition-ing variables (e:g: leverage, volume, duration, spreads, volatility proxies, periodic-ity/seasonality terms, etc.) in the zero probability or volatility specications, or in both, and by introducing new endogenous variables (e:g: volume, durations, spreads, volatility proxies, etc.) to form a complete dynamic system. Simulations show that common volatility models are inconsistently estimated by common methods if the zero probability is time-varying, and that estimates of risk (i:e: conditional volatil-ity, Value-at-Risk and Expected Shortfall) are biased upwards. Finally, an empirical illustration shows that risk estimates can be substantially biased in practice if the time-varying zero probability is not accommodated appropriately.

(5)

The rest of the paper is organised as follows. Section2presents the new class and

derives some general properties. Section3 proposes specic models of the zero

prob-ability and of the volatility, and a joint model that allows for a two-way feedback.

Section 4 contains a Monte Carlo study of the parameter and risk estimation bias

induced in some common models of volatility, when the time-varying zero probability

is not appropriately accommodated. Section 5 contains our empirical application,

whereas Section 6 concludes. The Appendix contains auxiliary derivations, and

ad-ditional material and simulations. Tables and gures are located at the end.

2 Financial returns with time-varying zero

proba-bility

2.1 The ordinary model of return

The ordinary model of a nancial return rt (possibly mean-corrected) is given by

rt = twt; wt IID(0; w2); Pt 1(wt= 0) = 0; t 2 Z; (1)

where t> 0 is a time-varying scale or volatility (that needs not equal the conditional

standard deviation), wt2 R is an Independentically and Identically Distributed (IID)

innovation conditional on the past It 1 and Pt 1(wt = 0) is the zero probability of

wt conditional on the past. The subscript t 1 is thus notational shorthand for

conditioning on past information It 1. We refer to (1) as an \ordinary" model of

return, since the zero probability of return rt is 0 and constant. An example of an

ordinary model is the GARCH(1,1) ofBollerslev (1986), where

2

t = 0+ 1rt 12 + 1t 12 ; wt IID(0; w2 = 1): (2)

Another example is the Stochastic Volatility (SV) model, where

ln 2

t = 0 + 1ln 2t 1+ vt 1; vt IID(0; v2); (3)

with vi being independent of wj for all pairs i; j. Other examples include quadratic

variation (e:g: Brownian motion) and other continuous time notions of volatility,

the log-GARCH class proposed independently by Geweke (1986), Pantula (1986)

and Milhj (1987), the EGARCH model of Nelson (1991), the mixed data sampling

(MIDAS) regression of Ghysels et al. (2006), and the Dynamic Conditional Score

(DCS)/Generalised Autoregressive Conditional Score (GAS) models ofHarvey(2013)

and Creal et al.(2013).

2.2 The model of return with time-varying zero probability

The model of return with time-varying conditional zero probability is given by

rt = tzt; zt = wtIt1t1=2; wt IID(0; w2); Pt 1(wt= 0) = 0; (4)

(6)

The variable Itdetermines whether return rtis zero or not. If It = 1, then rt6= 0 with

probability 1, and if It= 0, then rt= 0. The probability of a zero return conditional

on the past is thus 0t = 1 1t. For convenience we will sometimes refer to 1t (and

transformations thereof, e:g: ht = ln(1t=0t)) as the zero probability, since 0t can

straightforwardly be obtained via 1t (and transformations thereof, e:g: 0t = 1 1t).

The symbolism It ? wt means It and wtare independent at t conditional on the past.

The motivation for letting 1t enter the way it does in zt, is that this ensures that

V art 1(z) = 2w (see Property 2 in Section2.3).

In (4), t can be specied as a wide range of volatility models in terms of the

zero-adjusted return

ert= twt: (6)

We refer to this quantity as \zero-adjusted" return, since ert= rt1=21t whenever It= 1.

For example, the GARCH(1,1) model in terms of zero-adjusted return is given by

2

t = 0+ 1er2t 1+ 1t 12 ; (7)

whereas the zero-adjusted log-GARCH(1,1) model is given by

ln 2

t = 0+ 1ln er2t 1+ 1ln 2t 1: (8)

In both cases their ordinary counterparts are obtained as special cases when 1t

is constant and equal to 1. In empirical practice we observe rt rather than the

zero-adjusted return ert. But for a given set of values of 1t (or estimates of 1t,

rather), we can obtain ert (or an estimate of ert, rather) whenever It = 1, since then

we have ert= rt1t1=2. Whenever It= 0, the zero-adjusted return ertwill be unobserved

or \missing". Nevertheless, algorithms that handles missing values can be used for estimation and inference. Details of how we implement estimation with missing values

is given in Section 3.2 and in the Appendix . An alternative way of specifying t,

which avoids the need for an algorithm that handles missing values, is to let the zero-adjusted return enter the volatility equation only at non-zero locations. This

is for example the strategy employed by Hautsch et al. (2013) in their model of

positively-valued variables (e:g: volume). For example, a simplied version of their

log-GARCH(1,1) specication is ln 2

t = 0+ 1(ln ert 12 )It 1+ ln t 12 .

2.3 Some general properties

An attractive feature of the model (4){(5) is that many of its properties follow

straightforwardly (assuming the quantities in question exist) as a function of the underlying models of volatility and zero probability. These properties are:

1. Ordinary models of return are obtained when the zero probability is constant

and equal to zero, i:e: when 1t = 1, for all t.

2. Although zt is not IID conditional on the past when 1t is non-constant, zt

does have a constant conditional mean Et 1(zt) equal to zero, and a constant

conditional second moment V art 1(zt) equal to 2w. As a consequence, the

return series frtg remains a martingale dierence sequence even though 1t is

(7)

3. Both the conditional and unconditional second-moment properties of the un-derlying volatility model are retained:

V art 1(rt) = t2w2; and V ar(rt) = E(t2)w2: (9)

In a zero-adjusted stationary GARCH(1,1), for example, where the

identi-ability condition is 2

w = 1, we have that V art 1(rt) = t2 and V ar(rt) =

0=(1 1 1). This holds regardless of whether 1t is constant or

time-varying.

4. The conditional second-moment assuming return is non-zero is given by

V ar(rtjIt= 1; It 1) = t21t12w: (10)

In other words, conditional volatility is scaled upwards under the assumption that return is non-zero, and the more so the higher the conditional zero proba-bility 0t.

5. The sth. conditional moment of return is given by

Et 1(rst) = ts1t(2 s)=2E(wst): (11)

Higher order (i:e: s > 2) conditional moments (in absolute value) are thus scaled upwards by positive conditional zero probabilities, whereas the opposite is the case for lower order (i:e: s < 2) conditional moments (in absolute value). In particular, conditional skewness (s = 3) and conditional kurtosis (s = 4) become

more pronounced when 0t > 0, whereas Et 1(rt) in absolute value is usually

scaled downwards since Ejwtj < 1 for most densities of empirical relevance.

6. If there is no feedback between 1t and t, and if their unconditional moments

exist, then the sth. unconditional moment is

E(rs

t) = E(ts)E(1t(2 s)=2)E(wts): (12)

The eect is thus similar to that of conditional moments: Higher order (i:e: s > 2) unconditional moments (in absolute value) are scaled upwards by positive zero probabilities, whereas the opposite is the case for lower order (i:e: s < 2) unconditional moments (in absolute value)

7. The probability density function (pdf) and cumulative distribution function

(cdf) of zt conditional on the past are, respectively, given by

fz(zt) =

3=2_1t fw(zt1t1=2) if zt6= 0;

(1 1t) if zt= 0; (13)

Fz(zt) = 1tFw(zt1t1=2) + 1fzt0g(1 1t); (14)

where fw is the pdf of wtconditional on the past, Fw is the cdf of wt conditional

on the past and 1fzt0g is an indicator function equal to 1 if zt 0 and 0

(8)

8. The pdf and cdf of rt conditional on the past are fr(rt) = _1t3=2fer(rt1t1=2) if rt 6= 0; (1 1t) if rt= 0; (15) Fr(rt) = 1tFer(rt1t1=2) + 1frt0g(1 1t); (16)

where fer(ert) = t1fw(ert=t) is the pdf of ert conditional on the past, Fer(ert) =

Fw(ert=t) is the cdf of ert conditional on the past and 1frt0g is an indicator

function equal to 1 if rt 0 and 0 otherwise.

9. If Fw is strictly increasing and c 2 (0; 1), then the cth. quantile of zt and rt

conditional on the past are

zc;t = Fz 1(c) = 8 > < > : _1t1=2F 1 w (c=1t) if c < Fw(0)1t 0 if Fw(0)1t c < Fw(0)1t+ 0t _1t1=2F 1 w h (c 0t) 1t i if c Fw(0)1t+ 0t; (17) rc;t = Fr 1(c) = 8 > < > : _1t1=2F 1 er (c=1t) if c < Fer(0)1t 0 if Fer(0)1t c < Fer(0)1t+ 0t _1t1=2F 1 er h (c 0t) 1t i if c Fer(0)1t+ 0t: (18)

The conditional (100 c)% Value-at-Risk (VaRc) of zt and rt, respectively, are

therefore dened as zc;t and rc;t. Note that, since F_er1(x) = tFw1(x),

equation (18) can be written as

rc;t = tFz 1(c): (19)

This is particularly convenient when the density of ert is unknown (e:g: when

estimation is by QML).

10. If Fw is strictly increasing and c > 0, then the (100 c)% Expected Shortfall

(ESc) of zt and rt conditional on the past are given by Et 1(ztjzt zc;t) and

Et 1(rtjrt rc;t), respectively, where Et 1(ztjzt zc;t) = _c1tEt 1 wt1fwtFw1(c=1t)g if c < Fw(0)1t; (20) Et 1(rtjrt rc;t) = _c1tEt 1 ert1fertF_r_e1(c=1t)g if c < Fer(0)1t: (21)

Note that, since F 1

er (x) = tFw1(x), equation (21) can also be written as

Et 1(rtjrt rc;t) = tEt 1(ztjzt zc;t) if c < Fer(0)1t: (22)

This formulation is particularly convenient when the density of ert is unknown

(9)

3 Models of

1t

and

t

If 0 < 1t < 1, then it follows from (15) that the log-likelihood at t conditional on

the past can be written as

ln fr(rt) = Itln fer(rt1t1=2) + Itln 3=21t + (1 It) ln(1 1t); (23)

= Itln fer(ert) + Itln 1t+ (1 It) ln(1 1t); (24)

since fer(ert) = 1t1=2fer(rt1t1=2) when It = 1. The total log-likelihood is therefore given

byPn_t=1ln fr(rt) = LogL + LogL, where

LogL =

n X

t=1

Itln fer(ert) and LogL =

n X

t=1

Itln 1t+ (1 It) ln(1 1t): (25)

When there is no feedback from t to 1t, then the models of t and 1t, respectively,

can be estimated in two separate steps. First, 1t can be estimated by maximising

LogL. Second, the tted values of 1t can be used to generate estimates of ert, which

can subsequently be used to estimate t by maximising LogL. In the next two

subsections we consider such models that can be estimated in two separate steps.

Then, in subsection3.3, we propose a joint model of 1t and twith feedback eects.

3.1 Models of

1t

The model (4)-(5) admits a wide range of specications of 1t. In the simplest, 1t is

constant and can be estimated by computing the fraction of non-zero returns. Another simple specication, which may be particularly useful in the presence of periodic, say,

intraday zero probabilities, is 1t = P24i=1idit, where the dit's are dummy variables

associated with the hours of the day. Then each i is readily estimated by computing

the fraction of non-zero returns in each hour of the day. The logistic representation

of this model is ht = 0 +P24i=2idit, where 1t = 1=(1 + exp( ht)). A third class

of models that is straightforwardly estimated is one in which 1t is a function of a

deterministic time-trend, say, ht = 0 + t. The motivation for this specication

is that market developments (e:g: the inux of high-frequency algorithmic trading, increased trading volume, increased quoting frequency, lower tick-size, etc.) may have reduced the probability of zeros in a gradual and monotonous way.

In many situations the models just described are not sucient to adequately

capture the zero probability dynamics. The 1t may be autoregressively dependent

and/or determined by other variables (e:g: volume, news, etc.), either instead of or in addition to periodicity eects and trends. For computational simplicity one could

consider modelling all this in a linear probability model with It as the left-hand side

variable, with the usual problems that this entails (e:g: tted probabilities outside the unit interval). Another option, which avoids the drawbacks of the linear probability

model, is the dynamic logit model proposed by Hautsch et al. (2013) for trading

(10)

(ACM) model byRussell and Engle (2005), and its specication is ht= 0+ K X k=1 kst k + L X l=1 lht l; st = p It 1t 1t(1 1t); st 2 R; (26)

where st conditional on the past is IID(0; 1). We will henceforth refer to (26) as an

Autoregressive Conditional Logit (ACL) model. The fhtg is an ARMA process, so

in the rst order case, i:e: ht = 0 + 1st 1+ 1ht 1, each parameter has the usual

ARMA-like interpretation. The 0 controls the level of the unconditional probability

E(1t), whereas the 1 controls the impact of a shock st: The larger in absolute value,

the greater the departure from recent values of 1t. The 1 is a persistence parameter

(j1j < 1 entails stability): The closer to 1, the higher persistence of 1t. Finally,

the ACL can be augmented with additional conditioning information or covariates by

including them in the ht speciction, yielding an ACL-X model of 1t.

3.2 Models of

t

When there is no feedback from t to 1t, then the former can be estimated in

a second step conditional on the estimates of 1t. One strategy in formulating a

specication for t, or ln t, is to simply skip the zeros. An example of this isFrancq

et al.(2013), where the log-GARCH(1,1) specication (without asymmetry) is ln 2

t =

0+ 1er2t 1It 1+ 1ln t 12 . This, in fact, this is equivalent to replacing a zero on ert with the value 1 (rather than a small non-negative value, which is the more common approach), and may not the best solution empirically since it creates a \jumpy" or erratic contribution from the log-ARCH term. A similar eect is induced in the

GARCH model if the ARCH-term er2

t 1is replaced by er2t 1It 1. Moreover, if the model

relies on the assumption that Pt 1(wt = 0) = 0, then parameter estimates will in fact

be biased, seeSucarrat and Escribano(2013). An alternative strategy is to treat zeros

as \missing values". The main advantage with this approach is that the contribution of the ARCH-term is less erratic, and that the properties of the volatility model in question carry over more straighforwardly. In particular, the properties derived in

Section2.3 are easier to exploit.

We propose the following general procedure to estimate models of twhile treating

ert as missing at zero locations:

1. Record the locations at which the observed return rt is zero and non-zero,

respectively. Use these locations to estimate 1t by maximising LogL.

2. Obtain an estimate of ertby multiplying rtwith b1=21t , where b1tis the tted value

of 1t from Step 1. At zero locations the zero-adjusted return ert is unobserved

or \missing".

3. Use an estimation procedure that handles missing values to estimate the

volatil-ity model t by maximising LogL.

Sucarrat and Escribano(2013) propose an algorithm for the log-GARCH model where missing values are replaced by estimates of the conditional expectation. If Gaussian (Q)ML is used for estimation, then this can be viewed as a variant of the Expectation

(11)

Maximisation (EM) algorithm. A similar algorithm can be devised for many addi-tional volatility models, including the GARCH model. The Appendix contains the

details of this algorithm, whereas Section4.2 contains a small simulation study that

compares its accuracy with ordinary methods.

Additional conditioning variables or covariates (\X"), e:g: past values of leverage, volume, duration and periodicity/seasonality terms, can be added to the GARCH and log-GARCH specications, respectively, to form GARCH-X and log-GARCH-X

models, see Han and Kristensen (2014), Francq and Thieu (2015), Sucarrat et al.

(2015), and Francq and Sucarrat (2015). A zero-adjusted version of the

log-GARCH-X model constitutes a particularly attractive alternative, since it does not impose any negativity restrictions on the parameters, and since estimation, inference and missing values can be handled via its ARMA-X representation.

3.3 A joint model of

1t

and

t

with two-way feedback

If there is feedback from t to 1t, then the models of 1t and t cannot be estimated

separately. Here, we propose a joint model with two-way feedback together with a QML estimation procedure. The model is a combination of the log-GARCH model and the ACL model, so we refer to it as a log-GARCH-ACL model. In general form the model is ln 2 t = 0+ P X p=1 1pln er2t p+ K X k=1 1kst k+ Q X q=1 1qln 2t q+ L X l=1 1lht l (27) ht = 0+ P X p=1 2pln er2t p+ K X k=1 2kst k+ Q X q=1 2qln t q2 + L X l=1 2lht l: (28)

The exponential specications ensure the positivity of t and 1t, and enable more

exible dynamics (e:g: parameters can be negative). Also, the logit-specication

ensures that 1t 2 (0; 1). The motivation for the log-GARCH specication for ln 2t

instead of, say, the EGARCH ofNelson (1991), is that the latter is not amenable to

QML estimation and inference in the presence of missing values of ert.2 Also, it is not

clear that an ML-based procedure for the EGARCH would yield consistent estimates

due to invertibility issues, seeWintenberger (2013), since the invertibility-problem is

in fact compounded in the presence of missing values.

For notational economy we will hereafter work with the rst order specication only. The rst order version of the model can be written as

yt= ! + A1vt 1+ B1yt 1; (29) where yt= (ln t2; ht)0, ! = (10; 10)0, vt = (ln er2t; st)0, A1 = 11 11 21 21 and B1 = 11 11 21 21 : (30) If jE(ln w2

t)j < 1, which is usually the case for the most commonly used densities in

(12)

nance (e:g: the Student's t and the GED), then a stability condition of the system

is that all the solutions of j12 (A1 D + B1)cj = 0 are greater than one in modulus,

where D = 1 0 1 0 (31)

and A1 D is the Hadamard product of A1 and D. The estimation procedure that we

propose is based on QML estimation of the VARMA representation. The motivation for this is to get the system into a form such that missing values can be handled

with the algorithm outlined in Section 3.1. The details of the estimation procedure,

together with simulations, are contained in the Appendix .

Conditioning variables or covariates (\X") can be added to either (27) or (28), or

both, since the transformation to the VARMA representation needed for estimation is not aected by the additional variables. The model can also be extended with additional endogenous variables like (say) volume, volatility proxies, volatilities from other return series, and so on. Specically, let yt = (ln t2; ht; ln 3t2; : : : ; ln Mt2 )0 and

xt be the vectors of endogenous and exogenous variables, respectively, where the time

index in xt does not necessarily mean that all (or any) of the exogenous terms are

contemporaneous. The 2

3t; : : : ; 2Mt are either return-volatilities or the conditional

expectations of the square of a positively valued variable, e:g: volume. The rst order version of the system (generalisation to higher orders is straightforward) can then be written as

yt= ! + A1vt 1+ B1yt 1+ C1xt; (32)

where ! = (10; 10; 30; : : : ; M0)0, vt = (ln er1t2; st; ln er23t; : : : ; ln erMt2 ; )0, and where

A1, B1 and C are appropriately sized coecient matrices. The additional

vari-ables er3t; : : : ; erMt either take on values in the whole real space (e:g: return) or are

positively-valued (e:g: volume or duration). In the latter case the ln er2

t specication

is a logarithmic Multiplicative Error Model (MEM), see Brownlees et al. (2012) for

a survey of MEMs. The stability properties of the system can be investigated in the same way as earlier, and QML estimation and inference procedures are available via the VARMA-X representation.

4 Simulations

4.1 The eect on parameter and risk estimates

If the zero probability is time-varying, then the estimation theory of common volatility models is not valid. Here, we study how this aects parameter and risk estimates. In the simulations the Data Generating Process (DGP) of return is given by

rt = tItwt1t1=2; wt N(0; 1); t = 1; : : : ; n = 10000; (33)

where the 0-DGP is governed by a deterministic trend equal to

(13)

The term t _{= t=n is thus \relative" time with t} _{2 (0; 1]. We use three parameter}

congurations for the 0-DGP: (0; ) = (1; 0), (0; ) = (0:1; 3) and (0; ) = (0:2; 3).

These yield fractions of zeros over the sample equal to 0, 0.1 and 0.2, respectively. The DGPs of the GARCH and log-GARCH models, respectively, are given by

2

t = 0+ 1ert 12 + t 12 ; (35)

ln 2

t = 0+ 1ln er2t 1+ ln t 12 ; (36)

with (0; 1; 1) = (0:02; 0:1; 0:8) in each. In both cases estimation proceeds by

replacing er2

t with rt2in the recursions. For the log-GARCH, whenever rt2 = 0, its value

is set to 1 (i:e: the specication of Francq et al. (2013), but without asymmetry).

Estimation of the GARCH model is by Gaussian QML, whereas estimation of the

log-GARCH is by Gaussian QML via the ARMA-representation, see Sucarrat et al.

(2015).

The upper row of graphs in Figure1contains the average parameter biases, where

the bias in replication i is computed as average estimate_i truei (the no. of

repli-cations is 1000). The general tendency is clear: The higher the proportion of zeros,

the greater the bias. Seemingly, this is not the case for 0 in the GARCH model.

However, this is simply due to the y-scale of the graph, since closer inspection reveals that there is indeed a (small) upwards bias. Generally, the magnitude of the bias is smaller for the GARCH. The exact bias in the log-GARCH, however, depends on the value used to replace zeros. So a dierent replacement-value for zeros may in fact result in a smaller bias than for the GARCH. The middle and lower rows of graphs in

Figure1contain the Mean Percentage Errors (MPEs) of the risk estimates, computed

as 100 Pn_t=1(xt 1) in replication i, where xt = estimatet=truet. For example, the

volatility percentage error at t is 100 (xt 1) with xt = bt=t. The VaR and ES

graphs are for a risk level equal to 1%. Again, the graphs show a clear tendency: The higher the proportion of zeros, the greater the bias. Moreover, the bias is always positive.3

Additional simulations with fat-tailed wt, other values on 0; 1; 1 and risk level

(for VaR and ES), and with a dierent 0-DGP, produce similar results. These simu-lations are not reported, but are available on request.

4.2 A missing values algorithm

I order to study the nite sample bias of the algorithm outlined in Section 3.2, we

undertake a simulation study similar to that above. The DGP is exactly the same,

but estimation proceeds dierently. In the GARCH, whenever er2

t is zero, then it is

replaced by an estimate of its conditional expectation Et 1(ert2) at that point (see the

Appendix ). Estimation of the log-GARCH proceeds similarly, except that now it is

3_{This is in line with previous studies on the eect of discreteness.} _{Gottlieb and Kalay} ₍₁₉₈₅₎

found that variance estimates of daily stock returns were biased upwards due to discreteness, but their analysis assumed the nature of the discreteness was constant, and that the conditional density of returns was normal. Cho and Frees (1988) also found that volatility was overestimated in most cases, using a measure of how quickly prices change. Li and Mykland (2014) nd that Realised Volatility (RV) is biased and overestimates volatility in the presence of discreteness, and that the bias is increasing in the sampling frequency.

(14)

ln er2

t that is replaced by an estimate of E(ln er2t) whenever rt is zero.

Figure2contains the parameter biases for the GARCH(1,1) and log-GARCH(1,1)

models, respectively. A solid blue line stands for the bias produced by the algorithm, whereas a dotted red line stands for the bias of ordinary Gaussian QML estimation without zero-adjustment. The Figure conrms that the algorithm provides approxi-mately unbiased estimates in nite samples in the presence of missing values. Nomi-nally, the biases produced by the ordinary method may appear small. However, as we will see in the empirical applications, such small nominal dierences in the parameters can produces large dierences in the dynamics.

Additional simulations with fat-tailed wt, other values on 0; 1 and 1, and with

a dierent 0-DGP, produce similar results. These simulations are not reported, but are available on request.

5 Empirical application

In order to shed light on how returns with time-varying zero probabilities aect volatility dynamics, Value-at-Risk (VaR) and Expected Shortfall (ES) in practice, we

revisit three of the return series in Sucarrat and Escribano (2013). These series are

of interest, since they exhibit a variety of zero probability characteristics. The three series are the daily Standard and Poor's 500 stock market index (SP500) return, the daily Apple stock price return and the daily Ekornes stock price return. The rst two return series are well-known, whereas the third is a leading Nordic furniture manufacturer listed on the Oslo Stock Exchange. Ekornes is a medium-sized company in international terms, since its market value is approximately 300 million euros (at the end of the series). Our interest in Ekornes is due to its relatively large { for daily returns { proportion of zeros over the sample (about 19%). The source of the data

is Yahoo Finance (http://finance.yahoo.com). All three returns are computed as

(ln St ln St 1)100, where Stis the index level or stock price at day t. Saturdays and

Sundays, where returns are usually 0, are not included in our sample. Descriptive

statistics are contained in the upper part of Table 2. The statistics conrm that the

returns exhibit the usual properties of excess kurtosis compared with the normal, and ARCH as measured by rst order serial correlation in the squared return. The number of zeros varies from only 2 observations (about 0.1% of the sample) for SP500 to 667 observations (about 19% of the sample) for Ekornes.

The middle part of Table 2 contains estimates of three dynamic logit models for

each return. The three models are:

Constant: ht = 0;

Trend: ht = 0+ t; t = t=n; t 2 (0; 1];

ACL(1,1): ht = 0+ 1st 1+ 1ht 1:

In the rst model the zero probability is constant, in the second it is governed by

a deterministic trend (t _{is \relative time") and in the third it is an ACL(1,1). For}

SP500 returns, it is the rst logit specication that ts the data best according to the Schwarz (1978) information criterion (SIC). Accordingly, we use its tted values

(15)

best model according to SIC is the ACL(1,1).

The bottom part of Table2contains estimates of two GARCH(1,1) specications

for each return. These are

Ordinary: 2

t = 0+ 1rt 12 + 1t 12 ;

0-adj: 2

t = 0+ 1ert 12 + 1t 12 :

Estimation of the Ordinary specication proceeds by Gaussian QML without

adjust-ment of the observed returns rt. Estimation of the second specication is also by

Gaussian QML, but here the observed returns are replaced by estimates of the

zero-adjusted ones, ert, treating zeros as missing values. For SP500, in which there are only

2 zeros, the two sets of estimates are virtually identical. For the two other returns, by contrast, the nominal dierences vary between 0.003 and 0.007. These may appear small. However, as we will see shortly, these small nominal parameter dierences { together with the dierent treatment of zeros { can lead to substantially dierent risk measure dynamics.

Figure3contains graphs of the tted conditional zero probabilities b0t (upper row

of graphs), and ratios of the conditional risk measures from the two estimation meth-ods. The ratio of the tted conditional standard deviations (second row of graphs)

is computed as bt;Ordinary=bt;0-adj, i:e: the values from the Ordinary specication over

those from the 0-adjusted specication. The VaR and ES ratios (third and fourth rows of graphs) are computed similarly. The rst column of graphs are those of SP500. Unsurprisingly, since the SP500 return series only contain 2 zeros, and since the estimated parameters are virtually identical, the ratios are essentially equal to 1 throughout the sample. This is reected in the Mean Percentage Errors (MPEs),

computed as n 1Pn

t=1(xt 1) 100, where xt is the ratio in question. The second

column of graphs are those of Apple. The tted zero probability declines over the sample in a non-stationary fashion, and in the latter part it is essentially zero. This is reected in the ratio between the conditional standard deviations. In the rst part of the sample, until 1998 approximately, the conditional standard deviations of the Ordinary specication are about 3-5% higher, whereas in the latter part they are only about 1% higher (the MPE over the whole sample is 2.36%). Interestingly, there is not much dierence between the specications for the 1% VaR, and this holds throughout the sample. For the 1% ES, however, the values of the Ordinary specication start at about 8-10% higher. Then they fall steadily before stabilising towards the end at 3-4% lower than the values of the 0-adjusted specication. Finally, the third column of graphs are those of Ekornes. The evolution of the zero probabilities are very dif-ferent from the others' in two ways. First, they are substantially higher throughout the sample. Second, in contrast to those of Apple, the dynamics appears stationary. This is reected in the evolution of the ratios. The conditional standard deviations of the Ordinary method, for example, are { on average { 8.38% higher than those of the 0-adjustment method, and stably so throughout. Again, the 1% VaRs are, on average, very similar throughout the sample, and again the 1% ESs are dierent. However, this time the ESs are not only dierent, but substantially dierent, since the Ordinary specication yields a 1% ES that is on average 11.51% higher than that of the 0-adjusted specication { and this dierence appears stationary throughout the sample.

(16)

All in all, the empirical comparison reveals that the biased parameter estimates of the Ordinary method can lead to substantially dierent values on two common risk measures, namely volatility and ES. Moreover, the extent and sign (i.e. whether it is higher or lower) of the magnitude depend on how big the zero probability is, and on the exact nature of the zero probability dynamics (e:g: whether it is stationary or not).

6 Conclusions

We propose a new class of nancial return models that allows for a time-varying zero probability. A key feature of the new class is that standard volatility models (e:g: ARCH, SV and continuous time models) are nested and obtained as special cases when the zero probability is constant and equal to zero. Another attraction is that the properties of the new class (e:g: conditional volatility, skewness, kurtosis, Value-at-Risk, Expected Shortfall, etc.) are obtained as functions of the underlying volatility model. The new class allows for autoregressive conditional dynamics in both the zero probability and volatility specications, and for a two-way feedback between the two. In the absence of a feedback eect from volatility to the zero prob-ability specication, then estimation becomes particularly simple, since the model of zero probability and the model of volatility can then be estimated separately. Our empirical illustration shows that risk estimates can be substantially biased if the time-varying zero probability is not accommodated appropriately.

References

Ball, C. A. (1988). Estimation Bias Induced by Discrete Security Prices. Journal of Finance 43, 841{865.

Bauwens, L., C. Hafner, and S. Laurent (2012). Handbook of Volatility Models and Their Applications. New Jersey: Wiley.

Bien, K., I. Nolte, and W. Pohlmeier (2011). An inated multivariate integer count hurdle model: an application to bid and ask quote dynamics. Journal of Applied Econometrics 26, 669{707.

Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Jour-nal of Econometrics 31, 307{327.

Brownlees, C., F. Cipollini, and G. Gallo (2012). Multiplicative Error Models. In L. Bauwens, C. Hafner, and S. Laurent (Eds.), Handbook of Volatility Models and Their Applications, pp. 223{247. New Jersey: Wiley.

Cho, D. C. and E. W. Frees (1988). Estimating the Volatility of Discrete Stock Prices. Journal of Finance 43, 451{466.

Creal, D., S. J. Koopmans, and A. Lucas (2013). Generalized Autoregressive Score Models with Applications. Journal of Applied Econometrics 28, 777{795.

(17)

Engle, R. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inations. Econometrica 50, 987{1008.

Engle, R. F. and J. R. Russell (1998). Autoregressive Conditional Duration: A New Model of Irregularly Spaced Transaction Data. Econometrica 66, 1127{1162. Francq, C. and G. Sucarrat (2015). An Equation-by-Equation Estimator of a

Multivariate Log-GARCH-X Model of Financial Returns. https://mpra.ub.

uni-muenchen.de/67140/.

Francq, C. and L. Q. Thieu (2015). Qml inference for volatility models with covariates.

http://mpra.ub.uni-muenchen.de/63198/.

Francq, C., O. Wintenberger, and J.-M. Zakoan (2013). GARCH Models Without Positivity Constraints: Exponential or Log-GARCH? Forthcoming in Journal of

Econometrics, http//dx.doi.org/10.1016/j.jeconom.2013.05.004.

Geweke, J. (1986). Modelling the Persistence of Conditional Variance: A Comment. Econometric Reviews 5, 57{61.

Ghysels, E., P. Santa-Clara, and R. Valkanov (2006). Predicting volatility: getting the most out of return data sampled at dierent frequencies. Journal of Economet-rics 131, 59{95.

Gottlieb, G. and A. Kalay (1985). Implications of the Discreteness of Observed Stock prices. Journal of Finance 40, 135{153.

Han, H. and D. Kristensen (2014). Asymptotic theory for the QMLE in GARCH-X models with stationary and non-stationary covariates. Journal of Business and Economic Statistics, ??{??

Harvey, A. C. (2013). Dynamic Models for Volatility and Heavy Tails. New York: Cambridge University Press.

Hausman, J., A. Lo, and A. MacKinlay (1992). An ordered probit analysis of trans-action stock prices. Journal of nancial economics 31, 319{379.

Hautsch, N., P. Malec, and M. Schienle (2013). Capturing the zero: a new class of zero-augmented distributions and multiplicative error processes. Journal of

Finan-cial Econometrics. Forthcoming. http://jfec.oxfordjournals.org/content/

early/2013/03/29/jjfinec.nbt002.

Li, Y. and P. A. Mykland (2014). Rounding Errors and Volatility Estimation. Journal of Financial Econometrics. Forthcoming. doi:10.1093/jjnec/nbu005.

Liesenfeld, R., I. Nolte, and W. Pohlmeier (2006). Modelling Financial Transaction Price Movements: A Dynamic Integer Count Data Model. Empirical Economics 30, 795{825.

Ljung, G. and G. Box (1979). On a Measure of Lack of Fit in Time Series Models. Biometrika 66, 265{270.

(18)

Milhj, A. (1987). A Multiplicative Parametrization of ARCH Models. Research Report 101, University of Copenhagen: Institute of Statistics.

Nelson, D. B. (1991). Conditional Heteroskedasticity in Asset Returns: A New Ap-proach. Econometrica 59, 347{370.

Pantula, S. (1986). Modelling the Persistence of Conditional Variance: A Comment. Econometric Reviews 5, 71{73.

R Core Team (2014). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Russell, J. R. and R. F. Engle (2005). A Discrete-State Continuous-Time Model of Financial Transaction Prices and Times: The Autoregressive Conditional Multinomial-Autoregressive Conditional Duration Model. Journal of Business and Economic Statistics 23, 166{180.

Rydberg, T. H. and N. Shephard (2003). Dynamics of Trade-by-Trade Price Move-ments: Decomposition and Models. Journal of Financial Econometrics 1, 2{25. Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics 6,

461{464.

Shephard, N. (2005). Stochastic Volatility: Selected Readings. Oxford: Oxford Uni-versity Press.

Sucarrat, G. and A. Escribano (2013). Unbiased QML Estimation of Log-GARCH

Models in the Presence of Zero Returns. MPRA Paper No. 50699. Online athttp:

//mpra.ub.uni-muenchen.de/50699/.

Sucarrat, G., S. Grnneberg, and A. Escribano (2013). Estimation and Inference in Univariate and Multivariate Log-GARCH-X Models When the Conditional Density

is Unknown. MPRA Paper No. 49344. Online athttp://mpra.ub.uni-muenchen.

de/49344/.

Sucarrat, G., S. Grnneberg, and A. Escribano (2015). Estimation and Infer-ence in Univariate and Multivariate Log-GARCH-X Models When the Condi-tional Density is Unknown. ComputaCondi-tional Statistics and Data Analysis.

Forth-coming. DOI: http://dx.doi.org/10.1016/j.csda.2015.12.005. Working

Pa-per version: http://mpra.ub.uni-muenchen.de/62352/.

Wintenberger, O. (2013). Continuous Invertibility and Stable QML Estimation of the EGARCH(1,1) model. Scandinavian Journal of Statistics 40, 846{867.

(19)

A Derivation of properties 7 to 10

A.1 Pdfs and cdfs

Let Xt = wtIt1t1=2, and let Pt 1(Xt xt) denote the cdf of Xt conditional on the

past. Then Pt 1(Xt xt) = Pt 1(wtIt1t1=2 xt) (37) (a) = Pt 1(wtIt1t1=2 xt; It = 1) + Pt 1(wtIt1t1=2 xt; It = 0)(38) (b) = Pt 1(wt1t1=2 xt; It= 1) + Pt 1(0 x; It = 0) (39) (c) = Pt 1(wt xtp1t)Pt 1(It= 1) + Pt 1(0 x)Pt 1(It= 0) (40) (d) = Pt 1(wt xtp1t)1t+ 10x0t; (41)

where we use: (a) P (A) = P (A \ B) + P (A \ Bc_{), (b) I}

t = 1 in wtIt1t1=2 in the

rst term and It = 0 in the second, (c) wt and It are independent conditional on the

past and (d) Pt 1(0 x) reduces to the indicator function for non-random events.

Replacing Xt with zt in the derivation gives (14), whereas replacing Xt with rt and

wt with ert gives (16). Next, the pdfs in (13) and (15) are obtained by dierentiating

the cdfs.

A.2 Quantiles

Let Xt = wtIt1t1=2 with cdf FX(xt) = Fw(xtp1t)1t+ 1fxt0g0t. We wish to nd

the generalised inverse of FX, given by FX1(c) = inffx 2 R : FX(x) cg. Suppose

rst that c < Fw(0)1t, so that xt< 0. Then

Fw(xtp1t)1t= c , xtp1t = Fw1(c=1t) , xt= 1t1=2Fw1(c=1t): (42)

Suppose now that c Fw(0)1t+ 0t. Then

Fw(xtp1t)1t+ 0t = c , Fw(xtp1t) = (c 0t)=1t; (43)

so that xt = 1t1=2Fw1[(c 0t)=1t]. Finally, if Fw(0)1t c < Fw(0)1t + 0t,

then xt = 0 by the denition of the generalised inverse. Replacing Xt with zt in the

derivation gives (17), whereas replacing Xt with rt and wt with ert gives (18).

A.3 Tail expectations

Let xt denote a realisation of the random variable Xt = wtIt1t1=2 at t, so that

FX(xt) = Pt 1(Xt xt). If 0 < c < FX(0)1t, then the c-level (lower) tail-expectation

of Xt conditional on the past is given by

1 cEt 1 Xt1fXtF_X1(c)g = 1 c Z AytdFX(yt); (44)

(20)

where A = ( 1; _1;t1=2F 1

w (c=1;t)). Because c < FX(0)1t, we have that F_X1(c) < 0, so that the area we integrate over only includes negative numbers. In this region FX(xt) = 1tFw(xtp1t) with derivative dFX(xt)=dxtequal to 1t3=2fw(xtp1t). Hence,

Et 1 Xt1_fX_t_F 1 X (c)g = _1t3=2 Z Aytfw(yt p 1t) dyt: (45)

Letting ut = ytp1t gives dyt= dut=p1t, so that the area of integration is changed

to ( 1; F 1 w (c=1t)). This gives Et 1 Xt1fXtF_X1(c)g = 1t Z _F 1 w (c=1t) 1 utfw(ut) dut (46) = 1tEt 1 wt1fwtFw1(c=1t)g ; (47) so that 1 cEt 1 Xt1_fX_t_F 1 X (c)g = _c1tEt 1 wt1_fw_F 1 w [c=1t])g : (48)

Replacing Xt with zt gives (20), whereas replacing Xt with rt and wt with ert gives

(21).

B Missing values estimation algorithm

In Section 3.2 we propose an estimation procedure where volatility models are

esti-mated in a second step conditional on estimates of ert from a rst step, treating zeros

as missing values. Here, we provide the details of how our missing values algorithm is implemented for the GARCH(1,1) and log-GARCH(1,1) models. Extensions of the algorithm to specications of higher orders and/or with covariates is straightforward and self-explanatory.

Let b(k)₀ ; b₁(k) and b₁(k) denote the parameter estimates of a GARCH(1,1) model

after k iterations with some numerical method (e:g: Newton-Raphson). The initial

values are at k = 0. If there are no zeros so that rt = ert for all t, then the kth.

iteration of the numerical method proceeds in the usual way: 1. Compute, recursively, for t = 1; : : : ; n:

b2

t = b(k 1)0 + b(k 1)1 ert 12 + b1(k 1)b2t 1: (49)

2. Compute the log-likelihood Pn_t=1ln fer(ert; bt) and other quantities (e:g: the

gra-dient and/or Hessian) needed by the numerical method to generate b(k)₀ ; b(k)₁

and b₁(k).

Usually, fer is the Gaussian density, so that the estimator may be interpreted as a

Gaussian QML estimator. The algorithm we propose modies the kth. iteration in several ways. Let G denote the set that contains the locations of the non-zeros, and

let n _{denote the number of non-zero returns. The kth. iteration now proceeds as}

(21)

1. Compute, recursively, for t = 1; : : : ; n: a) r2_t ₌ er2 t if t 2 G b2 t if t =2 G; where bt2 = b(k 1)0 + b1(k 1)r2t 1+ b1(k 1)b2t 1; (50) b) b2 t = b(k 1)0 + b(k 1)1 r2t 1+ b1(k 1)bt 12 : (51)

2. Compute the log-likelihoodP_t2Gln fer(ert; bt) and other quantities (e:g: the

gra-dient and/or Hessian) needed by the numerical method to generate b(k)₀ ; b(k)₁

and b₁(k).

Step 1.a) means r2_t _{is equal to an estimate of its conditional expectation at the}

locations of the zero-values. In Step 2 the symbolism t 2 G means the log-likelihood only includes contributions from non-zero locations. A practical implication of this is that any likelihood comparison (e:g: via information criteria) with other models

should be in terms of the average log-likelihood, i:e: division by n _{rather than n.}

QML Estimation of the log-GARCH model is via its ARMA-representation, since the standard Gaussian ML estimator must be interpreted as exact ML in the presence

of missing values, see Sucarrat and Escribano (2013). If jE(ln w2

t)j < 1, then the

ARMA(1,1) representation exists and is given by ln er2

t = 0+ 1ln er2t 1+ 1ut 1+ ut; ut= ln wt2 E(ln w2t); (52)

where 0 = 0 + (1 1)E(ln w2t), 1 = 1+ 1, 1 = 1 and ut is zero-mean and

IID conditionally on the past. Accordingly, the usual ARMA-methods can be used

to estimate 0, 1 and 1, and hence the log-GARCH parameters 1 and 1. To

identify 0, however, an estimate of E(ln w2t) is needed. Sucarrat et al. (2015) show

that, under very general assumptions, the formula ln [n 1Pn

t=1exp(but)] provides

a consistent estimate (see also Francq and Sucarrat (2015)). To accommodate the

missing values, this formula is modied to lnn 1P

t2Gexp(but)

.

C Estimation of the joint model of

1t

and

t

with

two-way feedback

For notational economy we illustrate the rst order case only, since the extension to higher order specications is straightforward. The procedure starts by casting the

ln 2

t equation into its \ARMA-X" form and then changing the ht equation

accord-ingly. Assuming that jE(ln w2

t)j < 1 and denoting ~yt = (ln ert2; ht)0, = E(ln wt2),

ut= (ln wt2 ; st)0 and ut = (ln w2t ; 0)0, the VARMA representation is given by

ln er2

t = 0 + 1ln ert 12 + 11st 1+ 11 ut 1+ 11ht 1+ ut; (53)

ht = 0+ 2ln er2t 1+ 21st 1+ 21 ut 1+ 21ht 1; (54)

where

0 = 0+(1 11)E(ln w2t), 0 = 0 21E(ln wt2), 1 = (11+11), 2 = (21+

(22)

being IID(0; 2

u). The stability condition for this representation is identical to the

untransformed model, i:e: the log-GARCH-ACL specication, but the unconditional expectations are now given by

E(ln er2 t) = (1 21) 0+ 110 (1 1)(1 21) 211; (55) E(ht) = 2 0+ (1 1)0 (1 1)(1 21) 211: (56)

More compactly, the VARMA representation can be written as

~yt= 0+ 1~yt 1+ 1ut 1+ ut; (57) where 0 = ! + (12 D B1 D), 1 = (A1 D + B1) and 1 = 11 11 21 21 : (58)

To estimate the VARMA parameters 0, 1 and 1 we replace the density fer in the

log-likelihood at t, which is given by (see the beginning of Section 3)

ln fr(rt) = Itln fer(ert) + Itln 1t+ (1 It) ln(1 1t); (59)

with an instrumental QML density (e:g: the normal) fu1t in the IID error u1t. The

joint log-likelihood at t conditional on the past thus becomes

Itln fu1t(ut) + Itln 1t+ (1 It) ln(1 1t): (60)

Maximisation of the total log-likelihood provides estimates of the VARMA

param-eters. All the parameters of interest, apart from 0 and 0, can be identied from

the VARMA estimates. In order to identify 0 and 0 an estimate of is needed,

since the VARMA intercepts are given by 10= 0+ (1 11) and 20= 0 21,

respectively. To this end we use the same formula as in the univariate case to estimate

, i:e: lnn 1P

t2Gexp(bu1t)

, where bu1t are the residuals from the rst equation

in the estimated VARMA model. Next, from the denition of 0 we can solve for

0 and 0, respectively, in order to obtain plug-in estimators of 10 and 0. Table 1

provides a small Monte Carlo study of this estimation procedure. The simulations show that estimates are close to their population counterparts in nite samples for

(23)

Table 1: QML estimation of the log-GAR CH-A C L mo del via the VARMA represen tation DGP T b0 b11 b11 b 11 b 11 b0 b21 b21 b 21 b 21 b E(ln w 2 t) b E( 0t ) wt N (0 ;1): A 5000 0.04 0.301 0.053 0.082 0.059 0.152 0.053 0.094 0.051 0.949 -1.318 0.37 5 10000 0.05 0.302 0.054 0.094 0.053 0.152 0.055 0.094 0.042 0.950 -1.321 0.376 B 5000 0.02 0.101 0.006 0.796 -0.006 0.189 0.001 0.099 -0.003 0.937 -1.272 0.050 10000 0.00 0.101 0.001 0.797 -0.002 0.158 0.001 0.100 -0.004 0.947 -1.273 0.05 0 C 5000 0.00 0.101 0.055 0.794 0.052 0. 214 0.051 0.101 0.006 0.945 -1.27 7 0.076 10000 0.00 0.101 0.053 0.795 0.051 0.209 0.052 0.105 0.001 0.948 -1.275 0.074 wt t(10): A 5000 0.06 0.302 0.062 0.089 0.053 0.148 0.050 0.096 0.055 0.950 -1.457 0.440 10000 0.06 0.303 0.059 0.086 0.051 0.148 0.050 0.100 0.054 0.949 -1.455 0.440 B 5000 0.00 0.101 0.000 0.794 -0.001 0.210 0.001 0.104 -0.003 0.930 -1.391 0.049 10000 0.00 0.101 0.000 0.796 0.001 0.168 0.000 0.101 0.002 0.945 -1.394 0.050 C 5000 0.00 0.104 0.055 0.791 0.052 0. 215 0.051 0.105 0.005 0.945 -1.39 7 0.112 10000 0.01 0.103 0.056 0.797 0.048 0.209 0.051 0.107 0.004 0.948 -1.398 0.112 The table con tains the av erage parameter estimates from 100 sim ulations. All computations in the programming language R ( R Core Team ( 2014 )) using the same initial values across DGPs. wt N (0 ;1), wt is standard normal with E (ln w 2 t) = 1: 27. wt t(10), wt is standardised t with 10 degrees of freedom and E (ln w 2 t) = 1: 39. T , sample size. DGP A, (0 ;11 ;11 ;11 ;11 ) = (0 ;0 :3 ;0 :05 ;0 :1 ;0 :05) and (0 ;21 ;21 ;21 ;21 ) = (0 :15 ;0 :05 ;0 :05 ;0 :05 ;0 :95). DGP B, (0 ;11 ;11 ;11 ;11 ) = (0 ;0 :1 ;0 ;0 :8 ;0) and (0 ;21 ;21 ;21 ;21 ) = (0 :15 ;0 ;0 :1 ;0 ;0 :95). DGP C, (0 ;11 ;11 ;11 ;11 ) = (0 ;0 :1 ;0 :05 ;0 :8 ;0 :05) and (0 ;21 ;21 ;21 ;21 ) = (0 :2 ;0 :05 ;0 :1 ;0 ;0 :95). E (0t ), unconditional zero probabilit y (estimated by the fraction of zeros, i:e: 1 T 1 PT t=1 It ). All computations in R ( R Core Team ( 2014 )).

(24)

Table 2: Descriptive statistics, dynamic logit models and

GARCH-models of SP500, Apple and Ekornes returns (see Section 5)

Descriptive statistics: s2 _s4 _ARCH [p val] n 0s b0 SP500 1.73 10.30 143:10 [0:00] 3684 2 0.001 Apple 9.25 55.03 7:12 [0:01] 7303 294 0.040 Ekornes 5.70 10.32 54:01 [0:00] 3546 667 0.189 Dynamic logit-models: b0

(s:e:) (s:e:)b1 (s:e:)b1 (s:e:)b SIC Logl

SP500 Constant 7:158

(0:707) 0.0115 17:04

Trend 7:200

(1:309) (2:478)0:673 0.0137 17:00

ACL(1,1) 0:032

(4e 05) (4e 05)1:147 (4e 05)0:997 0.0116 9:022

Apple Constant 3:171 (0:060) 0.3387 1232:5 Trend 1:870 (0:094) (0:263)3:437 0.3102 1123:9 ACL(1,1) 1e 09 (4e 05) (0:011)0:024 (9e 05)0:999 0.3095 1116:9 Ekornes Constant 1:462 (0:043) 0.9692 1714:3 Trend 1:183 (0:083) (0:150)0:576 0.9673 1706:9 ACL(1,1) 0:445 (0:130) (0:036)0:207 (0:087)0:701 0.9599 1689:6 GARCH-models: b0 (s:e:) (s:e:)b1 b 1 (s:e:) SP500 Ordinary 0:015 (0:003) (0:008)0:083 (0:009)0:908 0-adj. 0:015 (0:003) (0:008)0:083 (0:009)0:908 Apple Ordinary 0:168 (0:033) (0:008)0:087 (0:010)0:901 0-adj. 0:175 (0:037) (0:010)0:093 (0:012)0:894 Ekornes Ordinary 0:036 (0:009) (0:002)0:019 (0:004)0:974 0-adj. 0:039 (0:011) (0:004)0:025 (0:005)0:968

s2_{, sample variance. s}4_{, sample kurtosis. ARCH,} _{Ljung and Box} ₍₁₉₇₉_{) test}

statistic of rst-order serial correlation in the squared return. p val, the p-value of the test-statistic. n, number of returns. 0s, number of zero returns. b0, proportion of zero returns. s:e:, approximate standard errors (obtained

via the numerically estimated Hessian). k, the number of estimated model coecients. LogL, log-likelihood. SIC, the Schwarz (1978) information criterion. All computations in R (R Core Team(2014)).

(25)

0.00 0.05 0.10 0.15 0.20

−0.04

0.00

0.04

0.08

Average zero probability Bias ●● ● ● ● ● α0 GARCH Log−GARCH 0.00 0.05 0.10 0.15 0.20 −0.04 0.00 0.04 0.08

Average zero probability

● ● ● ● ● ● α1 0.00 0.05 0.10 0.15 0.20 −0.04 0.00 0.04 0.08

● ● ● ● ● ● β1 0.00 0.05 0.10 0.15 0.20 −1 0 1 2 3

Bias in % (MPE) ● ● ● σt GARCH 0.00 0.05 0.10 0.15 0.20 −1 0 1 2 3 Average zero−probability ● ● ● 1% VaR 0.00 0.05 0.10 0.15 0.20 −1 0 1 2 3 Average zero−probability ● ● ● 1% ES 0.00 0.05 0.10 0.15 0.20 0 5 15 25

Bias in % (MPE) ● ● ● σt Log−GARCH 0.00 0.05 0.10 0.15 0.20 −1 1 2 3 4 5 Average zero−probability ● ● ● 1% VaR 0.00 0.05 0.10 0.15 0.20 −1 1 2 3 4 5 Average zero−probability ● ● ● 1% ES

Figure 1: Simulated parameter and risk estimation biases in GARCH(1,1) and

(26)

0.00 0.05 0.10 0.15 0.20 −0.04 0.02 Average zero−probability GARCH bias ● ● ● ● ● ● α0: Algorithm Ordinary 0.00 0.05 0.10 0.15 0.20 −0.04 0.02 Average zero−probability ● ● ● ● ● ● α1: 0.00 0.05 0.10 0.15 0.20 −0.04 0.02 Average zero−probability ● ● ● ● ● ● β1 0.00 0.05 0.10 0.15 0.20 −0.05 0.05 Average zero−probability Log−GARCH bias ● ● ● ● ● ● α0: Algorithm Ordinary 0.00 0.05 0.10 0.15 0.20 −0.05 0.05 Average zero−probability ● ● ● ● ● ● α1: 0.00 0.05 0.10 0.15 0.20 −0.05 0.05 Average zero−probability ● ● ● ● ● ● β1

Figure 2: Simulated parameter biases in GARCH(1,1) and log-GARCH(1,1) models for the missing values algorithm in comparison with ordinary methods (see Section 4.2) 2000 2005 2010 0.0 0.2 0.4 0−prob: SP500 1985 1990 1995 2000 2005 2010 0.0 0.2 0.4 Apple 2000 2005 2010 0.0 0.2 0.4 Ekornes 2000 2005 2010 0.8 0.9 1.0 1.1 1.2 SD ratio: MPE = 0.04% 1985 1990 1995 2000 2005 2010 0.8 0.9 1.0 1.1 1.2 MPE = 2.36% 2000 2005 2010 0.8 0.9 1.0 1.1 1.2 MPE = 8.38% 2000 2005 2010 0.8 1.0 1.2 1% VaR ratio: MPE = 0.00% 1985 1990 1995 2000 2005 2010 0.8 1.0 1.2 MPE = 0.07% 2000 2005 2010 0.8 1.0 1.2 MPE = 0.49% 2000 2005 2010 0.8 1.0 1.2 1.4 1% ES ratio: MPE = 0.03% 1985 1990 1995 2000 2005 2010 0.8 1.0 1.2 1.4 MPE = 3.17% 2000 2005 2010 0.8 1.0 1.2 1.4 MPE = 11.51%

Figure 3: Fitted 0-probabilities, and the ratios of tted t, 1% VaR and 1% ES (see