We modelx(iδ) using a modified version of the Lognormal Log-ACD (LL-ACD) model proposed by Allen, Chan, McAleer, and Peiris (2008). The LL-ACD(p,q) model is specified as follows: xi=exp(c+γ′Zi+Ψi), (2.12) Ψi= p
∑
j=1 βjΨj−1+ q∑
j=1 αjεi−j+εi, (2.13) εi∼i.i.d.N (0,σε2), (2.14) in which Zi andγ are some covariates and the associated parameter vector. It is obvi- ous that the model is equivalent to a log-linear regression ofxi onZiwith ARMA-type5In Li, Nolte, and Nolte (2018a), it is proven that a non-parametric duration-based volatility
estimator is more than 6 times more efficient than a RV estimator, while the parametric duration- based volatility estimator is more than 20 times more efficient than the non-parametric duration- based volatility estimator.
2.4 Price Duration Modelling | 61
error termsΨi. Let us define the parameter vectorθ ={c,γ′,β, . . . ,βp,α1. . . ,αq,σε2}′. Based on the dataset {X,Z} where X={x(iδ)}i=1:I and Z={Zi}i=1:I, the model can be easily estimated via Quasi Maximum Likelihood (QML) with the following conditional log-likelihood function:
lnL(θ;X,Z) =−I 2ln 2π− I 2lnσ 2 ε − 1 2 I
∑
i=1 εi2 σε2 . (2.15)It is then clear that the densityFx(·|Ft(i−δ1),θ)estimator is simply a log-normal density. We can then construct the ICV estimator based on the estimated parameter vector
b
θ and the estimated error term bεi as:
ICV(0,T) =− I
∑
i=1 ln 1−Φ(εbi/σbε) δ P(ti(−δ1)) !2 , (2.16)in whichΦ(·) is the CDF of a standard normal distribution.
The LL-ACD model is chosen because of the following rationales: Firstly, the log-linear form allows us to include seasonality components and other explanatory variables freely with a guaranteed positive fitted duration bx(iδ). Secondly, the QML es- timation of the LL-ACD model, as derived by Allen, Chan, McAleer, and Peiris (2008) using the results in Bollerslev and Wooldridge (1992), ensures that the parameter estimates are consistent even when the log-normal density is misspecified. This is a crucial property of the model that validates our analysis especially when explanatory variables are included. It is worth noting that estimates of γ are still vulnerable to endogeneity bias due to simultaneity or omitted variables, which is likely to be the case in our analysis. However, since we mainly focus on the goodness-of-fit of the model in general to improve the quality of volatility estimates instead of the marginal effects of the included variables, this is not our major concern. Thirdly, (2.12) permits a method to estimateγ conveniently via OLS regressions, since the dynamic structure in the residuals does not alter the unbiasedness of the OLS estimator. This allows us to examine the relative importance of explanatory variables using simple OLS regressions without estimating the dynamic parameters. Also, the OLS estimates can be used to initialize the LL-ACD estimation, which can speed up the estimation significantly.
2.4.1
Inclusion and Selection of Variables
Empirically price durations are subject to diurnal effects which are usually filtered out prior to model estimation (Engle and Russell, 1998). The LL-ACD model allows
Table 2.1 Description of MMS variables Name Notation Parameter Description
Volume V OLi γV OL Logarithm of the total trading volume per
second in (ti(−δ)1,ti(δ)].
Total quote depth T Qi γT Q Logarithm of the sum of the best bid and
ask depth per second in (ti(−δ1),ti(δ)].
Number of transactions NTi γNT Logarithm of the number of transactions
per second in (ti(−δ1),ti(δ)].
Bid-ask spread BASi γBAS Mean bid-ask spread in (ti(−δ)1,ti(δ)].
Quote depth difference QDi γQD The absolute difference between the loga-
rithm of the sum of the best bid and the (sum of) best ask depth in (ti(−δ1),ti(δ)]. Order imbalance OIi γOI The absolute difference between the loga-
rithm of the sum of the number of buyer- and seller-initiated orders in (ti(−δ1),ti(δ)]. Order flow OFi γOF The absolute difference between the loga-
rithm of the sum of the buyer- and seller- initiated volume in (ti(−δ)1,ti(δ)].
for joint estimation of seasonality parameters and the ACD parameters, which can potentially lead to efficiency gains in parameter estimation (Hautsch, 2012).
We specify the seasonality regressors in a flexible Fourier form as proposed by Andersen and Bollerslev (1997b):
si= P
∑
j=1 υjtij+ Q∑
j=1 υc,jcos(ti·2πj) +υs,jsin(ti·2πj) , (2.17)in whichPandQare predetermined degrees of the polynomials, andti is the calendar time of the i-th event divided by the total length of a trading day. υj,υc,j and υs,j are parameters to be estimated. We can include each component of si into Zi to estimate the seasonality parameters. In all of our model estimations, the degree of polynomials for the flexible Fourier regressors are set to be P=1 and Q=3.
A key contribution of this chapter is the inclusion and selection of MMS covari- ates in intraday volatility estimation. Based on our discussion in the literature review section, we include the following covariates in the estimation procedure: trading volume, total quote depth, quote depth difference, order imbalance, order flow and the number of transactions. The definitions of these variables are summarized in Table 2.1. As discussed in Section 2.3, for theV OLi,T Qi and NTi variables, we use a measure of accumulated speed instead of the exact quantity to ensure a meaningful
2.4 Price Duration Modelling | 63
interpretation of the associated parameters. We include both the contemporaneous MMS covariates and their one-duration lagged version in the LL-ACD model to control for potential lead-lag relationships between price durations and these covari- ates. However, we would like to note that, in this setting there is no clear economic interpretation of the coefficients of the lagged covariates due to the randomness of the length of the past duration.
As a summary to the discussion above, the γ′Zi component in (2.12) can be written as follows:
γ′Zi=si+γV OL,0V OLi+γT Q,0T Qi+γNT,0NTi+γBAS,0BASi+γQD,0QDi+γOI,0OIi+γOF,0OFi
+γV OL,1V OLi−1+γT Q,1T Qi−1+γNT,1NTi−1+γBAS,1BASi−1+γQD,1QDi−1+γOI,1OIi−1+γOF,1OFi−1. (2.18) In empirical studies, one can include a richer set of explanatory variables inγ′Zi that may further improve the goodness-of-fit of the model. However, as the number of parameter increases, the efficiency of the parameter estimates deteriorates, which leads to less efficient volatility estimates for a given sample size. To improve the performance of volatility estimation, we propose to only include the most relevant covariates in the estimation of the LL-ACD model. The relevance of the covariates is determined adaptively using an OLS regression as specified in (2.12) by treating Ψi as a correlated error term.
We use the best subset regression (BSR) to select the MMS covariates that are most relevant to price duration modelling. BSR is a classical statistical method (see, e.g. Beale, Kendall, and Mann (1967) and Hocking and Leslie (1967)) that is frequently used in variable selection problems. It has recently regained attention due to the development of optimization methods in Bertsimas, King, and Mazumder (2016). To discuss this selection method in detail, let γ ={ν,γZ} where ν is the
(P+2Q)-by-1 seasonality parameters, and γZ is the 14-by-1 vector of the MMS
parameters. Starting with the regression model:
lnx(iδ)=c+γ′Zi+Ψi. (2.19) For each K ranging from 1 to 14, we solve the following nonconvex problem:
min c,γ ||lnx (δ) i −c+γ ′Z i||22 subject to ||γZ||0≤K (2.20)
where ||(·)||2 is the l2 norm and ||γZ||0 denotes the pseudo-norm of γZ that counts the number of non-zero elements in γZ. The problem above can be expressed as a Mixture Integer Optimization (MIO) problem, as suggested by Bertsimas, King, and
Mazumder (2016), and can be solved very efficiently using MIO optimizers. The detailed optimization setup is documented in Appendix B.3.
In essence, BSR for a given K finds the optimal combination of K different co- variates that minimizes the mean squared error. We will refer to the optimized model for a given K as the K-optimal model. Intuitively, as K ranges from 1 to 14, more im- portant MMS covariates will be included first inK-optimal models and less important MMS covariates will only be included when more important ones are already in the model. Therefore, the number of inclusions of each MMS covariate in the K-optimal model serves as a natural ranking of relative importance of each MMS variable. To determine the overall optimal combination of MMS covariates to be included, we can pick the best model among all K-optimal models using information criteria, such as Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC). We will refer to the overall best model among all theK-optimal models as K∗-optimal model, where K∗ is the number of regressors that optimizes some model selection criteria.
Other choices of variable selection schemes are also available, for example, reg- ularization methods such as the LASSO by Tibshirani (1994) or the elastic net by Zou and Hastie (2005), dimension reduction methods such as principal component regression (PCR) or partial least square (PLS) regressions (Wold, Sj¨ostr¨om, and Eriksson, 2001). The best subset selection approach has the advantage that it has a very straightforward economic interpretation without many tuning parameters. The relative performance of BSR and shrinkage estimators remain an open debate (see, e.g. Bertsimas, King, and Mazumder (2016) and Hastie, Tibshirani, and Tibshi-
rani (2017)), but the BSR approach provides a simple solution to rank the relative importance of the variables based on their inclusions in the optimal model. For shrinkage estimators, it is not immediately clear what criteria should be used to rank the contribution of the variables, and the performance of these estimators depends crucially on the choice of the tuning parameters. As to the PCR and PLS approaches, one may argue that these estimators can extract some latent factors in the system of regressors, but latent factors lack a clear economic interpretation as they are just linear combinations of the regressors. A summary in Hastie, Tibshirani, and Friedman (2009) shows that the variable selection methods described above have similar in-sample performance, and we choose the best subset regression method due to its simplicity and more straightforward economic interpretations.