Concluding remarks - Further developments of two point process models for fine scale time serie

Kullback-Leibler information is a unique overall measure of discrepancies between a fitted model and the true model which generates the observed sample. AIC, TIC, and GIC are derived as (asymptotically) unbiased estimators of K-L information. There- fore, they are K-L information-based criteria for model evaluation. Akaike’s discov- ery of the formal relationship between Kullback-Leibler information and maximum likelihood combines model estimation and model evaluation under a single theoretical framework: optimization, which makes the information-theoretic criteria unique among all other model evaluation criteria or procedures. Under the general assumption that the true model isunknown and hasinfinite dimension, it can be theoretically justified that the information criteria, derived based on minimizing the relative expected K-L information, have the following unique properties: (1) the selected best model gives an optimal balance between the bias (model accuracy) and the variance (model complex- ity) which has the minimum predictive error among all the candidate models; (2) the selected best model changes as the sample size changes to allow a more complex model to be selected as more information is available when the sample size is getting larger.

Although the information-theoretic criteria make an automatic model selection process possible, it should be noted that the best way to ensure selecting a ‘good’ or ‘useful’ model is to reduce the number of candidate models fit to the data by thoughtful a pri- ori model formulation. Once a group of reasonable candidate models are obtained, it is always valid to apply AIC for model evaluation within an M-estimator framework.

This makes the information-theoretic approach applicable to a vast range of statistical models and the computation issue for model evaluation is also facilitated considerably because of the constant trace term specification. We need more real data empirical study cases to test our research findings.

Efron (2000) has given the following comments on model selection and hypothesis testing approach:

“Model selection is another area of statistical research where important development seem to be building up, but without a definitive breakthrough. ... The fact is that classic Fisherian estimation and testing theory are a good start, but not much more than that, on model selection. In particular, maximum likelihood estimation theory and model fitting do not account for the number of free parameters being fit, and that is why frequentist methods like Mallow’s Cp, the Akaike Information Criterion, and Cross- validation have evolved.”

In this sense, the information-theoretic approach has been justified as a valid and better alternative for model evaluation in general. Based on the theoretical justification and examples presented in this chapter, various ACD models will be evaluated and compared following an information-theoretic approach in the next chapter.

Model selection for ACD models –

an application

3.1 Introduction

High frequency financial data, which record transaction-by-transaction observations across time, have become more widely available in recent years due to advances in information technology. Such data provide market micro-structure information (O’Hara, 1995) in the form of transaction times and associated asset variables, such as stock price and volume, etc. These high-frequency data diﬀer from more traditional financial time series recorded over discrete time intervals because they usually contain large numbers of irregularly spaced observations in (near) continuous time, and therefore re- quire alternative stochastic models. Since high frequency data can be viewed as points in continuous time it follows that models based on stochastic point processes may be useful in a statistical study of high-frequency data (Bauwens and Hautsch, 2006).

The ACD model is a point process model which takes the interval specification as it was formulated in Engle and Russell (1998). The random durations between consecutive transactions are modelled based on an autoregressive structure as we have seen in Section 2.4.3. During the last decade, various extended ACD models were developed aiming to better describe the market micro-structure and better fit the error term distribution with the real stock transaction data.

Probably due to the inherent inability of hypothesis testing in model evaluation, the literature has so far devoted little attention to model comparison despite the availability of many competing ACD model specifications. There are several possible reasons for this. The first reason is that, because the model estimation of an ACD model fits within the quasi-MLE framework, so that the standard maximum likelihood theory is

no longer applicable to handle the related inferential issues. A second reason may be due to the inherent diﬃculty in testing the non-nested models. A more fundamental concern is that no known theory can justify what ‘good’ property a selected best model has in terms of the overall model performance, if a candidate model can ever be identified as the best model by following the hypothesis testing approach. Hence, nothing can generally be said about ranking models.

The most popular model diagnostic check in ACD model papers is to use the L-B statistic (Equation A.20) to test for serial dependence in the estimated residual (and squared residual) series. Luca and Gallo (2004) used quantile-quantile plots (qq-plots) as a visual check on model goodness-of-fit. Bauwens et al. (2004) compared a group of selected ACD models based on the evaluation of density forecasts and the calculated Pearson’s Q-statistic (Equation 2.39). Bauwens et al. (2004) stated that they were aiming to compare ACD models using the criterion of predictive ability (p590), but they also acknowledged that their method is not useful for a comparison of misspecified models (p592). Meitz and Terasvirta (2006) introduced a number of new misspecification tests using the Lagrange Multiplier principle to evaluate ACD models. Because these articles followed the hypothesis testing approach for model evaluation, they all imply that a ‘true’ model (of finite dimension) exists and is measurable. There- fore they always aim to test whether a fitted model is ‘adequate’ or not compared to the ‘true’ model. This is fundamentally contrary to the general assumptions of the information-theoretic approach. Given a stock market situation, the trading process of a share can be aﬀected by numerous factors, many of which may be unmeasurable. Therefore, the general assumptions backing up the hypothesis testing approach are unrealistic and we consider that the hypothesis testing approach of model evaluation for ACD models is theoretically inferior. So far, we have found no attempts to use the information-theoretic approach for model evaluation for ACD models.

Quasi-maximum likelihood (QMLE) implies ‘model misspecification’. If we may interpret the typical hypothesis testing statement ‘given the null hypothesis is true’ as ‘given the hypothesized distribution is correct’, one should always question the validity of a hypothesis test under a misspecification situation. Under the general assumptions of the information-theoretic approach, any of our specified models is almost surely misspecified. As pointed out in Section 2.4.3, for an ACD model the log-likelihood function used for calculating the QMLE, i.e. for parameter estimation, is diﬀerent from the log-likelihood function which defines the probabilistic property of the model.

It is the latter log-likelihood function that is used to define theconditional distribution model (as defined in Section 2.2) of an ACD process. Therefore, the model evaluation for candidate ACD models falls within an M-estimator framework. The ACD model literature has made no distinction between these important, but diﬀerent, likelihood functions.

In this chapter, we fit various ACD models to two real stock transaction data sets (one is IBM data collected from the New York Stock Exchange and a second one is Darby data from the Kuala Lumpur Stock Exchange). Quasi-maximum likelihood is calculated for model estimation and the information-theoretic approach is followed for model evaluation and comparison. The structure of this chapter is as follows.

Section 3.2 deals with the description of the data sets, preparation of raw data, and exploratory data analysis (EDA). The treated data are ready for confirmatory analysis and EDA results are presented. Some technical details of raw data treatment are also discussed.

Section 3.3 specifies the fitted ACD models. There are six ACD models consid- ered for analysis, four basic ACD models: EACD(1,1), EACD(2,2), WACD(1,1), and WACD(2,2); two mixed distribution ACD models: mixed exponential ACD(1,1) and mixed lognormal-gamma ACD(1,1). All these six ACD models are fitted to the two real data sets.

In order to minimize the serial dependence and to ensure the unit mean constraint with the estimated residual series, a penalty term is added to the QMLE function as the optimization objective function for parameter estimation. This is diﬀerent from the estimation procedure employed in Engle and Russell (1998). In Section 3.4, the model estimation results are presented and some discussions are given on the model estimation details.

Section 3.5 presents the model evaluation results. AIC scores are calculated for model evaluation. Pearson’s Q-statistic is also calculated as an extra evidence to cross check the information-theoretic criteria’s evaluation results. First we evaluate and compare the four basic ACD models. In the second subsection, mixed lognormal-gamma ACD(1,1), a newly proposed model, is compared with existing ACD(1,1) models. In the third subsection, a two-stage ACD model fitting procedure is proposed. The Q- statistics and a bootstrap version of qq-plots are used to provide further empirical evidence to confirm the AIC model assessment results.

In document Further developments of two point process models for fine scale time series : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany (Auckland), New Zealand (Page 73-79)