Forecasting mid-price movements with machine learning techniques

The prediction of the limit order book flow is an interesting topic both for practitioners and researchers. Among all the possible prediction tasks, Publication I and Publication II focus on the prediction of mid-price movements. In the two Publica- tions, this is posed as a classification problem for the three labels indicating whether the mid-price movement is stale, decreasing or increasing. The prediction horizon is specified as the number of future order book events from the time at which the prediction is referred to (evaluated). Publication I analyzes the mid-price direction predictability after 1, 2, 3, 5, 10 events, Publication III for a 10-events horizon. Before addressing the finding, is important to remark that perhaps among the contributions of Publication I there are no complex applications and fine-tuning of ML methods for maximizing the performance of the above-mentioned prediction task. Rather, Publication I introduces a benchmark limit-order book dataset designed for future ML applications (and clearly, the current one). In this concern, the dataset construction,

order book processing, features extraction, and input normalization (see Section 3) is one of the major tasks. For the reproducibility of the results and for them to constitute a reliable benchmark, the experimental protocol based on increasing foldings (see Section 3.4 in Publication I) have been designed and accurately described. In this regard, the data-handling part in Publication II is minimal, and corresponds to a direct application over the very same dataset developed in Publication I (actually, on the subsample utilizing z-score normalization). But whereas Publication I utilizes the traditional approach of forming input vectors from features at a given time instance, Publication II uses tensor representations, where the time dimension is retained. In fact, the conversion between a tensor representation to a vector representation leads to the loss of temporal information.

Turning to actual results, Publication I implements two ML methods under three different data-normalization schemes and a common experimental protocol. The performance measures Publication I report show that even in the noisy high-frequency LOB data ML can effectively retrieve signals for useful for prediction. At shorter horizons, the performance is however affected by the data-noisiness, i.e. microstructure effects that prove to make the very-short-term prediction of the mid-price movement challenging. Anyhow the random mid-price dynamics, given as a noise superimpo- sition over the efficient price, seems not to be a completely independent process, indeed the input vectors extracted from the past dynamics of the observed mid-price turns to be capable of achieving satisfactory performance measures also at 1, 2, 3 event-horizons by. However, if these prediction approaches are to be exploited by practitioners, reactions within the next-event are unfeasible (the median inter-event duration is 64 milliseconds), thus the predictability over a slightly longer horizon is more relevant from a practical perspective. Despite the data size, the out-of-sample performance (F1) is up to 43% for both methods, showing that basic machine learning techniques can effectively provide a satisfactory classification of the direction of mid- price movements. Among the normalization methods, results clearly indicate that the role played by the data-normalization is not secondary. Furthermore, we have evidence that specific combinations of ML algorithm and normalization schemes lead to considerably different performance measures than others.

RO 1of this dissertation aims to explore the applicability of standard ML methods

for the mid-price movement prediction task. Also, by addressing the role played by different normalization schemes and forecasting horizons. To pursue this goal Publi-

cation I applies two simple ML methods, under three data-normalization approaches and evaluates out-of-sample forecasting performance measures. Results confirm the feasibility of simple ML methods for this prediction task and that normalization is a key ingredient, perhaps more relevant than the classifier. The ultra-high frequency dataset developed for answering this question is made freely available and inclusive of a detailed description of the experimental protocol as well as the definition of the feature set and variables therein provided, fulfilling the second part of the research goal.

Linear discriminant analysis is applied in Publication II and proved to be an effective method of the mid-price direction prediction task. However when relying on a tensor representation of the data the corresponding multilinear discriminant analysis boosts all the performance measures up to approximately 15% and 5% with respect to the worst and best competing input-based alternatives. This indicates the importance of the contribution of the temporal information captured in tensor representation: not only current features are important for the prediction task, but interrelations in their lags generously contribute to performance improvements. Furthermore, in Publication II a regressor operating on the tensor representation is developed, and a scheme to select the best-performing model state discussed based on the algorithm’s learning dynamics. This method leads to the highest F1 scores among the LDA, MDA, benchmarks of Publication I and the bag-of-features algorithms in (Passalis, Tsantekidis et al., 2017).

RQ 1is addressed by implementing four ML methods based on time-series’ tensor-

representation and compared with the results from input-vector alternatives applied in the literature. Our results show that the extent to which forecasting performance measures are improved is generally widely signiﬁcant for all the four measures we considered (in this regard, see Table 6.1).

5.2 Long-range correlations in limit order book markets

Publication III provides a study on the scaling behavior for duration-related variables extracted from the order book data of ﬁve securities. The scaling exponent is extracted

with the DFA method computed for, inter and cross -events durations1_{. The scaling} exponents we ﬁnd for order-to-order, trade-to-trade and cancel-to-cancel series are consistent with earlier analyses in the literature (e.g. Ivanov, Yuen and Perakakis,

2014; Gu et al., 2014). Power-law exponent estimates are very consistent (α0.6)

across different stocks (and side of the book), suggesting that fractality in durations is a general phenomenon (aligned with Ivanov, Yuen, Podobnik et al., 2004), although there are some minor differences especially in the cross-events durations between the stocks traded at Helsinki, Copenhagen and Stockholm. This suggests some exchange-specific features in the long-range correlations, e.g. not all the market participants trade on multiple exchanges. Our analysis finds evidence of crossovers in the time-series we analyzed. This confirms the finding of Ivanov, Yuen, Podobnik et al., 2004 but extending it to different duration series and detecting it over several stocks. This point out that the observed fractality is complex and time-related being the compound effect of different scaling exponents characteristic of different time horizon domains. Therefore fractal properties in LOB markets seem to be indeed quite complex, reflecting the complexity of the underlying markets. The crossovers observed are interestingly placed around a day, a week and a month time scales. This analysis was possible only due to the availability of a long data period, indeed earlier studies identified two crossovers (e.g. Ivanov, Yuen, Podobnik et al., 2004; Ivanov, Yuen and Perakakis, 2014; Tiwari et al., 2017). This evidence supports the idea that there might be participants trading at different horizons o however interactions between daily, weekly and monthly goals. This aligned with the theoretical argument of (U. A. Müller, Dacorogna, Davé, Pictet et al., 1993), although Publication III provides evidence only with respect to durations. Furthermore, we find evidence of great symmetry with respect to the book side, indicating either a general and uniform behavior in market’s participants, either their tendency in submitting both buy and sell limit. Importantly, this is the first analysis where different duration series are jointly analyzed within the same order book data, i.e. for a given stock and a given period we consider all the possible durations within different LOB events. Our findings unveil a true multi-level and interacting complexity: all the series and the corresponding cross-series are multifractal with a characteristic scaling exponent being very consistent on a daily level. This may indicate that e.g. trading algorithms are similar in the way the past information is processed for placing limit-orders,

market-orders, and cancellations.

Publication III also studies the relationship between the intra-day scaling exponent and some economic variables (like daily return and volatility). For the correlation between the scaling exponent and volatility, we have clear positive signiﬁcance for

all the inter-event series. Clear clusters in the correlation betweenαand economic

variables are observed in the trade-to-trade durations, although some of them are not signiﬁcant, while widespread values are observed for the cancel-to-cancel and order- order series. Very generally, this whole analysis indeed points out a true complexity in the order book dynamics, unveiling general long-range autocorrelation in the duration series (α >0.5) but of complex nature varying with the time-scale (crossovers).

RQ 2is answered by analyzing the scaling exponent for different sets of duration

series, not limited to single stock not to a certain side of the book. Findings from Pub- lication III show that long-range autocorrelations are ubiquitous in all the time-series under investigations, and showing a generally remarkable homogeneity. Further- more, the availability of long-span high-frequency data allows to unveil up to three different scaling exponents applicable at different sampling frequencies.

RQ 3deals with the associations with economic variables and long-range autocor-

relations. Publication III analyzes the association for three duration series and six economic variables, ﬁnding it to be of very heterogeneous nature across the variables, and for the order and cancellation series, across the stocks too. Indeed for certain variables and time-series, the association is either signiﬁcant, largely positive, and consistent across the different factors, while for others is the opposite.

In document Volatility modeling and limit-order book analytics with high-frequency data (Page 99-103)