6.2 Discussion
6.2.2 Goal Evaluation
This section revisits the goals set out in section 1.3 and discuss to what degree they are fulfilled.
-80%
2006-12-04 2007-12-04 2008-12-03 2009-12-03 2010-12-03 2011-12-03 2012-12-02
Cumulative Return
Time EEF Trader 2007 - 2012
Benchmark EEF
(a) The experimental test run on OBX from January 2007 to February 2012
-40 %
2001-12-05 2003-12-05 2005-12-04 2007-12-04 2009-12-03 2011-12-03 2013-12-02
Cumulative Return
Time EEF Trader 2003 - 2012
Benchmark EEF
(b) The experimental test run on Dow from January 2003 to January 2012
-80 %
2007-12-04 2008-03-03 2008-06-01 2008-08-30 2008-11-28 2009-02-26
Cumulative Return
Time EEF Trader 2008 - 2009
Benchmark EEF
(c) The financial crisis on OBX. January 2008 to January 2009
2007-12-04 2008-03-03 2008-06-01 2008-08-30 2008-11-28 2009-02-26
Cumulative Return
Time EEF Trader 2008 - 2009
Benchmark EEF
(d) The financial crisis on Dow. January 2008 to January 2009
Figure 6.1: Presentation of EFF runs in different scenarios against the benchmark.
System Architecture
Seamless Module Exchange. The system and its network of nodes ensure seam-less exchange of more than just modules, but of any internal part of the trading system. To ease the exchange of large parts the trading module has been intro-duced. This contains a set of nodes and allows for exchange of larger chunks of functionality with ease. The goal stated that these changes should be available to do post-compilation. This aspect has been disregarded. It was found that there were not worth the effort to make a decent enough interface to support this feature, thus making it easier and more efficient to leave such configuration in the code.
The modules and nodes are still easy to put together, but requires compilation afterwards. To retain the ability to test many different approaches at one run the system facilitates for the creation of several trading systems at once, and to run them in parallel from the interface.
134 Discussion
Scalability and Parallelization. The nodes operate asynchronously and on sep-arate threads. This allows for parallelization, and the system utilizes all available processor cores it is given. This goes some way towards scalability, but it still limits the system to run on one machine. Through a system of job scheduling and with more general communication interfaces between modules or nodes, the system could be spread on a cluster, but this is far outside of the scope of the four month deadline.
Implemented in Less Than Four Months. The complete system was imple-mented in a bit more than four months. In man-hours the implementation time did exceed the estimates. An estimated 1500 man hours were used in the coding phase of this project and 25441 KLOC2 were produced. Tuning of the parameters in the trading systems is not considered as implementation time. This system is scaled a bit down, but in retrospect it should have been scaled much further down to allow for more time to design and test different trading systems.
Technical
Time Series Analysis Methods. No machine learning technique showed any predictive ability on the Dow experiments, though this can be attributed to the efficiency of this market. On OBX however several performed quite well. SVR and SVM performed best as can be seen from the results before transaction costs.
The fact that they committed too many trades and incurred heavy transaction costs is irrelevant of their predictive ability. The ANN techniques seem to have some predictive ability though this is much smaller. In all cases it is clear that it requires substantial inefficiencies in the market and as comparison against RSI and MACD shows does it not give better performance than naive technical analysis techniques. On a side note GRNN and PNN was found to be excruciatingly slow, especially on datasets on any size. This makes them harder to test and tune and will limit their use in any live system. Autoregression showed consistently poor performance.
System Training Scheme. Evolution is used as a meta-parameter optimizer, but for many simpler systems (RSI and MACD and FTL) this is the only training they have. In this system the evolution is only run at the start. Many experiments show that the system starts out well, but then degrade in performance the further it gets from the training period. Future plans for the system is to run evolution on a trailing window in the same way as lower level training. This way the systems are continuously tweaked for optimal performance at the cost of some computational performance. There might also be some issues with overfitting and the evolution process must have some mechanic to stop it before this happens in too large a
21 KLOC = 1000 lines of code
degree. All the classifiers have their own learning mechanisms. Resilient Prop-agation gives the best general results for MLPNN training and is used to train both the feed-forward and recurrent neural networks. For the GRNN and PNN the conjugate gradient algorithm is used to find the optimal sigma value, while an intelligent modified grid search were used to find the parameters of SVM [41; 61].
Little focus were given to test different SVM and GRNN training methods, rather expert advice were followed.
Robustness and Adaptivity. The training and test datasets are quite different periods. In the Dow case training is bearish, while the test has two bull periods with the financial crisis in the middle. With OBX it trains in a bull period, then runs on the financial crisis and a slight bull period before and after. In the OBX experiments the only systems with decent performance as those that are evolved to buy more or less the whole index and stay in it. This indicates little or no adaptability. In the OBX experiments however some traders like EEF manages to deliver strong results throughout the time period, many of these does however have very high variances. Most are also correlated with the market as the beta values indicate. There are some exceptions Like EEF, RSI and SVM and these can be said to be quite adaptable.
As for robustness any trader that delivers consistent return without too high a variance could be said to be robust as the input data from any financial market can be said to be noisy at best. Some like the ANNs does manage this, and these are the most robust trading systems in this study.
Financial
Input Data. A set of relevant input data had been identified, see section 2.2 for an overview. Many of these were however not tested in the system due to lack of the required data. Tests with fundamental data were rejected from the scope of this project and the focus was set on the mostly technical data that can be created from pure price data. The framework is designed to easily integrate a fundamental data source. One possible conclusion from the tests is that technical data is not enough to identify anomalies that can be capitalized on in a market with transaction costs of the level simulated in these experiments.
Risk and Return. Risk is not a direct part of the fitness function for evolution.
There is a possibility of constraining risk, though this was not activated during the experiments. The fitness function uses only average daily returns. The evolu-tionary training period is 3 years in the Dow Jones case. The results show that such an evolutionary approach is able to incorporate risk. This can be seen most clearly in the example of the ”follow-the-leader”(FTL) trader. This chooses how many of the last winners to pick. Even in the case were there are no transaction
136 Discussion
costs FTL evolves to pick as many stocks as it is allowed, clearly the most risk averse selection, even though it is only evaluated on return.
One could even argue that a bit too much risk consciousness is present in many trading systems, possibly due to the fact that the training period for Dow is quite bearish. In the experiment with transaction costs on Dow only one trader outperforms the benchmark on returns, while four outperforms it on variance. In the OBX experiments with a more bullish training period traders were less risk averse, though FTL still evolves to pick n − 1 of the stocks, which is its upper limit.
It is evident that the system does not need an explicit risk utility function to incorporate risk. Though to appease potential investors the system does have functionality to set a constraint on variance.
Portfolio Evaluation. The framework has several different portfolio evaluation schemes, from the simple approach of picking the n winners in a historical period, to providing a mean-variance optimized portfolio. Early experiments found that a uniform binary vector discrete selection of a set of well performing stocks forming a portfolio is generally better for learning than a more complex non-uniform portfolio scaled on risk or return.