The Bull-Bear Trading Engine
15. Standard Deviation: The standard deviation o f the change in closing prices for a specified period.
6.10 Stationarity
6.11.2 Criticisms of the Method
There are several criticisms that could be leveUed at this work which are important to explore. These faU into two main categories: criticism o f the results and interpretation, and criticism of the viabihty of the system in terms o f management management and the psychology of trading.
^ Delta is how the net present value of the trade varies with the market price. A delta-neutral trade’s net present value is consequently unaffected by price movements.
The results are not risk adjusted: This is a very important issue. Financial theory asserts that excess returns (ie. returns greater than that made by an index such as the FTSEIOO or S&P500) can only be made by taking on more risk, by gearing up investments. The economists would argue that what has been done here is that the system is really under-performing the market, but that it is geared up so that the percentage returns are greater than those made by the indices. If this was the case, then risk-adjusting the returns would expose the extent of market under-performance. This argument is misplaced and risk adjusting the returns is a debatable proposition:
1. For this system to w ork depends on the market not behaving as economists believe. Therefore, the behaviour of this system must be in some sense disjoint to financial theory. It is o f debatable wisdom then to attempt to apply (linear) financial theory to an area of non-linear behavioural study that must, by definition, lie outside the domain o f linear econometrics. The robustness of linear models collapses when taken even shghtly outside the assumptions o f the problem domain - this study relies on assumptions o f randomness, equilibrium pricing, investor rationality, and market efficiency being infringed, and the market price consequently having exploitable dynamics. To then suddenly assert that all the assumptions are vahd again is dubious.
2. This study contributes to a growing volume o f work advocating a paradigm shift to a non-linear view of economics. Arguments by analogy are often weak, but in this case can vividly illustrate the problem. Assume that the financial theory o f the day operates around the motion o f the planets. If someone then devised the current financial theory, it would be ludicrous to expect them to “moon adjust” their results to gain credibility. Typically, when paradigm shifts take place, the existing theory that is replaced is a special case of the new theory, and importantly - effects become visible that were invisible fi'om the viewpoint o f the previous paradigm^. Consequently, for a non-linear system, a non-linear version o f risk-adjustment must be carried out.
^ For instance, Newtonian mechanics is a special case of the Theory of Relativity, but concepts such as time-dilation or gravitational lensing are invisible from within Newtonian mechanics[Cham82].
3. The most widely used measure o f risk, the variance of the P&L, depends on what length scale is used. The length scale must then be chosen ad hoc over which to find the variance[Pete91]. For the variance to have meaning, the distribution o f returns must be normal or Gaussian. It has been acknowledged since the 60’s [Osbo59] that the distribution o f returns is fat-tailed - i.e. not normal. Mandlebrot [Mand82] asserts that if the markets display stable Paretian behaviour then their variance is infinite. Moreover, as the system only ever stops and reverses, it is always in the market, so the volatihty of the P&L must be a simple function of the volatihty o f the market. According to the financial theorists, market volatihty exhibits GARCH behaviour, where periods o f persistent high volatihty are fohowed by persistent low volatihty, with random changes between the two. The variance is then an inadequate measure o f risk, just as the average number o f a die is 3.5.
The whole exercise is simply curve fitting: This is another very serious criticism, and one that is extremely difficult to counter with experimentahy supported argument. The argument goes that as the system is developed, the developer runs tests, and makes modifications on the basis of the results of those tests. The first time those tests are run, the test is genuine, but thereafter, a feedback loop exists through the experimenter, and the vahdity o f the results is progressively compromised as more and more experiments are run.
The defence against this criticism is that all the project development was done on the Eurodollar 3 month market. Only once the system was operational was data from other markets used. The P&Ls shown in Figure 6.6 were the first use o f those data streams, the maturity experiment was also blind, as were the commodity experiments. This defence is not complete, as economists would then assert that all the markets are highly correlated, and so there has been an imphcit use of future data once the system was complete. However, it is clear from Figure 6.6 that even if the markets are highly correlated, the system responds to them in different ways. If the system behaves differently to different but correlated markets, then either it is acting randomly or it is responding to some behavioural component of the markets that does not show up in the correlation calculations.
The system is behaving randomly: It is likely that some aspect o f the behaviour of this system is effectively random, as it is being driven by a market that, at times, is
probably near random. However, if the system’s behaviour is completely random then it is unlikely that a confidence level of 86% would emerge for the existence of some dependence on maturity. While this value is not so high as to place the existence o f this effect beyond all doubt, it is too high to be dismissed out-of-hand as the product o f a random effect.
The system operates completely systematically on gold. The probability of sustaining a series o f losses as long as is shown in Figure 6.11 as the result of random trading is very low. It must be emphasised that this is not the system simply losing equity through commissions, bid-ask spread and slippage - these factors contribute approximately 10% of the total losses made. There are simply practically no wiiming trades, and this demonstrates that the system is, in some sense, operating consistently.
The results are only one sample path: This point is closely related to the previous criticism - if a system has a stochastic component to its behaviour, then it will clearly take some time to assess its behaviour. It is true that this is one sample path, but there are some points that are important to recognise:
1. No out-sample test conducted as part of this investigation uses less than 1100 days of market data - this constitutes simulated trading on 12 markets over 3 time frames over a period in excess of 4 years. Through an appeal to reality, most managers would feel that they would be in a position to make an informed assessment of a trader’s performance after this time.
2. The system has consistent behaviour: The dynamics are similar for the closed cumulative profit and loss curves for the initial investigation and the maturity experiment(Figures 6.7, 6.10) despite the fact that these two P&Ls come from experiments whose target markets only overlap to a limited extent.
3. To a very real degree, this is a criticism that is independent of the quahty of the work: once data from a market has been used, there is no way to generate completely independent sample path for that period. This is more a criticism of the fundamental methodology of analysing time-series and attempting to find exploitable determinism that is precluded by financial theory, than a criticism specifically aimed at this work.
T h e system is u n trad ab le: The validity of this statement depends entirely on one’s risk tolerance and what one’s definition of success is. Although the concept of risk as an analytical tool was criticised earher in this chapter, it is useful as an intuitive notion o f the likelihood o f simply losing money. For how long and to what degree can the dealer, and more importantly, his management, handle holding a losing position? As for success, what is the benchmark? Until the technical analysts agree with the financial theorists about what the system has to do to be a success, this is not a question that can easily be answered, short of “put us in the top 5% o f hedge funds”.
A case has been made that the system should be used on the longer maturity markets. Similarly, it appears that the system trades the US markets consistently better than the other markets that have been tested. To test this hypothesis thoroughly will unfortunately take around 4 years for the market to generate new, untainted data.
6.12 Summary
The main points of this chapter were as follows:
• The problem was to develop a genetic algorithm based system that could find and exploit tradable behaviour in interest rate futures markets.
• The data available was daily open, high, low closes for a range o f European, Japanese and American government bond futures markets. Additional data such as trading volumes and open interest were also available.
• It appears that the system works more effectively on the longer maturity markets. It is difficult to account for this effect fi'om a conventional econometric viewpoint. • The system was unable to form rehable models for either o f the commodity futures
markets tested.
• From an analysis of the rules, it appears that price movements have a tradable degree of persistence.