Conclusion - Data-Snooping, Technical Trading Rule Performance, and the Bootstrap

This paper applies a new methodology that allows researchers to control for data-snooping biases to compute the statistical significance of investment performance while accounting for the dependencies resulting from in-

Table VI

Technical Trading Rule Summary Statistics: Out-of-Sample Dow Jones Industrial Average (1987–1996) and the Standard and Poor’s 500 Futures (1984–1996) with the Mean Return Criterion

This table provides summary statistics, White’s Reality Check p-value, and the nominal p-value for the best-performing rule, chosen with respect to the mean return criterion, and the recursive cumulative wealth rule, for both the out-of-sample Dow Jones Industrial Average_{~1987–1996! and} the Standard and Poor’s 500 Futures_{~1984–1996!. The nominal p-value is that which results from applying the Reality Check methodology to} the best trading rule only, thereby ignoring the effects of the data-snooping. The cumulative wealth trading rule bases today’s signal on the best trading rule as of yesterday, according to total accumulated wealth. The recursive cumulative wealth rule is not the best trading rule ex post, thus the Reality Check p-value does not apply.

Dow Jones Industrial Average S&P 500 Futures

Summary Statistics Best Rule

Cumulative

Wealth Rule Best Rule

Cumulative Wealth Rule

Annualized average return 14.4% 2.8% 9.4% 25.5%

Nominal p-value 0.000 0.322 0.042 0.895

White’s Reality Check p-value 0.341 n0a 0.908 n0a

Total number of trades 6 676 43 210

Number of winning trades 4 234 22 56

Number of losing trades 2 442 21 154

Average number of days per trade 411.7 3.7 76.5 14.3

Average return per trade 34.38% 0.04% 3.00% 20.33%

Number of long trades 4 338 22 104

Number of long winning trades 3 140 12 31

Number of long losing trades 1 198 10 73

Average number of days per long trade 598.0 4.3 98.6 17.1

Average return per long trade 48.16% 0.24% 5.76% 0.16%

Number of short trades 2 338 21 106

Number of short winning trades 1 94 10 25

Number of short losing trades 1 244 11 81

Average number of days per short trade 39.0 3.2 53.4 11.6

Average return per short trade 6.82% _20.16% 0.12% _20.82%

1682

The

Journal

vestigating several investment rules. We believe that this methodology deserves to be widely used in finance: There is an obvious focus in finance on infor- mation and decision rules that can be used to predict financial returns, but it is often forgotten that this predictability may be the result of a large number of researchers’ joint search for a successful model specification with predictive power. Many researchers, such as Merton ~1987!, have called for a remedy to control for data-snooping biases, and the methodology in this paper provides just such a tool. It summarizes in a single statistic the significance of the best-performing model after accounting for data-snooping.

Aside from being important in assessing the inf luence of data-snooping bias in performance measurement studies, the approach of this paper also has substantial value to investors who are searching for successful investment strategies. Suppose that, after experimenting with a large number of decision rules, an investor comes up with what appears to be a highly successful rule that outperforms the benchmark strategy. The investor is then left with the task of assessing just how much of the performance is a result of data-snooping, and how much is due to genuine superior performance. In the presence of complicated dependencies across the rules being evaluated, this is a very difficult question to answer, and only a bootstrap methodology such as the one offered in this paper appears to be feasible. Furthermore, since the investor would know the exact identity of the universe of investment rules from which the optimal rule is drawn, the approach of this paper is eminently suited for such an assessment.

Our analysis allows us to reassess previous results on the performance of technical trading rules. We find that the results of BLL appear to be robust to data-snooping, and indeed there are trading rules that perform even bet- ter than the ones considered by BLL. Hence their result that the best performing technical trading rule is capable of generating profits when applied to the DJIA stands up to inspection for data-snooping effects. This finding is valid in all four subperiods considered by BLL. However, we also find that the superior performance of the best technical trading rule is not repeated in the out-of-sample experiment covering the 10-year period 1987–1996. In this sample the results are completely reversed and the best-performing trading rule is not even statistically significant at standard critical levels. This result is also borne out when data on a more readily tradable futures contract on the S&P 500 index are considered: Again there is no evidence that any trading rule outperforms over the sample period.

Three conclusions appear to be possible from these findings. First, the out-of-sample results may simply not be representative, possibly because of the unusually large one-day movement occurring on October 19, 1987. Al- though this argument can never be rejected outright, we want to emphasize that the out-of-sample trading period is rather long~3,291 days!, which would seem to lend support to the claim that we can evaluate the trading rules’ performance reasonably precisely in the postsample period. Also, the out-of- sample results are robust to whether or not data on 1987 are included in the sample. In a finite sample, very large movements in stock prices such as Data-Snooping and Technical Trading Rule Performance 1683

those occurring on October 19, 1987 would, if anything, actually tend to improve the performance of the best trading rule because some of the rules inevitably would have been short in the index on that date and hence would have earned returns of 22 percent in a single day.25

Second, the 7,846 trading rules that we consider may of course have been selected from an even larger universe of rules. If this is the case, then the p-value adjusted for data-snooping is biased toward zero under the assump- tion that the included rules are also the ones that performed quite well during the historical sample period. This explanation is a logical possibility, but the experiments reported in this paper also show that it can have merit only as long as two conditions are both satisfied: The omitted trading rules cannot improve substantially on the best-performing trading rule drawn from the current universe, and the omitted trading rules should generate payoffs that are largely orthogonal to the payoffs of the included trading rules so that they will increase the effective span. We think that we have been suf- ficiently careful in choosing the number and types of trading rules included in the adopted universe so that it is unlikely that these conditions are si- multaneously satisfied.

Third, it is possible that, historically, the best technical trading rule did indeed produce superior performance, but that, more recently, the markets have become more eff icient and hence such opportunities have disap- peared.26_{This conclusion certainly seems to match up well with the cheaper} computing power, the lower transaction costs and increased liquidity in the stock market that may have helped to remove possible short-term patterns in stock returns.

Appendix A. Trading Rule Parameters

This appendix describes the parameterizations of the 7,846 trading rules used to generate the full universe of rules under consideration.

A.1. Filter Rules

x5 change in security price ~x 3 price! required to initiate a position; y5 change in security price ~ y 3 price! required to liquidate a position; e5 used for an alternative definition of extrema where a low ~high! can be

defined as the most recent closing price that is less~greater! than the n previous closing prices;

25_{Indeed, as shown in Table III, the best trading rule from the BLL universe under the} mean return criterion generates a mean return of 8.6 percent in the period from January 1987 through December 1996. However, the best rule~200-day variable moving average with a 1 percent band! from the BLL universe in the period January 1988 through December 1996 generates a mean return of only 5.6 percent. Furthermore, the large universe provides a best rule during subperiod 5 that generates a mean return of 14.4 percent, whereas the best rule ~20-day filter rule of 0.10! during the period beginning in 1988 provides a mean return of only 13.9 percent.

26_Ready_{~1997! also reports a decline in the ability of technical trading rules to predict daily} returns over the period 1990–1995.

c5 number of days a position is held, ignoring all other signals during that time; x5 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.12, 0.14, 0.16, 0.18, 0.2, 0.25, 0.3, 0.4, 0.5@24 values#; y5 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.04, 0.05, 0.075, 0.1, 0.15, 0.2 @12 values#; e5 1, 2, 3, 4, 5, 10, 15, 20 @8 values#; c5 5, 10, 25, 50 @4 values#.

Noting that y must be less than x, there are 185 x-y combinations. Number of filter rules5 x 1 ~x * e! 1 ~x * c! 1 ~x-y combinations!

5 24 1 192 1 96 1 185 5 497. A.2. Moving Averages

n5 number of days in a moving average; m5 number of fast-slow combinations of n;

b5 fixed band multiplicative value;

d5 number of days for the time delay filter;

c5 number of days a position is held, ignoring all other signals during that time; n5 2, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250 @15 values#; m5

(

i51 n21 i5 105; b5 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 @8 values#; d5 2, 3, 4, 5 @4 values#; c5 5, 10, 25, 50 @4 values#.

Note that a 1 percent band filter and a 10-day holding period are applied to all combinations of moving averages with a fast MA of one, two, and five days and a slow MA of 50, 150, and 200 days. This addition of nine rules allows our universe of trading rules to encompass all of BLL’s trading rules.

Number of rules5 n 1 m 1 ~b * ~n 1 m!! 1 ~d * ~n 1 m!! 1 ~c * ~n 1 m!! 1 9 5 15 1 105 1 960 1 480 1 480 1 9 5 2,049.

A.3. Support and Resistance

n5 number of days in the support and resistance range;

e5 used for an alternative definition of extrema where a low ~high! can be defined as the most recent closing price that is less~greater! than the n previous closing prices;

b5 fixed band multiplicative value;

d5 number of days for the time delay filter;

c5 number of days a position is held, ignoring all other signals during that time;

n5 5, 10, 15, 20, 25, 50, 100, 150, 200, 250 @10 values#;

e5 2, 3, 4, 5, 10, 20, 25, 50, 100, 200 @10 values#;

b5 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 @8 values#; d5 2, 3, 4, 5 @4 values#;

c5 5, 10, 25, 50 @4 values#.

Number of rules5 @~1 1 c! * ~n 1 e!# 1 @~b * ~n 1 e!! * ~1 1 c!# 1 @d * c * ~n 1 e!#

5 100 1 800 1 320 5 1,220. A.4. Channel Breakouts

n5 number of days for the channel;

x5 difference between the high price and the low price ~x 3 high price! required to form a channel;

b5 fixed band multiplicative value;

c5 number of days a position is held, ignoring all other signals during that time;

n5 5, 10, 15, 20, 25, 50, 100, 150, 200, 250 @10 values#; x5 0.005, 0.01, 0.02, 0.03, 0.05, 0.075, 0.10, 0.15 @8 values#; b5 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 @8 values#;

c5 5, 10, 25, 50 @4 values#.

Noting that b must be less than x, there are 43 x-b combinations. Number of rules5 ~n * x * c! 1 @n * b * ~x-b combinations!#

5 320 1 1,720 5 2,040. A.5. On-Balance Volume Averages

n5 number of days in a moving average; m5 number of fast-slow combinations of n;

b5 fixed band multiplicative value;

d5 number of days for the time delay filter;

c5 number of days a position is held, ignoring all other signals during that time; n5 2, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250 @15 values#; m5

(

i51 n21 i5 105; b5 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 @8 values#; d5 2, 3, 4, 5 @4 values#; c5 5, 10, 25, 50 @4 values#. Number of rules5 n 1 m 1 ~b * ~n 1 m!! 1 ~d * ~n 1 m!! 1 ~c * ~n 1 m!! 5 15 1 105 1 960 1 480 1 480 5 2,040.

Appendix B. Reality Check Technical Results

For the convenience of the reader, we replicate the main results of White ~1997! and briefly interpret these. In what follows, the notation corresponds to that of the text unless otherwise noted.

Let P denote the probability measure governing the behavior of the time series $Z_t%. Also, n denotes convergence in distribution, and p& denotes convergence in probability. In White’s~1999! notation, used here, f*5 f ~Z, b*! whereb*5 plimn ZbT. As no parameters are estimated in our application, we

have written E~ f ! in the text.

PROPOSITION2.1: Suppose that n102~ Nf 2 E~ f*!! n N~0,V! for V positive semi-

definite. ~a! If E~ fk*!, . 0 for some 1 # k # l, then for any 0 # c , E~ fk*!,

P@ Nf_k . c# r 1 as T r `. ~b! If l . 1 and E~ f1*! . E~ fk*!, for all k 5 2,..., l,

then P@ Nf₁ . Nf_k for all k5 2,..., l# r 1 as T r `.

Part~a! says that if some model ~e.g., the best model! beats the benchmark, then this is eventually revealed by a positive estimated relative performance. Part ~b! says that the best model eventually has the best estimated performance relative to the benchmark, with probability approaching one. The next result provides the basis for hypothesis tests of the null of no predictive superiority over the benchmark, based on the predictive model selection criterion.

PROPOSITION2.2: Suppose that n102~ Nf 2 E~ f*!! n N~0,V! for V positive semi-

definite. Then as t r ` max k51, . . . , ln 102_{$ Nf} k2 E~ fk*!% n Vl[ max k51, . . . , l$Zk% ~B1! and min k51, . . . , ln 102_{$ Nf} k2 E~ fk*!% n Wl[ min k51, . . . , l$Zk%, ~B2!

where Z is an l 3 1 vector with components Zk, k 5 1,..., l, distributed as

N~0,V!.

COROLLARY2.4: Under the conditions of Theorem 2.3 of White (1999), we have

that as t r ` r

~

L@ QVl*6Z1, . . . , ZT1t#, L

@

_k_{51, . . . , l}max n102~ Nfk2 E~ fk*!!

#!

p &0 ~B3! and r

~

L@ RWl*6Z1, . . . , ZT1t#, L

@

_k_{51, . . . , l}min n102~ Nfk2 E~ fk*!!

#!

p &0, ~B4! where QVl*[ max k51, . . . , ln 102_{~ Nf} k*2 Nfk!, ~B5! RWl*[ min k51, . . . , ln 102_{~ Nf} k*2 Nfk!, ~B6!

L denotes the probability law of the indicated random vector, and r is any

distance metric on the space of probability laws.

Thus, by comparing QVlto the quantiles of a large sample of realizations of QVl*,

we can compute a p-value appropriate for testing H0: maxk51, . . . , lE~ fk*! # 0,

that is, that the best model has no predictive superiority relative to the benchmark. White~1999! calls this the bootstrap “Reality Check p-value.”

The level of the test can be driven to zero at the same time that the power approaches one according to the next result, as the test statistic diverges at a rate n102 under the alternative.

PROPOSITION 2.5: Suppose that conditions A.1(a) or A.1(b) of White’s (1997)

appendix hold, and suppose that E~ f1*! . 0 and E~ f1*! . E~ fk*!, for all k 5

2, . . . , l. Then for any 0, c , E~ f1*!, P @ QVl . n

102_c_{# r 1 as T r `.}

COROLLARY 2.6: Let g : U r R~U , Rm! be continuously differentiable such

that the Jacobian of g, Dg, has full row rank one at E@hk*# [ U, k 5 0,..., l.

Suppose either: (i) Assumptions A9 and B of White (1999) hold and there are no estimated parameters; or (ii) Assumptions A9, B, and C of White (1999) hold, and either: (a) H 5 0 and qn 5 cn2g for constants c . 0, 0 , g , 1

such that ~ng1e0R! log log R r 0 as T r ` for some e . 0; or ~b! ~n0R! log log Rr 0 as T r `. Then for Nf*computed using Politis and Romano’s (1994) stationary bootstrap, as Tr `

r~L@n102_{~ Nf}*_{2 Nf !6Z}

1, . . . , ZT1t#, L@n102~ Nf 2 m*!#!

&&0, ~B7! where r and L@{# are as previously defined, H is the Jacobian of h, Nf [ ~ Nf1, . . . , Nfl!', Nfk [ g~ Nhk*! 2 g~ Nh0*!, m* [ ~ m1*, . . . ,ml*!, and mk* [ g~E @hk*#! 2

g~E @h0*#! and qnis defined in Appendix C.

Maintaining the original definitions of QVl* and RWl* in terms of Nfkand Nfk*, we

have the following corollary.

COROLLARY2.7: Under the conditions of Corollary 2.6, we have that as tr `

~

L@ QVl*6Z1, . . . , ZT1t#, L

@

_k_{51, . . . , l}max n102~ Nfk2 mk*!

#!

p &&0 ~B8! and r

~

L@ RWl*6Z1, . . . , ZT1t#, L

@

_k_{51, . . . , l}min n102~ Nfk2 mk*!

#!

p &&0. ~B9!

The test is performed by imposing the element of the null least favorable to the alternative—that is, mk 5 0, k 5 1,..., l; thus the Reality Check

p-value is obtained by comparing QVl to the Reality Check order statistics,

obtained as described in Section II. As before, the test statistic diverges to infinity at the rate n102 under the alternative.

PROPOSITION 2.8: Let Nf, m*, and V be as defined above. Suppose

n102_{~ Nf}

1 2 m1*! n N~0, v11! for v11 $ 0, and suppose that m1* . 0 and, if

l. 1, m1* . mk* for all k5 2,..., l. Then for any 0 , c , m1*, P@ QVl. n

102_c_{# r 1}

as T r `.

Note that it is reasonable to expect the conditions required for the above results to hold for the data we are examining. As pointed out by BLL, al- though stock prices do not seem to be drawn from a stationary distribution, the compounded daily returns ~log-differenced prices! can plausibly be as- sumed to satisfy the stationarity and dependence conditions sufficient for the bootstrap to yield valid results. It is possible to imagine time series for returns with highly persistent dependencies in the higher order moments that might violate the mixing conditions of White ~1999!, but the standard models for stock returns do not exhibit such persistence.

Appendix C. The Stationary Bootstrap

Politis and Romano~1994! present a resampling technique, called the stationary bootstrap, that can be applied to a strictly stationary and weakly dependent time series to generate a pseudo-time series that is stationary. Here we describe our application of the stationary bootstrap and the algorithm used to generate the pseudo-time series of returns. The notation corresponds to that of the text.

We use a resampled version of Nf 5 n21

(

tT5Rft11 to deliver the Reality

Check p-value for testing the hypothesis that the selected ~best! model has no predictive superiority over the benchmark model. The resampled statistic is computed as Nf*_{5 n}21

(

t5R T ft11* , ~C1! ft11* [ f ~Zu~t!11,b!, t5 R,...,T, ~C2!

and u~t! is a random index chosen according to the Politis and Romano sta- tionary bootstrap algorithm. For this, we choose a priori a “smoothing pa- rameter” q 5 qn, 0 , qn # 1, qn r 0, nqn r ` as n r `, and proceed as

follows:

1. Set t5 R. Draw u~t! 5 u~R! at random, independently and uniformly

In document Data-Snooping, Technical Trading Rule Performance, and the Bootstrap (Page 32-44)