Reducing Active Return Variance by Increasing
Betting Frequency
Newfound Research LLC February 2014
For more information about Newfound Research call us at +1-617-531-9773, visit us at www.thinknewfound.com or e-mail us at [email protected]
Abstract
In this paper we assume a framework where N-day forecasts result in active allocations that are held over N days before new active allocations are
generated. To increase betting frequency, overlapping forecast periods must be utilized, which introduces potential correlation between active allocations. We explore the implications of this correlation on bet-size positioning and find that the circulant nature of the correlation matrix guarantees that the minimum variance portfolio is constructed through an equal-weighting scheme.
Therefore, to minimize active return variance, we find that we should generate a new portfolio daily and invest 1/Nth of our wealth in that portfolio.
Introduction
Active managers seek to take off-benchmark bets (“active allocations”) to create consistent positive-expectancy returns. The difference between the manager’s returns and the benchmark’s returns are the manager’s “active returns.” For a manager to be successful over the long run, positive-expectancy in active returns is mandatory; in the short-run, the variance of active returns can dominate performance. Managing active return variance, therefore, is critical.
In this paper we will take a simplified view that active return variance is driven entirely by variance in the accuracy of our investment models; we classify an active allocation recommended by the model to either be accurate or
inaccurate.
Beyond increasing model accuracy, to reduce active return variance we seek to exploit the law of large numbers, which “guarantees” stable long-term results for the averages of random events. The more frequently we bet, the more our results will tend towards our predictable long-term average.
Setup
Let's begin with the assumption that we have a model that provides active allocations that we buy and hold for one year. To follow this strategy, we run the model every December 31st, invest in the proposed allocations, and sit tight for the rest of the year. While we believe that the active returns we generate have positive expectancy, they do have non-zero variance. In other words, over the long run, we expect to add value – but our active returns can potentially move into negative territory in the short-run.
If we had an infinite horizon, this method would be fine: by the law of large numbers, volatility, as a percentage of the expected excess return, would decrease.
Unfortunately, we don't have an infinite horizon to invest over. If we hit a low probability "unlucky streak" with our model, we've lost several years of investing returns we likely will never be able to make up. Furthermore, if we make our strategy available to outside investors, each investor will have a unique active return volatility to expectancy ratio based on when they invested. Therefore, we
have to take every step possible to minimize the variance of our active returns such that the positive expectancy of our returns can dominate the results more quickly.
Assuming we cannot improve the accuracy of our model, to reduce our variance, we should bet more frequently. Ideally, we want to take more
completely independent bets. Modern portfolio theory solves this problem by diversifying assets across different securities.
In our model, the only way to take more bets is to bet more frequently.
The question, then, is how much more frequently? Does it help, for example, if we rebalance December 31st and January 31st? We have to remember that the law of large numbers is applied to independent bets. Placing a second bet in January may not help much: the period we are forecasting and holding overlaps by 11 months, and therefore we would expect that the accuracy of our model would be highly correlated for the periods.
Auto-correlation in Model Accuracy
Here we will introduce the notation “>>” for dominance: if A >> B then the sign of the return over the period [A, B] is determined by the sign of A. In other words, A >> B if and only if the absolute value of the log return over A is greater than the absolute value of the log return over B.
Let us assume that the probability of our model providing a correct signal (value of 1) is p and an incorrect signal (value of 0) is (1-p) (note that the choice of 0 and 1 for signal values is convenient, but not with the loss of generality).
Furthermore, we assume that model accuracy will be determined by the sub- period that dominates as introduced in the prior paragraph. We can then construct a table of joint probabilities of the accuracy of our two signals conditional on the dominance environment:
s0 / s1 0 / 0 0 / 1 1 / 0 1 / 1
A >> B >> C (1-p)2 p(1-p) p(1-p) p2
A >> B << C (1-p)2 p(1-p) p(1-p) p2
A << B << C (1-p)2 p(1-p) p(1-p) p2
A << B >> C (1-p) 0% 0% p
s0 * s1 0 0 0 1
To calculate our correlation between model accuracy s0 and s1, we first need the covariance, cov[s0,s1] = E[s0 * s1] – E[s0] * E[s1]. We can use the table above to define our probability space if we had the probability for each of the dominance relationships.
We would expect that each of these dominance probabilities to be determined by the relative length of time of A, B and C to each other. For example, if A is 1 month, B is 11 months, and C is 1 month, then we would intuitively expect that A << B >> C would have a higher probability due to the fact that the volatility of returns increases with the square-root of time.
By using our conditional probability table, we can calculate E[s0*s1]; since s0*s1 is 0 in all cases except when s0=1 and s1=1. We find that E[s0*s1] = P(A >> B >>
C)*p2 + P(A >> B << C)*p2 + P(A << B << C)* p2 + P(A << B >> C) * p.
Therefore, we have:
𝜌 𝑠!, 𝑠! = 𝑤!+ 𝑤!+ 𝑤! 𝑝! + 𝑤!𝑝 − 𝑝! 𝑝(1 − 𝑝)
where w1 = P(A >> B >> C), w2= P(A >> B << C), w3= P(A << B << C) and w4= P(A << B >> C). Knowing that w1+w2+w3+w4=1, we can collect terms and ultimately end up with:
𝜌 𝑠!, 𝑠! = 𝑤!
Using Monte-Carlo techniques, we can estimate the P(A << B >> C) values for different monthly overlap lengths:
11 10 9 8 7 6 5 4 3 2 1
73.88% 62.73% 53.97% 46.47% 39.71% 33.30% 27.37% 21.65% 16.10% 10.66% 5.32%
If we limit ourselves to only investing at the end of the month, we have 12 possible times a year we can invest a portion of our portfolio. To achieve our
goal of minimizing our active return volatility (which is a direct consequence of our model accuracy), we want to invest as frequently as possible in independent bets. Unfortunately, the accuracies of our signals are not independent: in fact, they are highly correlated as we just proved above. So how do we best divvy up our capital? Do we invest twice a year and maximize our exposure to the least correlated bets? Do we gain anything by investing more frequently even though it will be in more highly correlated bets?
Consider our model accuracy correlation matrix:
t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 t=11
t=0 100% 74% 63% 54% 46% 40% 33% 27% 22% 16% 11% 5%
t=1 74% 100% 74% 63% 54% 46% 40% 33% 27% 22% 16% 11%
t=2 63% 74% 100% 74% 63% 54% 46% 40% 33% 27% 22% 16%
t=3 54% 63% 74% 100% 74% 63% 54% 46% 40% 33% 27% 22%
t=4 46% 54% 63% 74% 100% 74% 63% 54% 46% 40% 33% 27%
t=5 40% 46% 54% 63% 74% 100% 74% 63% 54% 46% 40% 33%
t=6 33% 40% 46% 54% 63% 74% 100% 74% 63% 54% 46% 40%
t=7 27% 33% 40% 46% 54% 63% 74% 100% 74% 63% 54% 46%
t=8 22% 27% 33% 40% 46% 54% 63% 74% 100% 74% 63% 54%
t=9 16% 22% 27% 33% 40% 46% 54% 63% 74% 100% 74% 63%
t=10 11% 16% 22% 27% 33% 40% 46% 54% 63% 74% 100% 74%
t=11 5% 11% 16% 22% 27% 33% 40% 46% 54% 63% 74% 100%
If we assume a constant variance for our model accuracy, then without loss of generality, we can assume our variance is 1 and assume that our correlation matrix is also our covariance matrix.
Since our stated goal is to reduce model accuracy variance, what we are seeking to do is allocate our capital in such a way that in aggregate, model accuracy variance is minimized through diversification; in other words, what weights will give us the minimum variance portfolio! Fortunately, such a portfolio has a closed form solution:
𝑤 = 𝐶!!1 1!𝐶!!1
Solving this equation, therefore, should provide us with the weights that will give us the minimum variance in model accuracy.
Note that an interesting result of this analysis is that the result is entirely independent of the expected accuracy of the model itself.
Solving the equation for the correlation matrix above, we find that an equal- weight scheme minimizes the variance. Is this result unique to our correlation matrix? It is simulated, after all; what if we had an error in our estimation?
Fortunately, the result is due to the special design of our correlation matrix. The matrix is a symmetric circulant matrix, which is a square matrix where each row vector is rotated one element to the right relative to the preceding row vector. A special property of such a matrix is that its inverse is also symmetric circulant1. This property guarantees that the resulting vector of 𝐶!!1 will be equivalent to 𝑘1 for some constant k. The simple intuition that any vector’s dot product against a vector of 1s is the sum of the original elements in the vector; since our matrix is circulant, each row has the same sum. Therefore, a circulant matrix multiplied by a vector of 1s will result in a vector with identical elements.
Therefore, when we then divide this term by 1𝑘1 we are simply normalizing, giving us an equal-weight solution.
In plain English, no matter the model accuracy or the correlation in model accuracy in overlapping periods, it will be best to spread our capital equally among all of the bets.
We can continue to reduce our variance by increasing the number of portfolios we hold. So instead of monthly, we may choose to create 52 weekly portfolios - - or even 252 trading-day portfolios! However, at a certain point the cost of trading will outweigh the reduction in variance.
1 See Wang, Jun-qing and Dong, Chang-zhou, “Inverse Matrix of Symmetric Circulant Matrix on Skew Field”
Empirical Evidence
As a test, we assumed we had a model with 60% accuracy at predicting the direction of forward 21-day (approximately 1-month) returns for the S&P 500.
We then constructed 21 indices, each with a starting date offset by one day from the prior index, with the first index starting on February 14th, 1950. Each index predicts the sign of the return for the S&P 500 over the next 21 days and invests accordingly: long exposure if we predict a positive sign and short
exposure if we predict a negative sign. We ensure 60% accuracy, over the long run, by using forward information about the S&P 500’s return and drawing from a uniform distribution to determine whether to choose the sign – and therefore the positioning – correctly or incorrectly.
From these 21 indices, we then construct 5 indices: an index that uses only the 1st index, an index that equal-weights the 1st and 11th indices, an index that equal-weights the 1st, 6th, 11th and 16th indices, an index that equal-weights the 1st, 4th, 7th, 10th, 13th, 16th and 19th ("every 3rd") indices, and finally an index that equal-weights all the indices. The indices are rebalanced to their equal-weight configuration monthly.
Active returns for each index are computed by taking 21-day returns of the index and subtracting 21-day returns of the S&P 500. We find that there is a statistically significant reduction in standard deviation of active returns at a 5%
confidence level.
Index Lower
Confidence
Standard Deviation
Upper Confidence
1st 5.8781% 5.8803% 5.8876%
1st & 11th 5.3864% 5.3884% 5.3951%
1st, 6th, 11th & 16th 5.0912% 5.0931% 5.0994%
Every 3rd 4.7008% 4.7025% 4.7083%
All 4.4087% 4.4103% 4.4157%
Conclusion
In this paper, we propose a simple methodology for reducing active return variance. The methodology exploits the law of large numbers, a statistical
phenomenon that “guarantees” that if we take more bets, performance deviation from the long-term expected return is reduced. The law of large numbers
assumes that bets are identical and independent from one another.
We assume a framework where N-day forecasts result in active allocations that are held over N days before new active allocations are generated. To increase betting frequency, overlapping forecast periods must be utilized, which
introduces potential correlation between active allocations. We explore the implications of this correlation on bet-size positioning and find that the circulant nature of the correlation matrix guarantees that the minimum variance portfolio is constructed through an equal-weighting scheme. Therefore, to minimize active return variance, we find that we should generate a new portfolio daily and invest 1/Nth of our wealth in that portfolio.
Beyond reducing active return variance, this methodology may carry the side- benefit of increasing strategy capacity since the portfolio will be reduced into N equal-weight sub-portfolios that will all rebalance on different days, reducing market impact without necessarily increasing aggregate turnover.
The limitation of this methodology is in operational and trading friction where fixed costs can create a realized drag on active returns that outweighs the benefits of reduced active return variance. We do not address this restriction in this paper and believe it is an area that warrants further study.
Monte Carlo Simulation Python Code
import numpy import pandas n = 1000000 correls = []
lp = 0.5 lc = []
for overlap in range(11, 0, -1):
a = 12 - overlap b = overlap c = 12 - overlap
a_v = numpy.random.normal(scale = numpy.sqrt(a / 12.), size = [1, n]) b_v = numpy.random.normal(scale = numpy.sqrt(b / 12.), size = [1, n]) c_v = numpy.random.normal(scale = numpy.sqrt(c / 12.), size = [1, n]) a_dom_b = numpy.abs(a_v) > numpy.abs(b_v)
b_dom_a = ~a_dom_b
b_dom_c = numpy.abs(b_v) > numpy.abs(c_v) c_dom_b = ~b_dom_c
a_dom_b_and_b_dom_c = numpy.logical_and(a_dom_b, b_dom_c) a_dom_b_and_c_dom_b = numpy.logical_and(a_dom_b, c_dom_b) b_dom_a_and_b_dom_c = numpy.logical_and(b_dom_a, b_dom_c) b_dom_a_and_c_dom_b = numpy.logical_and(b_dom_a, c_dom_b) p0 = numpy.mean(a_dom_b_and_b_dom_c) # A >> B >> C
p1 = numpy.mean(a_dom_b_and_c_dom_b) # A >> B << C p2 = numpy.mean(b_dom_a_and_b_dom_c) # A << B >> C p3 = numpy.mean(b_dom_a_and_c_dom_b) # A << B << C event_prob = numpy.array([p0, p1, p2, p3])
cond_prob = numpy.array([[(1-lp)**2, lp*(1.-lp), lp*(1.-lp), lp**2], # A >> B >> C [(1-lp)**2, lp*(1.-lp), lp*(1.-lp), lp**2], # A >> B <<
[(1-lp), 0., 0., lp], # A << B >> C [(1-lp)**2, lp*(1.-lp), lp*(1.-lp), lp**2]]) # A << B << C
e_s0s1 = cond_prob.transpose().dot(event_prob)[-1]
e_x = lp
std_x = numpy.sqrt(lp * (1. - lp)) lc.append((e_s0s1 - e_x**2) / std_x**2)
df = pandas.Series(lc, index = map(lambda x: str(x) + ' Month Overlap', range(11, 0, -1)))
print df
For more information about Newfound Research call us at +1-617-531-9773, visit us at www.thinknewfound.com or e-mail us at [email protected]
Past performance is no guarantee of future returns.
• IMPORTANT: The projections or other information generated by Newfound Research LLC regarding the likelihood of various investment outcomes are hypothetical in nature, do not reflect actual investment results, and are not guarantees of future results.
• All investing is subject to risk, including the possible loss of the money you invest. Diversification does not ensure a profit or protect against a loss. There is no guarantee that any particular asset allocation or mix of funds will meet your investment objectives or provide you with a given level of income.
• These materials represent an assessment of the market
environment at specific points in time and are intended neither to be a guarantee of future events nor as a primary basis for
investment decisions. The performance results should not be construed as advice meeting the particular needs of any investor.
Neither the information presented nor any opinion expressed herein constitutes a solicitation for the purchase or sale of any security. Past performance is not indicative of future
performance and investments in equity securities do present risk of loss. Newfound Research LLC’s results are historical and their ability to repeat could be affected by material market or
economic conditions, among other things.