4.4.1
Choice of regression for the ARP model
In Chapter3we presented the ARP model, which searches for a portfolio with a constant return per time period. In that model, we used a regression based strategy as opposed to a correlation based one.
In fact, we could not have chosen correlation as a suitable nonlinear objective function for the ARP model, due to the nature of the constant return assumption. For instance, the correlation between time, the benchmark we chose for the ARP (e.g. time represented as numbers from 1 to h), and a hypothetical investment whose returns are perfectly constant would be undefined. As the variance of the constant return investment is zero, we cannot calculate the Pearson correlation coefficient.
Now suppose our returns are as close as possible to being perfectly constant, i.e. h − 1 constant returns and a single return that is different from the others by a very small amount. The correlation of this example does not follow a predictable pattern, and this can be verified by artificially constructing simple examples.
4.4.2
Pearson correlation coefficient
In this chapter we seek to minimise the most familiar measure of dependence between two random variables, the Pearson product-moment coefficient, defined in Equation 4.14
(where the absolute value component was added to correctly represent the objective of our MNP model). The Pearson correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.
The Pearson coefficient is known to have some limitations. For example, being a measure of linear association between two variables, it may fail to capture a nonlinear relationship. Also, the coefficient can be drastically influenced by a few extreme outliers. In general, the Pearson coefficient is not considered suitable for non-normal distributed random variables. Egan (2007), for example, analysed the returns of the S&P500 index and concluded that the normal and lognormal distributions are a poor fit for its returns, the best fit being a t -distribution with location/scale parameters. The disadvantages of the Pearson coefficient are discussed in Joe (1997) and Hutchinson & Lai(1990).
There are alternative approaches that are considered more sensitive to nonlinear rela- tionships, and which could be used as our objective function instead. The most popular alternative is the Spearman rank correlation coefficient ρ. The Spearman ρ is a “quasi- ordinal” correlation coefficient which is equivalent to the Pearson coefficient after the variables have been transformed into rank orders. The Spearman coefficient is often de- scribed as being “nonparametric” as a perfect Spearman correlation is obtained when two variables are related by any monotonic function, in contrast with the Pearson correlation, which only gives a perfect value when the variables are related by a linear function.
Another variable dependence coefficient that was introduced to address some of Pear- son deficiencies is the Distance correlation, introduced bySz´ekely et al.(2007) andSz´ekely & Rizzo (2009). The Distance coefficient is analogous to the Pearson coefficient, but it is based on the pairwise Euclidean distances of each vector of random variables. This coefficient requires the calculation of the distance variance, distance standard deviation and distance covariance, defined similarly. The Distance correlation coefficient R satisfies 0 ≤ R ≤ 1, and an important property is that R = 0 if and only if the random variables are statistically independent.
We leave the investigation of these alternative correlation measures for future work.
4.4.3
Alternative approaches to MNP
In this work we considered a MNP as a portfolio that is uncorrelated to the market benchmark, in which we considered the underlying index as the sole representation of the market. An alternative definition is to consider the market as represented by multiple factors, such as the Fama-French three-factor model (Fama & French (1993)) and the Carhart four-factor model (Carhart (1997)). If we redefine a MNP as a portfolio that is “independent” from multiple factors (instead of a single one), to retain a similar objective function we would need to transform Equation (4.14) into a function that minimises the multiple correlation coefficient (which takes values between zero and one), a much more complex and impractical task. An alternative is to redefine a MNP in the form of a “factor immune/independent portfolio” as defined below.
Under a factor assumption with M factors the standard (linear) equation for the return rit on asset i at time t is:
rit = αi+ M
X
j=1
βijFjt+ some random noise element (4.30)
In other words the return on asset i at time t is made up from some asset dependent term αi plus factor terms made up from a linear sum of the factors, where βij is the
coefficient for asset i in relation to factor j. Here the coefficients αi and βij are time
independent and are typically estimated from in-sample data using multiple least-squares regression.
The basic approach is therefore as follows. Using multiple least-squares regression estimate the coefficients αi and βij in the in-sample period [1, . . . , T ]. Let ˆαi and ˆβij
be the estimates. If we have a weight wi associated with investment in asset i (where
PN
i=1wi = 1) we approximate portfolio return at time t by the weighted sum of asset
returns, namely PN
i=1wirit.
Turning to the factor equation and neglecting the noise term, it follows that the return on the portfolio at time t is given by:
N X i wirit = N X i=1 wi ˆ αi+ M X j=1 ˆ βijFjt = N X i=1 wiαˆi+ M X j=1 XN i=1 wiβˆij Fjt (4.31)
So under this equation the portfolio return (at time t) is composed of two terms:
PN
i=1wiαˆi a term dependent only on the assets in the portfolio
PM j=1 PN i=1wiβˆij
Fjt a term involving time dependent factors
We want to minimise dependence on the market, where the market is driving the factors that we observe. Hence we want the influence of the factor term on the portfolio return to be as small as possible.
One way to achieve this is simply to minimise |PT
t=1 PM j=1 PN i=1wiβˆij Fjt|, so the
total factor contribution to portfolio return (summed over all time periods) is as small as possible. Other objectives are also possible, e.g. minimise PT
t=1| PM j=1 PN i=1wiβˆij Fjt|,
which minimises the sum of the absolute values of factor terms at each time period t. In fact, we did perform some preliminary investigations with this model and tested it with publicly available Fama-French factors for the US market, for portfolios composed of assets from the S&P500 index. The results we obtained were not encouraging and thus we did not do further research on this model.
4.5
Conclusions
In this chapter we considered the problem of constructing a market neutral portfolio where we can hold both long and short positions in assets. We formulated this problem as a mixed-integer nonlinear program, minimising the absolute value of the correlation between portfolio return and index return, and solved it using the Minotaur software package.
Computational results were presented for eleven different problem instances derived from universes defined by S&P international equity indices. These indicated that in- sample we could achieve very low correlations (in many cases zero correlation) in reason- able computation times. Out-of-sample correlations were higher, but for the majority of cases examined the market neutral portfolios constructed using the approach given in this chapter outperformed their benchmark indices.
Computational results, for the test problems considered, indicated that the model proposed out-performed an alternative approach based on minimising the absolute value of regression slope (the zero-beta approach).
We compared our approach with the performance of seven funds that adopt market neutral strategies with respect to the S&P 500. This comparison indicated that for three of these seven funds we had similar correlations, the other four funds had lower correlations than our market neutral portfolios. However in contrast to these seven funds (only one of which outperformed the index, and then only slightly) our market neutral portfolios outperformed the index by a significant amount in the vast majority of the cases examined.
Chapter 5
Exchange-traded funds: a survey
and performance analysis
5.1
Introduction
ETFs (exchange-traded funds) have grown significantly in recent years, in terms of the number and size of funds and in trading volume. At their simplest ETFs offer replication of a market index such as the S&P500, and thereby offer the investor exposure to a market index in a much more flexible manner than a conventional mutual fund. In some countries ETFs also offer tax advantages over mutual or index funds.
Estimates of the total size of the ETF market vary, but as an indication BlackRock
(2011) estimates it was approximately US$1.5 trillion (i.e. US$1.5 × 1012) at the end of
2011. The market size has doubled since late 2008. Due to the growth of the market for ETFs, regulators around the world have become concerned at their potential for inducing (or exacerbating) market risk and instability.
In the light of the growing importance of ETFs, we survey and classify existing ETFs and analyse their performance in replicating the behaviour of their underlying assets. We were able to identify 8192 ETFs (of which some are no longer active); we were able to find sufficient information to classify 6937 active ETFs. We selected a subset of 822 ETFs to conduct a detailed statistical performance analysis.
This chapter is structured as follows. We first discuss how plain vanilla ETFs and synthetic (leveraged/inverse) ETFs are constructed, mention regulatory concerns with regard to ETFs and review the academic literature relating to ETFs. We then describe our ETF survey database and generate insights into the ETF market by classifying this data (involving 6937 ETFs). The performance analysis of 822 ETFs follows.