2. THE ARBITRAGE PRICING THEORY AS A CONCEPTUAL FRAMEWORK
2.2. Motivation for a conceptual framework
2.2.5. Limitations of the APT framework
The APT framework fulfils the role of an informative conceptual framework within which equilibrium relationships can be modelled and within which the return generating process can be explored. Propositions stemming from the APT find support; the return generating process is characterized by more than one factor and expected returns reflect the structure of the return generating process to some extent. Idiosyncratic risk factors are not priced suggesting that they are of no concern when modelling the return generating process. Systematic risk is of primary importance within the APT framework. The APT framework outperforms a single-factor alternative in explaining the time series behaviour of returns and expected returns. However, a comprehensive assessment and interpretation of the framework requires that limitations of the framework are acknowledged and criticisms noted.
Dhrymes, Friend and Gultekin (1984) re-examine the results obtained by Roll and Ross (1980) arguing that any empirical investigation of a new theory or concept should be subjected to replication to confirm the findings. The authors point out that more than five factors may be necessary to describe the return generating process of returns on NYSE and AMEX securities. Using an almost identical dataset23 to that of Roll and Ross (1980), Dhrymes et al. (1984) show that that the number of groups for which a five-factor structure is inadequate is greater than that suggested by Roll and Ross (1980). Whereas Roll and Ross (1980) arrive at the conclusion that it is unlikely that more than five factors are necessary to
23 Dhrymes et al. (1984) replace 13 securities for which data is unavailable and a further 11 securities characterized by a large number of missing observations.
describe the return generating process based upon a finding that in only 2.3 percent (1 out of 42) of groups a five-factor decomposition is inadequate, Dhrymes et al. (1984) find that a five-factor decomposition is inadequate for 16.6 percent (7 out of 42) of groups. This discrepancy is attributed to Roll and Ross’ (1980) use of groups with missing observations or the greater precision of the statistical software used by Dhrymes et al. (1984), or both.
Regardless of the source of error, this limitation potentially translates into erroneous conclusions regarding the complexity of the return generating process.
Dhrymes et al. (1984) also investigate how the number of factors in the return generating process varies with group size. Results indicate that returns on the smallest group (15 securities) are characterized by at most two factors whereas returns for groups consisting of 90 and 240 securities are described by at most nine and six factors respectively. This suggests that the number of factors is positively related to the number of securities in a group.
Whereas Beenstock and Chan (1986) attempt to rationalize the positive relationship between the number of factors and group size, and Hughes (1984) shows that the first five factors are the most important in explaining the variation in returns, the findings of Dhrymes et al.
(1984) should not be ignored. What is of concern is the large discrepancy in the number of factors in the return generating process of group of varying sizes. Furthermore, Dhrymes et al. (1984) do not investigate the group size at which the number of factors stabilizes.24 The question that arises is whether in practice a two-factor model is sufficient and does not omit any relevant factors or whether a nine-factor model should be adopted on the basis of statistical and practical significance. With such a large discrepancy in the number of factors, the problem is not easily solved. Moreover, although an indication of an upper bound is desirable, it is not provided by Dhrymes et al. (1984). The consequences of a failure to identify a stable or upper bound to the number of factors in the return generating process is best articulated by Bodie, Kane and Marcus (2005) who state that if a model requires hundreds of explanatory factors, the consequence is a failure to simplify the return generating process. These findings indicate a major limitation; the APT framework fails to unambiguously identify the number of factors that are sufficient to describe the return generating process. Notwithstanding this limitation, what is certain is that more than one factor characterizes the return generating process.
24 Only five factors are estimated for groups consisting of 240 securities as a result of “accelerating computer costs.”
Central to the APT framework is the APT model which postulates that expected returns reflect compensation for exposures to multiple systematic risk factors in the return generating process. Roll and Ross (1980) find that at least three factors are priced, Chen (1983) finds that between two and four factors are priced, Hughes (1984) finds that between three and six factors are priced and Beenstock and Chan (1986) show that up to three factors are priced.
Dhrymes et al. (1984) investigate whether a five-factor model explains the cross-section of expected returns by testing the null hypothesis of risk premia jointly equalling zero (see Chen, 1983). The null hypothesis is rejected in only 14.29 percent (6 out of 42) of the groups considered suggesting that a central component of the APT framework, the APT model, fails to explain expected returns. Furthermore, when factors representative of idiosyncratic risk such as the standard deviation and the skewness of individual returns are incorporated into the model, the null hypothesis is rejected for only 4.76 percent (2 out of 42) of groups. These findings contrast with those of Roll and Ross (1980) who find that at least one or more factors are priced in 28.57 percent (12 out of 42) of groups when idiosyncratic risk factors are considered. Dhrymes et al.’s (1984) findings suggest that APT factor loadings fail to explain expected returns and that APT factor loadings fare even worse in the presence of idiosyncratic risk factors. Moreover, when securities are sorted into groups by mean returns, no statistically significant risk premia vectors are found. Dhrymes et al. (1984) attribute this discrepancy in results to Roll and Ross’ (1980) reliance upon the validity of the assumption regarding the normality of cross-sectional Generalized Least Squares (GLS) risk premia estimates and the use of tests that fail to consider departures from these assumptions. These findings cast some doubt upon the validity of the APT framework; the APT model is a critical component of the framework and if it does not hold, then why should an investigation of the return generating process be based upon a framework of which a central proposition is deficient. Dhrymes et al. (1984) state that the finding that few risk premia vectors differ statistically from zero reduces the APT to an explanation of the return generating process only. If this is the case, the APT can no longer be considered as a comprehensive conceptual framework which can be used to model both the return generating process of returns and to study equilibrium relationships.
Another set of tests of the APT considers whether the model is able to explain anomalies not explained by other models (Brown & Reilly, 2009). In essence, such tests are a comparison of alternate models on criteria other than explanatory power. One such anomaly is the so called January effect, which may bias asset pricing tests in favour of finding significant
relationships between risk and expected returns. For example, Tinic and West (1984) show that when January is excluded from the analysis of the CAPM, there is a breakdown in the relationship between expected returns and risk. M.Gultekin and N.Gultekin (1987) investigate whether the January effect plays a role in tests of the APT model using CRSP return data for the July 1962 to December 1981 period. Preliminary analysis reveals that there is a strong seasonal effect in returns; mean returns are almost 30 percent for January whereas the month with the second highest mean returns is November with 11 percent. This seasonality is reflected in cross-sectional regressions estimated using only January returns;
the null hypothesis of risk-premia being jointly equal to zero is always rejected in each of the sample groups of 30 and 90 securities. However, when returns for all other months excluding January (average of eleven months) are considered, risk premia are priced in just under a tenth (9 percent) and under a third (27 percent) of the groups respectively (see M.Gultekin &
N. Gultekin, 1987: Table III, Panel A&B). While the seasonality results are robust to group size in that the January effect is observed in groups of 30 and 90 securities, the presence of a strong January effect is likely to bias results in favour of the APT model if January is considered together with the other eleven months. This is confirmed by M. Gultekin and N.
Gultekin’s (1987) findings that the statistical significance of the risk premia is greatly diminished when the sample excludes January and returns on the remaining eleven months are used. These findings suggest that the APT framework is only able to explain the risk-return relationship in January and is therefore, not applicable during other months. If this is the case, then drawing inferences from APT literature may not be appropriate as results reflect the presence of seasonal effects and are not a true reflection of equilibrium relationships.
Reinganum (1981) argues that the APT framework is a plausible alternative if it can explain differences in returns on firms of different sizes. The author’s test is based upon the assumption that securities with similar factor loadings should have similar returns. To test this hypothesis, a two-stage procedure using CRSP data for securities traded on the NYSE and AMEX since July 1962 is employed. In the first stage, control portfolios are created by grouping securities with similar factor loadings into control groups where factor loadings are estimated over a sixteen year period. Excess returns on securities are then estimated by subtracting control portfolio returns from security returns. In the second stage, having obtained excess returns, securities are grouped into portfolios according to size. The rationale is that because all securities within the control groups have similar exposure to factors, excess
returns are risk-adjusted and therefore, should be near zero. The validity of the APT model can then be investigated by testing the null hypothesis of average excess returns jointly equalling zero (Reinganum, 1981). Reinganum (1981) however finds that this is not the case;
average excess returns are not equal to zero regardless of whether three, four or five-factor models are used to model APT risk. Returns on the smallest portfolio are positive and statistically significant whereas returns on the largest portfolio are negative and statistically significant. Returns on smaller portfolios are higher than returns on larger portfolios. A formal test of the equality of means confirms that there is evidence of a size effect after controlling for APT risk. This suggests that the APT framework fails to account for all risk in returns and fails to explain the size effect. In failing to do so, the APT framework fails to explain a phenomenon not explained by a simpler model. Thus, the indictment against the APT framework comes from the finding that it fails to account for anomalies that arise within the CAPM; the APT does not perform better relative to a simpler framework on criteria other than explanatory power. If the APT model does not present an improvement over the CAPM in the cross-sectional context, then the argument for adopting the more complicated APT framework is somewhat weakened.25
At the core of the APT framework lies the assumption that only systematic factors drive returns and as idiosyncratic risk can be mitigated through diversification, only systematic risk factors are relevant to security returns (Reinganum, 1981; Burmeister et al., 1994). Roll and Ross (1980), Chen (1983), Beenstock and Chan (1986) and Yli-Olli and Virtanen (1993) either find no evidence or limited evidence of priced idiosyncratic risk factors. Dhrymes et al.
(1985) examine this proposition using CRSP return data for the July 1962 to December 1981 period. It is hypothesized that measures of idiosyncratic risk, namely the total and residual standard deviation, should be irrelevant when considered together with systematic measures of risk. Results for three group sizes (30, 60 and 90 securities) indicate that both systematic and idiosyncratic risk factors derived over the first sub-period (July 1962 to March 1972) provide little insight into the behaviour of expected returns although, idiosyncratic risk appears to be more important relative to systematic risk. For groups of the 30 securities, the null hypothesis that the risk premia vector is zero is rejected in 3.33 percent (1 out of 30) of
25 Reinganum (1981) does however note that a number of hypotheses are tested jointly; for example, the ability to explain anomalies and also indirectly that only undiversifiable factors are relevant in explaining the cross-sectional characteristics of expected returns. For this reason, tests cannot reveal with absolute certainty which hypotheses are supported. Potential sources of error cited are the (potentially incorrect) assumption of a linear return generating process, the (in)ability to completely diversify away idiosyncratic variance and the existence of arbitrage on the NYSE and AMEX.
groups when factor loadings are used exclusively as explanatory factors. However, when own standard deviation and residual standard deviation are used as explanatory factors in addition to factor loadings, the vector of risk premia is found to be statistically insignificant for all groups suggesting that measures of idiosyncratic risk subsume any explanatory power that factor loadings may have. Furthermore, measures of total and residual standard deviation by themselves are statistically significant for 17 percent (5 out of 30) of groups suggesting that idiosyncratic risk is more important than systematic risk in explaining expected returns.
These results are generalizable to groups of 60 and 90 securities (Dhrymes et al., 1985: Table III).
In the second sub-period (March 1972 to December 1981), factor loadings by themselves appear to play somewhat more important role. For groups of 30 securities, the risk premia vector is statistically significant for 20 percent (6 out of 30) of groups. However, the risk premia vector becomes statistically insignificant for all groups when idiosyncratic risk factors are considered in addition to factor loadings. This again suggests that any explanatory power attributable to factor loadings is subsumed by measures of idiosyncratic risk. By themselves, the own and residual standard deviation are statistically significant for 20 percent (6 out of 30) and 13.333 percent (4 out of 30) of groups respectively. These findings are generalizable to groups of 60 and 90 securities; fewer risk premia vectors are statistically significant when idiosyncratic risk measures are considered in addition to factor loadings and idiosyncratic risk measures appear to be marginally more important than factor loadings in explaining expected returns. If systematic risk factors explain expected returns and account for risk, then own variance and residual standard deviation should not play a role. Yet, Dhrymes et al.’s (1985) results suggest that these idiosyncratic risk factors may be just as or even more important than factor loadings. Such findings are concerning as they are incompatible with central propositions of the APT framework.
M.Gultekin and N.Gultekin (1987) present evidence supporting the findings of Dhrymes et al. (1985). The authors report that when residual standard deviation is introduced after excluding January returns, the number of priced risk premia decreases even further. However, the most pronounced result is observed when January returns are considered in isolation.
Whereas risk premia are always statistically significant in January for groups of 30 and 90 securities, factor loadings are priced in 60 percent (18 out of 30) and 90 percent (9 out of 10) of groups respectively when residual standard deviation is incorporated as an additional
explanatory factor. Reported t-statistics indicate that the residual standard deviation by itself is statistically significant in more than half of the groups, regardless of size. Such results are concerning; if loadings on APT factors, which are assumed to capture systematic risk fail to explain expected returns and if factors that proxy for idiosyncratic risk are priced, then this suggests that the APT framework’s assumption that systematic risk is the only risk category that is important for explaining returns may not be valid (Dhrymes et al., 1985).
Perhaps the most notable criticism is that the APT is a complete generalization and does not identify the risk factors that characterize the return generating process (Burmeister et al., 1994). This limitation is widely recognized in the literature. Kandir (2008) states that a major criticism of the APT framework is that the framework derives factors statistically and does not identify them. Priestley (1996) suggests that factor analytic and principal component techniques make estimated risk premia uninterpretable. Brown and Reilly (2009) state that the APT framework does not identify the factors that describe returns and note that this is considered to be the primary shortcoming of the APT. Without knowing the identity of factors, it is impossible to determine which factors drive returns and to meaningfully interpret risk premia (Kandir, 2008). Dhrymes et al. (1984) are more forthcoming in their criticism of the APT and argue that without knowing the economic meaning of the factors, it is difficult to determine how the empirical application of the APT framework is useful for predictive and explanatory purposes. This contrasts with the CAPM framework, which identifies the market proxy as the single risk factor and therefore, makes the CAPM and the underlying single-factor model easier to apply once a suitable market proxy has been identified (Brown &
Reilly, 2009). Roll and Ross (1980) note this shortcoming in their empirical investigation of the APT and argue that further research should be undertaken to determine the identity of the underlying factors. It is suggested that if there are a few systematic sources of risk, these are likely to be related to economic aggregates such as the GNP and interest rates. It is further argued that factors that have explanatory power for returns should be considered as substitutes for the unidentified factors. The authors go onto state that the formulation of the linear factor model within the APT framework motivates for further research into the theoretical and empirical structure of the model so as to better understand which systematic factors drive returns. The need to interpret unidentified systematic risk factors is recognized by Chen (1983) as the most important direction of further research.