The earlier risk-adjusted techniques (frequently called traditional measures of performance evaluation) include the Sharpe (1966) measure, the Treynor (1965) measure and the Jensen (1968) measure. These measures are based on the idea that the
combination of any portfolio with the risk-free asset is located in the expected return/beta space or expected return/standard deviation space.
The Jensen measure has been the most commonly used performance measure in academic and non-academic empirical studies, and may be defined as the return above or below the one that results from the CAPM equilibrium return. One of the most important problems associated with the Jensen’s measure is that, as it is derived directly from a CAPM framework, it requires the identification of the market portfolio as the benchmark portfolio and the efficiency of this portfolio is a necessary condition to validate the performance results. If this is not verified, the performance of any fund is a function of the selected index. This question was first raised by Roll (1978) on theoretical grounds. Following Roll, many other studies questioned the sensitivity of the Jensen measure to the benchmark portfolio used (e.g.: Lehmann and Modest, 1987;
Grinblatt and Titman, 1989; and Elton, Gruber, Das and Hlavka, 1993), and the benchmark issue is one of the most pertinent issues in performance studies.
Grinblatt and Titman (1989) addressed the problem of the appropriate benchmark and concluded that “the appropriate benchmark portfolio consists only of those assets that can be included in the portfolio being evaluated.” (p. 411). This argument implies that the missing asset problem, in tests of the CAPM, does not apply to portfolio performance evaluation. In this context, a more intuitive interpretation can be given to Jensen’s measure: it is the return from the combination of a passive portfolio with a risk-free asset. Such a combination has the same risk as the active portfolio. Since the manager is free to invest in a passive portfolio and in the risk-free asset, this is a more intuitive way of viewing the Jensen’s measure, independently of the validity of the CAPM (Blake, Elton and Gruber, 1993). In this context, the Jensen’s measure may also be applied to bond portfolios.
The selection of only one index may, however, fail in considering alternative choices for the portfolio manager. For example, Elton, Gruber, Das and Hlavka (1993) found that the evidence of positive performance obtained previously by Ippolito (1989) can be attributed to the use of an incorrect benchmark, which did not account for the performance of small stocks. When applying a multiple index model, with 3 indices – the S&P 500 index plus a small stock index and a bond index – to the same sample, they observed that funds underperformed. The development of multiple index measures constitutes an attempt to overcome this type of problems. If it is known that the return is generated by N factors, then N diversified portfolios are sufficient to describe portfolio returns and a linear combination of these N portfolios is efficient.
Another argument for the use of multiple index measures results from the APT, which assumes that the expected return may be described as a linear function of the sensitivities to more than one factor. Deviations of this function are a measure of the manager’s selection capacity. On a theoretical (and even practical) basis, multiple index versions of the Jensen’s measure seem to be more appropriate than single-index versions.14
In spite of their attractiveness, these approaches to performance evaluation are not free of criticisms. In fact, there is still one important problem: which multiple index or multiple factor model to use? The APT does not specify the number or the identity of the factors that affect expected returns. As a consequence, several alternative techniques have been proposed and used in the evaluation of the performance of managed portfolios, such as: statistically derived factors through factor analysis and principal
14 Sharpe (1994) formalized a generalized version of Sharpe (1966) ratio as an alternative measure in a multiple index context. While the traditional measure compares the average portfolio return subtracted from the risk-free rate with the standard deviation of the portfolio returns, his more generalized version compares the portfolio return subtracted from a benchmark return with the standard deviation of that difference. This measure is usually called “Information Ratio”.
component analysis (see, for example, Lehmann and Modest, 1987), indices that represent the type of investments retained in the portfolio (e.g.: Blake, Elton and Gruber, 1993; and Sharpe, 1992) or macroeconomic variables related to aggregate economic activity, inflation and interest rates (among others, Kryzanowski, Lalancette and To, 1997; and Elton, Gruber and Blake, 1995). The set of indices that are adequate for evaluating performance is a question that has not yet been completely solved (Elton and Gruber, 1997) and the search for the “true model” continues. Recent research however, seems to indicate that good models already exist, and that no more than a small number of easily identifiable factors/indices are enough to explain the return on stock and bond portfolios (Blake, Elton and Gruber, 1993; Elton, Gruber and Blake, 1995, 1999).
3.3. SINGLE-INDEX VERSUS MULTIPLE INDEX MODELS: EVIDENCE ON BOND FUND PERFORMANCE
As we have mentioned before, compared to stock funds, there are fewer empirical studies on bond fund performance evaluation. A possible explanation for this might be the discussion about the proper risk measure for bond portfolios and the problems in applying the same measures used for stocks. Asset pricing theories, in general, assume a linear relation between the risk measures and the expected excess return of the assets. In principle, these models are applicable to all financial securities but their implications are usually examined for stocks.15
15 While the risk of common stocks is generally identified as being systematic or unsystematic, the perceived risks of bonds are commonly identified as interest rate risk, duration risk, purchasing power risk and default risk. The lack of convincing empirical evidence to show that covariance risks are priced in bond markets contributes to the different descriptions of risk. Furthermore, empirical models for fixed income returns are typically directed at the problem of solving for the prices of derivative claims and its
On the other hand, prior to the 1970s, interest rates were very stable and most bond portfolios followed buy-and-hold strategies, so probably their performance did not differ much. This environment changed dramatically in the late 1970s and especially in the 1980s, when interest rates increased dramatically and became more volatile. The incentive towards more active bond management led to an increasing demand for techniques to evaluate bond portfolio managers as their performance became substantially more dispersed.
The first studies on bond portfolio performance used a single-index measure with duration, the measure of the relative price volatility to changes in interest rates for a bond, as the risk measure (Wagner and Tito, 1977a, 1977b). They derived a bond market line similar to the security market line used in evaluating equity fund performance. In this case, duration simply replaces beta as the risk variable.16 However, it is well known that duration is an exact measure of interest rate sensitivity only if all term structure shifts are parallel and infinitesimally small. Some studies investigated the explanatory power of the single-index model using duration as the risk measure.
Ilmanen (1992) found that duration explains quite well (between 80 and 90 percent) the cross-sectional variation in Government bond returns. Relatively to corporate bonds, Ilmanen, McGuire and Warga (1994) found that duration is an incomplete measure of risk and its explanatory power decreases as the credit quality declines.
As an alternative to the models based on duration, regression-based index models, commonly used for stocks, also began to be derived for bonds. Under these models, risk is measured by beta, the regression coefficient, and the intercept alpha represents the
application to portfolios with unobserved weights, such as mutual funds, is not straightforward (Farnsworth, 1997b).
16 This approach is described in detail in text books on investments such as Reilly and Brown (1997) and Sharpe, Alexander and Bailey (1999).
performance measure (as in Jensen’s measure). Instead of a stock index, a bond index is used as the benchmark portfolio.
Gudikunst and Mccarthy (1992) use a single-index model with a broad bond market index and conclude that the performance of the funds, net of expenses, was almost equal to that of the bond market index. They also applied a multi-index model and found that other variables are important in explaining bond portfolio returns. These additional variables are the change in market duration, a yield spread on BBB-rated bonds, the standard deviation of past returns and the growth rate of total fund assets17.
Blake, Elton and Gruber (1993) compare single-index to multiple index models using a set of different bond market indices. They argue that conclusions on bond fund performance are not substantially affected by the choice of the indices, once they account for the high yield effect, and that fewer factors as compared to the stock market seem to explain expected returns in bonds. The bond funds in the sample underperformed (by almost the amount of the average management fees) the relevant market indices.
In a later study, Elton, Gruber and Blake (1995) developed a relative18 APT model for bonds and used this model to evaluate bond fund performance. Following previous studies on common stocks (for example Chen, Roll and Ross, 1986), they considered as factors driving bond expected returns, a set of portfolio returns (measuring variables like market returns, default risk and term risk), and two fundamental economic
17 A similar model was applied to the Portuguese bond fund market (Silva and Armada, 1998). With very few exceptions, Portuguese bond funds showed significant negative performance.
18 The authors differentiate between relative pricing and equilibrium models. What they develop is not an equilibrium model but a relative pricing model. They assume absence of arbitrage, which is a necessary although not sufficient equilibrium condition; it only guarantees that assets have relative prices (not absolute) consistent with equilibrium.
variables: a measure of unexpected changes in inflation and a measure of unexpected changes in Gross National Product (GNP). In the context of the (relative) APT model, bond funds continued to show neutral performance, after adding expenses, and a negative performance in a net expenses basis.
Gallo, Lockwood and Swanson (1997) also evaluated the performance of a sample of US based International bond funds against both single and multiple benchmarks. For the single benchmark model they considered an international bond market index and for the multiple benchmark they considered three risk factors: a world bond market index, an US bond index and a factor representing currency risk. Their results indicated that funds, in general, were unable to outperform either benchmark over the total sample period. Fund managers did, however, outperform the multi-index benchmark for the first half of the sample period. Furthermore, they concluded that the multi-index model was the more appropriate model for evaluating international bond fund performance.
In a research on the performance of high-yield (low-grade) bond funds, and using a single-index model with a high-yield index as the market index, Gudikunst and Mccarthy (1997) find, once again, no evidence of superior performance. By extending the work of Cornell and Green (1991) they also analyse a multi-index model as a risk-return-generating process for high-yield bond portfolios. Variables such as high-yield bond index excess return, federal Government deficit, industrial production, yield spread on BBB-rated bond index and previous fund returns’ volatility were used. The results seem to indicate that high-yield bond returns are explained not only by the bond markets but also by variables representing economic conditions, as found by Cornell and Green.
Kahn (1998), using the style analysis approach of Sharpe (1992)19, investigated bond fund performance and also found no evidence of superior performance on bond funds. Similar conclusions were reached by Singh and Dresnack (1998) relatively to US Municipal bond funds and using the single-index model with a municipal bond index as the factor index.
Recent research on global bond funds20 (Detzler, 1999) reaches similar conclusions. Detzler’s study applies a wide range of benchmarks: two single-index models (one using a world Government bond index and the other a broad bond index including US Government and corporate bonds) and three multi-index models (using bond Government indices of the major countries, a corporate bond index and exchanges rates variables). On average, the 19 global funds did not outperform any of these five benchmarks during the period of 1988 to 1995. The first multi-index model, a 10-factor benchmark, includes excess returns in local currencies on five Government bond indices: Canada, US, Germany, Japan and UK, a corporate bond factor21 and exchange rate movements for the Canadian Dollar, Deutsch Mark, Japanese Yen and British Pound. The second one, a US dollar benchmark, includes six factors: excess returns in US dollars on the same five Government bond indices and the corporate bond factor.
The third one, a fully-hedged benchmark, also includes six factors: the currency-hedged excess returns on the five Government bond indices and the corporate bond factor.
19 This methodology consists of separating each month portfolio returns into two main components: the return attributable to style, usually measured by an asset class factor model, and the return attributable to selection. The style return represents the return to an appropriate benchmark and the selection return measures the performance relative to the benchmark.
20 Global bond funds invest in both US and non-US bonds (also called world bond funds). They differ from International bond funds as these usually invest exclusively in non-US bonds.
21 This factor was included in order to capture the higher default risk of corporate bonds. As monthly returns on non-US corporate bond indices are not available, Detzler considers only a US corporate bond index.
Maag and Zimmermann (2000) investigate the performance of a sample of 40 German bond funds over the period of 1988-1996. They consider several single-index models, several multi-index models and also an asset class factor model. For any of these models they obtained alphas of the same order of magnitude: more negative than positive alphas. The conclusion that resulted from the application of all the different models and index specifications was that bond funds underperform.
So, the conclusion that can be drawn from almost all the few empirical studies is that bond funds are not able to beat the market; on average, they present neutral or even negative performance. This type of conclusion is similar to what has also been found in relation to stock funds. Furthermore, although bond fund performance, as measured by alpha, seems to be less sensitive to the benchmark problem, empirical evidence suggests that multiple benchmarks models are, in general, more appropriate.
However, performance measures obtained using these regression-based index models, besides the traditional criticisms of estimation errors and appropriateness of benchmarks, assume that returns and risk are stationary. In the case of bond portfolios this assumption can be even more critical as bond fund managers are probably more market timers than security pickers, namely for funds that include mostly Government bonds. Their performance depends heavily on the ability to predict future interest rates and on adjusting the duration of the fund accordingly. If they predict an increase (decrease) on interest rates they should decrease (increase) the duration. As duration is closely related to beta, regression-based index models with fixed betas may not be appropriate due to beta non-stationarity.
Assuming that expected returns and risk are time-varying presumes the existence of some return predictability. If this is the case, there should be some publicly available information, related with economic conditions, which represent useful information to
predict security returns. This type of empirical evidence has important implications for market efficiency, asset pricing and performance evaluation and therefore cannot be disregarded. In the next section we review and discuss the issue of return predictability.