• No results found

5. TESTS OF MARKET EFFICIENCY

5.3 Further tests of market efficiency for the remaining sub samples

The previous section tested the efficiency of the B365 home win odds by means of a spread graph, independent sample t-tests and several regressions. Similarly this section tests the efficiency of the other sub samples. These other sub samples are odds of the big five teams in the premier league, odds of teams that promoted to the Premier League, odds of teams with large followings and finally odds of teams with obscure followings. These sub samples are

further divided in home win odds and away win odds. To avoid long-windedness the current study will not describe all the results of the sub samples in different sections but presents the reader with an overview of the spread graphs of the different sub samples as well as with an overview of the results of the independent sample t-tests. Based on these findings the sub samples that show the most signs of inefficiencies will be further tested by means of the regression approach similar to the previous section. This may result in some interesting specifics that will be of use for the following chapter that tries to find profitable trading strategies. Additionally, one can say that based upon the previous section the results of the independent sample t-tests are close to the results of the regression. It would therefore be over the top to run all the regressions once more. The same approaches are used to validate the assumptions and run the tests. Full outputs as well as the bundled spread tables are given in the appendixes that are separated into the five sub samples.

The spread graphs of the different sub samples are given on the following two pages. The study already explained that spread graphs are an excellent manner for testing the efficiency of the odds. Furthermore, the previous chapter indicated that bookmakers overprice the favorite, which results in lower implied probabilities and if the realized probabilities stay constant the spreads would naturally turn positive. Based upon this reasoning it is therefore expected that especially in the lower odd ranges the spread would be substantially positive. If one assumes that both the favorite and the underdog are priced with the same measure of bookmaker’s profit margin built into them, the spread would gradually decline and become negative for the higher odd ranges. Furthermore if the favorite is overpriced the underdog must be underpriced. Spread, therefore, is a perfect measure for testing efficiency. The spread graphs are now given in figure 9.

Several of the sub samples indeed show this downward trend. Especially the promoted teams home win sub sample depicts the expected pattern. Other sub samples that are of interest, besides the whole sample home win, are the whole sample away win, the big five away win and the large following away win. The only problem with away odds is the low frequency of occurrences in the lower odd ranges. Furthermore, an away win is always more difficult to accomplish than a home win and consequently away matches are priced higher.

Figure 9. Spread graphs of all the different sub samples separated in home win odds, the left figures, and away win odds, the

Large Followings Win this spread is initially positive and thereafter decreases gradually to negative spreads.

Additionally, some sub samples seem interesting although they may not portray this downward trend. The big five home win sub sample and the large following home win sub sample for example show positive spreads along all odds. This of course would be a clear sign of inefficiently set odds and could result in some interesting trading strategies for punters and the like.

To further grasp the meanings of these spread graphs the results of the independent sample t-tests are given in table 12 below. The approach is similar to the one used for the whole sample

win odds. The only difference is that the current study tests only one of the assumptions underlying the independent sample t-test for the remaining sub samples. Furthermore, the independent sample t-test has two assumptions of which one should be satisfied to yield accurate results. The first one indicates that the samples are large, i.e. (n1,n2 > 30). The second one implies a normal distributed quantitative variable. For the whole sample home win odds both of these assumptions were tested for matters of completeness. The remaining sub samples, however, are only tested on the normality assumption of the quantitative variable by means of the Shapiro-Wilk test.

Although it may be a bit risky to base our findings on one assumption test alone, the whole sample home win odds indicated that the whole finite population of which the samples are taken normally should approach the central limit theorem. This then implies that the populations are normally distributed and the samples taken from these population most likely follow a Gaussian distribution (Bowerman, O’Connell & Hand 2001). All the different sub samples have finite populations of much more than 30. In a way this implies that the first assumption is satisfied for the sub samples and the results would thus be accurate. However, the current study finds this a bit too risky a strategy to base its results upon and therefore does not attach any meaning to the results of the independent sample t-tests if the quantitative variable is not normally distributed. Further reassurance concerning the accuracy of our approach is given by Connover (1999), Royston (1995) and Shapiro and Wilk (1965), who all claim that the Shapiro-Wilk test is the most accurate test for measuring non-normality for small to medium samples. All of the sub samples have quantitative variable occurrences that lie somewhere between 8 and 60 observations, i.e after bundling the odds into ranges, this Shapiro-Wilk test seems to offer the most accurate test statistics. The results are now presented in table 12.

Table 12. Independent sample t test results for all the different sub samples.

Sub Sample Shapiro-Wilk

for variable

t Sig (2-tailed)

Whole sample away Win 0,317 -0,335 0,739

Whole sample away win lower half

0,967 0,430 0,682

Big five home win 0,033 Not relevant Not relevant

Big five home win lower half Not relevant Not relevant Not relevant

Big five away win 0,021 Not relevant Not relevant

Big five away win lower half Not relevant Not relevant Not relevant

Promoted teams home win 0,805 0,303 0,764

Promoted teams home win lower half

0,334 0,871 0,409

Promoted teams away win 0,485 -0,380 0,707

Promoted teams away win lower half

0,938 -0,027 0,980

Large following home win 0,204 0,660 0,513

Large following home win lower half

0,787 1,326 0,210

Large following away win 0,388 -0,215 0,831

Large following away win lower half

0,064 0,811 0,441

Obscure followings home win 0,933 0,229 0,820

Obscure followings home win lower half

0,797 0,046 0,964

Obscure followings away win 0,080 -0,341 0,735

Obscure followings away win lower half

0,998 0,365 0,734

Table 12 indicates that two variables are not normally distributed. It concerns the quantitative variables of the big five teams in the Premier League for both the home win odds and away win odds. For these samples consequently the results of the independent sample t-test are not given because they might be biased. Remember that the independent sample t-test measures the difference in means for realized probabilities and implied probabilities concerning the probabilities of odd outcome occurrence. The independent sample t-tests are thus directly related to the spread graphs as the spreads for the different odds portray the realized probability minus the implied probability for that odd. Significant spreads or significant results for the independent sample t-test thus indicate inefficiently set odds.

Based upon the table several things become clear. First, one may conclude that overall the lower half samples have slightly lower p values than the complete samples, which may indicate that for the lower odd ranges spreads are indeed slightly more positive. Second, one notices that none of the p values are significant, which indicates that the sub populations the current study tests, are efficient. The external validity of the results is further discussed in the limitations part later on. Lastly, the table shows some results that are in line with the spread graphs. Furthermore, the p value of the lower half promoted teams home win sample is indeed lower than the p value of the complete promoted teams home win sample. This is in line with the graph of the promoted teams home sample as there is a clear downward spread pattern visible. Nevertheless this downward pattern is not significantly strong enough to cause significant p values for the independent sample t-tests. The table also shows that the p values for the large followings home win sample are the lowest compared to other sub samples. This is something one could have expected based upon the spread graphs as well. Moreover the large followings win graph shows clear positive spreads along almost all odd bundles.

Overall, one may conclude that none of the sub samples, and thus there populations, show real signs of inefficiencies. The spread graphs and independent sample t-tests do indicate, in line with theory, that the lower odds have slightly higher spreads. None of the results, however, are significant. Based upon these results and the spread graphs it may be interesting to see whether the regression of the large following home win sample confirms our results. The results of this regression may then offer some further specifics of this sub sample that may be of use for creating a profitable trading strategy in chapter 6.

Unfortunately creating several regressions for the large followings home win odds results in pretty useless models. Furthermore, similar to the whole sample home win odds six regression were created. Three of these regressions used no logarithmic function and three did use a logarithmic function. All regression had spread or (ln)spread as dependent variable. The

independent variables for regression number 1 were odds or (ln)odds, for regression number 2 a dummy variable was added measuring the range of odds and finally regression number 3 added an interaction variable to the regressions. The full results of these tests are given in appendix F. The results, however, are pretty useless as none of the adjusted R squares of the regressions is positive, the Durbin-Watson tests of the residuals all deviate to some extent from 2 and all of the independent variables are insignificant. The negative adjusted R square is probably due to the fact that R square is less than k/(n-1). Where k is the number of independent variables (Bowerman, O’Connell & Hand, 2001) . Due to this fact the utility of the regressions and their predictive powers are pretty useless. Some of these aspects will be further discussed in a latter part of the study called suggestions for further research.

Overall, one may conclude this chapter by indicating that the spread graphs portray some patterns that are in line with theory. Furthermore, several of these spread graphs showed positive spreads for the lower odds that gradually declined as the odds increased in value.

Literature also argued that odds are set inefficiently. The spread graphs indeed portray positive and negative spreads, i.e. differences between realized and implied probabilities, but according to the current study’s independent sample t-test results and regression results, these positive and negative spreads are not significant enough to label any of the study’s sub populations as inefficient. Based upon the results one may as well conclude that in line with theory, spreads are bigger in the lower odd ranges compared to the spreads for the whole populations or samples. However, these spreads were not found to be that significant to label any of the ‘lower half’ sub samples as inefficient. Chapter 6 now briefly investigates whether punters can use some of the information of the above results to create trading strategies that result in profits.

Related documents