Robustness - Performance evaluation of the statistical models and betting

2.5 Performance evaluation of the statistical models and betting

2.5.3 Robustness

The robustness of the end wealth calculations are estimated by evaluating the results’ robustness to two factors: (1) which golfers there are betted on;

(2) which tournaments there are betted on. To keep the following simple, only the two best performing strategies are considered.

(1) Which golfers there are betted on: I test whether the results are driven by bets on golfers with specific values for ’odds’ and ’volume matched’.

I run a grid search where the two betting strategies are evaluated for golfers with different ranges of odds and matched volume. That is, I evaluate the betting strategies on golfers with low odds (< 50) and high odds (≥ 50) as well as liquid (golfer that are matched > £5000 on) and less liquid golfers (golfer that are matched £1000 to £5000 on). I have chosen not to look at illiquid golfers, because only very small bets can be placed on them. From Table 2.15 it is clear that the positive return is mainly driven by bets on golfers with odds <50.

Table 2.15: End wealth by betting on select group of golfers.

2a.fractional CL 2b.fractional Discounted CL

£1000 to £5000 ≥ £5000 £1000 to £5000 ≥ £5000

Odds < 50 1.54 1.34 1.72 1.46

Odds ≥ 50 0.96 1.05 0.97 1.03

(2) Which tournaments there are betted on: To test whether the results are driven by bets on only a few tournaments, I create 1000 sets of bootstraped resamples (with replacement) of the 139 tournaments with odds (see Table 2.4). For each of the 1000 bootstraped samples with 139 tourna-ments, I calculate the end wealth of the strategies according to Equation 2.16.

Figure 2.11 contains boxplots of the end wealth for the 1000 bootstraped samples for the two betting strategies ’2b.fractional CL’ and ’2b.fractional Discounted CL. The results indicate that the strategies could enable a bettor to make a very large positive return. The positive return does not appear to be driven by few specific tournaments in the sample.

Figure 2.11: Boxplot of end wealth for the bootstrapped resamples for the two best performing strategies. The numbering (e.g. 2a) refer to the numbered list on page 65; fractional refer to the fractional Kelly strategy.

How much to bet?

Before deciding whether to implement the above proposed betting system in the real world, it would be relevant to investigate how much money one could expect to be able to win. It is clear that the odds are going to change if you place big bets on a golfer. If you back a golfer, the odds will likely fall; if you lay a golfer, the odds will likely increase. The size of the increase/decrease is likely related to the liquidity of the golfer, but it is not possible to derive an explicit formula for how the odds will change as a function of the size of odds placed.

Chapter 3 Conclusion

In this thesis I evaluate whether Betfair’s golf prediction markets are effi-cient. I create a novel dataset containing: (1) winning market prices from the biggest public prediction market, Betfair, for all golf tournaments in the PGA Tour and the European Tour in 2011 and 2012 and (2) historical golf results from the PGA Tour, European Tour, Champion Tour and Nationwide Tour from the beginning of 2002 to the end of 2012.

I evaluate whether the golf prediction markets are efficient by testing whether market prices are well adjusted to two sets of relevant historical data. First, I perform weak form tests to see if prices are efficiently adjusted to historical prices. Secondly, I perform semi-strong form tests to see if prices are efficiently adjusted to results from previous golf tournaments.

I test for for weak form and semi-strong form efficiency by building bet-ting strategies which aim at achieving positive return. Positive return in-dicate market inefficiency. My approach of evaluating prediction markets efficiency differs a bit from other papers dealing with the prediction markets.

For example, it is the aim of: Cowgill & Zitzewitz (2013); Forsythe et al.

(1992); Smith et al. (2006) to evaluate whether prediction markets provide

more precise probability estimates than corporate experts, exit polls and bookmakers respectively. My approach is, however, in line with other studies of market efficiency on sports markets see e.g Bolton & Chapman (1986);

Benter (1994) and Sung & Johnson (2012).

My analyzes lead to the following two main findings:

Firstly, I am not able to achieve positive return using the naive betting strategies - betting on favorites and longshots, respectively. I thus show no sign of weak form inefficiency in the market.

Secondly, by using the most profitable proposed betting strategy based on results from previous golf tournaments, a bettor is able to more than double his start wealth over the two year period from the beginning of 2011 to the end of 2012. This robustness of the finding is evaluated via (1) bootstrapped resampling of the tournaments as well as (2) simulations of the betting strate-gies for golfers in different ranges of odds and matched volume. The result stands up to the robustness tests. The fact that the betting strategy enables a bettor to achieve positive return over the two year period indicate that prices not are well adjusted to e.g. results from previous golf tournaments;

Betfair’s golf markets thus seem semi-strong form inefficient.

My conclusions are weakened by the fundamental structure of golf tourna-ments. I have relatively few tournaments with many golfers, much competi-tion and possibly important attributes that are hard to quantify (e.g. psy-chological state of the golfers). The shown ability to achieve positive return could have been caused by e.g. a sampling bias, whereby some tournament outcomes are overrepresented in the two year sample I analyze, compared to the ’true population of golf tournament outcomes’. More tournaments could

be included in the analyzes in the future to strengthen the conclusions.

The findings in this thesis lead to the following two considerations.

Firstly, it would be worth investigating the possibilities of implementing the betting strategy in real life. The ideas and approaches proposed in this thesis could be further refined in order to improve performance before a real life implementation. The existing attributes could be incorporated into the model in different ways and new attributes could be added to the model in order to obtain a better and more unbiased fit. The added attributes could describe the golfers individual weather preferences, psychological state, preferences with regards to type of course etc. Benter (1994) writes that it took him “approximately five man-years of effort to [...] organize the database and develop a handicapping model”. More time is thus likely to improve the performance of the proposed models and strategies.

Secondly, this thesis provides evidence that suggests that prediction mar-kets are inefficient estimators of event probabilities. I propose that more studies such as this is needed before prediction markets can confidently be used as efficient probability estimators.

When deciding which method to use in order to answer the question “What is the probability of...?”, the amount of available data should play a key role.

My findings indicate that prices in prediction markets do not always corre-spond to true event probabilities even for very liquid markets such as Betfair’s golf markets. An analytical approach could potentially provide more efficient estimates of event probabilities than prediction markets.

Bibliography

Arrow, K. J., Forsythe, R., Gorham, M., Hahn, R., Hanson, R., Ledyard, J. O., Levmore, S., Litan, R., Milgrom, P., Nelson, F. D. et al. (2008). The promise of prediction markets. SCIENCE-NEW YORK THEN WASHINGTON- 320(5878), 877.

Beard, H. (2009). Golf: An Unofficial and Unauthorized History of the World’s Most Preposterous Sport. Simonand Schuster.

Beilock, S. L. & Carr, T. H. (2001). On the fragility of skilled perfor-mance: What governs choking under pressure? Journal of experimental psychology: General 130(4), 701.

Benter, B. (1994). Computer based horse race handicapping and wagering systems: a report. In: Efficiency of Racetrack Betting Markets (Hausch, D. B., Lo, V. S. & Ziemba, W. T., eds.). Academic Press, pp. 183 – 198.

Betfair (2014). Historical golf odds. http://data.betfair.com/. [Online;

accessed: 2014-01-18].

Bolton, R. N. & Chapman, R. G. (1986). Searching for positive returns at the track: A multinomial logit model for handicapping horse races.

Management Science 32(8), 1040–1060.

Cowgill, B. & Zitzewitz, E. (2013). Corporate prediction markets:

Evidence from google, ford, and koch industries1 .

Edelman, D. (2007). Adapting support vector machine methods for horser-ace odds prediction. Annals of Operations Research 151(1), 325–336.

Ehrenberg, R. G. & Bognanno, M. L. (1990). The incentive effects of tournaments revisited: Evidence from the european pga tour. Industrial and Labor Relations Review , 74S–88S.

Fama, M. B. G., Eugene F (1970). Efficient capital markets: A review of theory and empirical work*. The journal of Finance 25(2), 383–417.

Forsythe, R., Nelson, F., Neumann, G. R. & Wright, J. (1992).

Anatomy of an experimental political stock market. American Economic Review 82, 1142–1142.

Fox, J. (2008). Applied regression analysis and generalized linear models.

Sage.

Franck, E., Verbeek, E. & N¨uesch, S. (2010). Prediction accuracy of different market structures, bookmakers versus a betting exchange. Inter-national Journal of Forecasting 26(3), 448–459.

Griffith, R. M. (1949). Odds adjustments by american horse-race bettors.

The American Journal of Psychology .

Hausch, D. B., Lo, V. S. & Ziemba, W. T. (1994). Efficiency of Race-track Betting Markets, vol. 2. World Scientific Publishing.

Hausch, D. B., Lo, V. S., Ziemba, W. T. & Ziemba, W. (2008). Effi-ciency of Racetrack Betting Markets, vol. 2. World Scientific Publishing.

Jones, E., Oliphant, T., Peterson, P. et al. (2001–). SciPy: Open source scientific tools for Python. URL http://www.scipy.org/.

Kelly, J. L. (1956). A new interpretation of information rate. Information Theory, IRE Transactions on 2(3), 185–189.

Lessmann, S., Sung, M.-C. & Johnson, J. E. (2007). Adapting least-square support vector regression models to forecast the outcome of horser-aces. The Journal of Prediction Markets 1(3), 169–187.

MacLean, L., Ziemba, W. T. & Blazenko, G. (1992). Growth versus security in dynamic investment analysis. Management Science 38(11), 1562–1585.

Manski, C. F. (2006). Interpreting the predictions of prediction markets.

economics letters 91(3), 425–429.

McFadden, D. (1974). Conditional logit analysis of qualitative choice be-havior .

Orszag, J. M. (1994). A new look at incentive effects and golf tournaments.

Economics Letters 46(1), 77–88.

Ottaviani, M. & Sørensen, P. N. (2008). The favorite-longshot bias:

an overview of the main explanations. Handbook of Sports and Lottery Markets (eds. Hausch, DB and Ziemba, WT), North-Holland/Elsevier , 83–102.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. & Duchesnay, E. (2011). Scikit-learn:

Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830.

Sauer, R. D. (1998). The economics of wagering markets. Journal of economic Literature 36(4), 2021–2064.

Schmidt, C. & Werwatz, A. (2002). How accurate do markets predict the outcome of an event? .

Shmanske, S. (2005). Odds-setting efficiency in gambling markets: Evi-dence from the pga tour. Journal of Economics and Finance 29(3), 391–

402.

Smith, M. A., Paton, D. & Williams, L. V. (2006). Market efficiency in person-to-person betting. Economica 73(292), 673–689.

Smith, M. A., Paton, D. & Williams, L. V. (2009). Do bookmak-ers possess superior skills to bettors in predicting outcomes? Journal of Economic Behavior & Organization 71(2), 539–549.

Smoczynski, P. & Tomkins, D. (2010). An explicit solution to the prob-lem of optimizing the allocations of a bettor’s wealth when wagering on horse races. Mathematical Scientist 35(1).

Stevenson, A. & Lindberg, C. (2010). New Oxford American Dictionary, Third Edition. Oxford University Press.

Sung, M. & Johnson, J. E. (2012). Comparing the effectiveness of one-and two-step conditional logit models for predicting outcomes in a speculative market. The Journal of Prediction Markets 1(1), 43–59.

Tan, P.-N., Steinbach, M. & Kumar, V. (2013). Introduction to data mining. Pearson Education India.

Tanaka, R. & Ishino, K. (2012). Testing the incentive effects in tour-naments with a superstar. Journal of the Japanese and International Economies 26(3), 393 – 404. URL http://www.sciencedirect.com/

science/article/pii/S0889158312000196.

Tziralis, G. & Tatsiopoulos, I. (2012). Prediction markets: An ex-tended literature review. The journal of prediction markets 1(1), 75–91.

Verbeek, M. (2008). A guide to modern econometrics. John Wiley & Sons.

Wolfers, J. & Zitzewitz, E. (2006). Interpreting prediction market prices as probabilities. Tech. rep., National Bureau of Economic Research.

Yahoo (2014a). Historical golf results. http://sports.yahoo.com/golf/.

[Online; accessed: 2014-01-18].

Yahoo (2014b). THE PLAYERS Championship. http://sports.yahoo.

com/golf/pga/leaderboard/2013/13. [Online; accessed: 2014-01-18].

Yahoo (2014c). Yahoo data license. http://info.yahoo.com/guidelines/

us/yahoo/ydn/ydn-3955.html. [Online; accessed: 2014-01-18].

Ziemba, W. T. (2008). Chapter 10 - efficiency of racing, sports, and lottery betting markets. In: Handbook of Sports and Lottery Markets (Hausch, D. B. & Ziemba, W. T., eds.), Handbooks in Finance. San Diego: Else-vier, pp. 183 – 222.

Appendix A Appendix

A.1 Golf dictionary

Table A.1: Golf terms Term Description

Bogey A score of one over par

Fairway

The area between the tee box and the putting green where the grass is cut even and short

Golfer A person who plays golf

Handicap A handicap is a numerical measure of a golfer’s potential play-ing ability based on the tees played for a given course

Hazard

Special areas on the golf course that have additional rules for play. There are generally two types: (1) water hazards, e.g. ponds, lakes, and rivers; and (2) bunkers, which are sand traps.

Continued from previous page Term Description

Par

The pre-determined number of strokes that a scratch (or 0 handicap) golfer should require to complete a hole or a round.

Rough

A grass area on the golf course where the grass is cut higher than the grass on the fairway and the green. It is typically a disadvantageous area to hit from.

Stroke play

The most common scoring system in golf. It involves counting the total number of strokes used on each hole during a given round, or series of rounds. The winner is the player who has used the fewest number of strokes over the course of the round, or rounds.

Tee box The starting point of a golf hole.

Source: Stevenson & Lindberg (2010)

In document Market Efficiency in Person-to-Person Betting on Golf Tournamnets (Page 72-84)