Faculty of Science
Machine Learning and Financial Advice
Christian Igel
Department of Computer Science
Outline
1 Machine Learning at DIKU
2 Example Applications in Finance
3 Risks of Automated Investments Tools
Outline
1 Machine Learning at DIKU
2 Example Applications in Finance
3 Risks of Automated Investments Tools
4 ML, Fairness, and Discrimination
Machine learning
Machine learning is a branch of computer science and applied statistics covering software that improves its performance at a
Why machine learning?
• Computer systems are required for tasks for which solutions cannot be specified in the traditional way, e.g., because
• the designer’s knowledge is limited, and/or
• the sheer complexity and variability precludes an accurate description.
• However, large amounts of data describing the task are often available or can be automatically obtained.
• To take proper advantage of this information, we need systems that self-adapt and automatically improve based on sample data – systems that learn.
Machine learning turns data into knowledge
Slide 5/24 — Christian Igel — Machine Learning and Financial Advice —[email protected]Machine learning in finance
• Customer credit scoring• Automatic trading
• Credit card approval and fraud detection
• Computer-aided rating
• Prediction of mortgage underwriting
• Foreign exchange rate forecasting
• Predicting stock initial public offerings • Forecasting economic turning points • Bankruptcy prediction • Signature verification • Risk management • Enhancing auditing by irregularity detection • Identification of customer characteristics and targeted marketing
Driver assistance systems
Stallkamp, Schlipsing, Salmen, Igel: Man vs. Computer: Benchmarking Machine Learning Algorithms for Traffic Sign Recognition.Neural Networks 32, 2012
Sport analytics
Schlipsing, Salmen, Tschentscher, Igel. Adaptive Pattern Recognition in Real-time Video-based Soccer Analysis.Journal
DIKU researchers in ML and data mining
Machine Learning Lab
http://image.diku.dk/MLLab
Image Section
http://www.diku.dk/english/research/imagesection
DIKU faculty doing machine learning, information retrieval, and
pattern recognition:Corinna Cortes (head of Google Research
New York, adjunct), Ingemar Cox, Marleen De Bruijne, Sune Darkner, Aasa Feragen, Christian Igel (head of ML Lab), Francois Lauze, Christina Lioma, Mads Nielsen (head of Image Group), Marco Loog (TU Delft, adjunct), Søren Olsen, Yevgeny Seldin, Jon Sporring, Kim Steenstrup Pedersen, . . .
Outline
1 Machine Learning at DIKU
2 Example Applications in Finance
3 Risks of Automated Investments Tools
Support Vector Machines (SVMs)
Φ Φ k(x,z) = hΦ(x),Φ(z)iCortes, Vapnik: Support-Vector Networks,Machine Learning20, 1995
Glasmachers, Igel: Maximum-Gain Working Set Selection for SVMs,Journal of Machine Learning Research7, 2006 Glasmachers, Igel: Maximum Likelihood Model Selection for 1-Norm Soft Margin SVMs with Multiple Parameters,IEEE
Transactions on Pattern Recognition and Machine Intelligence32, 2010
Trees and Random Forests
600/1536 280/1177 180/1065 80/861 80/652 77/423 20/238 19/236 1/2 57/185 48/113 9/72 3/229 0/209 100/204 36/123 16/94 14/89 3/5 9/29 16/81 9/112 6/109 0/3 48/359 26/337 19/110 18/109 0/1 7/227 0/22 spam spam spam spam spam spam spam spam spam spam spam email email email email email email email email email email email email email email email email email email email email ch$<0.0555 remove<0.06 ch!<0.191 george<0.005 hp<0.03 CAPMAX<10.5 receive<0.125 edu<0.045 CAPAVE<2.7505 free<0.065 business<0.145 george<0.15 hp<0.405 CAPAVE<2.907 1999<0.58 ch$>0.0555 remove>0.06 ch!>0.191 george>0.005 hp>0.03 CAPMAX>10.5 receive>0.125 edu>0.045 CAPAVE>2.7505 free>0.065 business>0.145 george>0.15 hp>0.405 CAPAVE>2.907 1999>0.58Hastie, Tibshirani, & Friedman.The
Elements of Statistical Learning. Springer,
Business example: Credit scoring
Acredit scoremeasures the creditworthiness of a client.
Client applies for loan Application evaluation Loan evaluation Client granted loan Client declined loan Bad Good
Joint work with Danske Bank; figures in this section provided by Kasper Nybo Hansen.
Results from MSc thesis
LD A LOG K−NN RF CAR T C4.5 SVM Mod. RF 0.76 0.78 0.80 0.82 0.84 0.86 0.88 Accur acy 0.846 0.835 0.833Application in cyber fraud detection
• Cybercrime is one of thetop four economic crimes, leading to financial loss and damaging the reputation of institutions.
Global Economic Crime Survey 2014, PwC
• With Nets A/S, we apply machine learning for detecting online identity theft.
• General and individual user models are learnt,
deviations from the models indicate attacks.
Outline
1 Machine Learning at DIKU
2 Example Applications in Finance
3 Risks of Automated Investments Tools
Alert
In May 2015, U.S. Securities and Exchange Commission and Financial Industry Regulatory Authority (FINRA) issued an alert to “provide investors with a general overview of automated investment tools”:
• Understand any terms and conditions.
• Consider the tools limitations, including any key assumptions.
• Recognize that the automated tools output directly depends on what information it seeks from you and what information you provide.
• Be aware that an automated tools output may not be right for your financial needs or goals.
• Safeguard your personal information.
Alert
For example . . .
“an automated investment tool [. . . ] may be programmed to consider limited options. For example, an automated investment tool may only consider investments offered by an affiliated firm.” What is the risk of this happening when the advisor is human?
Outline
1 Machine Learning at DIKU
2 Example Applications in Finance
3 Risks of Automated Investments Tools
4 ML, Fairness, and Discrimination
New York Times with Cynthia Dwork
NYT: Some people have argued that algorithms eliminate discrimination because they make decisions based on data, free of human bias. Others say algorithms reflect and perpetuate human biases. What do you think?
C. DWORK: Algorithms do not automatically eliminate bias.
Suppose a university, with admission and rejection records dating back for decades and faced with growing numbers of applicants, decides to use a machine learning algorithm that, using the historical records, identifies candidates who are more likely to be admitted.
ML and discrimination
This is all true – but can turned around:
ANONYMOUS: Suppose the university solely relies on human resource case handlers who, using their personal opinions and experience, identify candidates who are more likely to be admitted.
Historical biases from the case handler’s socialisation, historical biases from experiences many years ago, as well as biases caused by the high variance in the limited lifetime experience may guide the case handler, and past discrimination as well as random effects may lead to future discrimination.
Fairness through awareness I
C. DWORK: “Fairness Through Awareness” makes the
observation thatsometimes, in order to be fair, it is important to make use of sensitive information while carrying out the
classification task.
This may be a little counterintuitive: The instinct might be to hide information that could be the basis of discrimination.
Fairness through awareness I
NYT: What’s an example?
C. DWORK: Suppose we have a minority group in which bright students are steered toward studying math, and suppose that in the majority group bright students are steered instead toward finance.
An easy way to find good students is to look for students studying finance [. . . ]
But not only is it unfair to the bright students in the minority group, it is also low utility.
[. . . ] cultural awareness tells us that “minority+math” is similar to “majority+finance.”A classification algorithm that has this sort of cultural awareness is both more fair and more useful.
New York Times, August 10, 2015, interview with Cynthia Dwork by Claire Cain Miller
Summary
• ML is a powerful tool in finance.
• Just because you use ML, you do not solve all problems related to fairness and discrimination.