Suzy Moat
Data Science Lab
Behavioural Science, WBS [email protected]
Quantifying human behaviour
using online data
Data Science Lab
The advantage of looking forward
1
Future Orientation Index 2010
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
Future Orientation Index 2010
Suzy Moat & Tobias Preis Based on Preis, Moat, Stanley and Bishop (2012)Ratio of Google searches for “2011” to searches for “2009” during 2010 for 45 countries
more Google searches for “2009” more Google searches for “2011”
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
Future Orientation Index 2012
Suzy Moat & Tobias PreisBased on Preis, Moat, Stanley and Bishop (2012)
Richer countries look forward
Time with Weekly Granularity
Search Volume
0
5
10
2008
2009
2010
0
5
A
B
“2008”
“2007”
“2009”
“2010”
“2011”
“2009”
Future-Orientation Index
GDP / Capita [10
4USD]
1
2
3
4
0.0
0.5
1.0
1.5
2.0
Preis, Moat, Stanley & Bishop (2012) Featured by:Photo: Perpetual Tourist
Anticipating market moves
Hypothetical strategy
week
t
Moat et al. (2013); Preis et al. (2013)number of
week
t
t-1
t-2
t-3
Moat et al. (2013); Preis et al. (2013)Hypothetical strategy
number of
Page views
decreased:
BUY
stock
in week t+1
week
t
t-1
t-2
t-3
Moat et al. (2013); Preis et al. (2013)Hypothetical strategy
number of
Page views
decreased:
BUY
stock
in week t+1
Page views
increased:
SELL
stock
in week t+1
week
t
t-1
t-2
t-3
Moat et al. (2013); Preis et al. (2013)Hypothetical strategy
number of
Wikipedia
: Dow Jones companies
Views data: signi
fi
cant di
ff
erence
Moat, Curme, Avakian, Kenett, Stanley & Preis (2013) Featured by:
Return [Std. Dev. of Random Strategies]
Density 0.0 0.2 0.4 0.6 −2 0 2 Wikipedia Views DJIA Companies Wikipedia Edits DJIA Companies Random Strategy
Wikipedia
: Financial topics
Moat, Curme, Avakian, Kenett, Stanley & Preis (2013) Featured by:
Views data: signi
fi
cant di
ff
erence
0.00 0.25 0.50 0.75 1.00 −2 0 2
Return [Std. Dev. of Random Strategies]
Density Wikipedia Views Financial Topics Wikipedia Edits Financial Topics Random Strategy
Wikipedia
: Actors and
fi
lmmakers?
Moat, Curme, Avakian, Kenett, Stanley & Preis (2013) Featured by: 0.0 0.1 0.2 0.3 0.4 −2 0 2
Return [Std. Dev. of Random Strategies]
Density
Wikipedia Views
Actors & Filmmakers
Random Strategy
Random strategy mean + 2 sds Random strategy mean + 1 sd return (random strategy sds)
0 1 2
-1
“debt”
“culture”
How
keywords perform
Random strategy mean + 2 sds Random strategy mean + 1 sd return (random strategy sds)
0 1 2 -1
“debt”
“culture”
“stocks”
“credit”
“garden”
“train”
Preis, Moat & Stanley (2013)
# occurrences in FT
# hits on Google
Returns signi
fi
cantly
correlated with indicator
of
fi
nancial relevance
Financial relevance
Random strategy mean + 2 sds Random strategy mean + 1 sd return (random strategy sds)
0 1 2
-1
Preis, Moat & Stanley (2013)
debt money crisis internet technology money debt
Curme, Preis, Stanley & Moat (2014)
What is searched for
before falls?
55 groups of search
terms
Business and politics
most related
Curme, Preis,
Stanley & Moat (2014)
What is searched for before falls?
Cumulative Returns (%)
-100 0 100 200
Random Strategy Politics I Business
Photographers as sensors
Flickr
and tourist numbers
Seresinhe, Preis & Moat (under review)
Scenicness and wellbeing
Scenicness and wellbeing
A
D
Average percentage of greenspace 0 0.3 0.6 0.7 0.9 0.94 0.96 0.99Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
Scenicness and wellbeing
Seresinhe, Preis & Moat (under review)
A
D
Average percentage of greenspace
0 0.3 0.6 0.7 0.9 0.94 0.96 0.99
Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle
Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and wellbeing
Seresinhe, Preis & Moat (under review)
A
D
Average percentage of greenspace
0 0.3 0.6 0.7 0.9 0.94 0.96 0.99
Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle
Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
People report
better health in
more scenic
locations
Scenicness and wellbeing
A
D
Average percentage of greenspace 0 0.3 0.6 0.7 0.9 0.94 0.96 0.99Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle
Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
Scenicness and wellbeing
A
D
Average percentage of greenspace 0 0.3 0.6 0.7 0.9 0.94 0.96 0.99Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle
Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
A
D
Average percentage of greenspace
0 0.3 0.6 0.7 0.9 0.94 0.96 0.99
Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8 POOR HEALTH GREENSPACE SCENICNESS Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester Newcastle
Liverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
Scenicness and wellbeing
A
D
Average percentage of greenspace 0 0.3 0.6 0.7 0.9 0.94 0.96 0.99Average scenic rating
1 2.2 3.1 3.6 3.9 4.2 4.6 8
POOR HEALTH
GREENSPACE
SCENICNESS
Average rates of poor health (SMR) 0 0.5 0.6 0.7 0.8 0.9 1.2 3.2 London Birmingham Manchester NewcastleLiverpool Sheffield
B
C
0.00 0.25 0.50 0.75 1.00All areas Urban Suburban Rural
Proba
blity of the model
gi
ven the data (AICw)
Model
Scenicness only Greenspace only
Scenicness and Greenspace
Measuring
fl
u with
Ginsberg et al., Nature 368, 1012 (2009)
Flu data
Google Trends estimate
Butler, Nature 494, 155 (2013)
“The press reports may have triggered many fl
● ● ●●●●●●●●●● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●● ●●● ● ●● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ●●●●●●●●●●● ●●●●●●●●●●● ●●●●● ●●●●●●● ●●● ●●● ●●●●●● ● ● ●● ●● ●●●●●●●●●●● ●●●●●●●●●●● ●●●●●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ●●●●●●● ●●●●●●●●●●●●●●●● 2 4 6 2010 2011 2012 2013 Time [Weeks] In fl uenza − Lik e Illness [%] Predicted Value Observed Value 80% Prediction Interval 95% Prediction Interval ● Training Period Out-of-Sample Nowcast
Preis & Moat (2014)
Flu estimate errors signi
fi
cantly reduced
To what extent can Internet data
help us measure and even predict
human behaviour?
[email protected]
@suzymoat
computer science statistics physics mathematics
crime science
finance health
economics