Big Data for the Common
Good: A regulation perspective
Dr Nikolaos Korfiatis (n.korfiatis@uea.ac.uk)
Assistant Professor in Business Analytics and Regulation
University of East Anglia (UEA) and Center for Competition policy (CCP)
About the speaker
• PhD (CBS/Denmark, 2009), MEng (KTH/Sweden, 2006)
• Academic Appointments
• 2014 - Assistant Professor in Business Analytics and Regulation, UEA & CCP
• 2011-2014 – Senior Researcher/Director for Data Science and Analytics, Frankfurt Big Data Lab
• 2009-2011 – Senior Researcher/Post.Doc. Department of Economics University of Copenhagen
• Industrial Appointments
• 2014 - Senior Data Scientist in Residence for Adastra Germany (Automotive, Banking)
Agenda
• What is the common good ? What’s the role of big data on shaping it ? What are the misfits ?
• Big Data working for the benefit of consumers
• Case studies
• The case of online reviews and endorsements • The case of the sharing economy
• How do we move forward ?
The common good
• A utilitarian perspective: the greatest possible good for the greatest possible number of individuals
• A theological perspective: “Do not live entirely
isolated, having retreated into yourselves, as if you were already [fully] justified, but gather instead to seek together the common good (Codex Sinaititicus – Epistle of Barnabas)”
• A political perspective: Promoting the general welfare
A definition for big data (Wikipedia)
• “Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate.
Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.
• The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set.
• Accuracy in big data may lead to more confident decision
making. And better decisions can mean greater operational
efficiency, cost reductions and reduced risk.”
7
Information Load and Information Overload
Are big data used for the
benefit of consumers ?
Bias
Grinberg, Nir, et al. "Extracting Diurnal Patterns of Real World Activity from Social Media." ICWSM. 2013.
Signal
Informed Minority vs
Uninformed Majority
From Citizens to
Consumers
What if …
• Big data was used to protect consumer interests
rather than been an enabler of corporate goals (e.g. focusing on improving experience rather than
targeting sales) ?
• Big data helped consumers reward positive
experiences and drive more competition for quality ?
• Big data enabled legislators prevent abuse (e.g. Cartel formation)
Case study 1: Online
reviews and endorsements
Definitions
Review valence: The rating the a customer gives
95% percent of the websites provide a 5 star rating scale ranging from extremely negative (1 star) to extremely positive.
Hotel websites (e.g. tripadvisor, booking.com) are more inclined to use a 10 star rating scale.
Review volume: The total number of reviews this product has until the time point
Definitions
• Review helpfulness: Most retailers allow consumers to rate the information content by asking if a review
How do the ratings look ?
• For a rating given at time point t,
the distribution is found empirically to be J-Shaped (Hu et al. 2009).
}
5
,
4
,
3
,
2
,
1
{
ir
Korfiatis et al. 2012, ECRA.
Number of reviews per unit of the rating scale. (Total number of reviewed items/books: Nbooks=7262
Why reviews matter (2)
• Assist consumers better evaluate alternatives and shrink the size of the consideration set (Pavlou and Gefen, 2004)
• Reduce information asymmetry in transactions with incomplete information about the seller (Dellarocas, 2003)
• Drive growth for businesses: One-star increase in a restaurant's Yelp rating led to an increase of 5% to 9% in revenues (Lucca, 2011)
So should we trust them ?
Mechanism Problems and
firm-motives
•
Reviewer side
• Self selection
• Material relationship (Astroturfing)
• Rating / voting motives are different (aspect rating)
•
Firm side
• “Ballot stuffing” / Shilling (for own benefit)
• Flame wars (against competitors) (Mayzlin, 2013, AER)
Regulator’s perspective
oAre online retailers doing enough to avoid review
manipulation ?
oGoogle not doing enough for click fraud avoidance – Tuzihlin 2006
oShould heuristics be established where review
aggregators and review sites should comply (Malbon, 2013) ?
oWho owns this content ?
Are
regulators
aware of this
problem?
Regulatory Landscape
•
UK CMA (2008) “Consumer protection from
Unfair Traiding”
• Prohibiting “falsely representing oneself as a consumer”
•
US Federal Trade Commission (FTC, 2009) and
(FTC, 2013) Guides Governing Endorsements,
Testimonials
• Any material relation should be disclosed to the consumer (targeted to astroturfing)
Reviews decline over time
• Hu, Nan, Ling Liu, and Vallabh Sambamurthy. "Fraud
detection in online consumer reviews." Decision Support Systems 50.3 (2011): 614-626.
Attacking the review mechanism
with ballot stuffing
Strategy
•
Consider the following estimator of the
average rating for I reviews at event time i:
•
Consider a subset of fake reviews
N
f•
If
N
farrives early on the event time and
N
fis
sufficiently large the influence on the
estimator of a genuine review r
gwill be
minimal r
g/
N
f
i i r N r 1 _ 33But is it different than genuine
biases ?
•
Self selection bias:
• Customers who are more positive towards a producer/brand tend to be the first to buy the product and thus the first to provide ratings.
• Rating scale also favours under-reporting bias for the median item.
•
Sequential bias:
• Previous reviews influence the valence of the next review to arrive.
Other influencers on review
manipulation
• Mayzlin et. al, 2013, AER
• Review aggregator policy might make the process of posting a fake review costly (based on
purchased or not-purchased requirement).
• Proximity with a competitor might influence the proportion of fake reviews through competition between firms for a constrained space/customer inflow. (E.g Hotels)
Case study 2: The sharing
economy
The “Sharing” economy
• Development of technology and trust mechanisms (e.g. reputation building using online reviews/star ratings) allows for the co-exploitation of assets by more than one agent (collaborative consumption)
• Examples:
• Ride sharing: Blabla car • Taxi: Uber
• Rentals/B&B: Airbnb • Etc.
Airbnb: Rentals from private owners
Should the regulators intervene ?
• Public finance approach: How easy is for asset owners to evade taxes ?
• Social Welfare: In two sided markets such as the
rental market, do these apps create adverse effects ?
• Shall the regulators monitor how the “disruptive”
potential of these apps creates an unfair situation for
the uninformed majority