Big Data for the Common Good: A regulation perspective

(1)

Big Data for the Common

Good: A regulation perspective

Dr Nikolaos Korfiatis (n.korfiatis@uea.ac.uk)

Assistant Professor in Business Analytics and Regulation

University of East Anglia (UEA) and Center for Competition policy (CCP)

(2)

About the speaker

• PhD (CBS/Denmark, 2009), MEng (KTH/Sweden, 2006)

• Academic Appointments

• 2014 - Assistant Professor in Business Analytics and Regulation, UEA & CCP

• 2011-2014 – Senior Researcher/Director for Data Science and Analytics, Frankfurt Big Data Lab

• 2009-2011 – Senior Researcher/Post.Doc. Department of Economics University of Copenhagen

• Industrial Appointments

• 2014 - Senior Data Scientist in Residence for Adastra Germany (Automotive, Banking)

(3)

Agenda

• What is the common good ? What’s the role of big data on shaping it ? What are the misfits ?

• Big Data working for the benefit of consumers

• Case studies

• The case of online reviews and endorsements • The case of the sharing economy

• How do we move forward ?

(4)

The common good

• A utilitarian perspective: the greatest possible good for the greatest possible number of individuals

• A theological perspective: “Do not live entirely

isolated, having retreated into yourselves, as if you were already [fully] justified, but gather instead to seek together the common good (Codex Sinaititicus – Epistle of Barnabas)”

• A political perspective: Promoting the general welfare

(5)

(6)

A definition for big data (Wikipedia)

• “Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate.

Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.

• The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set.

• Accuracy in big data may lead to more confident decision

making. And better decisions can mean greater operational

efficiency, cost reductions and reduced risk.”

(7)

7

(8)

Information Load and Information Overload

(9)

Are big data used for the

benefit of consumers ?

(10)

(11)

Bias

(12)

Grinberg, Nir, et al. "Extracting Diurnal Patterns of Real World Activity from Social Media." ICWSM. 2013.

(13)

Signal

(14)

(15)

(16)

Informed Minority vs

Uninformed Majority

(17)

From Citizens to

Consumers

(18)

What if …

• Big data was used to protect consumer interests

rather than been an enabler of corporate goals (e.g. focusing on improving experience rather than

targeting sales) ?

• Big data helped consumers reward positive

experiences and drive more competition for quality ?

• Big data enabled legislators prevent abuse (e.g. Cartel formation)

(19)

Case study 1: Online

reviews and endorsements

(20)

(21)

Definitions

 _Review_valence_{: The rating the a customer gives}

 95% percent of the websites provide a 5 star rating scale ranging from extremely negative (1 star) to extremely positive.

 Hotel websites (e.g. tripadvisor, booking.com) are more inclined to use a 10 star rating scale.

 Review volume: The total number of reviews this product has until the time point

(22)

Definitions

• Review helpfulness: Most retailers allow consumers to rate the information content by asking if a review

(23)

How do the ratings look ?

• For a rating given at time point t,

the distribution is found empirically to be J-Shaped (Hu et al. 2009).

}

5 ,

4 ,

3 ,

2 ,

1 {



i

r

Korfiatis et al. 2012, ECRA.

Number of reviews per unit of the rating scale. (Total number of reviewed items/books: N_books=7262

(24)

Why reviews matter (2)

• Assist consumers better evaluate alternatives and shrink the size of the consideration set (Pavlou and Gefen, 2004)

• Reduce information asymmetry in transactions with incomplete information about the seller (Dellarocas, 2003)

• Drive growth for businesses: One-star increase in a restaurant's Yelp rating led to an increase of 5% to 9% in revenues (Lucca, 2011)

(25)

So should we trust them ?

(26)

Mechanism Problems and

firm-motives

• Reviewer side

• Self selection

• Material relationship (Astroturfing)

• Rating / voting motives are different (aspect rating)

• Firm side

• “Ballot stuffing” / Shilling (for own benefit)

• Flame wars (against competitors) (Mayzlin, 2013, AER)

(27)

(28)

(29)

Regulator’s perspective

o_{Are online retailers doing enough to avoid review}

manipulation ?

oGoogle not doing enough for click fraud avoidance – Tuzihlin 2006

o_{Should heuristics be established where review}

aggregators and review sites should comply (Malbon, 2013) ?

oWho owns this content ?

(30)

Are

regulators

aware of this

problem?

(31)

Regulatory Landscape

• UK CMA (2008) “Consumer protection from

Unfair Traiding”

• Prohibiting “falsely representing oneself as a consumer”

• US Federal Trade Commission (FTC, 2009) and

(FTC, 2013) Guides Governing Endorsements,

Testimonials

• Any material relation should be disclosed to the consumer (targeted to astroturfing)

(32)

Reviews decline over time

• Hu, Nan, Ling Liu, and Vallabh Sambamurthy. "Fraud

detection in online consumer reviews." Decision Support Systems 50.3 (2011): 614-626.

(33)

Attacking the review mechanism

with ballot stuffing

Strategy

• Consider the following estimator of the

average rating for I reviews at event time i:

• Consider a subset of fake reviews

N

_f

• If

N

_f

arrives early on the event time and

N

_f

is

sufficiently large the influence on the

estimator of a genuine review r

_g

will be

minimal r

_g

/

N

_f



 i i r N r 1 _ 33

(34)

But is it different than genuine

biases ?

• Self selection bias:

• Customers who are more positive towards a producer/brand tend to be the first to buy the product and thus the first to provide ratings.

• Rating scale also favours under-reporting bias for the median item.

• Sequential bias:

• Previous reviews influence the valence of the next review to arrive.

(35)

Other influencers on review

manipulation

• Mayzlin et. al, 2013, AER

• Review aggregator policy might make the process of posting a fake review costly (based on

purchased or not-purchased requirement).

• Proximity with a competitor might influence the proportion of fake reviews through competition between firms for a constrained space/customer inflow. (E.g Hotels)

(36)

Case study 2: The sharing

economy

(37)

(38)

The “Sharing” economy

• Development of technology and trust mechanisms (e.g. reputation building using online reviews/star ratings) allows for the co-exploitation of assets by more than one agent (collaborative consumption)

• Examples:

• Ride sharing: Blabla car • Taxi: Uber

• Rentals/B&B: Airbnb • Etc.

(39)

Airbnb: Rentals from private owners

(40)

(41)

Should the regulators intervene ?

• Public finance approach: How easy is for asset owners to evade taxes ?

• Social Welfare: In two sided markets such as the

rental market, do these apps create adverse effects ?

• Shall the regulators monitor how the “disruptive”

potential of these apps creates an unfair situation for

the uninformed majority

(42)