• No results found

HKU Big Data and Privacy Workshop. Privacy Risks of Big Data Analytics From a Regulator s Point of View

N/A
N/A
Protected

Academic year: 2022

Share "HKU Big Data and Privacy Workshop. Privacy Risks of Big Data Analytics From a Regulator s Point of View"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

Privacy Risks of Big Data Analytics – From a Regulator’s Point of View HKU Big Data and Privacy Workshop

(2)

1. Data protection principles

2. Big data analytics and privacy

Big Data Analytics and Mobile Apps

(3)

Personal Data Flow

Collection Retention/

Erasure Storage, Use or Processing

Collection Limitation Data Quality

Use Limitation IT System

OECD Privacy Framework Principles

(4)

1. Data protection principles

2. Big data analytics and privacy

Big Data Analytics and Mobile Apps

(5)

Failures of Big Data Analytics Big Data and Privacy

(6)

Failures of Big Data Analytics – Google Flu Prediction

It does not always work…

• Underestimated by half in 2009 when comparing with CDC data

• Overestimated by half in 2012 when comparing with CDC data

• Predictor of flu or predictor of winter?

• A black-box approach makes it hard for people to judge

(7)

Failures of Big Data Analytics – US Presidential Election

“Past performance does not guarantee future results…”

• Colorado professors built a data model that correctly “backward predicted” the eight US presidential election results since 1980

• It failed to forward predict the 2012 election…

(8)

Privacy risks of big data analytics Big Data and Privacy

(9)

Privacy Risks of Big Data Analytics

1. Sense of rights violation or “surprise”

2. Re-identification

3. Negative impact/discrimination

(10)

Privacy Risks of Big Data Analytics

1. Sense of rights violation or “surprise”

2. Re-identification

3. Negative impact/discrimination

(11)

Correct predication can still be creepy Big Data and Privacy

(12)

The Surprise of Big Data Analytics – Target’s Pregnancy Prediction

If it works in this way…

(13)

The Surprise of Big Data Analytics – Target’s Pregnancy Prediction

Target learnt this lesson:

“Then we started, in the same mailer, mixing baby items with other things we know they would never buy, like lawn mower… as long as the pregnant woman doesn’t know she has been spied on, it works and she would use the coupons…”

(14)

Privacy Risks of Big Data Analytics

1. Sense of rights violation or “surprise”

2. Re-identification

(15)

The myth of anonymisation Big Data and Privacy

(16)

The Myth of Anonymisation

AOL released “anonymised” search records of 650,000 people over a three-month period

User 4417749 was found to be Ms Arnold of Lilburn of Georgia through the keywords she entered

Her searches also included “nicotine effect”, “dry mouth”, “hand tremors”,

“bipolar disorder” – do we need to

(17)

“Anonymised” Massachusetts state employee hospital records

– State employee hospital records released for research

– Governor reassured the public that the data was de- identified

The Myth of Anonymisation

– Governor’s own record re- identified by a researcher by

(18)

How much data do you need to identify someone?

– 87% US population can be identified by using Zip code, gender and date of birth;

– 53% by place, gender and date of birth; and – 18% by county, gender and date of birth.

The Myth of Anonymisation

(19)

The only way to make data anonymous is to make it useless…

Professor Paul Ohm (University of Colorado Law School)

The Myth of Anonymisation

(20)

Privacy Risks of Big Data Analytics

1. Sense of rights violation or “surprise”

2. Re-identification

3. Negative impact/discrimination

(21)

Before we look at discrimination, let’s look at the reality of big data analytics

Big Data and Privacy

(22)

Big data analytics:

 Correlation

 Causation

The Reality of Big Data Analytics

(23)

US spending on science, space, and technology reveals Suicides by hanging, strangulation and suffocation?

The (Academic) Reality of Big Data Analytics

(24)

Number of Nicolas Cage films reveals swimming-pool drowning?

The (Academic) Reality of Big Data Analytics

(25)

Divorce rate in Maine reveals Per capita consumption of margarine ?

The (Academic) Reality of Big Data Analytics

(26)

But, do we really care about

the difference between correlation and causation?

The Reality of Big Data Analytics

(27)

The (Commercial) Reality of Big Data Analytics

(28)

The (Commercial) Reality of Big Data Analytics

(29)

The (Commercial) Reality of Big Data Analytics

(30)

The Reality of Big Data Analytics

(31)

The Reality of Big Data Analytics

Marketers are not interested in theories, they are interested in results.

So if it works, what’s the problem?

So if users of table feet protectors pay back their loans promptly, what’s wrong in lending to them?

The problem lies with the ‘have not’, those that you

(32)

The Reality of Big Data Analytics

Is there a solution to this?

– Need to know what big data is and isn’t good at

IS • Pattern matcher

• Gives recommendations

ISN’T • Substitutes for proper data collection and

(33)

Privacy Challenge of Big Data Analytics

Risks recap:

1. The (unintended) impacts on people when it is working;

2. The risks of re-identifying people from anonymised sensitive data; and

3. The “targeted not”.

(34)

Big Data Analytics

References

Related documents