• No results found

Big Privacy Rises to the Challenges of Big Data

N/A
N/A
Protected

Academic year: 2021

Share "Big Privacy Rises to the Challenges of Big Data"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Report: Big Idea

Privacy Regulations Can Shock Data Miners, Yet

Big Data Demands a New Privacy Compact

“Big  Privacy”  Rises  to  the  

Challenges of Big Data

April 11, 2014

Steve Wilson

Vice President and Principal Analyst

Content  Editor:  R  “Ray”  Wang

(2)

Table of Contents

Purpose and Intent ... 3

Executive Summary ... 3

Everyone Suffers When Online Crosses the Line ... 4

The Big Business of Big Data ... 4

Big Data Cannot Ignore Privacy Law ... 6

Check  the  Fine  Print  in  “PII”! ... 6

Classical Data Privacy Controls ... 7

Big  Data  “Oil  Spills” ... 8

Google Finds that Public Can Still Be Private ... 8

Facebook Was Too Clever with Photo Tagging ... 8

Target Gets Too Close to Its Female Customers ... 9

More Privacy Shocks Are Likely to Come ... 9

“Big  Privacy”  to  Deal  with Big Data ... 10

A New Big Privacy Compact ... 10

Parallax Points of View ... 12

Alan Lepofsky, Vice President and Principal Analyst, Constellation ... 12

Dr. Janice Presser, CEO, The Gabriel Institute ... 12

Constellation Research Panel ... 14

Disclosures ... 14

Endnotes ... 15

Analyst Bio: Steve Wilson ... 16

About Constellation Research ... 17

(3)

Purpose and Intent

This   report   aims   to   enhance   decision   makers’   appreciation   of   the   regulatory   and   social  

impacts of data analytics and Big Data by exposing some surprising strengths of data privacy law as well as shortcomings in standard privacy principles. We set out a fresh compact that innovative digital companies can strike with their communities to respect the incredible and privileged insights that Big Data provides.

This report offers insights into three of  Constellation’s  primary  business research themes, Data to Decisions, Matrix Commerce and the Next-Generation Customer Experience.

Executive Summary

Big Data concerns the extraction of knowledge and insights from the vast underground rivers of unstructured data that course unseen through cyberspace. It represents one of the biggest challenges to privacy and data protection society has yet seen. Never before has so much personal information been available so freely to so many.

Personally Identifiable Information (PII) is the lifeblood of most digital enterprises today. Many social media business models are fueled by a generally one-sided bargain for PII, and the fairness of this value exchange is currently a hot topic. Data analytics and data mining are able to pull PII almost out of thin air. Collectively, digital businesses may have gone too far in their enthusiasm for Big Data, sacrificing the trust of their users for short-term commercial gain.

Big Data promises vast benefits for a great many stakeholders but the benefits are jeopardized by the excesses of a few. Some cavalier online businesses are propelled by a

naive  assumption  that  data  in  the  “public  domain”  is  up  for  grabs;;  they  err  on  the  side  of  

abandon. Many think the law has not kept pace with technology, but technologists often underestimate the strength of conventional data protection laws and regulations. The extraction of PII from raw data may be interpreted as a collection and as such is subject to longstanding data protection statutes. On the other hand, orthodox privacy policies and freeze-frame user agreements do not cater for the way PII can be conjured tomorrow from raw data collected today. It is unfortunate that privacy compliance efforts so often give the impression of being preoccupied with unwieldy policy documents and simplistic compulsory notices about cookies.

Thus, the fit between Big Data and data privacy standards is complex and sometimes surprising. While existing laws are not to be underestimated, Constellation calls for a fresh compact with users that engages them in the far-reaching upside of transforming data to decisions. We call on Big Data businesses to exercise restraint in using powerful analytic tools, to be transparent about their business models, to offer consumers fair value for their data and to innovate in privacy as well as data mining.

(4)

Everyone Suffers When Online Crosses the Line

Consider Pay-as-You-Drive car insurance, a new product with premiums scaled according

to  how  you  drive.  By  analyzing  data  from  automobile  “black  boxes”,  the  insurance  company  

can tell not only how far the car has gone (so that infrequent drivers can enjoy discounted fees) but can also detect where it has gone, how fast it has been going and so on. Higher risk driving behaviors attract extra levies or other forms of disincentive.

GPS signals could be used to inform these services, but explicit vehicle tracking arouses privacy fears. So some Pay-as-You-Drive systems promise not to use GPS, and instead draw only on more innocuous speed and time measurements. Yet, the privacy picture is not so simple. Recent research at the University of Denver has shown that when combined with map data, speed and time can be used to infer the location of a car at any time, just as precisely as GPS coordinates. Dr. Rinku Dewri and his colleagues write that because of this,

“customer  privacy  expectations  in  non-tracking telematics applications need to be reset and

new  policies  need  to  be  implemented  to  inform  customers  of  possible  risks”.1

Constellation does not allege that insurance companies are exploiting automobile black box data in this way, but the temptations of Big Data prove time and time again to be irresistible. If time and speed data can be accessed by third parties and linked to maps or other data sets to extract insights about drivers, it may only be a matter of time before this routinely happens.

When businesses go too far with advanced data analytics and leave users feeling violated

or  betrayed,  then  everyone  suffers.  Disillusioned  customers  don’t  just  abandon  the  firms  

that have squandered their trust; they also lose confidence in cyberspace more broadly and withdraw from other new and worthwhile services, like e-government, e-health, digital payments and e-commerce at large.

Constellation Chairman and CEO Ray Wang has argued cogently for a correction to the way business is done around Big Data, so customers take back some control:

“We  won’t be able to build sustainable digital business models until we agree on some limits to how customer data can be used. A compact must

be  reached  on  the  balance  between  privacy  and  convenience.”2

This research report begins to unpack what such a new compact might look like. First,  let’s  

review how Big Data processes and business models convert raw data into Personally Identifiable Information (PII) and expose how this extraction collides with international privacy best practice.

The Big Business of Big Data

It’s  not  for  nothing  people call  it  “data  mining”. The raw material of Big Data – namely all

the ones and zeroes coursing beneath us in the digital environment – is often likened to crude oil, alluding to the enormous riches to be extracted from an undifferentiated matrix. Look at photo data, for instance, and the rapid evolution of tools for monetizing it. These tools range from simple metadata embedded in digital photos which record when, where

(5)

and with what sort of device they were taken, through to increasingly sophisticated pattern recognition and facial recognition algorithms. Image analysis can extract places and product names from photos and automatically pick out objects. It can identity faces by re-purposing biometric templates that originate from social network users tagging their friends for fun in entirely unrelated images. Image analysis lets social media companies work out what people

are   doing,   when   and   where,   and   who   they’re   doing   it   with,   thus revealing personal preferences and relationships, without   anyone   explicitly   “liking”   anything   or   “friending”  

anyone.

The ability to mine photo data defines a new digital gold rush. Like petroleum engineering, image analysis is very high tech. There is extraordinary research and development (R&D) going on in face and object recognition. The   “infomopolies”   like   Facebook   and   Google  

(whose fortunes are made on nothing other than information) and digital media companies like Apple have invested enormously in their own R&D and in acquiring start-ups in this space. And, of course, they pay over-the-odds for photo companies like Picasa, Instagram and Snapchat3 - not merely because photos are fun and tagging them is cool, but because the potential for extracting intelligence from images is unbounded.

So more than data mining, Big Data is really about data refining, as suggested by Figure 1, transforming unstructured information into fresh insights, decisions and value.

Figure 1. The Metaphor of Data Refining

Business models for monetizing photo data are still embryonic. Some entrepreneurs are

beginning   to   access   photo   data   from   online   social   networks.   For   example   “Facedeals”,   a  

proof of concept from advertising invention lab Redpepper, provides automated check-in to retail stores by face recognition; the initial registration process draws on images and other

profile  information  made  available  by  Facebook  (with  the  member’s  consent)  over  a  public  

API (see http://redpepperland.com/lab/details/check-in-with-your-face). It is not clear if

Facedeals  accesses  the  biometric  templates,  but  nothing  in  Facebook’s  privacy  and data use

Photo data

Location

from placenames, landmarks

Linkages

from who took photo, recognised companions

Deduced Likes

from trend data, recognised objects, emotions

Behaviour Patter ns

from trend data

Future Intelligence? I M A G E A N A L Y S I S

(6)

policies restrains the company from providing or selling the templates. But as we shall see, international privacy regulations do in fact restrict the uses that can be made of the byproducts of Big Data, should they be personal. Facebook has been taken to task for stretching social data analytics beyond what members reasonably expected to occur.4 We believe more surprises like this await digital businesses in retail, healthcare and other industries.

Big Data Cannot Ignore Privacy Law

It’s  often  said  that  technology  has  outpaced  privacy  law,  yet  by  and  large  that's  just  not  the  

case. Technology has certainly outpaced the intuitions of consumers, who are increasingly alarmed at what Big Data can reveal about them behind their backs. However, data privacy principles set down in 1980 by the Organization of Economic Cooperation and Development (OECD) still work well, despite predating the World Wide Web by decades. Enforcement of privacy laws is gaining momentum everywhere. Outside the U.S., rights-based privacy law has proven effective against many of today's more worrying business practices. Digital entrepreneurs can feel entitled to make any use they like of data that comes their way, but in truth 30-year-old privacy law says otherwise.

Information innovators ignore international privacy law at their peril. In this section, we will see why, by reviewing the surprising definition of Personally Identifiable Information, and how good old technology-neutral privacy principles are as relevant as ever.

Check t

he  Fine  Print  in  “PII”!

Privacy  is  personal,  as  they  say,  “by  definition”.  But  it’s  important  to  check  the  technical  

definition of personal data, because the fine print often surprises.

The U.S. General Services Administration (GSA) defines Personally Identifiable Information as “information that can be used to distinguish or trace an individual's identity, either alone or when combined with other personal or identifying information that is linked or linkable to a specific individual” (underline added).5

This means that items of data can constitute PII if other data can be combined to identify the person concerned. And note carefully that the fragments are each regarded as PII rather than the whole data that eventually identifies someone. People often presume that PII stands for Personally Identifying (rather than Identifiable) Information. The difference is subtle but very important. The definition means that some data can and should be classified as PII before it is identified, rather than after, with due consideration to the context of the data flows and the potential for identification. This after all is only prudent; if personal data needs certain safeguards, then it is best they be applied before the data is identified and

it’s  too  late.  

For   further   practical   guidance   on   classifying   and   treating   PII,   please   see   “Is it Personal Information or Not? Embrace the Uncertainty”, http://constellationr.com/content/it-personal-information-or-not-embrace-uncertainty.

(7)

Constellation Research Panel

Valuable comments on this research were received from the following people (all responsibility for the text remains with Constellation):

Professor Graham Greenleaf, Faculty of Law, University of New South Wales. Associate Professor David Lindsay, Faculty of Law, Monash University.

Dr. Alana Maurushat, Senior Lecturer, Faculty of Law, University of New South Wales. Rich Toohey, President of Rewards Member Experience, Marriott.

Disclosures

Your trust is important to us, and as such, we believe in being open and transparent about

our   financial   relationships.   With   our   clients’   permission, we publish their names on our website.

(8)

Analyst Bio: Steve Wilson

Steve Wilson is Vice President and Principal Analyst at Constellation Research, Inc. He focuses on digital identity, privacy and cyber security across the business research themes of Consumerization of IT and Next-Generation Customer Experience.

Experience

Steve has worked in ICT innovation, research, development and analysis for over 25 years. He holds double degrees in physics and electrical engineering. His career began in R&D and medical software engineering in Australia and the U.S. He moved into cyber security in 1995 and specialized in identity management, holding R&D leadership and Principal Consultant roles with Security Domain (later Baltimore Technologies), KPMG, PwC and SecureNet. In 2004, Steve founded Lockstep Consulting to concentrate on identity and privacy research and analysis. He is personally responsible for numerous breakthroughs in difficult areas of identity infrastructure and governance, including national scale authentication, PKI, smartcards, digital credentials, fraud control and privacy engineering. He has provided advice on identity frameworks to the governments of Australia, Hong Kong, New Zealand, Malaysia, Singapore, Kazakhstan and Macau.

Influence

Steve has been involved in security public policy and industry development for over 16 years. He was a member  of  the  Australian  Law  Reform  Commission’s  Developing Technology committee (2007-08), the  Federal  Privacy  Commissioner’s  PKI  Reference  Group  (2000) and the National E-Authentication Council (1998-2001). He contributed to the American Bar Association PKI Assessment Guidelines (1999-2002) and was co-author of the APEC Electronic Authentication guidelines (1998-2001). Steve chaired the Certification Forum of Australasia over 1999-2002 and the OASIS PKI Committee from 2007 to 2008.

He is a current member of the International Association of Privacy Professionals, the Australian Government Gatekeeper PKI Advisory Committee, and the Privacy Coordination Committee of the National Strategy for Trusted Identities in Cyberspace (NSTIC).

Patents

System and method for anonymously indexing electronic record systems US 8,347,101; AU 2005220988

Authenticating electronic financial transactions US 8,286,865; US 8,608,065; NZ 589160

Verified anonymous code signing AU 2012101460

Decoupling identity in the Internet of Things (pending). Twitter: @steve_lockstep

(9)

About Constellation Research

Constellation Research is a research and advisory firm that helps organizations navigate the challenges of digital disruption through business models transformation and the judicious application of disruptive technologies. This renowned group of experienced analysts, led by

R   “Ray”   Wang,   focuses   on   business-themed research, including Digital Marketing

Transformation; Future of Work; Next-Generation Customer Experience; Data to Decisions; Matrix Commerce; Technology Optimization and Innovation; and Consumerization of IT and the New C-Suite.

Unlike the legacy analyst firms, Constellation Research is disrupting how research is accessed, what topics are covered and how clients can partner with a research firm to achieve success. Over 225 clients have joined from an ecosystem of buyers, partners, solution providers, C-suite, boards of directors and vendor clients. Our mission is to identify, validate and share insights with our clients. Most of our clients share a common trait - the passion for learning, innovating and delivering impactful results.

Organizational Highlights

Founded and headquartered in the San Francisco Bay Area, United States, in 2010. Named Institute of Industry Analyst Relations (IIAR) New Analyst Firm of the Year in 2011.

Serving over 225 buy-side and sell-side clients around the globe.

Experienced research team with an average of 21 years of practitioner, management and industry experience.

Creators of the Constellation Supernova Awards – the   industry’s   first   and   largest  

recognition of innovators, pioneers and teams who apply emerging and disruptive technology to drive business value.

Organizers of the Constellation Connected Enterprise – an innovation summit and best practices knowledge-sharing retreat for business leaders.

Founders of Constellation Academy, experiential workshops in applying disruptive technology to disruptive business models.

Website: www.ConstellationR.com Twitter: @ConstellationRG

Contact: [email protected] Sales: [email protected]

Unauthorized reproduction or distribution in whole or in part in any form, including photocopying, faxing, image scanning, e-mailing, digitization, or making available for electronic downloading is prohibited without written permission from Constellation Research, Inc. Prior to photocopying, scanning, and digitizing items for internal or personal use, please contact Constellation Research, Inc. All trade names, trademarks, or registered trademarks are trade names, trademarks, or registered trademarks of their respective owners.

Information contained in this publication has been compiled from sources believed to be reliable, but the accuracy of this information is not guaranteed. Constellation Research, Inc. disclaims all warranties and conditions with regard to the content, express or implied, including warranties of merchantability and fitness for a particular purpose, nor assumes any legal liability for the accuracy, completeness, or usefulness of any information contained herein. Any reference to a commercial product, process, or service does not imply or constitute an endorsement of the same by Constellation Research, Inc.

This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold or distributed with the understanding that Constellation Research, Inc. is not engaged in rendering legal, accounting, or other professional service. If legal advice or other expert assistance is required, the services of a competent professional person should be sought. Constellation Research, Inc. assumes no liability for how this information is used or applied nor makes any express warranties on outcomes. (Modified from the Declaration of Principles jointly adopted by the American Bar Association and a Committee of Publishers and Associations.)

San Francisco | Andalucia | Austin | Belfast | Boston | Chicago | Colorado Springs | Denver | London | Los Angeles | Monta Vista | New York | Pune Sacramento | San Diego | Santa Monica | Sedona | Sydney | Tokyo | Toronto | Washington D.C.

References

Related documents

And in the 1980s, as the United States began to experience a marked rise in inequality, including a growing gap in wages between skilled and unskilled workers, it was natural to

We selected different categories of apps because we wanted to evaluate our approach against reviews containing diverse vocabularies, describing different features, and written

National Academy of Sciences and National Institutes of Health (US), 1999, Monoclonal Antibody Production: A report of the Committee on Methods of Producing Monoclonal

multidisciplinary teams caring for chronically ill and dying patients, orthopaedic nurses must reflect on the ethical principles of justice, respect for persons, beneficence,

As the service provider offers call center solutions for many firms, this is one of their main products from their point of view, and therefore they assign a product manager

113,114 Figure 8 shows the current density of 304 stainless steel (SS) artificial pit electrodes as a function of the surface ion concentration, which was calculated from a model

Hypothesis 4 stated that psychological empowerment would moderate the relationship between perceived underemployment and job attitudes such that when psychological empowerment

However, to give a flavor of how SAS/ACCESS LIBNAME engines are used, the following example was culled from a program in use at one of the author’s clients to access data from