• No results found

Privacy and Data Analytics

N/A
N/A
Protected

Academic year: 2021

Share "Privacy and Data Analytics"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Privacy and Data Analytics

by Yves Le Roux, Principal Consultant, Technical Sales, CA Technologies

It has been widely reported that the global market for business analytics software grew roughly 14 percent in 2011, fueled by pervasive hype about “big data” as well as new technological innovations. The same reports say that between now and 2016, the business analytics market will have a compound annual growth rate of 9.8 percent, reaching $50.7 billion.1 Companies and

governments using powerful analytic capabilities on big data are able to find patterns and insight. At the same time, we see an increasing interest for privacy (for example, new proposed EU Data Protection Regulation and President Obama's Consumer Privacy Bill of Rights). This paper tries to answer how to balance the privacy rights without undermining the possibilities developed by data analytics.

What Is Information Privacy?

Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively. The boundaries and content of what is considered private differ among cultures and individuals, but share basic common themes.

Information Privacy concerns exist wherever personally identifiable information is collected and stored – in digital form or otherwise. Improper or non-existent disclosure control can be the root cause for privacy issues.

The growing importance of Information Communications Technologies (ICTs) and transborder data flows and their implications for privacy first attracted the interest of the Organisation for Economic Co-operation and Development (OECD) in 1969. In 1977, an Expert Group chaired by The Hon. Justice Michael Kirby of Australia, was created to begin work on guidelines. As a result, on 23 Sep 1980, the OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data2 were adopted. (Figure 1 illustrates the personal data

ecosystem.) The OECD Privacy Principles, which are part of these guidelines, provide the most commonly used international privacy framework, and serve as the basis for the creation of leading practice privacy programs and additional principles.

Part two of the OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data describe the principles that are technologically neutral and written using commonly understood language.

The principles are:

Collection Limitation

Data should be obtained fairly, legally, and, when appropriate, with the •

consent or knowledge of the subject of the data.

Data Quality

Personal data should be accurate, up-to-date, complete, and relevant to •

their purpose.

About the author

Yves Le Roux is a Principal Consultant in Technical Sales

After his graduation from Paris University in 1970, Yves Le Roux worked in the Rothschild Group where, among others tasks, he was in charge of the network security and other security related issues. In 1981, he joined the French Ministry of Industry where he was in charge of the Open Systems Standardization programs.

In 1986, he took the position of European Information Security Manager at Digital Equipment. Then, he joined the security research and development team. In 1999, he went to Entrust Technologies, PKI software editor. In 2003, Yves joined CA Technologies as a Technology Strategist.

Yves has co-authored three books on security. He was a lecturer at Paris University and has spoken at many conferences. He was member of the European Network and Information Security Agency (ENISA) Permanent Stakeholders’ Group (PSG). He is also a member of the Cloud Standards Customer Council Security Working Group, and the Cloud Security Alliance Privacy Level Agreement Working Group.

(2)

Purpose Specification

The purposes for personal data collection should be made clear before, not •

after, the data is collected. Later use of should be compatible with the stated purposes. If the purposes change, the changes should be specified.

Use Limitation

Without the consent of the subject, or legal authorization, personal data •

should not be disclosed or used for previously unspecified purposes.

Security Safeguards

Reasonable security precautions should be taken against loss, unauthorized •

access, modification, or disclosure of personal data.

Openness

Personal data developments, practices, and policies should be open. There •

should be a general policy of openness about developments, practices and policies with respect to personal data. The existence and nature of personal data, and purposes of their use, and the identity and contact information for the data controller3.

Individual Participation

An individual should have rights, including

The right to find out from data controllers if they have data relating to the •

individual.

To obtain data relating to the individual in a timely and reasonable manner •

at a reasonable charge and in a form intelligible to the individual.

If access is denied, reasons must be provided and the individual be able to •

challenge the denial.

If the individual launches a successful challenge to the data related to him, •

the individual may have the data deleted, corrected, completed or amended.

Accountability Principle

A data controller should be accountable for compliance to these principles. •

(3)

As highlighted in the Business Software Alliance Global Cloud Computing Scorecard4, based on these principles, most advanced economy countries have

comprehensive data protection laws in place and have established independent privacy commissioners.

What Is Data Analytics?

ISACA5defines data analytics as processes and activities designed to obtain and

evaluate data to extract useful information. The results of data analytics may be used to identify areas of key risk, fraud, errors, or misuse; improve business efficiencies; verify process effectiveness; and influence business decisions.6

For example, in March 2010, as authorities scrambled to find the source of a salmonella outbreak that sickened hundreds around the country, investigators from the U.S. Centers for Disease Control and Prevention successfully used the shopper cards that millions of Americans swipe every time they buy groceries. Investigators followed the trail of grocery purchases to a Rhode Island company that makes salami, then zeroed in on the pepper used to season the meat.7

For business, the most promising part is predictive analytics which exploit patterns found in historical and transactional data to identify risks and opportunities. Predictive analytics8is used in actuarial science, marketing,

financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and other fields.

In the 16 Feb 2012 issue of The New York Times Magazine, a fascinating article9

entitled “How Companies Learn Your Secrets“ discusses the powerful statistical techniques that some companies are using to analyze sales and other data to gain insights into their customers' behaviors and needs.

Obviously in Web 2.0 and cloud computing, as LinkedIn founder and chairman Reid Hoffman said at the Web 2.0 Expo 2011 in San Francisco10, data will come

in two forms: explicit and implicit. Explicit data is data that users willingly give to social networks, blog posts and tweets, while implicit data is data collected in the background, like geographic location information.

Some examples of data analytics applied to these explicit data are described below.

If a cloud provider reads the taglines of a user’s photographs and learns •

that a John Doe (who is not a user of the service) is in one of the photos skiing, the provider will know that John Doe has some skiing interest and may use it for marketing purposes.

If a user places a draft document with a cloud provider and allows three •

colleagues to access the document, the system would have transactional information on the user and the user’s colleagues. The same information could also show relationships among the four individuals and their institutional affiliations.

If two publicly owned companies considering a merger share documents •

with each other through a cloud provider, the transactional and relationship information might reveal confidential plans even if the provider does not read the actual documents.

The phenomenon of “big data”, namely, the vast quantities of data that can be stored, linked, and analyzed, brings with it the possibility of finding information, trends, insights that were not previously obvious or capable of being

ascertained.

Explicit data is data that users willingly give to social networks, blog posts and tweets, while implicit data is data collected in the background, like geographic location information.

(4)

Data Analytics and Privacy

The ability to store data indefinitely and strides in analytics present enormous potential for using personal data for other purposes, possibly bringing

significant economic or social benefits to both individuals and organizations. However, using personal data in ways that neither the organization nor the individual anticipated when the data was collected can also contribute to the perception that privacy is at risk.

The first problem is that many companies are not informing customers of their data collection, retention, and analysis policies as recommended by the OECD Privacy Principles. In Feb 2012, the Article 29 Working Party which is made up of data protection authorities from the member states of the European Union decided to give its French data protection member (CNIL) the lead task to investigate the new privacy policy of an online search provider11. They found

that the new policy would allow the provider to display ads that are related to the user’s activity on some smart phones (personal number, calling party numbers, date and time of calls) and its location data. As a result, the CNIL and the EU data protection authorities are deeply concerned about the combination of data across services and have strong doubts about the lawfulness and fairness of such processing. On 27 Feb 2012, the CNIL sent a letter to the provider exposing these concerns12and on 16 March, a questionnaire with 69

questions to be answered before 5 Apr 2012.13

The second problem related to the OECD Use Limitation Principle is that individuals want to know about and be able to choose whether to consent to new unrelated uses. Obtaining consent, either in terms of the permission organizations obtained initially directly or through procedures including this permission (for example, license approval, terms and conditions) or in going back to individuals to obtain consent for new purposes, presents risks. Many purposes for collecting personal data may be difficult to explain and equally difficult to understand. If the initial consent language is overly broad to take into account any potential uses of personal data, individuals may not know or understand what could happen to their personal data, and any consent they provide is arguably less than informed. Consequently, their trust in the organization may be placed at risk. Returning to individuals to obtain consent may, in some instances, also risk the trust of the individual depending on how often consent is requested and what the new uses are.

The third problem is related to the Individual Participation Principle and more particularly to the access and correction rights. The dynamic “information life cycle” that characterizes the collection, storage, use, and disclosure of personal data online is posing a challenge to the exercise of rights related to access, correction and erasure in a practical way. As an example, if an individual wanted to know how and why a particular advertisement was served to them while they were surfing the Internet, how would they go about finding out? Who would they ask? Given the increasing reliance on various transactional data for

automated risk management and profiling, the need for accuracy and the ability to correct information is likely even greater now than in the past. But, rapid dissemination, indexing, caching and mirroring of data also pose problems for individuals seeking to correct personal data (or have it removed).

The fourth problem for the companies holding the data is related to the OECD Security safeguards principle. Collecting increasing amounts of personal data can create security challenges for organizations, as more personal data is potentially at risk of privacy breaches. How and where they are storing these

However, using personal data in ways that neither the organization nor the individual anticipated when the data was collected can also contribute to the perception that privacy is at risk.

Many purposes for collecting personal data may be difficult to explain and equally difficult to understand.

(5)

vast databanks of sensitive information? What are they doing to protect this information from prying eyes?

Currently, Data Protection Laws are mainly placing duties on the data controller who (either alone or jointly or in common with other persons) determines the purposes for which and the manner in which any personal data are, or are to be, processed. Given the relatively static data transfers and comparatively simple business models and relationships in place when data protection principles were first written, the concept of data controller did not contemplate scenarios where many players could be considered data controllers. Increasingly complex business models and relationships, as well as new technologies, can make it challenging to determine who the data controller is and therefore who is responsible for protecting the personal data. Subcontracting, outsourcing, evolving partnerships between organizations in value chains, behavioral advertising, and other emerging business models can add layers of complexity in determining responsibilities and identifying roles. Often an entity can be a controller related to one use of information and a co-controller, processor or sub-processor for another.

A Potential Solution to the Issue Related to Privacy and Data Analytics

In 2009, the Centre for Information Policy Leadership started an innovative, 21st century approach to data protection “Accountability-Based Privacy Governance”14trying to clearly delineate what accountability—and the related

concept of responsibility—means for organizations that collect, store, and process information.

This approach requires companies to implement programs that foster

compliance with data protection principles and to be able to explain how those programs provide the required protections for individuals. Accountability obligates organizations to take responsibility for the safe and appropriate processing and storage of data, wherever it occurs. It requires them to implement effective data and privacy protection policies that correspond to accepted external criteria found in law, regulation and industry best practices. Accountability asks that organizations analyze and understand the risks that data use raises for individuals, and take necessary and appropriate steps to mitigate those risks. It further requires that organizations make judicious decisions about data use, even when traditional individual consent or choice may not be available.

The Accountability Project identified nine common fundamentals that an accountable organization should implement:

1. Policies

2. Executive oversight 3. Staffing and delegation 4. Education and awareness

5. Ongoing risk assessment and mitigation

6. Program risk assessment oversight and validation 7. Event management and complaint handling 8. Internal enforcement

9. Redress

Privacy programs must have the following characteristics:

Accountable organizations would adopt privacy policies consistent with 1

commonly accepted external criteria – applicable law, regulation, and

Subcontracting, outsourcing, evolving partnerships between organizations in value chains, behavioral advertising, and other emerging business models can add layers of complexity in determining responsibilities and identifying roles.

(6)

recognized external guidelines. Such policies would also reflect the organization’s values and promises it has made to individuals

Accountable organizations would implement mechanisms to put policies 2

into effect and communicate those policies to individuals.

Accountable organizations would integrate privacy protections into 3

governance and apply them across all aspects of the organization where they are relevant. The organizations would designate a person or persons at an appropriately senior level to be responsible for privacy and data

protection initiatives throughout the organization.

Accountable organizations would put in place an internal oversight 4

program. In addition, accountable organizations would oversee and raise awareness of third-party vendors and suppliers with whom they do business to ensure that they are meeting the obligations created by law, regulation and the organization’s privacy promises to its customers.

Organizations adhering to requirements of accountability would be prepared 5

to demonstrate to a regulator their commitment to accountability and their capacity to provide necessary data and privacy protections. They would do so by providing evidence, when asked, that they have implemented each of the elements described above.

The practical success of an accountability approach will rely significantly on its broad implementation across the marketplace. Going forward, the

“Accountability-Based Privacy Governance Project” will focus in a more detailed way on the infrastructure necessary for successful implementation of an accountability approach by organizations and by regulators.

This is a technical way to solve this issue. But it is more than a technical issue and must also take in due account the three main groups of actors in these problems which are individuals, government policy-makers and organizations. Three related topics have to be studied:

1. Protection and Security

2. Rights and responsibilities for using data 3. Accountability and enforcement

In May 2012, the World Economic Forum issued a document entitled

“Rethinking Personal Data: Strengthening Trust”15. This report recommends that

all stakeholders take four main steps:

1. Engage in a structured, robust dialogue to restore trust in the personal data ecosystem

2. Develop and agree on principles to encourage the trusted flow of personal data.)

3. Develop new models of governance for collective action 4. Establish “living labs”

These discussions will have to focus on: 1. Protection and Security

2. Rights and responsibilities for using data 3. Accountability and enforcement

With the World Economic Forum good relationship with business, political, academic and other leaders of society, we may hope to see some progress in the near future.

Accountable organizations would put in place an internal oversight

(7)

References

1 http://www.eweek.com/c/a/Enterprise-Applications/Business-Analytic-Market-to-Reach-507B-by-2016-on-Big-Data-Hype-IDC-179369/

2 http://www.oecd.org/document/18/0,3746,en_2649_34255_1815186_1_1_1_1,00.html 3Data controller is defined as a party who, according to domestic law, is competent to decide about the contents and use of personal data regardless

of whether or not such data are collected, stored, processed or disseminated by that party or by an agent on its behalf. (See: OECD Guidelines on the

Protection of Privacy and Transborder Flows of Personal Data.)

4http://portal.bsa.org/cloudscorecard2012/assets/PDFs/BSA_GlobalCloudScorecard.pdf 5ISACA is formerly known as the Information Systems Audit and Control Association. Today, it goes by its acronym and has more than 100,000

constituents in 180 countries. 6 http://www.isaca.org/Knowledge-Center/Research/ResearchDeliverables/Pages/Data-Analytics-A-Practical-Approach.aspx 7 http://www.newsobserver.com/2010/03/11/381609/cdc-sleuths-use-shopper-card-data.html#storylink=cpy 8http://en.wikipedia.org/wiki/Predictive_analytics 9http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?_r=1&pagewanted=all 10http://mashable.com/2011/03/30/reid-hoffman-data/ 11 http://www.cnil.fr/english/news-and-events/news/article/googles-new-privacy-policy-raises-deep-concerns-about-data-protection-and-the-respect-of-the-euro/ 12http://www.cnil.fr/fileadmin/documents/en/Courrier_Google_CE121115_27-02-2012-EN.pdf 13 http://www.cnil.fr/fileadmin/documents/La_CNIL/actualite/questionnaire_to_Google-2012-03-16.pdf 14http://www.informationpolicycentre.com/accountability-based_privacy_governance/ 15http://www3.weforum.org/docs/WEF_IT_RethinkingPersonalData_Report_2012.pdf NOTICES

Copyright © 2012 CA. All rights reserved. All trademarks, trade names, service marks and logos referenced herein belong to their respective companies.

The information in this publication could include typographical errors or technical inaccuracies, and CA, Inc. (“CA”) and the authors assume no responsibility for its accuracy or completeness. The statements and opinions expressed in this publication are those of the authors and are not necessarily those of CA.

Certain information in this publication may outline CA’s general product direction. However, CA may make modifications to any CA product, software program, service, method or procedure de-scribed in this publication at any time without notice, and the development, release and timing of any features or functionality dede-scribed in this publication remain at CA’s sole discretion. CA will support only the referenced products in accordance with (i) the documentation and specifications provided with the referenced product, and (ii) CA’s then-current maintenance and support policy for the referenced product. Notwithstanding anything in this publication to the contrary, this publication shall not: (i) constitute product documentation or specifications under any exist-ing or future written license agreement or services agreement relatexist-ing to any CA software product, or be subject to any warranty set forth in any such written agreement; (ii) serve to affect the rights and/or obligations of CA or its licensees under any existing or future written license agreement or services agreement relating to any CA software product; or (iii) serve to amend any product documentation or specifications for any CA software product.

Any reference in this publication to third-party products and websites is provided for convenience only and shall not serve as the authors’ or CA’s endorsement of such products or websites. Your use of such products, websites, any information regarding such products or any materials provided with such products or on such websites shall be at your own risk.

To the extent permitted by applicable law, the content of this publication is provided “AS IS” without warranty of any kind, including, without limitation, any implied warranties of merchantabil-ity, fitness for a particular purpose, or non-infringement. In no event will the authors or CA be liable for any loss or damage, direct or indirect, arising from or related to the use of this publica-tion, including, without limitapublica-tion, lost profits, lost investment, business interruppublica-tion, goodwill or lost data, even if expressly advised in advance of the possibility of such damages. Neither the content of this publication nor any software product or service referenced herein serves as a substitute for your compliance with any laws (including but not limited to any act, statute, regula-tion, rule, directive, standard, policy, administrative order, executive order, and so on (collectively, “Laws”) referenced herein or otherwise or any contract obligations with any third parties. You should consult with competent legal counsel regarding any such Laws or contract obligations.

References

Related documents

Finally, protease sensitivity studies in Pax3 mutants bearing engineered Factor Xa sites either in the linker separating the PAl and RED motif (position 100), or upstream the

An analysis of the economic contribution of the software industry examined the effect of software activity on the Lebanese economy by measuring it in terms of output and value

 This arrangement, apart from forcing banks to have multiple tie-ups was anticipated to possibly lead to loss of valuation for several bank promoted insurance companies with

After creating the metadata for an entity type, you can use the Generate Jobs option from the entity type editor toolbar to create and publish jobs to the DataFlux Data

The definition of these types of cooperative and the basis for choosing them is discussed at greater length in later chapters (2.4 and 5.2). As a result, the study included

Interactions between a Speech Pathologist and people with Interactions between a Speech Pathologist and people with aphasia in the first 6 weeks post stroke: A qualitative study of

● From the Start/Finish at the River House Barn, head south and cross Fig Ave into the River Campground and head EAST and connect with Main Loop trail.. ● The main loop trail will

The study showed that resilience was a prerequisite for successful inclusion as the learners achieved academic success in an inclusive mainstream pedagogic setting despite