• No results found

Versions Date Changes Type of change Delivered by. Version /04/2015 Initial Document - UCBL and INSA of Lyon

N/A
N/A
Protected

Academic year: 2021

Share "Versions Date Changes Type of change Delivered by. Version /04/2015 Initial Document - UCBL and INSA of Lyon"

Copied!
49
0
0

Loading.... (view fulltext now)

Full text

(1)

Projectacronym: NEBULA

Projectname: Anovelvocationaltrainingprogrammeoncloud computingskills Projectcode: 540226-LLP-1-2013-1-GR-LEONARDO-LMP

Document Information

Document IDname: Nebula_WP4_D4.3.1_Learning_Material_and_Content_2015_30_04 Document title: NebulaVETprogramlearningmaterialandcontent

Type: Slides

DateofDelivery: 30/04/2015 Workpackage: WP4

Activity D.4.3.1 Disseminationlevel: Public

Document History

Versions Date Changes Typeofchange Deliveredby Version1.0 15/04/2015 InitialDocument - UCBLand INSAofLyon Version2.0 26/06/2015 Edition

Modificationsaccording tofeedbackprovidedby

partners

UCBLand INSAofLyon

Acknowledgement

The personsof UCBLin chargeof producingthe course areParisa Ghodous, CatarinaFerreira DaSilva,Jean Patrick Gelas and Mahmoud Barhamgi. The persons from UCBL involved in preparing, translation and reviewareHindBenfenatki,GavinKempandOlivier Georgeon.

Thepersonsof “INSAofLyon”in chargeofproducingthe courseareFrédériqueBiennier, NabilaBenharkat. The persons from INSA of Lyon involved in preparing, translation and review are Francis Ouedraogo and Youakim Badr.

Disclaimer

Theinformationinthis documentis subjecttochangewithoutnotice.All rightsreserved.

The course is proprietary of UCBL and INSA of Lyon. No copying or distributing, in any form or by any means, isallowedwithouttheprior writtenagreementofthe ownerofthepropertyrights. Thispublication reflects the views only of the author, and the Commission cannot be held responsible for any use, which

(2)

Module 3 objectives

The aim of this module is to provide the student

with the capabilities to analyse the risks and

legal implications associated to the migration

process, assessing their influence in the data,

processes, and applications

---Note: due to intellectual property reasons, the logotype of

UCBL must remain in all utilisation of this course content,

as well as the note “copyright DUNOD” mentioned in some

slides with figures.

(3)

Risk, security, and legal analysis for

migration to cloud

(4)

According to you, how should you take

care of Privacy?

• Can you define personal data / private life?

• How do you define privacy?

• Do you know some techniques to provide privacy?

• Do you know how your personal data are protected in

Cloud?

• Do you know the legal framework you deal with regarding

privacy?

(5)

According to you, how can you assess

the risks associated to the Cloud

Migration?

• In this part you will

– Learn basic principles to define private life and

define privacy requirements

– Get information to identify how personal data can

be protected

– Learn basic elements on legal frameworks

– Identify privacy challenges to take into account

while migrating on the Cloud

(6)

PART 3 OVERVIEW

1. Privacy

2. Actors involved in the cloud computing model

3. Solutions

(7)

Cloud Computing

Cloud computing has proven to be a successful paradigm that largely simplifies the deployment of data storage and computation capabilities for enterprises. It provides interesting characteristics:

Flexible pay-per-use pricing model, users pay only for what they consume

No upfront cost for consumed (hardware/software) resources

Scalable (and unlimited) storage and computation resources

No needs to manage the allocated resources

(8)

Cloud Computing, the ugly face ...

However, the previously stated benefits come at an expensive price:

Users and enterprises lose control over the systems that manage their data and applications

Users don't know where their data is stored and who can access it

Cloud providers (and their management staff, e.g., DB and networks administrators, software developers,, etc.) may:

‒ View the user’s sensitive information

‒ Process the users’ data for various reasons: e.g., sending targeted advertisements, to snoop on people, selling the data to interested parties

(9)

Privacy concerns through an example

A Conference Management System (CMS):

A typical CMS can do the following tasks:

Distributes the papers to the program committee (PC) members, based on their preferences and conflicts of interest

Organizes the collection and the distribution of reviews and discussion

Ranks papers according to their scores

Sends out reminder emails, as well as notifications of acceptance or rejection

Produces reports such as lists of sub-reviewers, acceptance statistics and conference program

(10)

Privacy concerns through an example

A Cloud-based CMS has the following advantages (for the conference chair):

The conference chair does not need to install and host a Web server and install a CMS software on the server. He/she needs only to create the conference account “in the cloud”

The whole business of managing the server (including backups and security) is done by someone else, and gains economy of scale

Accounts for authors and PC members exist already, and don’t want to be managed on a per-conference basis

(11)

Privacy concerns through an example

Data is stored indefinitely, and reviewers are spared the necessities of keeping copies of their own reviews

The system can help complete forms such as the PC member invitation form and the paper submission form by suggesting likely colleagues based on past collaboration history

For all of these reasons cloud based CMS such as EasyChair and EDAS are immense contribution to the academic community

(12)

Privacy concerns through an example

Data Privacy Concerns: Accidental or deliberate disclosure

The Cloud-based CMS administrators become custodians of a huge quantity of data about the submission and the reviewing behaviour of researchers,

aggregated across multiple conferences

This data could be deliberately or accidentally disclosed, with unwelcome consequences:

Reviewer anonymity could be compromised, as well as the confidentiality of PC discussions

The acceptance success records could be identified for researchers, over a period of years

(13)

Privacy concerns through an example

Data Privacy Concerns: Accidental or deliberate disclosure

The aggregated reviewing profile (fair/unfair, thorough/scant,

harsh/undiscerning, prompt/late, etc.) of researchers could be

disclosed

The data could be abused by:

Hiring or promotions committees

Funding and award committees

Researchers choosing collaborators and associates

...

The mere existence of the data makes the system administrators

vulnerable to bribery and coercion

(14)

Privacy concerns through an example

Data Privacy Concerns: Accidental or deliberate disclosure

The problem of data privacy exists before the emergence of Cloud Computing, but the Cloud “magnifies” it:

Before the cloud: data privacy breaches were about one conference

With the Cloud: data privacy breaches concern thousands of conferences

over decades, presenting tremendous opportunities for abuse if the data gets into the wrong hands

(15)

Privacy concerns through an example

Data Privacy Concerns: beneficial data mining

The data could be also exploited for some beneficial purposes:

Fraud and unwanted behaviour detection and prevention

Researchers who systematically unfairly accept each other’s papers, or rivals who systematically reject each other’s papers

Reviewers who reject a paper and later submit to another conference a paper with similar ideas

Undesirable submission patterns and behaviours by individual researchers:

Parallel of serial submissions of the same paper

(16)

Privacy concerns through an example

Data Privacy Concerns: beneficial data mining

The data could be also exploited for some beneficial purposes:

The data could be used to understand the way conferences are administered

ACM and IEEE could use the data to construct quality metrics for the

conferences

How much “new blood” is entering the community

How a conference changed over its different editions

The types of authors who submit to the conference

This raises important questions as to who is allowed to mine the data and for what purposes

(17)

Actors involved in the cloud computing

model

Data owners: The moral or the real persons to whom the data belongs

Reviewers and authors in a CMS

Patients in an Electronic Medical Record

Privacy concerns:

The main concern of data owners is to protect their data and identities against all unauthorized access or uses

They may also have privacy preferences that must be respected, e.g., a

patient may allow only his primary physician to access his medical information while he is under treatment, and may refuse the usage of his data for research purposes, etc.

(18)

Actors involved in the cloud computing

model

Data users: The people who query the data for various reasons

Physicians who consult the medical data of their patients for treatments

Medical Researchers who access the patients data to study the side effects of a given medicine, etc.

Privacy concerns

The main concern of data users is to protect their queries and identities

Example, a researcher who is trying to discover the side effects of a given

medicine may require his identity and queries (about his current researches) to be protected (to keep his research secret from his peer researchers), etc.

(19)

Actors involved in the cloud computing

model

Service or cloud providers: they include all IT staff required to run and manage

the cloud

Network administrators

Database administrators

Software developers,

Management and technical staff

(20)

Current solutions for data privacy

Current solutions for data owners

Current solutions for data users

(21)

Current privacy solutions for data owners

There exist two categories of solutions

Encryption-based solutions

 Protect the privacy against cloud providers

Privacy-aware access control solutions  Protect the privacy against data users

(22)

Encryption-based solutions

Encryption is a simple and promising solution to protect the confidentiality of data from the cloud provider

Idea:

• Data is encrypted before they are stored in the cloud

• Malicious cloud insiders cannot view the private sensitive data

Limitations:

• Encryption limits the cloud’s ability to process the users' queries on their behalf

(23)

Encryption-based solutions

Different techniques were proposed to allow the cloud to process queries on encrypted data, without decrypting them:

Data partitioning techniques

Order preserving encryption techniques

Searchable encryption techniques

(24)

Data partitioning techniques

Data elements are organized into groups, called Buckets

Each Bucket has boundaries and a Tag

All data elements inside a bucket are associated with the same tag

Example: the Bucket B1 includes the employees whose age is in [18, 30]

The client (i.e., data owner) should store the boundaries of all buckets

Query model:

The client should:

Determine the buckets that intersect with his query (based on the boundaries)

Retrieve the buckets from the cloud server

Decrypt the data of retrieved buckets, and remove the data elements that don't satisfy the query

(25)

Data partitioning techniques

Limitations:

There is a trade-off between the ensured privacy protection and

the performance:

Large volume buckets offer better protection, but involve increased computation overhead for the client

Small volume buckets offer poor protection, but less computation on the client site

The client overhead is not negligible

(26)

Order preserving encryption techniques

The encryption scheme preserves the order relation between the original values and their encrypted values

Example: if a > b, then e(a) > e(b)

The cloud can compute simple queries involving simple operations on encrypted data, e.g., MAX, MIN, Count, >, <, =

Access control model cannot be implemented on the server side

Limitations

Malicious cloud providers may progressively know the mapping between real and encrypted values

(27)

Homomophoric encryption techniques

The Homomophoric encryption scheme makes it possible to

answer all types of queries on encrypted data

Based on the idea that all query operations can be implemented

through two operations: the addition and the multiplication (i.e.

All query operators can be translated using these two operations)

Limitations

Impractical, for example the computation of a simple query may

take years

(28)

Privacy-aware access control solutions

The objective of these solutions is to protect the privacy of data against data users

Most of these solutions are based on RBAC (Role Based Access Control) models

 Rules : <Recipient, Data Item, Purpose, Conditions>

Recipient: the entity requesting the data

Data item: the requested data

Purpose: the objective for which the data is requested

Conditions: the set of conditions that should be met  Examples:

A physician may access the Lab Tests of a patient in case of emergency

A physician may access the personal information of a patient if the later agrees

(29)

Privacy-aware access control solutions

Pros and Cons:

These solutions provide fine-grained access control (at the

attribute level)

Offer different levels of accuracy for the a data item

Extensible, simple and easy to implement and use

Not always doable when the data stored on the cloud is

encrypted

Most of these solutions assume the cloud to be a trusted entity

(that can verify the privacy aware access rules)

(30)

Current privacy solutions for data users

There are different types of data that may be considered as

privacy sensitive by the cloud users:

Queries related information

Example: A scientific researcher may not want to discloses

his queries to protect his ongoing inventions

Identity

Example: An HIV patient may want to keep his identity

private when he ask questions about HIV symptoms

Contextual information

(31)

Query related solutions

PIR-based solutions:

Most of the current solutions are based on PIR (Private Information Retrieval) protocols

The idea behind PIR protocols is to execute a query over un-trusted server without letting the server knows anything about executed queries or their results

PIR protocols are cryptographic (i.e., queries and their answers are cipher texts)

Different protocols exists for one or several servers

Limitations:

PIR-based solutions are very time expensive, and thus impractical (a query may take years to be answered)

(32)

Query related solutions

Plan-based solutions:

These works are motivated by the observation that different plans for the same query reveal different information about the user intention (behind the query) to the server

These techniques are based on modifying the mature query optimization techniques to produce privacy aware query plans that satisfy the users constraints and preferences

Example of the users constraints and preferences:

Enforcing specific value constraints (e.g., name = John Doe) on a specific trusted server

Using a specific copy of a relation from a specific servers

Limitations:

This types of solutions require from users to have knowledge about the servers involved in resolving their queries

(33)

Identity related solutions

Most of the solutions are based on Digital Identity Management DIM systems

DIMs allow to authenticate users to cloud service providers without releasing their identities

A typical DIM involves the flowing entities:

Cloud service providers (CSPs)

Identity providers (IdPs): assign identity attributes to users

Registrars: verify the identity attributes given by an IdP to users, then issue

a certificate to the user

Users: a user can authenticate himself to CSPs using the certificate and gain access to authorized services

(34)

Identity related solutions

Using a DIM system the user can choose and manage the identity

attributes that he wishes to use

Examples of DIM system: Metasystem and CardSpace of Microsoft

Limitations:

Different Cloud services may require different DIMs which pose

interoperability issues between DIMs

(35)

General conclusions

Privacy protection is still a real challenge for the adoption of Cloud Computing in privacy critical and sensitive domains, e.g., healthcare, finance, military, etc.

The existing solutions for privacy preservation are still unsatisfactory for both data owners and consumers, but the research is in constant progress

The most effective solutions today rely on trust, auditing and tractability:

Trust: Cloud service providers should be selected based on how trusted they are

Trust computation is done based on the past interaction with the cloud provider and aggregated across a good number of users

(36)

General conclusions

Auditing: mechanisms are needed to monitor the different

operations within a cloud to detect and prevent suspicious

queries and data accesses

Tractability: mechanisms are needed to track down the origins of

the different operations within a cloud (e.g., who accessed a given

data item, and for what purposes, etc.)

(37)

Legal constraints

• Risks management

– Due to their consequences... – Technologies

– Confidentiality

– Tracking / monitoring • Encryption

– Allowed keys and algorithms – Communication

– Encrypted data storage

– Different legal constraints depending on the countries • Encryption vs scrambled data

• Data privacy

– Personal data

(38)

Data privacy

• Personal data

– User related information

• Name, addresses, competencies, phone number, email… • IP address, computer name, visited URLs, geo-localization..

– Activity related information

• Log files, Access control system logs…

• « Physical » control (access badges, video…)

• Private life violation

– Personal data collection

• User must be informed

• Personal (and private) data / files let on computers

(39)

Big brother is watching you…. Private

life protection (1/2)

• Different legal contexts

– Market based regulation: US

• Federal Trade Commission

• Improper (unfair) sites won’t be visited • Hyper-protection for minors

– Legal specification: EU

• World wide protection for Europeans

(40)

Big brother is watching you…. Private

life protection (2/2)

• Major stakes

– World wide exchanges among Internet

– Legal framework?

– Common principles

• Fair and unfair practices

(41)

Cloud impact on Privacy (1)

• Rise of actors

– Personal data are totally distributed – Responsibilities management

– Difficult to get a consistent view

• Multiple policies

– Depending on actors

– Shared infrastructure / services – New protection needs

• Charter analysis

• Threats related to the provider

– Difficulties due to the extra-territoriality of the cloud

(42)

Cloud impact on Privacy (2)

• Your personal data means money for service providers… – To use a service

• “Pay by providing some data” – Data quality

– Trust level associated to the provider

• Personal data are necessary to achieve the operation supported by the service

– Provider charter

– Risk related to linked processes / linked data – To make a service be profitable

• Economy of personal data – Anonymised or not

– Pricing for addresses, mail, email, phone numbers… – Integrated in the service economical / pricing model

(43)

Legal constraints...

• Integration of constraints related to data protection act – Tracking risk

– Privacy at work • Access control

• Proxy confirguration /deconfiguration • Emails can be confidential

• Legal precedent

– Hyperprotection for people on the EU side – Hyperprotection for minors in the US

– Risk for companies only if the privacy violation is involved in / is used to justify a penalty

– Usage charter

(44)

Private life, privacy and web sites (1)

• Trails / data let on the Internet

– Name and address of the computer – Computer’s parameters

– Cookies

– Visited pages

• Information collection – Identity

– Address (mail and email) – Phone numbers

– (Pub) quiz

– Consideration of • Goods

(45)

Private life, privacy and web sites (2)

• Practices that may be « unclear »…

– Sell “Clients / prospects files”

• May be forbidden in EU depending on the way the file has been made / on what the file contains

• Problems due to data exportation

– Safe Harbour

– Advertisement on data collection

• E.g. Microsoft

– Linked processes on the collected data

• Traces and workflow recognition • Customer profile identification

(46)

Fair and unfair (1)

• Fair practices

– Transparency

• Personal data collection

• Processes involving personal data

– Absolute need

• Cyber-control used in conjunction with other security tools

– Equity

• Goals associated to personal data processing

– Proportionality

(47)

Fair and unfair (2)

• Unfair practices

– Data collection…

– Shelf life of the collected data – Linked processes

– Sell personal data

• Consequences

– Users must be informed – Legal notice

– Secured storage and secured processes involving personal data – Adapted analysis / mining processes on personal data

• Anonymisation

(48)

Data privacy at work

• Private life at work

– Private files

– Mailing protection is also applied for email tagged as “private” – Users authentication

• Login/password

• Physical key protection while implementing a PKI • Bio-metrics based authentication

• Activity related control

– Reporting files

• Survivable systems

• Usage of resources, productivity measures...

– Activity reporting

• Re-building Workflow process

(49)

Case study

• Based on the migration use case description define the

requirements that the provider will have to fulfil and write a

usage charter for users

– Identify the personal data involved

– Identify the way they will be processed / stored…

– Define what a fair practice could be

– Define what unfair practice could be

– Identify the risks related to the cloud platform (based on

previous exercises)

– Identify requirements for the provider and write the user

charter

References

Related documents