A Client-Based Privacy Manager for Cloud Computing

(1)

A Client-Based Privacy Manager for Cloud Computing

Miranda Mowbray

HP Labs

Long Down Avenue, Stoke Gifford Bristol, BS34 8QZ. UK

+44-117-3128178

Siani Pearson

HP Labs

Long Down Avenue, Stoke Gifford Bristol, BS34 8QZ. UK

+44-117-3128438

ABSTRACT

A significant barrier to the adoption of cloud services is that users fear data leakage and loss of privacy if their sensitive data is processed in the cloud. In this paper, we describe a client-based privacy manager that helps reduce this risk, and that provides additional privacy-related benefits. We assess its usage within a variety of cloud computing scenarios. We have built a proof-of-concept demo that shows how privacy may be protected via reducing the amount of sensitive information sent to the cloud.

Categories and Subject Descriptors

C.2.4 [Computer-Communication Networks]: Distributed systems – distributed applications.

General Terms

Management, Security

Keywords

Cloud computing, Privacy

1. INTRODUCTION

The central idea of cloud computing services is that these services are carried out on behalf of customers on hardware that the customers do not own or operate. The customer sends input data to the cloud, this data is processed by an application provided by the cloud service provider, and the result is sent back to the customer. Cloud computing is particularly attractive to businesses in times of financial recession and credit squeezes, because using cloud services enables them to substitute capital expenditure on hardware and software to meet their worst-case computing requirements with operating expenditure that relates to the amount of computing that they actually use. Some cloud computing services are aimed at individual consumers rather than businesses; they offer easy availability over the Web of a service which might be difficult or costly for the individual to buy as software. However, current cloud services pose an inherent challenge to data privacy, because they typically result in data being present in unencrypted form on a machine owned and operated by a different organization from the data owner. There are threats of

unauthorized uses of the data by service providers and of theft of data from machines in the cloud. Fears of leakage of sensitive data or loss of privacy are a significant barrier to the adoption of cloud services [8]. These fears may be justified: in 2007, criminals targeted the prominent cloud service provider Salesforce.com, and succeeded in stealing customer emails and addresses using a phishing attack [7].

Moreover, there are laws placing geographical and other restrictions on the processing by third parties of personal and sensitive information. These laws place limits on the use of cloud services as currently designed. For example a UK business storing data about individual customers with some cloud computing services could find itself in breach of UK law, if the services’ standard subscription agreements do not give any assurances that the computers that the data is stored on are adequately secure [18].

In this paper we describe a client-based privacy manager. Most of the features of the privacy manager require a corresponding service-side component for effective operation, and hence require some cooperation from the service provider. The reasons for having a client-side component for these features rather than leaving them to be implemented entirely on the server side are that this architecture provides a user-centric trust model that helps users to control their sensitive information, assuming that the service provider cooperates with them. These features can assist the user in clearly communicating his privacy-related preferences to the service provider, and can also assist the service provider in compliance with privacy laws and regulations. There is however one feature of the privacy manager, obfuscation, which in some circumstances can be used by users to protect the privacy of their data even if there is no cooperation from the service provider - indeed, even if the service provider is malicious.

2. PROBLEM SCENARIOS

2.1 Sales Force Automation

One very popular set of cloud services for businesses is Salesforce.com’s Sales Force Automation suite [17]. For these services, the business uploads its sales data to databases on Salesforce.com’s computers. Once it is there, salespeople and managers in the business can use Salesforce.com’s software over the web to analyse their sales data and answer queries such as who the top 10 purchasers are of a particular product, or how the last week’s sales break down by region. Since storage and analysis of a large database is computationally intensive, it makes sense for the business to use cloud services for this as opposed to purchasing computing hardware and software to do it in-house. Detailed sales data is generally commercially sensitive – businesses are not willing to share it with their competitors – and Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

COMSWARE’09, June 16–19, 2009, Dublin, Ireland.

(2)

in many cases will also contain individual information about the customers who have made purchases, such as their email addresses and product preferences. The security threat that we consider in this scenario is the theft of sales data from the service provider’s system, followed by possible resale to business competitors or identity thieves.

2.2 Customized End-User Services

Information may be automatically gathered about end-user context and user data in the cloud assessed, in order to provide targeted end user services. For example, in a non-enterprise scenario, a user could be notified which of his friends are near his current location. The assessed data might include: name, location, availability (for example, derived from calendars), recommendations, likes and dislikes, names of service providers used, phone contacts, details of phone calls including target and duration, lists and contact details of relatives, friends, work colleagues, etc.

The main threats in this type of scenario involve:

• Personal information about a user being collected, used, stored and/or propagated in a way that would not be in accordance with the wishes of this user.

• People getting inappropriate or unauthorized access to personal data in the cloud by taking advantage of certain vulnerabilities, such as lack of access control enforcement, data being exposed ‘in clear’, policies being changeable by unauthorized entities, or uncontrolled and/or unprotected copies of data being spread within the cloud.

• Legal non-compliance. In particular, restrictions on transborder data flow may apply, and also some of the data may be of types subject to additional regulations.

2.3 Share Portfolio Calculation

This is a more specific example than the two above. The application is the calculation of the current value of a user’s share portfolio. The application receives data from the user specifying the number of shares in different companies in a portfolio. Whenever the user wishes to know the current value of the portfolio, he sends a query to the application, which looks up the current value of the relevant shares, calculates the total value of the portfolio, and returns this value to the user.

The threat in this scenario is a leak of information about the user’s share ownership from the service provider’s system, followed by possible misuse. As this is financial data, the user may be particularly keen to keep it private, and there may also be additional regulations limiting its communication and use.

2.4 Requirements

A set of requirements arise from privacy legislation and consideration of the scenarios above:

R1. minimization of personal and sensitive data used and stored within the cloud infrastructure

R2. security protection of data used and stored within the cloud infrastructure: safeguards must prevent unauthorized access, disclosure, copying, use or modification of personal information

R3. purpose limitation: data usage within the cloud has to be limited to the purpose for which it was collected and should only be divulged to those parties authorized to receive it.

R4. user centric design: the user should be given choice about whether or not his information is collected to be used within the cloud, his consent should be solicited over the gathering and usage of such information and he should be given control over the collection, usage and disclosure of personal and sensitive information. R5. user feedback: notice about data collection should be

provided to the user about what information will be collected, how it will be used, how long it will be stored in the cloud, etc. and there should be transparency about how personal information that is collected is going to be used within the cloud. Privacy legislation may also impose some other requirements, such as conformance to rules on data retention and disposal, and data access (in the sense of users being able to get access to personal information stored about them – in this case, in the cloud – to see what is being held about them and to check its accuracy). A further aspect is that it is necessary to respect cross-border transfer obligations, but that is particularly difficult to ensure within cloud computing, so it is likely that legislation will need to evolve to allow compliance in dynamic, global environments: the notion of accountability is likely to provide a way forward. Privacy laws differ according to country block, and also national legislation. The basic principles given in [13] apply to most countries, and many national privacy laws are based on them. There is however a difference in view: in the EU privacy is a basic right, whereas in the Asia Pacific region privacy legislation is more centered on avoiding harm. Depending on jurisdiction there may be additional restrictions on the processing of certain sensitive types of data, such as health or financial data.

3. OUR SOLUTION

In this section we present the overall architecture of our solution, provide more detail about the functionality provided by a central component of this solution, and then consider how this solution may address certain issues raised in the previous section.

3.1 Overall Architecture

The overall architecture of our solution is illustrated in Figure 1. Privacy Manager software on the client helps the user to protect his privacy when accessing cloud services. A central feature of the Privacy Manager is that it can provide an obfuscation and de-obfuscation service, to reduce the amount of sensitive information held within the cloud. In addition, the Privacy Manager assists the user to express privacy preferences about the treatment of his personal information, use multiple personae, review and correct information stored in the cloud, etc. Further detail about these features is given below.

(3)

Internet Privacy Manager Cloud Application Obfuscated Data Data User Client Feedback Obfuscation Preferences Personae Data access

Figure 1: Overview of our solution

.

3.2 Privacy Manager

In this section we describe the features of the Privacy Manager in more detail.

3.2.1 Obfuscation

The first feature of the Privacy Manager provides obfuscation and de-obfuscation of data. This feature can automatically obfuscate some or all of the fields in a data structure before it is sent off to the cloud for processing, and translate the output from the cloud back into de-obfuscated form. The obfuscation and de-obfuscation is done using a key which is chosen by the user and not revealed to cloud service providers. This means that applications in the cloud cannot de-obfuscate the data. Moreover, an attacker who uses the same application will not be able to de-obfuscate the user’s data by observing the results when he obfuscates his own data, since his obfuscation key will not be the same as the user’s key. Since this obfuscation is controlled by the user, it should be more attractive to privacy-sensitive users than techniques for data minimization that they do not control.

In general, the more information that is obfuscated within a data structure, the smaller the set of applications which can run using the obfuscated data structure as input, and the slower the obfuscation process. In some cases, it is not an option to obfuscate all the personal and sensitive data in the data structure. Data items that are not obfuscated may be used by cloud services for personalization of user content and targeting of advertising. The other features of the Privacy Manager allow users some control over the handling of these data items by the cloud services.

3.2.2 Preference setting

A second feature of the Privacy Manager is a method for allowing users to set their preferences about the handling of personal data that is stored in an unobfuscated form within the cloud. A similar approach has been taken within P3P [21] and PRIME [19]. The

resultant policies can then be associated with data sent to the cloud, and preferably cryptographically bound to it (by encrypting both the policy and data under a key shared by the sender and receiver). For stickiness of the privacy policy to the data, public key enveloping techniques can be used. Alternatively, it is possible to use policy-based encryption of credential blobs (a form of Identifier-Based Encryption (IBE) technology) [2]: the policies could be used directly as IBE encryption keys to encrypt the transferred material [3].

Part of this specification could involve the purpose for which the personal data might be used within the cloud, and this could be checked within the cloud before access control were granted, using mechanisms specified via [4]. Note that, unlike the obfuscation feature, this feature is only useful if there is a corresponding policy enforcement mechanism within the cloud.

3.2.3 Data access

The Privacy Manager contains a module that allows users to access personal information in the cloud, in order to see what is being held about them, and to check its accuracy. This is essentially an auditing mechanism which will detect privacy violations once they have happened, rather than a mechanism to prevent violations from happening in the first place. Nevertheless the basic principles of data access and accuracy [13] are considered to be part of privacy in many national privacy laws. So under these laws, the service providers need to be able to make this information accessible to the user. This module enables, organises and logs this access on the client machine. Providing data access when data is spread over a very large number of machines is a highly challenging problem, although it may be a legal requirement: solving this problem is outside the scope of this paper. If the data is spread over only a few machines, it should be relatively straightforward for the service provider to enable data access.

3.2.4 Feedback

The Feedback module manages and displays feedback to the user regarding usage of his personal information, including notification of data usage in the cloud. This module could monitor personal data that is transferred from the platform – for example location information, usage tracking, behavioural analysis, etc. (while the Preferences feature would allow the user to control such collection). It could also have an explanatory role, including education about privacy issues and providing informed choice to the user, beyond expression of preferences.

3.2.5 Personae

This feature allows the user to choose between multiple personae when interacting with cloud services. For example, in some contexts a user might not want to reveal any personal information and just act in an anonymous manner, whereas in other contexts he might wish for partial or full disclosure of identity. The user’s choice of persona may drive the strength of obfuscation that is used. For example, there may be certain data items within a data set which the obfuscation mechanism will obfuscate if the data is associated with one persona, but not if it is associated with other personae of the same user.

(4)

3.3 How Our Solution Addresses the Problem

Scenarios

We now consider how this solution may be used to address the issues raised within the scenarios presented in Section 2.

3.3.1 Sales Force Automation

Suppose that the sales data sent to the cloud for a Sales Force Automation service has entries consisting of a customer, product, status (purchase, failure etc), price and time. The Privacy Manager obfuscation module translates the customer, product and status into pseudonyms, multiplies the price by a factor, and moves the time forward by a time interval. The obfuscation software will generate new pseudonym maps and price factors for each new user. (The pseudonym maps may be implemented by association tables, or by a deterministic symmetric encryption function; in the latter case different maps correspond to different keys.)

Typical queries such as the names and total sales revenue of the ten best-selling products, and the email address of the customer who spent most on these, can then be run on obfuscated data in the cloud. In this case the obfuscation software translates back the answer from the cloud by mapping back the product and customer pseudonyms, and dividing the revenue figure by the secret factor. The process is illustrated in Figure 2. An enterprise sales rep wants to find the email address of the customer who has spent most on the CoolWidget product. His client runs Privacy Manager software, whose integrity is protected by a Trusted Platform Module. The obfuscation feature of the Privacy Manager obfuscates his query, and sends the result to a cloud-based application for sales force automation, running on the service provider’s hardware. The application consults the obfuscated sales database for the enterprise and sends back an answer. The answer is in obfuscated form: the software de-obfuscates it to reveal the required email address. The answer might be sent with an advertisement targeted by using information from the enterprise account and the services that the enterprise user previously used.

boundary Enterprise Internet Privacy Manager Cloud Application Q: CoolWidget fan’s email? Obfuscated data mjm75k 42ilu jcr7.. … … Q: 42ilu fan’s email? A: [email protected] A: mjm75k

Figure 2: Using a cloud service with obfuscation to find the address of the customer who has spent most on CoolWidgets.

As mentioned in Section 3.2.1, not all applications can operate on input data that has been obfuscated in a non-trivial way, but many useful applications can. The marketing literature for Salesforce.com’s Sales Force Automation suite lists 87 features. We have determined that 80 of these can theoretically be implemented using input data that has been obfuscated in the manner described above. The remaining seven features either use the ability to send mass emailing directly from Salesforce.com – and so require Salesforce.com to have access to unobfuscated customer email lists – or allow the calculation of arbitrary mathematical functions on data elements.

We describe this feature as “obfuscation” rather than “encryption” because the obfuscated data still retains some information about the original data. It may be possible for some types of information about the sales to be obtained by analysis of the obfuscated data. For example, with the obfuscation method just described, by guessing that the most common status will correspond to “purchase” it may be possible to deduce from the obfuscated data what the ratio is of the total purchase values of the most popular and second most popular products. For additional security, more complex obfuscation methods can be chosen; for example the pseudonym corresponding to the status could depend on the customer as well as the actual status value, and fake data entries can be added whose effect on the answer from the cloud will be removed by the obfuscation of queries and de-obfuscation of answers. Nevertheless, even the simple obfuscation method described above ensures that customer email addresses or product names and prices cannot be stolen directly from the service provider’s system, as they are never present in the clear in this system.

Database records that have been obfuscated using different keys cannot be compared directly, so the obfuscation feature has to take key management into account. A way of addressing this issue is for the privacy manager to retain a record of which keys were used during which date ranges, to query database records from a given date range using the appropriate obfuscation key, and to combine the de-obfuscated answers for each relevant date range. For a sales database application most queries are likely to involve only one or at most a small number of date ranges. Provided that keys are not changed very frequently, the amount of state that will need to be kept will be small. Backup copies of this state can be stored so that it is still possible to de-obfuscate past sales data if this state is accidentally deleted.

Useful applications in areas other than sales force automation – such as orchestrating marketing campaigns and assessing their effectiveness – can be obfuscated in a very similar way.

3.3.2 Customized End-User Services

In this scenario, the user sets his preferences at to the treatment of personal data using the Preference setting feature of the privacy manager. For instance, for the service telling him which of his friends are near, he might state a preference for his friends’ contact details not to be used for direct marketing by third parties, while accepting that his own identity and location will be used to target advertisements sent to him with the service. He may use the Persona feature as a simple and intuitive way of selecting one particular set of preferences for the use of data in a given context. For example the user may have one preset persona for communications with friends and another for communications with colleagues, which specify different sets of preferences..

(5)

The Privacy Manager can use this preference information to determine the appropriate degree of obfuscation to be carried out on the data. This helps balance privacy protection against the user’s desire for customized services.

The user’s preferences are sent by the Privacy Manager on the client to a service-side component which governs enforcement of the policies. The service-side component ensures that these preferences remain attached to any personal information stored, used and shared in the cloud, and follow that data if it were transferred or propagated, preventing it being used in any way that is not compatible with that policy and thereby ensuring that the user has control over the usage of his data.

In some cases it may be that the service cannot be provided according to the user’s stated preferences. In that case, a service-side component communicates with the Feedback module of the Privacy Manger, which consults the user to notify him and find out the action he wishes to take.

Once the user has released data into the cloud, there are two ways in which he may learn of the ways that his data is being used. One is that the service-side Feedback component may contact the Feedback module of the Privacy Manager and notify him of data use, without him having to actively request this. The other is that he uses the Data Access module of the Privacy Manager to request access to his data (for example, to check the accuracy of data stored about him). The Data Access module communicates with yet another service-side component that is responsible for ensuring compliance with legal requirements of data access.

3.3.3 Share Portfolio Calculation

For this scenario it is possible to use obfuscation to protect information about the user’s share ownership from being misused. The client does not communicate the user’s portfolio directly to the application. Instead, it constructs two different portfolios such that the true portfolio is some linear combination of these. (The coefficients of the linear equation relating the portfolios act as the user’s obfuscation/deobfuscation key, and are not revealed to the service provider.) The client sends the two portfolios to the application separately, as the obfuscated input data. When the user wishes to know the current value of his portfolio, the client sends a request for the current value of each of the two of portfolios in the obfuscated data. It then combines the two answers from the cloud using the linear equation to obtain the current value of the user’s portfolio.

The unobfuscated data describing the user’s true portfolio is never present in the service provider’s system (or anywhere else in the cloud) So it cannot leak from this system, even if the service provider is malicious.

Notice that for this scenario our solution does not require the service provider to make any changes to the application, or to provide any additional services (such as the service-side parts of data access and feedback). Exactly the same application can be used for obfuscated and unobfuscated input data. Indeed, the service provider may be unaware that a pair of portfolios is the obfuscated portfolio of a single customer rather than the unobfuscated portfolios of two different customers.

3.3.4 Assessment of our Approach

In this section we trace the requirements given in Subsection 2.4 to the architecture proposed above. The following solutions at least partially address these requirements R1-R5:

• Data minimization is provided via the obfuscation feature (addressing R1)

• We assume that access control, etc. will be deployed on the services side in order to protect any data stored within the cloud (addressing R2)

• Purpose limitation (R3) is addressed by the preference setting feature and its service-side component.

• Our architecture has a user-centric design (R4). In particular, the preference-setting feature allows the user greater control over the usage of his data, and the personae feature makes this more intuitive.

• Feedback (R5) is provided via the feedback and data access features.

3.4 Discussion: When Our Solution is Not

Suitable

Our solution is not suitable for all cloud applications.

Theoretically, a user with data x and a service provider with data y could use Yao’s protocol for secure two-party computation [22] to enable the user to learn f(x,y) without the service provider learning

x or the user learning y, where f is any polynomial-time

functionality. So theoretically any polynomial-time application could be calculated in a fully obfuscated fashion, if the service provider were willing to implement the application using Yao’s protocol. However, the implementation of Yao’s protocol on a large data set x in general may require the user to have a rather large amount of storage and computation power. (The obfuscation methods described in this paper require much less computation and storage by the user than Yao’s protocol would need to compute the same results for a large data set.) For users with limited computing resources there is thus a tradeoff between the extent to which data is obfuscated and the set of applications that can effectively be used, even when the service provider gives full cooperation. Nevertheless, if the service provider cooperates then the other features of our solutions can still be used.

The picture is different if the service provider does not provide full cooperation. Some cloud service providers that base their business models on the sale of user data to advertisers (or other third parties) may not be willing to allow the user to use their applications in a way that preserves his privacy. Other providers may be willing to respect users’ privacy wishes, but not to implement the service-side code that is necessary for some of the privacy manager’s features. Yet other service providers may claim to cooperate, but not be trustworthy. In these cases, the features of our solution other than obfuscation will not be effective, since they require the honest cooperation of the service provider. There is still a possibility that in these cases a user may be able to use obfuscation to protect the privacy of his data. However, the ability to use obfuscation without any cooperation from the service provider depends not only on the user having sufficient computing resources to carry out the obfuscation and de-obfuscation, but also on the application having been implemented in such a way that it will work with obfuscation. For example, a

(6)

service that is customized with a map showing the area around a US user’s zip code might theoretically be implemented in a way that would allow a user to obtain the correct customized result without revealing his zip code to the service provider. But a common method of implementing this type of service is to pass the input zip code directly to a map server, and mash up the map with the result from the rest of the service. With such an implementation it is difficult for the user to obtain the correct result without revealing the correct zip code to the application. As a more general example, for some applications it may be difficult to discover the set of input values that are treated as valid by the application. Without some knowledge of the set of valid inputs, it is not possible to design an obfuscation function such that the obfuscated input data is still valid input.

Despite this, based on our analysis of SalesForce’s service offerings, we believe that many existing cloud services can be used in an obfuscated fashion without any cooperation from the service provider.

4. OTHER APPROACHES AND RELATED

WORK

Some companies obfuscate data by hand, in an ad-hoc fashion, before sending the obfuscated data to the cloud for processing. A large pharmaceutical company has complained that this is a major bottleneck for expanded use of cloud computing [12].

One approach to the problem focuses on security of sensitive or personal data once it is in the cloud, for example ensuring separation of different customers’ data, encrypting data in transit but allowing applications to decrypt it, and checking virtual machine security. This approach is necessary to protect sensitive data items that cannot be obfuscated, but it does not address some of the legal issues. Moreover ensuring security within a large complex cloud system is a hard technical problem. Where sensitive data items can be obfuscated, it is safer for the customer to obfuscate them, so that they are never present in the cloud in the clear, and the customer does not have to rely on the service provider’s security controls.

Some storage-as-a service providers, such as JungleDisk, Amazon S3 and Mozy, encrypt data files with a key stored only on the user’s machine. Storage-as-a-service with no personalization can use data files encrypted in such a way that no-one but the user can decrypt them (in particular, cloud applications cannot decrypt them). However, cloud services which process or use some items of the data cannot use such encrypted files as input. Some such cloud services could use as input databases that had been obfuscated using Voltage’s Format-Preserving Encryption [20]. This encrypts specific data fields while retaining the format of data records, and preserving referential integrity. Similarly, TC3 Health Inc.’s HIPAA-compliant software pseudonymizes sensitive items before processing data using cloud computing [1]. However, it appears that cloud services which calculate the sum of several data entries cannot use data encrypted using these methods as input. Hence these methods are not sufficient to deal with, for example, the database queries described in Section 3.4.1. Related obfuscation techniques have been used within other domains: for example, within intrusion detection, two research prototypes encrypt parts of the log that relate to personal information: firstly, in the IDA (Intrusion Detection and Avoidance) prototype [19], that pseudonymises the subject fields

within audit records by encryption. Secondly, the AID (Adaptive Intrusion Detection) system [6] uses encryption by a secret (shared) key for the pseudonymisation process; this key is changed from time to time; the usage of public key encryption was also examined. Some special pseudonyms may be defined for groups where the identity of a single member can only be revealed by cooperation of a certain number of group members. One example would be where the key for decryption could be split into two halves, which are given to the security administrator and the data protection officer.

Furthermore, Pinkas and Lindell [10] introduced the idea of privacy-preserving data mining in which two parties owning confidential data cooperate to efficiently obtain the result of a function which depends on both sets of data, in such a way that the only information that either learns is what can be deduced from the result of the function. This work builds on Yao’s protocol [22], and there is a body of research on this problem – see [11] for a bibliography. A consumer and provider of a cloud service who agree to use one of the protocols for privacy preserving data mining might be able to ensure that no more information is transferred from the customer to the provider than the minimum necessary for the service. However, these protocols assume that both parties have sufficient computing power to operate the protocol, which may require the storage and processing of a large amount of data. The common business scenario for cloud computing is that the consumer of the service has only limited computing power available in-house, and almost all the computing power necessary for the service is provided by the service provider.

Proxy systems, such as the now defunct anonymizer.com, re-package Web surfing requests to disguise their origin. However they do not alter data entered on the Web page. A proxy system could be used in conjunction with data obfuscation for users who wish to keep their identity as well as their data confidential. Some products perform deep content inspection on network traffic and detect or filter based on policies and linguistic analysis [16]. However, they are designed to block communications that contain sensitive data, to encrypt at the file level and do not turn an output containing obfuscated data back into the original.

The Privacy Manager features described in 3.2.2-3.2.5 build upon similar approaches used in client-server and Peer to Peer (P2P) systems [6, 9]. In particular:

• The preference setting feature is similar to privacy management tools that enable inspection of service-side polices about the handling of personal data (for example, software that allows browsers to automatically detect the privacy policy of websites and compare it to the preferences expressed by the user, highlighting any clashes [21])

• The feedback feature can use a range of HCI techniques for improving notice [14], and could also play a role in pseudonymous audit [19].

• The data access feature is similar to secure online access mechanisms to enable individuals to check and update the accuracy of their personal data [17]

(7)

• The personae feature could offer an anonymous persona, by means of using network anonymity techniques and providing pseudonymisation tools that allow individuals to withhold their true identity from the cloud, and only reveal it when absolutely necessary [6, 9, 15]. Existent technologies include anonymous web browsers, pseudonymous email and pseudonymous payment. The mechanisms may be designed for complete anonymity, or else pseudonymity (i.e. anonymity that is reversible if needed, for example in case of fraud).

5. CURRENT STATUS

This is work in progress. We have implemented a proof-of-concept demo of the obfuscation feature of the privacy manager in the first scenario. It implements the more complex obfuscation methods described in Section 3.4.1. Figure 3 is part of a screenshot from this demo. This demo shows that obfuscation works for an application which performs some processing on the input data.

As a next step we are investigating other ways of enhancing privacy in cloud computing, in particular to ensure the provision of relevant notice, choice, legitimacy and purpose limitation. These include use of privacy infomediaries and enforceable ‘sticky’ electronic privacy policies. These may be combined with, or used independently of, the solution described above. Notably, the client software above could be extended to manage personal privacy controls that are enforced within the cloud. Specifically, we plan to investigate how consent and revocation of consent can be provided within cloud computing environments, as part of research carried out within EnCoRe (Ensuring Consent and Revocation) – a UK project examining solutions in the area of consent and revocation with respect to personal information [5].

6. ACKNOWLEDGEMENTS

Thanks to Rob Whitmore for technical assistance, and to the anonymous referees for their useful comments on an earlier draft of this paper.

Figure 3: User interface for Privacy Manager sales database

7. REFERENCES

[1] Amazon Web Services LLC. 2009. Case Studies: TC3 Health. Web page, http://aws.amazon.com/solutions/case-studies/tc3-health/

[2] Boneh, D. and Franklin, M. 2001. Identity-based Encryption from the Weil Pairing. In Advantages in Cryptology – CRYPTO 2001, G. Goos, J. Hartmanis and J. van Leeuwen, Eds. Springer LNCS Series 2139. Springer, Berlin /

(8)

Heidelberg, 213-229. DOI= http://dx.doi.org/10.1007/3-540-44647-8_13

[3] Casassa Mont, M., Pearson, S. and Bramhall, P. 2003. Towards Accountable Management of Identity and Privacy: Sticky Policies and Enforceable Tracing Services. In Proceedings of the IEEE Workshop on Data and Expert Systems Applications (Prague, Czech Republic, September 1 – 5, 2003). DEXA’03. IEEE Computer Society, Washington DC, USA, 377-382.. DOI=

http://dx.doi.org/10.1109/DEXA.2003.1232051 [4] Casassa Mont, M. and Thyne, R. 2006. A Systemic

Approach to Automate Privacy Policy Enforcement in Enterprises. In Proceedings of the 6th Workshop on Privacy Enhancing Technologies (Cambridge, UK, June 28 – 30, 2006). PET’06. Springer LNCS series 4258, Springer Berlin/ Heidelberg, 118-134. DOI=

http://dx.doi.org/10.1007/11957454_7

[5] EnCoRe. EnCoRe: Ensuring Consent and Revocation. Project web site. http://www.encore-project.info [6] Fischer-Hűbner, S. 2001. IT-Security and Privacy: Design

and Use of Privacy-Enhancing Security Mechanisms. Springer LNCS series 1958, Springer Berlin / Heidelberg. DOI= http://dx.doi.org/10.1007/3-540-45150-1 [7] Greenberg, A. 2008. Cloud Computing’s Stormy Side.

Forbes Magazine (19 Feb 2008).

[8] Horrigan, J.B. 2008. Use of cloud computing applications and services. Pew Internet & American Life project memo (Sept 2008).

[9] Information Commissioner’s Office, UK, 2007. Privacy enhancing techologies (PETs). Data protection guidance note (29 March 2007).

[10]Lindell, Y. and Pinkas, B. 2008. Privacy Preserving Data Mining. J. Cryptology 15 (3) (2002), 151-222. DOI= http://dx.doi.org/10.1007/s00145-001-0019-2

[11]Liu, K. 2006. Privacy Preserving Data Mining Bibliography. Web site.

http://www.cs.umbc.edu/~kunliu1/research/privacy_review.h tml

[12]Mather, T. 2008. More Cloud Computing. RSA Conference 365 blog (26 Sept 2008).

https://365.rsaconference.com/blogs/tim_mather/2008/09/26/ more-cloud-computing

[13]Organization for Economic Co-operation and Development (OECD). 1980. Guidelines Governing the Protection of Privacy and Transborder Flow of Personal Data (1980). OECD, Geneva.

[14]Patrick, A. and Kenny, S. 2003. From Privacy Legislation to Interface Design: Implementing Information Privacy. In Human-Computer Interactions, R. Dingledine (ed.), PET 2003, LNCS 2760, Springer-Verlag Berlin, pp. 107-124. [15]PRIME, Privacy and Identity Management for Europe. 2008.

Project web page. https://www.prime-project.eu/

[16]RSA Security. 2008. Data Loss Prevention (DLP) Suite. Web page. http://www.rsa.com/node.aspx?id=3426

[17]Salesforce.com, Inc. 2000-2009. Sales Force Automation. Web page. http://www.salesforce.com/products/sales-force-automation/

[18]Salmon, J. 2008. Clouded in uncertainty – the legal pitfalls of cloud computing. Computing magazine (24 Sept 2008). http://www.computing.co.uk/computing/features/2226701/cl ouded-uncertainty-4229153

[19]Sobirey, M., Fischer-Hűbner, S. and Rannenberg, K. 1997. Pseudonymous Audit for Privacy Enhanced Intrusion Detection. Elsevier Computers and Security 16 (3),p. 207. DOI= http://dx.doi.org/10.1016/S0167-4048(97)84519-1 [20]Voltage Security, 2009. Format-Preserving Encryption. Web

page.

http://www.voltage.com/technology/Technology_FormatPres ervingEncryption.htm

[21]World Wide Web Consortium (W3C). Platform for Privacy Preferences (P3P) Project web site. http://www.w3.org/P3P [22]Yao, A. C. 1986. How to Generate and Exchange Secrets.

Proceedings of the 27th Symposium of Foundations of Computer Science (FoCS), IEEE, pp.162-167.