Data Security in the Cloud

(1)

Published: 20th_{March 2011}

Windows Azure Security Overview

Data Security in the Cloud

Module Manual

(2)

2 The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical,

photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Microsoft, Active Directory, Hyper-V, SQL Azure, Visual Basic, Visual C++, Visual C#, Visual Studio, Windows, Windows Azure, Windows Live and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

(3)

3

Overview

The security of one’s data is one of the most important elements that decision makers must take under consideration. Data is an extremely valuable asset, and most customers feel insecure regarding the cloud environment because they do not know the exact location of their data and exactly how it is protected. This is a serious barrier for customers considering using the cloud; customers must be sure that their data is safe and private before entrusting it to a shared storing infrastructure.

Under the traditional information technology (IT) model, an organization is accountable for all aspects of its data protection regime, from how it uses sensitive personal information to how it stores and protects such data stored on its own computers. Cloud computing changes the paradigm because information is moved offsite to data centers owned and managed by cloud providers.

Responsibility for the physical hosting and protection of data is taken away from the customers, yet – even though the data physically resides in a cloud provider’s data centers – cloud customers still own their data and remain ultimately

responsible for controlling its use and protecting the legal rights of individuals whose information they have gathered.

Microsoft understands its responsibilities concerning customer data. The challenge of storing and protecting such data is not new to the company; Microsoft’s on-line services such as Hotmail and Live Services have hosted customer data since the launch of the MSN® network in 1994. Today Microsoft is a leader in addressing the privacy and security issues associated with hosting customer data in the cloud.

This document describes the privacy, policies, infrastructure, and security mechanisms designed to protect customer data in Windows Azure.

Customer Concerns

Concerns over information security and privacy top on the list of issues that customers evaluate before storing their data in the cloud. When speaking to customers, the following issues will likely be raised:

 Is it still my data?

 Who gets to see my data?

 Do I know if someone looked at my data?  How can I control access to my data?  Where, physically, is my data?

 What laws and regulations apply?  Is the integrity of my data assured?

 Is my data erased effectively when I delete it?

 Where does my responsibility to protect my data end, and where does the cloud service provider’s responsibility begin?

These concerns relate overwhelmingly to two subjects: privacy and security. The subjects deal with different issues, yet they overlap in several areas.

Security issues such as access control and data integrity are handled using security infrastructures and technologies, such as secure communication, cryptography and identity management. Privacy issues such as data ownership and transparency are handled through the enforcement of privacy policies and

(5)

5 regulations such as ISO/IEC 27001:2005, and management standards such as SAS 70 Type II

For both security and privacy, international standards and regulations allow providers and customers to monitor the level of compliance with and

implementation of the relevant policies. For instance, ISO/IEC 27001 is an Information Security Management System (ISMS) standard published by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). ISO/IEC 27001 formally outlines a

management system that brings information security under explicit management control. SAS 70 Type II is an internationally recognized auditing standard that provides guidance to service auditors when assessing the internal controls of a service organization and issuing a service auditor’s report.

Windows Azure utilizes a number of mechanisms in the securing of customer data, uses Microsoft’s Global Foundation Services (GFS) group to formulate data center security and privacy policies. These policies and procedures are enforced by the Online Services Security and Compliance (OSSC) Information Security Management System (ISMS).

Microsoft Global Foundation Services (GFS) runs Microsoft's data centers for Microsoft Online services such as Hotmail and Dynamic CRM. The OSSC team within GFS is responsible for compliance with ISMS regulations and standards. For more information about the services GFS provides, visit

http://www.globalfoundationservices.com

For more information about the Microsoft Online Privacy Statement and how it relates to customer concerns mentioned here, visit:

http://privacy.microsoft.com/en-us/default.mspx

Azure Storage Architecture

Traditional on-premises applications are designed to run on a single server. Data is stored in memory or persisted to the file system. In the cloud, this architecture no longer applies. The cloud is made up of a grid of compute nodes load-balanced by the cloud fabric; consequently, information must be saved in a shared location that all compute nodes are able to access.

The solution is an independent storage mechanism that provides Storage as a Service (SaaS) to both cloud applications and on-premises applications. Storage in Windows Azure is deployed on separate hardware from the compute nodes. To handle massive data, storage must be scalable and reliable, so multiple layers of architecture are used. The top layer validates, authenticates, and authorizes requests, routing them to the partition layer and data layer that actually store the bits. Security validation is done as soon as possible to prevent malicious resource consumption.

Data Protection

In Windows Azure there are three major types of data storage: Windows Azure Storage, SQL Azure, and Azure AppFabric Cache. Each storage type has its own properties: its own pros and cons and its own typical usage scenario. Yet all three storage types are designed to protect the data they store. This data protection is implemented in several domains:

(6)

6

Protection Against Data Loss

All data in Azure data stores is replicated three times across multiple physical computers in the data center. This architecture provides automatic failover and load balancing.

One node is considered primary, and the others are marked as secondary nodes. In the case of hardware failure of the primary node, one of the secondary nodes will take its place and be marked as the primary. Another replication will be instantly performed to preserve three living replicas at all times.

Customer data is spread across multiple physical servers within the geolocation specified for the data store. In this way, the data store achieves high availability and stability for all applications, from the smallest to the largest, without

requiring intensive administrative effort.

If a customer wishes to mitigate major catastrophic events that might disable an entire data center, it is possible to create different stores in different data centers and use Microsoft's sync services, such as SQL Data Sync, to synchronize data between geographic locations.

Secure communication

Communication between roles and storage is secure by default. If desired, it is possible to require SSL for all communication with all types of storage in the Windows Azure Platform. Secure communication is however essential when communicating sensitive information from storage to applications running on-premises. Storage is independent, so it cannot distinguish between traffic coming from Windows Azure nodes and traffic originating from the Internet. Channels inside Windows Azure data centers are considered secure, but channels originating from the public Internet are untrusted. SSL ensures that all traffic runs on a secure channel, so configuring storage to accept requests running on SSL channels provides an important layer of security.

Isolation

Windows Azure prevents interaction between data containers by creating logical and physical separations. Storage is implemented by a shared infrastructure that isolates data containers through a number of mechanisms. Each of the different storage infrastructures provided by the Windows Azure platform contains a mechanism (implemented as a layer in the multi-layer architecture) responsible for isolating data containers.

For example, networks are physically and logically isolated. (For more

information, see the networking white paper.) Data containers are provisioned among customers such that internal data routing ensure logical isolation.

Access Control

All communication with all types of storage must be authenticated and

authorized. The only exception is in the use of public blobs, described later in this document.

(7)

7 Each type of storage has its own access control mechanism. Windows Azure Storage and Azure AppFabric Cache follow the same principle. The owner of the store is provided with a secret key. This access key provides full access to the data; thus it must be protected and handled with care.

SQL Azure implements the traditional SQL Access control model. Initial access is established using a connection string that contains a user name and a password. Access to each of the database objects is controlled by the login and Role

mechanisms.

The Windows Azure Storage Access Control mechanisms will be described in detail later in this document.

Privacy in Microsoft

Since the launch of the Microsoft Network (MSN) in 1995, Microsoft has built and hosted a wide variety of services:

 Familiar consumer-oriented services such as the Windows Live Hotmail web-based email service and the Bing search engine

 Enterprise-oriented services such as the Microsoft Dynamics CRM Online business software and the Microsoft Business Productivity Online Standard Suite from Microsoft Online Services

 Many behind-the-scenes services that handle online billing and advertising functions for Microsoft customers.

Microsoft’s Global Foundation Services (GFS) provides the cloud infrastructure for these services, with a focus on adherence to numerous regulatory, statutory, and industry standards. The OSSC team within GFS works with partners and other teams throughout the company to manage security risks to global online services at Microsoft in order to fulfill its mission to provide trustworthy, available online businesses that create a competitive advantage for Microsoft.

All Microsoft data centers are managed according to the following privacy principles:

Accountability in handling personal information within Microsoft and with

external vendors and partners

Notice to individuals about how we collect, use, retain, and disclose their

personal information

Collection of personal information from individuals only for the purposes

identified in the privacy notice have provided

Choice and consent for individuals regarding how we collect, use, and disclose

their personal information

Use and retention of personal information in accordance with the privacy notice

provide to individuals and the consent that the individuals have provided in return

Disclosure or onward transfer of personal information to vendors and partners

only for purposes that are identified in the privacy notice, and in a secure fashion

Quality assurance steps to ensure that personal information in our records is

accurate for and relevant to the purposes for which it was collected

Access for individuals who want to inquire about and, when appropriate, review

and update personal information they have in our possession

Enhanced security of personal information to help protect against unauthorized

access and use

Monitoring and enforcement of compliance with our privacy policies, both

internally and with our vendors and partners, along with established processes to address inquiries, complaints, and disputes

(8)

8 Microsoft’s software development teams apply the PD3+C principles, defined in the Security Development Lifecycle (SDL), throughout the company’s

development and operational practices:

 Privacy by Design – Microsoft uses this principle in multiple ways during the

development, release, and maintenance of applications to ensure that data collected from customers has a specific purpose and that the customer is given appropriate notice in order to enable informed decision-making. When data to be collected is classified as highly sensitive, additional security measures such as encrypting while in transit, at rest, or both may be taken.  Privacy by Default – Microsoft’s offerings ask customers for permission

before collecting or transferring sensitive data. Once authorized, such data is protected by means such as access control lists (ACLs) in combination with identity authentication mechanisms.

 Privacy in Deployment – Microsoft discloses privacy mechanisms to

organizational customers as appropriate to allow them to establish appropriate privacy and security policies for their users.

 Communications – Microsoft actively engages the public through publication

of privacy policies, white papers, and other documentation pertaining to privacy.

Windows Azure storage is no exception to the policies governing other online services hosted by Microsoft. The software running Windows Azure was developed with privacy in mind under the PD3+C principles defined in the SDL.

Windows Azure Access Control

Windows Azure Storage has a simple access-control model. Each Windows Azure subscription can create one or more storage accounts, and each storage account has a single secret key, called the Storage Account Key (SAK), that is used to control access to all data in that storage account. This supports the typical scenario, under which storage is associated with applications, and these

applications have full Windows Azure Security control over their associated data. Any application that wants to have access to data in a storage account needs to have the appropriate SAK.

A more sophisticated access-control model can be achieved by creating a custom application front end to the storage, giving the application the storage key, and letting the application authenticate remote users and even authorize individual storage requests.

Storage account keys can be reset using the subscription credentials via the Windows Azure Portal or SMAPI. To support periodically changing SAKs without any breaks in service, a Storage Account can have two secret keys associated with it at one time, where either key grants full access to all of the data. There is then a three-step sequence for changing the secret key: adding the new one as authorized to the storage service; change the key used by all applications accessing the service; and removing the old key so that it will no longer be authorized.

Shared Access Signatures

The owner of the storage access keys (SAKs) has full control over blobs, tables, and queues.

(9)

9 Customers may want some blobs to be made public. One common scenario is static data (such as pictures) that needs to be accessible directly from a browser. To achieve this, the blob container can be marked as public.

Sometimes, however, you want to be able to access your blobs from a browser without making them publicly accessible to everyone. In this case, a Shared Access Signature is needed. With the key, you can create a special URL that will enable access to the blobs even without the storage access keys. This special URL should be distributed only to customers to whom you want to give access.

The URL for a Shared Access Signature includes additional components that specify the container or blob to make accessible, the interval over which the Shared Access Signature is valid, the permissions associated with the signature, any signed identifiers associated with the request, and the signature itself. Shared Access Signatures can only be created by the SAK owner because the key is required to create an HMAC signature. Figure 1 describes the structure of the Shared Access Signature’s URL.

Figure 1

Container Policy

A container-level access policy provides an additional level of control over Shared Access Signatures on the server side. Establishing a container-level access policy serves to group Shared Access Signatures and to provide additional restrictions for signatures that are bound by the policy. You can use a container-level access policy to change the start time, expiry time, or permissions for a signature, or to revoke it, even after it has been issued.

A container-level access policy provides greater flexibility in issuing Shared Access Signatures. Instead of specifying the signature's lifetime and permissions on the URL, it is possible to specify these parameters within the access policy, stored as metadata on the container in which the signed resource (container or blob) resides. For example, to change the lifetime of one or more signatures, one can simply modify the container-level access policy rather than have to reissue the signatures. Similarly, Shared Access Signatures cannot be canceled after being issued, but if a container policy is used it is possible to quickly revoke a signature by modifying the container-level access policy.

A container-level access policy includes a signed identifier, a value that may be up to 64 characters long and must be unique within the container. The value of this signed identifier is specified in the signedidentifier field in the URL of the

(10)

10 Shared Access Signature. These policies are created using the Blob Service REST API, and a maximum of five policies can be associated with a container.

AppFabric Cache Access Control

The Azure AppFabric Cache Security model is simple. The owner of the cache is provided with a security token (ACS Token) in the cache portal. All cache access requests must be authenticated.

To create a cache object, a CacheFactory must be created. The CacheFactory is provided with the key, either in code or by writing it to the configuration file.

All communication with the cache runs over a secure channel established by WCF message security. The channel is protected by implementing WS-* security standards.

Figure 2 shows the cache's security token as presented by the management portal. The portal provides a basic configuration snippet, which contains the key that has to be inserted into the application's configuration file before using the cache.

Figure 2

SQL Azure Security Model

SQL Azure is a database in the cloud. It was designed to look as similar as possible to traditional on-premises SQL databases. The developer and the IT pro can continue to use their existing knowledge and tools to manage and work with SQL Azure.

SQL Azure uses the TDS protocol (i.e., port 1433) in exactly the same way as on-premises databases, so data access technologies like ADO.NET and Entity

Framework can use it transparently. For network administrators, this means that their environment must be configured to allow outbound TCP connections over port TCP/1433 to enable applications and tools to connect to SQL Azure.

(11)

11 SQL Azure contains a built-in firewall to filter incoming traffic. This firewall is configured using firewall rules that can be written to the master database or directly in the management portal, as shown in Figure 3 and Figure 4. It is also possible to enable or disable connections from Windows Azure by

clicking "Allow other Windows Azure services to access this server" in the firewall rules configuration, as shown in Figure 3.

Figure 3

Figure 4

SQL Azure provides the same set of security principles that are available in SQL Server with SQL Server Authentication. You can use these to authorize access and secure your data:

SQL Server Logins: Authenticate access to SQL Azure at the server level. Database Users: Grant access to SQL Azure at the database level.

Database Roles: Group users and grant access at the database level.

SQL Azure only supports the standard SQL authentication mechanism, where the database stores and manages the credentials database for user logins.

Connection to the database is established by presenting a connection string that contains the credentials of the user trying to connect. Connection strings that contain clear-text credentials obviously contain sensitive information and must be protected and handled with care.

(12)

12 During the provisioning process, SQL Azure creates a login for the customer that is the server-level principal similar to SA login in SQL Server. This login will have administrative capabilities in the virtual SQL Server that will be provisioned for the customer. This account will then be used to create additional user logins for authentication, as well as database users and roles for authorization.

When connecting to the SQL Azure service, the user will need to provide

credentials required for login. These credentials will be protected using SSL while in transit (all communication between the SQL Azure database and your

application requires SSL encryption at all times). Connections will be re-authenticated every 60 minutes, with the user client software resending the credentials. At this point, any password reset will be enforced for that connection. For performance reasons, when a password is reset in SQL Azure, the connection will not be re-authenticated immediately, even if the connection is reset due to connection pooling. This is different from the behavior of an on-premises SQL Server.

Encryption

Currently there is no support for native encryption in either Windows Azure

Storage or SQL Azure. Data in Windows Azure Storage is stored as clear text, and there is no option to encrypt the data other than by developing your own

encryption code. SQL Azure does not currently support the Transparent Data Encryption (TDE) feature available in Microsoft SQL Server.

For data accessed only by services hosted by Windows Azure, this is not a major concern. Encrypting data in a hosted service requires that all instances of the service have access to the encryption keys, and if a service chooses to encrypt data these keys should be stored in certificates held in the Windows Azure certificate store. However, the security policies implemented by Microsoft’s data centers help ensure that attackers cannot gain physical access to the machines running Azure services or to the disks that implement Azure storage.

Consequently, the data destruction policies eliminate the possibility of data becoming inadvertently accessible to other Azure subscribers.

Therefore, as long as a service implements an appropriate level of transport security (e.g., SSL), the only feasible way in which an attacker could gain access to the data would be to obtain or steal the details of either the Windows Live ID used by the Azure subscription or the Management Certificate used to administer the hosted services. Such an attacker would in any case have access to the certificate store and therefore the means to decrypt the data, so performing encryption in this scenario would be of little benefit.

Encryption is valuable if you are accessing data located in Azure storage from applications running on computers outside of the Azure environment rather than from services hosted on Azure. In this scenario, different users running these applications might need to maintain confidentiality from each other. You can encrypt data within a client application by for instance using a private AES key, known only to the user, that is generated on a client machine or elsewhere within the enterprise. The encrypted payload is then uploaded to Azure storage. The data can be decrypted only by the user providing this key (which can be stored in the user’s profile using DPAPI). Other users can generate their own private keys, and these keys should not be disclosed. A client application, such as a Web

(13)

13 browser, that attempts to read the encrypted data directly from Azure storage without providing the appropriate key will not be able to decrypt the data.

(14)

14

Hybrid Applications

There are scenarios under which sensitive data should not be deployed to the cloud. The most common reasons are laws and regulations prohibiting the use of data storage in the cloud. The nature of regulation changes slowly, so currently there are many cases in which local laws are not up-to-date with rapidly evolving cloud technologies. There are of course other cases, in which custom data

protection infrastructures prevent data from being written to the cloud.

In these cases, Windows Azure Connect allows for a direct and safe connection between the local data center and the cloud. This enables scenarios in which data continues to live on-premises but computation is done in the cloud. The main caveats are of course scalability and performance; local data storage is usually not designed to work on the same scale as storage in the cloud. Similarly, the long communication channel between the data store and the computation nodes will affect performance.

Conclusion

Taking data out of the company’s private data center and storing it in the cloud is never an easy step for decision makers and IT pros to make. Microsoft

understands the importance of information security and privacy and implements all the steps necessary to ensure that anything stored in its data centers is safe and private. All storage infrastructures provided by Windows Azure implement data-protection mechanisms such as access control, secure communication, and isolation.

Information stored in Microsoft's data center is managed under strict privacy policies and regulations. Microsoft is committed to keeping your privacy and protecting your data. Microsoft also keeps close watch on all of the evolving security standards and regulations in order to continually maintain compliance. It is not surprising, therefore, that Microsoft’s cloud infrastructure has obtained the Federal Information Security Management Act of 2002 (FISMA)’s

Authorization to Operate (ATO), which authorizes US government organizations to store their data in Windows Azure.

Data Security in the Cloud

Windows Azure Security Overview