• No results found

Handling Recordkeeping Risks in the Cloud

In document 5926.pdf (Page 111-121)

2. LITERATURE REVIEW

2.3. The Technology

2.3.2. Defining Cloud Computing

2.3.2.1. Handling Recordkeeping Risks in the Cloud

As mentioned in Chapter 1, NARA and ARMA International have remarked on the potential recordkeeping risks associated with Cloud Computing. A variety of researchers and governmental agencies has also looked into possible risks, and has provided more detailed information regarding both risks and best practices. For example, the U.S. government’s CIO council, in conjunction with the Federal Compliance Committee published Creating Effective Cloud Computing Contracts for the Federal Government: Best Practices for Acquiring IT as a Service (CIO Council 2012). This document highlights ten areas in which federal agencies “require improved collaboration and alignment during the contract formation process…when acquiring cloud computing services” (3): selecting a cloud service; CSP (i.e., “Cloud Service Provider”) and End-User Agreements; Service Level Agreements (SLAs); CSP, agency, and integrator roles and responsibilities; standards; security; privacy; e-Discovery; Freedom of Information Act (FOIA); and e-Records. Although “e-Records” obviously relates to

recordkeeping, several other of these areas can potentially impact recordkeeping activities in the organization. The European Union’s Article 29 Data Protection Working Party (Article 29 Data Protection Working Party 2012) attributes many of these risks to “a lack of control over personal data as well as insufficient information with regard to how, where and by whom the data is being processed/sub-processed” (2).

Recordkeeping risks associated with cloud computing and the steps needed to mitigate these risks according to the CIO Council and the Article 29 Data Protection

Table 1 - Comparing Cloud to Related Architectures, by Feature and Source

Feature Cluster Grid Cloud P2P

Server Types Commodity

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

High-End

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Commodity computers and high-end servers and network attached storage

(Buyya et al. 2009)

Edge of Network (i.e., desktop) (Stockinger 2007) Commodity (Agrawal et al. 2010) Ownership Single (Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Multiple

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Single

(Buyya et al. 2009) (Mell and Grance 2009b)

(Wyld 2009)

Multiple

(Stockinger 2007)

Single

(Buyya et al. 2009)

Discovery Membership Services

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Centralized Index & Decentralized Information

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007) Membership Services (Buyya et al. 2009) Decentralized (Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Ease of Use Difficult

(Jones 2008) (Klems 2008) (Weinhardt et al. 2009) Easy (Jones 2008) (Klems 2008) (Weinhardt et al. 2009) 89

Feature Cluster Grid Cloud P2P

Security/Privacy Traditional

login/password-based. Medium level of privacy – depends on user privileges.

(Buyya et al. 2009)

Public/private key pair based authentication and

mapping a user to an account. Limited support

for privacy. (Buyya et al. 2009) Each user/application provided a virtual machine. High security/privacy guaranteed. Support for

setting per-file access control list (ACL).

(Buyya et al. 2009)

Credential delegations and user authorization

(Foster et al. 2008) (Stockinger 2007) (Vaquero et al. 2009)

Security through isolation

(Vaquero et al. 2009)

Simple Use of Webforms (over SSL)

(Foster et al. 2008)

Public Key Infrastructure (PKI) and X.509 SSL

certificates

(Youseff, Butrico, and Da Silva 2008)

Resource Sharing Collaboration (VOs, fair

share), policies & procedures

(Stockinger 2007) (Vaquero et al. 2009)

Assigned resources are not shared

(Vaquero et al. 2009)

Feature Cluster Grid Cloud P2P

Resource Management Centralized

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Distributed

(Bote-Lorenzo, Dimitriadis, and Gómez-

Sánchez 2003) (Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007) (Taylor and Harrison

2008) (Vaquero et al. 2009) (Weinhardt et al. 2009) Centralized (Buyya et al. 2009) (Vaquero et al. 2009) (Weinhardt et al. 2009) Distributed (Stockinger 2007)

Capacity Stable and guaranteed

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Varies, but typically high (Buyya et al. 2009)

(Čibej, Sulistio, and Buyya 2009) (Stockinger 2007) Provisioned on demand (Agrawal et al. 2010) (Buyya et al. 2009) (Wang et al. 2010) (Zhang, Cheng, and

Boutaba 2010)

Varies

(Stockinger 2007)

Speed

(Latency/Bandwidth)

Low latency, high bandwidth

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

High latency, low bandwidth

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Low latency, high bandwidth

(Buyya et al. 2009)

High latency, low bandwidth

(Stockinger 2007)

Application Development Local

(Weinhardt et al. 2009)

In the Cloud

(Weinhardt et al. 2009)

Feature Cluster Grid Cloud P2P

Business and/or Funding Model

Limited, not open market

(Buyya et al. 2009)

Limited, not open market

(Buyya et al. 2009)

Pay as you go; utility pricing

(Agrawal et al. 2010) (Buyya et al. 2009) (Foster et al. 2008)

(Jones 2008) (Knorr and Gruman 2008)

(Weinhardt et al. 2009) (Wilton 2010) (Yachin 2008) (Zhang, Cheng, and

Boutaba 2010)

Public good or privately assigned; project-oriented

resource sharing; policy

(Buyya et al. 2009) (Foster et al. 2008) (Weinhardt et al. 2009)

Tiered, per-unit, and subscription-based pricing

(Youseff, Butrico, and Da Silva 2008) Standardization & Interoperability Virtual Interface Architecture (VIA)-based (Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Some open grid forum standards

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Foster et al. 2008)

(Stockinger 2007)

Web Services (SOAP and REST)

(Buyya et al. 2009) (Foster et al. 2008)

No Standards

(Stockinger 2007)

Provenance Management Done via workflow

systems

(Foster et al. 2008)

Relatively unexplored

(Foster et al. 2008)

Feature Cluster Grid Cloud P2P

Computational Model Batch

(Foster et al. 2008) (Weinhardt et al. 2009)

Interactive

(Foster et al. 2008) (Weinhardt et al. 2009)

Various (e.g., batch, interactive, distributed,

parallel)

(Stockinger 2007)

Long-lived services based on hardware virtualization

(Klems 2008)

Short-lived batch-style processing (job execution)

(Klems 2008)

Scalability 100s

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

1000s

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Jones 2008) (Stockinger 2007) 100s to 1000s (Buyya et al. 2009) Millions (Stockinger 2007)

Nodes and sites scalability

(Vaquero et al. 2009)

Elastic

(Agrawal et al. 2010) (Wang et al. 2010) (Zhang, Cheng, and

Boutaba 2010)

Nodes, sites, and hardware scalability

(Vaquero et al. 2009)

Feature Cluster Grid Cloud P2P

Virtualization Virtualization of data and

computing resources

(Stockinger 2007) (Vaquero et al. 2009)

(Zhang, Cheng, and Boutaba 2010)

Virtualization of hardware and software platform

(Vaquero et al. 2009) (Wang et al. 2010) (Zhang, Cheng, and

Boutaba 2010)

Sef-Management, Failure Management

Limited (often failed tasks/applications are

restarted)

(Buyya et al. 2009)

Limited (often failed tasks/applications are

restarted)

(Buyya et al. 2009)

Strong support for failover and content replication.

VMs can be easily migrated from one node to

another (Buyya et al. 2009) Reconfigurability (Stockinger 2007) (Vaquero et al. 2009) Reconfigurability, self- healing (Stockinger 2007) (Vaquero et al. 2009)

Software Dependencies Application domain-

dependent software (grid middleware required)

(Stockinger 2007) (Taylor and Harrison

2008) (Vaquero et al. 2009) Application domain- independent software (Vaquero et al. 2009) Allocation/Scheduling Centralized (Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Decentralized

(Buyya et al. 2009) (Čibej, Sulistio, and Buyya

2009) (Stockinger 2007)

Both centralized and decentralized

(Buyya et al. 2009)

Decentralized

(Stockinger 2007)

Working Party are:

Lack of availability of one’s data. Because many cloud providers rely on

proprietary technology and because there are still no widely accepted interoperability standards for cloud computing, one may be subject to vendor lock-in. Under such a situation, if one chooses to terminate the contract with one’s CSP, or if the CSP goes out of business or merges with another company, one may be at risk of losing access to one’s data. The CIO Council recommends that any agency ensure that it explicitly specifies data ownership in its service contract and that it validates that the CSP has the capability (and intent) to transfer data back to the agency in the event of contract termination or to a different CSP if and when the agency desires.

Lack of data integrity. Although cloud providers engage in resource isolation techniques to ensure that information cannot be accessed by other cloud consumers than those who have authority to access it, the underlying computing resources (e.g., hardware) are nonetheless shared among a variety of consumers. An agency needs to ensure that it understands how resource isolation occurs and what security safeguards the CSP provides. The CSP should provide auditing capabilities and should be given clear metrics that the agency needs to evaluate on a regular basis. In addition,

processing logs should be available to the agency on a regular basis.

Lack of Confidentiality. A cloud provider may receive a request or demand for information from law enforcement agencies. According to the CIO Council and Article 29 Data Protection Working Party, the agency needs to understand how such a request could impact its own information. For example, if the agency receives an e- Discovery request, it needs to be confident that the cloud provider can ensure that

they can meet that request in a timely manner. In addition, if the cloud provider stores information outside the agency’s jurisdiction, it may be subject to legal requirements that conflict with the agency’s privacy requirements, especially if the information is held outside the country. If the data needs to be stored in a particular jurisdiction, this needs to be specified in the cloud contract as well. In addition, the CSP should be required to sign a confidentiality agreement and to follow the same confidentiality rules that the agency’s personnel have to follow.

Lack of Intervenability. Recordkeepers sometimes need to access records to make corrections or to erase erroneous information. The agency needs to be sure that the cloud provider can offer this ability. Contracts need to specify clearly what roles and responsibilities all parties to the contract play to minimize the risk that the agency will not be able to intervene with its records according to its policies and mandates, according to the CIO Council and Article 29 Data Protection Working Party.

Lack of Isolation. Because underlying resources are shared within a cloud computing arrangement, it is physically possible for cloud providers to link information from several different customers. If an administrator has sufficient access rights, he or she could link these pieces of information in ways that are not acceptable to one or more of the customers. Therefore, agencies must make sure that the CSP ensures that no one has access to more information than they actually need to have. In addition, the CSP needs to let the agency know what technical measures it takes to isolate the information so that the risk of unauthorized access is minimized.

Lack of Transparency. The CSP needs to be transparent about its policies and procedures regarding security, privacy, and handling of the data. In many cases, cloud

providers layer their services. That is, some portions of their services are actually provided by yet other vendors. In such cases, the agency not only needs to know who all the parties that touch the data are, they need to ensure contractually that all of these vendors are willing and able to meet their recordkeeping requirements. In addition, the agency needs to understand how the cloud provider can comply with its retention periods. The agency needs to ensure that the data is erased or anonymized acceptably. To do this requires either destroying or demagnetizing storage media, or overwriting the data sufficiently. Special software tools exist that will overwrite data multiple times to ensure it is unrecognizable (Article 29 Data Protection Working Party 2012). However, one needs to recognize that 100% secure infrastructure is not possible in the Cloud. This is why full transparency of the CSP is essential, according to the CIO Council and Article 29 Data Protection Working Party.

Of course, any agency needs to ensure that the CSP is aware of all recordkeeping requirements, including laws and regulations by which the agency must abide when handling records. The CIO Council notes that risks can be lowered by including recordkeeping

personnel in the requirements definition process and including them in communications channels (32). In addition, the CSP must be willing and able to transfer records of long-term value to an agency-specified archive according to retention period requirements.

In document 5926.pdf (Page 111-121)