Data Intensive Storage
Services for Cloud
Environments
Dimosthenis Kyriazis
National Technical University of Athens, Greece
Athanasios Voulodimos
National Technical University of Athens, Greece
Spyridon V. Cogouvitis
National Technical University of Athens, Greece
Theodora Varvarigou
National Technical University of Athens, Greece
bUMMMSHFHKF
'Reference'
Table of Contents
Foreword xiv
Preface xvi
Acknowledgment xix
Chapter 1
Commercial and Distributed Storage Systems 1
Spyridon V Gogouvitis, National Technical University of Athens, Greece Athanasios Voulodimos, National Technical University of Athens, Greece Dimosthenis Kyriazis, National Technical University of Athens, Greece
Chapter 2
Key Distributed Components for a Large-Scale Object Storage 9
Miriam Allalouf Jerusalem College of Engineering, Israel Ghislain Chevalier, Orange Labs, France
Danny Harnik, IBM Haifa Research Labs, Israel Sivan Tal, Infinidat Ltd., Israel
Chapter 3
Content Centric Storage and Current Storage Systems 27
Michael C. Jaeger, Siemens AG, Corporate Technology, Germany Uwe Hohenstein, Siemens AG, Corporate Technology, Germany
Chapter 4
Business Models and Billing Challenges 47
Javier Martinez Elicegui, Telefonica I+D, Spain Lei Xu, Umea University, Sweden
Emilio Garcia Escobar, Telefonica I+D, Spain
Chapter 5
Towards Federation and Interoperability of Cloud Storage Systems 60
Sebastian Dippl, Siemens AG Corporate Technology, Germany Michael C. Jaeger, Siemens AG Corporate Technology, Germany Achim Luhn, Siemens AG Corporate Technology, Germany Alexandra Shulman-Peleg, IBM Haifa Research Lab, Israel Gil Vernik, IBM Haifa Research Lab, Israel
Chapter 6
SLA Management in Storage Clouds 72
Nikoletta Mavrogeorgi, National Technical University of Athens, Greece Spyridon V. Gogouvitis, National Technical University of Athens, Greece Athanasios Voulodimos, National Technical University of Athens, Greece Vasilios Alexandrou, National Technical University of Athens, Greece
Chapter 7
Cloud Access Control Mechanisms 94
Ciro Formisano, Engineering Ingegneria Informatica SPA, Italy Lucia Bonelli, Engineering Ingegneria Informatica SPA, Italy
Kanchanna Ramasamy Balraj, Engineering Ingegneria Informatica SPA, Italy Alexandra Shulman-Peleg, IBM Haifa Research Lab, Israel
Chapter 8
Compliance in the Cloud 109
Lucia Bonelli, Engineering Ingegneria Informatica, Italy Luisa Giudicianni, Engineering Ingegneria Informatica, Italy Angelo Immediata, Engineering Ingegneria Informatica, Italy Antonio Luzzi, Engineering Ingegneria Informatica, Italy
Chapter 9
Media Convergence and Cloud Technologies: Smart Storage, Better Workflows 132
Mirko Lorenz, Deutsche Welle, Germany Linda Rath- Wiggins, Deutsche Welle, Germany Wilfried Runde, Deutsche Welle, Germany Alberto Messina, RAI, Italy
Paola Sunna, RAI, Italy Giorgio Dimino, RAI, Italy Maurizio Montagnuolo, RAI, Italy Roberto Borgotallo, RAI, Italy
Chapter 10
Telecommunication Industry: Storage and Mobility 145
Fredrik Solsvik, Telenor ASA, Norway Michel Dao, Orange Labs, France
Chapter 11
Data Intensive Enterprise Applications 158
Peter Izsak, SAP Research Israel, Israel Aidan Shribman, SAP Research Israel, Israel
Chapter 12
Cloud Computing for Earth Observation 166
Roberto Cossu, European Space Agency (ESR1N), Italy Claudio Di Giulio, European Space Agency (ESRIN), Italy Fabrice Brito, Terradue, Italy
Dana Petcu, Institute e-Austria, Austria & West University of Timisoara, Romania
Chapter 13
Cloud-TM: An Elastic, Self-Tuning Transactional Store for the Cloud 192
Joao Barreto, Technical University Lisbon, Portugal Pierangelo Di Sanzo, Sapienza Universita di Roma, Italy Roberto Palmieri, Sapienza Universita di Roma, Italy Paolo Romano, Technical University Lisbon, Portugal
Chapter 14
Storage Security and Technical Challenges of Cloud Computing 225
Shantanu Pal, University of Calcutta, India
Chapter 15
Flashing in the Cloud: Shedding Some Light on NAND Flash Memory Storage Systems 241
Jalil Boukhobza, University of Western Brittany, France
Chapter 16
XtreemFS: A File System for the Cloud 267
Jan Stender, Zuse Institute Berlin, Germany Michael Berlin, Zuse Institute Berlin, Germany Alexander Reinefeld, Zuse Institute Berlin, Germany
Compilation of References 286
About the Contributors 309
Detailed Table of Contents
Foreword xiv
Preface xvi
Acknowledgment xix
Chapter 1
Commercial and Distributed Storage Systems 1
Spyridon V. Gogouvitis, National Technical University of Athens, Greece Athanasios Voulodimos, National Technical University of Athens, Greece
Dimosthenis Kyriazis, National Technical University of Athens, Greece
Distributed storage systems are becoming the method of data storage for the new generation of appli cations, as it appears a promising solution to handle the immense volume of data produced in today's rich and ubiquitous digital environment. In this chapter, the authors first present the requirements end users pose on Cloud Storage solutions. Then they compare some of the most prominent commercial distributed storage systems against these requirements. Lastly, the authors present the innovations the VISION Cloud project brings in the field of Storage Clouds.
Chapter 2
Key Distributed Components for a Large-Scale Object Storage 9
Miriam Allaloif Jerusalem College of Engineering, Israel Ghislain Chevalier, Orange Labs, France
Danny Harnik, IBM Haifa Research Labs, Israel Sivan Tal, Infinidat Ltd., Israel
This chapter discusses distributed mechanisms that serve as building blocks in the construction of the VISION Cloud object service. Two are fundamental building blocks in the creation of a large-scale clustered object storage. These are distributed file systems and distributed data management systems. In addition, the authors study two complimentary topics that aim to improve the qualities of the under lying infrastructure. These are resource allocation mechanisms and improvements to data mobility via data reduction.
Chapter 3
Content Centric Storage and Current Storage Systems 27
Michael C. Jaeger, Siemens AG, Corporate Technology, Germany Uwe Hohenstein, Siemens AG, Corporate Technology, Germany
Content-centric storage represents an approach for handling large amounts of data; It is one of the innova tions pursued by the VISION Cloud project. The goal of the VISION Cloud project is the development of an industry grade storage system using cloud technology. The envisaged use of the VISION Cloud involves the storage and management of millions of data items, potentially several hundreds of terabytes in size. On the one hand, the technical foundations must be capable of efficiently storing such an amount of data. On the other hand, the VISION Cloud must provide adequate means of an API for allowing the efficient navigation, search, and access for the right data item in this storage. For the latter purpose, VI SION Cloud provides a data access layer, which is called "Content Centric Interface." Applications can use this data access layer for accessing the VISION Cloud storage from a content-centric point of view, abstracted from actual storage representation. The content centric interface is different from existing cloud storage interfaces and is similar, from an architectural point of view, to object relational mapping frameworks for traditional applications with relational database systems.
Chapter 4
Business Models and Billing Challenges
Javier Martinez Elicegui, Telefonica I+D, Spain Lei Xu, Umea University, Sweden
Emilio Garcia Escobar, Telefonica I+D, Spain
47
The advent of the Cloud has leveraged a number of challenges, both for customers and service providers. Companies willing to embrace the new paradigm must face some entrance barriers, such as security, privacy and trust concerns, vendor locking risk, legal issues, etc. While service providers may work to minimize these barriers, they must be especially careful when defining what may constitute the most crucial aspect for the success of their offerings: the business model. Different incarnations of the cloud (IaaS, PaaS, and SaaS) add to the possibility of offering public or private solutions, or even federated models. On top of this is the billing strategy: the ubiquitous pay-per-use approach (either in its most common post-paid incarnation, or in a novel pre-paid version) is only the starting point for a wide range of innovative solutions, including bundling or QoS considerations, which European project VISION Cloud is tackling as part of its research efforts. This chapter aims to provide a comprehensive discussion on the most relevant business factors that the Cloud confronts.
Chapter 5
Towards Federation and Interoperability of Cloud Storage Systems 60
Sebastian Dippl, Siemens AG Corporate Technology, Germany Michael C. Jaeger, Siemens AG Corporate Technology, Germany Achim Luhn, Siemens AG Corporate Technology, Germany Alexandra Shulman-Peleg, IBM Haifa Research Lab, Israel Gil Vernik, IBM Haifa Research Lab, Israel
While it is common to use storage in a cloud-based manner, the question of true interoperability is rarely fully addressed. This question becomes even more relevant since the steadily growing amount of data that needs to be stored will supersede the capacity of a single system in terms of resources, availability, and network throughput quite soon. The logical conclusion is that a network of systems needs to be created
that is able to cope with the requirements of big data applications and data deluge scenarios. This chapter shows how federation and interoperability will fit into a cloud storage scenario. The authors take a look at the challenges that federation imposes on autonomous, heterogeneous, and distributed cloud systems, and present approaches that help deal with the special requirements introduced by the VISION Cloud use cases from healthcare, media, telecommunications, and enterprise domains. Finally, the authors give an overview on how VISION Cloud addresses these requirements in its research scenarios and architecture.
Chapter 6
SLA Management in Storage Clouds 72
Nikoletta Mavrogeorgi, National Technical University of Athens, Greece Spyridon V Gogouvitis, National Technical University of Athens, Greece Athanasios Voulodimos, National Technical University of Athens, Greece Vasilios Alexandrou, National Technical University of Athens, Greece
The need for online storage and backup of data constantly increases. Many domains, such as media, enterprises, healthcare, and telecommunications need to store large amounts of data and access them rapidly any time and from any geographic location. Storage Cloud environments satisfy these require ments and can therefore provide an adequate solution for these needs. Customers of Cloud environments do not need to own any hardware for storing their data or handle management tasks, such as backups, replication levels, etc. In order for customers to be willing to move their data to Cloud solutions, proper Service Level Agreements (SLAs) should be offered and guaranteed. SLA is a contract between the cus tomer and the service provider, where the terms and conditions of the offered service are agreed upon. In this chapter, the authors present existing SLA schemas and SLA management mechanisms and compare various features that Cloud providers support with existing SLAs. Finally, they address the problem of managing SLAs in cloud computing environments exploiting the content term that concerns the stored objects, in order to provide more efficient capabilities to the customer.
Chapter 7
Cloud Access Control Mechanisms 94
Ciro Formisano, Engineering Ingegneria Informatica SPA, Italy Lucia Bonelli, Engineering Ingegneria Informatica SPA, Italy
Kanchanna Ramasamy Balraj, Engineering Ingegneria Informatica SPA, Italy Alexandra Shulman-Peleg, IBM Haifa Research Lab, Israel
Cloud storage systems provide highly scalable and continuously available storage services to millions of geographically distributed clients. In order for users to trust their data to these systems, they need to be confident that their data is secure. Thus, cloud services should implement an access control mecha nism preventing unauthorized access and manipulation of their data. This chapter presents the existing access control mechanisms and describes their advantages and limitations in the Cloud set-up. The authors address the main access control aspects that include managing the identities and defining access policies. Furthermore, they describe more complex scenarios of identity federation and integration of separate identity silos which is required in various scenarios, like collaboration, merge on acquisition, or migration. For each topic, the authors present the existing solutions and describe the motivation for the architecture developed by the VISION Cloud project.
Chapter 8
Compliance in the Cloud 109
Lucia Bonelli, Engineering Ingegneria Informatica, Italy Luisa Giudicianni, Engineering Ingegneria Informatica, Italy Angelo Immediata, Engineering Ingegneria Informatica, Italy Antonio Luzzi, Engineering Ingegneria Informatica, Italy
Despite the huge economic, handling, and computational benefits of the cloud technology, the multiten-ant and geographically distributed nature of clouds hides a large crowd of security and regulatory issues to be addressed. The main reason for these problems is the unavoidable loss of physical control that costumers are forced to accept when opting for the cloud model. This aspect, united with the lack of knowledge (i.e. transparency) of the vendor's infrastructure implementation, represents a nasty question when costumers are asked to respond to audit findings, produce support for forensic investigations, and, more generically, to ensure compliance with information security standards and regulations. Yet, support for security standards compliance is a need for cloud providers to overcome customers hesitancy and meet their expectations. In this context, tracking, auditing, and reporting practices, while transcending the compliance regimes, represent the primary vehicle of assurance for security managers and audi tors on the achievement of security and regulatory compliance objectives. The aim of this chapter is to provide a roundup of crucial requirements resulting from common security certification standards and regulation. Then, the chapter reports an overview of approaches and methodologies for addressing compliance coming from the most relevant initiatives on cloud security and a survey of what storage cloud vendors declare to do in terms of compliance. Finally, the SlEM-based approach as a supporting technology for the achievement of security compliance objectives is described and, the architecture of the security compliance component of the VISION Cloud architecture is presented.
Chapter 9
Media Convergence and Cloud Technologies: Smart Storage, Better Workflows 132
Mirko Lorenz, Deutsche Welle, Germany Linda Rath- Wiggins, Deutsche Welle, Germany Wilfried Runde, Deutsche Welle, Germany Alberto Messina, RAI, Italy
Paola Sunna, RAI, Italy Giorgio Dimino, RAI, Italy Maurizio Montagnuolo, RAI, Italy Roberto Borgotallo, RAI, Italy
Why do media organizations look out for cloud storage? In short, the media industry as a whole is facing various challenges. Due to digital convergence there is more material, less time, and multiple channels to fill, while budgets get smaller. TV, video on demand, and mobile content have become big drivers in pushing a search for innovative storage solutions. In addition to that, the opportunity to work with raw data, which can be used for deeper analysis, mapping, visualization, and personalized services is another aspect of why there is a need for novel storage solutions, preferably in the cloud. The media industry could lower production costs and increase speed to market of time critical reporting. This book chapter provides an overview of how far VISION Cloud can provide novel concepts for these demands.
Chapter 10
Telecommunication Industry: Storage and Mobility 145
Fredrik Solsvik, Telenor ASA, Norway Michel Dao, Orange Labs, France
The operators Telefonica, Orange, and Telenor represent the telecommunication industry in the VISION Cloud project. Together, they provide a telco-oriented use case, which provides feedback and require ments to the work on the reference architecture being developed. The use cases are developed based on the challenges and opportunities that are identified that relate to storage and mobility technologies. The use cases validate the reference architecture of VISION Cloud based on prototype tests and experimen tations that enable the use case to be evaluated in scenarios. Telecommunication industry challenges are being addressed by the advancements made in the VISION Cloud project iterations, which takes the inputs from the telco use case and other use cases into consideration. This chapter is a study in the telecommunication industry challenges and possibilities with respect to the cloud storage technology advancements made in VISION Cloud.
Peter Izsak, SAP Research Israel, Israel Aidan Shribman, SAP Research Israel, Israel
Today almost all big enterprises act globally, which results in a growing need for a new kind of data analytics. Imagine a company where data from distribution and sales needs to be combined with in creasing online sales on multiple platforms and marketing across new social media channels. Here, new real-time analytics using Cloud Computing concepts can open new perspectives. SAP has had a strong presence in the Business Intelligence (BI) market. The company pioneered concepts to collect, combine, and analyze company wide information. As a result, SAP customers enjoy BI capabilities that are strongly integrated with their SAP operational systems (e.g., ERP, CRM). In recent years, compa nies have leveraged Cloud Computing as a means for lowering the Total Cost of Ownership (TCO) of various types of business applications that are provided On-Demand. SAP already offers products such as SAP Business ByDesign, which is offered as a Softvvare-as-a-Service (SaaS) On-Demand product. Feature-rich Cloud storage solution such as VISION Cloud enables SAP to integrate new innovations to its On-Demand software portfolio. This chapter describes liovv VISION Cloud enriches SAP's Instant Business Intelligence analytical On-Demand service.
Roberto Cossu, European Space Agency (ESRIN), Italy CI audio Di Giulio, European Space Agency (ESRIN), Italy
Fabrice Brito, Terradue, Italy
Dana Petcu, Institute e-Austria, Austria & West University of Timisoara, Romania
This chapter elaborates on the impact and benefits Cloud Computing may have on Earth Observation. Earth Observation satellites generate in factTera- to Peta-bytes of data, and Cloud Computing provides many capabilities that allow an efficient storage and exploitation of such data. Several scenarios related to Earth Observation activities are analyzed in order to identify the possible benefits from the adoption of Cloud Computing. As concrete proofs-of-concept, several activities related to Cloud Computing in the context of Earth Observation are exposed and discussed. Technical details are provided for a particular framework used by Earth Observation applications that has made the transition from using Grid services towards using Cloud services. A special attention is given to the avoidance of the vendor-lock-in problem.
Chapter 11
Data Intensive Enterprise Applications 158
Chapter 12
Chapter 13
Cloud-TM: An Elastic, Self-Tuning Transactional Store for the Cloud 192
Joao Barreto, Technical University Lisbon, Portugal Pierangelo Di Sanzo, Sapienza Universita di Roma, Italy Roberto Palmieri, Sapienza Universita di Roma, Italy Paolo Romano, Technical University Lisbon, Portugal
By shifting data and computation away from local servers towards very large scale, world-wide spread data centers, Cloud Computing promises very compelling benefits for both cloud consumers and cloud service providers: freeing corporations from large IT capital investments via usage-based pricing schemes, drastically lowering barriers to entry and capital costs; leveraging the economies of scale for both services providers and users of the cloud; facilitating deployment of services; attaining unprecedented scalabil ity levels. However, the promise of infinite scalability catalyzing much of the recent hype about Cloud Computing is still menaced by one major pitfall: the lack of programming paradigms and abstractions capable of bringing the power of parallel programming into the hands of ordinary programmers. This chapter describes Cloud-TM, a self-optimizing middleware platform aimed at simplifying the develop ment and administration of applications deployed on large scale Cloud Computing infrastructures.
Chapter 14
Storage Security and Technical Challenges of Cloud Computing 225
Shantanu Pal, University of Calcutta, India
Cloud computing has leaped ahead as one of the biggest technological advances of the present time. In cloud, users can upload or retrieve their desired data from anywhere in the world at anytime, making this the most important and primary function in cloud computing technology. While this technology reduces the geographical barriers and improves the scalability in the way we compute, keeping data in a Cloud Data Center (CDC) faces numerous challenges from unauthorized users and hackers within the system. Creating proper Service Level Agreements (SLA) and providing high-end storage security is the biggest barrier being developed for better Quality of Service (QoS) and implementation of a safer cloud computing environment for the Cloud Service Users (CSU) as well as for the Cloud Service Providers (CSP). Therefore, cloud applications need to have increased QoS and effective security measures and policies set in place to provide better services and to decline unauthorized access. The purpose of this chapter is to examine the cloud computing technology behind innovative business approaches and es tablishing SLA in cloud computing applications. This chapter provides a clear understanding of different cloud computing security challenges, risks, attacks, and solutions that exist in the present heterogeneous cloud computing environment. Storage security, different cloud infrastructures, the many advantages, and limitations are also discussed.
Chapter 15
Flashing in the Cloud: Shedding Some Light on NAND
Flash Memory Storage Systems 241
Jalil Boukhobza, University of Western Brittany, France
Data and storage systems are one of the most important issues to tackle when dealing with cloud comput ing. Performance, in terms of data transfer and energy cost, predictability, and scalability are the main challenges researchers are faced with, and new techniques for storing, managing, and accessing huge amounts of data are required to make cloud computing technology feasible. With the emergence of flash memories in mass storage systems and the advantages it can provide in terms of speed and power ef ficiency as compared to traditional disks, one must rethink the storage system architectures accordingly. Indeed, the integration of flash memories is considered a key to leverage the performance of data-centric computing. The purpose of this chapter is to introduce flash memory storage systems by focusing on their specific architectures and algorithms, and finally their integration into servers and data centers.
Chapter 16
XtreemFS: A File System for the Cloud 267
Jan Slender, Zuse Institute Berlin, Germany Michael Berlin, Zuse Institute Berlin, Germany Alexander Reinefeld, Zuse Institute Berlin, Germany
I
Cloud computing poses new challenges to data storage. While cloud providers use shared distributed hardware, which is inherently unreliable and insecure, cloud users expect their data to be safely and securely stored, available at any time, and accessible in the same way as their locally stored data. In this chapter, the authors present XtreemFS, a file system for the cloud. XtreemFS reconciles the need of cloud providers for cheap scale-out storage solutions with that of cloud users for a reliable, secure, and easy data access. The main contributions of the chapter are: a description of the internal architecture of XtreemFS, which presents an approach to build large-scale distributed POSIX-compliant file systems on top of cheap, off-the-shelf hardware; a description of the XtreemFS security infrastructure, which guarantees an isolation of individual users despite shared and insecure storage and network resources; a comprehensive overview of replication mechanisms in XtreemFS, which guarantee consistency, avail ability, and durability of data in the face of component failures; an overview of the snapshot infrastruc ture of XtreemFS, which allows to capture and freeze momentary states of the file system in a scalable and fault-tolerant fashion. The authors also compare XtreemFS with existing solutions and argue for its practicability and potential in the cloud storage market.
Compilation of References 286
About the Contributors 309