• No results found

CLOUD COMPUTING: An architectural and technological overview

N/A
N/A
Protected

Academic year: 2021

Share "CLOUD COMPUTING: An architectural and technological overview"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

CLOUD COMPUTING:

An architectural and technological overview

Gianmario Motta1, Nicola Sfondrini2, Daniele Sacco3 Department of Industrial and Information Engineering

University of Pavia Via Ferrata 1, Pavia, Italy

1[email protected], 2[email protected], 3[email protected]

Abstract— We present a Systematic Literature Review (SLR) on Cloud Computing that selected 39 papers from first tier journals and conferences in the period 2008-2012. The selective approach captures the relevant points of view on Cloud Computing, namely the overall paradigm in terms of delivery and deployment. Afterwards, we point out the technological issues about networking and data management in Cloud Computing. This SLR extends previous reviews thanks to a longer time period and a wider research sample.

Cloud Computing, Literature Review, Cloud Deployment Model, Cloud Delivery Model, IT management

I. INTRODUCTION

Undoubtedly, Cloud Computing (CC) is a fashionable topic from 2007 onwards. The issue we deal with is about the kind of comprehension we have about CC. In more precise terms we state it as “What cloud computing really is and what level of maturity has reached”. To answer it, we have conducted a Systematic Literature Review (SLR). In the following sections we illustrate its method.

A. The method of Systematic Literature Review (SLR)

SLR is an “explicitly formulated, reproducible and up-to-date summary” [1]. As opposed to narrative reviews, it is based on a specified structured method. Our SLR includes the following steps:

 Formulation of the research question

 Selection of sources and inclusion of primary studies

 Quality assessment and data extraction

 Summary of study result

 Results interpretation.

B. Formulation of the research

The overall research question is, as we said earlier, “What cloud computing really is and what level of maturity has reached”. Once the question is stated, the second point is how to classify articles. This point is discussed by various authors [2] [3]. We have classified research by a two levels grid (Table 1). Each of these research domains is illustrated by a dedicated section in section 2.

TABLE 1-CLASSIFICATION OF CLOUD COMPUTING ISSUES Primary Domains Secondary Domain Content Service Level Management Performance, Security Performance studies to refine workflow setting and security.

Business Issues

Cost, Legal Issues

The economic value and privacy issues.

Other Papers that cover multiple domains

C. Source Selection

A SLR may be complete or selective. We follow the latter approach. We are aware that in this case some relevant ideas may be missed, but analyzing thousands of papers would be frankly infeasible. Our selective SLR has been based on web search engines:

 IEEE Computer Society

 Google Scholar

 ACM Digital Library

We have selected English-written sources given the small number of relevant documents in other languages. Books were not considered, because they often deal many different concepts that may cover too many areas. In short, 39 articles were selected trough three steps:

 An initial set of pertinent studies is extracted from the titles retrieved by search engines by reading title, abstract and introduction.

 Eliminate short papers, English papers, non-international

 Select the final set of papers based on to their adherence to research questions.

Within each of the research perspectives we have mentioned in Table 1, a source may fall in various categories, that we list in Table 2.

TABLE 2-CATEGORIES OF SOURCES

Category Description

Case study An investigation on a single individual, group, incident, community or enterprise Theory Guidelines on or introduction to a particular

(2)

research issue

Survey An investigation on a given topic based on the analysis of a given sample

Simulation A study that introduces simulation methods and related results

Position paper

Presents an opinion about an issue Literature

Review

Profiles the literature related to a given topic II. FINDINGS

We here discuss the positions emerging from the SLR, segmented by domain [2] [4] [5] .

A. The definitions of Cloud Computing (CC)

Probably the most popular definition in business environment, NIST states that CC “is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [6]. Definitions by CC or IT vendors reflect their specific viewpoints rather than a common understanding.

Amazon’s definition is close to NIST, since their business is on CC services and not on hardware or software products: “Cloud Architectures are designs of software applications that use Internet-accessible on-demand services. Applications built on Cloud Architectures are such that the underlying computing infrastructure is used only when it is needed (for example to process a user request), draw the necessary resources on-demand (like compute servers or storage), perform a specific job, then relinquish the unneeded resources and often dispose themselves after the job is done. While in operation the application scales up or down elastically based on resource needs” [7].

According to Microsoft CC is rather a brilliant future; “[it] represents the platform for the next generation of business. CC is driving the transformation of the IT industry across the entire stack: hardware model delivering incredibly powerful and efficient hardware at a fraction of the cost; application model allowing developers to rapidly create highly available secure cloud applications; operations model keeping cloud applications available 24X7 with 9-to-5 management” [8]. In turn, Sun addresses architecture implications; actually “[it is a set of] services that are encapsulated, have an API, and are available over the network. This definition encompasses using both compute and storage resources as services” [9].

In more general terms, researchers regard CC as an evolution of a set of existing technologies [10], that is driven by enabling technologies as (a) Virtualization, (b) Monitoring, metering and orchestration of service flow, (c) Web service and SOA [11]. Although CC is a new technological approach, it combines several previous technologies [6] [10]. Some sources relate it to previous paradigms as Grid Computing, Utility Computing, Cluster Computing and Service Oriented Architecture. Due to this

relation and to the excitement in business, a multitude of points of view came out [10] [12].

Authors agree that CC is driving substantial changes to the use of IT resources, enabling users to exploit all available resources through Internet with a standard interface [13] and that it transforms corporate IT to something that is operated by a third party [14]. Other researchers consider CC as a killer technology that absorbed the best features of previous technologies and replaces them completely. In fact, the input-output feature of CC resembles the service computing one and the storage feature is quite similar to the pervasive computing one [15]. Finally, researchers highlight virtualization and, from the economic point of view, the payment system “pay-as-you-go” [16].

1) Common characteristics

In short, common characteristics of CC can be stated as follows.

On-demand self-service provisioning: Computing

resources can be acquired and used at anytime and anywhere allowing customizing the computing environments without the need for a direct service provider intervention. The user with administrative privileges has complete control of the configuration parameters and can set the desired characteristics of computing power, storage and network.

Resource pooling: Cloud service Providers share their resources with multiple users but they are configured to appear as a unique platform for individual end users. This multi-tenancy model allows to dynamically assign to several users physical and virtual resources according to their request without providing specific information on their position [17].

Rapid elasticity and scalability: The scalability is one of the most important feature of CC and allows to acquire more resources during a peak of demand and release them once they are no longer required. In this way the user has the illusion of infinite resources that can be used at the precise moment they become to be necessary for the proper service delivery [18] [19].

Broad network access: Resources are available over Internet through standard interfaces that allow the access from several devices such as PDA, laptops, desktops and mobile phones. For this reason, a key factor lies in the stability and availability of bandwidth that is guaranteed just from few years on a global scale [20].

Measured service: Cloud services are automatically managed and metered using appropriate monitoring and reporting systems for providing transparency for both the provider and end user. The use of specific metrics has allowed billing based on real usage of resources transforming, in this way, the IT in a commodity.

2) Delivery Models

Cloud providers deliver on three service layers: infrastructure, platform, and application.

Software as a Service (SaaS): It exists from long time and even anticipates CC. It can be considered as the basic idea behind the cloud and epitomized as “all-software in the end” [21]. A SaaS is an implementation of a business

(3)

application that (a) is hosted in a cloud infrastructure and (b) is provided to customers through the Internet. This approach eliminates, for end users, the need to install and run the applications on their computers and reduces the cost of software by on-demand pricing [10]. SaaS is on the top of the service stack and customers cannot control the underlying infrastructure and application platforms. In fact, the freedom of the end users is limited to the execution of the application and to certain user-specific configuration [22].

Platform as a Service (PaaS): A recent addition, it encloses everything for the whole software engineering lifecycle, from programming to deployment [23]. The potential consumers of a PaaS service are software developers and testers, who are able to use infrastructure and do not need a tight control on it [6]. This level includes two separate sections: (a) programming environment, with tools to support development and debugging, and (b) execution environment to deploy final applications [24]. Most PaaS vendors lock developers into their development platforms, and prevent direct communication with lower computing infrastructures or provide APIs with limited functionalities [25].

Infrastructure as a Service (IaaS): It is the lowest level of the service stack and contains all physical and virtual resources used to construct the cloud. Cloud infrastructure services are based on virtualized platforms, that are an evolution of the virtual private server, and offer to customers the possibility to buy on consumption whole range of physical resources, software and storage [22]. In this way IaaS services avoid procurement, capacity planning, installation and configuration and reduce costs and time to deploy a system [24]. The consumers are not able to control the underlying cloud infrastructure but have a complete control on operating systems, storage capacity and deployed applications [6]. The high freedom and the wide possibilities of configuration led some researchers to divide this layer into sub layers related to the resources they depend on (a) computational resources, (b) data storage (c) communications [26].

3) Deployment Models

Each level of the cloud technology can be by the following models.

Private cloud: it is operated solely for one organization. A private cloud may be built and managed by the organization or external providers and may exist on or off premise [6]. Private clouds use virtualization and automated management technologies to ensure an high control on performance, reliability and security [23]. Furthermore, a local data center reduces administrative and management tasks. This view has often been criticized for its similarity to the traditional proprietary server farm and the need to incur initial capital costs, a stark contrast to the basic philosophy of CC.

Public cloud: it provides computing services to public or industry group and it is owned by an external CSP [6]. Public clouds do not require to invest initial capital on infrastructure and shift risks to infrastructure providers. However, it implies a lack of control on data, network and

security settings, which hampers their effectiveness and spread in many business scenarios [23].

Community cloud: Multiple organizations with common

requirements, as mission, policy and compliance, share cloud infrastructures across administrative domains. The infrastructure constituents of a community cloud can be managed by a partner organization and may exist on or off premise [6]. An alternative is the Virtual Private Cloud (VPC) that is essentially a virtual platform running on top of public clouds, that is shared with other companies [23].

Hybrid cloud: It is a blended deployment model, with a composition of two or more clouds (private, community, or public) that are handled as a unique entity by a standardized technology that enables data and application portability. Generally, it addresses the limitations of the previous approaches [6]. A hybrid cloud is commonly used to offload processes and data from the company private cloud to a reliable public cloud. This deployment model offers better flexibility, control and security over application data than public clouds but it requires a careful study to determine the best split of the components between public and private cloud [23].

Figure 1 summarizes the delivery and deployment models and explains how delivery models operate on deployment models. Within the delivery model we have shown the ways of nesting systems that ensure flexibility and efficiency.

Figure 1. Delivery and Deployment models B. Technological Issues

CC uses several current technologies and therefore perpetuate their issues. On the other side, the innovative service characteristics of CC pushed researchers to analyze the quality of service. By looking into cloud computing as a white box, researchers have identified potential critical areas: network configuration and data management.

1) Networking

In cloud computing two different networks coexist: (a) a logical network, made of virtual machines, that are deployed on physical servers interconnected by logical links, and (b) a physical network, made of physical servers and physical networking devices. This dual network requires to map optimized flows on the physical network and to allocate bandwidth to meet communication requirements. One approach is based on a “1-cuts” algorithm for node

(4)

assignment [27]. However, dynamics of resources looks too complex to be managed by predictions based on simulations. Therefore, a system of voluntary collaboration between resources is necessary, as suggested by the principle Catallaxy. The agents dynamically allocate resources by routing traffic demands in an optimal way, lowering cost and optimizing performance [28] [29]. Other solutions for agility and flexibility in cloud computing propose a virtualized optical network [30] or an implementation of a reliable multicast service to reduce bandwidth on links while ensuring required reliability and flexibility [31].

2) Data Management

CC uses a layered storage area to run computing cycles and redundancy replication to ensure high availability. Moreover, the flexible architecture of CC provides, ideally, an infinite amount of resources, that allow data-intensive flows otherwise unapproachable [32]. Invisible to cloud users, this structure ensures a stable, reliable and massively parallel computing engine. Some researchers have directed their efforts to create a distributed programming framework that supports simplified data-parallel applications. The implementation of this framework ensures long-term storage for large amounts of data and provides a natural data-parallel computing framework that operates the distributed datasets by assigning a group of operators to each element in datasets [33] [34]. The structure of CC guarantees optimal management of data, but raises critical issues in data mining. In mining methods and algorithms should exploit available resources and avoid unnecessary redundant operations. Some researchers have identified customizable templates, assembled recursively and capable to handle a wide class of SQL data mining queries [35] [36]. In addition to traditional mining, CC has fostered new capabilities in automatic data management and content interpretation. Most popular search engines use CC to parse file contents in the attempt of creating a semantic web [37].

III. CONCLUSION

Our SLR has illustrated the current academic research landscape on Cloud Computing (CC) and has highlighted relevant trends from a technological viewpoint. In our review, we found different definitions that still show a conceptual uncertainty. Almost all articles, even if classified in different domains, have addressed technological issues related to the adoption of cloud systems. These issues are developed, in most cases, from the provider’s perspective, while research should still cover the quality of the service from the end-user’s viewpoint [38]. Relevant work on the effective use of PaaS, that might be a strategic element in the future competition of vendors, is still scarce. Nor CC standards have been really defined, as unified APIs, to ensure systems interoperability.

New ways of approaching IT are coming into common usage through software allowing dynamic data storage (e.g. Dropbox), collaborative writing (e.g. Google documents) and many others [39].

Finally, CC has the potential to introduce a new vision of globalization, transforming it into the largest reorganization

of the world since the Industrial Revolution, but only the evolution of next years, will show the real route taken by this technology today still not completely mature.

REFERENCES

[1] M. Egger, G. D. Smith, and K. O’Rourke, “Rationale, potentials, and promise of systematic reviews”, Systematic reviews in health care: Meta-analysis in context, pp. 3-19, 2001.

[2] H. Yang and M. Tate, Where are we at with cloud computing?: a descriptive literature review, 2009.

[3] I. Sriram and A. Khajeh-Hosseini, “Research agenda in cloud technologies”, arXiv:1001.3259,. 2010.

[4] A. Khajeh-Hosseini, I. Sommerville, and I. Sriram, “Research challenges for enterprise cloud computing”, arXiv:1001.3257, 2010. [5] G. Motta and N. Sfondrini, “Research studies on cloud computing: a

systematic literature review”, 17th International Business Information Management Association Conference (IBIMA), 2011.

[6] P. Mell and T. Grance, The NIST definition of cloud computing. National Institute of Standards and Technology, 53, 2009.

[7] J. Varia, Cloud architectures, White Paper of Amazon, jineshvaria. s3, 2008.

[8] D. Chappell, Introducing the Windows Azure platform. Retrieved May, 30, 2010.

[9] J. Carolan, S. Gaede, J. Baty, G. Brunette, A. Licht, J. Remmell, L. Tucker, and J. Weise, Introduction to Cloud Computing architecture-White Paper, 2009.

[10] L. Wang, J. Tao, M. Kunze, A. C. Castellanos, D. Kramer, and W. Karl, “Scientific cloud computing: Early definition and experience”, IEEE, pp. 825-830, 2008.

[11] M. A. Vouk, “Cloud computing–Issues, research and implementations”, Journal of Computing and Information Technology, 16, pp. 235-246, 2008.

[12] K. Hwang, “Massively distributed systems: From grids and p2p to clouds”, Advances in Grid and Pervasive Computing, pp. 1, 2008. [13] J. Voas and J. Zhang, “Cloud Computing: New Wine or Just a New

Bottle?”, IT professional, 11, pp. 15-17, 2009.

[14] I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud computing and grid computing 360-degree compared”, IEEE, pp. 1-10, 2008.

[15] L. Mei, W. Chan, and T. Tse, “A tale of clouds: paradigm comparisons and some thoughts on research issues”, IEEE, pp. 464-469, 2008.

[16] L. M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A break in the clouds: towards a cloud definition”, ACM SIGCOMM Computer Communication Review, 39, pp. 50-55, 2008.

[17] Z. Hengxi, L. Chunlin, S. Zhengjun, and Z. Xiaoqing, “Resource Pool-Oriented Resource Management for Cloud Computing”, Business, Economics, Financial Sciences, and Management, pp. 829-832, 2012.

[18] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, and I. Stoica, “A view of cloud computing”, Communications of the ACM, 53, pp. 50-58, 2010.

[19] J. Ejarque, J. Álvarez, R. Sirvent, and R. M. Badia, “Resource Allocation for Cloud Computing: A Semantic Approach” Open Source Cloud Computing Systems: Practices and Paradigms, 90, 2012.

[20] J. Fiaidhi, I. Bojanova, J. Zhang, and L. J. Zhang, “Enforcing Multitenancy for Cloud Computing Environments” IT professional, pp. 16-18, 2012.

[21] H. Erdogmus, “Cloud computing: does nirvana hide behind the nebula?”, IEEE software, pp. 4-6, 2009.

(5)

[22] C. Hoefer and G. Karagiannis, Taxonomy of cloud computing services, 2010.

[23] Q. Zhang, L. Cheng, and R. Boutaba, “Cloud computing: state-of-the-art and research challenges”, Journal of Internet Services and Applications, 1, pp. 7-18, 2010.

[24] A. Lenk, M. Klems, J. Nimis, S. Tai, and T. Sandholm “What's inside the Cloud? An architectural map of the Cloud landscape”, IEEE Computer Society, pp. 23-31, 2009.

[25] R. L. Grossman, “The case for cloud computing”, IT professional, 11, pp. 23-27, 2009.

[26] L. Youseff, M. Butrico, and D. Da Silva, “Toward a unified ontology of cloud computing”, IEEE, pp. 1-10, 2008.

[27] Y. Hou, M. Zafer, K. Lee, D. Verma, and K. K. Leung “On the mapping between logical and physical topologies”, IEEE, pp. 1-10, 2009.

[28] W. Streitberger and T. Eymann, “A simulation of an economic, self-organising resource allocation approach for application layer networks”, Computer Networks, 53, pp. 1760-1770, 2009.

[29] S. Pallickara, J. Ekanayake, and G. Fox, “An Overview of the Granules Runtime for Cloud Computing”, IEEE, pp. 412-413, 2008. [30] B. G. Chun and P. Maniatis, “Augmented smartphone applications

through clone cloud execution”, USENIX Association, pp. 8, 2009. [31] M. Matos, A. Sousa, J. Pereira, R. Oliveira, E. Deliot, and P. Murray,

“CLON: Overlay networks and gossip protocols for cloud environments”, On the Move to Meaningful Internet Systems: OTM, pp. 549-566, 2009.

[32] X. Llorà, B. Ács, L. S. Auvil, B. Capitanu, M. E. Welge, and D. E. Goldberg, “Meandre: Semantic-driven data-intensive flows in the clouds”, IEEE, pp. 238-245, 2008.

[33] Y. Gu and R. Grossman, “Exploring data parallelism and locality in wide area networks”, IEEE, pp. 1-10, 2008.

[34] R. L. Grossman, Y. Gu, M. Sabala, and W. Zhang, “Compute and storage clouds using wide area high performance networks”, Future Generation Computer Systems, 25, pp. 179-183, 2009.

[35] J. L. Johnson, “SQL in the Clouds” Computing in Science & Engineering, pp. 12-28, 2009.

[36] R. Grossman and Y. Gu, “Data mining using high performance data clouds: experimental studies using sector and sphere”, ACM, pp. 920-927, 2008.

[37] P. Mika and G. Tummarello, “Web semantics in the clouds”, IEEE Intelligent Systems, pp. 82-87, 2008.

[38] M. Pastaki Rad, A. Sajedi Badashian, G. Meydanipour, M. Ashurzad Delcheh, M. Alipour, and H. Afzali, “A Survey of Cloud Platforms and Their Future”, Computational Science and Its Applications– ICCSA, pp. 788-796, 2009.

[39] G. Motta and N. Sfondrini, “Cloud computing and enterprises: a survey”, The 2nd International Conference on Computer and Management (CAMAN), 2012.

References

Related documents