Technological Comparision Between Cloud and Grid

(1)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 4, April 2013)

126

Technological Comparision Between Cloud and

Grid

Satheesh Kavuri

1

, GangadharaRao Kancharla

2

, Basaveswara Rao Bobba

3 1_{Dhanekula Institute Of Engineering & Technology, Vijayawada, INDIA}

2_{Dept. Of Computer Science & Engineering, Acharya Nagarjua University, Guntur, INDIA} 3

Computer Center, Acharya Nagarjua University, Guntur, INDIA

Abstract-- The objective of this paper is to limelight the difference between cloud and grid computing according to their performance and also we focus on difference between them based on their functionality. The development of computer industry is promoted by the progress of distributed computing, parallel computing and grid computing, the cloud computing etc. Grid and cloud computing are latest emerging technologies which are progressing very sophistically. There is a strong debate over their interrelation with each other and their performance. This paper focuses on the types of cloud deployment models and types of services, the similarities and differences of cloud computing and grid computing, meanwhile it discusses the better aspect of cloud computing than grid computing, and refers the common problems faced by both computing. This paper characterizes and presents a side by side comparison of grid and cloud computing services and also list future areas of research that mostly like these computing services. A close comparison helped the two communities to understand very effectively.

Keywords -- cloud computing; grid computing; comparison; Cloud Deployment models; similarities

I. INTRODUCTION

The evolution of cloud computing is one of the major advances in the computing area as well as in economics of computing technology. Since entering into 21st century, there has been a rapid boom of computer network development, Information technology is now more and more blended into our daily life at the coming of electronic era. The concept of cloud computing was jointly proposed by Google and IBM in 2007 [2]. The development of computer industry is promoted by the progress of distributed computing, parallel computing and grid computing, so the cloud computing movement rises. This study describes the types of cloud computing services and deployment models, the similarities and differences of cloud computing and grid computing, meanwhile discusses the better aspect of cloud computing than grid computing, and refers the common problems faced by both the computing models, and some security issues.

A Cloud computing technique is a cost-efficient computing approach in which the records or information and applications can be accessed from a Web browser by consumers.

The Cloud Computing can also be termed as dynamic computing because it provides resources when required (dynamically).

As we can see, there aren‘t many ―exclusive‖ differences between cloud computing and grid computing. After all, both are used to maximize resources, have elements that interact with each other, and are meant to provide the user with a simplified presentation of services. It‘s easy to get them confused. To understand what makes them unique, we need to look at the way in which tasks are computed within their particular environment.

Criteria for adoption of cloud: Criteria which [1] should be considered while adopting the cloud are mentioned below 1.Size of IT resources 2. Utilization of existing resources 3.Sensitivity of data 4.Availability of services 5.Data lock-ins and data transfers 6.Performance unpredictability 7.Software licensing 8. Legal terms and conditions.

This paper is organized as follows: Section II describes the progress from Distributed Systems to Cloud through Grids. Section III describes the general introduction of grid computing with its characteristics, services, applications, pros - cons etc., Section IV describes the general introduction of cloud computing with its characteristics, services, applications, pros - cons etc., Section V presents the similarities and comparison between grid computing and Cloud computing. Section VI gives conclusion of this paper and lastly the references.

II. FROM DISTRIBUTED COMPUTING TO CLOUD

COMPUTING THROUGH GRID COMPUTING

(2)

International Journal of Emerging Technology and Advanced Engineering

127

This concept is borrowed by both cloud and grid computing. Various enterprises adopt different architectures for Distributed computing. But here we mainly focus on the Grid and Cloud Computing. Cloud computing eliminates the need for organizations to build and maintain expensive data centers. It enables organizations to stand up new systems quickly and easily. Its pay as-you-go rental model allows organizations to defer costs. It increases business continuity by providing inexpensive disaster-recovery options. It reduces the need for organizations to maintain a large IT staff [12]. The following figure illustrates the computing paradigm shift of the last half century.

[image:2.612.354.526.265.400.2]

Specifically, the figure identifies six distinct phases. In Phase 1, users used terminals to connect to powerful mainframes shared by many users. In Phase 2, stand-alone personal computers (PCs) became powerful enough to satisfy users‘ daily work and more over there is no data sharing. In Phase 3 users use computer networks that allowed multiple computers to connect and to communicate each other .In Phase 4 users use to communicate with more different types of networks - global network. In Phase 5 brought us the electronic grid to facilitate shared computing power and storage resources (distributed computing). People used PCs to access a grid of computers in a transparent manner. Now, we are in Phase 6, cloud computing lets us exploit all available resources on the Internet in a scalable and simple way.

Fig 1: Computing paradigm shift

(source - ComputationWorld'12, Nice 7/23/2012)

[image:2.612.50.294.464.658.2]

In general cloud computing is overlapped with many existing technologies like Distributed Computing, Cluster Computing, Utility Computing, and Grid Computing. The cloud is the same basic idea as the grid and grid computing is the back bone of the cloud computing. The following figure gives a complete overview of relationship with other computing domains that it overlaps with. Web 2.0 covers almost the whole spectrum of service-oriented applications, where Cloud Computing lies at the large overlaps with all these fields where it is generally considered of lesser scale than supercomputers and Clouds.

Web 2.0 technologies allow the performance of functionally complex web applications in browsers and do not load application on local PCs. [15] Internet services that provide access to data through special program interfaces, allowing one to process data online, appear to be extremely useful as well.

The Evolution of Grid Computing to Cloud Computing:

(3)

International Journal of Emerging Technology and Advanced Engineering

[image:3.612.326.571.111.282.2]

128 Fig 3: Foundation of Cloud Computing

Source: Adapted from IBM (2009) White Paper—Seeding the Clouds: Key Infrastructure Elements for Cloud Computing, February 2009 (p. 6).

III. GRID COMPUTING

In the late 1990s and early 2000s, the idea of grid computing, a type of distributed computing that harnesses the power of many computers to handle large computational tasks, was all the range, at least among organizations with high-performance computing (HPC) needs. Grid computing requires the use of software that can divide and farm out pieces of a program as one large system image to several thousand computers. One concern about grid is that if one piece of the software on a node fails, other pieces of the software on other nodes may fail. Large system images and associated hardware to operate and maintain them can contribute to large capital and operating expenses.

[image:3.612.76.261.148.259.2]

Grid computing provides consistent, inexpensive access to computational resources (supercomputers, storage systems, data sources, instruments, and people) regardless of their physical location or access point. A Grid is basically the one [4] that uses the processing capabilities of different computing units for processing a single task. This task though is managed by a central computing machine. This machine divides the task into numerous tasks and gets processed from difference computing machines in the cluster. As soon as these tasks are completed by the machines, they send the result back to the central machine which takes care of controlling all the tasks. All the results are clubbed together and a single output is provided.

Fig 4: A grid is a collection of machines, sometimes referred to as “nodes,” “resources,” “members,” “donors,” “clients,” “hosts,”

“engines,” and many other such terms.

They all contribute any combination of resources to the grid as a whole. Some resources may be used by all users of the grid while others may have specific restrictions.

The technology is also more cost-effective, enabling better use of critical funds. Standardized protocols [11] and tools still need to be agreed upon. As a result, the concept is still being perfected. Many agree, though, that the potential for grid computing systems is limitless.

Real World Examples of Grid Computing: SETI (Search for Extraterrestrial Intelligence) @Home project, BOINC (Berkeley Open Infrastructure for Network Computing), Einstein@Home, GIMPS – Great Internet Mersenne Prime Search, LHC@home, World Community Grid, e-RDA etc., [9]

Characteristics of Grid Computing: Having a complete grid definition built using all main characteristics and these characteristics [3] may be described as follows.

Large scale: grid computing deal with a number of resources ranging from just a few to millions causes to solve a serious problem.

(4)

International Journal of Emerging Technology and Advanced Engineering

129 Heterogeneity: Heterogeneous computing systems comprise growing numbers of increasingly more diverse computing resources that can be local to one another or geographically distributed.

Resource sharing: resources in a grid belong to many different organizations that allow other organizations (i.e. users) to access them.

Multiple administrations: Each organization may use its own administrative policies to access their owned resources can be accessed and used. This leads to a security problems.

Resource coordination: Computational grids are large-scale high-performance distributed computing environments that provide dependable, consistent, and pervasive access to high-end computational resources.

Transparent access: A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.

Dependable access: a grid must assure the delivery of services under established Quality of Service (QoS) requirements

Consistent access: a grid must be built with standard services, protocols and inter-faces thus hiding the heterogeneity of the resources while allowing its scalability.

Pervasive access: the grid must grant access to available resources by adapting to a dynamic environment in which resource failure is commonplace.

Grid Types: Grids can be categorized into three broader categories that focus on the functional aspect:

Data Grid: A data grid is a grid computing system that deals with data i.e. the controlled sharing and management of large amounts of distributed data. Data Grid is the storage component of a grid environment. Scientific and engineering applications require access to large amounts of data, and often this data is widely distributed. A data grid provides seamless access to the local or remote data required to complete compute intensive calculations.

Examples: Biomedical informatics Research Network (BIRN) and The Southern California earthquake Center (SCEC).

Computational Grid: A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational.

Example: Science Grid (US Department of Energy)

Scavenging grid: A scavenging grid is most commonly used with large numbers of desktop machines and are usually given control over when their resources are available to participate in the grid.

Example: SETI (Search for Extraterrestrial Intelligence)@Home

Grid Services: Grid Services are software components which provide access to a set of Grid resources such as data sources, high performance equipment and computational resources. A service stresses interoperability and may be dynamically discovered and used. According to OGSA, the service abstraction may be used to specify access to computational resources, storage resources, and networks in a unified way. Computational Grids composed of a number of heterogeneous resources, which may be owned and managed by different administrators. Each of these resources may offer one or more services.

 A single application with a well-defined API.

 A single application used to access services on other resources – managed by a different systems administrator.

 A collection of coupled applications, with pre-defined interdependencies between elements of the collection. Each element provides a sub-service that must be combined in a particular order.

 An interface for managing access to a resource. Grid computing has great potential, but there are still [39][16] absent features waiting for implementation. The following are the list of identified disadvantages.

 The results of all processes are sent first on all nodes within the grid, and then collaboratively assessed. Before the final assessment is made, it is not possible to define or to declare a final outcome. This is a serious problem when talking about time sensitive projects.

 Grid computing requires an advanced infrastructure: small servers, fast connections between the servers and finally, in order to maximize the potential of that infrastructure, it requires the use of quality tools, software and skilled technicians to manage the grid.  There is also a problem with people who don't want to

share their resources, despite the fact that everybody involved with the resource sharing would benefit.  Grids are complicated to build and use, and currently

users require some level of expertise.

(5)

International Journal of Emerging Technology and Advanced Engineering

130

IV. CLOUD COMPUTING

Cloud computing isn‘t a network computing (or) centralized computing. In the centralized computing the documents or applications are located in a single server and these can be accessed through an organizational local private network. [1]

Cloud computing can be defined as a data service, software and storage service, where the end user is not aware of the physical location and system configuration that delivers the services. Cloud computing is where an application doesn't access resources it requires directly; rather it accesses them through something like a service. ―Cloud Computing is a paradigm in which information is permanently stored in servers on the internet and cached temporarily on clients that include desktops, entertainment centers, table computers, notebooks, wall computers, hand-held‘s, sensors, monitors, etc.‖. [12]

At one end is Software as a Service (SaaS), allowing a pay per use model to be applied to acquiring software with no on-premise hosting. Platform as a Service (PaaS) provides an elastically scalable compute platform including middleware for applications to be deployed to. Finally Infrastructure as a Service (IaaS) enables servers, storage, networks to be provisioned on demand and on a pay as you use basis with self-administration of the infrastructure.

Cloud computing is dynamically scalable because users only have to consume the amount of online computing resources they actually want. Cloud computing is the use of a 3rd party service (Web Services) to perform computing needs.

Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

This cloud model promotes availability and is composed of five essential cloud characteristics, three service models and four deployment models.

The U.S. National Institute of Standards and Technology (NIST) have a set of working definitions that separate cloud computing into service models and deployment models. Those models and their relationship to essential characteristics of cloud computing are shown in the following figure.

In general the cloud model is composed of five essential characteristics, three service models, and four deployment models.

Essential Characteristics of cloud model

 On-demand self-service: A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider.

 Broad network access:Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms.

 Resource pooling: The provider‘s computing resources are pooled to serve multiple consumers, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand.  Rapid elasticity: Capabilities can be rapidly and

elastically provisioned, in some cases automatically, to quickly scale out and rapidly released to quickly scale in.

 Measured service: Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service.

(6)

International Journal of Emerging Technology and Advanced Engineering

131

It is a new way to access and use software for small and medium enterprises; it is also a new way for vendors to sell their products as a services. Cloud-based software is available almost immediately, it doesn‘t have to be fully customized to user‘s environment, and users don‘t need to have extensive in-house IT staff or equipment dedicated to it. If users looking for a cost-effective, efficient way to deploy business software, cloud computing may be the answer to all tech problems. The National Institute for Science and Technology (NIST) has defined four different cloud computing deployment models; private cloud, community cloud, public cloud, and hybrid cloud.

Private Cloud: The cloud infrastructure is operated solely for an organization. A private cloud is a proprietary computing architecture, owned or leased by a single reorganization, which provides hosted services behind a firewall to ―customers‖ within the organization. Private clouds improve average server utilization; allow usage of low-cost servers and hardware while providing higher efficiencies;

Examples of Private Cloud:  Eucalyptus

 Ubuntu Enterprise Cloud - UEC (powered by Eucalyptus)  Amazon VPC (Virtual Private Cloud)

 VMware Cloud Infrastructure Suite  Microsoft ECI data center.

Two private cloud scenarios are as follows:

[image:6.612.334.561.135.284.2]

On-site Private Cloud: Applies to private clouds implemented at a customer‘s premises. The private cloud may be centralized at a single subscriber site or may be distributed over several subscriber sites.

Fig 7: On-site Private Cloud

Outsourced Private Cloud: As shown in the figure, an outsourced private cloud has two security perimeters, one implemented by a cloud subscriber (on the right) and one implemented by a provider (left). Applies to private clouds where the server side is outsourced to a hosting company.

Fig 8: Out-sourced Private Cloud

[image:6.612.332.563.375.672.2]

On site Community Cloud: The cloud infrastructure is shared by several organizations and supports a specific community that has shared concerns. I think Community Clouds is going to be one of the most popular solutions in the next couple of months. Community cloud has two possible scenarios:

Fig 9 : On-site Community Cloud

[image:6.612.63.271.519.646.2]

(7)

International Journal of Emerging Technology and Advanced Engineering

132 On-site Community Cloud Scenario: Applies to community clouds implemented on the premises of the customers composing a community cloud.

Outsourced Community Cloud: Applies to community clouds where the server side is outsourced to a hosting company.

Examples of Community Cloud:



Google Apps for Government



Microsoft Government Community Cloud

Public Cloud: The organizations run their applications from a data center provided by the cloud provider and those are responsible for providing the infrastructure, servers, storage and networking necessary to ensure the availability and scalability of the applications and also are called as provider clouds. The most ubiquitous, and almost a synonym for cloud computing. Subscribers connect to providers via the public Internet.

Examples of Public Cloud:



Google App Engine



Microsoft Windows Azure



IBM Smart Cloud



Amazon EC2

[image:7.612.327.539.201.373.2]

This diagram is essentially similar to private cloud except that a subscriber facility implementing a security perimeter is shown in the following figure.

Fig 11: Public Cloud

Hybrid Cloud:A hybrid cloud is a composition of at least one private cloud and at least one public cloud. The cloud infrastructure is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load-balancing between clouds). For example, an organization might use a public cloud service, such as Amazon Simple Storage Service (Amazon S3) for archived data but continue to maintain in-house storage for operational customer data.

Ideally, the hybrid approach allows a business to take advantage of the scalability and cost-effectiveness that a public cloud computing environment offers without exposing mission-critical applications and data to third-party vulnerabilities.

Examples of Hybrid Cloud:



Windows Azure (capable of Hybrid Cloud)

[image:7.612.68.271.456.585.2]



VMware Vcloud (Hybrid Cloud Services)

Fig 12: Hybrid Cloud

The Cloud Computing Stack/Service model in Cloud:Many business organizations are looking into cloud computing services to reduce the cost and complexity of their business infrastructure and its preservation. Application service provider (ASP) has been replaced by ―on-demand software‖ and ―software as a service.‖ Cloud Computing is often described as a stack, as a response to the broad range of services built on top of one another under the moniker ―Cloud‖.

The diagram below depict the Cloud Computing stack – it shows three distinct categories within Cloud Computing: Software as a Service, Platform as a Service and Infrastructure as a Service. These three fundamental classifications are often referred to as the SPI model, where SPI refers to Software, Platform or Infrastructure (as a Service), respectively.

[image:7.612.346.539.571.694.2]

(8)

International Journal of Emerging Technology and Advanced Engineering

133

The goal of Software as a Service (SaaS) is replacing the applications running on personal computer and no need to install special software on our computer. SaaS is defined as application is deployed over the internet.

A SaaS provider gives complete license to the consumers as a service either on demand/subscription/pay-as-you-go/no-charge to use the software. These applications are accessible from various client devices through a simple interface such as a web browser. An example of this approach would be Salesforce.com CRM, or SugarCRM.

The goal of Platform as a Service (PaaS) provides a runtime-system and application framework. A

PaaS provides an environment for software development, storage and hosting delivered as-a-service over the Internet. Examples of PaaS are Google App Engine, Force.com, Microsoft Azure, WOLF, etc.

The goal of Infrastructure as a Service (IaaS) provides traditional computing resources such as servers, storage, and other forms of low level network and hardware resources offered in a virtual, on demand fashion over the internet where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. Examples include services like Go Grid, Amazon‘s EC2 and even S3. [14]

V. JUXTAPOSING GRIDS AND CLOUDS

Broadly one can say that cloud computing is a combination of thick or thin clients, grid computing and utility computing. In short, cloud computing has become a significant technology trend and could reshape the IT sector and the IT marketplace. It provides a deployment environment for application software which will elastically scale its compute and storage capacity to best meet the applications immediate true requirements, all autonomously. The companies which have might to invest in research technological innovation; technological adoption would be the front runners in any upcoming technology. The same thing is true with cloud. Amazon and Google happen to be the first few companies that have made huge investments in cloud computing.

Similarities and differences: In this paper, we talk about cloud computing service types and the similarities and differences between cloud and grid computing. A common feature is that both grid and clouds attempt at utility computing. However, their realization of utility computing is different. They are often mistaken for one another, but the fact is that they‘re nowhere close to the same thing. We look at why cloud computing may be advantageous over grid computing, what issues to consider in both, and some security concerns.

While there are many similarities between grid and cloud computing, it is the differences that matter most. [13]

Similarities: Cloud computing and grid computing are scalable. Scalability is accomplished through load balancing of application instances running separately on a variety of operating systems and connected through Web services. CPU and network bandwidth is allocated and de-allocated on demand. The system's storage capacity goes up and down depending on the number of users, instances, and the amount of data transferred at a given time.



Both computing types involve multi-tenancy and multitask, means that many customers can perform different tasks, accessing a single or multiple application instances. Sharing resources among a large pool of users assists in reducing infrastructure costs and peak load capacity



Consumers can be afraid of sending sensitive data through a large number of computers. As for the user security is a very prior issue as customer data and program is residing on provider premises.



Data can be manipulated regardless of its location



Data may be moved repeatedly to distinct computers, which generates the bottleneck of the process, since the data is not always available everywhere and sometimes it is necessary to make this data available.



Cloud and Grid computing provide service-level agreements (SLAs) for guaranteed uptime availability of, say, 99 percent.



Both of these computing must be determine the amount of unused resources

Differences: Grid computing [22] is better suited for organizations with large amounts of data being requested by a small number of users (or few but large allocation requests), whereas cloud computing is better suited to environments where there are a large number of users requesting small amounts of data (or many but small allocation requests). This section aims to compare Grids and Clouds across a wide variety of perspectives, from architecture, security model, business model, programming model, virtualization, data model, compute model, to provenance and applications.

(9)

International Journal of Emerging Technology and Advanced Engineering

134

The business model for Grid is project-oriented in which the users or community represented by that proposal have certain number of service units (i.e. CPU hours) they can spend.

Architecture: Grid architecture [21] provides an overview of the Grid components, defines the purpose and functions of its components, and indicates how the components interact with one another. Grids provide protocols and services at five different layers as identified in the Grid protocol architecture as shown in the figure.

The Fabric layer comprises the physical resources

includes computational resources, storage systems, network resources, catalogues, software modules, sensors and other system resources which are shared within the Grid. The

Connectivity layer contains the core communication and authentication protocols for easy and secure network transactions. The GSI (Grid Security Infrastructure) protocol underlies every Grid transaction. The Resource

layer uses the communication and security protocols

[image:9.612.328.563.136.263.2]

(GRAM) to control secure negotiation, initiation, monitoring, accounting, and payment for the sharing of functions of individual resources.

Fig 14: Grid architecture (adapted from Foster)

The Collective layer is responsible for all global

resource management and for interaction with collections of resources. This layer is responsible for community authorization together with accounting and payment services. The Application layer involves the user applications that are deployed on the Grid. Only a Grid-enabled or gridified application to run in parallel and use multiple processors of a Grid setting or that can be executed on different heterogeneous machines.

[image:9.612.88.244.396.506.2]

Clouds are usually referred to as a large pool of computing and/or storage resources, which can be accessed via standard protocols via an abstract interface. There are also multiple versions of definition for Cloud architecture, we define a four-layer architecture for Cloud Computing in comparison to the Grid architecture, composed of 1) fabric, 2) unified resource, 3) platform, and 4) application Layers.

Fig 15 : Cloud architecture (Source:NIST)

The fabric layer contains the raw hardware level

resources, such as compute resources, storage resources, and network resources. The unified resourcelayer contains resources that have been abstracted/encapsulated (usually by virtualization) so that they can be exposed to upper layer and end users as integrated resources, for instance, a virtual computer/cluster, a logical file system, a database system, etc. The platform layer adds on a collection of specialized tools, middleware and services on top of the unified resources to provide a development and/or deployment platform. Finally, the application layer contains the applications that would run in the Clouds. Clouds in general provide services at three different levels (IaaS,

PaaS, and Saas) as follows, although some providers can choose to expose services at more than one level.

Infrastructure as a Service (IaaS) provides the

capability to the consumer is the provision of fundamental computing resources where the consumer is able to deploy and urn arbitrary software. The consumer does not manage or control the underlying cloud infrastructure. Typical examples are Amazon EC2 (Elastic Cloud Computing) Service and S3 (Simple Storage Service).[25]

Platform as a Service (Paas) provides the capability to

the consumer is deployment onto the cloud infrastructure consumer-created applications using programming languages and tools supported by the provider (e.g., java, python, .Net). The consumer can control over the deployed applications An example is Google‗s App Engine, which enables users to build Web applications.

Software as a Service (SaaS) provides the capability to

(10)

International Journal of Emerging Technology and Advanced Engineering

135 Resource Management: This section mainly deals with resource management found in Grids and Clouds, which covers compute model, data model, virtualization, monitoring, and provenance.

Compute Model: Most Grids use a batch-scheduled compute model, in which a local resource manager (LRM) manages the computer resources for a Grid site, and users submit batch jobs to request some resources for some time. Cloud Computing compute model will likely look very different, with resources in the Cloud being shared by all users at the same time. This should allow latency sensitive applications to operate natively on Clouds, although ensuring a good enough level of QoS is being delivered to the end users likely to be one of the major challenges for Cloud Computing as the Clouds grow in scale, and number of users.

Data Locality: To achieve good scalability at Internet scales for Clouds, Grids, and their applications, data must be distributed over many computers, and computations must be steered towards the best place to execute in order to minimize the communication costs. Google‗s Map Reduce system runs on top of the Google File System, within which data is loaded, partitioned into chunks, and each chunk replicated. In Grids, data storage usually relies on a shared file systems (e.g. NFS, GPFS, PVFS, Luster), where data locality cannot be easily applied. One approach is to improve schedulers to be data-aware, and to be able to leverage data locality information when scheduling computational tasks; [20] this approach has shown to improve job turn-around time significantly.

Data Management architectures are important to ensure that the data management implementations scale to the required dataset sizes in the number of files, objects, and dataset disk space usage while at the same time, ensuring [18] that data element information can be retrieved fast and efficiently.

Grids have been making progress in combining compute and data management with data-aware schedulers, but its believe that Clouds will face significant challenges in handling data-intensive applications without serious efforts invested in harnessing the data locality of application access patterns. Although data-intensive applications may not be typical applications that Clouds deal with today, as the scales of Clouds grow, it may just be a matter of time for many Clouds.

Data Model: Data Grids have been specifically designed to tackle data intensive applications in Grid environments, with the concept of virtual data playing a crucial role.

Virtual data captures the relationship between data, programs and computations and prescribes various abstractions that a data grid can provide: location transparency where data can be requested without regard to data location, materialization transparency where data can be either recomputed on the fly or transferred upon request,

[image:10.612.385.502.280.358.2]

representation transparency where data can be consumed and produced no matter what their actual physical formats and storage. Internet Computing will be towards Cloud Computing centralized, in which storage, computing, and all kind of other resources will mainly be provisioned by the Cloud is shown in the following Figure.

Fig 16: The triangle model of next-generation internet language

Virtualization: Virtualization has become an indispensable ingredient for almost every Cloud; the most obvious reasons are for abstraction and encapsulation. Clouds need to run multiple (or even up to thousands or millions of) user applications and all the applications appear to the users as if they were running simultaneously and could use all the available resources. Virtualization provides the necessary abstraction such that the underlying fabric (raw compute, storage, network resources) can be unified as a pool of resources and resource overlays (e.g. data storage services, Web hosting environments) can be built on top of them. Virtualization also enables each application to be encapsulated such that they can be configured, deployed, started, migrated, suspended, resumed, stopped, etc., and thus provides better security, manageability, and isolation. Grids do not rely on virtualization as much as Clouds do, but that might be more due to policy and having each individual organization maintain full control of their resources (i.e. by not virtualizing them).

(11)

International Journal of Emerging Technology and Advanced Engineering

136

In a Cloud, different levels of services can be offered to an end user, the user is only exposed to a predefined API, and the lower level resources are opaque to the user (especially at the PaaS and SaaS level, although some providers may choose to expose monitoring information at these levels). The user does not have the liberty to deploy her own monitoring infrastructure, and the limited information returned to the user may not provide the necessary level of details for her to figure out what the resource status is. Essentially monitoring in Clouds requires a fine balance of business application monitoring, enterprise server management, virtual machine monitoring, and hardware maintenance, and will be a significant challenge for Cloud Computing as it sees wider adoption and deployments.

Programming Model: Although programming model in Grid environments does not differ fundamentally from traditional parallel and distributed environments. Grids primarily target large-scale scientific computations, so it must scale to leverage large number of resources, and we would also naturally want to make programs run fast and efficient in Grid environments, and programs also need to finish correctly, so reliability and fault tolerance must be considered. MPI (Message Passing Interface) is the most commonly used programming model [28] in parallel computing, in which a set of tasks use their own local memory during computation and communicate by sending and receiving messages. MPICH-G2 is a Grid enabled implementation of MPI. It gives the familiar interface of MPI while providing integration with the Globus Toolkit. In Grids, many applications are loosely computations may require large number of data sets and it may take more time to compute and communicate. MapReduce is only yet another parallel programming model, providing a programming model and runtime system for the processing of large datasets, and it is based on a simple model with just two key functions: map and reduce, borrowed from functional languages. The map function applies a specific operation to each of a set of items, producing a new set of items; a reduce function performs aggregation on a set of items. The MapReduce runtime system automatically partitions input data and schedules the execution of programs in a large cluster of commodity machines. Clouds adopted some common communication protocols such as HTTP and SOAP, the integration and interoperability of all the services and applications remain the biggest challenge, as users need to tap into a federation of Clouds instead of a single Cloud provider.

Application Model: Grids generally support many different kinds of applications, ranging from high performance computing (HPC) to high throughput computing (HTC). HPC applications are efficient at executing tightly coupled parallel jobs within a particular machine with low-latency interconnects and are generally not executed across a wide area network Grid; these applications typically use message passing interface (MPI) to achieve the needed inter-process communication. On the other hand, Grids have also seen great success in the execution of more loosely coupled applications that tend to be managed and executed through workflow systems or other sophisticated and complex applications. Related to HTC applications loosely coupled nature, there are other application classes, such Multiple Program Multiple Data (MPMD)[19], MTC, capacity computing, utility computing, and embarrassingly parallel, each with their own niches.

On the other hand, Cloud Computing could in principle cater to a similar set of applications. The one exception that will likely be hard to achieve in Cloud Computing (but has had much success in Grids) are HPC applications that require fast and low latency network interconnects for efficient scaling too many processors. As Cloud Computing is still in its infancy, the applications that will run on Clouds are not well defined, but we can certainly characterize them to be loosely coupled, transaction oriented and likely to be interactive.

Job scheduling:Job scheduling is the fundamental concept of grid technology leads to use all kinds of resources. It can divide a huge task into a lot of independent and no related sub tasks, and then let every node do the jobs. Even any node fails and doesn‗t return result, it doesn‗t matter; the whole process will not be affected. Even one node crashes, the task will be reassigned to other nodes. Just like grid computing, cloud computing will make a huge resource pool through grouping all the resources. But the resource provided by cloud is to complete a special task.

Security Model:Interoperability can become a serious issue for cross-data center, cross-administration domain interactions. Each Grid site may have its own administration domain and operation autonomy based on the assumption that resources are heterogeneous [24] and dynamic. Security plays an important role in Grid infrastructure.

(12)

International Journal of Emerging Technology and Advanced Engineering

137

Which provides to facilitate accounting, auditing and delegation, so that a program can be authorized to access resources on a user‗s behalf and it can further delegate to other programs; privacy, integrity and segregation, resources belonging to one user cannot be accessed by unauthorized users, and cannot be tampered during transfer; coordinated resource allocation, reservation, and sharing, taking into consideration of both global and local resource usage policies. The public-key based GSI (Grid Security Infrastructure) protocols are used for authentication, communication protection, and authorization. Gruber (Grid Resource Usage SLA Broker) is an example which follows distributed policy enforcement points to enforce both local usage policies and global SLAs (Service Level Agreement), which allows resources at individual sites to be efficiently shared in site, multi-VO environments. Currently, the security model for Clouds seems to be relatively simpler and less secure than the security model adopted by Grids. Cloud infrastructure typically rely on Web forms (over SSL) to create and manage account information for end-users, and allows users to reset their passwords and receive new passwords via Emails in an unsafe and unencrypted communication. The Grid approach to security might be more time consuming, but it adds an extra level of security to help prevent unauthorized access. Security is one of the largest concerns for the adoption of Cloud Computing. We outline seven risks [23] a Cloud user should raise with vendors before committing:

1.Privileged user access: sensitive data processed outside the enterprise brings with it an inherent level of risk.

2.Regulatory compliance: a customer needs to verify if a Cloud provider has external audits and security certifications and if their infrastructure complies with some regulatory security requirements;

3.Data location: since a customer will not know where her data will be stored,

4.Data segregation: Needs to ensure that one customer‗s data is fully segregated from another customer‗s data;

5.Recovery: Cloud provider has an efficient replication and recovery mechanism to restore.

6.Investigative support: Cloud services are especially difficult to investigate.

7.Long-term viability: data should be viable even the Cloud provider is acquired by another company.

The following table describes about five essential characteristics of cloud computing when applied to cluster and grid computing.

Table 1

Comparing Cluster, Grid and Cloud Computing

Characteristic Cluster Grid Cloud On-demand Self Service No No Yes Broad network Access Yes Yes Yes

Resource Pooling Yes Yes Yes

Rapid Elasticity No No Yes

Measured Service No Yes Yes

Problems of cloud computing

 Cloud computing technology faces lot of issues such as data transfer rate is very low due to heavy traffic network and low bandwidth to transfer 1 terabyte of data it take more time (1 to 2 days).

 Interoperability Issues

 Hidden Cost like Internet charges, maintenance charges by the cloud provider etc.

 Users with sensitive data may be reluctant to entrust it to external providers or to providers outside their borders.  The cloud is generally located at a single site, which

increases risk of complete cloud failure.

VI. CONCLUSION

The main aim of this work is to present an overview of grid computing and cloud computing. And also highlights the differences between grid computing and cloud computing. Both these computing techniques provide a promising and efficient way of using computing and storage resources. Having a fundamental understanding on the difference between the two will help the user choose the system best suited for his or her needs. In our point of view clouds and grids share a lot of similarities between in their vision, architecture and technology, but they also differ in various aspects such as security, programming model, business model, compute model, data model, applications, and abstractions. A close comparison such as this helped the two communities understand, share and evolve infrastructure and technology within and across, and accelerate.

REFERENCES

[1 ] Open Cloud Consortium website, http://opencloudconsortium.org, accessed 10 November 2010.

[2 ] Center Bo Wang, HongYu Xing ―The Application of Cloud Computing in Education Informatization, Modern Educational Tech...‖ Computer Science and Service System (CSSS), 2011 International Conference on IEEE, 27-29 June 2011, 978-1-4244-9762-1, pp 2673 – 2676.

[3 ] http://www.gridservices.com/

(13)

International Journal of Emerging Technology and Advanced Engineering

138

[5 ] Center Bo Wang, HongYu Xing ―The Application of Cloud Computing in Education Informatization, Modern Educational Tech...‖ Computer Science and Service System (CSSS), 2011 International Conference on IEEE, 27-29 June 2011, 978-1-4244-9762-1, pp 2673 – 2676.

[6 ] Yogesh Simmhan, Girish Subramanian ‖ Bridging the Gap between Desktop and the Cloud for eScience Applications‖

[7 ] Slow Moving Clouds Fast Enough for HPC, M. Feldman, HPC Wire, August 10 2009.

[8 ] N. Leavitt, ―Is Cloud Computing Really Ready for Prime Time?‖ Computer, vol. 42, no. 1, 2009, pp. 15–20.

[9 ] http://www.cc.gatech.edu/

[10 ]Foster, I. and Kesselman, C. The Globus Project: a Status Report. In Proc. IPPS/SPDP‘98 Workshop on Heterogeneous Computing, pp. 4–18, 1998

[11 ]Foster, I. Computational Grids, pp. 15–52. In [10], 1998. [12 ]www.csrc.nist.gov

[13 ]Shuai Zhang, Shufen Zhang, Xuebin Chen, Xiuzhen Huo, ―The Comparison Between Cloud Computing and Grid Computing, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010),2010 IEEE NIST, http://csrc.nist.gov/groups/SNS/cloudcomputing/

[14 ]http://www.cloudconsulting.com/

[15 ]D. Skillicorn and D. Talia. Models and languages for parallel computation. ACM Computing Surveys, 30(2), June 1998.

[16 ]I. Foster and N. T. Karonis. A grid-enabled MPI: Message passing in heterogeneous distributed computing systems. In Supercomputing. IEEE, November 1998. www.supercomp.org/sc98.

[17 ]I. Raicu, Y. Zhao, I. Foster, A. Szalay. ―Accelerating Large scale Data Exploration through Data Diffusion, International Workshop on Data-Aware Distributed Computing 2008.

[18 ]I. Foster, J. Vöckler, M. Wilde, Y. Zhao, Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation, SSDBM 2002: 37-46.

[19 ]Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von von- Laszewski, I. Raicu, T. Stef-Praun, and M. Wilde. ―Swift: Fast, reliable, loosely coupled parallel computation, IEEE Int. Workshop on Scientific Workflows, pages 199–206, 2007.

[20 ]B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E. Lee, J. Tao, Y. Zhao, ―Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience, 18(10).

[21 ]P. Groth, S. Miles, W. Fang, S. Wong, K. Zauner, L. Moreau. ―Recording and using provenance in a protein compressibility experiment, in Proceedings of the 14th IEEE Int. Symposium on High Performance Distributed Computing (HPDC), 2005.

[22 ]I. Raicu, Z. Zhang, M. Wilde, I. Foster, P. Beckman, K. Iskra, B. Clifford. ―Towards loosely-Coupled Programming on Petascale Systems, IEEE/ACM Supercomputing 2008.

[23 ]J. Brodkin. ―Gartner: Seven cloud-computing security risks, http://www.networkworld.com/news/2008/070208-cloud.html, 2008. [24 ]Shuai Zhang, Shufen Zhang, Xuebin Chen, Xiuzhen Huo, ―Cloud Computing Research and Development Trend, 2010 Second International Conference on Future Networks,2010. Read Luis Ferreira‘s Grid Computing - Products and Services

[25 ]Amazon Simple Storage Service (Amazon S3), http://aws.amazon.com/s3, 2008.