Grid Computing
and
Alternative Distributed Computing Solutions
Noman Islam
Oct, 2007
Introduction
The defining characteristic of a grid [1]:
“The essence of grid computing lies in the
efficient
and
optimal utilization
of a wide
range of
heterogeneous
,
loosely coupled
A Three Point Check List for Grids [4]
1. Coordinates resources that are not subject to
centralized control
– A Grid integrates and coordinates resources and users that live within different control domains
2. Uses standard, open, general-purpose
protocols and interfaces
– built from multi-purpose protocols and interfaces that address such fundamental issues as
A Three Point Check List for Grids [4]
3. Deliver nontrivial qualities of service
Introduction to Cluster Computing
• A group of tightly coupled computers that work
together closely so that in many respects they
can be viewed as though they are a single
computer
• They are often connected to each other through
fast LAN
• Cluster Categories
– High-availability (HA) clusters – Load-balancing clusters
Grid Vs Cluster Computing
• The key difference between grids and traditional clusters are that grids connect collections of computers which do not fully trust each other, or which are geographically
dispersed
• Grid computing is optimized for workloads which consist of many independent jobs or packets of work, which do not have to share data between the jobs during the
computation process. Grids serve to manage the
allocation of jobs to computers which will perform the work independently of the rest of the grid cluster.
Grid Vs Cluster Computing
• Grids consist of heterogeneous resources
(integrates storage, networking, and
computation resources) where as clusters
have computational resources
• Clusters usually contain a single type of
processor and operating system; grids can
contain machines from different vendors
Grid Vs Cluster Computing
• Grids are dynamic by their nature. Clusters
typically contain a static number of processors
and resources; resources come and go on the
grid. Resources are provisioned onto and
removed from the grid on an ongoing basis
• Grids are inherently distributed over a local,
metropolitan, or wide-area network. Usually,
clusters are physically contained in the same
complex in a single location; grids can be (and
are) located everywhere. Cluster interconnect
technology delivers extremely low network
Grid Vs Cluster Computing
• Grids offer increased scalability. Physical proximity and network latency limit the ability of clusters to scale out; due to their dynamic nature, grids offer the promise of high scalability
• But Cluster and grid computing are becoming completely complementary. Many grids incorporate clusters among the resources they manage. Indeed, a grid user may be unaware that his workload is in fact being executed on a remote cluster. And while there are differences between grids and clusters, these differences afford them an
Grid Vs Cluster Computing
• As networking capability and bandwidth
advances, problems that were previously
the exclusive domain of cluster computing
will be solvable by grid computing. It is
vital to comprehend the balance between
the inherent scalability of grids and the
Introduction to P2P
•
P2P
is a class of applications that takes
advantage of resources-storage, cycles,
content, human presence - available at the
edges of the Internet
• A pure peer-to-peer network does not
have the notion of clients or servers, but
Grid Vs P2P
•
Grid
were motivated by the requirements of
professional communities
needing to access
remote resources, federate datasets, and/or pool
computers for large-scale simulations and data
analyses. It was initially developed to address
the needs of
scientific collaborations,
commercial interest is growing
•
P2P
has been popularized by
grass roots,
mass-culture file-sharing and highly parallel
computing applications
that scale in some
Grid Vs P2P
•
Grid
integrate resources that
are more
powerful, more diverse, and better connected
than the typical
P2P
– Grid resource - cluster, storage system, database, or scientific instrument administered in an organized
fashion according to some well defined policy.
•
P2P
often deal with intermittent participation and
h
ighly variable behavior
.
Grid Vs P2P
•
Grid
often involves only modest numbers
of participants. The amount of activity can
be large.
– Early
Grid
implementations did
NOT address
scalability and self management as
priorities
Grid Vs P2P
• In Grid, works have been done associated with creating and operating persistent, multipurpose infrastructure services for authentication, authorization, discovery,
resource access, data movement...Less effort has been devoted to managing participation in the absence of trust • P2P offers much scalability, fault tolerance,
self-configuration, automatic problem determination. P2P
system have tended to focus on the integration of simple resources (individual computers) by protocols. The
persistence properties of such infrastructures are not specifically engineered but are rather emergent
Grid Vs P2P
• P2P system lacks a central point of management; this makes it ideal for providing anonymity. Grid
environments, on the other hand, usually have some form of centralized management and security (for
instance, in resource management or workload scheduling).
• Lack of centralization means: – More scalable
– More tolerant of single-point failures than grid computing
systems. (Although grids are much more resilient than tightly coupled distributed systems, a grid inevitably includes some key elements that can become single points of failure)
Grid Vs P2P
• Also, while an important characteristic of grid
computing is that resources are dynamic, in P2P
systems the resources are much more dynamic
in nature and generally are more fleeting than
resources on a grid
• A final distinction between the two systems is
standards -- the general lack of standards in the
P2P world contrasts with the host of standards in
the grid universe. And, thanks to entities like the
Global Grid Forum, the grid universe has a
Grid Vs CORBA
• CORBA
– OGSA and CORBA, both are based on the
concept of service-oriented architecture
(SOA)
– CORBA assumes object orientation (after all,
it is part of the name), but grid computing
does not
Distributed Computing Environment
• The Distributed Computing Environment (DCE) is a software system developed in the early 1990s by a
consortium that included Apollo Computer (later part of Hewlett-Packard), IBM, Digital Equipment Corporation, and others. The DCE supplies a framework and toolkit for developing client/server applications. The framework includes a remote procedure call (RPC) mechanism
known as DCE/RPC, a naming (directory) service, a time service, an authentication service, an authorization
Grid Vs DCE
• Not so much an architecture but an
environment, DCE facilitates distributed
computing; grid computing (in the form of
OGSA) is more of an end-to-end
architecture designed to encapsulate
Conclusion
• We have examined Grid Computing and
its importance at Enterprise Level
• Also an analysis of the similarities and
differences between grid computing and
four major distributed computing systems
• Based on the benefits of these paradigms,
References
[1] “Perspectives on grid: Grid computing -- Next-generation distributed computing”, Matt Haynos, Program Director, Grid Marketing and Strategy, IBM,
http://users.cs.cf.ac.uk/David.W.Walker/IGDS/GridCourse.htm
[2] Grid Vs Peer-to-Peer, Yin Chen,
http://freewebs.com/yinchenagain/doc/p2p.pdf
[3] Wikpedia, the Free Encyclopedia, http://www.wikipedia.org