Grids: Concepts,
Technologies and
Applications
Geoffrey Fo
Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401
April 25 2005 [email protected]
So what is a Grid?
n Supporting human decision making with a network of at least four large computers, perhaps six or eight small computers, and a great assortment of disc files and magnetic tape units -not to mention remote consoles and teletype stations - all
churning away. (Licklider 1960)
n Coordinated resource sharing and problem solving in
dynamic multi-institutional virtual organizations
n Infrastructure that will provide us with the ability to
dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and
distributed applications.
n Realizing thirty year dream of science fiction writers that
have spun yarns featuring worldwide networks of
interconnected computers that behave as a single entity.
Internet Scale Distributed Services
n Grids use Internet technology and are distinguished by
managing or organizing sets of network connected resources
• Classic Web allows independent one-to-one access to
individual resources
• Grids integrate together and manage multiple
Internet-connected resources: People, Sensors, computers, data systems
n Organization can be explicit as in
• TeraGrid which federates many supercomputers;
• Deep Web Technologies IR Grid which federates multiple
data resources;
• CrisisGrid which federates first responders, commanders,
sensors, GIS, (Tsunami) simulations, science/public data
n Organization can be implicit as in Internet resources such as
curated databases and simulation resources that “harmonize a community”
Different Visions of the Grid
n Grid just refers to the technologies• Or Grids represent the full system/Applications
n DoD’s vision of Network Centric Computing is just a Grid
(linking sensors, warfighters, commanders, backend resources) and they are building the GIG (Global Information Grid)
n Utility Computing or X-on-demand (X=data, computer ..) is
major computer Industry interest in Grids
n e-Science or Cyberinfrastructure are virtual organization Grids
supporting global distributed science (note sensors, instruments are people are all distributed
n Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and
VRVS/GlobalMMCS like Internet A/V conferencing are Collaboration Grids)
n Commercial 3G Cell-phones and DoD ad-hoc network initiative
are forming mobile Grids
e-moreorlessanything and the Grid
n
e-Business
captures an emerging view of corporations as
dynamic
virtual organizations
linking employees, customers
and stakeholders across the world.
•
The growing use of
outsourcing
is one example
n
e-Science
is the similar vision for scientific research with
international participation in large accelerators, satellites or
distributed gene analyses.
n
The
Grid
integrates the best of the Web, traditional
enterprise software, high performance computing and
Peer-to-peer systems to provide the information technology
e-infrastructure
for
e-moreorlessanything
.
n
A
deluge of data
of unprecedented and inevitable size must
be managed and understood.
n
People
,
computers
,
data
and
instruments
must be linked.
n
On demand
assignment of experts, computers, networks and
storage resources must be supported
More Broad Classes of Grid Applications
n Enterprise Grid supports information system for anorganization; includes “university computer center”, “(digital) library”, sales, marketing, manufacturing …
n Outsourcing Grid links different parts of an enterprise together
(Gridsourcing)
• Manufacturing plants with designers
• Animators with electronic game or film designers and
producers
• Coaches with aspiring players (e-NCAA or e-NFL etc.)
n Customer Grid links businesses and their customers as in many
web sites such as amazon.com
n e-Multimedia can use secure peer-to-peer Grids to link creators,
distributors and consumers of digital music, games and films
respecting rights
n Distance education Grid links teacher at one place, students all
over the place, mentors and graders; shared curriculum, homework, live classes …
e-Defense and e-Crisis
n
Grids support
Command and Control
and provide
Global Situational Awareness
• Link commanders and frontline troops to themselves and to
archival and real-time data; link to what-if simulations
• Dynamic heterogeneous wired and wireless networks • Security and fault tolerance essential
n
System of Systems;
Grid of Grids
• The command and information infrastructure of each ship is
a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid
• Grids must be heterogeneous and federated
n
Crisis Management
and
Response
enabled by a Grid
linking sensors, disaster managers, and first responders
with decision support
Types of Computing Grids
n
Running “
Pleasing Parallel Jobs
” as in United Devices,
Entropia (Desktop Grid) “cycle stealing systems”
n
Can be managed (“inside” the
enterprise
as in Condor)
or more informal (as in SETI@Home)
n
Computing-on-demand
in Industry where jobs spawned
are perhaps very large (SAP, Oracle …)
n
Support
distributed file systems
as in Legion (Avaki),
Globus with (web-enhanced) UNIX programming
paradigm
• Particle Physics will run some 30,000 simultaneous jobs
n
Linking Supercomputers as in
TeraGrid
n
Pipelined
applications linking data/instruments,
compute, visualization
n
Seamless Access
where Grid portals allow one to choose
one
of multiple resources with a common interfaces
Utility and Service Computing
n An important business application of Grids is believed to beutility computing
n Namely support a pool of computers to be assigned as needed to
take-up extra demand
• Pool shared between multiple applications
n Natural architecture is not a cluster of computers connected to
each other but rather a “Farm of Grid Services” connected to Internet and supporting services such as
• Web Servers
• Financial Modeling • Run SAP
• Data-mining
• Simulation response to crisis like forest fire or earthquake • Media Servers for Video-over-IP
n Note classic Supercomputer use is to allow full access to do
“anything” via ssh etc.
• In service model, one pre-configures services for all programs
and you access portal to run job with less security issues
Some Important Styles of Grids
n Computational Grids were origin of concepts and linkcomputers across the globe – high latency stops this from being used as parallel machine
n Knowledge and Information Grids link sensors and information
repositories as in Virtual Observatories or BioInformatics
• More detail on next slide
n Education Grids link teachers, learners, parents as a VO with
learning tools, distant lectures etc.
n e-Science Grids link multidisciplinary researchers across
laboratories and universities
n Community Grids focus on Grids involving large numbers of
peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts
n Semantic Grid links Grid, and AI community with Semantic web
(ontology/meta-data enriched resources) and Agent concepts
Information/Knowledge Grids
n
Distributed
(10’s to 1000’s) of
data sources
(instruments,
file systems, curated databases …)
n
Data Deluge
: 1 (now) to 100’s
petabyte
s/year (2012)
• Moore’s law for Sensors
n
Possible
filters
assigned dynamically (
on-demand
)
•
Run image processing algorithm on telescope image
•Run Gene sequencing algorithm on compiled data
n
Needs
decision support
front end with “what-if”
simulations
n
Metadata
(
provenance
)
critical to annotate data
n
Integrate
across experiment
as in multi-wavelength
astronomy
Database Database Analysis and Visualizatio Portal Repositorie Federated Databases Data Filte Services
Field Trip Data
Streaming Data Sensor s
?
Discovery Services SERVOGrid Researc Simulation s Research Education Customization Services From Researc to Education Educatio Grid Computer FarmiSERVO in a nutshell
n Designed to link data-sets (repositories and real time),
computations and earthquake scientists in ACES (Asia Pacific) Cooperation
• Australia China Japan USA
n Exemplified by SERVOGrid in USA led by JPL n Supports simulation and datamining as services
n Adopts conservative WS-I+ Web Service Interoperability
standards
n Builds full “Grid” in a library fashion as a Grid of Grids • GIS (Geographic Information System) Grid built as a set of OGC
compatible Web Services “talking” GML
• iSERVO federates separate Grids in each country/organization/function • A Grid is “just” a collection of Services aka distributed programs
n Multi-scale simulations supported by Grid workflow
n Portals based on NSF Middleware Initiative NMI Open Grid
Computing Environment OGCE
In flight data
Airline
Maintenance Centre
Ground Station
Global Network Such as SITA
Internet, e-mail, pager
Engine Health (Data) Center
DAME
Rolls Royce and UK e-Science Progra
Distributed Aircraft Maintenance
Environment
~ Gigabyte per aircraft per Engine per transatlantic
flight
~5000 engines
NASA Aerospace Engineering Grid
Virtual Observatory Astronomy Gri
Integrate Experiments
Radio Far-Infrared Visible
Visible + X-ray
Dust Map
Galaxy Density Map
e-Chemistry Laborator
Experiments-on-demand
Grid Resources
Grid-enabled Output Streams
CERN LHC Data Analysis Grid
HPC Simulation Data Filter Data Filter Data Filter Data Filt er Data Filter Distributed Filters massage data For simulation Other Gri
and W eb Servi ces Analysi Control Visualize
SERVOGrid (Complexity) Computing Model
Grid
OGSA-DA
Grid Services
This Type of Grid
integrates with
Parallel computing
Multiple HPC facilities but only use one at a time Many simultaneous
data sources and sinks
Grid Data Assimilation
Sources of Grid Technology
n
Grids support distributed collaboratories or virtual
organizations integrating concepts from
n
The Web
n
Agents
n
Distributed Objects
(CORBA Java/Jini COM)
n
Globus, Legion, Condor, NetSolve, Ninf and other High
Performance Computing activities
n
Peer-to-peer Networks
n
With perhaps the Web and P2P networks being the most
important for “Information Grids” and Globus for
“Compute Grids”
The Essence of Grid Technology?
n
We will start from the Web view and assert that basic
paradigm is
n
Meta-data rich Web Services communicating via
messages
n
These have some basic support from some runtime
such as .NET, Jini (pure Java), Apache Tomcat+Axis
(Web Service toolkit), Enterprise JavaBeans,
WebSphere (IBM) or GT3/4 (Globus Toolkit 3/4)
• These are the distributed equivalent of operating system
functions as in UNIX Shell
• Called Hosting Environment or platform
n
W3C standard WSDL defines IDL (Interface
standard) for Web Services
Meta-data
n
Meta-data
is usually thought of as “data about data”
nThe
Semantic Web
is at its simplest considered as
adding meta-data to web pages
n
For example, the hospital web-page has meta-data
telling you its location, phone-number, specialties
which can be used to automate Google-style searches to
allow planning of disease/accident treatment from web
n
Modern trend (
Semantic Grid
) is meta-data about
web-services e.g. specify details of interface and useage
• Such as that a bioinformatics service is free or bandwidth
input is of limited amount
n
Provenance
– history and ownership – of data very
important
A typical Web Service
n In principle, services can be in any language (Fortran .. Java ..
Perl .. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining)
n The simplest implementations involve XML messages (SOAP) and
programs written in net friendly languages like Java and Python
Paymen Credit
Card
Warehous e
Shipping control
WSDL interfaces
WSDL interfaces
Securit
y Catalog
Porta Service
Web Services Web Services
Raw (HPC) Resources Middleware Database Portal Service s Syste Service s Syste Service s Syste Service s Application Service Syste Service s Syste Service s Use Services “Core Grid
Typical Grid
Architecture
Each Blob is a Computer
Program!
Classic Grid Architecture
Database Database
Netsolv e
Computin g
Securit y Collaboratio
n
Compositio n
Content Access
Resources
Client
s Users and Devices
Middle Tie Brokers Service Providers
Middle Tier becomes Web Services
Peer to Peer Grid
Database Database
Peers Peers
Peer to Peer Grid A democratic organization
User Facin
Web Service Interfaces
Service Facin
Web Service Interfaces
Event Messag Brokers
Event Messag Brokers
Event Messag Brokers
What is Happening?
n Grid ideas are being developed in (at least) four communities
• Web Service – W3C, OASIS, (DMTF)
• Grid Forum (High Performance Computing, e-Science)
• Enterprise Grid Alliance (Commercial “Grid Forum” with a
near term focus)
n Service Standards are being debated
n Grid Operational Infrastructure is being deployed n Grid Architecture and core software being developed
• Apache has several important projects as do academia; large
and small companies
n Particular System Services are being developed “centrally” –
OGSA framework for this in GGF; WS-* for OASIS/W3C/Microsoft-IBM
n Lots of fields are setting domain specific standards and building
domain specific services
n USA started but now Europe is probably in the lead and Asia
will soon catch USA if momentum (roughly zero for USA) continues
Technical Activities of Note
n Look at different styles of Grids such as Autonomic (Robust
Reliable Resilient)
n New Grid architectures hard due to investment required n Program the Grid – Workflow
n Access the Grid – Portals, Grid Computing Environments n Critical Services Such as
• Security – build message based not connection based • Notification – event services
• Metadata – Use Semantic Web, provenance • Fabric and Service Management
• Databases and repositories – instruments, sensors
• Computing – Submit job, scheduling, distributed file systems • Visualization, Computational Steering
• Network performance
Low Level WS-*
High Level
Web services
•
Web Services
build
loosely-coupled,
distributed
applications,
(wrapping existing codes
and databases) based on
the
SOA
(service
oriented architecture)
principles.
•
Web Services interact by
exchanging messages in
SOAP
format
•
The contracts for the
message exchanges that
implement those
interactions are described
via
WSDL
interfaces.
Philosophy of Web Service Grids
•
Much of Distributed Computing was built by natural
extensions of computing models developed for sequential
machines
•
This leads to the
distributed object
(DO) model represented
by Java and
CORBA
– RPC (Remote Procedure Call) or RMI (Remote Method Invocation) for Java
•
Key people think this is not a good idea as it scales badly
and ties distributed entities together too tightly
– Distributed Objects Replaced by Services
•
Note
CORBA
was considered too complicated in both
organization and proposed infrastructure
– and Java was considered as “tightly coupled to Sun” – So there were other reasons to discard
•
Thus replace distributed objects by
services
connected by
“
one-way
” messages and not by request-response messages
Plethora of Standards
• Java is very powerful partly due to its many “frameworks” that generalize libraries e.g.
– Java Media Framework
– Java Database Connectivity JDBC
• Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services”
– About 60 WS-* specifications introduced in last 2-3 years
– These are low level with higher level standards such as access database (OGSA-DAI) or “Submit a job” built on top of these
• Many battles both between standard bodies and between companies as each tries to set standards they consider best; thus there are multiple standards for many of key Web Service functionalities
• Microsoft a key player and stands to benefit as Web Services open up enterprise software space to all participants
– e.g. MQSeries (IBM) and Tibco have to change their messaging systems to support new open standards
WS-I Interoperability
•
Critical underpinning of Grids and Web Services is the
gradually growing set of specifications in the Web Service
Interoperability Profiles
•
Web Services Interoperability
(WS-I) Interoperability
Profile 1.0a." http://www.ws-i.org.
gives us
XSD,
WSDL1.1, SOAP1.1, UDDI
in basic profile and parts of
WS-Security
in their first security profile.
•
We imagine the “60 Specifications” being checked out and
evolved in the
cauldron of the real world
and occasionally
best practice identifies a new specification to be added to
WS-I
which
gradually increases in scope
– Note only 4.5 out of 60 specifications have “made it” in this definition
Bit
level
Internet
(OSI
Stack)
Layered Architecture for Web Services and Grids
Base Hosting Environment
Protocol HTTP FTP DNS …
Presentation XDR …
Session SSH …
Transport TCP UDP …
Network IP …
Data Link / Physical
Servic Internet
Application Specific Grids
Generally Useful Services and Grids
Workflow WSFL/BPEL
Service Management (“Context etc.”)
Service Discovery (UDDI) / Information
Service Internet Transport
Protocol
Service Interfaces WSDL
WS-* implies the The Service Internet
We have the classic (CISCO, Juniper ….) Internet routing the
flood of ordinary packets in OSI stack architecture
Web Services build the “Service Internet” or IOI (Internet on
Internet) with
• Routing via WS-Addressing not IP header
• Fault Tolerance (WS-RM not TCP)
• Security (WS-Security/SecureConversation not IPSec/SSL)
• Data Transmission by WS-Transfer not HTTP
• Information Services (UDDI/WS-Context not DNS/Configuration files)
• At message/web service level and not packet/IP address level
Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software
overlay network defining Grid (analogy is VPN)
SOAP Header contains all information needed for the “Service
Consequences of Rule of the Millisecond
•
Useful to remember
critical time scales
– 1) 0.000001 ms – CPU does a calculation
– 2a) 0.001 to 0.01 ms – Parallel Computing MPI latency
– 2b) 0.001 to 0.01 ms – Overhead of a Method Call
– 3) 1 ms – wake-up a thread or process
– 4) 10 to 1000 ms – Internet delay
•
2a), 4) implies geographically distributed
metacomputing
can’t in general compete with parallel systems
•
3) << 4) implies a software overlay network is possible
without significant overhead
– We need to explain why it adds value of course!
•
2b) versus 3) and 4) describes regions where
method
and
message
based programming paradigms important
Linking Modules
n
From method based to RPC to message based to event-based
publish-subscribe Message Oriented Middleware
Module A Module
B
Method Call .001 to 1 millisecond
Service A Service
B Messages
0.1 to 1000 millisecond latency
Coarse Grain Service Model
Closely coupled Java/Python …
Service B Service A
Publisher Post Events “Listener
Subscribe to Events
Message Queue in the
What is a High Performance Computer?
n We might wish to consider three classes of multi-node computers n 1) Classic MPP with microsecond latency and scalable internodebandwidth (tcomm/tcalc ~ 10 or so)
n 2) Classic Cluster which can vary from configurations like 1) to 3)
but typically have millisecond latency and modest bandwidth
n 3) Classic Grid or distributed systems of computers around the
network
• Latencies of inter-node communication – 100’s of milliseconds
but can have good bandwidth
n All have same peak CPU performance but synchronization costs
increase as one goes from 1) to 3)
n Cost of system (dollars per gigaflop) decreases by factors of 2 at
each step from 1) to 2) to 3)
n One should NOT use classic MPP if class 2) or 3) suffices unless
some security or data issues dominates over cost-performance
n One should not use a Grid as a true parallel computer – it can
link parallel computers together for convenient access etc.
What is a Simple Service?
• Take any system – it has multiple functionalities
– We can implement each functionality as an independent distributed service
– Or we can bundle multiple functionalities in a single service
• Whether functionality is an independent service or one of many method calls into a “glob of software”, we can always make them as Web services by converting
interface to WSDL
• Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond”
– Distributed services incur messaging overhead of one (local) to
100’s (far apart) of milliseconds to use message rather than method call
– Use scripting or compiled integration of functionalities ONLY
when require <1 millisecond interaction latency
• Apache web site has many projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services
– Makes it hard to integrate sharing common security, user profile, file access .. services
Grids of Grids of Simple Services
• Link via methods messages streams• Services and Grids are linked by messages
• Internally to service, functionalities are linked by methods
• A simple service is the smallest Grid
• We are familiar with method-linked hierarch
Lines of Code Methods Objects Programs Packages
Overlay and Compose
Grids of Grids
Methods Services Component Grids
CPUs Clusters Compute Resource Grids MPPs
Databases DatabasesFederated
Sensor Sensor Nets
Data
Component Grids?
• So we build collections of Web Services which we
package as
component Grids
–
Visualization Grid
–
Sensor Grid
–
Utility Computing Grid
–
Person (Community) Grid
–
Earthquake Simulation Grid
–
Control Room Grid
–
Crisis Management Grid
• We build bigger Grids by
composing component
Grids
using the
Service Internet
Critical Infrastructure (CI) Grids built as Grids of Grids
Gas Service and Filters
Physical Network Registr
y Metadata
Flood Service and Filters
Flood CIGrid
…
Electricity Gas CIGrid CIGrid…
Data
Access/Storage Securit
y Notification Workflow Messaging Portal
s VisualizationGrid Collaboration
Grid
Sensor Grid Compute Grid
GIS Grid
Core Grid Services
Two-level Programming I
n
The paradigm implicitly assumes a
two-level
Programming Model
n
We make a
Service
(same as a “distributed object” or
“computer program” running on a remote computer)
using conventional technologies
• C++ Java or Fortran Monte Carlo module • Data streaming from a sensor or Satellite • Specialized (JDBC) database access
n
Such
services
accept and produce data from users files
and database
n
The Grid is built by coordinating such services
assuming we have solved problem of programming the
service
Servic
e Data
Two-level Programming II
n
The Grid is discussing the composition of distributed
services
with the runtime
interfaces to Grid as
opposed to UNIX
pipes/data streams
n
Familiar from use of UNIX Shell, PERL or Python
scripts to produce real applications from core programs
n
Such interpretative environments are the single
processor analog of
Grid Programming
n
Some projects like GrADS from Rice University are
looking at integration between service and composition
levels but dominant effort looks at each level separately
Service
1 Service2
Service
3 Service4
What Should One Do?
n
Grids
and
Service Oriented Architectures
will
• Change landscape in mature areas like enterprise software • Support new distributed applications in Science,
Government, Education, Business and Community areas
• Encourage trends like outsourcing and globalization in all
activities
n
Web Service/Grid
standards and infrastructure are still
in their infancy but broad principles reasonably clear
n
Many large scale software development activities are
inconsistent
with modern architectures
n