Introduction to Grid
Computing
What is the
GRID?
Grid Computing Technology
• Grid computing enables the
virtualization of distributed
computing and data resources
such as processing, network
bandwidth and storage capacity
to create a single system
image, granting users and
applications seamless access
to vast IT capabilities.
• Just as an Internet user views a
unified instance of content via
the Web,
a grid user essentially
sees a single, large virtual
The Application-Infrastructure Gap
Dynamic and/or Distributed Applications A 1 B 1 9 9Global Community
What Grid can really do?
•
Help you build a secure IT infrastructure
•
Remote job submission so you can
distribute your workload
•
Middleware and Programming API for
developing distributed services and
application that use them.
Grid Computing’s Potential
Virtual Computers Virtual Databases UNC-CH NCSU Duke WFU WSSU NCArts NCAT UNC-C UNC-A ECSU WCU ASU ECU UNC-G NCCU UNC-W UNC-P FSUX Unified view of data and
computers Computers and data appear to be local
X Efficient access to large data sets Caching Replication
Attributes X Single sign-on, security X Policy-based resource sharing
Type of Grid
• Computational Grid
– Present a virtual pool of computing resources
• Data Grid
– Present a virtual pool of storages
• Access Grid
Technology
and Trends
Brief History of Grid Computing
• Generation 0:
No Grid only Internet and Web
• Generation 1:
Internet Computing
– Seti Home, Peer-to-peer
– Many experimental Grid and Meta computing System
• Generation 2:
Execution oriented Grid
– Use grid as batch supercomputing system – Basic set of services, no standardize access – Workable security based on X.509
• Generation 3:
Services Oriented Grid
– Distributed Services accessible globally – Based on Web Services Standard
What is
really
the Grid?
• Set of Computers, Servers, Clusters connected together with good network
• OS Linux / Windows on each machine (mostly linux) • Install Grid Middleware, mostly Globus
• Setup so middleware properly including certificate infrastructure
• Use tool on top
– Scheduler to distribute computing workload
• Gridified Application
– Use globus library or other lobrary such as MPI – Grid services
Components of the Grid
High Speed Network
Grid Middleware Grid Enable Application
Grid Middleware
• Key technology in Grid Computing
Provide uniform, high-level access to a wide
range of resources (including networks)
Address interdomain issues of security, policy, etc.
Permit application-level management and monitoring of end-to-end performance
• Middleware-level and higher-level APIs library
and tools for programmer to implement
The Globus Project
• Project by
– Argonne National Laboratory
– University of Southern California's Information Sciences Institute
– University of Chicago – University of Edinburgh
– Swedish Center for Parallel Computers
• Leader
– Ian Foster and Carl Kesselman
• Produces open-source software that is a de facto grid standard
• Globus is now being take care of by Globus alliance
Grid Architecture
• Fabric Layer– Protocol and interface that provide access to computing resources such as CPU, storage
• Connectivity Layer
– Protocol for Grid-specific network transaction such as security GSI
• Resources Layer
– Protocol to access a single resources from application
• GRAM (Grid Resource Allocation Management)
• GridFTP ( data access) • Grid Resource Information
Service • Collective layer
– Protocol that manage and access
group of resources Fabric
Connectivity Resources
Collective Layer Application Layer
Grid Security Infrastructure
• Core part of grid system
• Extension of
– SSL(Secure Socket Layer)/TLS (Transport
Layer Security)
– X.509 Certificate
• Provides
– Single sign-on capability
How GSI works
User Proxy
User Proxy
Proxy Credential Proxy Credential GSIGSI--Enable Enable GRAM Server GRAM Server Process Process MAP MAP Single
Single SignonSignon using Grid using Grid Certificate and Certificate and ID ID
Remote process creation request
Remote process creation request
Local id
Local id
GSI
GSI--Enable Enable GRAM Server GRAM Server Process Process MAP MAP Local id Local id Gate Keeper Gate Keeper User obtain User obtain
signed certificate from CA
signed certificate from CA
GRAM
Application Application Broker Broker Co Co--allocatorallocator MDS MDS GRAM GRAM LSF LSF GRAM GRAM SGE SGE GRAM GRAM PBS PBSRSL
RSL
RSL
RSL
RSL
RSL
Bridging the Gap:
Service-Oriented
Infrastructure
• Service-oriented applications
– Wrap applications as
services
– Compose applications
into workflows
• Service-oriented
infrastructure
– Provision physical
resources to support
application workloads
ApplnService ServiceAppln Users
Workflows
Composition Invocation
Open Grid Service Architecture
(OGSA)
• Reference architecture for the next generation grid technology
• Merge Grid and Web Services technologies together • open standards for grid computing
– Sponsored by the GGF (organization modeled after IETF) – Primary working groups: OGSA and OGSI
– Many vendors involved: IBM, Sun, Oracle, AVAKI, UD, etc…
• (But, ANL and IBM seem to have the upper hand)
Globus as
Service-Oriented Infrastructure
IBM IB M Uniform interfaces, security mechanisms, Web service transport,monitoring
Computers Specialized Storage resource User Application User Application User Application IBM IB M GRAM GridFTP Host Env User Svc DAIS Database Tool Tool Reliable File Transfer MyProxy Host Env User Svc MDS-Index
Applying the
Grid
How to Develop Grid
Application
• Use grid as parallel super computer
– Using Grid-enable-MPI (MPICH-G2) to do parallel programming
• Mostly apply to technical computing
• Use Grid as a powerful server system
– Using Portal + Grid Scheduler + to Distributed Jobs – Load balance in scheduler
• Use grid to develop SOA application
– Grid Service and WorkflowTraditional Grid Enable
Application
Compute Resource Compute Resource Compute Resource Compute Resource Compute Resource Internet Resource Broker User StationGrid and cluster Middleware (MPI, Globus)
Task Task
Task Task Task
Service Oriented Grid
Server OS Grid Middleware Service Server OS Service Registration Service Application Workflow NetworkPotential Application
• E-commerce
– Infrastructure for B2B communication
• E-government
– Secure information exchange for government
operation.
• Smart card, e-citizen
• E-Military
Potential Application
• E-health care
– Secure information exchange for health care industry
• Thailand Population: 63 Million People • Public Hospital: 200 Hospitals
• Private Hospital: 337 Hospitals
• 30 Baths Project: 1,480 Baths per capitar • 27,000 Doctors country wide
– Massive storage and processing are needed for medical data
• Patient record • Diagnostic record
Potential Application
• E-entertainment
– Video Streaming, secure exchange of digital
media
– Distributed gaming server, secure
subscription of gaming services
• E-learning
– Access Grid
What industries are using grid
computing now?
• Automotive and aerospace
– for collaborative design and data-intensive testing
• Financial services
– for running long, complex scenarios and arriving at more accurate decisions
• Life sciences
– for analyzing and decoding strings of biological and chemical information
• Government
– for enabling seamless collaboration and agility in both civil and military departments and agencies
• Higher education
– for enabling advanced, data and compute intensive research.
An eBusiness Use of Globus:
SAP Demonstration @
GlobusWorld
• 3 Globus-enabled applns:
– CRM: Internet Pricing Configurator (IPC)
– CRM: Workforce Management (WFM)
– SCM: Advanced Planner & Optimizer (APO)
• Applications modified to:
– Adjust to varying demand & resources – Use Globus to discover
& provision resources
IPC Dispatcher IPC Server Request: Price Query Delegation of Request Response: Pricelist Depending on: - Time - Discount - Number of Items - …
Web Browsers / Batch Processes
(typically several thousand requests)
IPC Server 1 2 2 3
SAP AG R/3 Internet Pricing & Configurator (IPC)
ThaiGrid Drug Discovery Project
• Partners: KU/IBM
• Problem: From over 1000 active compounds available from Thai medicinal plants database, find a smallest set of compounds that has a
potential to be used as a drug
– Very compute intensive.
Several month of computing time
• Solution: Use grid to increase computing power to 10-100 times
Concept
Compute Resource Compute Resource Compute Resource Compute Resource Compute Resource Internet Resource Broker User StationGrid and cluster Middleware (MPI, Globus)
Task Task
Task Task Task
Scheduler MMJFS Gamess Gamess Service Gamess File Server Portal Portlet OG SA DA I Broker Server Registration Server Backend DB Molecular DB Grid Ftp
Grid based data assimilation
using GA and Remote Sensing
• Partners: KU/AIT • Problem:
– Very long computation time from months to years – Large amount of RS data
needed to be moved around and process
• Solution:
– Using Grid to harvest more computing power – Hide data assimilation
process behind the service
8/22/2006 TAM2005 37
Massive Rendering of Graphics
Images
• Partners: KU/Imagimax • Problem:
– Rendering massive amount of images for animation require a massive amount of computing power
• Solution:
– Grid based Rendering Support Middleware
based on NPACI ROCKS – Use grid to link multiple
clusters together – Portal to submit
Nimrod-G, Australia
Astrophysics Air Pollution
Antenna Design
Airfoil Design
Circuit Design Monte Carlo Computational Chemistry
Public Health Policy CFD
Cardiac Modelling
Climate
A Typical eScience Use of Globus:
Network for Earthquake Eng.
Simulation
Links instruments, data,
computers, people
Grids for Physics:
LHC Computing Grid
Grid2003 Æ Open Science Grid
¾ 30 sites (2100-2800 CPUs) & growing
¾ 400-1300 concurrent jobs
¾ 8 substantial applications + CS experiments
¾ Running since October 2003
Korea
系统架构 :DartGrid的分层模型
“We’ve used GT3 to build the largest database grid system for Traditional Chinese Medicine, integrating
about 50 TCM-relevant databases.” (Zhejiang U.) VO的地址栏 语义浏览面板 本体论树 Dar t Gr i d 虚拟组 织资源浏览面板 Q3语义查询 显示面板 语义注册面板
Global Grid Forum
(www.ggf.org)
• The Global Grid Forum (GGF) is a
community-initiated forum researchers and practitioners working on "grid" technologies. • GGF's primary objective is to
– promote and support the development, deployment, and implementation of Grid technologies and applications
– creation and documentation of "best practices" -technical specifications, user experiences, and implementation guidelines
• GGF has many working group working on various grid standard
Conclusion
• Grid Computing is a high impact technology
– Listed by MIT as one of the 10 Most importanttechnology that will change the world
• Grid can help making the use of IT infrastructure
more efficient by
– Provides uniform access , security – Virtualized the resources