• No results found

Grid Computing Research

N/A
N/A
Protected

Academic year: 2021

Share "Grid Computing Research"

Copied!
29
0
0

Loading.... (view fulltext now)

Full text

(1)

Grid Computing Research

Ian Foster

Mathematics and Computer Science Division Argonne National Laboratory

and

Department of Computer Science The University of Chicago

(2)

[email protected] ARGONNE CHICAGO

Overview

z Grid computing & its importance to DOE z Grid R&D at Argonne

z Defining and instantiating a Grid Services

architecture

z Data Grids

(3)

[email protected] ARGONNE CHICAGO

On-demand creation of powerful virtual computing systems

The Grid: The Web on Steroids

http:// http:// Web: Uniform access to HTML documents Grid: Flexible, high-perf access to all significant resources Sensor nets Data archives Computers Software catalogs Colleagues

(4)

[email protected] ARGONNE CHICAGO

Grid Computing: Take 2

z Enable communities (“virtual organizations”)

to share geographically distributed resources as they pursue common goals—in the

absence of central control, omniscience, trust relationships

z Via investigations of

– New applications that become possible when resources can be shared in a coordinated way – Protocols, algorithms, persistent

(5)

[email protected] ARGONNE CHICAGO

Grids and DOE

z Distinctive characteristics of much DOE

science encourages Grid concepts

– Unique and expensive facilities: accelerators, microscopes, supercomputers, …

– Large-scale, multidisciplinary science: climate, materials, high energy physics, …

– Large-scale simulation, data-intensive science

z The question is not whether to Grid-enable

(6)

[email protected] ARGONNE CHICAGO

Grid Communities & Applications:

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS Online System

Offline Processor Farm ~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPS France Regional

Centre Germany Regional Centre Italy Regional Centre

Institute Institute Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 0 Tier 1 Tier 1 Tier 2 Tier 2 Tier 4 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents

(7)

[email protected] ARGONNE CHICAGO tomographic reconstruction real-time collection wide-area dissemination

desktop & VR clients with shared controls

Advanced Photon Source

Grid Communities & Applications:

Online Instrumentation

archival storage

(8)

[email protected] ARGONNE CHICAGO

Grid Communities and Applications:

Mathematicians Solve NUG30

z Community=an informal

collaboration of

mathematicians and computer scientists

z Condor-G delivers 3.46E8

CPU seconds in 7 days

(peak 1009 processors) in U.S. and Italy (8 sites)

z Solves NUG30 quadratic

assignment problem

14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,17,30,6,20,19, 8,18,7,27,12,11,23

(9)

[email protected] ARGONNE CHICAGO

Grid R&D at Argonne

z Started in 1995 (I-WAY experiment, s/w)

z Globus R&D (much joint with USC/ISI)

– Innovative security, resource management, data access, information, communication, fault detection, etc., etc. technologies

– Large user base among tool developers

– Widespread adoption in “production” grids: e.g., NASA IPG, NSF NTG, DOE DISCOM

– Exciting application demonstrations

(10)

[email protected] ARGONNE CHICAGO Ambient mic (tabletop) Presenter mic Presenter camera Audience camera

Access Grid Collaboration Technology

z

Designed spaces for

group interactions

z

Hands free audio

z

Multiple video and

audio streams

(11)

[email protected] ARGONNE CHICAGO

Addressing Cross-Cutting

Technical Issues

z Development of Grid protocols & services

– Protocol-mediated access to remote resources – New services: e.g., resource brokering

– “On the Grid” = speak Intergrid protocols – Mostly (extensions to) existing protocols

z Development of Grid APIs & SDKs

– Facilitate application development by supplying higher-level abstractions

z The (hugely successful) model is the Internet

(12)

[email protected] ARGONNE CHICAGO

Layered Grid Architecture

(By Analogy to Internet Architecture)

Application

Fabric

“Controlling things locally”: Access to, & control of, resources

Connectivity

“Talking to things”: communication (Internet protocols) & security

Resource

“Sharing single resources”:

negotiating access, controlling use

Collective

“Managing multiple resources”: ubiquitous infrastructure services

User

“Specialized services”: user- or appln-specific distributed services

Internet Transport Application

Link

(13)

[email protected] ARGONNE CHICAGO

Instantiating the Grid Architecture:

Connectivity Layer Protocols & Services

z Communication

– Internet protocols: IP, DNS, routing, etc.

z Security: Grid Security Infrastructure (GSI)

– Uniform authentication & authorization mechanisms in multi-institutional setting

– Single sign-on, delegation, identity mapping – Public key technology, SSL, X.509, GSS-API – Supporting infrastructure: Certificate

Authorities, key management, etc.

(14)

[email protected] ARGONNE CHICAGO

Instantiating the Grid Architecture:

Resource Layer Protocols & Services

z Grid Resource Allocation Mgmt (GRAM)

– Remote allocation, reservation, monitoring, control of compute resources

z GridFTP protocol (FTP extensions)

– High-performance data access & transport

z Grid Resource Information Service (GRIS)

– Access to structure & state information

z Network reservation, monitoring, control

z All integrated with GSI: authentication,

authorization, policy, delegation

(15)

[email protected] ARGONNE CHICAGO

Instantiating the Grid Architecture:

Collective Layer Protocols & Services

z Index servers aka metadirectory services

– Custom views on dynamic resource collections assembled by a community

z Resource brokers (e.g., Condor Matchmaker)

– Resource discovery and allocation

z Replica catalogs

z Co-reservation and co-allocation services z Etc., etc.

(16)

[email protected] ARGONNE CHICAGO Compute Resource SDK API Access Protocol Source Code Repository SDK API Lookup Protocol

Example: User Portal

Web Portal

Source code discovery, application configuration

Brokering, co-allocation, certificate authorities

Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, …

User Appln Collective Resource Connect Fabric

(17)

[email protected] ARGONNE CHICAGO Compute Resource SDK API Access Protocol Checkpoint Repository SDK API C-point Protocol

Example:

High-Throughput Computing System

High Throughput Computing System Dynamic checkpoint, job management, failover, staging

Brokering, certificate authorities

Access to data, access to computers, access to network performance data

Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, schedulers

User Appln Collective Resource Connect Fabric

(18)

[email protected] ARGONNE CHICAGO

Data Grid R&D

z “Enable a geographically distributed

community to pool their resources to perform sophisticated, computationally intensive analyses on Petabytes of data”

z Specific issues addressed

– Network QoS for bulk data transfer – Replica management technologies

– High-speed data transfer protocols & tools – Community-based access control

(19)

[email protected] ARGONNE CHICAGO

Network Quality of Service Research:

Overall Picture

z Secure, policy-driven bandwidth allocation

for high-end applications

z Immediate and advance reservations

z Co-reservation and co-allocation of

multiple resources for end-to-end flows

z All supported in a modular architecture

widely used for Grid apps (Globus Toolkit)

z Experience with differentiated services

z Future opportunities in all-optical context z Collaborative effort with LBNL

(20)

[email protected] ARGONNE CHICAGO

The GARNET QoS Research Testbed

Linux Linux

Cisco 7507 Cisco 7507 Cisco 7507 ESnet Testbed ANL Ultra MREN EMERGE Testbed Ultra UW Madison iCAIR Univ of Chicago Univ of Illinois LBNL NREN Testbed NASA Ultra ATM Fast Ethernet Sandia Cave

(21)

[email protected] ARGONNE CHICAGO

Bulk Transfer (LAN)

0 20000 40000 60000 80000 100000 0 50 100 150 200 250 Time B a ndw idt h ( K bps ) background foreground competitive

When a reservation begins, the bulk-transfer backs off

When a reservation ends, the bulk-transfer speeds up

The competitive UDP traffic never interferes

(22)

[email protected] ARGONNE CHICAGO

Bulk Transfer (WAN)

0 10000 20000 30000 40000 50000 0 20 40 60 80 100 120 140 160 180 Time B a n d w id th (K b p s) background foreground competitive

(23)

[email protected] ARGONNE CHICAGO

Data Grid Architecture

Discipline-Specific Data Grid Application

Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, …

Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs,

Access to data, access to computers, access to network performance data, …

Communication, service discovery (DNS), authentication, authorization, delegation

Storage systems, clusters, networks, network caches, …

User Appln Collective Resource Connect Fabric

(24)

[email protected] ARGONNE CHICAGO

Globus Data-Intensive

Services Architecture

Library Program Legend globus-url-copy Replica Programs Custom Servers globus_gass_copy globus_ftp_client globus_ftp_control

globus_common GSI (security)

globus_io OpenLDAP client

globus_replica_catalog globus_replica_manager Custom Clients globus_gass_transfer globus_gass Already exist

(25)

[email protected] ARGONNE CHICAGO

High-Level View of Earth System Grid:

A Model Architecture for Data Grids

Metadata Catalog Replica Catalog Tape Library Disk Cache Attribute Specification

Logical Collection and Logical File Name

Disk Array Disk Cache

Application Replica Selection Multiple Locations NWS Selected Replica

GridFTP commands Performance Information & Predictions

Replica Location 1 Replica Location 2 Replica Location 3

(26)

[email protected] ARGONNE CHICAGO

Future Directions

z “Grid computing” represents a significant

success story for DOE research and

opportunity for turbocharging DOE science

z Timely to take things to the next level, by

– Aggressive programs focused on Grid-enabling key DOE applications (climate, physics, combustion, etc.)

– New efforts focused on security, next-gen optical technologies, Grid tools, etc.

– Infrastructure: faster networks, protocols, services (ESnet -> ESgrid?)

(27)

[email protected] ARGONNE CHICAGO

Abbreviated Acknowledgments

z Globus project Co-PI: Carl Kesselman

z Other Globus principals include: Steve

Tuecke, Ann Chervenak, Bill Allcock, Gregor von Laszewski, Lee Liming, Steve Fitzgerald

z Access Grid: Rick Stevens & many others

z DOE Grid collaborators: Arie Shoshani,

Reagan Moore, Miron Livny, Bill Johnston, Brian Tierney, others

z ESG collaborators: Dean Williams & others

(28)

[email protected] ARGONNE CHICAGO

Future Directions

z “Grid computing” represents a significant

success story for DOE research and

opportunity for turbocharging DOE science

z Timely to take things to the next level, by

– Aggressive programs focused on Grid-enabling key DOE applications (climate, physics, combustion, etc.)

– New efforts focused on security, next-gen optical technologies, Grid tools, etc.

– Infrastructure: faster networks, protocols, services (ESnet -> ESgrid?)

(29)

[email protected] ARGONNE CHICAGO

For More Information

z Book (Morgan Kaufman)

– www.mkp.com/grids z Globus – www.globus.org z Grid Forum – www.gridforum.org z

PPDG

– www.ppdg.net

References

Related documents

El siguiente resumen del cuadro clínico de la familia intenta exponer algunas de las cuestiones que los antropólogos podrían aclarar. Desgraciadamente lo que se esboza aquí no es

【注】 1

However, composites with 2 vol% silica had the best mechanical properties of specimens with silica nanoparticles included at 43.8 MPa and 3.05 GPa for flexural strength and

Also configuration wise it seems that using configurations with more Naive Bayes then Hoeffding trees seem to perform better, however also a configuration solely based on

When an RCVF or SNDRCVF command is used with multiple display devices, the default value WAIT(*YES) prevents further processing until an input-capable field is returned to the

In addition, when you are building your business based on automated field workers, the downtime from damaged devices or loss of data can quickly have a bigger impact than the cost

Theoretical Content Practical Content Week Specific Learning Outcomes Teacher’s Activities Resources Specific

By having Archer replace Rossetti‘s beloved in The House of Life with Madame Olenska, Wharton is showing the reader early on that only in books can Archer and Madame Olenska