• No results found

The Problem Solving Environments of TeraGrid, Science Gateways, and the Intersection of the Two

N/A
N/A
Protected

Academic year: 2020

Share "The Problem Solving Environments of TeraGrid, Science Gateways, and the Intersection of the Two"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

JIM BASNEY1, STUART MARTIN2, JP NAVARRO2, MARLON PIERCE3, TOM SCAVO1,

LEIF STRAND4,

TOM URAM2,5, NANCY WILKINS-DIEHR6, WENJUN WU2, CHOONHAN YOUN6 1NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

2ARGONNE NATIONAL LABORATORY 3INDIANA UNIVERSITY 4CALIFORNIA INSTITUTE OF TECHNOLOGY

5UNIVERSITY OF CHICAGO

6SAN DIEGO SUPERCOMPUTER CENTER, UNIVERSITY OF CALIFORNIA AT SAN DIEGO

The Problem Solving Environments

of TeraGrid, Science Gateways, and

(2)

TeraGrid, what is it

A unique combination of fundamental CI components

(3)

Gateways, what are they

Problem Solving Environments for Science

Portal or client-server interfaces to high end resources

 Web developments, explosion of digital data lead to the increased

importance of the internet and the web for science

 Only 16 years since the availability of web browsers

 Developments in web technology

• From static html to cgi forms to the wikis and social web pages of today

 Full impact on science yet to be felt

 Web usage model resonates with scientists

But, need persistency if the Web is to have a profound impact

on science (this is key for all PSEs)

TeraGrid provides common infrastructure for gateway

developers

(4)

TeraGrid’s Infrastructure for Gateways

Problem

 Local compute resources are typically not enough for Gateways

Goal

 Make it easy to use any TeraGrid site from a Gateway

Approach

 Provide a set of client APIs and command line tools for use in

Gateways/portals

 Maintain and deploy a set of common services on each site

(5)

Infrastructure Capabilities

Information Discovery

 Find deployed services

 Get details about the compute resources

Data Management

 Move data to and from compute resources

Execution Management

 Submit and monitor remote computational jobs

Security

(6)

Security

Based on Grid Security Infrastructure (GSI)

 Uses X509 PKI

 End entity certificates (e.g. issued to a person or host)

 User proxy certificates (valid for a limited period of time)

Enables

single sign-on

to all TG resources

Enables

delegation

 Users/clients can disconnect and let services perform actions

securely on their behalf

Integrated in grid middleware services

(7)

GT4 Server

Globus Web Service

Java WS Container

Gridmap

GSI in Action

GT4 Client

Globus WS Client

grid-proxy-init

end entity credential

Key

proxy

credential Key

(8)
(9)

Gateway Workflow with GSISSH

GSISSH Service Scheduler (e.g., PBS)

Compute Nodes

GSISSH Service Scheduler (e.g., LSF)

Compute Nodes

Local Jobs Local Jobs

Resource A Resource B

gatewa Jobs

GSISSH PBS LSF Client does:

• myproxy-logon (once) • Move files with gsiscp

(10)

Remote Execution Management

Grid Resource Allocation and Management (GRAM)

Provide an abstraction layer on top of various local

resource managers (PBS, Condor, LSF, SGE, …)

 Defines a common job description language

 Client API and command line tools to asynchronously access remote

LRMs

 Fault tolerant

 GSI Security

 “job” Workflow

 File staging before and after job execution

 Lastly, File cleanup

(11)

Traditional LRM Interaction

Local Jobs

Resource A

Scheduler (e.g., PBS)

Compute Nodes

Satisfies many users and use cases

TACC’s Ranger (62976 cores!) is the Costco of HTC ;-), one

(12)

Local Jobs

Resource A

GRAM4 Service Scheduler (e.g., PBS)

Compute Nodes

remot GRAM Jobs gramJob API

Adds remote execution capability

Enable clients/devices to manage

jobs from off of the cluster (Gateways!)

(13)

GRAM Benefit

GRAM4 Service Scheduler (e.g., PBS)

Compute Nodes

GRAM4 Service Scheduler (e.g., LSF)

Compute Nodes

Local Jobs Local Jobs

Resource A Resource B

GRAM Jobs

gramJob API

(14)

Gateway Perspective

GRAM4 jobs

Scalable jo

(15)

Data Management - GridFTP

GridFTP

 High-performance, secure, reliable data transfer protocol optimized for

high-bandwidth wide-area

 GSI Security

 Third-party transfers  Parallel Transfers  Striping

 Lots of small files (LOSF)

(16)

Data Management - RFT

Reliable File Transfer

 Adds reliability on top of GridFTP

 GSI Security

 Throttles requests

 Retries non-fatal transfer errors

 Resumes transfers from the last known position

 Requires delegation in order to contact GridFTP servers on user’s

(17)

We Authn

Resource Provider Science Gateway

WS GRAM

Client WS GRAM Service

proxy credential

proxy certificate

Key

Java WS Container

Webapp

Web Interface Web Browser

community credential

Key

community account

(18)

GT4 Server GT4 Client

Globus WS

Client SAML PIPGridShi

proxy certificate GridShib SAML Tools end entity credential Key SAML Globus Web Service Policy Logs

Java WS Container (with GridShib for GT)

(19)

We Authn

Resource Provider Science Gateway

WS GRAM

Client SAML PIPGridShi

proxy certificate GridShib SAML Tools communit y credential Key SAML WS GRAM Service Policy Logs

Java WS Container (with GridShib for GT)

Security Context Webapp attributes Web Interface Web Browser username proxy credential SAML Key

(20)

Information Management

 TeraGrid’s Integrated Information Services are a network of web services

responsible for aggregating the availability of TeraGrid capability kits, software, and services across all the infrastructure providers

 Where are the job submission, file-transfer, and login services needed by Gateways?  What is the queue status and estimated delay for each resource?

(21)
(22)

High-Availability Design

info.dyn.teragrid.org

info.teragrid.

org

TeraGrid Dynamic DNS

Server failover propagates globally in 15 minutes

Clients

Dynamic paths Static paths Service Provider Information Services

(23)

Today, there are approximately 29 gateways

using the TeraGrid

(24)

Selected Highlights from the PSE08 paper

The Social Informatics Data (SID) Grid

The Geosciences Network (GEON)

QuakeSim

Computational Infrastructure for Geodynamics

(CIG)

(25)

Social Informatics Data Grid

Heavy use of “multimodal”

data.

 Subject might be viewing a

video, while a researcher collects heart rate and eye movement data.

Events must be

synchronized for analysis,

large datasets result

Extensive analysis

capabilities are not

something that each

researcher should have to

create for themselves.

NSF Program Officers, September 10, 2008

(26)

How does SIDGrid use the TeraGrid?

Computationally intensive tasks

 Speech, gesture, facial expression, and physiological measurements

 Media transcoding for pitch analysis of audio tracks

 Once stored in raw form, data streams converted to formats

compatible with software for annotation, coding, integration, analysis

 fMRI image analysis

Workflows for massive job submissions and data

transfers using Virtual Data System (VDS)

Worflows converted to concrete execution plan via

Pegasus Grid planner

 TeraGrid information service (MDS)  Replica location service (RLS)

(27)

 The goal of GEON is

 to advance the field of

geoinformatics and

 to prepare and train current and

future generations of geoscience researchers, educators, and

practitioners in the use of

cyberinfrastructure to further their research, education, and

professional goals.

 GEON is providing several key

features

 data access, computational

simulations, personal work spaces and analyses environments

 identifying best practices with the

objective of dramatically advancing geoscience research and

(28)

How does GEON use the TeraGrid?

Computationally intensive tasks

 Ability to speedily construct earth models, access observed

earthquake recordings and simulate them to understand the

subsurface structure and characteristics of seismic wave propagation in an efficient manner

 SYNSEIS (SYNthetic SEISmogram generation tool), provides access

to seismic waveform data and simulate seismic records using 2D and 3D models.

 Conduct advanced calculations for simulating seismic waveforms of either earthquakes or explosions at regional distances (< 1000 km).

GSI (security), GAMA (account management), GridFTP

(data transfer), GRAM (job submission), MyWorkspace

(job monitoring)

Account management for classroom use, MyProjects

(29)

QuakeSim - Some Design Choices

—

Build portals out of

portlets

(Java Standard)

— Reuse capabilities from our Open Grid Computing Environments

(OGCE) project, the REASoN GPS Explorer project, and many TeraGrid Science Gateways.

— Decorate with Google Maps, Yahoo UI gadgets, etc.

—

Use

Java Server Faces

to build individual component

portlets.

— Build standalone tools, then convert to portlets at the very end.

—

Use simple

Web Services

for accessing codes and data.

— Keep It Stateless …

—

Use Condor-G and Globus job and file management

services for interacting with high performance computers.

— TeraGrid

—

Favor

Google Maps

and

Google Earth

for their simplicity,

interactivity and open APIs.

— Generate KML and GeoRSS

—

Use

Apache Maven

based build and compile system, SVN

(30)

Portlets + Client Stubs DB Service JDBC DB Job Sub/Mon And File Services Operating and Queuing Systems WSD L WSDL Browser Interface WS DL WSD L WS DL WS

DL WSDL

Visualization Or Map Service

DB WSDL

Host 1 (Quaketables) Host 2 (Grid) Host 3 (G Maps)

SOAP/HTTP

(31)

Two Approaches to the Middle Tier

Grid Service Grid Service

Backend Resource

Web Service Portal Comp. Portal Comp.

Grid Client

Backend Resource

Fat

Client ClientThin

Grid Protocol

(SOAP) Grid Client

HTTP + SOAP

(32)
(33)

Disloc output

converted to

KML and

(34)
(35)

“SWARM: Scheduling Large-scale Jobs over the Loosely-Coupled HPC Clusters,” S. L. Pallickara and M. E. Pierce, Friday, December 12, 2 p.m. to 2:30 p.m.

http://escience2008.iu.edu/sessions/SWARM.shtml

Standard Web Service Interface Request Manager

Resource

Ranking Manager DataModelManager

QBET Web

Service Fault Manager

Job Execution Manager Condor G with Birdbath

User A’s Job

Board User A’s Resource

Pool

Tokens for

resource X,Y,Z User A’s Job

Queue Job Distributor RDMBS MyProxy Server

High Performance Computing Clusters: Grid style clusters

and condor computing nodes

(36)

Membership-governed

organization

 40 institutional member, 9

foreign affiliates

Supports and promotes

Earth science by

developing and

maintaining software for

computational geophysics

(37)

How does CIG use the TeraGrid?

 Seismograms allow scientists to understand the ground motion

 Computationally-intensive simulations run on TeraGrid using an assortment

of 3D and 1D earth models produce synthetic seismograms  Necessary input datasets provided via the portal

 Daemon (Python, Pyre) constantly polls the web site looking for work to do

 GSI-OpenSSH and MyProxy credentials to submit jobs, monitors jobs, transfers output back to

portal

 status updates to the web site using HTTP POST

 Users can download results in ASCII and Seismic Analysis Code (SAC) format

 Visualizations include "beachball" graphics depicting the earthquake's source mechanism, and

maps showing the locations of the earthquake and the seismic stations using GMT (http://gmt.soest.hawaii.edu/)

 Researchers quickly receive results and can concentrate on the scientific

aspects of the output rather than on the details of running the analysis on a supercomputer

 Future Directions

 Parameter explorations

(38)

Conclusions

Technical requirements of some PSEs dictate

seamless access to high-end compute and data

resources

A

robust, flexible and scalable

infrastructure can provide

a foundation for many PSEs

PSEs themselves must be treated as sustainable

infrastructure

Researchers will not truly rely on PSEs for their work unless

References

Related documents

Positive adaptation, as evidenced by caregiver wellbeing, care recipient wellbeing, and household homeostasis, is the result of resilience exemplified in the attributes

The IHS (Headache Classification Committee of the International Headache Society 2004) has indicated that to diagnose a cervicogenic headache, there must be evidence that the

realized strategies.. strategic situation Ressourcenorientierte Strategien Marktorientierte Strategien strategic options industry analysis market analysis competitor

In this study the sensitivity of MDCT was 94% and specificity was 90%, with a positive predictive val- ue of 94%, negative predictive value of 90%, and diagnostic accuracy of

EMMA enters, clutching a knife really tightly, walks across the room towards STAN?. STAN’s Eyebrows raised in terror as EMMA approaches him with

This vast number of pilgrims in a limited geographic area and specific time has created an important challenges facing the Saudi government; especially in how to

Protezione elettriche uscita / Supply electrical output Cortocircuito (autoripristinante) / Short circuit (autoreset) overvoltage Regolazione di sensibilità / Sensibility adjustment

Application of indigenous inorganic sorbents in combination with membrane technology for treatment of radioactive liquid waste from decontamination processes. Uranium biosorption