• No results found

Grids and Peer to Peer Networks for e Science

N/A
N/A
Protected

Academic year: 2020

Share "Grids and Peer to Peer Networks for e Science"

Copied!
54
0
0

Loading.... (view fulltext now)

Full text

(1)

Grids and Peer-to-Peer Networks

for e-Science

PTLIU Laboratory for Community Grids

Geoffrey Fox and Community Grid Staff and Students Computer Science, Informatics, Physics

Indiana University, Bloomington IN 4740

http://grids.ucs.indiana.edu/ptliupages

[email protected]

(2)

Summary

n

Grid:

Global Computing Infrastructure with a myriad

of heterogeneous devices connected by diverse

networks

Measure and study their performance

Related to but different from classical parallel computing

performance studies

n

Web services:

New object models providing

universality in a service model of electronic capability

Simulate, data access/storage etc.

Nodes of application level systems one can model

n

Systems involve multiple devices connected together –

synchronization of these is performance driver

Communities or virtual organizations are e-Science collective systems

(3)

Trends of Importance

n Resources of increasing performance or functionality

Computers (ASCI, Earth Simulator to TeraGrid), storage,

sensors, networks, PDA’s

More and more data distributed around the world

n Applications of increasing sophistication

Size, multi-scales, multi-disciplines

Compose simulations from different disciplines

n New algorithms and mathematical techniques n Traditional Computer science

Compilers, Parallelism, Objects, Components

n Grid and Internet Concepts and Technologies

Enabling new applications

(4)

Projected Top 500 Until Year 2009

n First, Tenth, 100th, 500th, SUM of all 500 Projected in Time

Earth Simulator from Japan

http://geofem.tokyo.rist.or.jp/

(5)

PACI 13.6 TF Linux TeraGrid

32 32 5 32 32 5 HPS S HPS S ESnet HSCC MREN/Abilene Starlight 10 GbE NCSA 500 Nodes 8 TF, 4 TB Memory

240 TB disk

SDSC

256 Nodes 4.1 TF, 2 TB Memory

225 TB disk

Caltech

32 Nodes 0.5 TF 0.4 TB Memory 86 TB disk

Argonne

64 Nodes 1 TF 0.25 TB Memory 25 TB disk

4 Juniper M160 OC-12 OC-48 OC-12 574p IA-32 Chiba City 128p Origin HR Display & VR Facilities

= 32x 1GbE

= 64x Myrinet

= 32x FibreChannel

MyrinetClos

Spine Spine MyrinetClos

Chicago & LA DTF Core Switch/Routers Cisco 65xx Catalyst Switch (256 Gb/s Crossbar)

= 8x FibreChannel OC-12 OC-12 OC-3 vBNS Abilene MREN Juniper M40

1176p IBM SP Blue Horizon OC-48 NTON 32 24 8 32 24 8 4 4 Sun E10K 4 1500p Origin UniTree 1024p IA-32 320p IA-64 2 14 8 Juniper M40 vBNS Abilene Calren ESnet OC-12 OC-12 OC-12 OC-3 8 Sun Starcat 16 GbE

= 32x Myrinet

HPS S 256p HP X-Class 128p HP V2500 92p IA-32 24 Extreme Black Diamond OC-12 ATM Calren 2 2

(6)

Small Devices Increasing in Importance

n There is growing

interest in wireless portable displays in the confluence of cell phone and personal digital assistant

markets

n By 2005, 60 million

internet ready cell phones sold each year

n 65% of all

Broadband Internet accesses via non

desktop appliances

CM5

Integration of PDA’s and supercomputers (etc.) implies very heterogeneous

systems spanning

traditional performance fields

(7)

The HPCC Thrust has run its course?

n

The

1990 HPCC 10 year initiative

was largely aimed at

parallel computing enabling large scale simulations for a

broad range of computational science and engineering

problems

n

It was in many ways a success and we have methods and

machines that can (begin to)

tackle most 3D simulations

ASCI simulations particularly impressive

DoE still putting substantial resources into basic software

and algorithms from adaptive meshes to PDE solver

libraries

n

Machines are still increasing in performance exponentially

and should achieve

petaflops

in next 7-10 years

n

Not obvious that there will be major changes in parallel

(8)

e-Science

n

e-Science

implies integration of

data and researchers around the

model and builds on

Parallel Computers for SimulationSensors (satellites or ground based)

for dataDatabase

for knowledgeNetworks to lin

(9)

Classic Grid Architecture

Database Database

Netsolv e

Computin g

Securit y Collaboratio

n

Compositio n

Content Access

Resources

Middle Tie Brokers Service Providers

(10)

Astronomy is Facing a

Major Data Avalanche

Astronomy is Facing

a Major Data

Avalanche:

Multi-Terabyte Sky Surveys and Archives (Soon: Multi-Petabyte), Billions of

Detected Sources, Hundreds of Measured Attributes

per Source …

Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.

One e-Science Example

(11)

The Changing Style of Observational

Astronomy

Virtual

Observatory

Archives of pointed observations

(~ TB) Small samples

of objects

(~ 101 - 103)

Multiple, federated sky surveys and archives (~

PB) Large, homogeneous sky

surveys

(multi-TB, ~ 106- 109

sources) Pointed,

heterogeneou observations (~ MB - GB)

Future:

Now:

(12)

What is the NVO? - Content

Source Catalogs Image Data

Query Tools

Specialized Data:

Spectroscopy, Time Series,

Polarization Information Archives:

Derived & legacy data: NED,Simbad,ADS, etc

Analysis/Discovery Tools:

Visualization, Statistics

Standards

(13)

What is the NVO? - Components

n

Information Providers

e.g. ADS, NED, ...

Data Providers

Surveys, observatories, archives, SW repositories

Service Providers

(14)

Grid/P2P Use of Internet I

ROBERT B. COHEN, PH.D. COHEN COMMUNICATIONS GROUP [email protected] 212-986-7720

Global Grid Forum Toronto Feb 18 2002

Cohen’s Rival Estimate Mainl

Digital Video

(15)

Grid/P2P Use of Internet II

S2S Server to Server

Digital Vide “on demand”

P2P Grid

(16)

Use of Object Technologies I

n

The claimed commercial success in using

Object and

component technology

has not

yet

been a clear success in

HPCC and indeed in modeling & simulation

Object technologies

do not naturally support either

high performance or parallelism

C++

can be high performance but

Java (as a language)

is not uniformly so (it is improving)

We suggest that

Web Services

could change this

n

Fortran

(including Fortran90) will continue to decline in

importance and interest – the community should prefer

not to use it

It’s use will not attract the best students

n

Not essential

to write modules in

object oriented language

It is

essential

to package modules in

object framework

(17)

Use of Object Technologies II

n

There is

emerging HPCC component architecture

allowing

production of more modern libraries (integration

Infrastructure)

DoE has very large

CCA

– Common Component

Architecture – effort

Package software (“system and applications”)

as

distributed objects

– not as traditional libraries

n

CORBA HLA Java

and

Web Services

are

not

naturally

high

performance as

component models

High performance

often

not essential

for

coarse grain

objects

Web Services

support multiple implementations

allowing

(18)

Object Size & Distributed/Parallel Simulations

n

All

interesting systems

consist of

linked entities

Particles, grid points, people or groups thereof

n

Linkage translates into

message passing

Cars on a freeway

Phone calls

Forces between particles

n

Amount of communication

tends to be proportional to

surface area of entity whereas simulation time proportional

to volume

n

So

communication/computation

is surface/volume and

decreases

in importance as

entity size increases

n

In parallel computing, communication synchronized; in

distributed computing “self contained objects” (whole

programs) which can be scheduled asynchronously

(19)

Some Problem Classes

n

Classic HPCC:

synchronized objects with regular time

structure (communication overhead decreases as

problem size increases)

Includes PDE and interacting particle based applications

Give scaling parallelism on large MPP’s

n

Grid: Internet Technology and Commercial Application

Integration:

Large objects with modest communications

and without difficult time synchronization

Compose as independent (pipelined) services

Includes some approaches to multi-disciplinary simulation

linkage

n

Hardest:

smallish objects with irregular time

synchronization

Event driven simulations (HLA-RTI) used here

Sets of Grid Points

Sets of Service (programs)

(20)

What is a Web Service I

n A web service is a computer program running on either the local

or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL)

n In principle, computer program can be in any language

(Fortran .. Java .. Perl .. Python) and the interfaces can be implemented in any way what so ever

Interfaces can be method calls, Java RMI Messages, CGI Web

invocations, totally compiled away (inlining) but

n The simplest implementations involve XML messages (SOAP)

and programs written in net friendly languages like Java and Python

n Web Services separate the meaning of a port (message) interface

from its implementation

n Enhances/Enables Re-usable component model of ANY

electronic resource

(21)

What is a Web Service II

n

Web Services have important implication that

ALL

interfaces are XML messages based.

In contrast

n

Most Windows programs have interfaces defined as

interrupts due to user inputs

n

Most software have interfaces defined as methods which

might be implemented as a message but this is often

NOT explicit

Securit

y Catalog

Paymen Credit

Card

Warehous

(22)

etc. XML WS to WS Interfaces

(Virtual) XML Knowledge (User) Interface

Clients

(Virtual) XML Data Interface Raw Data

Ra

Resource

s

Raw Data W S W S Web Service (WS) W S W S W

S WS WS

W S

(23)

Details of WSDL Protocol Stack

n

UDDI

finds where programs are

remote( (distributed) programs

are just Web Services

n

WSFL

links programs togethe

(under revision?)

n

WSDL

defines interface (methods,

parameters, data formats)

n

SOAP

defines structure of message

including serialization of information

n

HTTP

is negotiation/transport

protocol

n

TCP/IP

is layers 3-4 of OSI

Physical Network

is layer 1 of OSI

UDDI or WSIL

WSFL

WSDL

SOAP or RMI

HTTP or SMTP or IIOP or RMTP

TCP/IP

(24)

Message Or Event Based Inte

Connection

Reso urce

Data base

Reso urce Sof

ware

Sof ware

XM Skin

e-Science/Grid/P2P Networks are XML

Specified Resources connected by XML

specified message

Implementation of resource and connection may or may not be XML XM

Skin

(25)

What is a Grid Web Service?

n There are generic Grid system services: security, collaboration,

persistent storage, universal access

OGSA (Open Grid Service Architecture) is implementing these as

extended Web Services

n An Application Web Service is a capability used either by another

service or by a user

It has input and output ports – data is from sensors or other

services

n Consider Satellite-based Sensor Operations as a Web ServiceSatellite management (with a web front end)

Each tracking station is a service

Image Processing is a pipeline of filters – which can be grouped

into different services

Data storage is an important system service

Big services built hierarchically from “basic” services

(26)

Sensor Web Service

Distributed Sensor Web Service

Output Web Service port

Universal sensor acces for people/computers

Input Web Service port

Different forma Sensor Data

(27)

Application Web Services

n Note Service model integrates sensors, sensor analysis, simulations and people n An Application Web Service is a capability used either by another service or

by a user

It has input and output ports – data is from users, sensors or other servicesBig services built hierarchically from “basic” services

Sensor Data as a We

service (WS) Data Analysis WS Sensor Managemen WS Visualization WS Simulation WS Filter

WS FilterWS FilterWS

Build as multiple Filter Web Services

Prog

WS ProgWS

(28)

The Application Service Model

n

As bandwidth of communication (between) services increases

one can support smaller services

n

A service “is a

component

” and is a replacement for a

library in case where performance allows

n

Services (components)

are a sustainable model of software

development – each service has documented capability with

standards compliant interfaces

XML

defines interfaces at several levels

WSDL

at Service interface level and

XSIL

or equivalent

for scientific data format

n

A service can be written as Perl, Python, Java Servlet,

Enterprise Javabean, CORBA (C++ or Fortran) Object …

n

Communication

protocol can be RMI (Java), IIOP

(CORBA) or SOAP (HTTP, XML) ……

(29)
(30)

Some Science Web Services

n

These build on general (community) web services

(31)

Education as a Web Service

n Can link to Science as a Web Service and substitute educational

modules

n “Learning Object” XML standards already exist from IMS/ADL

http://www.adlnet.org – need to update architecture

n Web Services for virtual university include:

n Registration

n Performance (grading) n Authoring of Curriculum

n Online laboratories for real and virtual instruments n Homework submission

n Quizzes of various types (multiple choice, random parameters) n Assessment data access and analysis

n Synchronous Delivery of Curricula

n Scheduling of courses and mentoring sessions

n Asynchronous access, data-mining and knowledge discovery

(32)

Different Web Service Organizations

n

Everything is a

resource implemented as a Web

Service

, whether it be:

back end supercomputers and a petabyte data

Microsoft PowerPoint and this file

n

All

Resources communicate via messages

n

Grids

and

Peer to Peer (P2P) networks

can be

integrated by building both in terms of

Web

Services

with different (or in fact sometimes the

same) implementations of

core services

such as

registration

,

discovery

,

life-cycle

,

collaboration

and

event or message transport

…..

Gives a

Peer-to-Peer Grid

(33)

Peer to Peer Grid

Database Database

JXTA

JXTA

Web Service Interfaces

Web Service Interfaces

Event Messag Brokers

Integrate P2P and Grid/WS

(34)

Role of Event/Message Brokers

n We will use events and messages interchangeably

An event is a time stamped message

n Our systems are built from clients, servers and “event brokers”

These are logical functions – a given computer can have one

or more of these functions

In P2P networks, computers typically multifunction; in Grids

one tends to have separate function computers

Event Brokers “just” provide message/event services; servers

provide traditional distributed object services as Web

services

n There are functionalities that only depend on event itself and

perhaps the data format; they do not depend on details of application and can be shared among several applications

NaradaBrokering is designed to provide these functionalities

MPI provided such functionalities for all parallel computing

(35)

NaradaBrokering implements an

Event Web Service

n Filter is mapping to PDA or slow communication channel

(universal access) – see our PDA adaptor

n Workflow implements message process n Routing illustrated by JXTA

Destination-Source matching illustrated by JMS using

Publish-Web

Service 1 (VirtualQueue Service 2Web

Destinatio

Source Matching Filter

Routin

g workflow

WSD

(36)

Features of Event Service I

n

MPI

nowadays aims at a

microsecond latency

n

The Event Web Service aims at a

millisecond latency

Typical distributed system travel times are many milliseconds

(to seconds for Geosynchronous satellites)

Different performance/functionality trade-off

n

Messages are

not sent directly

from

P

to

S

but rather

from

P

to Broker

B

and from Broker

B

to subscriber

S

n

Synchronous

systems: B acts as a real-time

router/filterer

Messages can be archived and software multicast

n

Asynchronous

systems: B acts as an

XML database

and

workflow

engine

n

Subscription is in each case, roughly equivalent to a

database query

(37)

Features of Event Web Service II

n

In principle Message brokering can be virtual and

compiled away

in the same way that WSDL ports can

be bound in real time to optimal transport mechanism

All Web Services are specified in XML but can be

implemented quite differently

Audio Video Conferencing sessions could be negotiated using SOAP (raw XML) messages and agree to use certain video codecs transmitted by UDP/RTP

n

There is a collection of XML Schema – call it GXOS –

specifying

event service

and requirements of message

streams and their endpoints

One can sometimes compile message streams specified in

GXOS to MPI or to local method call

n

Event Service must support dynamic heterogeneous

(38)

Features of Event Web Service III

n

The event web service is naturally implemented as a

dynamic distributed network

Required for fault tolerance and performance

n

A new

classroom joins

my online lecture

A broker is created to handle students – multicast locally my

messages to classroom; handle with high performance local messages between students

n

Company X sets up a

firewall

The event service sets up brokers either side of firewall to

optimize transport through the firewall

n

Note

all message based applications

use

same message

service

Web services imply ALL applications are (possibly virtual)

message based

(39)

Broker Network

Data base Reso

urce

Broker

Broker Broker Broker

Broker

Broker

Software multicast

(P2P) Community

For message/events service (P2P) Community

(40)

System Structure I

n

Systems are a dynamic mix of structured and

unstructured entities

n

P2P systems like

JXTA

support unstructured systems

realized by opportunistic messaging “broadcast locally”

over a certain “network distance”

n

Java Message Service

JMS

supports structured systems

where clients (message endpoints) link to one of a

known set of “central servers”

n

Event system must support

Advertise capability – Publish

Advertise need – Subscribe both for type and form of messages

Transport designated messages/events

(41)

Single Server P2P Illusion

Collaboration Server Data

base

(42)

System Structure II

n

One could think that the world is a well defined

structure of unstructured systems

Unstructured dynamic systems are P2P (JXTA) Peer Groups

Peer Groups could be cluster of students in a class for

distance learning or cluster of Grid (OGSA) Web services generated to support running a job

n

But maybe it is a set of

structured communities

with

unstructured connection

n

NaradaBrokering

needs to support both models and

those in between

Currently has JMS mode, JXTA mode and Native (most

powerful) mode

n

P2P

usually thought of as a set of “

unruly dangerous

clients

” but can equally well be used securely as a

middleware

interaction mode between

web services

(43)

Database Database

Grid Middleware

Grid Middleware

Grid Middlewar

e

Grid Middlewar

e

MP Group MP Group

M

p

M

(44)

Community Grids Laboratory Activities I

n

Core NaradaBrokering

Event Service

Operation in JMS or JXTA mode to demonstrate integration

of central and peer-to-peer mode

Focus is Performance and Capabilities (see later)

n

Garnet synchronous collaboration

environment used

for distance education and seminars

Built first on commercial JMS but ported to Narada – shows

that one can afford to use message service in synchronous application sharing

n

Interface of

Garnet

to

PDA

with

message size filtering

and optimized

HHMS

message service

This filtering also needed for slow clients – mix of dial-ups

and Internet2 clients in a collaboration

Event system supports (XML) client profiles

(45)
(46)

NaradaBrokering and JMS

Low Rate; Small Messages

(47)

NaradaBrokering and JXTA

Comparing Pure JXTA, Narada-JXTA and Direct P2P There is a bug in JXTA and this was only just fixed

Narada-JXTA provides JXTA guaranteed long distance delivery

Small Payload

(48)

JXTA is getting slower

Pure Narada 2 hops

Client

Client Narada

Narada

Client  JXTA  JXTA  Client

Client  JXTA  Narada  JXTA  Client

Client  JXTA  JXTA  Client multicast

(49)

PowerPoint can be converted to SV via Illustrator or Web export

Batik Viewer on PC

(50)

PDA Collaboration Event Filter

GMS JMS o

Narada This nowDoing

GMSME : iPaq H3650, WinCE 3.0,

Personal-Java1.1 Wireless 11 Mbit/s IEEE 802.11b

(51)

Community Grids Laboratory Activities II

n

Use of

JMS

(Narada) to support

asynchronous

collaboration

including early GXOS Schema XML

based

News Groups

and Web Site management

Integrated with Apache Slide and Jetspeed portals

n

Audio-Video Conferencing

as a Web service

H323 and SIP as Web services using XML Session Schema

NaradaBrokering support of UDP

n

Computing Portals

as Web services; NaradaBrokering

could support events (status, performance, job flow)

linking

operational job

to control servers and

(52)

NaradaBrokering Futures

n

Higher Performance

– reduce minimum transit time to

around one millisecond

n

Substantial

operational testing

n

Security

– allow Grid (Kerberos/PKI) security

mechanisms

n

Support of more

protocols

with dynamic switching as

in JXTA – SOAP, RMI, RTP/UDP

n

Integration of simple

XML database model

using JXTA

Search to manage distributed archives

n

More formal specification of “

native mode

” and

dynamic instantiation of brokers

n

General Collaborative Web services

(53)

Collaborative Web Service Access

n Intercept and multicast messages produced by Web Service

Collaborative We Service

Maste Client

Clien t

Even (Message

Service

Web Service has a por on which collaborativ modes set

Web Service can b “front-end” (in middl tier) to complex

Web Service Intercepto Providing General Services

Set Collaboration and Message Mode

Collaboratio as a We

(54)

Collaborative Replicated Web Services

n Intercept and multicast messages SENT to Web Service

Objec t Object Displa y Object Viewer Object Displa y Object Viewer Even (Message Service Master Object Displa y Object Viewer We Servic e Web Service Intercepto

Providing General Services Set Collaboration Mode

References

Related documents

The Strategist’s responsibilities include working with executive management, guiding the Hitachi Data Systems and Hitachi, Ltd., development teams on issues related to

Other products required to enable this solution include Enterprise-quality Data Center servers such HP ProLiant Servers, to run the various network and business applications or

Traits of plant communities in fragmented forests: The relative influence of habitat spatial configuration and local abiotic conditions. How well is current plant trait

• Experience and ability to operate relevant civil construction plant and machinery • Demonstrated ability to work as an effective team member of a high performing team. •

Data analysis determines whether the success of ERP implementation of SMEs in Central Java Province were influenced by the support of top man- agement, effective project

Like Nancy in Oliver Twist, Lady Brackenstall in The Adventure of the Abbey Grange, Laura Fairlie in The Woman in White, and Janet Dempster in Janet’s Repentance, Helen attempts to

3.7.1.1 Medication Administration record (MAR): Drug administration site and time 3.7.1.2 Nurse’s notes/flow sheet: patient education: patient response to treatment 3.8