• No results found

Grid Technology Implications for ACES and SERVOGrid

N/A
N/A
Protected

Academic year: 2020

Share "Grid Technology Implications for ACES and SERVOGrid"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Grid Technology

Implication

for ACES and

SERVOGrid

Brisban

Australi

June 5 2003

Geoffrey Fox Marlon Pierce

Community Grids Lab Indiana University

[email protected]

http://academia.web.cern.ch/academia/lectures/grid /

(2)

What is Grid Technology?

• Grids support distributed collaboratories or virtual

organizations integrating concepts from

• The Web

• Distributed Objects (CORBA Java/Jini COM)

• Globus Legion Condor NetSolve Ninf and other High Performance Computing activities

• Peer-to-peer Networks

• With perhaps the Web being the most important for “Information Grids” and Globus for “Compute Grids”

– Information Grids are basis of SERVOGrid

(3)

Paradigms Protocols Platforms and Hosting

• We can start from the Web view where the

basic

Grid paradigm

is

• Meta-data rich Web Services communicating via

messages

• These have some basic support from some runtime

such as .NET, Jini (pure Java), Apache

Tomcat+Axis (Web Service toolkit), Enterprise

JavaBeans, WebSphere (IBM) or GT3 (Globus

Toolkit 3)

– These are the distributed equivalent of operating system functions as in UNIX Shell

(4)

Taxonomy of Grid Functionalities

Grid supporting a company’s enterprise infrastructure Enterprise Grid

Grid supporting University community computing Campus Grid

Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations

Complexity or Hybrid Grid

Grid service access to distributed information, data and knowledge repositories

Information Grid

“Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers

Desktop Grid

Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”)

Compute/File Grid

Description of Grid Functionality Name of Grid Type

(5)

Database Database

Closely Coupled Compute Nodes

Analysis and Visualization Repositorie

Federated Databases

Sensor Nets

Streaming Data

Loosely Coupled Filters

(6)

HPC Simulation Data Filter Data Filter Data Filter Data Filt er Data Filter Distributed Filters massage data For simulation Other Gri

and W eb Servi ces Analysi Control Visualize SERVOGrid (Complexity)Computing Model Grid OGSA-DAI Grid Services

This Type of Grid

integrates with

Parallel computing

Multiple HPC facilities but only use one at a time Many simultaneous

data sources and sinks

(7)

Taxonomy of Grid Operational Style

Fault tolerant and self-healing Grid Robust Reliable Resilient R3

R3 or Autonomic Grid

Grid supporting collaborative tools like the Access Grid, whiteboard and shared applications.

Collaboration Grid

Grid designed for rapid deployment and minimum life-cycle support costs

Lightweight Grid

Grid built with peer-to-peer mechanisms

Peer-to-peer Grid

Integration of Grid and Semantic Web meta-data and ontology technologies

Semantic Grid

Description of Grid Operational or Architectural Style

(8)

SERVOGrid Grid Requirements

• Seamless Access to Data repositories and large scale computers

• Integration of multiple data sources including sensors, databases, file systems with analysis system

– Including filtered OGSA-DAI

• Rich meta-data generation and access with SERVOGrid specific Schema extending industry standards

• Portals with component model for user interfaces and web control of all capabilities

(9)

What is a Web Service I

• A web service is a computer program running on either the local or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL)

• In principle, computer program can be in any language (Fortran .. Java .. Perl .. Python) and the interfaces can be implemented in any way what so ever

– Interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining) but

• The simplest implementations involve XML messages (SOAP)

and programs written in net friendly languages like Java and Python

• Web Services separate the meaning of a port (message) interface from its implementation

(10)

etc. XML WS to WS Interfaces

(Virtual) XML Knowledge (User) Interface

Clients

(Virtual) XML Data Interface Raw Data Ra Resource s Raw Data W S W S Web Service (WS) W S W S W

S WS WS

W S

Render to XML Display Format

(Virtual) XML Rendering

(11)

What are System and Application Services?

• There are generic Grid system services: security, collaboration, workflow, notification

– OGSA (Open Grid Service Architecture) is implementing these as extended Web Services

• An Application Web Service is a capability used either by another service or by a user

– It has input and output ports – data is from sensors or other services

• Consider Satellite-based Sensor Operations as a Web Service

– Satellite management (with a web front end) – Each tracking station is a service

– Image Processing is a pipeline of filters – which can be grouped into different services

– Data storage is an important system service

– Big services built hierarchically from “basic” services

(12)

Application Web Services

• Note Service model integrates sensors, sensor analysis, simulations and people • An Application Web Service is a capability used either by another service or by a

user

– It has input and output ports – data is from users, sensors or other services – Big services built hierarchically using workflow from “basic” services

Sensor Data as a We

service (WS) Data Analysis WS Sensor Managemen WS Visualization WS Simulation WS ` Filter

WS FilterWS FilterWS

Workflow builds as multiple Filter Web Services

Prog

WS ProgWS or as multiple

(13)

Grid Politics

• There is a Global Grid Forum meeting 3 times per year with about 700 attendees per meeting

– Exchange information and define standards for “everything” not done in W3C and OASIS

– e.g. Grid Service, Security, What is a Job, Database, Computer, How to build portals ….

• There is a large project called Globus developing software largely for “compute/file” Grids

• There are some 50 Grid projects (mainly in Europe and USA) developing software and applications as well as installing

infrastructure

– Some are “deployment”: EDG NMI VDT …..

• There are related initiatives called CyberInfrastructure (NSF USA) and e-Science (UK)

• There is a proposed OMII (Open Middleware Infrastructure

(14)

OGSA/OGSI Top Level View

• OGSA is the set of

“core” Grid services

– Stuff you can’t live without

– If you built a Grid you would need to invent these things

http://www.gridforum.org/Meetings/ggf7/docs/default.htm http://www.globusworld.org/globusworld_web/jw2_program_tut.htm

Web Services and OGSI

Broadly applicable services: registry,

authorization, monitoring, data

access, etc., etc.

Hosting Environment Models for resources& ot her ent ities

More specialized services: data

replication, workflow, etc., etc. Domai

n - servicesspecific

O

the

r model

s

(15)

OGSI Open Grid Service Interface

• http://www.gridforum.org/ogsi-wg

• It is a “component model” for web services.

• It defines a set of behavior patterns that each OGSI service must exhibit. • Every “Grid Service” portType extends a common base type.

– Defines an introspection model for the service – You can query it (in a standard way) to discover

• What methods/messages a port understands

• What other port types does the service provide? • If the service is “stateful” what is the current state? • A set of standard portTypes for

– Message subscription and notification – Service collections

• Each service is identified by a URI called the “Grid Service Handle”

• GSHs are bound dynamically to Grid Services References (typically wsdl docs)

– A GSR may be transient. GSHs are fixed.

(16)

Two-level Programming I

• The paradigm implicitly assumes a two-level Programming Model

• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies

– C++ Java or Fortran Monte Carlo module – Data streaming from a sensor or Satellite – Specialized (JDBC) database access

• Such nuggets accept and produce data from users files and database

• The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget

Nugge

(17)

Two-level Programming II

• The Grid is discussing the linkage and distribution of the

nuggets with the onl

addition runtime interfaces to Grid as opposed to

UNIX data stream

• Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

• Such interpretative environments are the single processor analog of Grid Programming and this tends to be called

workflow

• Workflow is the composition of multiple services (programs) together to make a new service

– Includes “Software Bus”, “Application Integration”, “Co-ordination Languages” etc.

Nugget

1 Nugget2

Nugget

(18)

Workflow

• Workflow has at least 4 parts

– “Programming Environment” – typically GUI to drag and drop services and their linkages (familiar from AVS etc. which was workflow for visualization)

– Language – from XML to extended Python

– Compiler – converting Language into executable

– Runtime controlling flow of information and notification events

• Can use Python, Mathematica, Matlab, JavaSpaces, IBM BPEL4WS, DoE CCA etc.

– Don’t think current systems are very near “what we will want” but expect much progress over next 3 years and plenty of systems to work with

(19)

e-Science and the Data Deluge

Particle Physics

• 2006/7: First pp collisions at TeV energies at the Large Hadron Collider at CERN in Geneva

• ATLAS/CMS Experiments involve 2000 physicists from 200 organizations in US, EU, Asia

• Need to store,access, process, analyse 10 Petabytes/yr with 200 Teraflop/s distributed computation

• Building hierarchical Grid infrastructure to distribute data and computation

• Many 10’s of million $ funding for global particle physics Grid – GryPhyN, PPDataGrid, iVDGL, EU DataGrid, EU DataTag, UK GridPP projects

(20)

Astronomy and its Data Deluge

• Virtual Observatories – NVO, AVO, AstroGrid – Store all wavelengths, need distributed joins – NVO 500 TB/yr from 2004

• Laser Interferometer Gravitational Observatory

– Search for direct evidence for gravitational waves – LIGO 250 TB/yr, random streaming from 2002 • VISTA Visible and IR Survey Telescope in 2004

– 250 GB/night, 100 TB/yr, Petabytes in 10 yrs

• New phase of astronomy, storing, searching and analysing Petabytes of data

The total area of

astronomical telescopes in m2, and CCDs measured in

(21)

Engineering, Chemistry,

Environmental BioInformatics and

Medical Applications

• Real-Time Industrial Health Monitoring

– UK DAME project for Rolls Royce Aero Engines – 1 GB sensor data/flight, 100,000 engine hours/day

• Combinatorial Chemistry – experiments on demand • Earth Observation

– ESA satellites generate 100 GB/day

– NASA 15 PB by 2007

• Bioinformatics

– Tens of TB of high value curated data

• Medical Images to Information

(22)

Importance of Metadata

• Metadata is ‘data about data’

e.g. cataloges, indices, directory structures

• Librarians work with books which have same basic ‘schema’

e.g. title, author(s), publisher, date, etc

• Need for hierarchical, community-based approach to defining metadata and schemas

– e.g. CML, SERVOGridML ……..

• Metadata important for interoperability of

databases/federated archives, and for construction of intelligent search agents

(23)

Simulation Output as Digital Library

• Digital Libraries usually for archiving of text,

audio and video data

• Scientific data require transformation,

data-mining and visualisation tools

(24)

Emergence of a ne

research methodology?

• Traditional scientific methodologies are

theory and experiment

• Last half of 20th century saw emergence of

scientific simulation as a third methodology

• This century will see emergence of a fourth

methodology - collection-based research

(25)

OGSA-DA

(Malcolm Atkinson Edinburgh) UK e-Science Grid Core Programme

Development of Data Access and Integration Services for OGSA

http://umbriel.dcs.gla.ac.uk/NeSC/general/projects/OGSA_DAI

- Access to XML Databases Access to Relational Databases

(26)

-DAI Key Services

GridDataService GDS Access to data & DB

operations

GridDataServiceFactory GDSF Makes GDS & GDSF

GridDataServiceRegistry GDSR Discovery of GDS(F) & Data

GridDataTranslationService GDTS Translates or Transforms Data

GridDataTransportDepot

Integrated Structured Data Transport

GDTD Data transport with persistence

Relational & XML models supported

Role-based Authorisation

(27)

Client

Client Client

Relation al

database

Grid Data Service

Directo ry / File system XML

databas e

(28)

Integration of Data and Filters

• One has the OGSA-DAI Data repository interface

combined with WSDL of the (Perl, Fortran, Python …) filter

• User only sees WSDL not data syntax

• Some non-trivial issues as to where the filtering compute power is

– Microsoft says filter next to data

D B

Filter

WSDL Of Filter

(29)

OGSA OGSI & Hosting

Environments

• Start with Web Services in a hosting environment

• Add OGSI to get a Grid service and a component model

• Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities

OGSI on Web Services

Broadly applicable services: registry,

authorization, monitoring, data

access, etc., etc.

Hosting Environment for WS Models for resources& ot her ent ities

More specialized services: data

replication, workflow, etc., etc. Domai

n - servicesspecific

O the r model s Network OGSA Environment Possibly OGSA Not OGSA

(30)

Permeating Principles and Policies

• Meta-data rich Message-linked Web Services as the permeating paradigm • “User” Component Model such as “Enterprise JavaBean (EJB)” or .NET. • Service Management framework including a possible Factory mechanism • High level Invocation Framework describing how you interact with system

components.

– This could for example be used to allow the system to built from either W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications.

• Security is a service but the need for fine grain selective authorization encourages • Policy context that sets the rules for each particular Grid.

– Currently OGSA supports policies for routing, security and resource use. • The Grid Fabric or set of resources needs mechanisms to manage them. This

includes automatic recording of meta-data and configuration of software.

• Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services.

– Challenging as end-to-end and not just backbone QoS is needed.

• Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as

converting between different interface specifications.

(31)

Virtualization

• The Grid could and sometimes does virtualize various concepts

• Location: URI (Universal Resource Identifier) virtualizes URL

• Replica management (caching) virtualizes file location generalized by GriPhyn virtual data concept

• Protocol: message transport and WSDL bindings virtualize transport protocol as a QoS request

• P2P or Publish-subscribe messaging virtualizes matching of source and destination services

• Semantic Grid virtualizes Knowledge as a meta-data query

• Brokering virtualizes resource allocation

(32)

Interfaces and Functionality and Semantics I

• The Grid platform tries to minimize detail in protocols and maximize detail in interfaces to enhance scaling

• However rich meta-data and semantics are critical for correct and interesting operation

– Put as much semantic interpretation as you can into specific services

– Lack of Semantic interoperation is in fact main weakness of today’s Grids and Web services

• Everything becomes a service whether system or application level

• There are some very important “Global Services”

– Discovery (look up) and Registration of service metadata

– Workflow

(33)

Interfaces and Functionality and Semantics II

• There are many other generally important services

• OGSA-DAI The Database Service

• Portal Service linked to by WSRP (Web services

for Remote Portals)

• Notification of events

• Job submission

• Provenance – interpret meta-data about history of

data

• File Interfaces

• Sensor service – satellites …

• Visualization

(34)

Web Services as a Portlet

• Each Web Service naturally has a

user interface specified as “just another port”

– Customizable for universal access

• This gives each Web Service a

Portlet view specified (in XML as always) by WSRP (Web services for Remote Portals)

• So component model for resources “automatically” gives a component model for user interfaces

– When you build your

application, you define portle

at same time

Application o Content source WSD L Web Service S R W P

Application as a WS

General Application Port Interface with other We Services

User Face o Web Servic

WSRP Ports define

WS as a Portlet

Web Services have other ports (Grid Service) to be

(35)

Online Knowledge Center built from Portlets

• Web Services

provide a

component model

for the middleware (see large “

common

component architecture

” effort in Dept. of

Energy)

• Should match each WSDL component with

a corresponding user interface component

• Thus one “must use” a

component model

for the portal

with again an XML

specification (

portalML

) of portal

component

(36)

Sample page with several portlets:

(37)

Provide information about application

and

host parameters

Select application to edit

(38)

Categories of Worldwide Grid Service

to be exploited by SERVOGrid

• 1) Types of Grid

– R3

– Lightweight – P2P

– Federation and Interoperability

• 2) Core Infrastructure and Hosting Environment

– Service Management – Component Model

– Service wrapper/Invocation – Messaging

• 3) Security Services

– Certificate Authority – Authentication – Authorization – Policy

• 4) Workflow Services and Programming Model

– Enactment Engines (Runtime) – Languages and Programming – Compiler

– Composition/Development

• 5) Notification Services

• 6) Metadata and Information Services

– Basic including Registry

– Semantically rich Services and meta-data – Information Aggregation (events)

– Provenance

• 7) Information Grid Services – OGSA-DAI/DAIT

– Integration with compute resources – P2P and database models

• 8) Compute/File Grid Services – Job Submission

– Job Planning Scheduling Management – Access to Remote Files, Storage and

Computers

– Replica (cache) Management – Virtual Data

– Parallel Computing • 9) Other services including

– Grid Shell – Accounting

– Fabric Management

– Visualization Data-mining and Computational Steering

– Collaboration

• 10) Portals and Problem Solving Environments • 11) Network Services

(39)

What should SERVOGrid do ?

• Make use of Grid technologies and architecture from around the world

• Coordinate with broad community through Global Grid Forum and OMII

• Decide on domain specific standards SERVOGridML • Agree on particular approach within choices in

international suite (use GT3 or not?, use portlets or not?, choose meta-data technology) and define SERVOGrid community practice

• Develop software system infrastructure and applications specific to solid earth science

(40)

Proposed OMII Activities:

Central Gaps

Gaps in Grid Styles and Execution Environment

• Need for both robust (fault tolerant) and lightweight

(suitable for small groups) Grid styles identified

– Peer-to-peer style supports smaller decentralized virtual organizations

• Note opportunities for modern middleware ideas to be used – lightweight, message-based

• Note that Enterprise JavaBeans not optimized for Science which has high volume dataflow

• Federated Grid Architecture natural for integration of heterogeneous functionality, style and security

(41)

Information Gri

Enterprise Gri

Compute Grid

Campus Grid R2 R1

Teacher

Students

Dynamic light-weight Peer-to-peer

Collaboration Training Grid

Overlapping Heterogeneous

(42)

(a) Layered OGSA Grid Core Servic e Core Servic e Core Servic e Core Servic e Applicatio n Service Applicatio n Service Applicatio n Service OGSA Interface OGSA Mediation Core Servic e Core Servic e Core Servic e Core Servic e Core Servic e Core Servic e Appl. Servic e Appl. Servic e Appl. Servic e Appl. Servic e Grid-1 Grid-2

OGSA or non OGSA Interface-2 OGSA or non OGSA Interface-1

(43)

Many Gaps in Generic Services

• Some gaps like Workflow and Notification are to make production versions of current projects

– Just in UK workflow from DAME, DiscoveryNet, EDG, Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC ….

• RGMA and Semantic Grid offer improved meta-data and

Information services compared to UDDI and MDS (Globus)

– Need comprehensive federated Information service

• Security requires architecture supporting dynamic fine-grain authorization

• UK e-Science has pioneered Information Grids but gap is continuation of OGSA-DAI, integration with other services and P2P decentralized models

(44)

Gaps in Other Grid services

• Portals and User Interfaces – Noted gap that many not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware

• Programming Models (using workflow runtime)

• Fabric Management (should be integrated with central service management and Information system),

Computational Steering, Visualization, Datamining,

Accounting, Gridmake, Debugging, Semantic Grid tools (consistent with Information system), Collaboration,

provenance

• Application-specific services

References

Related documents

The proposed method: (i) is a novel method that is not based on either the dynamic model or on the ripple component; (ii) requires only the measurement of the current for the

Genre: Action, Drama Director: Aku Louhimies Production company: Solar Films Stage of project: In development Looking for: Pre-sales, co-producers, distributors, financing.

This background includes transformational leadership theory, experiential learning theory, The Five Practices of Exemplary Leadership, factors contributing to undergraduate

First, expressing the generalized the S generalized Gauss hypergeometric function in serie with the help of equation (2.1), the Aleph-function of r variables in series with the

MDC Museum of Art + Design (MOAD) is Miami Dade College’s flagship institution dedicated to the presentation and exhibition of visual art and design, housed at the National

The circuit shown in Figure P 5.2-1 b was obtained by simplifying the part to the right of the terminals using source transformations.. The part of the circuit to

10:00am UNITY Service / Sanctuary / Sunday School State of the Congregation Address after Worship Chili Cookoff, Family Life Center, after Annual Meeting 1-4:15pm

A security may be released when the obligations to import associated with the import licence have been fulfilled and the used licence, plus any extracts, is returned to