Web Services and Grid Architectur
and their application to
Earthquake Science
USC AIST Meeting
August 31 2004
Geoffrey Fox
Community Grids Lab
Indiana University
Philosophy of Web Service Grids
•
Much of Distributed Computing was built by natural
extensions of computing models developed for sequential
machines
•
This leads to the
distributed object
(DO) model represented
by Java and
CORBA
– RPC (Remote Procedure Call) or RMI (Remote Method Invocation) for Java
•
Key people think this is not a good idea as it scales badly
and ties distributed entities together too tightly
– Distributed Objects Replaced by Services
•
Note
CORBA
was considered too complicated in both
organization and proposed infrastructure
– and Java was considered as “tightly coupled to Sun” – So there were other reasons to discard
Web services
•
Web Services
build
loosely-coupled,
distributed applications,
based on the
SOA
principles.
•
Web Services interact
by exchanging
messages in
SOAP
format
•
The contracts for the
message exchanges that
implement those
interactions are
described via
WSDL
What is a Grid?
•
You won’t find a clear description of what is Grid and how
does
differ
from a
collection of Web Services
– I see no essential reason that Grid Services have different requirements than Web Services
– Geoffrey Fox, David Walker, e-Science Gap Analysis, June 30 2003. Report UKeS-2003-01,
http://www.nesc.ac.uk/technical_papers/UKeS-2003-01/index.html.
– Notice “service-building model” is like programming language – very personal!
•
Grids were once defined as “
Internet Scale Distributed
Computing
” but this isn’t good as Grids depend as much if
not more on data as well as simulations
e-Infrastructure
e-Infrastructure builds on the inevitable increasing performance of networks and computers linking them together to support new flexible linkages between computers, data systems and people
• Grids and peer-to-peer networks are the technologies that build e-Infrastructure
• e-Infrastructure called CyberInfrastructure in USA
We imagine a sea of conventional local or global connections supported by the “ordinary Internet”
• Phones, web page accesses, plane trips, hallway conversations
• Conventional Internet technology manages billions of
broadcast or low (one client to Server) or broadcast links
On this we superimpose high value multi-way organizations (linkages) supported by Grids with optimized resources and system support
• Low multiplicity fully interactive real-time sessions
N plus N Community Resources
Grid Community databases have analogy to Television and the
Large and Small Grids
N resources in a community (N is billions for the world and 1000-10000 for many scientific fields)
Communities are arranged hierarchically with real work being done in “groups” of M resources – M could be 10-100 in e-Science Metcalfe’s law: value of network grows like square of number of
nodes M – we call Grids where this true Metcalfe or M2 Grids
Nature of Interaction depends on size of M or N
• N plus N Shared Information Grids for largish N
• M2 Metcalfe Grids for smaller M < N
Technology support depends on M/N – might use a relatively
static DHT (Distributed Hash Table) for large N and a distributed shared memory for small M
Grids must merge with peer-to-peer networks to support both N
M
2
Interactions
• Superimpose M way “Grids” on the sea
(heatbath) of “2 by N” or N plus N
“ordinary” interactions
Implement Grid as a softwar
overlay network
Information Complexity I
n
Consider a community of N resources with
groups of
size M
with each group
complexity C
• N/M Groups
n
Information in systems varies from
coherent
(harmonious)
to
incoherent
limits
• Web and Grid data resources supply coherence as in curated
astronomy, bioinformatics, geophysics database
• Can consider N plus N Grids as Coherent or Harmonious Grids
n
I = (NM)
0.5. (C/M)
Incoherent to
N . (C/M)
Coherent
nIn this language
Grids
do one or both of
• Coherence/Harmony – common shared asynchronous
resources
• Interactivity – Increase complexity to M2 with real-time
Information Complexity II
n
N plus N Community
database has I =
N Coherent
•
Improving on
N
0.5incoherent
case
n
Nearest Neighbor
groups is I =
(NM)
0.5•
Becoming I =
N
in limit M = N
•
M is correlation length in Complex Systems approach
n
M-ary Interactive group
(M
2Metcalfe Grids) has C = M
2an
I =
(NM
3)
0.5Incoheren
to I =
NM Coherent
•
Coherent case most natural in science due to
synergy
between Metcalfe and Coherence Grids
n
“
Small World
(logarithmic) networks” and hierarchical
Architecture of (Web Service) Grids
Grids built from
Web Services
communicating through
an overlay network built in SOFTWARE on the
“ordinary internet” at the application level
Grids provide the
special quality of service
(security,
performance, fault-tolerance) and customized services
needed for “distributed complex enterprises”
We need to work with Web Service community as they
debate the 60 or so proposed Web Service specifications
• Use Web Service Interoperability WS-I as “best practice”
• Must add further specifications to support high performance
• Database “Grid Services” for N plus N case
Importance of SOAP
• SOAP defines a very obvious message structure
with a
header
and a
body
• The
header
contains information used by the
“
Internet operating system
”
–
Destination, Source, Routing, Context, Sequence
Number …
• The
message body
is only used by the
application
and will never be looked at by “operating system”
except to encrypt, compress it etc.
• Much discussion in field revolves around
what is in
header!
Web Services
• Java is very powerful partly due to its many “frameworks” that generalize libraries e.g.
– Java Media Framework
– Java Database Connectivity JDBC
• Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services”
– Some 60 active WS-* specifications for areas such as – a. Core Infrastructure Specifications
– b. Service Discovery
– c. Security
– d. Messaging
– e. Notification
– f. Workflow and Coordination – g. Characteristics
A List of Web Services I
• a) Core Service Architecture
• XSD XML Schema (W3C Recommendation) V1.0 February 1998,
V1.1 February 2004
• WSDL 1.1 Web Services Description Language Version 1.1,
(W3C note) March 2001
• WSDL 2.0 Web Services Description Language Version 2.0,
(W3C under development) March 2004
• SOAP 1.1 (W3C Note) V1.1 Note May 2000
• SOAP 1.2 (W3C Recommendation) June 24 2003
• b) Service Discovery
• UDDI (Broadly Supported OASIS Standard) V3 August 2003
• WS-Discovery Web services Dynamic Discovery (Microsoft,
BEA, Intel …) February 2004
• WS-IL Web Services Inspection Language, (IBM, Microsoft)
A List of Web Services II
• c) Security
• SAML Security Assertion Markup Language (OASIS) V1.1 May
2004
• XACML eXtensible Access Control Markup Language (OASIS) V1.0 February 2003
• WS-Security 2004 Web Services Security: SOAP Message
Security (OASIS) Standard March 2004
• WS-SecurityPolicy Web Services Security Policy (IBM,
Microsoft, RSA, Verisign) Draft December 2002
• WS-Trust Web Services Trust Language (BEA, IBM, Microsoft,
RSA, Verisign …) May 2004
• WS-SecureConversation Web Services Secure Conversation
Language (BEA, IBM, Microsoft, RSA, Verisign …) May 2004
• WS-Federation Web Services Federation Language (BEA, IBM,
A List of Web Services III
• d) Messaging
• WS-Addressing Web Services Addressing (BEA, IBM, Microsoft)
March 2004
• WS-MessageDelivery Web Services Message Delivery (W3C
Submission by Oracle, Sun ..) April 2004
• WS-Routing Web Services Routing Protocol (Microsoft) October 2001
• WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft,
Tibco) v0.992 March 2004
• WS-Reliability Web Services Reliable Messaging (OASIS Web
Services Reliable Messaging TC) March 2004
• SOAP MOTM SOAP Message Transmission Optimization Mechanism
(W3C) June 2004
• e) Notification
• WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO)
January 2004
• WS-Notification Framework for Web Services Notification with
WS-Topics, WS-BaseNotification, and WS-BrokeredNotification (OASIS)
OASIS Web Services Notification TC Set up March 2004
A List of Web Services IV
• f) Coordination and Workflow, Transactions and Contextualization
• WS-CAF Web Services Composite Application Framework including WS-CTX,
WS-CF and WS-TXM below (OASIS Web Services Composite Application Framework TC) July 2003
• WS-CTX Web Services Context (OASIS Web Services Composite Application Framework TC) V1.0 July 2003
• WS-CF Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V1.0 July 2003
• WS-TXM Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) V1.0 July 2003
• WS-Coordination Web Services Coordination (BEA, IBM, Microsoft) September 2003
• WS-AtomicTransaction Web Services Atomic Transaction (BEA, IBM, Microsoft) September 2003
• WS-BusinessActivity Web Services Business Activity Framework (BEA, IBM, Microsoft) January 2004
• BTP Business Transaction Protocol (OASIS) May 2002 with V1.0.9.1 May 2004
• BPEL Business Process Execution Language for Web Services (OASIS) V1.1 May 2003
• WS-Choreography (W3C) V1.0 Working Draft April 2004
• WSCI (W3C) Web Service Choreography Interface V1.0 (W3C Note from BEA, Intalio, SAP, Sun, Yahoo)
A List of Web Services V
• h) Metadata and State
• RDF Resource Description Framework (W3C) Set of recommendations expanded from original February 1999 standard
• DAML+OIL combining DAML (Darpa Agent Markup Language) and OIL (Ontology Inference Layer) (W3C) Note December 2001
• OWL Web Ontology Language (W3C) Recommendation February 2004
• WS-DistributedManagement Web Services Distributed Management Framework with MUWS and MOWS below (OASIS)
• WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) V0.5 Committee Draft April 2004
• WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) V0.5 Committee Draft April 2004
• WS-MetadataExchange Web Services Metadata Exchange (BEA,IBM, Microsoft, SAP) March 2004
• WS-RF Web Services Resource Framework including WS-ResourceProperties,
WS-ResourceLifetime, WS-RenewableReferences, WS-ServiceGroup, and
WS-BaseFaults (OASIS) Oasis TC set up April 2004 and V1.1 Framework March 2004
• ASAP Asynchronous Service Access Protocol (OASIS) with V1.0 working draft G June 2004
A List of Web Services VI
•
g) General Service Characteristics
•
WS-Policy
Web Services Policy Framework (BEA, IBM,
Microsoft, SAP)
May 2003
•
WS-PolicyAssertions
Web Services Policy Assertions
Language (BEA, IBM, Microsoft, SAP)
May 2003
•
WS-Agreement
Web Services Agreement Specification
(GGF under development)
May
2004
•
i) User Interfaces
•
WSRP
Web Services for Remote Portlets (OASIS)
OASIS Standard
August 2003
WS-I Interoperability
•
Critical underpinning of Grids and Web Services is the
gradually growing set of specifications in the Web Service
Interoperability Profiles
•
Web Services Interoperability
(WS-I) Interoperability
Profile 1.0a." h
ttp://www.ws-i.org.
gives us
XSD,
WSDL1.1, SOAP1.1, UDDI
in basic profile and parts of
WS-Security
in their first security profile.
•
We imagine the “60 Specifications” being checked out
and evolved in the
cauldron of the real world
and
occasionally best practice identifies a new specification to
be added to
WS-I
which
gradually increases in scope
Web Services Grids and
WS-I+
• WS-I Interoperability doesn’t cover all the capabilities need tosupport Grids
• WS-I+ is designed to minimal extension of WS-I to support “most current” Grids: it adds support for
– Enhanced SOAP Addressing (WS-Addressing) – Fault tolerant (reliable) messaging
– Workflow as in IBM-Microsoft standard BPEL
• Security and Notification best practice and support will probably get added soon
– There are Web Service frameworks here but various IBM v Microsoft v Globus differences to be resolved
• Portlet-based User Interfaces could be added
• UK OMII Open Middleware Infrastructure Institute is adopting this approach to support UK e-Science program
– Currently UK e-Science largely either uses GT2 (as in EDG) or Simple Web Services for “database Grids”
Bit
level
Internet
(OSI
Stack)
Layered Architecture for Web Services and Grids
Base Hosting Environment
Protocol HTTP FTP DNS …
Presentation XDR …
Session SSH …
Transport TCP UDP …
Network IP …
Data Link / Physical
Servic Internet
Application Specific Grids
Generally Useful Services and Grids
Workflow WSFL/BPEL
Service Management (“Context etc.”)
Service Discovery (UDDI) / Information
Service Internet Transport
Protocol
Service Interfaces WSDL
Servic Context
Higher Level
How
SERVOGrid
Fits In
•
There is core Web Services – the “operating
system of the world” – controlled by WS-*
•
There is workflow – programming the Grid
•
There are very general Web Services and Grids
such as
– Database
– Collaboration
– Job Submittal
•
There are some relatively general services and
Grids such as
– Visualization
– GIS
•
There are application specific services such as
Layers of Onion
All SERVOGrid capabilities are built as Web Services with this structure
3 level programming model Application
(level 1 Programming)
Application Semantics (Metadata, Ontology) Level 2 “Programming”
Systems Metadata (Context, State)
Basic WS-* Infrastructure
Web Service 1
Workflow (level 3) Programming
Working up from the Bottom
We have the classic (CISCO, Juniper ….) Internet routing the flood of ordinary packets in OSI stack architecture
Web Services build the “Service Internet” or IOI (Internet on
Internet) with
• Routing via WS-Addressing not IP header
• Fault Tolerance (WS-RM not TCP)
• Security (WS-Security/SecureConversation not IPSec/SSL)
• Information Services (UDDI/WS-Context not DNS/Configuration files)
• At message/web service level and not packet/IP address level Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software
overlay network defining Grid (analogy is VPN)
Consequences of Rule of the Millisecond
• Useful to remember critical time scales
– 1) 0.000001 ms – CPU does a calculation – 2) 0.001 to 0.01 ms – MPI latency
– 3) 1 to 10 ms – wake-up a thread or process – 4) 10 to 1000 ms – Internet delay
• 4) implies geographically distributed metacomputing can’t in general compete with parallel systems (OK for some cases)
• 3) << 4) implies RPC not a critical programming abstraction as it ties distributed entities together and gains a time that is typically only 1% of inevitable network delay
– However many service interactions are at their heart RPC but implemented differently at times e.g. asynchronously
• 2) says MPI is not relevant for a distributed environment as low latency cannot be exploited
Linking Modules
n
From method based to RPC to message based to
event-based
Module A Module B Method Call .001 to 1 millisecondService A Service
B Messages
0.1 to 1000 millisecond latency
Coarse Grain Service Model
Closely coupled Java/Python …
Service B Service A
Publisher Post Events “Listener
Subscribe to Events
What is a Simple Service?
• Take any system – it has multiple functionalities
– We can implement each functionality as an independent distributed service
– Or we can bundle multiple functionalities in a single service • Whether functionality is an independent service or one of many
method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL
• Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond”
– Distributed services incur messaging overhead of one (local) to
100’s (far apart) of milliseconds to use message rather than method call
– Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency
• Apache web site has many projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services
Grids of Grids of Simple Services
• Link via methods messages streams
• Services and Grids are linked by messages
• Internally to service, functionalities are linked by methods
• A simple service is the smallest Grid
• We are familiar with method-linked hierarch
Lines of Code Methods Objects Programs Packages
Overlay and Compose
Grids of Grids Methods Services Component Grids
CPUs Clusters Compute
Resource Grids MPPs
Databases DatabasesFederated
Sensor Sensor Nets
Data
Component Grids?
• So we build collections of Web Services which we
package as
component Grids
–
Visualization Grid
–
Sensor Grid
–
Utility Computing Grid
–
Person (Community) Grid
–
Earthquake Simulation Grid
–
Control Room Grid
–
Crisis Management Grid
Critical Infrastructure (CI) Grids built as Grids of Grids
Gas Service and Filters
Physical Network Registr
y Metadata
Flood Service and Filters
Flood CIGrid
…
Electricity Gas CIGridCIGrid
…
Data
Access/Storage Securit
y Notification Workflow Messaging
Portal
s VisualizationGrid Collaboration
Grid
Sensor Grid Compute
Grid GIS Grid
Database Database Analysis and Visualizatio Portal Repositorie Federated Databases Data Filte Services
Field Trip Data Streaming Data Sensor s
?
Discovery Services SERVOGrid Researc Simulation s Research Education Customization Services From Researc to Education Educatio Grid Computer FarmGeoscience Research and Education Grids
GI Grid
Sensor Grid Database Grid
IOI and CIE
• Let us study the two layers IOI (Service Internet On the Bit Internet) and CIE (Context and Information Environment)
• IOI is most “straightforward” as it is providing reasonably well understood capabilities at a new “level”
• CIE is roughly the inter-service “shared memory” used to manage and control them at “distributed operating system level
– Critical is “shared” (a database service) versus message based CIE
IOI
Application Specific Grids
Generally Useful Services and Grids
Workflow WSFL/BPEL
Service Management (“Context etc.”)
Service Discovery (UDDI) / Information
Service Internet Transport
Protocol
Service Interfaces WSDL
CIE Higher Level
Minicompute r Firewall Compute r Serve r PDA Mode m Laptop computer Workstatio n Peer s Peer s Audio/Video Conferencing Client Audio/Video Conferencing Client NaradaBrokering Broker Network
NaradaBrokering
Web Service B
Stream
Server-enhance Messaging
NB supports messages and streams
NaradaBrokering and IOI
• “Software Overlay Network” features • Support for Multiple Transport protocols • Support for multiple delivery mechanisms
– Reliable Delivery
– Exactly-once Delivery – Ordered Delivery
– Optional Delivery optimization modules for different modes • Compression/Decompression of payloads with optional module • Coalescing/Fragmentation of payloads with optional module • NTP Time Service
• Security Service
• Performance Monitoring
• Performance optimized routing with optional module
Virtualizing Communication
n Communication specified in terms of user goal and Quality of
Service – not in choice of port number and protocol
n Bit Internet Protocols have become overloaded e.g. MUST use UDP for A/V latency requirements but CAN’t use UDP as
firewall will not support ………
n A given “Service Internet” communication can involve multiple
transport protocols and multiple destinations – the latter possibly determined dynamically
Dial-u Filter Satellit
UDP
Firewal HTTP
A
B1
Hand-Hel Protocol Fas
Link
Software Multicast B2
B3
NB Broker Client Filtering
Performance Monitoring
n
Every broker
incorporates a
Monitoring service
that
monitors links
originating from the node.
n
Every link measures and exposes a set of
metrics
• Average delays, jitters, loss rates, throughput.n
Individual links can
disable
measurements for
individual or the entire set of metrics.
n
Measurement
intervals can
also be varied
n
Monitoring Service,
returns measured
metrics to
NaradaBrokering Service Integration
S1 P1 P2 S2
S1 S2
S1 S2
Proxy Messaging
Handler Messaging
Notification
S? Service P? Proxy
NB Transport
Internal to Service: SOAP Handlers/Extensions/Plug-ins Java (JAX-RPC) .NET Indigo and special cases: PDA's gSOAP, Axis C++
Standard SOAP Transport
Fast Web Service Communication I
• IOI Application level Internet allows one to optimize
message streams at the cost of “startup time”,
Web Services
can deliver the fastest possible interconnections
with or
without reliable messaging
• Typical results from Grossman (UIC) comparing Slow
SOAP over TCP with binary and UDP transport (latter
gains a factor of
1000
)
Pure SOAP SOAP over UDP Binary over UDP
Fast Web Service Communication II
• Mechanism only works for
streams
– sets of related
messages
• SOAP header in streams is constant
except for
sequence number (Message ID), time-stamp ..
• One needs two types of new Web Service
Specification
• “
WS-StreamNegotiation
” to define how one can use
WS-Policy to send messages at start of a stream to
define the methodology for treating remaining
messages in stream
Fast Web Service Communication III
•
Then use “WS-StreamNegotiation” to
negotiate stream
in
Tortoise SOAP – ASCII XML over HTTP and TCP –
–
Deposit basic SOAP header through connection – it is part of
context for stream (linking of 2 services)
–
Agree on firewall penetration, reliability mechanism, binary
representation and fast transport protocol
–
Naturally transport UDP plus WS-RM
•
Use “WS-FlexibleRepresentation” to define encoding of a
Fast
transport
(On a different port) with messages just having
“FlexibleRepresentationContextToken”, Sequence Number,
Time stamp if needed
–
RTP packets have essentially this structure
–
Could add stream termination status
•
Can monitor and control with original negotiation stream
CIE: Common Service Information and Metadata
• Consider a collection of services working together
– Workflow tells you how to specify service interaction but more basically there is shared information or context
specifying/controlling collection
• WS-RF and WS-GAF have different approaches to contextualization – supplying a common “context” which at its simplest is a token to represent state
• More generally core shared information includes dynamic service metadata and the equivalent of configuration information.
• One can supports such a common context either as pool of messages or as message-based access to a “database” (Context Service)
• Two services linked by a stream are perhaps simplest example of a collection of services needing context
• Note that there is a tension between storing metadata in messages and
services.
Database SDE SDE Servic e SDE SDE Servic e SDE SDE Servic e SDE SDE Servic e SDE SDE Servic e SDE SDE Servic e SDE SDE Servic e Individual Services Web Service Ports
Grid or Domain Specific Metadata Catalogs System or Federated Registry or Metadata
Catalog
Database1 Database2 Database3
Four Metadata Architectures
Messages
Notification Architecture
• Point-to-Poin
• Or Brokere
• NaradaBrokering will support both
WS-Eventing
and
WS-Notification
as well as Java Message
Service JMS that is Java Notification standard
Service B Service A
Bro
ker Publish
Subscribe
Queues Messages Supports creation and subscription of topics
Service B Service A
Architecture Characteristics
•
Build as Component
Web Services Grids
– Simulation
– Visualization
– GIS
– Database
•
NaradaBrokering
provides
– Fault Tolerance
– Support for High Performance Streams – Basic Dynamic Information Environment – Notification
•
HPSearch
provides
– More flexible information environment with scripting
– Can prototype simple workflow before implementing in BPEL