• No results found

Semantic and Streaming Grids

N/A
N/A
Protected

Academic year: 2020

Share "Semantic and Streaming Grids"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

Semantic and Streamin

Grids

Chinese Academy of Sciences Dec 6 2005

Geoffrey Fox

Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401

[email protected]

(2)

Four Data Streaming Application Areas

n

Data Assimilation

applied to link the

data deluge

(satellites, sensors, seismometers) in real time to small

and large scale parallel simulations

Use in Earthquake Science

n

Department of Defense

(and

Homeland Security

) have

built the

Global Information Grid

with a target

architecture

NCOW

(Network Centric Operations and

warfare)

They submit no jobs; rather stream data to brokers from

which they are filtered and distributed

Includes their rather dated distributed simulation HLA

n

Audio-Video Conferencing

implemented with services

and Grid messaging

n

Hand-held Grid

linking PDA/cell-phones to Grids

(3)

Data Deluged Science

n In the past, we worried about data in the form of parallel I/O or MPI-IO, but we didn’t consider it as an enabler of new science and new ways of computing

n Data assimilation was not central to HPCC

n DoE ASCI set up because didn’t want test data!

n Now particle physics will get 100 petabytes from CERN

Nuclear physics (Jefferson Lab) in same situationUse around 30,000 CPU’s simultaneously 24X7

n Weather, climate, solid earth (EarthScope)

n Bioinformatics curated databases (Biocomplexity only 1000’s of

data points at present)

n Virtual Observatory and SkyServer in Astronomy n Environmental Sensor nets

(4)

Information/Knowledge Grids

n

Distributed

(10’s to 1000’s) of

data sources

(instruments,

file systems, curated databases …)

n

Data Deluge

: 1 (now) to 100’s

petabyte

s/year (2012)

Moore’s law for Sensors

n

Possible

filters

assigned dynamically (

on-demand

)

Run image processing algorithm on telescope image

Run Gene sequencing algorithm on compiled data

n

Needs

decision support

front end with “what-if”

simulations

n

Metadata

(

provenance

)

critical to annotate data

n

Integrate

across experiment

as in multi-wavelength

astronomy

(5)

Database

S S

S

S SS SS SS SS SS SS SS SS F S F S F S F S F S F S F S F

S SF

F S F S F S F S F S F S F S F S F S F S F

S Portal

F S O S O S O S O S O S O S O S O S O S O S O S O S MD MD MD MD MD MD MD MD MD

MetaData Filter Service Sensor Service Other Service Anothe Grid

Raw DataDataInformationKnowledgeWisdom Decisions S S S S Anothe Service Anothe Service S S Anothe

Grid S S

(6)

Semantic Grid and Services

n Implications of SOA (Service Oriented Architectures) for SG

(Semantic Grid)

Build services to implement SG

n Implications of SG for SOA

Build metadata rich systems of services using SG

n Services receive data in SOAP messages, manipulate it and

produce transformed data as further messages

n Meta-data is carried in SOAP messages

n Meta-data controls processing and transport of SOAP Messages n Knowledge is created from data by services

n The Grid enhances Web services with semantically rich system

and application specific management

n One must exploit and work around

the

different

approaches to meta-data and their manipulation in Web Services

(7)

Structure of SOAP Messages

n SOAP Messages have System information in the header

including WS-Policy based meta-data defining processing options

Processed by Handlers

n Application data and meta-data is the body (controversies here!)

Processed by the Service itself

n Some meta-data like WS-RF is logically “only in messages” n Other like that in WS-Context or the SRB are stored in logical

equivalent of XML databases

n We only need to preserve semantic structure (XML/SOAP

Infoset) so transport in fast XML and store in efficient relational databases

H1 H2 H3 H4 Body F1 F2 F3 F4 Service Container Handlers

Container Workflow

(8)

What Type of Services are there?

n There are a horde of support services supplying security,

collaboration, database access, user interfaces

n The support services are either associated with system or application

We will study the WS-* and GS-* which implicitly or

explicitly define many support services

n There are generalized filter services which are applications that

accept messages and produce new messages with some data derived from that in input

Simulations (including PDE’s and reactive systems)

Data-mining

Transformations

Agents

Reasoning are all termed filters here

n There are services like “author ontology”, “parse RDF” or

“attach provenance” that directly support Semantic Grid

n But all services and their interactions are bathed in sea of meta-data and so implicitly need and support the Semantic Grid

(9)

It’s a Composite Hierarchical World

n Filters can be a workflow which means they are “just collections of other simpler services”

One needs meta-data to control the workflow

n Services are programs that accept messages and produce messages

n Grids are a distributed collection of services supporting

managed shared resources

Management requires meta-data

n Grids are distributed systems that accept distributed messages and produce distributed result messages

Can always talk about Grids and view a service or a

workflow as a special case of a Grid

n It just requires meta-data to send a message to a Grid and it routed to “correct computer” holding “requested service”

Meta-data allows mapping of virtual to real addresses

(10)

Semantically Rich Services with a Semantically

Rich Distributed Operating Environment

Database S S S S S S S S S S S S S S S S S S S

S SS SS SS SS SS SS SS SS F S F S F S F S F S F S F S F S F

S SF

F S F S F S F S F S F S F S F S F S F S F

S Portal

F S O S O S O S O S O S O S O S O S O S O S O S O S MD MD MD MD MD MD MD MD MD

MetaData Filter Service Sensor Service Other Service SOAP Message Streams SOAP Message Streams

Raw Data Raw Data

Raw Data Raw Data Data Data Data Data Information Information Knowledge Knowledge Wisdom Decisions Information Anothe Servic e Anothe Servic e Anothe

Grid Grids of Grids Architecture AnotheGrid

is same as outward facing applicatio

(11)

GIS Grids and Sensor Grids

n

OGC

has defined a suite of

data structures

and

services

to support

Geographical Information Systems and

Sensors

n

GML

Geography Markup language defines

specification of geo-referenced data

n

SensorML

and

O&M

(Observation and Measurements)

define meta-data and data structure for sensors

n

Services like

Web Map Service, Web Feature Service,

Sensor Collection Service

define services interfaces to

access GIS and sensor information

n

Grid workflow

links services that are designed to

support streaming input and output messages

n

We are building Grid (Web) service implementations of

these specifications for NASA’s

SERVOGrid

(12)

A Screen Shot From the WMS Client

(13)

WMS uses WFS that uses data sources

<gml:featureMember>

<fault>

<name> Northridge2 </name> <segment> Northridge2

</segment>

<author> Wald D. J.</author>

<gml:lineStringProperty>

<gml:LineString

srsName="null">

<gml:coordinates>

118.72,34.243 -118.591,34.176

</gml:coordinates>

</gml:LineString>

</gml:lineStringProperty>

</fault>

</gml:featureMember>

(14)

Electric Power and Natural Gas data from LANL

Interdependent Critical Infrastructure Simulations

Zoom-in

Zoom-out

FeatureInfo mode

Measure distance mode

Clear Distance

Drag and Drop mode

Refresh to initial map

(15)

Typical use of Grid Messaging in NASA

Datamining Grid

Sensor Grid

Grid Eventing GIS Grid

(16)

Typical use of Grid Messaging

HPSearc h

Manages

Narad Brokering Sensor Grid

WS-Context

Stores dynamic data

Filter or Dataminin

g

WFS (GIS data)

Post befor Processing

Post afte Processing

Notify

Subscribe

Grid Database

Archives

Web Feature Service

GIS Grid

Geographica

(17)

Real Time GPS

and Google Maps

Subscribe to live GPS station. Position data from SOPAC is

combined with Google map clients.

Select and zoom to GPS station location, click icons for more information.

(18)

Google maps can be

integrated with Web Feature Service

Archives to filter and

browse seismic records.

Integrating

Archived Web

Feature Services

(19)

Google Maps

as Service

accessed from

our WMS

(20)

3 XML Databases of Importance

n WS-Context controlling a workflow

n (Extended) UDDI supporting semantic service discovery n WFS or ASFS (see later) provides application specific

data/meta-data repository)

n These have different performance, scalability and data unit size

requirement

n In our implementation, each is currently “just an

Oracle/MySQL” database front ended by filters that convert between XML (GML for WFS) and object-relational Schema

Example of Semantics (XML) versus representation (SQL)

difference

n OGSA-DAI offers Grid interface to databases – we could use but

don’t as we only need to expose WFS and not MySQL to Grid

(21)

Information Management/Processing

n SOAP messages transport information expressed in a

semantically rich fashion between sources and services that enhance and transform information so that complete system provides

Semantic Web technologies like RDF and OWL help us have

rich expressivity

n DataInformationKnowledge transformation n We build application specific information

management/transformation systems ASIS for each application domain

n One special domain is the system itself where the metadata

associated with services, sessions, Grids, messages, streams and workflow is itself managed and supported by an SIIS

(22)

Generalizing a GIS

n

Geographical Information Systems

GIS have been

hugely successful in all fields that study the earth and

related worlds

They define Geography Syntax (GML) and ways to store,

access, query, manipulate and display geographical features

In SOA, GIS corresponds to a domain specific XML language

and a suite of services for different functions above

n

However such a universal information model has

not

been developed in other areas

even though there are

many fields in which it appears possible

BIS Biological Information SystemMIS Military Information System

IRIS Information Retrieval Information SystemPAIS Physics Analysis Information System

SIIS Service Infrastructure Information System

(23)

ASIS Application Specific Information System I

n a) Discovery capabilities that are best done using WS-*

standards

n b) Domain specific metadata and data including

search/store/access interface. (cf WFS). Lets call generalization

ASFS (Application Specific Feature Service)

Language to express domain specific features (cf GML). Lets call

this ASL (Application Specific language)

Tools to manipulate information expressed in language and key

data of application (cf coordinate transformations). Lets call this

ASTT (Application specific Tools and Transformations)

ASL must support Data sources such as sensors (cf OGC metadata

and data sensor standards) and repositories. Sensors need

(common across applications) support of streams of data

Queries need to support archived (find all relevant data in past)

and streaming (find all data in future with given properties)

Note all AS Services behave like Sensors and all sensors are

wrapped as services

Any domain will have “raw data” (binary) and that which has been

filtered to ASL. Lets call ASBD (Application Specific Binary Data)

(24)

ASIS Application Specific Information System II

n Lets call this ASVS (Application Specific Visualization Services)

generalizing WMS for GIS

n The ASVS should both visualize information and provide a way of

navigating (cf GetFeatureInfo) database (the ASFS)

n The ASVS can itself be federated and presents an ASFS output

interface

n d) There should be application service interface for ASIS from which all

ASIS service inherit

n e) There will be other user services interfacing to ASIS

n All user and system services will input and output data in ASL using

filters to cope with ASBD

AS Tool (generic

) A

“Sensor A Repository

AS Service (user defined)

ASVS Displa

y AS Tool

(generic )

Messages using ASL

Filter, Transformation, Reasoning, Data-mining, Analysis

(25)

Everything Is a Service

or a message/ Information

Nugget Militar Informatio Management

System

Directly GS-* WS-*

ASVS

Filters/ASTT

(26)

MI

or Military Information

Object Unit of Managed Information expressed in

ASL

OGSA-DAI and Sensor Standards

Info-WS-Notification WS-Eventing ASF

S

(27)

Two-level Programming I

• The Web Service (Grid) paradigm implicitly assumes a

two-level Programming Model

• We make a

Service

(same as a “distributed object” or

“computer program” running on a remote computer) using

conventional technologies

– C++ Java or Fortran Monte Carlo module – Data streaming from a sensor or Satellite – Specialized (JDBC) database access

• Such

services

accept and produce data from users files and

database

• The Grid is built by coordinating such services assuming

we have solved problem of programming the service

Servic

e Data

(28)

Two-level Programming II

n

The Grid is discussing the composition of distributed

services

with the runtime

interfaces to Grid as

opposed to UNIX

pipes/data streams

n

Familiar from use of UNIX Shell, PERL or Python

scripts to produce real applications from core programs

n

Such interpretative environments are the single

processor analog of

Grid Programming

n

Some projects like GrADS from Rice University are

looking at integration between service and composition

levels but dominant effort looks at each level separately

Service

1 Service2

Service

3 Service4

(29)

WS 2 WS N-1

Web Service 1 Web Service N

3 Layer Programming Model

Level 2 Programming choosing services by virtualization

Application Semantics (Metadata, Ontology) Semantic Grid Level 1 Programming inside services

Application expressed in in Java Fortran C++ MPI etc.

Level 3 Grid Programming composing multiple services

Service Workflow, Transactions, Mediation WS-* Infrastructure

Substantial work in UK e-Science program, international semantic web community

(30)

Consequences of Rule of the Millisecond

n

Useful to remember

critical time scales

1) 0.000001 ms – CPU does a calculation

2a) 0.001 to 0.01 ms – Parallel Computing MPI latency2b) 0.001 to 0.01 ms – Overhead of a Method Call

3) 1 ms – wake-up a thread or process either?

4) 10 to 1000 ms – Internet delay: Workflow

n

So use pointers and the compute memory system when

latencies of ≤ 1 millisecond but use URI looked up in a

context store when longer delays allowed

n

Transfer data when read-only and long latency allowed

n

Always choose the slowest allowed methodology and

remember when in doubt, Moore’s law favors computer

performance and systems always get more complex and

harder to maintain.

Classic

Programming

(31)

GlobalMMCS Web Service Architecture

SIP H323 AccessGrid NativeXGSP

Admire

Gateways convert to uniform XGSP Messaging

High Performance (RTP and XML/SOAP and ..

Media Servers

Filters Session Server

XGSP-based Control

NaradaBrokerin g

All Messaging

Use Multiple Media servers to scale to many codecs and many versions of audio/video mixing

NB Scales a distributed

We Services

NaradaBrokering

(32)

GlobalMMCS Architecture

Event Messaging Service

(NaradaBrokering)

XGSP Conference Control Service

Audio Video

Web Service MessagingInstant Web Service

Shared Display Web Service

Shared ….

Web Service

n

Non-WS collaboration

control protocols are

“gatewayed” to XGSP

n

NaradaBrokering

supports TCP (chat, control, shared

display, PowerPoint etc.) and UDP (Audio-Video

conferencing)

(33)

XGSP Example: New Session

<CreateAppSession>

<ConferenceID> GameRoom </ConferenceID> <ApplicationID> chess </ApplicationID>

<AppSessionID> chess-0 </AppSessionID>

<AppSession-Creator> John </AppSession-Creator> <Private> false </Private>

</CreateAppSession> <SetAppRole>

<AppSessionID> chess-0 </AppSessionID> <UserID> Bob </UserID>

<RoleDescription> black </RoleDescription> </SetAppRole>

<SetAppRole>

<AppSessionID> chess-0 </AppSessionID> <UserID> Jack </UserID>

<RoleDescription> white </RoleDescription> </SetAppRole>

(34)

XGSP AV Signaling Protocol with H.323

H323 Terminal H323 Gatewa

y H225.Setup H225.Connect JoinAVSessio n JoinAVSession OK Terminal Capability Se t AC K

Terminal Capability Set AC

K

OpenLogicChannel ( Video ) AC K JoinAVSessio n (Video) AC K

OpenLogicChannel ( Video )

OpenLogicChannel ( Audio ) AC

K

OpenLogicChannel ( Audio ) AC

K JoinAVSession (Audio)

ACK with video RTPLink <IP Addr, Port>

ACK with Audio

RTPLink<IP Addr, Port>

with the RTPLinks <IP Addr, Port>

& capability description

(35)

NaradaBrokering 2003-2006

n Messaging infrastructure for collaboration, peer-to-peer and Grids

Implements JMS and native high-performance protocols (message

transit time of 1 to 2 ms per hop)

n Order-preserving message transport with QoS and security profiles

n Support for different underlying transport such as TCP, UDP,

Multicast, RTP

n SOAP message support and WS-Eventing, WS-RM and WS-Reliability.

WS-Notification when specification agreed

n Active replay support: Pause and Replay live streams.

n Stream Linkage: can link permanently multiple streams – using in

annotation of real-time video streams

n Replicated storage support for fault tolerance and resiliency to storage

failures.

n Management: HPSearch Scripting Interface to streams and brokers

(uses WS-Management)

n Broker Topics and Message Discovery: Locate appropriate

n Integration with Axis2 Web Service Container (?)

n High Performance Transport supporting SOAP Infoset

(36)

Average Video Delays for one broker –

Performance scales proportional to number of brokers

Latency ms

# Receivers One session Multipl

sessions

30 frames/sec

(37)
(38)

GlobalMMCS SWT Client

Chat TV

Webcam Video

Mixer GIS

(39)

e - Annotation Playe

r

Archived stream playe

r

Annotatio

nplaye / WB r

Archieved stream

list

Real time stream

list

e -Annotation Whiteboar

d

Real time stream playe

r

Archived Real Time Real Tim

Stream List Stream List Player

e-Annotation Archived Stream Annotated e-Annotation

Player Player Stream Player Whiteboard

(40)

Location of software for Grid Projects in

Community Grids Laboratory

n

htpp://www.naradabrokering.org p

rovides Web service

(and JMS) compliant

distributed publish-subscribe

messaging

(software overlay network)

n

h

tpp://www.globlmmcs.org is

a

service oriented (Grid)

collaboration environment

(audio-video conferencing)

n

ht

tp://www.crisisgrid.org is

an OGC (open geospatial

consortium) Geographical Information System (GIS)

compliant

GIS and Sensor Grid

(with POLIS center)

n

htt

p://www.opengrids.org has

WS-Context, Extended

UDDI etc.

n

The work is still in progress but core part of

NaradaBrokering is quite mature

n

All software is open source

and freely available

(41)

Summary

n

Virtualization

everywhere

n

Focus on

semantics not representation

to get

performance

combined with

expressivity

for transport

and data access

n

All this enabled by powerful

meta-data services

n

Grids add management

to rich but potentially chaotic

set of Web Services;

management and coherence enabled by meta-data

n

Can define

general information architectures

(ASIS,

GIS, SIIS) for both applications and system

n

Knowledge

from

filters

that span

simulations,

data-mining, reasoning

and

agents

n

A

service

is just a

special case

of a

Grid

n

Build

systems

from SubGrids (

Gridlets

)

References

Related documents

The landforms existing in the sawah rice field used for this study showed stronger influence than the slope positions on the monitored variables; including particle

To help users who do not have extensive knowledge of transformation optics but want to use the transformation optical methods to design metamaterials for

¾ To provide advanced nanoscience instrumentation via the Defense University Research Instrumentation Program (DURIP). ¾ To provide DoD facilities and instrumentation capable

The risk assessment in developing countries possesses major challenges as the water supply distribution systems expand in an unplanned way and aging of pipes results into

6(c) which was obtained from Analytical method suggested that higher cellulose content provided higher reaction order whilst higher hemicellulose content provided

It meant that excessive Sn doping suppressed the OH free radical formation and showed the adverse effect to photodegradation of pyrocatechol as well. The rate

Activation and cross talk between Akt, NF κ B, and unfolded protein response signaling in 1-LN prostate cancer cells consequent to ligation of cell surface-associated

The first question that any client should ask an SEO firm is “Do you outsource your work or is it conducted in-house?” The quality of the staff and their knowledge of best