Implementing Geographical
Information System
Services for SERVOGrid
Marlon Pierce
SERVOGrid Components
n
Component (“portlet”)-based portals.
•
OGCE mentioned by Chris Hill
n
Web Services for “execution grid” services
•
Ant-based job specification
•File transfer
•
Distributed session management (“context”).
n
Geographic Information System (GIS) services for
“data grid” services.
•
Web Map Service
•Web Feature Service
•
GIS-compatible information services.
n
Support for streaming, real-time data.
n
Distributed service management/orchestration
Guiding Principles
n
Grids are composed of families of services
•
Data, execution, information, …
n
Use “WS-I+” approach to building service families.
•
Build Grids out of Web Service standards conservatively.
n WS-Interoperability is the starting point.
•
See position paper
http://grids.ucs.indiana.edu/ptliupages/publications/WebServiceGri
ds.pdf
n
SOAP a
nd WSDL provide universal messaging
framework and service definition language.
•
All services should communicate with the same message format.
•Message delivery is left as an exercise.
Pattern Informatics (PI)
n
PI is a technique developed at University of California,
Davis for analyzing earthquake seismic records to
forecast regions with high future seismic activity.
•
They have correctly forecasted the locations of 15 of last 16
earthquakes with magnitude > 5.0 in California.
n
See Tiampo, K. F., Rundle, J. B., McGinnis, S. A., &
Klein, W. Pattern dynamics and forecast methods in
seismically active regions.
Pure Ap. Geophys.
159,
2429-2467 (2002).
•
http://citebase.eprints.org/cgi-bin/fulltext?format=application/pdf&identifier=oai%3AarXiv.org%
3Acond-mat%2F0102032
n
PI i
s being applied other regions of the world, and John
has gotten a lot of press.
Pattern Informatics in a Grid
Environment
n
PI in a Grid environment:
• Hotspot forecasts are made using publicly available seismic records.
n Southern California Earthquake Data Center
n Advanced National Seismic System (ANSS) catalogs
• Code location is unimportant, can be a service through remote execution
• Results need to be stored, shared, modified
• Grid/Web Services can provide these capabilities
n
Problems:
• How do we provide programming interfaces (not just user interfaces) to the
above catalogs?
• How do we connect remote data sources directly to the PI code.
• How do we automate this for the entire planet?
n
Solutions:
• Use GIS services to provide the input data, plot the output data
n Web Feature Service for data archives n Web Map Service for generating maps
• Use HPSearch tool to tie together and manage the distributed data sources and
GIS Behind the Scenes
n
The web features are served up by a Web Feature Service.
nWeb Map Service aggregates maps
• NASA OnEarth + our own renderings.
n
We re-implement Open Geospatial Consortium standards using
Web Service Standards.
• SOAP messages, WSDL service definitions.
• Will allow us to separate messages from HTTP transport layer in future.
n
More WMS Info:
• http://grids.ucs.indiana.edu/ptliupages/publications/acm-gis-sayar.pdf.
• http://grids.ucs.indiana.edu/ptliupages/publications/Geoinformatics05_a
sayar.pdf.
n
More WFS
Info:
• http://grids.ucs.indiana.edu/ptliupages/publications/gwpap243.pdf
Tying It All Together: HPSearch
n
HPSearch
is an engine for orchestrating distributed Web
Service interactions
•
It uses an event system and supports both file transfers and
data
streams.
•
Legacy name
n
HPSearch flows can be scripted with JavaScript
•
HPSearch engine binds the flow to a particular set of remote
services and executes the script.
n
HPSearch engines are Web Services, can be distributed
interoperate for load balancing.
•
Boss/Worker model
n
ProxyWebService:
a wrapper class that adds notification
and streaming support to a Web Service.
Data Filter
(Danube)
PI Code Runner
(Danube)
qAccumulate Data
qRun PI Code
qCreate Graph
qConvert RAW -> GML
WFS (Gridfarm001) WMS HPSearch (TRex) HPSearch (Danube) HPSearch hosts an
AXIS service for remote deployment of scripts GML (Danube) WS Context (Tambora) NaradaBroker network: Used by HPSearch engines as well as for data transfer
Actual Data flow
HPSearch controls the Web services Final Output pulled by the WMS
HPSearch Engines communicate using NB Messaging
infrastructure
Virtual Data flow
Data can be stored and retrieved from the 3rdpart
repository (Context Service)
RDAHMM: GPS Time Series Segmentatio
Slide Courtesy of Robert Granat, JPL
n
Complex data with subtle signals is difficult for humans
to analyze, leading to gaps in analysis
n
HMM segmentation provides an automatic way to focus
attention on the most interesting parts of the time series
GPS displacement (3D) length two years
Divided automatically by HMM into 7 classes.
Features:
•Dip due to aquifer drainage (days 120-250)
•Hector Mine
earthquake (day 626) •Noisy period at
Towards Real-Time RDAHMM
n
A real-time version of RDHAMM could
potentially be used to detect state change
events in live data from a GPS station.
n
SCIGN maintains 125+ GPS stations, so trivially
parallel RDAHHM clones can monitor state
changes in the entire network.
NaradaBrokering: Message
Transport for Distributed Services
n
NB is a distributed
messaging software
system.
• http://www.naradabrokeri
ng.org
n
NB system virtualizes
transport links between
components.
• Supports TCP/IP, parallel
TCP/IP, UDP, SSL.
n
See e.g.
More Information
n
Contact:
[email protected]
n
G
IS Work at CGL: w
ww.crisisgrid.org
•
So
ftware, demos, publications
•
Several recent manuscript submissions are/will be
posted soon.
n
HPSearch at CGL: www.hpsearch.org
n
SERVOGrid Web Sites
•
Our fine parent project
•http://
servo.jpl.nasa.gov/
Acknowledgements
n
Geoffrey Fox, Community Grids Lab
director.
n
Shrideep Pallickara: NaradaBrokering
design/development lead
n
Grad Students: Ahmet Sayar, Galip Aydin,
SERVO Apps and Their Data
n GeoFEST: Three-dimensional viscoelastic finite element model for calculating
nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces.
• Relies upon fault models with geometric and material properties.
n Virtual California: Program to simulate interactions between vertical strike-slip
faults using an elastic layer over a viscoelastic half-space.
• Relies upon fault and fault friction models.
n Pattern Informatics: Calculates regions of enhanced probability for future
seismic activity based on the seismic record of the region
• Uses seismic data archives
n RDAHMM: Time series analysis program based on Hidden Markov Modeling.
Produces feature vectors and probabilities for transitioning from one class to another.
• Used to analyze GPS and seismic catalog archives.
• Can be adapted to detect state change events in real time.
Some SERVOGrid
Problems with Conventional Web
Services
n
Transport: HTTP Request/Response is a poor
choice for non-trivial data transport.
•
Much better to stream out data without knowing the
content-length.
n
Representation: ASCII XML is inefficient in obvious
and not so obvious ways.
•
For example, WS security depends upon
canonicalization to make reproducible message digests.
n
Efficiency and performance is not just a high
performance computing problem.
NaradaBrokering and Web
Services
n
SOAP 1.2 defines a message routing across distributed
SOAP Nodes.
•
Naturally maps to an NB implementation.
•
This has just been released from www.naradabrokering.org
n
NB
also has support for Eventing and
WS-ReliableMessaging.
n
More generally, we argue for the use of software messaging
substrates to provide/implement desirable “quality of
service” features
•
Transport, routing/addressing, reliability, security, discovery, etc.
•Specific service capabilities (like “run job”, “move file”, “query data”)
Efficient XML Representation
n
The XML Infoset provides an abstract data model.
•
SOAP 1.2 is defined using the Infoset.
n
This separates XML from “angle bracket notation”
restrictions.
•
Infoset-compliant binary representations are possible.
•
No loss of data, so you can translate between binary and ascii
representations.
n
Current lab research investigates hand-held applications.
•
See
http://grids.ucs.indiana.edu/ptliupages/publications/OptSOAP_CTS05
n
But eas
ily extensible to high performance transport
More Information
n
Contact:
[email protected]
n
G
IS Work at CGL: w
ww.crisisgrid.org
•
So
ftware, demos, publications
•
Several recent manuscript submissions are/will be
posted soon.
n
HPSearch at CGL: www.hpsearch.org
n
SERVOGrid Web Sites
•
Our fine parent project
•http://
servo.jpl.nasa.gov/
RDAHMM: SCIGN GPS Network Analysi
Slide Courtesy of Robert Granat, JPL
n Have found a way to detect regional aseismic signals
n This software is being integrated with the Quakesim web portal
n Scenarios for use with real time streaming data through the web portal are
currently being investigated
Now segment all 127 GPS stations
In blue: Number
of stations that change state on a given day
In red: Seismic activity
Days with many state changes often do not
Support for Streaming Data
n
We use NaradaBrokering messaging software to manage data
streams and filters.
• Open source, Java-based software from the Community Grids Lab
• Based on topic-based publication/subscription for delivery of messages
from/to multiple endpoints.
• “Message” can be anything, including SOAP and binary data streams.
• We use this for audio/video collaboration.
• More recently using it to build Web Service messaging substrates
n SOAP 1.2 routing model, WS-Reliability, WS-Eventing
n
NB ensures
reliable delivery
of events in the case of broker or client
failures and prolonged entity disconnects.
• Also supports replay.
n
Implements high-performance protocols (message transit time of 1
GPS Stations
n
Current implementation provides real-time access to
SOPAC GPS Services
n
As a case study we implemented services to provide real-time
access to GPS position messages collected from several SOPAC
networks.
n
Next step is to couple data assimilation tools (such as RDAHMM) to
real-time streaming GPS data.
n
Next steps
• Programming APIs: currently we assume the subscriber speaks
NaradaBrokering Java APIs (either NB’s native API or Java Messaging Service).
n Need to investigate appropriate Web Service standards and C/C++ bindings.
• SOAP enveloping of the GML message stream.
• A Sensor Collection Service will be implemented to provide metadata
Position Messages
n
SOPAC provides 1-2Hz real-time position
messages from various GPS networks in a
binary format called RYO.
n
Position messages are broadcasted
through RTD server ports.
n
We have implemented tools to convert
RYO messages into ASCII text and
Real-Time Access to Position
Messages
n
We have a Forwarder tool that connects to RTD
server port to forward RYO messages to a NB
topic.
n
RYO to ASCII converter tool subscribes this
topic to collect binary messages and converts
them to ASCII. Then it publishes ASCII
messages to another NB topic.
n