Scripting based architecture for
Management of Streams and
Services in Real-time Grid
Applications
Authors
Harshawardhan Gadgil, Geoffrey Fox, Shrideep Pallickara, Marlon Pierce
Community Grids Lab, Indiana University, Bloomington
Robert Granat
NASA, Jet Propulsion Lab, Pasadena
Presented By
Harshawardhan Gadgil
Talk Outline
n
Introduction to HPSearch Architecture
nA quick view of NaradaBrokering
middleware
n
System Goals
– RDAHMM Example
HPSearch
n A JavaScript based scripting runtime serving as
an administration console
– Currently uses Rhino (http://www.mozilla.org/rhino) implementation of JavaScript.
n Management viewed as
– Setting up the distributed application (Involves setting
up the broker network, initializing system components)
– Querying Run-time System Metadata
§ For logging purposes
§ Monitor metadata to help dynamically rewire system to
improve performance
HPSearch Architectur
Component Summary
n Binds URIs to a scripting language
– Allows us to manage (manipulate) the Resource
identified by the URI
§ E.g. Read / Write to files, sockets, topics
§ Read from data base, data from FTP/HTTP resource
§ This data can then be streamed to distributed application OR
data can be read from a stream and processed / stored
n “Host-objects” allow us to dynamically access
the host system.
– Useful for constructing objects that monitor system meta-data, perform management tasks
– PerfMetrics gathers system performance data and
HPSearc
Architecture Diagram Request Handler JavaScript Shell Task Scheduler Flow Handler Web Service EP Other Objects HPSearch Kernel URIHandler DBHandler WSDLHandler WSProxyHandler Request Handler HPSearch Kernel HPSearch Kernel Broker Network SOAP/HTTP . . . DataBas e Web Servic e Files Sockets Topics Network Protocol JDBC SOAP/HTTP WSProxy Servi ce WSProxy Servi ce WSProxy Servi ceHPSearch Control Events using
PUB/SUB on predefined topic Data buffers sent / received as Narada Events
NaradaBrokering
n NaradaBrokering
– Messaging infrastructure for collaboration, peer-to-peer and Grid applications
– Implements high-performance protocols (message
transit time of 1 to 2 ms per hop)
– Order-preserving, optimized message transport with
QoS and security profiles for sent and received messages
– Support for different underlying protocols such as TCP, UDP, Multicast, RTP
HPSearch + NaradaBrokerin
Managing Streams
n HPSearch uses NaradaBrokering to route data
streams
– Each stream is represented by a topic name – Components subscribe / publish to specified
topic
§ The WSProxy component automatically maps
topics to Input / Output streams
§ Each write (byte[] buffer) and
byte[] read() call is mapped to a NaradaBrokering event
System Goals
n Investigate the use of HPSearch as a management
console to deploy system and application components using scripting
n Use HPSearch in scientific / grid applications which can
use streaming data
– Data filtering is essential in most cases and we do not want to transfer entire data set when only a small percentage of data would be used,
§ E.g. Choosing data satisfying certain input constraints.
– Data re-ordering might be required for formatting input data to match the requirements of executable,
§ E.g. Converting GML to remove XML elements
n Use publish-subscribe methodologies to connect
n GPS time series have modes caused by unknown
underlying physical processes
n RDAHMM allows us to identify these modes and
time periods where physical processes
dominated the sequence without any a-priori knowledge of these processes
n Help to determine
– The actual physical causes
– When is the system entering a new mode (perhaps a
hint of some important seismic event)
Exampl
RDAHMM
Exampl
Application Components
n GPS database (Surface displacement time series
collected by SCIGN, http://www.scign.org)
n Data filter (filters data, removes unwanted
components, reorders data as required by the actual RDAHMM executable)
n RDAHMM executable, performs the time series
analysis. This data is transferred to the Graph plotting application.
HPSearch Engine
Applications
Streaming Data Filtering
Data Filter
Filters the input data to get
only the estimate and error
values
RDAHMM Analyze the data
Matlab Plotting
Script Output Graph
HPSearch Handler GPS Data Senso r Sourc e Handler Handler
Each Handler controls operation of 1 service
Sample Outpu
Related Work
n Scripting languages have been very popular and
successful for Rapid Application Deployment (RAD)
– Sash by IBM, allow RAD and handlers for various tasks such as reading from databases, LDAP registries, invoking Web Services and providing GUI
– Jython and Matlab are popular in many scientific communities. E.g. GeoDISE
– XCAT project at IU, Extreme Lab uses Jython for scripting to deploy distributed components
– Karajan (Ant Like scripting) is used for deploying applications over grid
– WSRF::Lite uses perl to implement Web Services Resource
Framework and host grid services
Future Work
n Investigate system and management scaling
with increasing number of components.
n Use of security for streams
n Negotiation of optimal (high performance)
transport
n More handlers for different aspects of
NaradaBrokering’s management (broker / topic discovery, security, replay etc…)
n Investigate interaction with WS Management
To conclude…
n
Presented a scripting based architecture
for management of data streams and
distributed services
n
Shown how we can use publish subscribe
methodologies to connect components
that process data in a stream
n
Create and deploy quick data filtering
applications
More Information
n
HPSearch Project Website
http://www.hpsearch.org
n
Na
radaBrokering Project @ IU
http://www.naradabrokering.org
n
CGL
Publications
Questions / Comments
n