Managing Dynamic
Metadata and Contex
Mehmet S. Aktas
Outline
p
Introduction
p
Problem Statement,
Hypothesis, Design Goals
p
Literature Survey
pResearch Issues
pMilestones
p
Contributions
Context
p
Def:
"Context is any information that can be used to
characterize the situation of an entity, where an entity can be a
person, place, or computational object.“ Dey A. et al, 1999
p
Context is metadata associated to both services and their
activities
p
Context can be
n independent of any interaction
p static context
§ Examples: type or endpoint of a service, less likely to change
p dynamic context
§ Examples: throughput of a service, likely to change over time
n generated as result of interaction
p information associated to an activity or session
Gaggle of Services
p
Gaggle of Services
n
are set of actively collaborating managed services put
together for a particular functionality, such as
collaboration, visualization or sensor Grid
n
collaborate for a particular common goal
p Example: emergence preparedness and response
Motivation
p
Current Grid Information Services provide information
describing services independent of their interactions.
p
We need management of all information associated with
services for;
n
correlating activities of widely distributed services
p workflow-style, SOA based applications
n
management of events especially in multimedia collaboration
p distributed session management
Motivation II
p
More reasons for management of Context
n
enabling uniform query capabilities to both dialog or
monolog context information
p “Give me list of services satisfying C:{a,b,c..} QoS requirements and
participating S:{x,y,z..} sessions”
n
enabling real-time replay/playback capabilities in
collaboration based sessions
Application Use Domain
p
Multimedia Collaboration domain:
Global MMCS
n multiple A/V services talk to various collaboration clients and services n defines a general session collaboration protocol (XGSP)
n XSGP enables different collaboration tools to talk to each other e.g.
AccessGrid, H.323
n needs a distributed session management systems
p
Characteristics of the domain
n widely distributed services
n metadata of events (archival data)
p mostly read-only
Application Use Domain - II
pWorkflow-style distributed application:
Geographic
Information System Grid
n sensor grid data services generates events when a certain magnitude
event occurs
n firing off various codes, filtering, analyzing raw data, generating images,
maps
n needs a distributed context management to correlate workflow activities
p
Characteristics of domain
n any number of widely distributed services can be involved n conversation metadata
p transient
1
WMS GUI WFS
http://..../..../..txt HP Search Data Filter PI Code Data Filter http://..../..../tmp.xml Context Information Service 2 5,6,7 8 4 3,9 <context xsd:type="ContextType"timeout=“100">
<context-service>http://.../WMS</ context-service>
<activity-list mustUnderstand="true" mustPropagate="true"> <service>http://.../WMS</service> <service>http://.../HPSearch</service> </activity-list> </context> session <context xsd:type="ContextType"timeout=“100">
<context-service>http://.../HPSearch</ context-service> <parent-context>http://../abcdef:012345<parent-context/> <content> profile information related WMS </content>
</context>
user profile
<context xsd:type="ContextType"timeout=“100">
<context-service>http://.../HPSearch</ context-service> <parent-context>http://../abcdef:012345<parent-context/> <content> shared data for HPSearch activity </content>
<activity-list mustUnderstand="true" mustPropagate="true"> <service>http://.../DataFilter1</service> <service>http://.../PICode</service> <service>http://.../DataFilter2</service> </activity-list> </context> activity <context xsd:type="ContextType"timeout=“100">
<context-id>http://../abcdef:012345<context-id/>
<context-service>http://.../HPSearch</ context-service>
<content>http://danube.ucs.indiana.edu:8080\x.xml</content> </context>
shared state
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://www.w3..."> <soap:Header encodingStyle=“WSCTX URL"
mustUnderstand="true">
<context xmlns=“ctxt schema“ timeout="100"> <context-id>http..</context-id>
<context-service> http.. </context-service> <context-manager> http.. </context-service> <activity-list
mustUnderstand="true" mustPropagate="true"> <p-service>http://../WMS</p-service>
<p-service>http://../HPSearch</p-service> </activity-list> </context> </soap:Header> ... SOAP header for Context
•session associated dynamic metadata
•user profile
•activity associated dynamic metadata
•service associated dynamically generated metadata What are the examples of dynamically generated
metadata in a real-life example?
3,4: WMS starts a session, invokes HPSearch to run workflow script for PI Code with a session id
5,6,7:HPSearch runs the workflow script and generates output file in GML format (& PDF Format) as result
8:HPSearch writes the URI of the of the output file into Context
9:WMS polls the information from Context Service
10: WMS retrieves the generated output file by workflow script and generates a map
<context xsd:type="ContextType"timeout=“100">
<context-service>http://.../HPSearch</ context-service> <content> HPSearch associated additional data generated
during execution of workflow. </content> </context>
Problem Statement
What is a novel process of building Information
Services, maintaining dynamic session-related
Hypothesis
p A fault-tolerant, high performance, scalable information system
n maintaining widely distributed dynamically generated metadata for
Gaggle of Services
n providing uniform interface to context information
p utilization of existing Grid Information Services for
interaction-independent context to improve search capabilities
n enabling coordination of widely distributed services in Gaggles
p workflow-style Grid applications
n enabling distributed event management and various capabilities for
A/V conferencing applications
p discovery of entities in a session
Architectural Design Goals
pKey Design Goals of our Design
n
scalability
p with respect to #
p widely distributed services
n
performance
p high responsiveness, reduced access latency
n
fault tolerance
p high availability of information p robust to replica crashes
n
flexibility
Literature Survey
p
Main Stream Grid Information Services
n
MDS, R-GMA, UDDI (Grimories)
p
Specifications for stateful service interactions
n
WS-CAF, WSRF, WS-Metadata Exchange
registry registry, producers
aggregator services, information sources Components application-oriented application-oriented, resource-oriented application-oriented resource-oriented, stateful interaction data Provided data centralized decentralized, hierarchical, peer-to-peer decentralized, hierarchical Distribution, Organizational Grimories UDDI Extension (myGrid) R-GMA
(European Data Grid) MDS4-(GT4)
registry and discovery of services and
workflows performance monitoring,
information monitoring and discovery
Functionality
Limitations in Grid Information
Services
p
Lack of support for session related dynamic metadata
n MDS4 adopts WSRF approach which does not scale managing
activities of multiple services sharing same state
p
Lack of support for advanced query capabilities
n ex: “Give me list of WFS services participating “fault displacement
WS-CA
WS-Context - Key Concepts
p
WS Composite Application Framework (WS-CAF)
n WS-Context, WS-Coordination, WS-Transaction Mngmt.
p
WS Context
n defines context, context service and mapping on SOAP n shared data to correlate service activities
n context information dependent on the type of the activity
p transactional activity: the URI of the coordinator in a session
n context service maintains associated context
n participants of an activity register with context service for lifecycle of
Web Service Resource Framework
Key Concepts
p
defines standard interfaces and behaviors for distributed
system integration
n standard XML-based information model
n standard interfaces for push and pull mode access to service data
p
enables every service to expose state data for query, update
n monitoring shared state
p
models resource state as private to a service
p
supports resource oriented approach for stateful interactions
n requires the identity of the resource to be passed in the SOAP
WS-Metadata Exchang
Key Concepts
p
WS Metadata is key to interactions
n WS-Policy: capabilities, requirements, general characteristics of
services
n WSDL: describes message operations, supported network protocols
used by services
p
WS-Metadata Exchange
n provides mechanism for sharing information about the capabilities of
individual Web services
n allows querying a WS Endpoint to retrieve metadata about what to
know to interact with them
Limitations in
Specifications for Service
Communication
p
WSRF does not actually accomplish state management by just
enabling access and update rights
n heterogeneous service environment n workflow-style applications
p
WSRF, WS-Metadata Exchange models service metadata
private to a service
n does not scale in managing activities of multiple services
n WS-Metadata Exchange defines only how to access
interaction-independent metadata
p
WS-Context is promising it has limitations
n simple framework for context management n limited query capability
TupleSpaces Paradigm
p
a communication paradigm
n space-based asynchronous communication n first described in Linda project in 1982 at Yale n pioneered by David Gelernter
p
Linda is a coordination language using primitive
operations on shared data in shared space
p data-centric coordination model
p
communication units are tuples
n data-structure consisting of one or more typed fields
JavaSpaces [Sun Microsystems]
p
JavaSpaces is an object oriented
n strongly influenced by Linda model n Java based, platform independent pspaces are transactionally secure
n mutual exclusive access to objects p
spaces are persistent
n temporal, spatial uncoupling p
spaces are associative
n content based search p
limitations
n centralized
n inefficient reading/writing performance
Research Issues
p
Recap on key design goals:
n
scalability, performance, fault tolerance
p
research issues related replicating dynamic metadata
n
deployment (dynamic vs. static replication)
p Where to place replicas of given context metadata? p What are the properties of new location must meet? p How to know if replica location stable?
Research Issues II
n
consistency
p What is the appropriate consistency model?
p How do replicas exchange replica updates in what direction?
p How can we utilize an ordering capability based on NTP (Network
Time Protocol) to provide consistency on the replicated context metadata?
p
performance
n
efficient metadata access
p How to choose a replica server to best serve client request?
Research Issues III
p
scalability
n
load balancing strategies
p How to manage load balancing?
p
other research issues
n
replay/playback capabilities
p How to enable real-time replay/playback capabilities?
n
session recovery
p How to enable session recovery?
n
uniform interface to context
Milestones
p
Implementation of TupleSpaces paradigm
p
Uniform Update and Query (search, discovery)
Services
p
Sequencer Service
n
ensures that an order is imposed on actions/events that
Milestones II
p
Storage (Replication) Service
n decide # and placement of replicas n enable autonomous behavior
n support robust behavior for replica crashes
p
Access (Request Distribution) Service
n distribute request among object replicas
p
Expeditor Service
n generalized caching mechanism
Evaluation of Hypothesis
pQualitative evaluation
n Does the system delivers what it promises in terms of functionality?
p Example test domains: Geographical Information System Grid, Global
MMCS
p How does the system function incase of replica crashes?
p
Quantitative evaluation
n How well the system delivers what it promises in terms of
performance?
n What are the performance cost and gains brought together with
scalability and fault tolerance?
p trade offs between fault-tolerance, scalability and performance
p what limitations does the trade offs impose to the practical use of my
system?
p what is # of replicas needed for certain availability? p what is the cost of fault tolerance?
Contribution of this Thesis
p
Identifies a novel approach for building Information
Services managing session related context.
p
Identifies a novel approach for providing fault tolerance
and scalability while providing high performance when
managing dynamic metadata
n Identifies a dynamic replication mechanism for widely distributed
Summary
p
This thesis addresses following problems
n Lack of support in Grid Information Services for context
(session-related dynamic metadata) management to correlate activities in workflow-style applications:
p by providing a novel approach for management of widely distributed,
shared session-related dynamic metadata
n Lack of support in Grid Information Services to provide distributed
session management:
p by providing distributed event management system enabling session
failure recovery or replay/playback capabilities
n Lack of search capabilities in Grid Information Services:
p by providing uniform search interface to both interaction independent and