XML Metadata Services

(1)

1

XML Metadata Services

SKG06 http://www.culturegrid.net/SKG2006/

Guilin China

November 3 2006

Mehmet S. Aktas, Sangyoon Oh, Geoffrey C. Fox and Marlon Pierce

Presented by Geoffrey Fox: Computer Science, Informatics, Physics Pervasive Technology Laboratories

Indiana University Bloomington IN 47401

(2)

2

Different Metadata Systems

 There are many WS-* specifications addressing meta-data

defined broadly

• WS-MetadataExchange

• WS-RF

• UDDI

• WS-ManagementCatalog

 And many different implementations from (extended) UDDI

through MCAT of the Storage Research Broker

 And of course representations including RDF and OWL

 Further there is system metadata (such as UDDI for core

services) and metadata catalogs for each application domain such as WFS (Web Feature Service) for GIS (Geographical Information Systems)

 They have different scope and different QoS trade-offs

• e.g. Distributed Hash Tables (Chord) to achieve scalability in large scale networks

• WS-Context

•_ASAP

•_WBEM

(3)

Different Trade-offs

 It has never been clear how a poor lonely service is meant to

know where to look up meta-data and if it is meant to be thought up as a database (UDDI, WS-Context) or as the contents of a

message (WS-RF, WS-MetadataExchange)

 We identified two very distinct QoS tradeoffs

 1) Large scale relatively static metadata as in (UDDI) catalog of

all the world’s services

 2) Small scale highly dynamic metadata as in dynamic workflows

for sensor integration and collaboration

• Fault-tolerance and ability to support dynamic changes with

few millisecond delay

• But only a modest number of involved services (up to 1000’s in a session)

(4)

4

(5)

WS-Context compliant XML

Metadata Services



We designed and built a

WS-Context

compliant XML

Metadata services supporting distributed or central

paradigms.

This service

,



supports extensive metadata requirements of rich

interacting systems,

such as

• correlating activities of widely distributed services, EX: workflow style GIS Service Oriented Architectures, AND

• optimizing Grid/Web Service messaging performance, EX:

mobile computing environment, AND

• managing dynamic events especially in multimedia

collaboration, EX: collaboration Grid/Web service applications, AND

• providing information to enable session failure recovery

(6)

6

Context as Service Metadata



We define all metadata (static, semi-static, dynamic)

relevant to a service as “

Context

”.



Context

can be associated to a single service, a

session (service activity) or both.



Context can be independent of any interaction



slowly varying, quasi-static

context



Ex: type or endpoint of a service, less likely to

change



Context

can be generated as result of service

interactions



dynamic, highly updated

context



information associated to an activity or session



Ex: session-id, URI of the coordinator of a

(7)

Hybrid XML Metadata Services –>

WS-Context + extended UDDI



We

combine

functionalities of these two services:

WS-Context AND extendedUDDI in one hybrid service to

manage

Context

(service metadata).

• WS-Context

controlling a workflow

• _{(Extended) UDDI}

_{supporting semantic service}

discovery



This approach enables

uniform query capabilities

on

service metadata catalog.

(8)

8 HTTP(S) WSDL Client WSDL Client HTTP Subscriber Publisher Database JDBC Extended UDDI Service WSDL Database WSDL Hybrid-WSContext Service JDBC Database WSDL Hybrid-WSContext Service JDBC Topic Based Publish-Subscribe

Messaging System

Replica Server-2 Replica Server-N

WSDL WSDL Hybrid-WSContext Service Database W S D L JDBC

Distributed Hybrid WS-Context XML Metadata Services

Replica Server-1

(9)

Key Features



Publish-Subscribe

exploited to support replicated

storage e.g.

• Initial storage of context

• Update to make copies consistent

• Access context



Use of

Javaspaces cache

running in memory on each

WS-Context node

• _{Naturally supports Get Context by name requests}

• _{Backed up every ~30 milliseconds to a MySQL database}



If query can be satisfied by Javaspaces cache, the

query

can be satisfied in < 1ms

plus the few milliseconds of

(10)

10

TupleSpaces-Based Caching Strategies

 TupleSpaces is a communication paradigm

• asynchronous communication

• pioneered by David Gelernter

• first described in Linda project in 1982 at Yale

• communication units are tuples

 data-structure consisting of one or more typed fields

 Hybrid WS-Context Service employs/extends TupleSpaces:

• all memory accesses. overhead is negligible (less than 1msec. for inqueries)

• data sharing - mutual exclusive access to tuples

• associative lookup - content based search, appropriate for key-based

caching

• temporal, spatial uncoupling of communicating parties

• e.g. a tuple: ("context_id", Context). This indicates a tuple with two fields:

a) a string, "context_id" and b) a Java object, "Context".

(11)

Managing Context UDDI WS-Context

purpose standard way of publishing, discovering generic Web Service information

standard way of maintaining distributed session state

information

metadata characteristics interaction-independent, rarely-changing, small-size

interaction-dependent, highly dynamic, small-size

types of typical queries high degree of complexity in inquiry arguments to improve the selectivity and increase the precision in the search results

simplicity in inquiry arguments, mostly key-based retrieval

queries, selectivity of queries is one.

scalability Whole Grid, UDDI is a domain-independent service for generic service metadata

Sub-Grids, modest number interacting Web Services participating an activity

desired features better expressiveness power of

service metadata (e.g., RDF-enabled UDDI Registries), up-to-date service entries (e.g., leasing capable UDDI Registries), domain-specific

capabilities (e.g., geospatial query capabilities), persistent storage

(12)

12

A general performance evaluation

on the most recent implementation

(13)

Prototype Evaluation - I



Performance Experiment:

We investigate the practical

usefulness of the system by exploring following

research questions.

• What is the baseline performance of the hybrid WS-Context Service implementation for given standard operations?

• _{What is the effect of the network latency on the baseline}

performance of the system?

(14)

14

Test-4. extended UDDI inquiry/publication

W

S

D

L

single

threaded _WS

D L extended UDDI Client 1 user/1000 transactions Extended UDDI Server Extended UDDI Server Engine

Test-1. Dummy Server

W

S

D

L

single

threaded _WS

D L Client 1 user/1000 transactions Dummy Server Dummy Server

Test-2. Hybrid-WSContext inquiry/publication without database access

W

S

D

L

single

threaded _WS

D L WS-Context Client 1 user/1000 transactions Hybrid-WSContext Service Publishing Querying

Module JDBC Handler Expeditor

Test -3. Hybrid-WSContext inquiry/publication with database access

W

S

D

L

single

threaded _WS

D L WS-Context Client 1 user/1000 transactions Hybrid-WSContext Service Publishing Querying

(15)

The experimental study indicates that the proposed system can provide comparable performance for standard operations with the existing metadata

TESTBED: Cluster node configuration

Processor Intel® Xeon™ CPU (2.40GHz)

RAM 2GB total

Network Bandwidth 900 Mbits/sec_nodes) .[1] (among the cluster OS GNU/Linux (kernel release 2.4.22) Java Version Java 2 platform, Standard Edition _{(1.4.2-beta-b19)} SOAP Engine Axis 2 (in Tomcat 5.5.8)

Round Trip Time Chart for Inquiry Requests

5 7 9 11 13 15 17 19

1 2 3 4 5

av er ag e re sp o n se t im e (m se c) p er r eq u es t

Test-1: Dummy service

Test-2: WS-Context inquiry with memory access

Test-3: WS-Context inquiry with dabase access

Test-4: UDDI inquiry

Metadata Services Avg. latency for inquiries

hybrid WS-Context 8.41 ms

extended UDDI 17.5 ms

JUDDI 40 ms

UDDI-MT _{20.37 ms}

JWSD 18.99 ms

(16)

16

Prototype Evaluation - II



Scalability Experiment:

We investigate the scalability

of the system by finding answers to the following

research questions.

• What is the performance degradation of the system for standard operations under increasing message sizes?

• What is the performance degradation of the system for standard operations under increasing message rates?

(17)

TEST-1 - Hybrid-WSContext inquiry/publication with increasing message sizes

TEST-2 - Hybrid-WSContext inquiry/publication with increasing message rates (# of messages per

single threaded W S D L WS-Context Client 1 user/100 transactions W S D L Hybrid FTHPIS-WSContext Service Publishing Querying

Module JDBC Handler Expeditor HTTP(S) W S D L Thread Pool W S D L Thread Pool W S D L Hybrid-WSContext Service Publishing Querying

5 Client distributed to cluster nodes 1 to 5, with each running

(18)

18 18 0 5 10 15 20 25 30

0.1 1.0 10.0 100.0

context payload size (KB)

a v g r o u n d t ri p t im e ( m ill is e c o n d s ) Tinquiry=T(RTT) Tpublication=T(RTT)

The results indicate that the cost of inquiry and publication operations remains the same, as the context’s payload size increases from 100Bytes up to 10KBytes. We also see that the hybrid WS-Context presents better performance than OGSA-DAI approach but latter technology more powerful

TESTBED: Cluster node configuration for hybrid WS-Context tests

RAM 2GB total

Network Bandwidth 900 Mbits/sec_nodes) .[1] (among the cluster OS GNU/Linux (kernel release 2.4.22)

Java Version Java 2 platform, Standard Edition (1.4.2-_beta-b19) SOAP Engine Axis 2 (in Tomcat 5.5.8)

Metadata Services Avg. latency for inquiries for 64KByte data retrieval

hybrid WS-Context 14.55 ms

OGSA-DAI WSRF 2.1 232 ms

=> OGSA-DAI Results are from

http://www.ogsadai.org.uk/documentation/scenarios/-performa nce

(19)

The results indicate that the proposed system can scale up to 940 simultaneous

querying clients or 222 simultaneous publishing clients where each client sending

one query per second, for small size context payloads with 30 milliseconds fault TESTBED: Cluster node configuration

RAM 2GB total

Network Bandwidth 900 Mbits/sec.[1] (among the cluster nodes)

OS GNU/Linux (kernel release 2.4.22)

Java Version Java 2 platform, Standard Edition (1.4.2-beta-b19)

SOAP Engine Axis 2 (in Tomcat 5.5.8)

0 10 20 30 40 50 60 70 80 90

0 100 200 300 400 500 600 700 800 900 1000

message rate (message/per second)

av

g

r

o

u

n

d

t

ri

p

ti

m

e(

m

s)

(20)

Axis2 Performance on Mutlicore Machines

0 10 20 30 40 50 60 70

0 500 1000 1500 2000 2500 3000 3500

Messages per Second

R

ou

nd

T

rip

T

im

e

(m

s)

(m

s)

Grid Farm Sun Fire - 6 Cores Sun Fire - 8 Cores HP xw9300 Dell Intel Xeon

2 Chips

2 Core/chip 2 Chips

1 Core/chip

1 Chip

8 Core/chip

1 Chip

6 Core/chip

Xeon

Opteron

(21)

HTTP(S)

W

S

D

L

Thread Pool

W

S

D

L

Thread Pool

5 Client distributed to cluster nodes 1 to 5, with each running 1 to 15 threads firing

messages to randomly selected servers.

DISTRIBUTION TEST

We investigate scalability when moving from a centralized server to a distributed one under heavy workloads.

• Numbered rectangle shapes correspond to an N-node FTHPIS system with various Publish-Subscribe topologies (this does NOT affect performance)

• 5 different FTHPIS system tested when N range from 1 to 5 under the same workload.

node-1

node-5

node-1

node-5

node-4

node-3

node-2 node-1

node-5

node-3

node-1

node-5

node-3

node-2

2 3 4 5

node-5

(22)

22

The results indicate that the scalability of metadata store can be increased when moving from a centralized service to a distributed system.

TESTBED: Cluster node configuration

RAM 2GB total

Network Bandwidth 900 Mbits/sec_nodes) .[1] (among the cluster OS GNU/Linux (kernel release 2.4.22)

Java Version Java 2 platform, Standard Edition (1.4.2-_beta-b19) SOAP Engine Axis 2 (in Tomcat 5.5.8)

900 950 1000 1050 1100 1150 1200 1250 1300

1 2 3 4 5

number of nodes

m e s s a g e r a te ( m s g /s e c o n d )

Hybrid WS-Context inquiry operation

# of nodes message rate

mean ± error

(ms) Stdev(ms)

1 940 47.05 ± 0.24 33.52 2 1005 40.76 ± 0.43 38.22 3 1082 38.58 ± 0.45 34.93 4 1148 36.28 ± 0.42 32.24 5 1221 34.13 ± 0.4 30.76

Non-optimal caching algorithm as does database access BEFORE Publish-Subscribe. Reversing this choice should lead to throughput

(23)

Prototype Evaluation - III



Fault Tolerance Experiment:

We investigate the

empirical cost of having fault-tolerance by finding

answers to the following research questions.

• What is the cost of the fault-tolerance in terms of execution time of standard operations on a tight cluster?

• _{How does the cost of fault-tolerance change when the replica}

(24)

24 24 node-1 node-5 node-4 node-3 node-2 client node-1 node-5 node-4 node-3 node-2 link-1 link-2 link-3 link-4 client

Test-1. LAN experiment. All nodes and client are located on a tightly coupled local area network.

Test-2. WAN experiment. Nodes are located on a loosely coupled wide area network.

(25)

Summary of machine configurations

Location Processor RAM OS Java Version

gf6.ucs.indiana.edu

Bloomington, IN, USA Intel® Xeon™ CPU

(2.40GHz) 2GB GNU/Linux (kernel release 2.4.22)

Java 2, STE, (1.4.2-beta-b19)

complexity.ucs.indiana.edu

Indianapolis, IN, USA Sun-Fire-880, sun4u sparc

SUNW 16GB SunOS 5.9 Java HotSpot( TM) 64-Bit Server VM(1.4.2 -01)

lonestar.tacc.utexas.edu

Austing, TX, USA Intel(R) Xeon(TM) CPU

3.20GHz 4GB GNU/Linux (kernel release 2.6.9)

Java 2, STE, (1.4.2-beta-b19)

tg-login.sdsc.teragrid.org San Diego, CA, USA GenuineIntel IA-64, Itanium 2, 4 processors 8GB GNU/Linux Java 2, STE, (1.4.2-beta-b19)

vlab2.scs.fsu.edu

Tallahase, FL, USA Dual Core AMD Opteron(tm) Processor 270

2GB GNU/Linux (kernel release 2.6.16)

Java 2, STE, (1.4.2-beta-b19)

FAULT-TOLERANCE EXPERIMENT TEST

(26)

26

0 2 4 6 8 10 12 14 16 18

1 2 3 4 5

number of replicas

T

im

e

(m

se

c)

Test1 LAN testing case -publication

Test2 WAN testing case -publication

Test3 - Inquiry operation (request granted locally with memory access)

Test4 - Inquiry operation (request granted locally with database access)

FAULT-TOLERANCE TEST RESULTS

The results point out the inevitable trade-off between the fault-tolerance (degree of replication or high availability of data) and performance. The lower the level of fault-tolerance, the higher the performance would be for publication operations.

(27)

An Application Case Scenario

and

an application-specific

performance evaluation

(28)

28



Handheld Flexible Representation

(HHFR) is an open

source software for fast communication in mobile Web

Services. HHFR supports:

• streaming messages, separation of message contents and

usage of context store.

• http://www.opengrids.org/hhfr/index.html



We use WS-Context service as

context-store for

redundant message parts

of the SOAP messages.

• redundant data is static XML fragments encoded in every SOAP message

• Redundant metadata is stored as context associated to service conversion in place



The empirical results show that we gain

83%

in

message size

and on avg.

41%

on

transit time

by using

WS-Context service.

(29)

Optimizing Grid/Web Service Messaging

Performance

· HHFR Scheme · Representation · Headers

· Stream Info.

Context-Store

Save Context

(setContents) Retrieve Context (getContents)

Stream of Message in Preferred Representation

Negotiation Over SOAP

HHFR Endpoint (Mobile) HHFR Endpoint

(Conventional)

(30)

30

Performance

with and without Context-store

Message Size Without Context-store With Context-store

Ave.±error Stddev Ave.±error Stddev

Medium: 513byte (sec) 2.76±0.034 0.187 1.75±0.040 0.217

Large: 2.61KB (sec) 5.20±0.158 0.867 2.81±0.098 0.538



Experiments ran over HHFR

 Optimized message exchanged over HHFR after saving

redundant/unchanging parts to the Context-store



Save on average

83% of message size, 41% of transit time

(31)

System Parameters

 T_access: time to access to a Context-store (i.e. save a context or

retrieve a context to/from the Context-store) from a mobile client  T_RTT: Round Trip Time to exchange message through a HHFR

channel

 N: number of simultaneous streams supported by stream summed over ALL mobile clients

 T_wsctx: time to process setContext operation

 T_axis: time consumed for Axis process

 T_trans: transmission time through network

(32)

32

Context-store:

System Parameters

Context-store (Information Service)

Service Provider (Endpoint A)

Mobile Client (Endpoint B)

Taccess

= Taxis + Twsctx + Ttrans

TRTT High performance Channel of HHFR

Transit

Client Client

Axis

Network Network

WS-CTX

(33)

33

Summary of

T

_axis

and T

_wsctx

measurements

T

_access

= T

_wsctx

+ T

_axis

+ T

_trans

Data binding overhead

at Web Service Container

is the dominant factor to

message processing

1.4 1.6 1.8 2

0 100 200 300 400 500

Ti

m

e

(m

se

c)

Twsctx

(34)

34

 C

hhfr = nthhfr + Oa + Ob

 C

soap = ntsoap



Breakeven point:

n

_be t_hhfr + O_a + O_b=

n

_be t_soap

O_a(WS) is roughly 20 milliseconds

Performance Model and Measurements

Average±error (sec) Stddev (sec)

Context-store Access (O_a) 4.127±0.042 0.516

Negotiation (O_b) 5.133±0.036 0.825

Oa : overhead for accessing the

(35)

String Concatenation

 Measure the

total time to process stream

 Independent

variables

• Number of messages per stream

• Size of the message

0 5 10 15 20 25 30 35

0 20 40 60 80 100 120 140

Number Of Messages Per Stream

Ti m e fo r F in is hi ng M es sa ge S tre am (s ec

) _{HHFR: 16 String Per Message}

SOAP: 16 String Per Message