Analysis - Dynamic Discovery, Creation and Invocation of Type Adaptors for Web Service Workflow

6.5 Evaluation

6.5.4 Analysis

Hypothesis H1 states that the performance cost of translation should be linear or better for fxml-T to be a scalable implementation. In Section 6.5.1, testing with increasing input document size shows fxml-T to have a linear increase in the cost of translation. Although a quadratic expansion is observed when schema sizes are increased, we discover that this performance overhead emanates from the xmlparsing routines used and not the transformation cost which is shown to remain linear in Figure 6.12. Hypothesis H2 states the M-Binding composition should ideally come with virtually zero performance cost. In Section 6.5.2, testing of increasingly complex M-Binding composition shows that the inclusion of mappings from other documents does not effect the translation performance in a significant way. Finally, to fulfil hypothesis H3 and ensure fxml-T is a practical

0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60

User CPU Time (Seconds)

File Size (KBytes) Perl

Perl Quadratic Fit FXML FXML Quadratic Fit Python Python Quadratic Fit

Figure 6.15: Transformation Performance against a random selection of Se-

quence Data Records from the DDBJ service

Doc Size (KB) fxml (s) perl (s) python (s)

2.04 0.30 0.36 0.17 5.01 0.30 0.34 0.17 11.00 0.35 0.45 0.18 15.96 0.43 0.72 0.22 20.51 0.43 0.68 0.22 30.89 0.61 1.22 0.27 40.20 0.57 0.78 0.20 52.12 0.74 1.30 0.26 ihline mvalue 0.0088 0.0183 0.0017 difference 0.0095 -0.0072 percentage 107.6% -81.2%

Figure 6.16: A summary of translation performance for bioinformatics data

implementation, we testfxml-T _{against real bioinformatics data sources. Figure}

6.15 illustrates that fxml-T performance is reasonable compared to other xslt

implementations.

6.6 Conclusions

fxml-T_{implements our transformation formalisation fully. It supports the trans-}

lation of documents based on mappings between components within source and destination schemas. To express complex relations between elements, for example the mapping of elements based on other attribute values, fxml-T _supports

predicate evaluation. When the manipulation of string values is required, regular expressions may be used to extract characters of interest. fxml-T also provides an implementation of the core msl _{constructs, namely schema components and}

typed documents. Themslspecification [26] does present inference rules that describe the process of document validation; i.e. the notation that anxmldocument conforms fully to the schema that describes it. However, we have not implemented this feature withinfxml-Tsince third part schema validators can be used. Valida- tion must be used otherwise it is possible to specify transformations that produce invalid documents.

Through evaluation of the fxml-T library against similar xmltranslation tools, we have shown that our implementation scales well when input document size is increased. While a quadratic expansion in translation time is observed when increasing schema sizes, we find that this increase emanates from the xml parsing subroutines used. The actual cost of translation remains linear with respect to input schema size. Therefore, our translation cost would be linear if more efficient

xml parsing routines were used. In terms of M-Binding composition, our implementation performs very well: increasing the number ofM-Bindings included has virtually no cost on the overall translation performance.

The languages fxml-M _and xslt _{[34] are obviously closely related because they}

both cater for xml translation. At a basic level, they provide operators that allow items of text in the source document to be replaced with different text in a destination document. However,fxml-M_{offers one significant benefit over}xslt_:

fxml-M _{supports the composition of mappings in a predictable manner. With} xml schema, the definition of elements, attributes and types can be imported from an external document. This is a useful feature when combining data from different sources because schema definitions do not need to be rewritten. If this were to occur with two xml schemas that both have an M-Binding to define the translation to another xml_{representation, the} _M_{-Binding definitions may also be}

Invocation and Discovery

Architecture

The goal of the WS-MED architecture is to provide a generalised set of software components that can be exploited by any technology making use of Web Services standards (such aswsdl_andxml_{schema) to translate}xml_{data between different}

formats via an intermediateowlrepresentation. In the previous two Chapters, we have focused on the core syntactic mediation technology, namely the mapping lan- guagefxml-M _{(Chapter 5), its corresponding implementation}fxml-T _(Chapter

6), and the internal workings of the Configurable Mediator (C-Mediator). While these contributions create the necessary infrastructure to support a scalable data translation approach, more software components are required to complete the big picture given in Figure 4.12 (Chapter 4). For example, analysis of current applica- tions shows that the discovery and sharing of Type Adaptors is not well supported: most users create their own library of adaptors and rarely share them with other individuals.

If we are to consider the WS-MED architecture as a generic solution to the workflow harmonisation problem, we must support users and other software components in the sharing and discovery of Type Adaptors, the dynamic invocation of target services, and the generation of canonical xmlrepresentations forowl concept instances. A more detailed list of these additional architecture requirements follows:

1. Invocation of target services

In a constantly changing environment where services appear and disappear at any time, services may be discovered to achieve particular goals that have not been used before. Therefore, it is important for an invocation component to cater for the execution of previously unseen services defined usingwsdl.

2. Derive a canonical model from the intermediary representation

Since our intermediary-based mediation approach relies on a canonicalxml

representation of owl _{concept instances to act as a} _{lingua franca}_{, a mech-}

anism is required to automatically derive such a model. To be compatible with theC-Mediator, anxmlschema is required to validate owlconcept instances (called an OWL-X IS).

3. Discovery of Type Adaptors

To find autonomously the appropriate Type Adaptor to harmonise the flow of data between two services, a discovery and publishing facility is required that supports the advertising and retrieval of Type Adaptors based on their conversion capabilities. To conform to existing Web Service standards, we base this part of the architecture on a uddicompliant registry.

In this Chapter, we present the WS-HARMONY _{architecture components that}

enable the execution of wsdl _{specified Web Services; the generation of} OWL- X IS (owlinstance schemas); and the advertising, sharing, and discovery of Type Adaptors. We test Type Adaptor discovery cost in the context of workflow execution times and show that discovery time is minimal in comparison to the execution of target services. Our Dynamic Web Service Invoker is also tested against another Web Service invocation api _{(Apache Axis [10]) and is shown to be faster,}

particularly as the message size increases.

The contribution of this Chapter is a set of architecture components to satisfy the requirements above for automatic workflow harmonisation:

• A Dynamic Web Service Invoker that is able efficiently to execute previously unseen wsdlspecified Web Services.

• An owl xml instance schema (OWL-X IS) generator that consumes owl

• An approach for the description of Type Adaptor components using wsdl

and a component to automatically generate wsdldefinitions of M-Binding capability.

• A registration, sharing and discovery mechanism for Type Adaptors using the Grimoires [93] service registry.

This Chapter is organised as follows: Section 7.1 gives an overview of wsdl_{, how}

services are typically invoked, why the invocation of previously unseen services is problematic, and how the Dynamic Web Service Invoker overcomes such problems. Section 7.2 discusses the relationship between owl _{ontologies and} xml _schema,

with an example to show theOWL-X IS generator at work using an algorithm that automatically createsOWL-X IS. In Section 7.3, we concentrate on the description and discovery of Type Adaptors, presenting a uniform description method based onwsdl. Section 7.4 evaluates our Dynamic Web Service Invoker against a leading Web Service invocation api_{, and shows that the discovery of Type Adaptors in} WS-HARMONY is insignificant compared to the execution of target services. Finally, we conlude in Section 7.5.

7.1 Dynamic Web Service Invocation

In this Section, we describe the problem faced by software components that are designed to enable the invocation of previously unseen Web Services. After a brief introduction to wsdl, and an explanation of the invocation problem, we present a solution that utilises a standardised xml view for service input and output messages. Finally, our implementation is presented in the form of the Dynamic Web Service Invoker (dwsi).

In document Dynamic Discovery, Creation and Invocation of Type Adaptors for Web Service Workflow Harmonisation (Page 146-152)