TheC-Mediatoris a component that consumesM-Binding documents and uses them to direct the transformation of data from one format to another via an intermediate owl representation. This process is broken into three stages: (i) conversion from the source xmlformat to owl(conceptual realisation); (ii) mod- elling of the owl concept instance; (iii) conversion from owl to a destination xml format (conceptual serialisation). Stages (i) and (ii) are performed by the Translation Engine that is implemented using the fxml-T functions defined in Section 6.3. Figure 6.6 shows how these functions are combined to create the Transformation Engine.
The Transformation Engine takes four inputs: a source xml schema, a source xml document, a destination xml schema and an M-Binding in xml format. The xmls->fxml:schema function is used to convert the source and destination
xmls->fxml:schema Source Schema <XMLS> M-Binding <XML> Source Document <XML> Destination Schema <XMLS> xml->fxml:td Destination Document <XML>
The Transformation Engine takes four documents as input
The Transformation Engine produces the destination XML document
fxml:transform xmls->fxml:schema fxml:td->xml Tr a nsf or ma tion Engine xml->fxml:binding fxml:schema fxml:schema fxml:schema fxml:td fxml:binding
xmlschemas tofxml:schemastructures. The source document is converted to an fxml:tdusing the xml->fxml:tdfunction (consuming the source schema already converted to an fxml:schema). The M-Binding document is converted to an
fxml:bindingand then passed with the sourcefxml:td, sourcefxml:schema, and destination fxml:schema to the fxml:transform function. Once the document translation has been completed, the output is converted from an fxml:td to an
xmldocument using the fxml:td->xmlfunction.
After the initial conversion from the sourcexmlformat to anowlconcept instance
(serialised in xml), the concept instance must be validated against its ontology definition. The C-Mediator uses jena to perform this stage of the mediation, creating an inference model from the ontology definition and importing the con- cept instance into it. During this stage, concept hierarchies are calculated and any instances imported are classified. From the perspective of our use case, this means that the output from the DDBJXML service (aDDBJ Sequence Data Record con- cept) is also classified as an instance of theSequence Data Record concept. There- fore, input to a service consuming aSequence Data Record, such as the NCBI-Blast service, is valid. TheC-Mediatorand its interaction with ourdwsi and the two
target Web Services from our use case is illustrated in Figure 6.7. In this diagram, the C-Mediator is shown converting data from DDBJXML format to FASTA format via an instance of theSequence Data Record concept. We show all the doc- uments necessary for each conversion process (e.g. xml schemas and M-Binding documents) and where they originate (e.g. wsdl definitions, manually specified or automatically generated). To illustrate the mechanics of the C-Mediator, we
follow the conversion process in four stages, as they are labelled in Figure 6.7:
1. The DynamicwsdlInvoker (dwsi) consumes theaccession idand invokes
the DDBJ service to retrieve a complete sequence data record. The document returned is of type DDBJXML.
2. The DDBJXML sequence data record is converted to an instance of the
sequence data record concept using the Translation Engine. The Transla- tion Engine consumes the sequence data record, the xmlschema describing it (taken from the DDBJ wsdl definition), a schema describing a valid in-
the OWL-X IS generator), and the realisation M-Binding document. The Translation Engine produces an instance of thesequence data record con- cept which is imported into the Mediation Knowledge Base (a jena store).
3. To transform thesequence data recordconcept instance to FASTA format, the Translation Engine is used again, this time consuming the owlconcept instance (in xmlformat), the schema describing it (generated by theOWL- X IS generator), the schema describing the output format (from the NCBI- Blast wsdl) and the serialisation M-Binding. The output produced is the sequence data in FASTA format.
4. The dwsi consumes the FASTA formatted sequence data record and uses it
as input to the NCBI-Blast service.
Out: GetEntryOut • record [DDBJXML] In: runAndWaitForIn • sequence_data[FASTA] Mediation KB (Jena) Translation Engine DDBJ XML Schema Sequence Data OWL Instance Schema DDBJ XML->Seq-Data-Ont M-Binding Translation Engine NCBI XML Schema Sequence Data OWL
Instance Schema Seq-Data-Ont -> FASTA
M-Binding
Sequence_Data_Record
Sequence_Data_Record
Sequence Data Ontology
Concept URI DDBJ WSDL OWL-XIS Generator Manually Specified NCBI Blast WSDL Manually Specified From Semantic Annotation
DDBJ Service: GetEntry PortType: GetEntry Dynamic WSDL Invoker wsdl:GetEntryIn • accession_id [xsd:string] NCBI-Blast Service: runAndWaitFor PortType: runAndWaitFor Dynamic WSDL Invoker wsdl:runAndWaitForOut • result[resultType] Configurable Mediator Web Services with WSDL
Descriptions
SOAP / HTTP SOAP / HTTP
Workflow Input Workflow Output
1
2
3
4
Figure 6.7: A detailed view of the Configurable Mediator in the context of