The architecture consists of execution engines which are identical in design. Each engine consists of internal modules that provide a unique set of capabilities. These modules coop- erates with each other to support the execution of service-oriented workflows. This section presents the design of the engine’s internal modules and describes their interactions, and it is organised as follows: Section 4.2.1 describes the internal modules of the architecture’s execution engine component, whereas Section 4.2.2 and discusses the interactions between these modules.
4.2.1
Description of the Engine’s Internal Modules
Figure 4.2 provides an architectural overview of the engine and its internal modules and the interactions between themselves, the end user (e.g. scientist), and the execution environ- ment. This section provides a concise description of each of these modules which include:
4.2 Engine Design 135 2. Partitioner module. 3. Analyser module. 4. Deployer module. 5. Monitor module. 6. Executor module. 7. Datastore module. Service-oriented Environment (1) (5) (8) (10) (11) (2) (7) Knowledge (9) Compiler Executor Deployer Datastore Interface Interface Scientist Monitor (3) (6) Partitioner Analyser (4) Engine Hosting Machine
136 System Architecture Design
Compiler Module
The compiler module is built from a set of procedures matching the production rules of the
Orchestralanguage. This compiler ensures the correctness of the workflow specification. It constructs a directed acyclic graph data structure that represents the workflow in which the vertices are computational tasks (e.g. service invocations), with edges between them that represent the data dependencies between the tasks. This data structure can be traversed, analysed, and decomposed into smaller parts that may be distributed to remote engines for execution. This compiler ensures the correctness of the workflow specification, and constructs an executable data structure that represents the workflow which can be analysed and distributed to remote locations.
Partitioner Module
This module is responsible for decomposing the workflow into smaller partitions (e.g. sub workflows) and modifying the overall structure of the workflow or its partitions as neces- sary. It relies on the analyser module to partition the workflow and produce a plan for the deployment of the overall workflow onto execution engines.
Analyser Module
This module performs placement analysis using information collected from the execution environment to determine candidate engines for executing the workflow partitions. It may cooperate with the partitioner module to restructure the overall workflow by decomposing it into smaller partitions as necessary.
Deployer Module
Based on the placement analysis outcome, this module generates a deployment plan for transmitting the workflow partitions to remote engines and triggering their execution.
4.2 Engine Design 137
Monitor Module
This module monitors network resources such as the remote engines and services involved in the workflow. It collects QoS metrics that relate to these resources such as the network
latency and bandwidth metrics. These metrics are typically stored in aknowledge basemod-
ule, and can be used in placement analysis to assist in decisions that relate to the partitioning of the workflow and its deployment.
Executor Module
This module is responsible for invoking services, collecting the invocations’ results, and forwarding these results to remote engines as necessary.
Datastore Module
This module is used to maintain information and data relevant to the workflows being exe- cuted or have been executed by the engine. The implementation of this module is described in Section 5.4.7.
4.2.2
Interactions of the Engine’s Internal Modules
The engine’s modules cooperate to compile a workflow specification, partition the workflow into smaller sub workflows, perform placement analysis to determine the most appropriate network locations at which the partitions may be executed, and generate a plan to deploy the workflow partitions onto multiple engines for execution in a monitored service-oriented environment. Based on the illustration provided in Figure 4.2, the interactions between these modules are enumerated as follows:
1. The compiler accepts a workflow specification as input.
138 System Architecture Design
3. The partitioner instructs the monitor to collect information from the environment to assist in the workflow partitioning process.
4. The monitor gathers QoS metrics relating only to the network latency and bandwidth between the services and the engines participating in the workflow. These metrics are discussed in section 4.3.3 of this chapter.
5. The information gathered by the monitor are passed to the analyser, which attempts to determine the most appropriate network location at which to execute the partitions.
6. The placement analysis results are passed to the partitioner, which relies on these results to restructure the workflow as necessary and generate a deployment plan of the workflow partitions.
7. The partitioner passes the deployment plan to the deployer component.
8. The deployer transmits workflow partitions to remote engines for execution.
9. The compiler analyses a workflow partition specification and generates its dataflow graph, which is then passed to the executor component.
10. The executor invokes services and communicates with remote engines.
11. The executor reads and writes data from and to a persistent datastore.