7 Overview – iPipeline
7.2 The ‘thin’ framework layer
7.2.1 Workflow Desktop
The user interacts with the application via the workflow desktop component. It is here that the user creates their workflows using the domain specific processes provided by the problem domain developer. Figure 7-2 illustrates the desktop with its key features identified.
Figure 7-2: iPipeline’s workflow desktop component
The desktop can expand infinitely in any direction to accommodate any sized
workflow or any number of workflows. The single most important part of the workflow
desktop is the toolbox. It is through the toolbox that the user is provided with access
to the process nodes from which they will build a workflow. The contents of the
toolbox will change depending on the specific problem domain being studied. It is
through the toolbox that iPipeline’s ‘thin’ framework layer connects to the ‘thick’
framework layer. The content of the toolbox is defined in the ‘thick’ framework layer.
The toolbox is divided into three sections mirroring the three phases of the
information processing cycle (see Chapter 5). The toolbox divisions, their mapping to
the information processing cycle and a description of each division is given in
Page | 98
Toolbox division
Information processing
cycle phase
Description
Data Source
Input Phase
Processes encapsulating
algorithms that provide
data for processing.
Data Manipulation
Processing Phase
Processes encapsulating
algorithms that manipulate
the data in the pipeline.
Data Visualisation
Output Phase
Processes encapsulating
algorithms that visualise
the data in the pipeline.
Table 7-2: Divisions of the toolbox and their relationship to the information processing cycle
To construct a workflow the user places nodes from the toolbox onto the desktop surface and connects them together. At a minimum a workflow must be composed of one data source process and one data visualisation process. Such a workflow will simply create a visualisation that displays the data loaded into the pipeline. More commonly one (or more) data manipulation processes will be inserted between the data source and data visualisation processes. The definition of a valid workflow can therefore be given as:
1. A workflow that contains a route through the workflow’s directed graph which starts at a data source
process and ends at a data visualisation process and
2. Optionally the route may contain any number of data manipulation
processes.
3. All processes must have correctly configured settings panels.
A valid workflow must have at least one (1) route through its directed graph that meets the conditions above but may have more. Figure 7-2 has two (2) such routes. Figure 7-2 shows both the minimal case (route 1) where a data source and data visualisation process is directly connected and the optional case where a data manipulation process has
been inserted (route 2). Each route can be executed to produce a set of data that is
Figure 7-3: Workflow to merge five data files into a single data source.
Chapter 7: Design of iPipeline
Page | 99
delivered to the data visualisation process at the end of the route. Since Figure 7-2 has two routes through its directed graph it will deliver two sets of data to the iRaster visualisation at the end of the workflow. These will be:
1. The raw data read from a data file by the input file process at the start of the workflow (route 1 in Figure 7-2).
2. A sorted set of data created by the optional data manipulation process (route 2 in Figure 7-2).
The power of visual programming becomes apparent when the workflow becomes more complex. Figure 7-2 dealt with a simple case where a single file of data is loaded into the pipeline. Figure 7-3 extends the workflow to load five (5) files of data. This situation might arise where a researcher wishes to combine simultaneous recordings from five different sensors into a single set of data. The problem domain developer has provided a data manipulation process to merge multiple datasets into a single dataset. Modifying the workflow only takes
moments. A classical computer program would require changes to the source code and re- compilation to achieve the same result. This can realise considerable time savings as operations can be inserted into (or removed from) the workflow without having to re-code or recompile the application. The processes encapsulate computer code that is “written once; re-used anywhere”.
Figure 7-4 further extends the Figure 7-3 example by introducing a new process – the export to file process. The workflow in Figure 7-3 loads merges and sorts five sets of data. This processing takes time and it might be useful to create and permanently store a single file with this
processing already
completed. Figure 7-4 shows such a workflow that produces two files. The first
Figure 7-4: Workflow to merge five datasets. Two new files are created, the first of raw data and the second after sorting the raw data.
Page | 100
file stores the result of combining five sets of raw data. The second file stores the same data after it has been sorted. The
benefits to the researcher can be summarised as:
1. The single file of raw data is easier to use and distribute than five files of raw data. 2. The result of
merging the raw data files and sorting the result is preserved allowing these steps to be eliminated in future. 3. The workflow is naturally paralysable. Writing the merged raw data file to disk can occur at the same time as sorting the raw data as they appear on different routes through the workflow.
The workflow desktop provides a flexible visualisation of the “program” that the user is creating. At the same time it provides the ability to rapidly reconfigure that program and preserve the results.
7.2.2 Process base classes