Application Design and Development - The Helium Prototype

The Helium Prototype

5.2 Application Design and Development

In order to develop a prototype there were a number of decisions that needed to be made in relation to how a usable and testable application is developed. There were a number of components to the system that had to be developed that both interact with each other and provide a usable framework on to which more complex analysis and visualization tools can be built.

The basic functionality was that users should be able to visualize pedigree data loaded from external files or from a suitable Germinate database; this being the preferred option.

The application design was split into a number of key areas which were then designed and implemented to give basic core methods on to which more detailed and specific functionality would be added after user feedback and testing sessions.

A user centred iterative design process was used in the development of the Helium prototype. Due to being employed within a plant genetics department, and heavily involved in additional projects involving European plant breeding companies and academic partners, there was daily contact with plant geneticists, and regular contact with commercial plant breeding companies both through email and during organised project meetings at which this work was routinely presented. Prototypes were also shown at conferences both within the biological and visualization domains. Feedback was also gained from one-to-one sessions which were held with breeders to show them in greater detail the work that was being undertaken. Additionally, sample datasets were obtained from breeders to test the visualization but these were commercially sensitive and not represented in this work. What this allowed was the testing of Helium on data out with the barley datasets detailed in Section 2.6.

The daily interaction, constant dialog and iterative design process ensured that features added were directly beneficial to end users and development was in line with the precise feedback from domain experts.

This constant feedback meant that an Agile software development methodology could be used which facilitated the evolutionary development of the Helium application. It was important that working prototypes were used for demonstration and feedback throughout the development process to ensure that development was both heading in the right direction and was meeting the research requirements of end users. It was also common to hear users have additional ideas when presented with the prototypes, this software development approach allowed the incorporation of these ideas where appropriate.

Design

The overview provides a high level overview of all the data in the pedigree being examined. In the test dataset this would mean a layout representation of all the barley plant lines. The overview can be colour coded for a single parameter such as the winter spring genetic divide or DUS characters loaded from Germinate and node sizing can be enabled to draw emphasis to particular plant lines that are commonly used in crosses. In addition, the use of emphasis such as in the changing of thickness of line around a node to highlight a plant line of particular interest will allow the user to determine where it sits in relation to the other plant lines in the pedigree as well as offering a reference point for the other data display panels in the application. A selected node (plant line) becomes the focus for all other displays within the visualization insomuch as the information contained in additional displays would relate to the selected node.

The overview shows nodes and edges to show genetic structure and selectively highlight edges based on the selected nodes to show related plant lines. The amount of detail at this level is severely restricted and is intended only to give a broad overview of the dataset and its size. Details such as plant line names were omitted from this display as they are too small to read and therefore serve no useful purpose. DUS characters are selected from a drop-down list which can be changed on the fly by users.

Detail level 1

This level shows a more detailed layout of the pedigree based on a single plant line selected on the overview display. Moving the overview windows will update this display to show the plant lines that are under the highlighting box on the overview visualization. Once a plant line is selected this display will update to show all ancestors or descendants based on options shown in the overview dialogue. These plant lines will be colour coded based on the overview colouring scheme but there will be an option to subsequently colour this display based on other parameters relating to the dataset such as phenotypic or genotypic data. All plant lines at this zoom level will be visible and names clearly displayed within nodes. Edges will also be clearly displayed. This can be looked at as a detail and overview stage.

1. Varietal name or other naming convention.

2. High level overview of some phenotypic or genotypic characteristic.

a. This could be something like resistance to a particular pathogen based on experimental work or;

b. The existence of a particular allele or haplotype at a given genetic locus identified by genotyping or sequencing data.

The attributes of the graph node that are available are as follows; node position, node shape, node size and node colour.

For clarity focus was only on node size and node colour having already established that users were happier with round node aesthetics. Node position was determined by the yFiles Sugiyama type graph layout algorithm.

Level 2 detailed data view

Detail level 1 showed a very general overview of the data which is held in the dataset and subsequent zoom levels may, or may not; include additional information which would not be possible with the high level overview level. This data level shows background information about a plant line and displays the data in a data panel, thus forming a classical details on demand design pattern described by Shneiderman in his visual information seeking mantra (Shneiderman 1996) and the term originally defined by Kreitzberg (Kreitzberg 1991). Selecting a node or edge of interest on the graph representation of the pedigree will trigger the retrieval of additional information from the database backend. Due to the potential volume of additional data this data panel will most likely be a tabbed pane with panes facilitating the logical split of additional data types.

Examples of data that are displayed include background information on the plant line, background information on genetic markers relating to the plant line or calculated phenotypic data such as field trial results. The details will very much depend on the context in which the pedigree is being examined but the following data types currently exist.

It has been already shown that there is a large volume and diversity of additional data that is held on each plant line although the coverage is somewhat sporadic (Section 2.6). While most of this data should be displayed as additional information in its own visualization space there are categories such as phenotype data where this information

can be overlaid on the original pedigree representation. This also extends to include things such as breeder information which can be overlaid to give a representation of which breeders are responsible for which plant lines.

To summarise; there are a large number of additional data types that are important in the comprehension of the pedigree data. These data types form background information that is important for breeders to know in order effectively analyse the data for important information, trends or patterns. These data types may be represented by overlaying the pedigree view with the information in the case of simple data types or by the creation of custom views showing additional visualization methods. The data that is displayed should represent the plant line of interest on the main pedigree view and change, or be clear, if nothing is selected. This is to try and help avoid confusion with the visualization.

In document Visualizing genetic transmission patterns in plant pedigrees. (Page 117-121)