Notations and description languages for information visualisation

The literature covered so describes high level concepts related to visualisation. To enable applying these concepts in a practical medium, over the years, visualisation grammars, declarative embedded languages and libraries have been developed. To this thesis, the interest is in how these languages allow specifying the mappings from data items to visual marks and defining composite views.

The category of visualisation grammars includes the Grammar of Graphics (Wilkinson, 2005) (GoG), the layered grammar of graphics (Wickham, 2010) (LGoG), Vega 1 and

Vega-Lite (Satyanarayan et al., 2016), which is an extension of Vega. They provide

commands and rules at different abstraction levels to specify visualisations. While the internal models and semantics of the rules vary among these grammars, they all follow the underlying idea from the original GoG, which is describing visualisations with multiple basic blocks that map data items to the visual encodings. These basic blocks are composed of data transformations, definition of scales that transform values from data space to abstractions, different ways of projecting these scales with different coordinate systems and rules for assigning data and values to positional and retinal channels of visual marks. Additionally, these grammars also support other graphical features beyond the rendering of visual marks, such as drawing the axes that represent the scales or tooltips and captions in common charts. These grammars are accompanied by implementations in different languages, which also lead to different ways of building them. Particularly, Vega and Vega-Lite have JavaScript implementations that use static specifications of visualisations, in files that have a structured notation for describing

key-value tuples (JSON). On the other hand, LGoG has been implemented as ggplot2

in the R programming environment, which requires either running scripts to build visualisations or inputting commands to build the visualisations interactively.

1_{https://vega.github.io/vega/} 2_{http://ggplot2.org/}

In terms of hierarchical visualisations and composition, the support for them in these languages and grammars vary in the following ways: default functionality, unique

commands for composition and visual variable mapping, with the last one also being

divided into layout arrangements and retinal property mappings. Default functionality means that one composition method is used as default, unless another composition method is explicitly defined. This is the case for superimposing multiple encodings

for the same data in GoG, LGoG and Vega — Vega-lite does not have any default

functionality regarding compositions. Unique commands means that the language has specific properties or operators to define compositions, which is the case for the

facet operator that is used in LGoG and Vega-lite, as well as the repeat, concat and layer operators in Vega-lite. The facet operator juxtaposes multiple subsets with the

same encoding in the two grammars that have it. Concat and repeat are used to juxtapose views for the same dataset; the difference between them is that concat allows a completely different specification of additional views, while repeat has a systematic parameterised composition. The parameterisation means that multiple charts can be generated with only one visualisation parameter varying; for example, bar chart views for two different quantitative variables changing over time. Layer superimposes different encodings on top of each other; it can be combined with repeat and concat to generate multiple superimposed charts.

Some of these grammars also use the properties that specify the mapping of data items to visual channels as means of composition. In the GoG, this is done either through the configuration of the coordinate system or assigning categorical variables to retinal properties. As it is a grammar that is based on algebraic compositions of variables of a dataset, with operations such as unions and cartesian products, the resulting number of dimensions used in the composition defines the arrangement of juxtaposed views. The arrangement is also based on the order of variables in the algebraic composition. Furthermore, as all values of a dataset are shown (unless aggregation is specified), when assigning a categorical variable to the colour channel, for instance, points are painted according to the values of the assigned variable. In the case of colouring based on a categorical variable with three values, three colours will be used, signalling the existence of subsets in the data. This is different compared to the other three grammers, which have a mix of special aggregation abstractions and visual variable assignment for using subsets of data. LGoG provides a group property that defines which categorical variable is used to create the subsets, but assigning certain types of variables to certain channels like colour will have the same result as in the GoG. Vega contains a special group abstract mark that defines panels for multiple views;

GOG LGOG Vega Vega-Lite HiVE Juxtapose multiple subsets C, L O, L G, L O, L V Juxtapose one subset C, L U G O, L U

Superimpose multiple subsets V V G V V Superimpose one subset D D D O U

Nesting U U G L D

Table 2.3 Table of support for composition methods in the main notations used for describing visualisations. U means unsupported, D means default functionality, L means support via layout definition, G means support by a special group visual mark,

O means specific operator to specify the composition and V means support via visual

variable assignment. Vega and Vega-Lite provide various levels of support for the different combinations.

these can be parameterised by the data as well to generate a variety of combinations of encodings through juxtaposition, superimposition and nesting.

In table 2.3, a summary of the composition methods in each notation is given, with the addition, for comparison, of the HiVE notation introduced previously. Besides the clear differences between them, it is also notable that nesting is generally under-supported by these grammars, also reflecting the limitations in the support for hierarchical data. Similar to the frameworks discussed previously, these grammars have varying strengths and weaknesses in terms of support for composition. This is also related to the differences between them and the research presented in this thesis, primarily explained by the fact that designing a general language for visualisation requires definitions of

syntax and semantics in addition to the model. The work presented in this thesis

focuses on the development of the model rather than the language used to specify it in practical use.

2.2.3 Tools and environments for visual exploration

In addition to programming languages, end user applications for visual data exploration have also been developed. These range from tools such as Mondrian 3_{, which enables}

users to analyse datasets with various visualisation methods, to Tableau (Tableau Soft-

ware, 2018), which enables users to design, share and explore interactive visualisations. Other similar tools are Spotfire (TIBCO Software Inc., 2018) and Qlik (Qlik, 2018). To this thesis, the interest is in the aspects of these applications that involve hierarchical and temporal data visualisations. While some of these tools enable lower level reconfigurations, one aspect in particular is the possibility of configuring coordinated multiple views in a dashboard approach, which is present in most of them. In order to generate shareable dashboards, these tools also allow users to transform their data as needed, cleaning and modifying attributes.

Among the state-of-the-art tools, Tableau stands out in particular due to its origins in academic research (Stolte et al., 2002). It includes mappings from data to visual variables in a similar manner to the Grammar of Graphics, while also offering predefined configurations to users. In terms of hierarchical and visual composition aspects, it has a tabular view of the the data and visual space, with limited support for a hierarchical arrangement of views. Both Tableau and Spotfire allow users to rearrange views freely in the dashboard mode though. However, the rearrangement is not connected to the data, as there is no native support for hierarchically connecting views so that they are

superimposed according to the composition methods described above.

These tools are included as part of the survey in chapter 6.

In document A framework for hierarchical time-oriented data visualisation (Page 59-62)