2.2.1 Content versus Intent
The word model is widely used, but it can mean a variety of things. In the context of modeling biochemistry, a model can imply a set of statements in the mind of the modeler representing biochemical knowledge, i.e. a mind model. The modeler translates these statements into machine-readable mathematical symbols and relationships, resulting in a formal model. When either of these representations is translated into visual objects in a visual medium, the resulting model is a visual model.
In this chapter, I will use model to refer to the formal model in a particular mathematical framework such as rule-based modeling or reaction network modeling. The specific mathematical statements about entities and processes used in the model, as well as any systematic transformation thereof, will be referred to as the content of a model. The biochemical statements in the mind of the modeler that were used to create the model will be referred to as the intent of the model. A mapping of either of these representations to a set of visual objects and relationships will be referred to as a visualization.
The separation of these concepts allows us to rationally treat the problem of interconversion. The intent of a model is limited only by the vernacular of biochemistry, and can be considered unrestricted for all practical purposes. The content of a model is restricted by the formal definitions of the mathematical framework used. The visualization of both model intent and content is restricted at a formal level by the choice of visual objects and notations, and at a practical level by the cognitive and aesthetic appeal of the generated diagram. The methods that I
develop in this chapter will use the content of a rule-based model to generate the visual representation, but will aim to convey the intent of the model and appeal to intuition.
2.2.2 Local versus Global
Both model intent and content are typically composed of individual statements about small numbers of entities, for example, a reaction with two reactant species and one product species representing a particular binding interaction. A visualization examining one such statement is considered to be local. A visualization of the whole model, composed from many such statements, is considered to be global. At the local level, the focus is on detail and the visualization is tailored to present the maximum amount of detail that is possible. At the global level, the focus is on identifying higher-order motifs and trends in the model, and a certain level of coarse-graining may be needed to uncover these. A comparison to real-world maps is applicable here: maps of larger geographic areas have to approximate features found in detailed maps of smaller areas in order to be useful. In this chapter, we will specifically distinguish between local and global visualizations of rule-based models and we will define appropriate coarse-graining procedures during the generation of global visualizations.
2.2.3 Flow versus Adjacency
An important visualization objective in diagrams of biochemistry is to emphasize the temporal order of events, particularly sequences and cycles [53]. Delineating specific paths, such as pathways, feedback loops and feed-forward loops, is important for visual comprehension and I will collectively refer to them as signal flows. Encoding the temporal order along a graphical
dimension, i.e. aligning signal flows top-down or left-right in the diagram can drastically improve visual comprehension [54]. However, a high density of edges can preclude a good visual alignment of flows, in which case force-directed layouts are efficient. However, these layouts emphasize adjacency relationships [55] and have poor visual comprehension. For a generated diagram to be useful, its size and edge density needs to be sufficiently sparse to align signal flows in an optimal manner.
2.2.4 Art versus Automation
There are many aspects to producing a diagram: synthesizing the content, defining the notation and attribute mappings, drawing the actual elements on some visual media, laying out the visual elements in an optimal manner. When every one of these aspects is left to the discretion of the diagrammer, then the diagram becomes a one-off art project that requires a heavy investment of time. It would be preferable to automate as many of these steps as possible. However, artistic and aesthetic considerations do play a role in the usefulness of a diagram [54], so a balance is necessary between automation and artistic discretion. In this work, we focus on automated generation of the content of a diagram, and coarse-graining of diagrams with minimal user input.
The development of automated layout algorithms is beyond the scope of this work.
2.2.5 Abstraction versus Enumeration
When defining a naming system or notation for a set of objects, there is a tendency to enumerate all possible states of those objects and to assign labels or features to every possibility indiscriminately. I call this the enumerative approach. Manual cataloguing, annotating and
diagramming approaches are typically of this type. Another way of defining a system or notation is to generalize over the set of objects and create a few carefully defined types or classes of objects, then map the real-world possibilities as instances of those classes. I call this the abstract approach.
Mathematical modeling frameworks are typically of this form. Enumeration is superficially the most direct approach, but it inevitable encounters a level of complexity that hinders automation and comprehension. Abstraction is much harder to do, but the right abstraction for a task can drastically reduce the complexity involved, improve clarity and enable automation.