Visualization
G RAPH D ESIGN P RINCIPLES
When creating graphs, you should pay attention to a few simple design guidelines to generate easy to read, efficient, and effective graphs. You should know and understand the following list of graph design principles:
• Reduce nondata ink. • Distinct attributes. • Gestalt principles. • Emphasize exceptions. • Show comparisons. • Annotate data. • Show causality.
Try to apply these principles on your graphs, and notice how they do not just esthetically improve, but also get much simpler to understand.
Reduce Nondata Ink
One of the most powerful lessons that I have learned stems from Edward Tufte. In his book,The Visual Display of Quantitative Information,he talks about the data-ink ratio. Thedata-ink ratiois defined by the amount of ink that is used to display the data in a graph, divided by the total amount of ink that was used to plot the entire graph. For example, take any bar chart. If the chart uses a bounding box, an excessive number of grid lines, or unnecessary tick marks on the axes, it increases the ink that was used to paint nondata elements in the graph. Three-dimensional bars and background images are some of the worst offenders of this paradigm. Get rid of them. They do not add any- thing to make a graph more legible and do not help to communicate information more
clearly.Reduce nondata ink.It is a simple principle, but it is very powerful. Figure 1-6 shows how a graph can look before and after applying the principles of reducing non- data ink. The right side of the figure shows the same data as on the left side, but in a way that is much more legible.
Risk Risk Department Engineering HR Sales Legal IT Finance Marketing 8 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 5 5 4 5 7 7 7
Figure 1-6 An example illustrating the data to ink-ratio and how reducing the ratio helps improve the legibility of a graph
Distinct Attributes
We briefly touched on the topic of perception in the preceding section. One perceptual principle relates to the number of different attributes used to encode information. If you have to display multiple data dimensions in the same graph, make sure not to exceed five distinct attributes to encode them. For example, if you are using shapes, do not use more than five shapes. If you are using hue (or color), keep the number of distinct colors low. Although the human visual system can identify many different colors, our short-term memory cannot retain more than about eight of them for a simple image.
Gestalt Principles
To reduce search time for viewers of a graph and to help them detect patterns and recog- nize important pieces of information, a school of psychology called Gestalt theory8is
often consulted. Gestalt principles are a set of visual characteristics. They can be used to highlight data, tie data together, or separate it. The six Gestalt principles are presented in the following list and illustrated in Figure 1-7:
• Proximity:Objects grouped together in close proximity are perceived as a unit. Based on the location, clusters and outliers can be identified.
8 Contrary to a few visualization books that I have read,Gestaltis not the German word for pattern. Gestalt
• Closure:Humans tend to perceive objects that are almost a closed form (such as an interrupted circle) as the full form. If you were to cover this line of text halfway, you would still be able to guess the words. This principle can be used to eliminate bounding boxes around graphs. A lot of charts do not need the bounding box; the human visual system “simulates” it implicitly.
• Similarity:Be it color, shape, orientation, or size, we tend to group similar-looking elements together. We can use this principle to encode the same data dimensions across multiple displays. If you are using the color red to encode malicious IP addresses in all of your graphs, there is a connection that the visual system makes automatically.
• Continuity:Elements that are aligned are perceived as a unit. Nobody would inter- pret every little line in a dashed line as its own data element. The individual lines make up a dashed line. We should remember this phenomenon when we draw tables of data. The grid lines are not necessary; just arranging the items is enough.
• Enclosure:Enclosing data points with a bounding box, or putting them inside some shape, groups those elements together. We can use this principle to highlight data elements in our graphs.
• Connection:Connecting elements groups them together. This is the basis for link graphs. They are a great way to display relationships in data. They make use of the “connection” principle. VISUALIZATIONTHEORY Column1 Value 1 Row 2 Entry 1 Column2 Value 1,2 Row 2,2 Entry 6 Proximity Closure Continuity Enclosure Similarty Connection
Figure 1-7 Illustration of the six Gestalt principles. Each of the six images illustrates one of the Gestalt principles.They show how each of the principles can be used to highlight data, tie data together, and separate it.
Emphasize Exceptions
A piece of advice for generating graphical displays is to emphasize exceptions.For exam- ple, use the color red to highlight important or exceptional areas in your graphs. By fol- lowing this advice, you will refrain from overusing visual attributes that overload graphs. Stick to the basics, and make sure your graphs communicate what you want them to communicate. Risk Department Engineering HR Sales 12 10 8 6 4 2 0
Figure 1-8 This bar chart illustrates the principle of highlighting exceptions.The risk in the sales depart- ment is the highest, and this is the only bar that is colored.
Show Comparisons
A powerful method of showing and highlighting important data in a graph is to com- pare graphs. Instead of just showing the graph with the data to be analyzed, also show a graph that shows “normal” behavior or shows the same data, but from a different time (see Figure 1-9). The viewer can then compare the two graphs to immediately identify anomalies, exceptions, or simply differences.
Annotate Data
Graphs without legends or graphs without axis labels or units are not very useful. The only time when this is acceptable is when you want the viewer to qualitatively under- stand the data and the exact units of measure or the exact data is not important. Even in those cases, however, a little bit of text is needed to convey what data is visualized and what the viewer is looking at. In some cases, the annotations can come in the form of a
figure caption or a text bubble in the graph (see Figure 1-10). Annotate as much as needed, but not more. You do not want the graphs to be overloaded with annotations that distract from the real data.
VISUALIZATIONTHEORY