Realisation information - The GeM presentation layers: the layout base

Treating the Multimodal Page as a Multilayered Semiotic

3.2 The GeM presentation layers: the layout base

3.2.2 Realisation information

The second part of the layout base then goes further beyond what is recorded in the GeM base layer so as to identify the particular forms that are employed within the identiﬁed layout units. Following common linguistic usage, we say that each layout unit speciﬁed in the layout segmentation has a visual realisation.

As already suggested, the most obvious difference in realisation is the mode that has been used—the verbal or the graphical/pictorial mode. Fol-lowing this distinction, the layout base differentiates between two kinds of elements: textual elements and graphical elements. These have differing sets of attributes that describe their particular layout properties.

For the textual elements, we need to characterise the narrow typographi-cal features employed. This is already a well-understood area of design and so will not be particularly highlighted in our discussion here. Very brieﬂy, when describing the use of type in a document, we need to specify the type-face (Times Roman, Helvetica, etc.), the style of type (normal, italic, bold, etc.) and the type size (usually measured in points, 1/72nd of an inch). We also add in here the colour of the letters and an increasing range of

graph-ical effects that nowadays ﬁnd application, such as shadows, glows, and 3D textures. Following Walker’s (2001, p26) use of terms from Twyman (1982), we can consider these properties as intrinsic features that “reside in the characters themselves and, particularly, in the system that produces those characters”. The reference to the system used is important as it makes clear that the technology employed determines the status of the particular graphical feature considered. Thus, in online documents, colour is simply an extra intrinsic feature of the characters displayed, whereas in older print technology, colour would have been part of a more or less elaborate manip-ulation of the printing process and so would be considered extrinsic.

Extrinsic features also include properties of typographical display that come about by manipulating the space ‘around’ the characters. In this area we need, for example, also to specify the distance between successive lines of print, technically called leading (rhyming with ‘sledding’), since this influences the perceived texture of a block of type enormously and is a determinant of visual discrimination. Leading is also usually measured in points. A more complete specification of font use in a block of text is then traditionally written with a pair of numbers, e.g., ‘12/14pt’, which means that a 12pt typeface is being used with 14pt between the successive baselines of type—therefore leaving 2pt extra (made up by the leading) between lines. As Schriver (1997, p261) usefully explains, this terminology can be confusing since by convention “10-point leading” and “10 points of leading” mean completely different things: the first refers to the distance between baselines (thereby including one line of type) and the second refers to the actual extra distance that is inserted between lines.

Whereas leading is concerned with vertical layout, we may also need to indicate how characters and words are stretched out horizontally. There may be more or less space between characters (termed kerning) and the characters themselves may be stretched or squeezed. The former has al-ways been available in traditional type-based print, while the latter only really became a possibility that can be productively employed in normal document design with the advent of computer-based typesetting. Again we see technology moving previous extrinsic features to become intrinsic.

Although both of these kinds of horizontal manipulation are important for professional document design, we will not refer to them further here unless they become both especially prominent visually and important for distin-guishing layout elements that we are working with.

These ways of characterising typographic design decisions made on the page were developed with respect to traditional print technology, going back several hundred years. Both the ‘10pts of leading’ used for vertical spacing and kerning were originally managed by literally inserting thin

strips of printer’s ‘lead’ between the lines of type or between characters.

Nevertheless, despite the completely different technological basis that is now responsible for getting pages onto paper or screen, the traditional vo-cabulary still provides a good way of describing and distinguishing the de-sign options that are available in modern document dede-sign. The techniques and effects that they describe are all basic tools of the trade for inﬂuencing the visual perception of the basic organisation of pages (cf. Schriver 1997, pp283–288 and Waller 1990).

Figure 3.3 Example of differential use of leading in the example Gannet page

We can illustrate something of their continuing utility by drawing atten-tion to the three distinct leadings used in our Gannet example; this is sug-gested graphically in Figure 3.3. Measuring the distance between lines in the Gannet page, we can discern three distinct regions on the page. The top lines of text are the furthest apart, the middle block are less far apart, and the bottom block is the closest together of all. The arrows on the lower right of the ﬁgure show the relative size differences. This shows that selection of leading and type size is still one of the ways that document designers ma-nipulate typographic realisation in order to signal implicitly to the reader divisions of the page into elements and those elements’ respective impor-tance and inter-relationships.

We can then start characterising textual layout elements with speciﬁca-tions of the following form, which give the values of particular attributes depending on the type of layout element in question:

layout element

The ‘xref’ attributes of each element identify which particular layout ele-ments we are dealing with: the numbering and its correspondence to the base units of the page will be shown below in Figure 3.4. The first lay-out element shown here refers to the main block of text on the page; the second refers to the single block covering the additional bird information given in the lower portion of the page. We can see that the font informa-tion specifies that the text realisainforma-tions are identical apart from the size and leading. Separating this realisational information from the particular lay-out elements making up the laylay-out layer segmentation makes it possible to capture generalisations by grouping all the layout elements that share a realisation within the same specification.

Generalisations are also captured by the hierarchical structure deﬁned over the layout elements. The information concerning the second layout unit, L1.4, is therefore ‘inherited’ by all its subordinate layout elements.

These elements are those expressing information about the Gannet’s

‘haunt’, ‘nest’, etc. in the lower part of the page (see Figure 3.4 for clariﬁcation). For these all that that needs to be expressed concerning their typographical realisation are the deviations from what was inherited from L1.4. We describe this below when we introduce the layout structure in detail.

Layout elements also specify the properties of embedded textual base units that are typographically highlighted against their environment in this way. These include exactly the same attributes as ordinary text elements with an extra attribute that refers back to the embedding element. The layout element explicitly notes those properties of the embedded element which make it stand out against its context (e.g., bold, italic, size, colour, etc.) by annotating these with particular values; the remaining properties are ‘inherited’ from its surrounding context.

The particular font attributes and values used here are for purposes of il-lustration only and, although useful for comparison across the texts we deal with, a more complete and standardised representation is preferable. There

have been many proposals for the properties and values that could be used for this representation—those developed for the linguistically-based anal-ysis of multimodality are relatively simple (e.g., Sefton 1990, Lim 2004), while those used in professional document design and the print industry are correspondingly more complex.³ The actual properties and values that we will assume for the GeM model are taken from the well developed charac-ter models that we discussed in the previous chapcharac-ter in the context of page rendering. These models, deﬁned for the markup language XML and its related formatting objects and style sheet speciﬁcations, provide a well es-tablished degree of international standardisation and are being continuously extended in sophistication.

Adopting such standardised attributes also contributes to our ability to collect data since ever more documents are being produced using these de-scription languages in any case; the properties they define for layout ele-ments then become available ‘for free’. Even if it is the case that for some particular analytic purpose a less complex characterisation (typeface, size, leading, colour, etc.) is sufficient, this should be based on a systematic simplification of the current standards. Adopting or developing further ad hoc characterisations that are not directly related to those standards is best avoided. We return to this issue in Chapter 6 below.

For properties appropriate for layout units realised in the visual mode there is much less agreement. Here there are proposals from each of the communities discussed in the previous chapter—such as, for example, the

‘interactive meanings’ of Kress and van Leeuwen (1996, p154), O’Toole’s (1994, p24) ‘functions and systems in painting’, Lim’s (2004, p236) ‘sys-tem network for graphics’ and many more. It will be some time still before an empirically motivated set of properties adequate for document analysis in general has been established. For present purposes, we will simply note whether the visual is a photograph, naturalistic drawing, line drawing or di-agram. Further distinctions will be considered as motivated by the require-ments of individual genres—for example in the area of diagrams, where we have arrows, connecting lines, and a host of other more conventional elements that must be distinguished for effective analysis (cf., e.g., Bertin 1983, Winn 1989, Tufte 1997).

In document John Bateman. Multimodality and Genre (Page 137-141)