Layout structure - The GeM presentation layers: the layout base

Treating the Multimodal Page as a Multilayered Semiotic

3.2 The GeM presentation layers: the layout base

3.2.3 Layout structure

Finally, we specify how the individually identiﬁed layout units are grouped together into larger elements that collectively make up the composition of

3As can be seen, for example, from current standardisation efforts such as the OASIS Open Document Format http://docs.oasis-open.org/ofﬁce/v1.1/OS/OpenDocument-v1.1.pdf.

the page. For instance, a heading and the text to which it belongs form to-gether a larger layout unit; or, similarly, the cells of a table form the larger layout unit “table”. In order to capture this kind of spatial grouping of ele-ments, we need to specify how larger segments can be constructed and po-sitioned relative to one another. We have developed several criteria for this, such as considerations of ‘framing’ and ‘visual integrity’. By these means layout elements are progressively grouped into larger elements´, building up a hierarchical structure with the entire page, or page spread, as the root and leading down through ever smaller elements to the smallest layout el-ements, those identiﬁed in the segmentation part of the layout base. The hierarchical structure itself is represented in the third part of the layout base: the GeM layout structure.

As discussed in the previous chapter, in Reichenberger et al. (1995) we proposed identifying visually-motivated layout chunks by progressively de-creasing the resolution of an image of the page; more sophisticated methods adopted from the automatic layout analysis community could also be use-fully employed at this point. The grouping into chunks can be applied in several steps, thus forming larger and larger layout chunks out of the ba-sic layout units up to the entire document. One chunk may then come to consist of layout elements of different realisations (text and graphics). Typ-ical layout elements will share some visual features (font-family, font-size, ...), while differing visually from their surroundings (e.g., by background colour, a surrounding box or whitespace and framing). Here we allow a very limited form of functional consideration in terms of ‘what belongs to-gether’. This is a valuable channel to segmentation that is inﬂuenced by our generic or speciﬁc knowledge of how a document is to be ‘read’.

It is also useful to contrast our layout structure with the document struc-ture of Power et al. described in the previous chapter. Since the layout structure is explicitly oriented towards the visual make-up of the page, it is naturally more ‘surface-oriented’ than document structure and differs from that structure in that

• it reﬂects the production and canvas constraints which the realisation of a given document structure is subject to (decisions about pagina-tion, columns, margins, hyphenapagina-tion, etc.);

• it includes navigational elements—layout elements which are not de-rived from the content, but which serve to guide the reader through the document (e.g. page numbers, pointers, running heads, titles);

• it speciﬁes the position of layout elements on the page.

For our Gannet example, the layout structure is then relatively straightfor-ward although, as mentioned above, there does remain some indeterminacy

Figure 3.4 Layout structure for the 1972 Gannet example page according to the GeM model

at the top of the page. Differential texture (cf. Figure 3.3) and application of segmentation procedures such as the XY-tree, give us four bands run-ning horizontally across the page. Moreover, if we proceeded by reducing resolution and ﬁltering for areas of similar texture and layout element prop-erties, we arrive at three layout units running across the top band of the page, just above the drawing: one consisting of the base units 1 and 3, one consisting of just base unit 4, and one on the right consisting of base units 2 and 5. Whether this is what the designer intended or not is a separate question—one which we will return to in more depth in Chapter 4.

A layout structure for the entire page is then given in Figure 3.4: the low resolution starting point for the visual decomposition is shown upper right in the ﬁgure. We also use here portions of the original page to indi-cate the base units to which the layout elements correspond. The structure shows that layout units are typically grouped into collections which are then themselves grouped into larger collections still. Each collection con-sists of elements showing some kind of commonality in the features of its member elements. This is the main criterion employed in the present case.

For example, the three elements of the header are grouped together because of their consistency in type face selection, size and leading; an alternative would have been to maintain these separate sub-elements (L1.1.1–L1.1.3) as direct descendents of the page. The material at the bottom of the page is also grouped together into a single layout element (L1.4); again

homogene-ity of texture (type face, leading, size) is the main cue here together with a rather weak framing separating it off from the main text body. Within this element, the individual elements are identiﬁable by virtue of the list headers in bold.

Each layout element has associated with it the particular properties that characterise it: that is, for the text elements, we have information about type face, leading, size and so on and, for the image, information about the type of drawing that is involved. Generalisations about groups of elements are made by moving the shared properties and values upwards in the layout structure. Thus, for example, the fact that all of the items in the ﬁnal list of the page (L1.4.1-L1.4.7) exhibit the same formatting and typographical features is captured by placing this information on the parent element L1.4.

This then also expresses that the sub-elements are visually similar: whether they are also similar in intent is then a question addressed when we deal with the rhetorical layer.

We use the layout structure and its units to identify and group visual ele-ments while the ‘content’ of these eleele-ments is provided by the base units of the base layer. We can therefore set up associations between layout units on the one hand and base layer units on the other. Considering, for example, the single layout unit in the centre of the page, L1.3: this is a paragraph of text. The orthographic sentences of this paragraph do not, however, con-tribute individually to the layout—they simply follow the text-ﬂow deter-mined for paragraphs. The sentences are nevertheless identiﬁable in the base layer. We therefore have an association between the single layout ele-ment, L1.3, and the set of base units 7–10 (cf. Table 3.2).

The layout structure captures the overall visual dependencies and real-isations evident on a page but does not yet fully determine the page or page segment layout. Further information is required about the actual po-sition of each unit in the document (on, or within, its page). For this, we introduce the area model. This serves to determine the position of each layout-element in a way that abstracts beyond the speciﬁcs of individual documents. As we have seen with respect to grids in the previous chapter, pages often partition their space into sub-areas. Whereas this need not al-ways be done when designing pages, if a page exhibits grid-like properties for the reader when interacting with the page, then we certainly need to capture this in our initial analysis layer. There is a further relationship to be seen here to the XY-trees that we introduced in the previous chapter; these are also determined by the relations recoverable on the page rather than by design intent. We leave the precise connection between these constructs open at this point: it is to be expected that as the methods available for in-ducing XY-trees become more sophisticated, they will become candidates

for the automatic detection of area models: until that is the case, however, we need to allow the analyst to determine the structure to be used.

The simplest area model is a generalisation of the modular grid speciﬁca-tion and partispeciﬁca-tions page space into sub-areas. For instance, a page may be designed in three rows—the area for the running head (row-1), the area for the page body (row-2), and the area for the page number (row-3)—arranged vertically. The page body space may itself consist of a number of columns arranged horizontally. These rows/columns need not be of equal size. For the present, we restrict ourselves to rectangular areas and sub-areas, and al-low each area to be arbitrarily divided further into sub-grids, each with their own dimensions. We will also see below some cases where the area model does not consist solely of a rectangular grid: in general, we can imagine any shape serving as a guideline for positioning layout elements. This style of document layout is becoming increasingly common in modern documents mainly because it is as straightforward for computer-supported design sys-tems to align text and other elements along curves as it is along straight lines.

Each page grid has a single root, which structures the document (page) into rectangular sub-areas in a table-like fashion. The sub-areas deﬁned by an area model then provide abstract locations for specifying where par-ticular layout elements or groups of layout elements are positioned on the page. A location of a layout element is sufﬁciently determined by saying in which cell of which area/sub-area it is to be placed. In addition, the layout elements located within an area are assumed to be aligned vertically with each other with respect to their left edge and horizontally at the top of their row. If this is not the case, then it is necessary to provide ‘offset’ values to record the discrepancy: these can either be explicit numeric values or abstract qualitative values like ‘right’ or ‘centre’. The relationship between layout structure and area model is suggested graphically in Figure 3.5.

We have found the separation of layout structure and area model useful in our analyses for three main reasons:

• First, it is quite possible for minor variations in the precise placement of layout elements to occur for genre-speciﬁc, canvas or production constraint reasons without altering the hierarchical organisation of the layout structure; separating the two levels of information makes it easier to recognise the commonality. This is related strongly, just as was the grid, to issues of the virtual canvas used in a document.

The area model is an essential part of the deﬁnition of such virtual canvases.

Figure 3.5 Graphical representation of the general method of correspondence used to relate layout structure and the area model in the GeM framework

• Second, the separation allows us to state generalisations over the physical placement that are inconvenient or difficult to express at the level of individual layout units. For example, one major function of the area model is to allow us to specify alignment relations over quite diverse configurations of layout units on the page: as we saw in the previous chapter, such alignment can be a significant factor in the assessment of good continuity and the forming of layout elements.

These alignments are naturally captured by the area model speciﬁca-tion.

• And ﬁnally, we also use the area model to explore Gestalt properties of the page layout as a whole. Properties such as centering and po-larisation, as proposed by Kress and van Leeuwen (cf. Chapter 2, Section 2.2.2) may be pursued in terms of area model conﬁgurations.

Polarisation may be indicated by grid modules that are arranged so as to provide more or less symmetrical ‘sites’ for positioning layout elements that serve rhetorically distinct purposes; centering by more complex grids with deﬁned central and non-central (margin) areas for alignment. We will see cases of these below.

Although this covers an extensive range of documents, there are still two problematic cases that do not ﬁt into the layout structure presented so far:

• Insets: Layout elements can displace or intrude into the space of other layout elements.

• Separators: Certain graphical elements (lines, arrows) do not al-ways ﬁt into a grid structure, but instead serve to indicate column or row separation—thereby ‘showing’ the area model explicitly in the page design.

For insets their content forces the content of the units that they are inset against to ﬂow around them. In the GeM model both the inset and the rela-tive ‘background’ are made to be ‘child’ elements of a single parent layout segment. To determine the precise location of the inset, the height and the width of the inset as well as its alignment inside the parent element’s space have to be explicitly given. A second, slightly different situation occurs when the inset layout element intrudes into the space of another layout ele-ment without displacing it. The latter eleele-ment does not need to be a textual ﬂow-object. One example for this is when text intrudes into the empty space surrounding illustrations, and the background-colour of the illustration is the same as the background colour of the text. Here both layout elements are siblings in the layout structure and have their own locations in the sub-area associated with their parent. These locations are adjacent. Thirdly, one element may appear in front of another, obscuring its content; in this case the layout structure treats them as if they were separate and places them within a stack of area models arranged in the third dimension orthogonal to the page; we will generally omit this complexity in our discussion here however.

Separators are two-dimensional graphical elements, often of type “line”, that are used as delimiters between cells, columns or rows, rather than to form cells by themselves. These lines may have a variety of further at-tributes, such as decorations of various kinds, arrowheads (determining di-rection), colours, thickness and so on. We will see examples of these in the documents analysed in later chapters.

The area model of our Gannet page is relatively simple and does not employ very much of the ﬂexibility that we have now introduced—nevertheless its thorough description belongs to a full analysis of the genre and so we will conclude our account of the layout structure of this Gannet page with its area model. As suggested with respect to grids in the previous chapter, an appropriate area model in this case is a straightforward manuscript grid. Moreover, because we are trying with an area model to describe what was actually done in the design of a page rather than expressing guidelines for design, we must also specify the precise height of the horizontal bands running across the page.

There is also structure to represent within these bands: in particular at the top of the page, where we have at least three layout units.

The lower portion of the page is also interesting in that despite a partial resemblance to a table in some respects, this is not supported visually for the lower part of the page as a whole. There is no regular alignment in the vertical dimension of ‘cells’ as is the case with tables. Instead we need here a typographical ﬂow rule of the following form: for each element, take the

Figure 3.6 Correspondence between the layout structure and the area model of the 1972 Gannet page

ﬁrst sub-element as a label (in bold) and separate it with whitespace from the second sub-element; if there are further sub-elements, separate these by whitespace also. Since the whitespace is dependent on the actual text that is present rather than any text-extrinsic spatial organisation, the positioning remains within the typography and is not to be captured in the area model.

The correspondence relation between the layout structure of this page, as set out in Figure 3.4 above, and an appropriate area model is shown in Fig-ure 3.6. This shows that the area model in the present case has rather little alignment to capture apart from the overall margins of the page; certainly within the itemised list at the bottom of the page there is no discernable alignment and so there is no reason to impose a stronger area model. The lower part of the area model accordingly consists of a single ‘cell’ for the layout chunk corresponding to the entire list occupying layout unit L1.4;

the layout within this cell follows from a combination of paragraph text-ﬂow properties and whitespace insertion as suggested in the rule above.

This forms part of the generic information given for the seven subordinated layout elements and so is captured as a generalisation in the parent ele-ment L1.4 rather than being repeated for each paragraph. It is then a matter of further empirical investigation to investigate how widespread, in terms of genres, production, designers, etc., this particular technique of framing layout elements is. Its resemblance to primitive spacing techniques when using a typewriter cannot, however, be overlooked, which already serves to

‘position’ the page generically.

This also serves as an interesting illustration of just how quickly things have changed. The page, published in 1972, is clearly straining at the pro-duction and consumption constraints holding at that time. The units of

the page operate primarily within the narrow typographic mode, relying on text-ﬂow for distributing its layout elements around the page, but with devi-ations (such as those seen in the spacing required within L1.4) that will in later documents be taken up entirely within the spatial mode. Then we will see explicit spatial information being taken up in the area model rather than relying on typographic ‘rules’ in the layout structure. The present page is, for this reason, very much in a transitional stage as our detailed analytical decomposition makes clear.

In document John Bateman. Multimodality and Genre (Page 141-149)