• No results found

Information Design: An Authoring Activity

5.2 Web Design from an Evolutionary Perspective

5.2.2 Information Design: An Authoring Activity

This section distinguishes between the era before the Web, the HTML era (from the advent of the Web until 1997), and the current XML era (W3C 1998). The beginning of the HTML era was exclusively focused on authoring. Only hypertext documents were supported, as the name of the so-called Web programming language, HTML, suggests: Hypertext Markup Language, a language for instructions – or tags – strewn throughout text documents. In the course of time, HTML supported other media: images, time-sensitive media, such as video and audio, etc., reflected in the term hypermedia, which is sometimes used to distinguish HTML from hypertext and sometimes synonymously to hypertext. We will use hypertext as a generic term in the following discussion.

The hypertext concept is older than HTML; its basic idea was formulated by Vannevar Bush as early as at the end of World War II in view of the emerging wealth of technical information carriers. Ted Nelson coined the term itself in the 1960s. Hypertext documents are composed of the following:

• Nodes, links, and anchors, as introduced in Chapter 1; and

• Meshes and other aggregates. Meshes designate coherent nodes and links, and were called

hypertext documents in the era before the Web. Examples for aggregates include views

(e.g., for experts and laymen among readers), paths (pre-determined reading sequences), and meta-nodes (meshes that can be embedded in enveloping meshes, like a node). The simple Web of the HTML era didn’t support them, but their significance has grown strongly with the advent of advanced modularization concepts, e.g., navigational structures like “star-shaped navigation” (see section 5.4.6), to mention one interaction design example. Though HTML considered initially only the authoring aspect, it represented a backward step compared with popular hypertext systems even in this respect, and even with regard to the fundamental vision of non-linear explorative documents (see Chapter 1). The Web’s popularity was possible only thanks to its simplicity and free worldwide availability. Its major weaknesses are mentioned briefly below in as far as they are relevant from the design perspective:

• HTML can be understood as a (classic) document description language with hypertext tags grafted on. This seduces people to disregard the atomicity principle of nodes; many “HTML” documents (actually nodes) are several pages long, and the basic hypertext idea of non-sequential reading is present only rudimentarily or in exceptional cases.

• HTML mixes orthogonal aspects like hypertext structure (via tags for links and anchors), document structure (headers, lists, etc.), and layout (background color, italics, etc.).

5.2 Web Design from an Evolutionary Perspective 91

• Though the Web recognizes the distributed software architecture with browsers and servers introduced in Chapter 4, it lacks the “horizontal” software architecture of abstract machines. Examples include the classic Dexter architecture, which separates the content and mesh management from the presentation, or the layering suggested in Chapter 3 and in this chapter.

• HTML is text-centric. Other media often occur only as link destinations (dead-end roads); many media types are not supported as link sources at all or have been only recently. It was not until the advent of SMIL (see Chapter 6) that the description of temporal media could be covered on the Web.

• The Web’s evolution increased the significance of first drawback above. The support for structuring and formatting within nodes improved gradually, while important hypertext aspects, e.g., user-definable node and link types, reverse links, separate storage of links and nodes, non-trivial destination anchors, etc., are still missing.

To better understand the authoring aspect of XML, in contrast to the above, we first have a look at HTML’s origin. It dates back to SGML, a standardized generic markup language for the world of print shops and publishing companies. “Generalized” means that SGML defines valid tags and rules to be used for an entire class of documents (i.e., a specific field of application and the documents it normally uses). The results are document type definitions (DTDs). An SGML parser can read DTDs and check documents to see whether or not they correspond to a DTD. However, special software has to be written to interpret and execute the instructions specified by tags. Publishing companies use DTDs to distinguish between different book, magazine, and brochure formats, forms, and many other things. In the beginning, HTML was nothing but an SGML-DTD for the “screen” format, extended by tags for links and anchors as “grafted on” hypertext functionalities. Later HTML versions corresponded to new DTDs. Browsers of the HTML era are not SGML parsers; instead, they have support for a few DTDs (the supported HTML versions) hardcoded into them, including the way they interpret tags and translate commands. The “rendering” is also hardcoded and only the introduction of CSS (cascading

style sheets) enabled reusable layouts and a rudimentary way of separating the layout from the

structure.

The XML era dawned when standard PCs were ready to “digest” SGML parsers. It almost suggested itself to standardize a simplified version of SGML to make the wealth of possibilities of a generic markup language usable. Together with the emergence of XML, an enormous number of “simple programming languages”, defined as XML-DTDs (more recently called XML

schemas), had been defined, including a language to describe remote procedure calls (SOAP),

a language to describe financial transactions (XML-EDI), a counterpart to HTML (XHTML), and many more (see Chapter 6). Since XML lets you formally describe the syntax but not the semantics, modern browsers can parse arbitrary XML schemas and documents, but they (essentially) can execute only XHTML. Almost all the weaknesses of HTML mentioned above have meanwhile been addressed in various XML standards. Whether and how these partly competing standards will proliferate remains to be seen.

We can identify a few basic rules for the design of document-based Web applications, i.e., for the authoring aspect, from the above discussion:

• Meshes should form the center of information design.

• Aspects such as layout and content, node and mesh, etc., should be separated conceptually, even if a technology doesn’t support such a separation.

• The selected technology should support advanced concepts, e.g., central link management, at least in the design, ideally also in the content management system (hidden from the end- user), and in intranets even in the implementation technology itself. XML-based solutions should be given preference over proprietary approaches.