2.4 Metadata, Information, and Knowledge
2.4.6 The Semantic Web
As has been seen in previous sections, the WWW has developed increased functionality as a hypertext system, and the ability to better structure information using metadata. This work also forms the foundations for the
Semantic Web [21], where the resources and information available on the WWW will be enhanced with reasoning and understanding, not just for human
consumption, but for automated processing and inferencing by software agents. The strategy for the Semantic Web is broken down into several layers (figure 2.8), where the technologies of each layer build upon those previous - an instantiation of the data-information-knowledge hierarchy.
Figure 2.8: The Semantic Web (from [21])
The Semantic Web is not a replacement for the current WWW, but an
extension, and so its basis remains the vast amount of distributed data available using common and open Internet protocols, and addressable using URIs.
Information, created by structuring this data, is then enriched as the Semantic Web layers are ascended.
While some data on the web is completely unstructured, other data has some basic structure due to its encoding in XML, and further structure is held in the links between data (the hyperstructure). But this structure within and between XML documents is arbitrary; there is no more explicit meaning to it than could be automatically understood and processed by a piece of software.
RDF builds on this by providing an assertion layer, in which metadata is used to add meaning to a resource on the WWW. RDFS can be used to define how the metadata can be asserted.
The next layer, provided through the Web Ontology Language (OWL) [18] is needed to describe the relationships between these asserted identifiers, to document these terms and concepts (and is based upon earlier work such as DAML+OIL). The formal definition of relations amongst terms in a vocabulary is called an ontology, and in the Semantic Web will typically consist of a taxonomy and a set of inference rules. For example, an ontology might define a faculty as an organisational sub-division of a university, and a department as a sub-division
of a faculty. An inference rule might state that a student who is a member of a department is therefore also a member of the faculty of which that department is a sub-division. Ontologies also allow expressions of equivalence between terms in different ontologies - this is particularly important in a distributed environment such as the WWW where it is very unlikely that the same ontologies will be used, or even be appropriate, throughout the entire system.
The Semantic Web is very much an active research area, and the upper layers shown in figure 2.8 are still to be developed. It will be realised when we can create and use many programs that collect Web content from diverse sources, which will process - understand - the information, and exchange the results with other programs.
Metadata in Support of
Streaming Media
Chapter 2 documented the development of hypermedia systems and how they have become distributed over networks, more specifically the Internet.
Multimedia systems which use temporal multimedia (such as audio and video) have also developed in distributed environments, where they demand additional support from the underlying network and transport protocols; hypermedia systems have intergrated such multimedia content through the development of hypermedia models and presentation languages with explicit support for time. Standardised metadata and vocabularies provide a framework in which to
generate and manipulate information to support structured processing, which are being applied to large scale distributed hypertext through the Semantic Web. In the first part of this chapter a lifecycle model of streaming is presented as a new way to evaluate the systems studied in chapter 2 (section 3.1). The model characterises the process media goes through as it traverses a distributed
multimedia system; functionality in systems is categorised into three layers, and the provision for these layers across the lifecycle is examined.
It is asserted that, in all these systems, the structured interpretation of data is paramount in its use as hypermedia, and interpretation as knowledge (section
3.3.1); that there is a commonality in the use of various types of metadata to support and enhance distributed multimedia applications, and a generalisation of structure in metadata to which the principles of structured interpretation, such as those from Open Hypermedia, are still applicable.
After analysis of metadata in multimedia systems in the context of the lifecycle model (section 3.2), the second part of the chapter describes the deficiencies in supporting metadata through the distribution, or streaming, stage of the lifecycle. To address these issues, the notion of continuous metadata is introduced (section 3.3), where explicit support for time-based processing of associated metadata is extended back across the network from presentation towards production.
The aspects of the metadata layer that continuous metadata is required to embrace are investigated, and finally the use of continuous metadata is illustrated in several motivational scenarios (section 3.4).
3.1
Components of Distributed Multimedia
Systems
In section 2.2.1 it was seen that, from the users’ perspective, distributed multimedia can be classified into three types of application: presentational, conversational, and combined [80].
From a systems perspective, the basic elements needed to support these applications can then be categorised as [5]:
1. Explicit support for continuous media. 2. Quality of Service.
3. Synchronisation. 4. Group communication.
Fulfilling these conditions poses various obstacles at all levels in a multimedia system, from basic network support for quality of service and multicast, through protocol support for encoding and transporting high bandwidth, time sensitive media, to synchronised presentation and navigation. Chapter 2 described research and solutions associated with supporting these requirements.