Topic maps - Knowledge representation strategies

4.2 Knowledge representation strategies

4.2.6 Topic maps

Topic maps can be defined as a standard for describing knowledge structures and associating them with information resources [27]. The purpose of a topic map is to convey knowledge about a resource through the use of an superimposed layer (or map) of the resources. This is done

by capturing i) subjects which resources define and; ii) the relationships between these subjects [28].

Topic maps are an approach that incorporates techniques from three different fields namely: tra-ditional indexing, library science and knowledge representation. Topic maps also have advanced techniques for linking and addressing information [27]. It is the equivalent of the traditional index found in the back of conventional books but extended and modified for the dynamic needs of electronic communication.

To better understand the relationship between topic maps and traditional indices, a discussion of traditional indices are warranted. There are three types of indices, each emphasizing another aspect of the index idea: traditional indices, glossaries and thesauri.

Indices A traditional index in a book can be seen as a map of the knowledge contained in that book. The index typically lists the topics covered in the book and references to the topics (typically in the form of page numbers) [27]. An index is made up of the following components:

• An (alphabetical) list of topic names (of which there can be more than one).

• Associations between topics.

• References to occurrences of the topics (pointed to via a “locator”, i.e. page or section numbers etc.).

Topics, associations and occurrences are also the key constructs in the topic map model, and their role in the topic map model will be discussed in more detail later in this section.

Glossaries and thesauri A glossary is a list of terms and definitions. It can be seen as an index with topic names and their definitions. A distinguishing feature of a glossary is that the definition of a topic is given directly in-line with the topic, therefore no locators are necessary to point to the desired definition. The glossary tells us what a specific term means and also other information relevant to the term.

A thesaurus can be seen as a network of interrelated terms with the associations between the terms of prime importance. Given a specific term, a thesaurus will give a list of synonyms for that term, terms in broader or narrower categories as well as any other related items. Associations in a thesaurus can be seen as typed. This enables us to say more than just two terms are related, but also why and how they are related [27]. This also allows for grouping of related terms, simplifying navigation.

The abovementioned are all common ways of mapping knowledge structures that exist in the printed media. Obviously for this to be useful in the realm of electronic communication, there is need of a method for representing this knowledge in a way so that it can be exchanged between people and/or machines. The structure of interest in the context of topic maps are conceptual graphs.

Conceptual graphs The conceptual graph is a type of semantic network (see section 4.2.2) used for the representation of knowledge. Nodes in the graph can be seen as concepts, which are defined by concept-types and conceptual relations between types. Each relation is linked to a requisite number of concepts and each concept is linked to one or more relations with di-rected arcs [29]. Concept-types, concepts and conceptual relations comprise the primitives of the language, and are clarified briefly below.

Concept-typesare used for the representation of classes of concepts that may exist in the concep-tual graph system. Content-types impose a type-hierarchy on the system, defining subtypes and supertypes of concepts. It is interesting to note that the type-hierarchy for a conceptual graph need not be tree-like in nature and a more graph-like structure is acceptable. It is also important to note that while it is assumed that the type hierarchy is predefined for the system, facilities for the definition of new types also exist.

A concept is an instance of a concept-type, and denotes that the concept exists in the system.

Concepts can also be labelled with so called referent fields that refer to specific individual in-stances of a concept. The referent field can be as simple as a unique name or as complicated as group information [29].

Conceptual Relationsshow the roles concepts play in relation to each other. Typical examples can include:

• ATTR indicates a concept is an attribute of another concept.

• AGNT indicates a relation connecting the instigator (or agent) of an action with the action itself.

• OBJ indicates a concept as a object of another.

• MANR indicates that a concept defines the manner in which another is done.

• LOC indicates a location where an event takes place.

A conceptual relation normally defines the link between two concepts, but it is also possible for a relation to be unary or ternary in nature [29].

With the primitives defined, an example of a conceptual graph can be supplied. Suppose we want to express the following with a conceptual graph:

A black cat called “Sam” sits on a green mat.

Figure 4.5 below represents the above statement as a conceptual graph. Rectangles denote con-cepts (‘cat’,‘sit’,‘green’,‘mat’,‘black’) and ovals denote relations (‘agnt’,‘loc’,‘attr’).

cat:sam sit mat

black green

agnt loc

attr attr

Figure 4.5: An example of a simple conceptual graph adapted from [29].

The basic model of conceptual graphs discussed briefly above, are very similar to those of topics and associations found in indices. The combination of indices and conceptual graphs are exactly what topic maps do by combining topic/occurrence with the concept/relation model suggested by conceptual graphs [27].

The above discussion introduced the principles behind conceptual graphs. These principles can be directly applied to the elements of the topic map model which is discussed below.

Elements of the topic map model

The key concepts in the topic map model are topics, associations and occurrences. A brief discussion of these fundamental concepts follow below.

Topics Topics are the most fundamental structure in the topic map model. A topic refers to the object or node in the topic map that stands in for or represents some real world subject. A subject can be anything conceivable by a human being. In his paper, Pepper notes that there should be a one-to-one relationship between topics and subjects, with each subject represented by just one topic [27]. The topic map is said to be consistent if this is true.

A topic can therefore be seen as a resource in the system and acts as a proxy for some subject [28].

The relationship between a topic and its subject is defined as reification. Reification of a subject allows for topic characteristics to be assigned to the topic that refies it. Topic characteristics for a topic can be one of the following: a name, an occurrence or a role played by a topic in an association.

Each of these topic characteristics also have a scope which specifies the extent to which a char-acteristic assignment is valid. The scope of a charchar-acteristic may be specified either explicitly as a set of topics, or implicitly [28].

Topic types Each individual topic is an instance of zero or more topic types. Topic types are also defined as topics in their own right and may or may not be explicitly indicated.

Topic names A topic may have zero or more names. Each of the topic names are considered to be valid in a certain scope. Each name may exist in multiple forms, i.e. it has exactly one base form known as the base name and each name may have one or more variants for use in specific processing contexts.

The base name is the base form for a topic name and is always a string. The base name provides the default label for a topic that an application can use for processing activities, unless a variant

exists that is deemed more appropriate in the processing context. Variant names are an alternative form of the base name and is optimized for a specific purpose. An application will choose the most appropriate variant name by examining the parameters of its variants. Parameters are a set of topics that expresses a variant’s processing context [27, 28].

Occurrences A topic may be linked to one or more information resources that are considered relevant to the topic in question. Such resources are called occurrences of the topic [27].

An occurrence could be any piece of information (see previous chapter) that is in some way relevant to the subject in question. Occurrences are usually external to the topic map document itself (it is allowable to have internal occurrences as well) and are pointed to using some sort of mechanism (like uniform resource indicators (URIs) in the XML topic maps (XTM) specification [28]).

Occurrences may also be an instance of a single class of occurrence. This is known as the occurrence typeof an occurrence.

Associations Topics and occurrences allow for the organization of information resources ac-cording to topic/subject and the creation of simple indexes. To be of any constructive use the model must be able to describe relationships between topics. This is exactly what topic associa-tionsdo. An association is a relationship between one or more topics, each of which plays a role as a member of the association.

Association types Associations between topics can be grouped according to their type. Asso-ciation types are also defined in terms of topics. The XTM specification also specifies special association classes. These define “class-instance” relationships between topics and “superclass-subclass” relationships between topics respectively [28].

Association roles Each topic or member in an association has a specific role in that association.

This is called the association role of an association. The role expresses a topic’s involvement as a member of an association. Association roles may also be typed and the type of an association

role is also a topic. Associations are also inherently multi-directional (i.e If A is related to B, then B must be related to A).

Topic map semantics The basic elements of the topic map model discussed above is not enough to make it a valid knowledge representation scheme. Clearly defined semantics is also a requirement, as noted earlier in this chapter.

Currently there exists two standard syntaxes for topic maps named Hytime topic maps (HyTM) and XTM [28]. Some of the XTM elements were mentioned in the discussion above, but there are also other syntactical conventions and concepts that were not mentioned (like subject identity, topic and subject indicators and published subjects to name a few). These conventions are dis-cussed in detail in the XTM specification [28] and the interested reader is encouraged to explore the XTM specification further.

In document A multi-agent collaborative personalized web mining system model. (Page 47-53)