4.2 Knowledge representation strategies
4.2.5 Ontologies
It has been the quest of philosophers for hundreds of years to define the nature of existence.
These philosophers spoke of ontology as the science of being, what it means to exist.
Today, ontologies are seen as a form of knowledge representation with uses in three main areas [20]:
• Communication between (i) implemented computational systems, (ii) humans and (iii) hu-mans and implemented computational systems.
• Computational inference for (i) internally representing and manipulating plans and plan-ning information and (ii) analyzing the internal structures, algorithms, inputs and outputs of implemented systems in theoretical and conceptual terms.
• Reuse and organization of knowledge. This is the structuring or organizing of libraries containing planning and domain information (knowledge base).
The uses listed above serves as an affirmation of ontologies as a knowledge representation tech-nique, but do not supply an adequate definition. A popular definition for an ontology is that it is an “explicit specification of a conceptualization” [21]. Conceptualization, in this context, refers to the the expression of knowledge about a certain application domain in terms of enti-ties. These entities (things in the world, the relationships they hold and the constraints between them) provide an abstract model of how people and/or machines think about the domain, and is usually restricted to some subject area like manufacturing, biology, the web etc. Specification is the representation of this conceptualization in some sort of concrete form. One of the steps in specification necessarily involves encoding the conceptualization in a knowledge representation language. We will discuss the representations commonly used later in this section.
Components of an ontology
The main components of an ontology are concepts, relations, instances and axioms [22]. Con-cepts represent a set or class of entities within a domain. Concepts can be classified into two groups:
• Primitive concepts possess only the necessary properties for admission to the class. So if we have the primitive concept α with property x, all concepts belonging to the class α
belongs to will have property x. There may however be other concepts with property x that do not belong to the class α belongs to.
• Defined concepts possess properties that are both necessary and sufficient for the concept to acquire admission to the class. If the defined concept β has property y then every concept that has property y belongs to the same class β belongs to.
Relationsdescribe the interactions between concepts or their properties. They can also be clas-sified into two broad categories:
• Taxonomies that organize concepts into a tree structure, with child nodes called sub-concepts and parent nodes called super-sub-concepts. These relationships structure the tree and further define the relationships between the concepts. Specialization relationships exist to indicate that one concept is a kind of another, more general concept. Partitive relation-ships describe a typical whole-part relationship, where a concept (whole) is made up of other (part) concepts. It is interesting to note that the definitions of specialization and par-titive relationships correspond exactly to the definitions of generalization and aggregation relationships defined in the unified modelling language (UML) [22].
• Associative relationships relate concepts across the tree structures mentioned above. Asso-ciative relationships describe the properties concepts have, for instance their names (Nom-inative), location with respect to another (Locative), processes involved in or has internally etc. Many of these types of relationships exist and are used to relate concepts with each other [22].
Relations can be organized into taxonomies as well. They can also have properties that more precisely define their nature and how they relate concepts with each other. These properties can include cardinality of the relation, transitivity of a relation, whether or not a relation is universally applicable to a concept, restrictions on a relation etc.
Once the conceptualization has been made concrete (as described in the next section), an ontol-ogy has been produced.
Instancesare the entities that are represented by a concept. In their paper, Stevens, Goble and Bechhofer notes that an ontology should not contain instances as it is meant to be a
conceptual-ization of the domain [22]. The combination of an ontology with its associated instances makes up the knowledge base for that domain. Axioms are used to place constraints on values for classes or instances. They also include general rules about the domain [22].
With the components of an ontology defined, it is still not clear how ontologies for different domains are created. The process of building an ontology is called ontological engineering and is introduced in the next section.
Building ontologies: ontological engineering
As in the case with software, ontologies are engineered and a methodology is needed to define what stages are involved in building an ontology, guidelines and principles that outline the activ-ities of each stage and a life-cycle that shows the relationship between the stages. The emerging discipline of ontological engineering is concerned with the ontology life cycle, that describes the steps involved in the development of an ontology and the relationships between these steps. The end goal of ontological engineering is the use and support of ontologies throughout this life cycle [20].
As ontological engineering is still in an emerging discipline it borrows some ideas from more mature engineering fields such as software engineering. It is therefore not surprising that method-ologies for the development of ontmethod-ologies can be divided into stage-based approaches (e.g. TOVE [23]) similar to the classic waterfall model in software engineering and iterative prototype refin-ing approaches (e.g. METHONTOLOGY [24]) similar to the spiral model in software engineer-ing. In both approaches there is a distinction between an informal stage where the ontology is described using diagrams or natural language and a formal stage where these high-level de-scriptions are encoded in a formal knowledge representation language so as to ensure that it is understandable and processable by a machine.
In their paper, Stevens et al propose a skeletal life-cycle for building ontologies [22]. The key stages of this approach are: specification, design, conceptualization, integration, formalization, and evaluation. These are illustrated in figure 4.4 below.
The specification stage attempts to develop a requirements specification for the ontology by
Evaluation Formal Stage
Conceptualization
Formalization Integration Informal Stage
Specification
Design
Figure 4.4: Ontology life cycle [22].
identifying the intended purpose and scope of the ontology. This is done through the acquisition of domain knowledge from which the ontology will be built. Sources of domain knowledge can span from experts in the domain to research papers to other ontologies. This broad scope can contribute to the fact that building ontologies are unfortunately a difficult, time consuming and expensive task as different members across a community may have conflicting opinions on how the domain they are considering should be modelled. It is therefore crucial that all interested parties (people and computer systems) commit to an agreed upon ontology. With this in mind, Holsapple and Joshi name five approaches to ontology design to ensure an acceptable design for an ontology [25]. They are:
• Inspiration which constitutes an individuals viewpoint about the domain being modelled.
• Induction where modelling is guided by analyzing specific cases in the domain.
• Deduction where general principles about a domain are accepted and applied adaptively to ontology construction.
• Synthesis which accepts non-overlapping base ontologies which provide a partial charac-terization of the domain.
• Collaboration of individuals, reflecting their experience and viewpoints. An initial ontol-ogy can be defined as a starting point for discussion.
There are numerous steps involved in all five approaches, and they all have certain distinct ad-vantages and disadad-vantages. In the article by Holsapple and Joshi, these steps are discussed in detail [25].
In the conceptualization stage, key concepts that exist in the domain, their properties and the relationships between these concepts are identified. These concepts, properties and relationships between these concepts were discussed earlier in this section. The goal of this stage is to struc-ture domain knowledge acquired in the previous stages into an explicit conceptual model of the domain. The resulting ontology is usually described in an informal manner at this stage. This leads to the integration stage, where existing ontologies are either used or customized for use in the domain. The use of generic ontologies can also aid in giving deeper definitions of the concepts defined in the domain.
In the formalization stage, the informal conceptualization and integrated elements of existing ontologies are represented using a formal language. More specifically, the formalisms discussed in this chapter are used (i.e. frame based systems, description logics (DLs)). Another alterna-tive for the formal representation of an ontology is the ontology interface layer (OIL) language.
OIL is an effort to produce a layered architecture for specifying ontologies [26]. The language has been designed so that it provides the modelling primitives commonly used in frame based systems, semantics based on description logics and automated reasoning support that can be specified and provided in an computationally efficient manner.
The evaluation stage determines the suitability of a ontology for its intended application. Eval-uation is done by evaluating the competency of the ontology to satisfy the requirements of the domain. This means that the consistency, completeness and conciseness of the ontology must be determined. Conciseness in this sense means the absence of redundancy in the definition of the ontology and appropriate granularity [22].
The general discussion on ontologies given above enables us to investigate a specialized ontology specification language (called topic maps) that can be used for the purposes of electronic indices.
Topic maps will be discussed in the next section.