• No results found

2.2 Knowledge representation methods and technologies

2.2.4 Ontologies

Despite the historical significance of ontology as the branch of metaphysics that studies categories of beings, the adoption of constructs in artificial intelligence called ontologies is recent compared to those discussed in the previous sections. More precisely, the consolidation of ontologies as theories of a modeled world open for consumption by machines coincides with the birth of Web ontologies, i.e. ontologies represented in formal languages that can be rendered using standards of wide acceptance on the Web. In 1993, Gruber gave an initial definition of ontology that was widely accepted in the computer science domain [Gru93a]:

An ontology is an explicit specification of a conceptualization.

This definition was further elaborated upon by the author, who wrote in 2009 [Gru09]: ...an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members). The definitions of the representational primitives include information about their meaning and constraints on their logically consistent application.

In literature, an ontology is also described as an artifact whose structure matches both its domain and intended task, and where the design is motivated by some specific purpose, e.g, the capability to solve some reasoning task or modeling problem [GP09].

2.2 Knowledge representation methods and technologies

Web ontologies have since become widespread standard artifacts for representing and reasoning upon knowledge. The formal concepts from these sets of knowledge are defined either within a given domain or across multiple domains, along with the logical relationships between and constraints upon these concepts. Ontologies combine many types of semantic structures, both hierarchical (e.g. taxonomies, meronomies) and non- hierarchical (e.g. the equality and opposition relationships in thesauri), and can do so on any amount. Ontologies also allow for a significant, albeit conceptually lax, decom- position into terminological and assertional components. A terminological component, or TBox , is a statement that describes the mechanisms in a part of the world in terms of controlled vocabularies. An assertional component, or ABox , is a statement that describes the population of that part of the world in compliancy with the controlled vocabularies defined by TBoxes; in other words, a fact. An alternative terminology has TBoxes containing intensional knowledge and ABoxes containing extensional knowl- edge [Gru93b]. While there are no strict rules set as to what levels of representation should be handled by ABoxes rather than by TBoxes, this being in fact an open issue in knowledge representation, this distinction has some essential implications on the prac- tice of modeling systems. The ability to discern assertional components, such as factual and domain knowledge on the objects and event occurrences in the real world, allows us to keep controlled vocabularies general, reusable and independent, in principle, of the model population. This is far less evident when dealing with conceptual models such as those of relational databases, where the underlying semantics are implicitly expressed in relational schemas. Note that the classical TBox/ABox dichotomy is not the only decomposition of knowledge accepted in computer science: in fact, in the information extraction field ontological knowledge can be modeled to distinguish referential domain entities, conceptual hierarchies, entity relationships and the domain population [NN06]. The state of the art in Web ontology authoring is the Web Ontology Language (OWL) [Gro09]. Its definition relies on a stack of representational schemas that cover multiple layers of Berners-Lee’s Semantic Web layer cake (cf. Section 2.1). OWL pro- vides the framework for bridging RDFS and ontologies, but its foundations are to be traced back to frame languages and beyond. In fact, it superseded the earlier ontol- ogy language DAML+OIL [MFHS02], which combined features of the OIL language described earlier while discussing frames, with the DARPA Agent Markup Lan-

guage (DAML), a United States defense program for the creation of machine-readable representations for the Web [HM00].

OWL is actually a family of languages that allow knowledge engineers to model domains under the principles of Description Logics (DL). These are knowledge rep- resentation languages that model concepts, roles, individuals and their relationships through an axiomatic theory. Languages in the OWL family are founded on model- theoretic formal semantics and define fragments of the First-Order Logic (FOL) with well-defined computational properties. In the first version of the languages, these frag- ments are called OWL-Lite, OWL-DL and OWL-Full and consist of specifications of which constructs of the OWL language should be used for coding ontologies and under which restrictions. OWL fragments provide each a trade-o↵ of expressivity and decid- ability [MvH04], with OWL-DL being the most widely used, as it is the most expressive fragment that guarantees the decidability of OWL reasoning procedures.

A revision of the language for accommodating the feature requests that arose from the heavy adoption of OWL in semantic applications led to an updated W3C recom- mendation for the updated OWL 2 in late 2009 [Gro09]. In OWL 2, the distinction between existing OWL fragments fell in favor of three new language profiles, all decid- able but with di↵erent computational properties. These profiles are called OWL 2 EL, OWL 2 QL and OWL 2 RL, and named after their intended usage in knowledge man- agement procedures for largely expressive ontologies, query answering procedures and rule systems, respectively [MGH+09]. Because they address computational complexity

issues rather than calculability, these profiles are also called tractable fragments, since they pose several restrictions on OWL language constructs and axioms in order to ad- dress certain scalability requirements deriving from interoperability with rule languages and relational databases. Other innovations contributed by the OWL 2 specification that are interesting for our work include the introduction of reflexive, irreflexive, and asymmetric object properties, i.e. properties holding between individuals; a clearer dis- tinction between ontology locations, set in statements following an import-by-location scheme, and their logical names; a versioning system with an impact on physical and logical referencing; use of the Internationalized Resource Identifier (IRI) [DS05] as a generalization of the URI for referencing entities and ontologies; and support for pun- ning, a meta-modeling capability that allows the same term to reference, under certain

2.2 Knowledge representation methods and technologies

restrictions, more entities of di↵erent types, e.g. classes, individuals and properties [GWPS09].