2.2 Description Logic and OWL Learning
2.2.2 Basic approaches in DL learning
Learning in description logics is essentially a search problem: it searches for anaccurate2
concept in a search space that consists of a potential infinite set of concepts constructed from the vocabulary of language of a given knowledge base. Concepts in the search
space are generated by refinement operators and they are organised in an ordering
structure.
There are two types of refinement operator: downward and upward refinement
operators. Given a concept, adownward refinement operatorcomputes a set of concepts
that aremore specificthan the given concept. Anupward refinement operator computes
a set of concepts that are more general than the given concept. In description logics
and OWL, the generality and specificity relations between concepts are based on the
inclusion (subclass/superclass) relationship. A description C subsumes a description
D (orC is a superclass ofD) if all instances of Dare also instances of C. Therefore,
C is said to be more general than D, or D is more specific than C. C is called a
generalisation of D and D is a specialisation of C. A refinement operator based on
these relations can be defined as follows:
Definition 2.20 (Refinement operator in description logics).Given a descrip-
tion logic language L and a concept C in L. A downward (respectively upward)
refinement operator ρ is a mapping from C to a set of concepts D in L such that
∀D∈D:DC (respectively CD). Dis calledrefinement of C.
Definition 2.20 implies that the refinement process can be applied recursively and it may be infinite. Therefore, practically, a refinement task is often provided with a maximal length of the concepts in the refinement result to help the refinement become finite. There are several methods to define concept length. In this thesis, the concept length computation is adopted from [82] as follows:
Definition 2.21 (Length of a concept in description logics).The length of a concept in description logics is the total number of symbols in the concept including
class names, role names, individual, and constructors.
In particular, the length of anALC concept is defined as follows:
Definition 2.22 (Length of an ALC concept). The length of a conceptC, denoted
by |C|, is inductively defined as follows:
|A|=| |=|⊥|= 1 (A is an atomic concept)
|¬D|= 1 +|D|
DE=|DE|= 1 +|D|+|E|
|∀r.D|=|∃r.D|= 2 +|D|
Corresponding to two types of refinement operator, there are two basic search di-
rections: top-down and bottom-up. It results in the two basic inductive learning ap-
proaches: top-down and bottom-up approach respectively.
Top-down (specialisation) approach
In the top-down learning approach, the search starts from a general concept, usually the Thing (TOP), and uses a downward refinement operator to specialise concepts in the search space until a concept or set of concepts that cover all positive examples and no negative ones is found. The basic tasks of a downward refinement are:
• replacing primitive concepts and properties in the description by their sub-concepts
or sub-properties, and
• specialising the range of properties, and
Figure 2.1: The top-down approach for learning conceptFather described in Example 2.5. The search starts from the most general conceptTOPand uses a refinement operator
ρto specialise the descriptions until it reaches the accurate concept Male AND hasChild SOME TOP.
U(. . .)
TOP
Male Female hasChild some TOP
Male and Female Male and
hasChild some TOP
Female and
hasChild some TOP overly general
overly general overly general
accurate concept incorrect & incomplete
overly specific
incorrect + incomplete
U(TOP)
U(. . .)
An example of a top-down learning approach for learning the Father concept de-
scribed in Examples 2.5 is shown in Figure 2.1.
Bottom-up (generalisation) approach
In the bottom-up learning approach, the search is in the reverse direction. It starts from a very specific concept and uses an upward refinement operator to generate more general concepts. Basic tasks of an upward refinement are:
• replacing examples with their most specific concepts using the Most Specific Con-
cepts operator (MSC) [2, 5, 30, 74], and
• removing axioms from the concept.
MSC is the most popular upward refinement operator in description logics. How- ever, it may not be available for some languages of description logics. Currently, the number of description logic languages supported by the MSC is limited. Figure 2.2
illustrates a bottom-up learning approach for learning theParentconcept described in
Example 2.5.
It is important to note that a specialisation may produce overly specific concepts (too strong theories) and thus a generalisation may be needed tocorrect those concepts. Analogously, a bottom-up approach may also need to use specialisation to correct overly general concepts generated by the downward refinement operator [104].
Figure 2.2: The bottom-up approach for learning conceptParent described in Example 2.5. The search starts from an example and uses a the the most specific concept (MSC) operator and removal of axioms to generalise the descriptions until it reaches an accurate concepthasChild SOME TOP.
john
Male and hasChild some Person
hasChild some Person Male
MSC(john, peter)
Parent is john: overly specific
remove axioms
overly specific
accurate concept incorrect & incomplete
In DL/OWL learning, the top-down approach is used more widely as it can facilitate the rich structural concept hierarchy in the ontology.