Basic approaches in DL learning - Description Logic and OWL Learning

2.2 Description Logic and OWL Learning

2.2.2 Basic approaches in DL learning

Learning in description logics is essentially a search problem: it searches for anaccurate2

concept in a search space that consists of a potential inﬁnite set of concepts constructed from the vocabulary of language of a given knowledge base. Concepts in the search

space are generated by reﬁnement operators and they are organised in an ordering

structure.

There are two types of reﬁnement operator: downward and upward reﬁnement

operators. Given a concept, adownward reﬁnement operatorcomputes a set of concepts

that aremore speciﬁcthan the given concept. Anupward reﬁnement operator computes

a set of concepts that are more general than the given concept. In description logics

and OWL, the generality and speciﬁcity relations between concepts are based on the

inclusion (subclass/superclass) relationship. A description C subsumes a description

D (orC is a superclass ofD) if all instances of Dare also instances of C. Therefore,

C is said to be more general than D, or D is more speciﬁc than C. C is called a

generalisation of D and D is a specialisation of C. A reﬁnement operator based on

these relations can be deﬁned as follows:

Deﬁnition 2.20 (Reﬁnement operator in description logics).Given a descrip-

tion logic language L and a concept C in L. A downward (respectively upward)

reﬁnement operator ρ is a mapping from C to a set of concepts D in L such that

∀D∈D:DC (respectively CD). Dis calledreﬁnement of C.

Definition 2.20 implies that the refinement process can be applied recursively and it may be infinite. Therefore, practically, a refinement task is often provided with a maximal length of the concepts in the refinement result to help the refinement become finite. There are several methods to define concept length. In this thesis, the concept length computation is adopted from [82] as follows:

Deﬁnition 2.21 (Length of a concept in description logics).The length of a concept in description logics is the total number of symbols in the concept including

class names, role names, individual, and constructors.

In particular, the length of anALC concept is deﬁned as follows:

Deﬁnition 2.22 (Length of an ALC concept). The length of a conceptC, denoted

by |C|, is inductively deﬁned as follows:

|A|=| |=|⊥|= 1 (A is an atomic concept)

|¬D|= 1 +|D|

DE=|DE|= 1 +|D|+|E|

|∀r.D|=|∃r.D|= 2 +|D|

Corresponding to two types of reﬁnement operator, there are two basic search di-

rections: top-down and bottom-up. It results in the two basic inductive learning ap-

proaches: top-down and bottom-up approach respectively.

Top-down (specialisation) approach

In the top-down learning approach, the search starts from a general concept, usually the Thing (TOP), and uses a downward reﬁnement operator to specialise concepts in the search space until a concept or set of concepts that cover all positive examples and no negative ones is found. The basic tasks of a downward reﬁnement are:

• replacing primitive concepts and properties in the description by their sub-concepts

or sub-properties, and

• specialising the range of properties, and

Figure 2.1: The top-down approach for learning conceptFather described in Example 2.5. The search starts from the most general conceptTOPand uses a reﬁnement operator

ρto specialise the descriptions until it reaches the accurate concept Male AND hasChild SOME TOP.

U(. . .)

TOP

Male Female hasChild some TOP

Male and Female Male and

hasChild some TOP

Female and

hasChild some TOP overly general

overly general overly general

accurate concept incorrect & incomplete

overly specific

incorrect + incomplete

U(TOP)

U(. . .)

An example of a top-down learning approach for learning the Father concept de-

scribed in Examples 2.5 is shown in Figure 2.1.

Bottom-up (generalisation) approach

In the bottom-up learning approach, the search is in the reverse direction. It starts from a very specific concept and uses an upward refinement operator to generate more general concepts. Basic tasks of an upward refinement are:

• replacing examples with their most speciﬁc concepts using the Most Speciﬁc Con-

cepts operator (MSC) [2, 5, 30, 74], and

• removing axioms from the concept.

MSC is the most popular upward reﬁnement operator in description logics. How- ever, it may not be available for some languages of description logics. Currently, the number of description logic languages supported by the MSC is limited. Figure 2.2

illustrates a bottom-up learning approach for learning theParentconcept described in

Example 2.5.

It is important to note that a specialisation may produce overly speciﬁc concepts (too strong theories) and thus a generalisation may be needed tocorrect those concepts. Analogously, a bottom-up approach may also need to use specialisation to correct overly general concepts generated by the downward reﬁnement operator [104].

Figure 2.2: The bottom-up approach for learning conceptParent described in Example 2.5. The search starts from an example and uses a the the most speciﬁc concept (MSC) operator and removal of axioms to generalise the descriptions until it reaches an accurate concepthasChild SOME TOP.

john

Male and hasChild some Person

hasChild some Person Male

MSC(john, peter)

Parent is john: overly specific

remove axioms

overly specific

accurate concept incorrect & incomplete

In DL/OWL learning, the top-down approach is used more widely as it can facilitate the rich structural concept hierarchy in the ontology.

In document Symmetric parallel class expression learning : School of Engineering and Advanced Technology, Massey University, New Zealand : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science (Page 46-49)