9.3 Query Execution
9.3.3 Path Type Calculation
One important part of SemRep is to calculate the relation type of paths consisting of more than one edge. Let us assume there is a path of length 2 between nodes X, Y, Z with two relations r1, r2 in between so that the path looks as follows: X r1 Y r2 Z. Given these pieces of information, the difficulty of the path type calculation is to determine the path type r. This calculation is not always trivial, as it can hold r = r1, r = r2 or r 6= r1 6= r2, depending on the type of r1 and r2. In [62], the authors present a simple approach for indirect path type resolution, though they do not regard the types has-a and part-of and do not provide any proofs or reasons for their deductions.
In the following section, we will discuss how the path type r can be determined from the relation types r1, r2 and under which circumstances the type cannot be determined.
Given a relation type r, we will denote the inverse relation type by r−1 for some illus-trations. To facilitate the discussion of the different cases that need to be regarded, we define several path categories and use the terminology which is depicted in Fig. 9.7.
Homogeneous Path
If it holds r1 = r2, p is called a homogeneous path. Since all relation types are transitive, it holds r1 = r2 = r. Thus, homogeneous paths are the simplest paths w.r.t. path type calculation. A typical example is a path leading up the lexicographic taxonomy.
SUVis-a Car is-a Vehicle → SUV is-a Vehicle
If it holds r1 6= r2, p is called a heterogeneous path. There are two forms of heterogeneous paths: canonical and non-canonical paths.
Canonical Path
If it holds r1 = equal, r2 6= equal (or vice versa), p is called a canonical path. The type equal can be seen as the identity (neutral) element in a path. Any relation type r com-bined with equal leads to r, i.e., equal does not change the path type. Thus, it holds r = r2, as the following examples shows:
Automobileequal Car is-a Vehicle → Automobile is-a Vehicle
If it holds r1 6= r2 6= equal, we call p a non-canonical path. There are again two forms of non-canonical paths: non-inverse paths and inverse paths.
Non-inverse Path
If p is a non-canonical path and it holds r1 6= r−12 , we call the path non-inverse path.
This means that one type isis-a resp. inverse is-a and the other type is has-a resp.
part-of. Generally speaking, the path consists of one generalization and one aggregation relation. In the field of semantics, it holds that aggregation has a higher binding strength compared to generalization and that the aggregation type prevails. Thus, r is identical to the aggregation type and is eitherhas-a or part-of. The following examples illustrates this:
Laptopis-a Computer has-a CPU → Laptop has-a CPU
One important question is why the aggregation type prevails against the generalization type and whether this assumption always leads to sensible results. As already shown in Section 3.2.1, concepts are defined by different properties and ahas-a or part-of relation is such a typical property. Thehas-a relation is thus used to define (or specify) the con-cept c and since properties are inherited by all sub-concon-cept c0, the has-a relation holds for all sub-concepts as well. Consider the example between Laptop and CPU as shown in Fig. 9.8. The concept computer is defined by the property "has-a" CPU. Any sub-concept below computer, such as laptop, inherits this property so that thehas-a relation holds as well. By contrast, the generalization type cannot hold, since thehas-a relation leads to another branch of the taxonomy as illustrated in the picture. The two branches are inde-pendent, i.e., CPU and Computer do not necessarily share any properties. For this reason, there is no semantic foundation to infer anis-a or inverse is-a relation between Laptop and CPU, but only the aggregation type holds.
However, an important problem arises if the path leads to a super-concept of c, which defines thehas-a or part-of relation. Consider the following, slightly adapted example:
Machineinverse is-a Computer has-a CPU → Machine has-a CPU
Figure 9.8: Path type calculation across branches.
Figure 9.9: Illustration of the specic case inverse is-a + is-a.
According to the previous argumen-tation it holds <Machine, has-a, CPU>. This statement is no longer universally valid, because the "has-CPU" property is only defined for computers, not for concept being more general than computer. The statement is not completely false ei-ther, as some machines contain a CPU, but it should be rather ex-pressed like "There are machines that have a CPU (e.g., computers)".
We call such a case Loss of Generality and as we will illustrate further be-low, it occurs in 4 of 8 possible com-binations of r1 and r2. Loss of Gen-erality is regarded in the confidence
calculation, i.e., paths are scored lower if this phenomenon occurs. Still, it has to be re-marked that most simplepart-of and has-a relations are not generally valid in the first place (as described in Section 3.2.3). For instance, the relation <cellar, part-of, house>
already suffers from Loss of Generality, since there are houses without a cellar and vice versa.
Inverse Paths
Eventually, if it holds r1 = r2−1 we call such a path an inverse path. There are 4 forms of inverse paths:
1. Xis-a Y inverse is-a Z
2. Xinverse is-a Y is-a Z 3. Xpart-of Y has-a Z 4. Xhas-a Y part-of Z
Case 1is the classic co-hyponym case, in which X and Z are siblings. They share the properties of the parent node Y , but have some additional, distinct properties. The rela-tion type isrelated. Let us assume that there is a root concept C at the hierarchy level 0 and Y is situated at level n. The greater n is, the greater is the semantic overlap and the greater is the similarity between X and Z.
Case 2is the inverse of Case 1, but the relatedness between X and Z cannot be deter-mined in this case. It holds obviously Y ⊂ X and Y ⊂ Z, but the relation between X and Z cannot be determined by means of semantic deduction. As also depicted in Fig. 9.9, the following examples lead to different results:
Fruitinverse is-a Apple is-a Plant Structure → Fruit is-a Plant Structure Fruitinverse is-a Apple is-a Tree Fruit → Fruit inverse is-a Tree Fruit
Fruitinverse is-a Apple is-a Fructus → Fruit equal Fructus76
Since the type cannot be uniquely determined, SemRep returnsundecided in this case.
Case 3 is similar to case 1, but Y does not describe common properties that X and Z inherit. Instead, Y can be interpreted as a place where X and Z may co-occur. SemRep concludes the related type in this case, but in contrast to case 1 this decision seems less confidential. The following example shows a case where related seems to be an appropriate decision:
Studentpart-of School has-a Teacher → Student related Teacher However, in the slightly modified example this decision seems much more vague.
Studentpart-of School has-a Janitor → Student related Janitor
The fact that two independent concepts may occur at the same place does not always suggest a sensiblerelated relation. In the student-janitor-example, "unrelated" would be the best decision, as there is no sensible relationship between a student and a janitor.
Finally, case 4 is similar to case 2. It cannot be uniquely resolved, so SemRep returns undecided.
76Fructus is the Latin name for 'Fruit'.
Figure 9.10: Relation type matrix for paths of length 2.
Conclusions
In this subsection, we have gradually described the different combinations of r1 and r2
within an indirect path, and what semantic path type r can be concluded. The results are the work of a profound semantic study of concepts within a lexicographic hierarchy. An overview of the input types and respective results is illustrated in Fig. 9.10.
Dark green states contain the relation types of homogeneous paths and light green states the relation types of canonical-heterogeneous paths. Non-inverse path types are marked yellow and orange, with orange describing path types with Loss of Generality. Finally, inverse types are marked blue, while two of the four combinations are undefined.
To resolve paths of length greater than 2 works analogously. As an example, consider the following path of length 3: (SUVis-a car equal automobile is-a vehicle). In this case, the first two relations are resolved just as illustrated in Figure 9.10. Thus, the two relations
<SUV,is-a, car> and <car, equal, automobile> lead to the relation <SUV, is-a, automo-bile>. Now, the original path is reduced to (SUVis-a automobile is-a vehicle). This path of length 2 can be resolved again using the illustrated approach. According to Fig. 9.10 it can be directly concluded: <SUV,is-a, vehicle>.
Combinations with typerelated are not handled by the approach and are generally not allowed, except for the type equal. In this case, it holds related + equal → related;
otherwiseundecided is returned, because no type can be reasonably concluded then.