4.6 Aboutness Systems: Example of the Flat Document Vector Space Model
4.6.3 Rules
The next step will be to define the aboutness proof system for the simple vector space model. First the vector space aboutness decision needs to be defined. According to [Huibers, 1996], given a document d represented by the set of descriptors χ(d) and a query q represented by χ(q), d is about q if rsv(χ(d), χ(q)) > 0.1 In terms of the vector
space model, this implies that the vectors for d and q have at least one entry at the same position. They share at least one index term. Both [Huibers, 1996] and [Wong et al.,
2001] have shown that a document is about a query in the simple vector space model if they share information. For [Huibers, 1996]’s Situation Theory framework, this entails the proposition that rsv(χ(d), χ(q)) > 0 if and if only χ(d) ∩ χ(q) 6≡ ∅. We reuse this proposition in the discussion of the aboutness rules.
For vector space retrieval, we would like to exclude Reflexivity in order to avoid logical anomalities as described in Section 4.4. Singleton Reflexivity is then given for vector space retrieval. We have to show that assuming map(A) ≡ {φ} and map(B) ≡ {φ}, also rsv(A, B) > 0, where A and B are sets of index terms. The latter is the case if there is an index term both part of A and B. We have φ as a member of both. Thus A ∩ B ≡/ ∅, and Singleton Reflexivity is given according to the proposition.
Transitivity does not hold, as the example of S ≡ { hhhouseii , hhgardenii }, T ≡ { hhhouseii , hhgarageii } and U ≡ { hhgarageii , hhcarii } shows. Then S T and T U but not S U , as their sets of index terms do not overlap. Thus, Transitivity is not given.
1Strictly speaking, this is a different function from the rsv above as the arguments are different but
giving it a different name would have made the background less readable. Also in future aboutness discussions, we use rsv for all functions that deliver the retrieval status value.
Symmetry is given. Say, S ≡ map(A) and T ≡ map(B). With the premise S T , we want to conclude that T S. That is straight-forward, as ∩ in A ∩ B for S T is commutative. Thus, Symmetry is given.
Set Equivalence is given. Let us assume that map(A) ≡ map(B) and map(A) map(C) are given. We have to show that map(B) map(C) is given. From the premises, we know by the definition of map that A ≡ B and A ∩ C ≡/ ∅, which includes B ∩ C ≡/ ∅. This proves that Set Equivalence holds.
If Euclid would be a property of the aboutness system, from S T and S U we would be able to derive that T U . Say, that S ≡ map(A), T ≡ map(B) and U ≡ map(C). Then, A ∩ B ≡/ ∅ and A ∩ C ≡/ ∅. However, this does not mean that B and C overlap in information, as the following example demonstrates: Let us assume that map(A) ≡ { hhgardenii , hhhouseii }, map(B) ≡ { hhgardenii , hhcarii } and map(C) ≡ { hhhouseii }. Then B ∩ C ≡ ∅, and Euclid is not given.
Next, the combination rules are demonstrated. In order to prove that Left Monotonic Unionholds, we need to find out whether S ⊗ U T is given if we know that S T . Let us assume that S ≡ map(A), T ≡ map(B) and S ⊗ U ≡ map(C). Then, A ∩ B ≡/ ∅, as S is about T and C ⊇ A by definition of map. With the conclusion that C ∩ B ≡/ ∅, Left Monotonic Union is given.
For Right Monotonic Union, we assume that from S T also S T ⊗ U . Let us assume that S ≡ map(A), T ≡ map(B) and T ⊗ U ≡ map(C). A ∩ B ≡/ ∅, with S about T . C ⊇ B follows from the definition of map. Therefore A ∩ C ≡/ ∅, and Right Monotonic Union is given.
Cut would allow us to state S U , given that S ⊗ T U and S T . Let us assume that S ≡ map(A), T ≡ map(B) and U ≡ map(C). Then, (A ∪ B) ∩ C ≡/ ∅ and A ∩ B ≡/ ∅. Yet, this does not necessarily mean that A ∩ C ≡/ ∅. Cut is not given.
Right Weakening is also not given. From { hhcarii } { hhhouseii , hhcarii }, we cannot say { hhcarii } { hhhouseii }. Right Weakening is not given.
Mix is supported if Left Monotonic Union is supported, as it is a special case of LMU with the additional knowledge that the added information is about the query, too. Similarly, Context-Free And is supported, as Right Monotonic Union is supported.
Deep containment is not given for our simple vector space model. Thus, Containment, Containment Composition, Absorption, Right Containment Monotonicity, Non-conflict- containment, Closed World Aboutness Assumption and Containment Preclusion are all only supported for surface containment for the model. We defined that a situation S contains a situation T if their underlying descriptor sets A and B share at least one information item. Then, obviously A ∩ B ≡/ ∅.
Absorption follows from the definitions of composition and containment in map. Right Containment Monotonicity is given, as Right Monotonic Union is given. As preclusion is not defined for the simple vector space model, Non-conflict-containment and Containment Preclusion are both not applicable. The Closed World Assump- tionis also not given, because two situations might be in no containment relationship but still share index terms.
Because preclusion is not defined for the simple vector space model, all the other non-aboutness rules using it are not applicable: Mutual Preclusion, Guarded Left Mono- tonicity, Guarded Right Monotonicity, Qualified Left Monotonicity and Qualified Right Monotonicity. There is no inherent way for the simple vector space model to control or qualify its monotonic behaviour. It cannot support conservative monotonicity.
For the non-aboutness rules, we have already excluded Mutual Preclusion. Simple Anti-Aboutness is more of a statement than a rule. We state that we consider it to be anti-aboutness, if two situations are not about each other. We can show that Simple Anti- Aboutness is the only way for the simple vector space model to support anti-aboutness.
Negation Rational is clearly not given for the model. With it, from S / T we could conclude that S / T ⊗U . With, { hhcarii } / { hhhouseii }, we can still conclude { hhcarii } { hhhouseii , hhcarii }. Negation Rational is not given. Strict Negation Rational is more of a statement, with which we would like to control the behaviour of systems that support Negation Rational in order to avoid inconsistencies, as shown in Section 4.4.4. As Negation Rational is not given, neither is Strict Negation Rational. Therefore, the only non-aboutness rule that could hold is Simple Anti-Aboutness, if we decide that a non-overlap of information would mean a contradiction in the information. This would be, however, a rather strong assumption, as, e.g., hhhouseii and hhgardenii do not ‘syntactically’ have an overlap, but can be informationally related.
Thus, we are not able to control the monotonic behaviour using preclusion or anti- aboutness and other rules of the model. There are many other ways of controlling the monotonic behaviour of an IR model. A commonly used method is to introduce a threshold θ > 0 so that in the equation rsv(χ(d), χ(q)) > θ. We call such a vector space model a thresholded vector space model [Wong et al., 2001]. Thresholds are an enhancement to the original model developed by Salton. We now briefly analyse some reasoning changes introduced by such a threshold.
Using the example of this model, we would like to introduce the notion of conditionally supported rules, as presented in [Wong et al.,2001]. This time we only have to investigate those rules that are already supported by the simple vector space model, as we said in Section4.1that conditions do not create new aboutness behaviour but constrain existing one.
Singleton Reflexivity is still fully supported by the thresholded vector space model. We have to show that under the premises map(A) ≡ {φ} and map(B) ≡ {φ} then rsv(A, B) > θ. Singleton Reflexivity is fully supported, as rsv(A, B) = 1, which has to be larger than θ because it is the maximum rsv.
Similarly, for Symmetry if the overlap of information is big enough to guarantee S T , then it must be also big enough to ensure T S, as in rsv ti and ui are
interchangeable without changing the overall rsv. Symmetry is still fully supported. The last one of the simple rules supported by simple vector space is Set Equivalence. It is fully supported by the thresholded vector space model because we have not changed the equivalence relation. No formal proof is necessary.
we can say that rsv(A, B) > θ, as S T . The question is whether aboutness is preserved if we extend to S ⊗ U T . Looking at rsv from page 54, rsv could easily fall below the threshold θ if the impact of the extension is stronger on the denominator q
Pn
i=1t2i ×
Pn
i=1u2i than on the numerator
Pn
i=1ti× ui. Thus, Left Monotonic Union is
now only conditionally given.
Right Monotonic Union is also conditionally supported for the thresholded vector space model according to [Wong et al.,2001]. Both monotonic unions are only condition- ally supported, as the respective threshold has to be passed. We have shown the impact of thresholds on particular the combination rules, where we add information here. Having shown the impact of thresholds, we skip the remaining rules from Section 4.4, as we only discuss the plain vector space model an example for our method.