3.4 Labeling Conjunctive Queries
3.4.1 Single-Atom Case
We will show in this section that Uatomis decomposable. Consequently, the discussion and labeling algorithm from Section 3.3.2 apply directly. The set {{Si} | Si ∈ S}, composed of singleton sets containing each of the security views, serves as a generating set for the labeler. For a complete end-to-end labeling algorithm, we only need to define implementations of the two subroutines introduced at the beginning of Section 3.3. The first determines, given V, V0 ⊆ U
atom, whether V V0. The second computes the GLB function – that is, given V, V0it finds a V00 such that (⇓ V) u (⇓ V0)= (⇓ V00).
Determining whether V V0 can be done using standard techniques from the literature on equivalent view rewriting, such as Compton’s algorithm [24]. When V is a set of single-atom views, the following criterion is both necessary and sufficient. Theorem 3.4.2 (Ordering Single-Atom Views). V V0
precisely when, for each V ∈ V, there exists V0 ∈ V0 such that there is a homomorphismθ such that (a) θV0 = V, (b) θ maps existential variables to existential variable, (c) if x is an existential variable and y , x then θy , θx, and (d) θ maps constants to themselves.
Proof of Theorem 3.4.2. Suppose V V0. Then for every single-atom view V ∈ V there exists a conjunctive rewriting R0over the views in V0whose expansion R0
to V, with homomorphisms θ : V → R0+and θ0 : R0+→ V. Since V is a single-atom query which is homomorphic to R0
+, we may assume without loss of generality (by eliminating redundant body atoms) that R0is a single-atom query as well, and therefore there is a single-atom view V0 ∈ V0such that {V} {V0}.
We now know that, by performing a substitution on the distinguished variables of V0, we can obtain a query R0+that is homomorphic to V. Such a substitution cannot force an equality constraint between an existential variable of R0+and any other variable, so (b) and (c) must hold. Every variable that is distinguished in R0must also be distinguished in R0+, so constraint (a) is also satisfied. And finally, a substitution on distinguished variables cannot affect the constants of R0
+, so that constraint (d) holds.
On the other hand, query obtained by performing a substitution on the distinguished variable of R0+, and which therefore leaves the remaining variables unchanged, immedi- ately satisfies conditions (a), (b), and (d). Condition (c) immediately follows from the fact that substitution is defined in a manner that prevents capture of existential variables.
Given single-atom views V and V0, there is only one possible choice of θ that maps the unique body atom of V0to that of V. To determine whether {V} {V0}, it suffices to check whether θ satisfies the conditions of Theorem 3.4.2; this can be done linear time.
More generally, if V and V0are sets of single-atom views then the preceding Theorem tells us that V V0if and only for every V ∈ V there is some V ∈ V0such that {V} {V0}; this check can be performed in O(|V| · |V0|) time.
We next turn our attention to the problem of computing Greatest Lower Bounds, or GLBs. Our approach is based on a procedure GLBSingleton for computing the GLB of two singleton sets of views {V} and {V0}; this can be extended to multi-element sets of views in a manner to be explained shortly.
GLBSingleton is based on the idea of unification. It begins by computing a general- ized Most General Unifier [12], or MGU, of the the bodies of V and V0. This is computed by a subroutine called GenMGU, which differs from a standard MGU computation in three ways. First, if the algorithm attempts to unify a constant with an existential variable, the unification fails. Second, if the algorithm attempts to unify an existential variable with an existential or distinguished variable, the result is an existential variable. Third, if the algorithm attempts to unify two distinguished variables, the result is another distinguished variable. We explain these differences using some examples.
First, we show why the unification of a constant with an existential variable must fail.
Example 3.4.3. Consider the following Boolean views:
V13() :− M(9, ’Jim’) V14() :− M(x, y)
The first view tests whether Meeting contains a particular tuple and the second checks whether it contains any tuples at all. The standard MGU of the body atoms is equal to the first atom, but the actual GLB of the views should be ⊥. There is no single-atom query that can be rewritten in terms of V13and also in terms of V14. This is a consequence of conditions (b) and (d) in Theorem 3.4.2.
Next, we illustrate the reasons for our handling of existential and distinguished variables.
Example 3.4.4. Consider views V6and V7 from Figure 3.4:
V6(x, y) :− C(x, y, z) V7(x, z) :− C(x, y, z)
In our new representation they become [C(xd, yd, ze)] and [C(xd, ye, zd)] respectively. Their GenMGU is [C(xd, ye, ze)], i.e. V9from Figure 3.4. This makes intuitive sense as V9, the projection on the first attribute of Contact, accurately represents the overlap between V6and V7, i.e. the information that can be computed from either V6or V7in isolation.
Once GenMGU is available, an extra check is needed to rule out some corner cases as shown in the next example.
Example 3.4.5. Consider the following Boolean views:
V14() :− M(x, y) V15() :− M(z, z)
The GenMGU of the body atoms is [M(we, we)], but the GLB should be ⊥ by the same reasoning as in example 3.4.3. This is a consequence of condition (c) of Theorem 3.4.2.
The check to eliminate such cases is conceptually straightforward. It involves finding situations where computing GenMGU forces a new equality constraint on two values in the same original atom, and where at least one of these values was an existential variable. If we find such a situation or if GenMGU fails, GLBSingleton returns ⊥; otherwise it returns the output of GenMGU.
We next verify the correctness of the process outlined above.
Theorem 3.4.6. VS = GLBSingleton(V, V0) satisfies ⇓ {V} u ⇓ {V0} = ⇓ {V00}.
Proof Sketch. GLBSingleton attempts to compute the Most General Unifier of V and V0. However, it prevents each existential variable in V from being unified with any other variable or constant. This ensures that VS can be obtained by performing a substitution on the distinguished variables in V, and therefore {VS} {V}, so that ⇓ {VS} ⇓ {V}. An analogous argument shows that ⇓ {VS} ⇓ {V0}. It follows that
⇓ {VS} ⇓ {V} u ⇓ {V0}
For the other direction, suppose that {V00} {V} and {V00} {V0} for some V00 ∈ Uatom. Then there must be a substitution on the distinguished variables in V (resp. V0) that yields
an atom that is isomorphic to V00. In particular, if there is a constant or existential variable at some position in the body of V (resp. V0) then there must be a constant or existential variable at the same position in V00. Similarly, if the terms at two different positions are equal in V (resp. V0) then the terms at the same positions in V00 must also be equal.
This means that VS and V00
are isomorphic on the existential variables and constants of VS. The remaining variables in VS are all distinguished, so by performing a substitution on the distinguished variables of VS, we can obtain a view that is isomorphic to V00, and therefore {V00} {VS}. It follows that
⇓ {V} u ⇓ {V0} ⇓ {VS}
and therefore the two must be equal, completing the proof.
GLBSingleton can be extended to non-singleton sets for a complete implementa- tion of GLB(V, V0). We simply compute the pairwise GLBSingleton of singleton sets containing each pair of views V ∈ V, V0 ∈ V0
and union all the results together. This completes the description of GLB, giving us the last tool we need to label queries using the techniques from Section 3.3.2.
Given V, V0 ⊆ U
atom, a set V00 ⊆ Uatom such that (⇓ V) u (⇓ V0) = (⇓ V00) can be computed as follows:
1: procedure GLB(V, V0) 2: V00 ← ∅
3: for each view V ∈ V do 4: for each view V0∈ V0do
5: if GLBSingleton(V, V0) , ⊥ then
6: V00 ← V00
∪ {GLBSingleton(V, V0)}
8: end for 9: end for 10: return V00 11: end procedure
In order to verify the correctness of this procedure, we must first show that Uatomis decomposable, in the sense discussed above.
Proposition 3.4.7. Suppose that {V} V ∪ V0, where V ∈ U
atom and V, V0 ⊆ Uatom. If {V} V ∪ V0then either {V} V or else {V} V0.
Proof. Suppose {V} V ∪ V0. Then there must be a rewriting R using the views in V ∪ V0 that is homomorphic to V. Since V contains exactly one body atom, we may assume WLOG (by folding R+if needed) that the expansion R+of R contains exactly one body atom. If this body atom originates from V then {V} V. On the other hand, if the
body atom originates from V0then {V} V0.
We next verify the correctness of the GLB procedure shown above.
Theorem 3.4.8. GLB(V, V0) returns a set VS such that(⇓ V) u (⇓ V0)= (⇓ VS).
Proof. VS
is a set of views of the form GLBSingleton(V, V0) where V ∈ V and V0∈ V0. For each pair of views we have
{GLBSingleton(V, V0)} {V} V ⇓ V and {GLBSingleton(V, V0)} {V0} V0 ⇓ V0
so that
Hence
⇓ VS ≡ [
V∈V,V0∈V0
⇓ {GLBSingleton(V, V0)} (⇓ V) u (⇓ V0)
In the other direction, suppose V00 (⇓ V) u (⇓ V0
), and let V00 ∈ V00
. Then {V00} (⇓ V), so there must be some V ∈ V such that {V00} {V} because U
atom is decomposable. Similarly, {V00} (⇓ V0), so there must be some V0 ∈ V0 such that {V} {V0}. It follows that {V00
} {GLBSingleton(V, V0)} (⇓ VS). Taking the union over all V00 ∈ V00, we conclude that
V00 = [ V00∈V00
{V00} (⇓ VS)
and therefore VS is a greatest lower bound, as required.