• No results found

3.2 Computing all Canonical Covers

3.2.1 Partial Covers

The set of all canonical covers of Σ forms a simple hypergraph on the FDs in Σ∗a. We

may thus use the terms defined for hypergraphs for canonical covers as well. In particular, we shall talk about autonomous sets of FDs, and (partial) superedges. Note that in this context the superedges are the atomic covers, while the edges are the canonical covers.

Definition 3.27. We call a set of FDs in Σ∗a autonomous if it is autonomous for the

hypergraph CC(Σ). When talking about transversals, we always mean transversals of

CC(Σ).

Lemma 3.28. A setG⊆Σ∗a is a cover of Σiff it intersects with all minimal transversals

of CC(Σ).

Proof. G is a cover iff it is a superedge of CC(Σ). Furthermore, CC(Σ) is simple, and by Lemma 3.17 the edges of a simple hypergraph are the minimal sets which intersect with all minimal transversals. Thus superedges are simply sets (not necessarily minimal) which intersect with all minimal transversals.

As superedges become (atomic) covers for the hypergraph CC(Σ), partial superedges become partial covers.

Definition 3.29. Let Σ be a set of FDs and G⊆S Σ∗a. We call G apartial cover of

Σ on S if G is a partial superedge ofCC(Σ) onS.

When S is autonomous, testing whether a set of FDs is a partial cover onS is easy:

Lemma 3.30. Let S Σ∗a be autonomous, and let Σ0 Σ∗a be an atomic cover of Σ.

Then a set G⊆S is a partial cover on S iff G∪0\S) is a cover of Σ.

ClearlyG∪0\S) is a cover of Σ iffG0\S)²Σ0S, which allows us to perform

this test quickly.

We will identify some autonomous (but not necessarily minimal) sets ofCC(Σ). The- orem 3.18 relates autonomous sets to the minimal transversals of CC(Σ). The following lemmas establish some results about the form of these minimal transversals.

Lemma 3.31. Let S Σ∗a be a minimal transversal of CC(Σ) and X A S. Then

S = Σ∗a\S is not a cover of Σ, but S∪ {X A} is.

Proof. By Lemma 3.28, S is not a cover of Σ since it does not intersect with S. If

S∪ {X →A} were not a cover, then every cover would contain a FD in

S∪ {X →A}=S\ {X →A}

Thus S\ {X→A}would be a transversal, which contradicts the minimality of S.

Definition 3.32. The sets of attributes X and Y are equivalent under a set of FDs Σ, written X ↔Y, if X →Y and Y →X lie in Σ.

Lemma 3.33. Let X A, Y B be contained in a common minimal transversal

S Σ∗a of CC(Σ). Then X and Y are equivalent under S = Σ∗a\S.

Proof. By Lemma 3.31 we have

S 2Y →B

S∪ {X →A}²Y →B (3.1)

Let us denote the closure of Y underS by Y∗S. If X *Y∗S then

Y∗S =Y∗S∪{X→A}

which contradicts (3.1). Thus S ²Y →X, and by symmetry S ²X →Y.

Definition 3.34. Let Σ be a set of FDs onR. We denote the set of FDs in Σ∗awith LHS

equivalent to X ⊆R as

EQX :={Y →Z Σ∗a |Y ↔X}

The partition of Σ∗a into non-empty equivalence sets is denoted as

EQ:={EQX | ∃Y.X →Y Σ∗a}

Theorem 3.35. LetΣ be a set of FDs onR. Then every set EQX ∈EQ is autonomous.

Proof. By Lemma 3.33 all FDs in a (maximal) connected component of T r(CC(Σ)) have equivalent LHSs under Σ. Thus EQX is the union of vertex sets of maximal connected

components of T r(CC(Σ)), and therefore an isolated set of T r(CC(Σ)). By Theorem 3.18 isolated sets of T r(CC(Σ)) are autonomous for CC(Σ).

We are now ready to prove our main theorem for this section.

Theorem 3.36. Let Σbe a set of FDs on R. A set G⊆Σ∗a is a cover of Σ iff GEQ X

Proof. By Theorem 3.35 the sets EQX form a partition of Σ∗a into autonomous sets, so

the theorem is a special case of Lemma 3.22.

Theorem 3.36 allows us to split the task of finding and representing all canonical covers of Σ into several smaller tasks. For every EQX EQ we find the set CX of

all non-redundant partial covers on EQX. By Theorem 3.10 these describe CC(Σ) in

decomposed form:

CC(Σ) =CX1 ∨. . .∨CXn

While we could easily compute CC(Σ) by taking their cross-union, this decomposed de- scription of CC(Σ) is usually much smaller, and thus better suited for most tasks.

Note that the equivalence classes EQX need not be minimal autonomous sets of

CC(Σ). If we could find smaller autonomous sets we could speed up the computation of CC(Σ) even more. However, we will show later (Theorem 3.81) that finding the min- imal autonomous sets of CC(Σ) is hard. In section 3.4 we will give an algorithm to find finer, but not necessarily minimal autonomous sets.