Partial Covers - Computing all Canonical Covers

3.2 Computing all Canonical Covers

3.2.1 Partial Covers

The set of all canonical covers of Σ forms a simple hypergraph on the FDs in Σ∗a_{. We}

may thus use the terms defined for hypergraphs for canonical covers as well. In particular, we shall talk about autonomous sets of FDs, and (partial) superedges. Note that in this context the superedges are the atomic covers, while the edges are the canonical covers.

Definition 3.27. We call a set of FDs in Σ∗a _{autonomous if it is autonomous for the}

hypergraph CC(Σ). When talking about transversals, we always mean transversals of

CC(Σ).

Lemma 3.28. A setG⊆Σ∗a _{is a cover of} _Σ_{iff it intersects with all minimal transversals}

of CC(Σ).

Proof. G is a cover iff it is a superedge of CC(Σ). Furthermore, CC(Σ) is simple, and by Lemma 3.17 the edges of a simple hypergraph are the minimal sets which intersect with all minimal transversals. Thus superedges are simply sets (not necessarily minimal) which intersect with all minimal transversals.

As superedges become (atomic) covers for the hypergraph CC(Σ), partial superedges become partial covers.

Definition 3.29. Let Σ be a set of FDs and G⊆S ⊆Σ∗a_{. We call} _G _a_{partial cover} _of

Σ on S if G is a partial superedge ofCC(Σ) onS.

When S is autonomous, testing whether a set of FDs is a partial cover onS is easy:

Lemma 3.30. Let S ⊆ Σ∗a _{be autonomous, and let} _Σ0 _⊆ _Σ∗a _{be an atomic cover of} _Σ_.

Then a set G⊆S is a partial cover on S iff G∪(Σ0_\_S₎ _{is a cover of} _Σ_.

ClearlyG∪(Σ0_\_S_{) is a cover of Σ iff}_G_∪_(Σ0_\_S₎_²_Σ0_∩_S_{, which allows us to perform}

this test quickly.

We will identify some autonomous (but not necessarily minimal) sets ofCC(Σ). The- orem 3.18 relates autonomous sets to the minimal transversals of CC(Σ). The following lemmas establish some results about the form of these minimal transversals.

Lemma 3.31. Let S ⊆ Σ∗a _{be a minimal transversal of} _CC_(Σ) _and _X _→_A _∈ _S_{. Then}

S = Σ∗a_\_S _{is not a cover of} _Σ_{, but} _S_{∪ {}_X _→_A_} _is.

Proof. By Lemma 3.28, S is not a cover of Σ since it does not intersect with S. If

S∪ {X →A} were not a cover, then every cover would contain a FD in

S∪ {X →A}=S\ {X →A}

Thus S\ {X→A}would be a transversal, which contradicts the minimality of S.

Definition 3.32. The sets of attributes X and Y are equivalent under a set of FDs Σ, written X ↔Y, if X →Y and Y →X lie in Σ∗_.

Lemma 3.33. Let X → A, Y → B be contained in a common minimal transversal

S ⊆Σ∗a _of _CC_(Σ)_{. Then} _X _and _Y _{are equivalent under} _S _{= Σ}∗a_\_S_.

Proof. By Lemma 3.31 we have

S 2Y →B

S∪ {X →A}²Y →B (3.1)

Let us denote the closure of Y underS by Y∗S_{. If} _X _*_Y∗S _then

Y∗S ₌_Y∗S∪{X→A}

which contradicts (3.1). Thus S ²Y →X, and by symmetry S ²X →Y.

Definition 3.34. Let Σ be a set of FDs onR. We denote the set of FDs in Σ∗a_{with LHS}

equivalent to X ⊆R as

EQX :={Y →Z ∈Σ∗a |Y ↔X}

The partition of Σ∗a _{into non-empty equivalence sets is denoted as}

EQ:={EQX | ∃Y.X →Y ∈Σ∗a}

Theorem 3.35. LetΣ be a set of FDs onR. Then every set EQX ∈EQ is autonomous.

Proof. By Lemma 3.33 all FDs in a (maximal) connected component of T r(CC(Σ)) have equivalent LHSs under Σ. Thus EQX is the union of vertex sets of maximal connected

components of T r(CC(Σ)), and therefore an isolated set of T r(CC(Σ)). By Theorem 3.18 isolated sets of T r(CC(Σ)) are autonomous for CC(Σ).

We are now ready to prove our main theorem for this section.

Theorem 3.36. Let Σbe a set of FDs on R. A set G⊆Σ∗a _{is a cover of} _Σ _iff _G_∩_EQ X

Proof. By Theorem 3.35 the sets EQX form a partition of Σ∗a into autonomous sets, so

the theorem is a special case of Lemma 3.22.

Theorem 3.36 allows us to split the task of finding and representing all canonical covers of Σ into several smaller tasks. For every EQX ∈ EQ we find the set CX of

all non-redundant partial covers on EQX. By Theorem 3.10 these describe CC(Σ) in

decomposed form:

CC(Σ) =CX1 ∨. . .∨CXn

While we could easily compute CC(Σ) by taking their cross-union, this decomposed de- scription of CC(Σ) is usually much smaller, and thus better suited for most tasks.

Note that the equivalence classes EQX need not be minimal autonomous sets of

CC(Σ). If we could find smaller autonomous sets we could speed up the computation of CC(Σ) even more. However, we will show later (Theorem 3.81) that finding the minimal autonomous sets of CC(Σ) is hard. In section 3.4 we will give an algorithm to find finer, but not necessarily minimal autonomous sets.

In document On fast and space efficient database normalization : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand (Page 72-74)