DNF and EKNF - Combining Normal Forms - On fast and space efficient database normalization : a

4.5 Combining Normal Forms

4.5.2 DNF and EKNF

As dependency preserving BCNF decompositions do not always exist, combining DNF and BCNF is not always a feasible option. To avoid this problem, or as an alternative should a dependency preserving BCNF decomposition not exist, one could try to ensure other normal forms which always allow dependency preserving decompositions. The most well-known normal form with this property is 3NF. However, as was pointed out in [46], 3NF does not always enforce beneficial decomposition, even though they may not cause any loss of dependencies. The following example, taken from [46], illustrates this.

Example 4.13. LetR=ABC and Σ ={A→B, B →A}. ThenAC and BC are minimal keys of R, and thus all attributes are prime. Therefore R is already in 3NF, even though dependency preserving decompositions exist, such as {AB, BC} or{AB, AC}.

As an improvement, the authors suggest a new normal form which is stronger than 3NF but still allows dependency preserving decompositions. They strengthen 3NF by allowing as RHS of a non-key FD only those prime attributes, which appear in the LHS of an atomic key dependency. Note that atomic FDs are called elementary in [46].

Definition 4.55. [46] LetRbe a schema with FDs Σ. A FDX →Ais calledelementary

if Σ∗ _{contains no FD} _X0 _→ _A _with _X0 ₍_X_{. A key is elementary if it forms the LHS of}

an elementary FD. An attribute is an elementary key attribute if it lies in an elementary key of R.

Definition 4.56. [46] LetR be a schema with FDs Σ. ThenR is inelemental key normal form (EKNF) if for every non-trivial FD X →A onR

(a) X is a key of R, or

(b) A is an elementary key attribute for R.

Note that the schema R from example 4.13 is not in EKNF, since neither A nor B

are elementary key attributes. Thus EKNF may enforce useful decomposition which 3NF does not.

As it turns out, algorithm “dependency preserving DNF decomposition” already produces a decomposition into EKNF.

Lemma 4.57. Let R be a single schema with FDs Σ. If R is in dependency preserving DNF, then it is also in EKNF.

Proof. Assume that R is not in EKNF. Then there exists a FD X → A∈ Σ∗a _{such that}

X is not a key of R, and A does not lie in the LHS of any key FD in Σ∗a_{. Furthermore,}

R is strictly c-dominated by the decomposition

D:={R\A} ∪ {S (R|S is not a key of R}

since A does not lie in R\A, which is the only key schema inD.

It remains to show that D is dependency preserving. Clearly the only FDs in Σ∗a

which do not lie in Σ∗_{[D] are key FDs containing} _A_{. They must be of the form} _Y _→ _A_,

since A does not lie in the LHS of any key FD. However, the FDs Y → X and X → A

Theorem 4.58. LetD be a decomposition produced by algorithm “dependency preserving DNF decomposition”. Then every schemaRX ∈ Dis in dependency preserving DNF w.r.t.

Σ∗_[_R

X].

Proof. Let DX be any dependency preserving decomposition of RX. Then DX is domi-

nated by the single schema

R_X0 :=[{Rj ∈ DX |Rj is a key of RX}

Clearly R0

X preserves all key FDs in Σ∗[D]. Thus EQX[DX] ⊆ EQX[R0X], so EQX[R0X]

implies EQX[RX]. However, RX has been constructed minimal such that EQX[RX] has

some partial cover property. Thus R0

X = RX, which shows that RX dominates every

dependency preserving decompositionDX. It follows thatRX is in dependency preserving

DNF.

Corollary 4.59. Algorithm “dependency preserving DNF decomposition” produces a de- composition into EKNF.

Proof. Follows immediately from the last lemma and theorem.

We note that this result is only due to our construction method, i.e., dependency preserving DNF does not imply EKNF in general.

Example 4.14. Let R = ABCD and Σ = {A → B, B → C, CD → A}. Then the decomposition D = {ABC, ACD} is in dependency preserving DNF (note that it is

not strictly c-dominated by {AB, BC, ACD} since C already appears in the key schema

ACD). However, D is not in EKNF, since ABC contains B →C which violates EKNF. One could say that the benefit of EKNF is that it enforces ”locally” well-designed schemas, something which may not be forced by DNF if this “local optimization” does not provide a significant benefit for the overall size of the decomposition. The same holds true for BCNF or other “local” normal forms, i.e., normal forms which consider only a single schema.

Chapter 5 Summary

We will briefly summarize the main results we obtained, and related problems which still remain open.

5.1 Main Results

In chapter 2 we developed algorithms for computing a dependency preserving BCNF decomposition. The main result was an “linear resolution” algorithm for computing the atomic closure Σ∗a _{for a set of functional dependencies Σ. While Σ}∗a _{can be exponential}

in Σ, we identified polynomial cases and showed that for finding a dependency preserving BCNF decomposition, it suffices to compute a subset of Σ∗a_{. Finally, we demonstrated}

how the results can be extended to a complex-valued data model.

The “linear resolution” algorithm was then used in chapter 3 to compute the set of all canonical covers CC(Σ). For that we showed how hypergraphs can be decomposed using autonomous sets, which led to an efficient representation of CC(Σ). Perhaps more im- portant than the actual algorithm, we obtained insights into how functional dependencies interact. In particular, Theorem 3.36 allows us to split the task of creating a canonical cover into smaller, independent tasks of creating partial covers. Our theory of autonomous sets may well have applications in other disciplines.

In chapter 4 we returned to database normalization by defining a new normal form “DNF” and providing algorithms for computing decompositions into DNF. This new normal form was characterized both semantically and syntactically, and one of the main difficulties was in showing that the characterizations match. We established that in some sense, DNF is the proper generalization of existing normal forms, in particular BCNF, onto multiple schemas. Finally, we showed how dependency preserving DNF decompositions can be computed, and how dependency preserving DNF and BCNF can be obtained simultaneously.

Overall, this work focused on computing good schema decompositions. We provided characterizations of such “good” decompositions and practical algorithms to obtain them. The results offer new insights, in particular into the interaction of functional dependencies, and are of immediate practical use in creating automated design tools.

In document On fast and space efficient database normalization : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand (Page 134-137)