• No results found

The Difficulty of Designing for 4NF

8.10 ADVANCED 4NF DESIGN*

8.10.2 The Difficulty of Designing for 4NF

The inference rules for MVDs are useful because they can help eliminate redundant MVDs and FDs. Also, as in the case of FDs alone, using nonredundant dependency sets can improve the design produced by the 4NF decomposition algorithm de- scribed earlier. However, even in the absence of redundant dependencies things can go awry. We illustrate some of the problems on a number of examples. Three issues are considered: loss of dependencies, redundancy, and design using both FDs and MVDs.

8.10 Advanced 4NF Design* 247

A contracts example. Consider the schema

Contracts(Buyer, Vendor, Product, Currency)

where a tuple of the formJohn Doe, Acme, Paper Clips, USD means that buyer John Doe has a contract to buy paper clips from Acme, Inc., using U.S. currency. Suppose that our relation represents contracts of an international network of buyers and companies. Although the contract was consummated in dollars, if Acme sells some of its products in DM (perhaps it is a German company), it may be convenient to store the contract in two tuples: one with the financial information expressed in USD, the other information expressed in DM. In general, Contracts satisfies the rule that, if a company accepts several currencies, each contract is described in each one. This can be expressed using the following combined MVD:

Buyer Vendor →→ Product | Currency (8.16)

To be explicit, this MVD means

Contracts= (Buyer Vendor Product)  (Buyer Vendor Currency). The second rule of our international network is that, if two vendors supply a certain product and accept a certain currency and a buyer of that product has a contract to buy certain product with one of the two vendors, then this buyer must have a contract for that product with the other vendor as well. For example, if, in addition to the above tuple, Contracts containedMary Smith, OfficeMin, Paper Clips, USD, then it must also contain the tuples John Doe, OfficeMin, Paper Clips, USD and Mary Smith, Acme, Paper Clips, USD. This type of con- straint is expressed using the following MVD:

Product Currency →→ Buyer | Vendor (8.17)

Let us try to follow the 4NF decomposition algorithm. If we first decompose using the dependency (8.16), we get the following lossless decomposition:

(Buyer, Vendor, Product)

(Buyer, Vendor, Currency) (8.18)

Observe that, once this decomposition is done, the second MVD can no longer be applied because no join dependency holds in either one of the above schemas.7The

situation here is very similar to the problem we faced with the BCNF decomposition algorithm: Some dependencies might get lost in the process. In our case, it is the dependency (8.17). The same problem exists if we first decompose using the second dependency above, but in this case we lose (8.16).

Unlike losing FDs during BCNF decomposition, losing MVDs is potentially a more serious problem because the result might still harbor redundancy even if

7It is not quite obvious because we have not discussed the tools appropriate for verifying such facts.

However, in this concrete example our claim can be checked using the definition of the natural join directly. We again recommend [Maier 1983] as a good reference for learning about such techniques.

every relation in the decomposition is in 4NF! To see this, consider the following relation for the Contracts schema:

Buyer Vendor Product Currency

B1 V1 P C

B2 V2 P C

B1 V2 P C

B2 V1 P C

It is easy to check that this relation satisfies MVDs (8.16) and (8.17). For instance, to verify (8.16) take the projections on decomposition schema (8.18).

Buyer Vendor Product Buyer Vendor Currency

B1 V1 P B1 V1 C

B2 V2 P B2 V2 C

B1 V2 P B1 V2 C

B2 V1 P B2 V1 C

Joining these two relations (using the natural join) clearly yields the original relation for Contracts. A closer look shows that the above relations still contain a great deal of redundancy. For instance, the first relation twice says that product P is supplied by vendors V1and V2. Furthermore, it twice says that P is wanted by buyers

B1and B2. The first relation seems to beg for further decomposition into (Buyer,

Vendor) and (Vendor, Product), and the second relation begs to be decomposed into (Buyer, Currency) and (Vendor, Currency). Alas, none of these wishes can be granted because none of these decompositions is lossless (for example, Vendor is not a key of the first relation). As a result, decomposition (8.18) suffers from the usual update anomalies even though each relation is in 4NF! Furthermore, since 4NF implies BCNF, even BCNF does not guarantee complete elimination of redundancy in the presence of MVDs!

A dictionary example. For another example, consider a multilingual dictionary relation, Dictionary(English, French, German), which provides translations from one language to another. As expected, every term has a translation (possibly more than one) into every language, and the translations are independent of each other. These constraints are easily captured using MVDs.

English →→ French | German

French →→ English | German (8.19)

German →→ English | French

The problem, as before, is that applying any one of these MVDs in the 4NF decom- position algorithm loses the other two dependencies, and the resulting decompo- sition exhibits the usual update anomalies.

8.10 Advanced 4NF Design* 249

A multilingual thesaurus example. Let us enhance the previous example so that every term is now associated with a unique concept and each concept has an associated description (for the purpose of this example, ignore the language used for the description). The corresponding schema becomes Dictionary(Concept, Description, English, French, German), where the dependencies are

English → Concept French → Concept

German → Concept (8.20)

Concept → Description

Concept →→ English | French | German

An example of a concept is A5329 with description “homo sapiens” and translations {human, man}, {homme}, and {Mensch, Mann}.

The 4NF decomposition algorithm suggests that we start by picking up an MVD that violates 4NF and then use it in the decomposition process. Since every FD is also an MVD, we might choose English → Concept first, which yields the schema (English, Concept) and (English, French, German, Description). Using the transitivity rule for MVDs, we can derive the MVD English→→ French | German | Description and further decompose the second relation into (English, French), (English, German), and (English, Description).

The resulting schema has two drawbacks. First, it is lopsided toward English whereas the original schema was completely symmetric. Second, every one of the bilingual relations, such as (English, French), redundantly lists all possible translations from English to French and back. For example, if a and b are English synonyms, c and d are French synonyms, and a translates into c, then the English- French dictionary (English, French) has all four tuples: a, c, a, d, b, c, and b, d.

A better way to use the 4NF decomposition algorithm is to compute the attribute closure (defined on page 220) of the left-hand side of the MVD in (8.20) with respect to functional dependencies in (8.20) and derive the following MVD by the augmentation rule:

Concept Description →→ English | French | German

We can then apply the 4NF decomposition algorithm using this MVD, which yields the decomposition (Concept, Description, English), (Concept, Descrip- tion, French), and (Concept, Description, German). We can further decompose each of these relations into BCNF using the FDs alone, spinning off (Concept De- scription). Not only do we end up with a decomposition into 4NF, but also all dependencies are preserved.

Related documents