• No results found

Second Normal Form: Eliminating Partial Dependencies

Before we explore second normal form, you must understand the concept offunctional dependence.For this definition, we’ll use two arbitrary attributes, cleverly named “A” and “B.” Attribute B isfunctionally dependenton attribute A if at any moment in time, there is no more than one value of attribute B associated with a given value of attribute A. Lest you wonder what planet the author lived on before this one, let’s try to make the definition more understandable. First, if we say that attribute B is functionally depend- ent on attribute A, what we are also saying is that attribute Adeterminesattribute B, or that A is adeterminant(unique identifier) of attribute B. Second, let’s look again at the first normal form relations in our Acme Industries example:

INVOICE: # Invoice Number, Customer Number, Customer Name, Customer Address, Customer City, Customer State, Customer Zip Code, Customer Phone, Terms,

Ship Via, Order Date, Total Order Amount INVOICE LINE ITEM: # Invoice Number, # Product Number,

Product Description, Quantity, Unit Price, Extended Amount

In the INVOICE relation, we can easily see that Customer Number is functionally dependent on Invoice Number because at any point in time, there can be only one value of Customer Number associated with a given value of Invoice Number. The very fact that the Invoice Number uniquely identifies the Customer Number in this relation means that, in return, the Customer Number isfunctionally dependenton the Invoice Number.

In the INVOICE LINE ITEM relation, we can also say that Product Description is functionally dependent on Product Number because, at any point in time, there is only one value of Product Description associated with the Product Number. How- ever, the fact that the Product Number is only part of the key of the INVOICE LINE ITEM is the very issue addressed by second normal form.

A relation is said to be insecond normal formif it meets both the following criteria:

The relation is in first normal form.

All non-key attributes are functionally dependent on theentireprimary key. If we look again at Product Description, it should be easy to see that Product Num- beralonedetermines the value. Said another way, if the same product appears as a line item on many different invoices, the Product Description is the sameregardlessof the Invoice Number. Or we can say that Product Description is functionally dependent on onlypart ofthe primary key, meaning it depends only on Product Number and not on thecombinationof Invoice NumberandProduct Number.

Color profile: Generic CMYK printer profile Composite Default screen

CHAPTER 6

Logical Database Design Using Normalization

157

Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6

It should also be clear by now that second normal form only applies to relations where we have concatenated primary keys (that is, those made up of multiple attrib- utes). If we have a primary key composed of only a single attribute, as we do with the first normal form version of the Invoice relation, and the primary key is atomic (that is, has no subparts that make sense by themselves), as all attributes should be, then it is simply not possible for anything to depend onpartof the primary key. It follows, then, that any first normal form relation that has only a single attribute for its primary key is automaticallyin second normal form.

Looking at the INVOICE LINE ITEM relation, however, second normal form vio- lations should be readily apparent: Product Description and Unit Price depend only on the Product Number instead of thecombinationof Invoice Number and Product Num- ber. But not so fast! What about price changes? If Acme decides to change their prices, how could we possibly want that change to be retroactive for every invoice we have ever created? After all, an invoice is an official record that we must maintain for seven years, per current tax laws. This is a common dilemma with fast-changing attributes such as prices. Either we must be able to recall the price at any point in time or we must store the price with the invoice so we can reproduce the invoice as needed (that is, when the friendly tax auditors come calling). For simplicity, we have elected to store the price in two places, one being the current selling price and the other being the price at the time the sale was made. Because the later is a snapshot at a point in time that is not expected to change, there are no anomalies to this seemingly redundant storage. An alternative would be to store a date-sensitive price history somewhere that we could use to reconstruct the correct price for any invoice. That is a practical alternative here, but you would never be able to do that with stock or commodities market transac- tions, for example. The point is that while the sales pricelooksredundant, there are no anomaliesto the additional attribute, so it does no harm. Notice that we adjusted the at- tribute names so their meaning is abundantly clear.

Once we find a second normal form violation, the solution is to move the attrib- ute(s) that is (are) partially dependent to a new relation where it depends on theen- tirekey instead ofpartof the key. Here is our invoice example rewritten into second normal form:

INVOICE: # Invoice Number, Customer Number, Customer Name, Customer Address, Customer City, Customer State, Customer Zip Code, Customer Phone, Terms,

Ship Via, Order Date, Total Order Amount INVOICE LINE ITEM: # Invoice Number, # Product Number,

Quantity, Sale Unit Price, Extended Amount PRODUCT: # Product Number, Product Description,

List Unit Price

The improvement from our first normal form solution is that maintenance of the Product Description now has no anomalies. We can set up a new product independent Color profile: Generic CMYK printer profile

of there being an invoice for the product. If we wish to change the Product Description, we may do so by merely changing one value in one row of data. Also, should the last invoice for a particular product be deleted from the database for whatever reason, we won’t lose its description (it will still be in the row in the Product relation).Alwaysre- member that the reason we are normalizing is to eliminate these anomalies.

Third Normal Form: Eliminating