186
Each row is unique
Example 1:
This Album table does not meet the criteria for First Normal Form. The main problem is in the tracks column. The column Tracks contains a list of songs rather than a single value. This would make it very difficult to locate information about any single song.
One solution that often occurs to beginning database developers is to enumerate a list of columns such as Track1, Track2, Track3, etc. to some arbitrary number of tracks. This also violates FN1 by creating a repeating group. Say, for argument’s sake, you made 13 track columns. What happens to an Album with fourteen tracks? What if an album only has one or two tracks? Also consider what you would need to do to find any individual track. You would need to query thirteen separate columns.
The following table is in first normal form:
Figure 54: Album Table in First Normal Form
AlbumTitle Track Artist ArtistCountry
Abby Road Here comes the sun Beatles UK
Abby Road Octopus’s Garden Beatles UK
Abby Road Something Beatles UK
Blond on Blond Rainy Day Woman Bob Dylan US
Blond on Blond Sad Eyed Lady of the lowlands
Bob Dylan US
Blond on Blond Stuck in Mobile with the Memphis blues again
Bob Dylan US
It is obvious from looking at the table above that First Normal Form is not sufficient. Every column contains a single value and there are no arrays or repeating groups, but there is a great deal of redundancy.
187 Example 2:
In our spreadsheet example the first attribute LastName/Dept stores two different types of values, last names and Department names. The attribute Title also has an issue. Lisa Brown has two titles,
“Program Assistant” and “Lab Assistant.” In First Normal Form each row of an attribute must contain only a single value.
It may not be obvious, at first why these things are a problem. Think about it from the point of view of someone querying the database. If they want to find a Department’s phone they have to search through all the faculty and staff names to find it. They could apply various filters, such as searching for values that have no status or position, but that is not guaranteed to return just what they want. A database user expects to be able to just ask for the Department names and find them. The Title column is even more problematic for the database searcher. Suppose, you want to find all the employees who have a title “Lab Assistant.” When the attribute contains a list of values you can’t simply search for that title.
You would have to use some kind of pattern search or string function to extract the title from the list.
There is also no way to ensure consistency or data integrity when you have a list of values for an attribute.
To get the data to conform to First Normal Form, the first thing to do is to separate LastName and Dept into two attributes. Since there is already a Dept attribute, call the new attribute, DeptName. Here is the first half of the spreadsheet with the correction:
LastName FirstName DeptName Phone
Building
code Building
Able Susan 206.555.2356 BE
Broadway Edison
Admissions Admissions 206.555.1000 BE
Broadway Edison
Anderson Elliot 206.555.1029 SA South Annex
Anderson Jolene 206.555.9001 SA South Annex
Bradley Lisa 206.555.2323 BE
Broadway Edison
188
Brown Martin 206.555.1200 SA South Annex
Information
Technology 206.555.1200 SA South Annex
Figure 55: Contact Table wiht Dept and Name seperarted
The next problem is more difficult. Title can have multiple values for a single employee. One
temptation is to add columns such as Title1, Title2, Title3, but this solution generates more problems than it solves. For the vast majority of employees who only have one title, Title2 and Title3 would be always be empty. Also, what if some enterprising employee were working four positions and had four titles. There would be no room for the fourth. For someone querying the database, this solution opens up even worse problems. If you were searching for all the employees who held a particular job title you would have to query three different attributes.
The only way to solve this problem is to break the entity into two or more separate entities. Job Title will be a separate entity. We will also need a linking entity, since there is a many-to-many relationship between employees and job titles.
One last issue remains. As you learned in the last chapter on database design, each entity should have a primary key, an attribute that uniquely identifies each row stored in the entity. In the Tutor database and most examples the book has used natural keys, that is keys that arise from some combination of the natural attributes of an entity. But in this example, just to show an alternative approach, we will use surrogate keys. Each row will be assigned a arbitrary number in sequence. Most Relational Database Management Systems have a utility to provide such keys. In SQL Server it is “Identity,” in Access it is called an “autonumber.”
The overall Contact Entity will have the key “ContactKey.” The new Title entity will have “TitleKey” for a primay key and the linking entity will have a composite key consisting of “ContactKey” and “TitleKey”
When we are done our data will look like this:
189
ContactKey LastName FirstName DeptName Phone
Building code
1 Able Susan 206.555.2356 BE
2 Admissions Admissions 206.555.1000 BE
3 Anderson Elliot 206.555.1029 SA
Technology 206.555.1200 SA
Figure 56: Contact table, First Normal Form
Building
Building
Address Office Dept Type Status Email Broadway
Edison 1700 Broadway 314 HUM Instruction FT [email protected] Broadway
Edison 1700 Broadway 124 ADM
South Annex 1650 Broadway 212 IT Instruction PT [email protected] South Annex 1650 Broadway 113 IT Instruction PT [email protected] Broadway
Edison 1700 Broadway 114 MAT Staff FT [email protected]
South Annex 1650 Broadway 201 IT Exempt [email protected]
South Annex 1650 Broadway 200
Figure 57: Contact table cont
TitleKey TitleName 1 Professor
2 Program Assistant 3 Dean
4 Lab Assistant
Figure 58: Title Table
ContactKey TitleKey
Figure 59:Contact Title Table
Below is an ERD of our efforts so far:
190 Contact
PK ContactKey LastName Firstname DeptName Phone BuildingCode BuildingName BuildingAddress Office
Dept Type Status Email
Title
PK TitleKey TitleName
ContactTitle PK,FK1 ContactKey PK,FK2 TitleKey
Figure 60: Contact ERD First Normal Form
Professor Collins lays out theTutor diagram so that both he and Sharon can see it clearly.
191
Figure 61: Tutor ERD Version 1
“As you recall,” he begins, “To conform to First Normal Form you need to eliminate all repeating groups or arrays and all multi-valued dependencies.” Together they go through each table. Professor Collins pauses at the Request table. “All the tables look good, but I do have a question about the attribute
‘Request Notes.’ Will there be instances with more than one entry for notes? If so it would be better to make an new entity called RequestNotes, or something like that. That way each request can have
192
several notes if necessary, and you aren’t wasting disk space by reserving note space for requests that don’t have any notes. The primary key could be the date and time the note was entered and it would be tied to the Request table by the RequestKey repeated as a Foreign Key.”
Sharon nods. “I hadn’t thought about that. I suppose that there could be several notes as the status of the request changes. I will ask Terry about it to be sure. But the idea of separating it just to save disk space makes sense too. I’ll change it.”
“Ok, let’s look to see how well your diagram conforms to Second Normal Form.