• No results found

repeating groups or arrays

In document Hands on Database (Page 185-192)

186

 Each row is unique

Example 1:

This Album table does not meet the criteria for First Normal Form. The main problem is in the tracks column. The column Tracks contains a list of songs rather than a single value. This would make it very difficult to locate information about any single song.

One solution that often occurs to beginning database developers is to enumerate a list of columns such as Track1, Track2, Track3, etc. to some arbitrary number of tracks. This also violates FN1 by creating a repeating group. Say, for argument’s sake, you made 13 track columns. What happens to an Album with fourteen tracks? What if an album only has one or two tracks? Also consider what you would need to do to find any individual track. You would need to query thirteen separate columns.

The following table is in first normal form:

Figure 54: Album Table in First Normal Form

AlbumTitle Track Artist ArtistCountry

Abby Road Here comes the sun Beatles UK

Abby Road Octopus’s Garden Beatles UK

Abby Road Something Beatles UK

Blond on Blond Rainy Day Woman Bob Dylan US

Blond on Blond Sad Eyed Lady of the lowlands

Bob Dylan US

Blond on Blond Stuck in Mobile with the Memphis blues again

Bob Dylan US

It is obvious from looking at the table above that First Normal Form is not sufficient. Every column contains a single value and there are no arrays or repeating groups, but there is a great deal of redundancy.

187 Example 2:

In our spreadsheet example the first attribute LastName/Dept stores two different types of values, last names and Department names. The attribute Title also has an issue. Lisa Brown has two titles,

“Program Assistant” and “Lab Assistant.” In First Normal Form each row of an attribute must contain only a single value.

It may not be obvious, at first why these things are a problem. Think about it from the point of view of someone querying the database. If they want to find a Department’s phone they have to search through all the faculty and staff names to find it. They could apply various filters, such as searching for values that have no status or position, but that is not guaranteed to return just what they want. A database user expects to be able to just ask for the Department names and find them. The Title column is even more problematic for the database searcher. Suppose, you want to find all the employees who have a title “Lab Assistant.” When the attribute contains a list of values you can’t simply search for that title.

You would have to use some kind of pattern search or string function to extract the title from the list.

There is also no way to ensure consistency or data integrity when you have a list of values for an attribute.

To get the data to conform to First Normal Form, the first thing to do is to separate LastName and Dept into two attributes. Since there is already a Dept attribute, call the new attribute, DeptName. Here is the first half of the spreadsheet with the correction:

LastName FirstName DeptName Phone

Building

code Building

Able Susan 206.555.2356 BE

Broadway Edison

Admissions Admissions 206.555.1000 BE

Broadway Edison

Anderson Elliot 206.555.1029 SA South Annex

Anderson Jolene 206.555.9001 SA South Annex

Bradley Lisa 206.555.2323 BE

Broadway Edison

188

Brown Martin 206.555.1200 SA South Annex

Information

Technology 206.555.1200 SA South Annex

Figure 55: Contact Table wiht Dept and Name seperarted

The next problem is more difficult. Title can have multiple values for a single employee. One

temptation is to add columns such as Title1, Title2, Title3, but this solution generates more problems than it solves. For the vast majority of employees who only have one title, Title2 and Title3 would be always be empty. Also, what if some enterprising employee were working four positions and had four titles. There would be no room for the fourth. For someone querying the database, this solution opens up even worse problems. If you were searching for all the employees who held a particular job title you would have to query three different attributes.

The only way to solve this problem is to break the entity into two or more separate entities. Job Title will be a separate entity. We will also need a linking entity, since there is a many-to-many relationship between employees and job titles.

One last issue remains. As you learned in the last chapter on database design, each entity should have a primary key, an attribute that uniquely identifies each row stored in the entity. In the Tutor database and most examples the book has used natural keys, that is keys that arise from some combination of the natural attributes of an entity. But in this example, just to show an alternative approach, we will use surrogate keys. Each row will be assigned a arbitrary number in sequence. Most Relational Database Management Systems have a utility to provide such keys. In SQL Server it is “Identity,” in Access it is called an “autonumber.”

The overall Contact Entity will have the key “ContactKey.” The new Title entity will have “TitleKey” for a primay key and the linking entity will have a composite key consisting of “ContactKey” and “TitleKey”

When we are done our data will look like this:

189

ContactKey LastName FirstName DeptName Phone

Building code

1 Able Susan 206.555.2356 BE

2 Admissions Admissions 206.555.1000 BE

3 Anderson Elliot 206.555.1029 SA

Technology 206.555.1200 SA

Figure 56: Contact table, First Normal Form

Building

Building

Address Office Dept Type Status Email Broadway

Edison 1700 Broadway 314 HUM Instruction FT [email protected] Broadway

Edison 1700 Broadway 124 ADM

South Annex 1650 Broadway 212 IT Instruction PT [email protected] South Annex 1650 Broadway 113 IT Instruction PT [email protected] Broadway

Edison 1700 Broadway 114 MAT Staff FT [email protected]

South Annex 1650 Broadway 201 IT Exempt [email protected]

South Annex 1650 Broadway 200

Figure 57: Contact table cont

TitleKey TitleName 1 Professor

2 Program Assistant 3 Dean

4 Lab Assistant

Figure 58: Title Table

ContactKey TitleKey

Figure 59:Contact Title Table

Below is an ERD of our efforts so far:

190 Contact

PK ContactKey LastName Firstname DeptName Phone BuildingCode BuildingName BuildingAddress Office

Dept Type Status Email

Title

PK TitleKey TitleName

ContactTitle PK,FK1 ContactKey PK,FK2 TitleKey

Figure 60: Contact ERD First Normal Form

Professor Collins lays out theTutor diagram so that both he and Sharon can see it clearly.

191

Figure 61: Tutor ERD Version 1

“As you recall,” he begins, “To conform to First Normal Form you need to eliminate all repeating groups or arrays and all multi-valued dependencies.” Together they go through each table. Professor Collins pauses at the Request table. “All the tables look good, but I do have a question about the attribute

‘Request Notes.’ Will there be instances with more than one entry for notes? If so it would be better to make an new entity called RequestNotes, or something like that. That way each request can have

192

several notes if necessary, and you aren’t wasting disk space by reserving note space for requests that don’t have any notes. The primary key could be the date and time the note was entered and it would be tied to the Request table by the RequestKey repeated as a Foreign Key.”

Sharon nods. “I hadn’t thought about that. I suppose that there could be several notes as the status of the request changes. I will ask Terry about it to be sure. But the idea of separating it just to save disk space makes sense too. I’ll change it.”

“Ok, let’s look to see how well your diagram conforms to Second Normal Form.

Things You Should Know

In document Hands on Database (Page 185-192)