©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
.
Chapter 6
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Normalization
is a process for assigning attributes to entities. It
reduces data redundancies and helps eliminate the data anomalies.
▪
Normalization works through a series of stages called normal forms:
▪
First normal form (1NF)
▪
Second normal form (2NF)
▪
Third normal form (3NF)
▪
Boyce-Codd normal form (BCNF)
▪
Fourth normal form (4NF)
▪
Fifth normal Form (5 NF)
▪
Domain Key normal form (DKNF)
▪
The highest level of normalization is not always desirable.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
The Need for Normalization
▪
Case of a Construction Company
▪
Building project -- Project number, Name, Employees
assigned to the project.
▪
Employee -- Employee number, Name, Job classification
▪
The company charges its clients by billing the hours spent
on each project. The hourly billing rate is dependent on the
employee’s position.
▪
Periodically, a report is generated.
▪
The table whose contents correspond to the reporting
requirements is shown in Table 5.1.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
A few employees works for
one project.
Project Num :
15
Project Name :
Evergreen
Employee Num :
101, 102, 103,
105
Scenario
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Project Num :
15
Project Name :
Evergreen
Emp Num
Emp Name
Job Class
Chr Hours
Hrs Billed
Total
101
102
103
105
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Table Structure Matches the Report
Format
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Problems with the Figure 5.1
▪
The project number is intended to be a primary key, but it
contains nulls.
▪
The table displays data redundancies.
▪
The table entries invite data inconsistencies.
▪
The data redundancies yield the following anomalies:
▪
Update anomalies.
▪
Addition anomalies.
▪
Deletion anomalies.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Conversion to First Normal Form
▪
A relational table must not contain
repeating groups
.
▪
Repeating groups can be eliminated by adding the
appropriate entry in at least the primary key
column(s).
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Before
After
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
1NF Definition
▪
The term first normal form (
1NF
) describes
the tabular format in which:
▪
All the key attributes are defined.
▪
There are no repeating groups in the table.
▪
All attributes are dependent on the primary
key.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
First Normal Form (1NF)
Student
Age
Subject
Waseem
15
Biology, Maths
Ahmed
14
Maths
Sajid
17
Maths
12
In First Normal Form, any row must not have a column in
which more than one value is saved, like separated with
commas. Rather than that, we must separate such data
into multiple rows.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Student
Age
Subject
Waseem
15
Biology
Waseem
15
Maths
Ahmed
14
Maths
Sajid
17
Maths
13
Student Table following 1NF will be :
Using the First Normal Form, data redundancy increases, as
there will be many columns with same data in multiple rows
but each row as a whole will be unique.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Dependency Diagram
▪
The primary key components are bold, underlined, and shaded
in a different color.
▪
The arrows above entities indicate all desirable dependencies,
i.e., dependencies that are based on PK.
▪
The arrows below the dependency diagram indicate less
desirable dependencies --
partial dependencies and transitive
dependencies
.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Functional Dependencies
15
▪
Partial Dependency
– when an
non-key
attribute
is
determined by a part, but not the whole, of a COMPOSITE
primary key (The Primary Key must be a Composite Key).
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Functional Dependencies
16
▪
Transitive Dependency
– when a non-key attribute determines
another non-key attribute.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Conversion to Second Normal Form
▪
Starting with the 1NF format, the database can be
converted into the 2NF format by
▪
Writing each key component on a separate line, and
then writing the original key on the last line and
▪
Writing the dependent attributes after each new
key.
PROJECT (
PROJ_NUM, PROJ_NAME
)
EMPLOYEE (
EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR
)
ASSIGN (
PROJ_NUM, EMP_NUM, HOURS
)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
A table is in 2NF if:
▪
It is in 1NF and
▪
It includes no partial dependencies; that is, no attribute
is dependent on only a portion of the primary key.
(It is still possible for a table in 2NF to exhibit
transitive
dependency
; that is, one or more attributes may be
functionally dependent on nonkey attributes.)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
Conversion to Third Normal Form
▪
Create a separate table with attributes in a
transitive functional dependence relationship.
PROJECT (
PROJ_NUM, PROJ_NAME
)
ASSIGN (
PROJ_NUM, EMP_NUM, HOURS
)
EMPLOYEE (
EMP_NUM, EMP_NAME, JOB_CLASS
)
JOB (
JOB_CLASS, CHG_HOUR
)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
3NF Definition
▪
A table is in 3NF if:
▪
It is in 2NF and
▪
It contains no transitive dependencies.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
◻
For a table to be in BCNF, following conditions must
be satisfied:
1.
R(relation) must be in 3rd Normal Form
2.
For each Functional Dependency ( X Y ), X should be
a super Key.
◻
Consider the following relationship R(A,B,C,D) and
following relationships:
A
BCD
BC
AD
D B
Above relationship is already in 3NF. Keys are A & BC
32
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
◻
Hence, in the functional dependency, A BCD, A
is the superkey.
◻
In the second relation, BC AD, BC is also a key.
◻
But in D B, D is not a key.
◻
Hence, we can break our relationship R, into R1
and R2.
R(A,B,C,D)
R1(A,D,C) R2(D,B)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
BCNF (Activity)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
▪
BCNF Definition
▪
A table is in BCNF if every determinant in
that table is a candidate key. If a table
contains only one candidate key, 3NF and
BCNF are equivalent.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Multivalued Dependencies
▪
A multivalued dependency occurs when a
determinant is matched with a particular set of
values:
Employee Degree
Employee Sibling
PartKit Part
▪
The determinant of a multivalued dependency can
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
3-3
0
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Two Multivalued Dependencies
EmployeeName
EmployeeDegree
EmployeeSibling
1
Chau
BS
Eileen
2
Chau
BS
Jonathan
3
Green
BS
Nikki
4
Green
MS
Nikki
5
Green
PhD
Nikki
6
Jones
AA
Frank
7
Jones
AA
Fred
8
Jones
AA
Sally
9
Jones
BA
Frank
10
Jones
BA
Fred
11
Jones
BA
Sally
EMPLOYEE_DEGREE_SIBLING
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Eliminating Anomalies from
Multivalued Dependencies
▪
Multivalued dependencies are not a problem if they
are in a separate relation, so:
▪
Always put multivalued dependencies into their own
relation.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
That Very Strange Table Again
Now we understand why this is a very strange table.
It has multivalued dependencies!
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
4NF
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
35
Denormalization
▪
Normalization is one of many database design goals.
▪
However, normalized tables result in:
▪
additional processing
▪
loss of system speed
▪
When normalization purity is difficult to sustain due to
conflict in:
▪
design efficiency
▪
information requirements
▪
processing speed
→
Denormalize by
▪
use of lower normal form
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Denormalization
▪
In order to generate the report below, a temporary
denormalized table is used since the last four semesters
of each faculty member could be different due to
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Denormalization
▪
EVALDATA is the master data table which is normalized
▪
FACHIST is created via a series of queries in order to produce
the desired report
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Exercise 1
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Exercise 2
Member
Cod
Skill
Code
Skill
Type
Title
Name
Last Name
Age
Group
Code
City
Code
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Expert
Code
Expertise
Code
The
Expertise
Name
Last Name
Department
Code
Department
Name
Location
21
113
System
Alice
Adams
55
Brown
NYC
35
113
System
Tom
Hank
32
Green
LA
179
Database
204
Program
50
179
Database
Robert
Cody
40
Yellow
Mexico
77
148
Web
Lisa
Baker
52
Black
Hawaii
179
Database
EXERCISE 3: Analyze and prepare Normalization Expert
database system To have the least redundancy.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Code
Term
Subject Code
Subject Name
Subject
Category
Category Name
Credit
Hours
Grade
4501
1/45
A01
Mathematics
02
Natural Science
3
F
1/45
A03
Urdu Language
01
Basic Studies
3
C
2/45
A01
Mathematics
02
Natural Science
3
D
2/45
A04
English 1
01
Basic Studies
3
D+
4502
1/45
A01
Mathematics
02
Natural Science
3
B
1/45
A04
English 1
01
Basic Studies
3
C
2/45
A02
Database System
03
Computer
Science
3
B
2/45
A05
Physics
02
Natural Science
3
D
4503
1/45
A06
Data
communication
03
Computer
Science
3
A
1/45
A04
English 1
01
Basic Studies
3
C
2/45
A05
Physics
02
Natural Science
3
D
4506
1/45
A03
Urdu Language
01
Basic Studies
3
B+
1/45
A04
English 1
01
Basic Studies
3
C
EXERCISE 4: Analyze and prepare Normalization
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Stereos To Go
Invoice
Order
No.
Date:
Account
No.
Ite
m
Numbe
r
Product
Description/Manufacturer
Qt
y
Pric
e
Produc
tCod
e
1
2
3
4
5
Date Shipped:
Custome
r:
Address:
City
Stat
e
Zip
Code
10001
0000-000-0000-0
John Smith
2036-26 Street
Sacramento CA
95819
SAGX730 Pioneer Remote A/V Receiver
AT10 Cervwin Vega Loudspeakers
CDPC725 Sony Disc-Jockey CD Changer
Subtotal
Shipping &
Handling
Sales Tax
Total
132985
10000
10306
153291
1
1
1
56995
35995
39995
06/15/2010
06/15/2010
E
x
e
r
c
i
s
e
5
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Table 4-3 contains sample data for parts and for vendors who supply those parts. In
discussing these data with users, we find that part numbers (but not descriptions)
uniquely identify parts and that vendor names uniquely identify vendors.
▪
Convert this table to a relation (named PART SUPPLIER) in first normal form.
Illustrate the relation with the sample data in the table.
▪
List the functional dependencies in PART SUPPLIER relation and identify a
candidate key.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.