Normalization
• A step by step process to produce more
efficient and accurate database design
Anomalies
• An inconsistent, incomplete or incorrect state
of database
• Four types of anomalies are of concern here;
Normalization
• A strongly recommended step
• Normalized design makes the maintenance of
database easier
• Normalization applied on each table of a DB
Normalization
• Performed after the logical database design • Informally also performed during conceptual
Normalization Process
•
Different forms or levels of normalization
•
Called first, second, third and so on forms
•
Each form has got certain conditions
•
If a table fulfils the condition(s) for a
Normalized DB Design
•
Process is applied on each table
•
The minimum form in which all
tables are in is called the normal
form of the entire database
•
Objective is to place the DB in
Functional Dependency
• Normalization is based on functional
dependencies (FDs)
• A type of relationship between attributes of a
Functional Dependency
Definition: If A and B are attributes of
a relation R, then B is functionally
Functional Dependency
•
It does not mean that A derives B,
although it may be the case sometime
•
Means that if we know value of A then
Functional Dependency
•
Attribute or set of attributes on the left
side are called determinant and on the
right are called dependents
•
Like R (a, b, c, d, e)
FD Example
FD Example from Table
stId stName stAdr prName prCrdts
S1020 Sohail Dar I-8 Islamabad MCS 64
S1038 Shoaib Ali G-6 Islamabad BCS 132
S1015 Tahira Ejaz L Rukh Wah MCS 64 S1015 Tahira Ejaz L Rukh Wah BCS 132 S1018 Arif Zia E-8,
Islamabad.
FDs and Keys
• We can determine the keys of a table seeing
its FDs
• The determinant of an FD that determines all
FDs and Keys
• A minimal super key is the candidate key, so if a
FDs and Keys
• If there is no other FD where a subset of this
determinant/SK is a super key, then it is a candidate key
FDs and Keys
EMP(eId, eName, eAdr, eDept, prId, prSal)
eId
eName, eDept, eAdr
eId, prId
prSal
STD(stId, stName, prName, adr, nic, cgpa)
stId
stName, prName, adr, nic, cgpa
Inference Rules
•
Called inference axioms or armstrong
axioms
•
These are rules that establish certain
FDs from a given set of FDs
Reflexivity
•
If B is a subset of A then A B,
it also implies that A A
always hold, that is
stName, stAdr stName
Augmentation
•
If we have A B then
AC BC that is
if stId stName then
Transitivity
•
If A B and B C then A C
that is
If
stId prName and prName credits
Then
Additivity or Union
• If A B and A C then A BC
if empId eName and empId qual
Then we can write it as
Projectivity or Decomposition
•
If A BC then A B and A C
if empId eName, qual
Then we can write it as
Pseudotransitivity
If A B and CB D then AC D
if stId stName and
stName, fName stAdr
Then we can write it as
First Normal Form
• A relation is in first normal form iff every
attribute in every tuple contains an atomic value
• There is no multivaued (repeating group) in
First Normal Form
•
One of the basic properties of relation
•
Multiple values create problems in
performing different operations, like, select
or join
•
Remember the treatment of multivalued
First Normal Form
STD(stId, stName, stAdr, prName, bkId)
stId stName stAdr prName bkId
S1020 Sohail Dar I-8 Islamabad MCS B00129 S1038 Shoaib Ali G-6 Islamabad BCS B00327 S1015 Tahira Ejaz L Rukh Wah MCS B08945,
First Normal Form
stId stName stAdr prName bkId
S1020 Sohail Dar I-8 Islamabad MCS B00129
S1038 Shoaib Ali G-6 Islamabad BCS B00327
S1015 Tahira Ejaz L Rukh Wah MCS B08945
S1015 Tahira Ejaz L Rukh Wah MCS B06352
S1018 Arif Zia E-8,
Islamabad.
BIT B08474
Second Normal Form
Full Functional Dependence
Consider the relation
CLASS(crId, stId, stName, fId, room, grade)
crId, stId
stName, fId, room, grade
stId
stName
Second Normal Form
• A relation is in 2nd normal form iff it is in the first
Second Normal Form
CLASS(crId, stId, stName, fId, room, grade)
Anomalies
• Redundancy
Anomalies
crId stId stName fId room grade
C3456 S1020 Suhail Dar F2345 104 B C5678 S1020 Suhail Dar F4567 106
C9876 S1123 Ali Ahmad F2232 101
C3456 S1038 Shoaib Ali F2345 104 A
Second Normal Form
Relation is decomposed based on the FDs
CLASS(crId, stId, stName, fId, room, grade)
crId, stId
stName, fId, room, grade
stId
stName crId
fId, room
STD(stId, stName)
Second Normal Form
Third Normal Form
Transitive Dependency
STD(stId, stName, stAdr, prName, prCrdts)
stId
stName, stAdr, prName, prCrdts
Anomalies
stId stName stAdr prName prCrdts
Third Normal Form
•
STD(stId, stName, stAdr, prName, prCrdts)
•
stId
stName, stAdr, prName, prCrdts
•
prName
prCrdts
3NF Relations
Boyce-Codd Normal Form
• A general form of 3NF• Every relation in BCNF is in 3NF vice-versa is
not always true
BCNF
• A table is in BCNF if every determinant is a
candidate key
• Situation when table in 3NF is not in BCNF
• A non-key determines a part of the composite
BCNF
FACULTY(fId, dept, office, rank, dateHired) fId, dept office, rank, dateHired
office dept
• Table is in 3NF, not in BCNF since the office is not a
candidate key
• Generates multiple overlapping candidate keys so
BCNF
• We decompose the table again to bring it into
BCNF