Relational Database Design Theory
Informal guidelines for good relational designs
Functional dependencies
Normal forms and normalization
1NF, 2NF, 3NF BCNF, 4NF, 5NF
Inference rules on functional dependencies
Additional properties for relational decompositions
Nonadditive join property
2NF, 3NF
2NF
Boyce-Codd Normal Form
BCNF
Difference from 3NF:
3NF allows A to be a key attribute 3NF allows A to be a key attribute
Every relation in BCNF is also in 3NF
Most relation schemas that are in 3NF are also in BCNF but not all:
1NF, 2NF, 3NF, BCNF
Test on any non-trivial XA
Violations normalization
1NF Multi-valued attributes New relation for each multi-valued attributes 2NF a) X is a super key
or
b) X is not a key
1) partial key -> non-key attribute
(partial FD)
New relation for the partial key and its dependent attributes b) X is not a key
or
c) A is a key attribute
(partial FD) dependent attributes
3NF a) X is a super key or
c) A is a key attribute
1) or
2) Non-key attribute -> non-key attribute
(transitive FD)
Above and
New relation for the non-key attribute and its dependent attributes
BCNF a) X is a super key 1) or 2) or
3) non-key -> key attribute
Above and
New relation for the non-key attribute and its dependent attributes
Relational Database Design Theory
Informal guidelines for good relational designs
Functional dependencies
Normal forms and normalization
1NF, 2NF, 3NF BCNF, 4NF, 5NF
Functional dependencies and keys
Additional properties for relational decompositions
Nonadditive join property
Keys and Functional
Dependencies
Superkey
No two distinct tuples in any state r of R can have the
same value for SK
Functional dependency: SK R
Key Key
Superkey of R; and it is minimal (removing any attribute
A from K leaves a set of attributes that is not a superkey
any more)
Functional dependency: K R, and for any A in K,
K-{A}R does not hold
Given a set of functional dependencies, can we find the keys of R?
Inference Rules for FDs
Armstrong’s inference rules (complete)
Reflexivity rule: if Y ⊆ X then X → Y
Augmentation rule: if X → Y then XZ → YZ
Transitivity rule: if X → Y and Y → Z then X → ZTransitivity rule: if X → Y and Y → Z then X → Z
Other rules
Decomposition Rule:
if X → YZ then: X → Y and X → Z
Union or Additive Rule:
Inference Rules for FDs
Specify a set of FDs that can be easily determined from attribute semantics
Infer additional FDs using inference rules The closure of a FD set F
The closure of a FD set F
the set of all FDs that include F as well as all FDs that
Closure of attribute set
The closure of attribute set X under FD set F, denoted as X+
The set of attributes that are functionally determined by
X based on F
How can we computationally find X+
How can we determine a set of attributes X is a key?
Example
SSN+ Dnumber+ Dnumber+ {SSN, Dnumber}+ {SSN, Dnumber, Ename}+Example
SSN+ = {SSN, Ename, Bdate, Address, Dnumber, Dname,
Dmgr_ssn} Dmgr_ssn}
Dnumber+ = {Dnumber, Dname, Dmgr_ssn}
{SSN, Dnumber}+ = {SSN, Ename, Bdate, Address,
Dnumber, Dname, Dmgr_ssn}
{SSN, Dnumber, Ename}+ = {SSN, Ename, Bdate, Address,
Dnumber, Dname, Dmgr_ssn}
Finding Keys based on Attribute
Closure
If X+ = R, then X is a superkey
If X+ = R, and (X-{A})+ != R, then X is a key
How do we find all keys given a FD set F? How do we find all keys given a FD set F? How do we find one key given a FD set F?
Example
Initialization
K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn} K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn}
Decrease one attribute at a time
…
Relational Database Design Theory
Informal guidelines for good relational designs
Functional dependencies
Normal forms and normalization
1NF, 2NF, 3NF BCNF, 4NF, 5NF
Functional dependencies and keys
Additional properties for relational decompositions
Nonadditive join property
Dependency Preservation Property
Informally, given a decomposition D of R and a FD set F on R, each FD in F either appear directly in D or could be inferred
Claim 1. it is always possible to find a dependency-preserving decomposition D with respect to F such preserving decomposition D with respect to F such that each relation Ri in D is in 3NF
Non-additive Join Property
A decomposition D = {R1, R2, ..., Rm} of R has the lossless (nonadditive) join property with
respect to the set of FDs F on R if, for every legal relation state r, the following holds, where * is the natural join of all the relations in D:
π
π
* (
π
R1(r), ...,
π
Rm(r)) = rlossless for “loss of information”, i.e. “addition of
spurious information”
How to test whether a decomposition satisfies the lossless join property?
Successive Lossless Join
Decomposition
Claim 2 (Preservation of non-additivity in
successive decompositions):
If a decomposition D = {R1, R2, ..., Rm} of R has the
lossless (non-additive) join property with respect to F
and if a decomposition Di = {Q1, Q2, ..., Qk} of Ri has
the lossless (non-additive) join property with respect to the projection of F
then the decomposition D2 = {R1, R2, ..., Ri-1, Q1, Q2,
..., Qk, Ri+1, ..., Rm} of R has the non-additive join property with respect to F.
Normalization with BCNF and
Lossless Join Property
In Practice
Relational design from ER model or existing tables/reports
Normalization for 3NF or BCNF with lossless join property
property
Sometimes normal forms are violated deliberately to achieve better performance (less join operations)