• No results found

Normalization. CIS 331: Introduction to Database Systems

N/A
N/A
Protected

Academic year: 2021

Share "Normalization. CIS 331: Introduction to Database Systems"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

Normalization

CIS 331:

Introduction to

Database Systems

(2)

Normalization: Reminder

Why do we need to normalize?

To avoid redundancy (less storage space needed, and data is consistent)

(3)

Vladimir Vacic, Temple University 3

Normalization: Reminder

Ssn c-id Grade Name Address 123 cs331 A smith Main

123 cs351 B smith Main 123 cs211 A smith Main

Clearly, Name and Address are redundant

(larger relation + you have to update 3 rows to update the Address)

(4)

Normalization: Reminder

Insertion anomaly:

Ssn c-id Grade Name Address 123 cs331A smith Main

… … … … …

(5)

Vladimir Vacic, Temple University 5

Normalization: Reminder

Delete anomaly:

Cannot delete Shmoo’s enrolment without loosing his address as well

Ssn c-id Grade Name Address 123 cs331 A jones 123 Main 124 cs351 B smith 124 Broad 125 cs211 A shmoo 42 Penn

(6)

Normal Forms

First Normal Form – 1NF

Second Normal Form – 2NF Third Normal Form – 3NF Fourth Normal Form – 4NF Fifth Normal Form – 5NF

(7)

Vladimir Vacic, Temple University 7

First Normal Form (1NF)

1NF: all attributes are atomic

(“no repeating groups”)

Anne Michael Rumpelstiltskin Peter Mary John Smith First Name

(8)

First Normal Form (1NF)

Anne Michael Rumpelstiltskin Rumpelstiltskin Peter Mary John Smith Smith Smith First Name

Last Name Normalized

(9)

Vladimir Vacic, Temple University 9

First Normal Form (1NF)

Not in 1NF 195 kg Metatron 165 lb Uriel 201 lb Gabriel 192 lb Raphael 187 lb Michael Weight Name

(10)

First Normal Form (1NF)

Normalized to 1NF kg 195 Metatron lb 165 Uriel lb 201 Gabriel lb 192 Raphael lb 187 Michael Unit Weight Name

(11)

Vladimir Vacic, Temple University 11

First Normal Form (1NF)

P2, P3 S3 P3 S2 P1, P3, P4 S1 Part Supplier Not in 1NF

(12)

First Normal Form (1NF)

null P3 P2 S3 null null P3 S2 P4 P3 P1 S1 Part3 Part2 Part1 Supplier Is this relation in 1NF?

(13)

Vladimir Vacic, Temple University 13

First Normal Form (1NF)

P2 P3 S3 S3 P3 S2 P1 P3 P4 S1 S1 S1 Part Supplier Normalized to 1NF

(14)

Second Normal Form (2NF)

2NF:

1NF and

all non-key attributes are fully dependent on the PK (“no partial dependencies”)

80 Ericsson Av. A CIS331 Erik Address Grade Course_ID Student Not in 2NF

(15)

Vladimir Vacic, Temple University 15

Second Normal Form (2NF)

A CIS362 Hildur C CIS331 Inge B CIS331 Sven A CIS331 Erik Grade Course_ID Student 212 Reykjavik St. Hildur 192 Freya Blvd. Inge 12 Olafson St. Sven 80 Ericsson Av. Erik Address Student Normalized to 2NF

(16)

Second Normal Form (2NF)

Address

Grade Course_ID

(17)

Vladimir Vacic, Temple University 17

Third Normal Form (3NF)

3NF: 2NF and no transitive dependencies 4.00 A CIS362 Hildur 2.00 C CIS331 Inge 3.00 B CIS331 Sven 4.00 A CIS331 Erik Grade_value Grade Course_ID Student Not in 3NF

(18)

Third Normal Form (3NF)

A CIS362 Hildur C CIS331 Inge B CIS331 Sven A CIS331 Erik Grade Course_ID Student 2.00 C 3.00 B 4.00 A Grade_value Grade

(19)

Vladimir Vacic, Temple University 19

Third Normal Form (3NF)

Value

Grade Course_ID

(20)

BCNF: Reminder

Informally:

Everything depends on the full key, and nothing but the key

Formally:

For every FD a b in F+

(21)

Vladimir Vacic, Temple University 21

Normal forms: BCNF vs. 3NF

If a relation is in BCNF, it is always in 3NF (but not the converse)

In practice, aim for

BCNF

If that’s impossible, we accept

3NF; but we insist on lossless join and dependency preservation

(22)

Normal forms: BCNF vs. 3NF

Let’s consider the ‘classic’ example: STC (Student, Teacher, Course)

Teacher Course

Student, Course Teacher

(23)

Vladimir Vacic, Temple University 23

Normal forms: BCNF vs. 3NF

STC (Student, Teacher, Course)

Teacher Course Student, Course Teacher

1) (Teacher, Course) (Student, Course)

(BCNF? Y+Y - Lossless? No - Dep. pres.? No)

2) (Teacher, Course) (Student, Teacher)

(24)

Normal forms: BCNF vs. 3NF

STC (Student, Teacher, Course)

Teacher Course Student, Course Teacher

in this case it is impossible to have both

BCNF and

(25)

Vladimir Vacic, Temple University 25

Normal forms: BCNF vs. 3NF

STC (Student, Teacher, Course)

Teacher Course Student, Course Teacher

S C

(26)

Normalization: Examples

342 P4 Paris 10 Kohl S7 (null) P3 London 12 Spiritoso S2 125 P4 Paris 10 Jones S1 500 P1 Paris 10 Jones S1 257 P3 Paris 10 Jones S1 Qty Part City Status_City Name Supplier Start by determining

(27)

Vladimir Vacic, Temple University 27

Normalization: Example

FDs

Supplier Name

Supplier City Status_City Supplier Status_City

(Supplier, Part) Qty

Partial Dependencies

(Supplier, Part) Name (Supplier, Part) City

(28)

Normalization: Example

Paris 10 Kohl S7 London 12 Spiritoso S2 Paris 10 Jones S1 City Status_City Name Supplier 257 P3 S1 500 P1 S1 Qty Part Supplier

(29)

Vladimir Vacic, Temple University 29

Normalization: Example

We took care of the partial dependencies, but what about transitive dependencies?

Paris 10 Kohl S7 London 12 Spiritoso S2 Paris 10 Jones S1 City Status_City Name Supplier

(30)

Normalization: Example

Paris Kohl S7 London Spirit S2 Paris Jones S1 City Name Supplier 342 P4 S7 (null) P3 S2 125 P4 S1 257 P3 S1 500 P1 S1 Qty Part Supplier Status_City City

(31)

Vladimir Vacic, Temple University 31

Normalization: Examples

Silberschatz et al.:

7.2 (lossless-join decomposition) hint: use the theorem!

“A decomposition is a lossless-join

decomposition if the joining attribute is a

superkey in at least one of the new relations” 7.16 (lossless-join decomposition)

(32)

Normalization: Final Thoughts

There are higher normal forms (4NF, 5NF), but we will not talk about them

In practice, “normalized” means in BCNF or 3NF

Luckily, in practice, ER diagrams lead to normalized tables (but do not rely on luck)

(33)

Vladimir Vacic, Temple University 33

Normalization: Overview

Why do we normalize?

To avoid redundancy (less storage space needed, and data is consistent)

To avoid update/delete anomalies

A good decomposition should:

be a lossless join decomposition

(you can recover original tables with a join) preserve dependencies

(34)

Normalization: Overview

Boyce-Codd Normal Form (BCNF):

“Everything should depend on the key, the

whole key, and nothing but the key” (so help

me Codd – joke attributed to C.J. Date )

1NF (all attributes are atomic) 2NF (no partial dependencies)

References

Related documents