• No results found

DATABASE DESIGN

N/A
N/A
Protected

Academic year: 2021

Share "DATABASE DESIGN"

Copied!
33
0
0

Loading.... (view fulltext now)

Full text

(1)

DATABASE DESIGN

Good Design vs. Bad Design

Normalization 1

(2)

Guidelines

⚫ Design relation schema so that it is easy to explain its meaning

⚫ Do not combine attributes from multiple

entity types and relationship types into a

single relation

(3)

Guidelines

⚫ Design base relation schemas so that no redundant information exists in tuples

⚫ Significant effect on storage space

(4)

Guidelines

⚫ Design base relation schemas so that no update anomalies will occur

⚫ Types of update anomalies:

◦ Insertion

◦ Deletion

◦ Modification

(5)

Easy to describe? One entity or two?

⚫ S# - Suppliers, P# - Parts;

⚫ Primary Key = {S#, P#}

Normalization 5

(6)

Data Redundancy

⚫ City of a supplier is listed several times

⚫ Waste of storage

Normalization 6

(7)

Data Anomalies

Update Anomaly

◦ If we change the city of a supplier, we must update all the tuples related to this supplier. If we miss one tuple, the data is no longer consistent.

Normalization 7

(8)

Data Anomalies

Insertion Anomaly

◦ Until a supplier actually supplies a part, we can’t record (insert) its city.

Normalization 8

(9)

Data Anomalies

Deletion Anomaly

◦ If we delete all the shipments of a supplier, its city is gone too.

Normalization 9

(10)

Normalization of Relations

⚫ Takes a relation schema through a series of tests

◦ Certify whether it satisfies a certain normal form

◦ Proceeds in a top-down fashion

Normal form tests

(11)

Normalization 11

Normal Forms - definitions

⚫ 1NF: All domain values in R are atomic

⚫ 2NF : R is in 1NF and every non-key attribute is fully dependent on the key

⚫ 3NF: R is 2NF and every non-key attribute is non-transitively dependent on the key

The Key

The Whole Key

And Nothing But The Key

(12)

First Normal Form

⚫ Part of the formal definition of a relation in the basic (flat) relational model

⚫ Only attribute values permitted are single atomic (or indivisible) values

Does not allow nested relations

◦ Tuple with a relation within it

(13)

First Normal Form (cont’d.)

⚫ Techniques to achieve first normal form

◦ Remove nested relation and non-atomic attributes into a new relation

◦ Propagate the primary key into it

(14)

Normalization of Repeating Groups

Before : S# (PQ)*

P# QTY S1 P1 300

P2 200 P3 400 S2 P1 300 P2 400 S3 P2 200

3 “records”

After : S# P# QTY

S1 P1 300 S1 P2 200 S1 P3 400 S2 P1 300 S2 P2 400 S3 P2 200

6 “records”

14

(15)

Definition of Functional Dependency

Constraint between two sets of attributes from the database

Cannot determine which FDs hold and which

do not unless meaning of, and relationships

among, attributes is known

(16)

Functional Dependence

Take a relation R(X, Y, …), where X and Y are sets of attributes:

⚫ X → Y means

◦ Y is functionally dependent on X

◦ X (functionally) determines Y

⚫ In Plain English:

◦ Each X-value in R is associated with precisely one Y-value in R.

Or, equivalently

◦ If two tuples have the same X-value, they must have the same Y-value.

Normalization 16

(17)

Functional Dependencies

⚫ {Part, Warehouse} → {Qty}

⚫ {Warehouse} → {W_Address}

⚫ {Part, Warehouse} → {W_Address}

Normalization 17

INVENTORY

( Part, Warehouse, Qty, W_Address)

P1 W1 15 Columbus P2 W1 10 Columbus P3 W1 25 Columbus P1 W2 10 Dayton P2 W2 35 Dayton P3 W2 11 Dayton P1 W3 44 Cincinnati

Primary Key = {Part, Warehouse}

(18)

Full Functional Dependency

1. X → Y i.e. ,Y depends on X

2. Let Z be a proper subset of X, i.e. Z ⊂ X.

There does NOT exist a Z such that Z → Y.

Normalization 18

INVENTORY

( Part, Warehouse, Qty, W_Address)

P1 W1 15 Columbus P2 W1 10 Columbus P3 W1 25 Columbus P1 W2 10 Dayton P2 W2 35 Dayton P3 W2 11 Dayton P1 W3 44 Cincinnati

(i.e., if Y depends on X,

then Y depends on Z, too)

(19)

Second Normal Form

Based on concept of full functional dependency

⚫ Second, normalize into a number of 2NF relations

◦ Nonprime attributes are associated only with

part of primary key on which they are fully

functionally dependent

(20)

Normalization 20

Second Normal Form: general definition

Second Normal Form: every nonprime attribute is not partially dependent on any candidate key.

Part Warehous e

Qty W_Address

W_Address is non-prime.

W_Address is partially dependent on candidate key Part, Warehouse.

Thus (Part, Warehouse, Qty, W_Address) is not in 2NF.

(21)

Decompose To Satisfy 2NF

Normalization 21

INVENTORY’

( Part, Warehouse, Qty)

P1 W1 15 P2 W1 10 P3 W1 25 P1 W2 10 P2 W2 35 P3 W2 11 P1 W3 44

LOCATION ( Warehouse, W_Address)

W1 Columbus W2 Dayton W3 Cincinnati

Part Warehouse

Qty

Warehouse W_Address

Every non-key attribute is fully dependent on the key

(22)

What are the Functional Dependencies?

S# - Suppliers, P# - Parts Primary Key = {S#, P#}

P# Pname, Color, Weight, Pcity S# Status, City

S#, P# Qty

(23)

Normalization 23

Result : 2NF

S# STATUS CITY S1

S2 S3 S4

20 10 30 20

London Paris New York London

P# PNAME COLOR WEIGH

P1 T P2 P3 P4 P5 P6

Nut Bolt Screw Screw Cam Cog

Red Green Blue Red Blue Red

12 17 17 14 12 19

PCITY London Paris Rome London Paris London

S#

S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4

P#

P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5

QT Y 300 200 400 200 100 100 300 400 200 200 300 400

S

P

SP

(24)

Data Anomalies Solved

Insertion Anomaly

◦ Enter information that S5 located in Athens, even though S5 does not currently supply any parts.

Normalization 24

S# STATUS CITY S1

S2 S3 S4 S5

20 10 30 20 30

London Paris New York London Athens

S

S#

S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4

P#

P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5

QT Y 300 200 400 200 100 100 300 400 200 200 300 400

SP

(25)

Data Anomalies Solved

Deletion Anomaly

◦ Delete the shipment connecting S3 and P2 by deleting the appropriate tuple from SP; we do not lose

information that S3 is located in New York.

Normalization 25

S# STATUS CITY S1

S2 S3 S4

20 10 30 20

London Paris New York London

S

S#

S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4

P#

P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5

QT Y 300 200 400 200 100 100 300 400 200 200 300 400

SP

(26)

Data Anomalies Solved

Update Anomaly

◦ Change the city for S1 from London to Amsterdam by changing it once.

Normalization 26

S# STATUS CITY S1

S2 S3 S4

20 10 30 20

Amsterda m

Paris

New York London

S

S#

S1 S1 S1 S1 S1 S1 S2 S2 S3 S4 S4 S4

P#

P1 P2 P3 P4 P5 P6 P1 P2 P2 P2 P4 P5

QT Y 300 200 400 200 100 100 300 400 200 200 300 400

SP

(27)

Possible Data Anomalies in 2NF

Normalization 27

S# STATUS CITY S1

S2 S3 S4

20 10 30 20

London Paris New York London

S#

STAT US CITY

Transitive dependence

A supplier’s city has one status value associated with it

S# City Status

(28)

Possible Data Anomalies in 2NF

⚫ Insertion

◦ Can’t enter the status of a city until a supplier is located in

that city.

⚫ Deletion

◦ Deleting the only tuple for a supplier causes status of the city to be lost.

⚫ Update

◦ Status of a city appears many times causing redundancy.

Normalization 28

S# STATUS CITY S1

S2 S3 S4

20 10 30 20

London Paris New York London

S#

STAT US CITY

Transitive

dependence

(29)

Normalization Steps

29

1. Eliminate any repeating groups or nested tables

2. Show functional dependencies

3. Identify candidate key(s)

4. Any attributes FD on part of the Key?

If so, decompose (normalize) to 2NF

(30)

2. Show Functional Dependencies

FD1: Property_id# {county_name, lot#, area, price, tax_rate}

FD2: {county_name, lot#} {property_id#, area, price, tax_rate}

FD3: county_name {tax_rate}

(31)

3. Identify Candidate Key

FD1: Property_id# {county_name, lot#, area, price, tax_rate}

FD2: {county_name, lot#} {property_id#, area, price, tax_rate}

FD3: county_name {tax_rate}

(32)

4. Any attributes FD on part of the Key?

⚫ Getting to 2NF

◦ FD3 violated 2nf :

◦ Tax-rate not full FD on whole key

FD1: Property_id# {county_name, lot#, area, price, tax_rate}

FD2: {county_name, lot#} {property_id#, area, price, tax_rate}

FD3: county_name

{tax_rate}

(33)

Decomposed

◦ Was:

◦ Now:

Tables now in 2nf No FDs lost

LOTS = LOTS1 * LOTS2

References

Related documents

If every non-key attribute is functionally dependent primary key, then the relation will

[r]

In recognition of the vital role that the higher education sector has in promoting inclusive resilience knowledge, the ANDROID disaster resilience network was established in

review D. Iliac artery-ureteral fistula is a rare entity that is being reported with increasing frequency. Patients with iliac artery-ureteral fistulas can be divided into two

Overall, this cache coherence request is processed in only a single round-trip (compared with directory-based protocols, note that unlock is asynchronous) with much less network

In the context of this research, it is extremely important to answer these questions as corpus-based concatenative sound synthesis (the method of sound synthesis used