• No results found

database_design21.ppt

N/A
N/A
Protected

Academic year: 2020

Share "database_design21.ppt"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Agenda

Announcements

Chapter 17 material – Physical Database

Design

15 Minute Break

(3)

Annoucements

PMI meeting this evening at the IT Club

12/29

 Discuss actions to date  Organize future activities

Collect emails for further interest

Project Designs due 1/8 or ealier

 Any questions?

Project Presentations start 1/5

(4)

Review: Logical DB Design

Logical DB Design is about converting ERDs or

Object Models into a logical record structure

The logical record structure is then converted to a

relational model

Logical Record Structure: a way of describing

records at the system level

We use relational analysis to convert logical record

structure to a set of relations

Relational analysis is a standard technique for

reducing models into non-redundant forms by

(5)

Review: DB Design 101

If we wanted to do so, we could store every item in a student table just like that below.

What is wrong with this design?

STUDENT_ID NAME DOB CITA_110_GRADE CITA_205_GRAD E

Tlombardi Thomas Lombardi Feb 12, 1970 75 Olombardi Ola Lombardi Jun 13, 1972 89

JSmith John Smith May 27, 1975

(6)

Review: DB Design 101

Database Normalization: a technique for

designing relational databases that aims to reduce data duplication, null fields, and data anomalies

If the same piece of data is stored in more than 1

place, it is possible for an update to put the data into an inconsistent state thus violating data

integrity.

Sufficiently normalized tables preserve data

integrity

Different types of databases require different

(7)

Review: Relations

Relation

: a table or list of values

Attribute in a relational model

: a column

of the table of list of values

Relational Model

: A set of tables that

describe the data in a system

(8)

Review: Relations

USER-ID USERNAME PASSWORD

1 Tlombardi Pwasd

2 Bob 23424jkd

GROUP-ID GROUPNAME

1 Admin

2 Member

GROUP-ID USER-ID

1 1

2 1

(9)

Review: Non-Normal Form

ORDER-NO ORDER-DATE ORDER-LINES

Ord1

Ord2

6 June 1998

3 May 1998

PART-NO QTY-ORDERED p1 10 p6 30

PART-NO QTY-ORDERED p5 10

p6 50 p2 30

(10)

Review: 1st Normal Form

ORDER-NO

ORDER-DATE PART-NO QTY-ORDERED

Ord1 6 June 1998 p1

10

Ord1 6 June 1998 p6 30

Ord2 3 May 1998 p5 10

Ord2 3 May 1998 p6 50

Ord2 3 May 1998Each attribute has a discrete value!p2 30

(11)

Review: Functional Dependencies

Functional dependency: where one value of

an attribute determines a single value of another attribute.

For example, in our previous case,

ORDER-DATE is functionally dependent on ORDER-NO.

There can be only one ORDER-DATE for each

ORDER-NO!

In the same way, there can be only 1

(12)

Review: Relation Keys

Relation Key

: a set of attributes whose

values identify a unique row in a relation

 Can include more than 1 column

 Relations can have more than 1 relation key  Must apply for all possible contents of the

relation

Key attribute

: an attribute is a key attribute

if it is at least part of one relation key

Non-key attribute

: an attribute that is not

(13)

Review: Normal-Form Relations

Normal-form Relations:

a set of relations

that describes the data in the system, but

where each data component is a simple value

 We would like relation keys to determine

unique values of non-key attributes

Data Redundancies

 The design goal is to eliminate data

redundancies

 Relations without such redundancy satisfy a

(14)

Review: Normal Forms

1NF: each column is a discrete value

2NF: All non-key attributes must be functionally

dependent on the whole of each relation key

3NF: There can be no functional dependencies

between non-key attributes

BCNF (Optimal Normal Form): set of relations

that have no single-value redundancy

4NF/5NF and beyond: address problems

arising from multivalued dependencies

Mutlivalued dependency: where one value of

(15)

Normalization and its Discontents

Is a higher degree of normalization better?

The higher the degree of normalization the more the

database is write optimized

The lower the degree of normalization the more the

database is read optimized

Different types of databases require different levelsA reporting system that is too normalized will

require more joins and impact performance adversely

A transaction system that is denormalized will result

(16)

Chapter 17: Physical DB Design

Databases are implemented using database

management systems (DBMS)

Each DBMS has its own properties and

proprietary technologies

The benefit of moving from logical DB design to

physical DB design

 Ensures that design is free of proprietary solutions

 Materialized views are unique to Oracle

(17)

Chapter 17: Physical DB Design

Major questions of physical DB design: how will

records be stored? how will records be accessed?

The default option is the table scan: row by row

search for data in the order of storage

 In general table scans are horrible for performance!!!

Direct access: Retrieval of records based on

indexed fields (keywords)

Index: uses one or more fields as a key to find

records quickly

(18)

Chapter 17: Database Types

Relational Databases

 Oracle

SQL Server

 MySQL

Hierarchical Databases

 IBM Information Management System (IMS)

Network Databases

CODASYL Standard (Consortium on Data Systems

Languages)

Object Databases

(19)

Chapter 17: Convert LRS

Converting the Logical Record Structure to a

data model supported by the DBMS

 RDBMS – simply convert LRS into relations

that will become tables in database

Network DB

 Logical records become record types  Logical record links become set types

 Hierarchical DB

 Only one parent

(20)

Chapter 17: Access

Requirements

How will the data be accessed?Example: the login screen

Every login screen accepts a username / password

 You can be sure that the database table behind

this application has an index on the username field

 SELECT * FROM USER WHERE

USERNAME=‘TOM’;

 The USERNAME field should be indexed so that

(21)

Chapter 17: Access

The record types accessed by each request

 Which tables? How often?

The sequence in which records are accessed

 ASC, DESC, ordered on multiple fields

The access keys

 What fields need to be indexed?

The items retrieved from each record

What fields need to be pulled back?

The number of records accessed

(22)

Chapter 17: DB Structures

Indexes are only part of the story of physical

design

Tablespaces are containers for tables that affect

how data can be accessed

Defines the storage characteristics for data including

 Maps to logical containers to physical files and devices

 Allocates new storage when needed in extents

Tablespaces can be spread across several physical

devices

Indexes and Tables can be stored on separate

(23)

Chapter 17: DB Structures

Block Sizes – the chunks of data allocated to a

table

Block Size should be multiple of Disk Block SizeBlock Size should be tuned to record length

Buffers, Buffer Pools, Cache, Cache-hit ratioStored Procedures, Triggers, Distributed

Transactions

DB Configuration and DB Design

 In order to meet your performance requirements,

you may need to tweak design purity

(24)

Chapter 17: Design Trade-Offs

Performance vs. Storage – Indexes require

space. You can always lean on storage to improve performance

Performance vs. Structure – Redundancy in

data can improve performance in certain areas.

Read vs. Write – Indexes improve retrieval

metrics, but degrade the performance of

insert/update because the index must be rebuilt

Performance vs. Administration – Indexes

require maintenance

 Rebuild indexes periodically

(25)

Chapter 17: Database Design

We have talked about the differences

between

OLTP Databases  Data warehouses

Let’s review these differences in light of

(26)

Break

15 Minute Break

When we get back we’ll discuss Program

(27)

Chapter 16: Program Design

We have captured business logic in a number

of different ways:

 DFD

 Sequence Diagrams  Transition DiagramsStructured English  Decision Tables

How do we turn these analysis and design

(28)

Chapter 16: Program Design

Identify automation boundaries in a DFD

 This highlights which part of a process should

be automated

Deciding which tasks should be automated vs.

manual can influence the feasibility of a project

Identify component subsystem boundaries

 What are the major components?

 Can we work on these concurrently?

 How will we test these? Can they be tested

(29)

Chapter 16: Program Design

After subsystems are chosen, process

specifications are refined for each DFD process

DFD processes are converted into program

modules

Program modules are arranged on a structure

chart

Structure chart displays program modules and

their interconnection

(30)

Chapter 16: Program Design

Program Design Metrics

 How do we know if our modules, packages,

programs make sense in terms of design?

 We use coding metrics to help us

Static Analysis ToolsPMD

 CheckStyle

Dynamic Analysis Tools  Findbugs

(31)

Chapter 16: Program Design

Coupling

describes the nature, direction,

and quantity of parameters passed between

modules

 loose coupling

Cohesion

describes how system functions

are coded into modules

 One function, one module = high strength

Loose coupling and high strength =

(32)

Chapter 16: Coupling

Content coupling: when a module makes direct

references to the contents of another module

These should be passed as parameters

 Imagine if an object had all public variables!!!  This is horrible!!!

Common-Environment Coupling: when

modules refer to the same data structure in a common environment

Shared files

(33)

Chapter 16: Coupling

Control coupling

: when a module passes a

control element to another module

 Control flags, function codes, switches

 Changes made to called function require

changes to calling function

Data coupling

: when a module passes data

through a well-defined interface to another

module

 Most desirable form of coupling

(34)

Chapter 16: Cohesion

Coincidental Strength: there is no meaningful

relationship between the parts in a module

Logical Strength: all of the elements in a module

perform similar tasks: edit, read, write

Temporal Strength: all functions are grouped

according to when they are used: initialization, etc.

Procedural Strength

Communication Strength: all functions that

communicate with each other are in the same module

(35)

Chapter 16: Code Complexity

McCabe’s Cyclomatic Complexity Number

measures the number branches in code to estimate the complexity of code.

 The higher the number the more complicated the code  Coupling/Cohesion metrics can help plan for testing

Modules that have many other modules that depend on

them should be more thoroughly tested

 Modules that have many other modules that depend on

them may need to be restructured

Quality metrics

(36)

Software Architecture Notes

Transactions across multiple systems

 Distributed Transactions

 Oracle

 SQL Server

 Message Queue  LDAP system

 Transaction coordinator uses 2 Phase Commit

Protocol to ensure that all members in the transaction agree

(37)

Software Architecture Notes

Sacrificing design purity for performance

 The SQL Server 2000 driver debacle

 SQL Server 2000 Driver for Java had a bug in its code

that made statement caching not work

 The driver rebuilt a query from scratch each time it was

called

 Executed 100,000 times

 Worked perfectly with other platforms Oracle, SQL Server 2005

 For performance, added if it’s SQL Server 2000 logic

then do this other crazy thing!!!

(38)

Software Architecture Notes

Sacrificing design purity for ease of maintainence

Validation engine required frequent changes due to new and changing requirements

 Rather than change interfaces again and again we invented the bag interface

 Rules for the bag: you can put anything you want into it

Every validator accepted a bag as a parameter with its configuration items inside

References

Related documents

If you want to transfer data into the BW system from other database management systems using DB Connect, you need to install both the database-specific DB Client as well as

Using Databases to Improve Business Performance and Decision Making Essentials of Management Information Systems Chapter 5 Foundations of Business Intelligence: Databases

• Physical database design is the process of implementing the logical data model in a DBMS?. • We shall assume that the logical data model is the relational

Database Administra�on, Physical database design and tuning, Distributed database systems,

The lower RH in the mound humus layer after stump removal seems to be an effect of soil 435 mixing. 436 437 Acknowledgements

INCIDENCE OF DENTAL CARIES IN TUBE-FED CHILDREN AND TUBE- FED CHILDREN RECEIVING ORAL FEEDING THERAPY: A..

This dissertation will focus on reliable and efficient signal reconstruction implementation based on Orthogonal Matching Pursuit (OMP) algorithm from the fewer measurements in

the acute care analysis is based on office of Statewide Health Planning and development (oSHPd) utilization data for acute care discharges and days in licensed general acute