Object-persistence formats. Files. Object-oriented databases. Data management layer design. Relational databases. Relational Database Example.

(1)

Data management layer design

Modelling and Systems Development Lecture 12

Mapping problem-domain objects

to object-persistence formats

Object-persistence formats

• Files

– Sequential and random-access

• Relational databases

• Object-oriented databases

• Object-relational databases

Files

• Directly supported by programming language

– Example: java.io package

• Sequential access

– Optimized for operations on entire file – Access to specific objects not efficient – Example: java.io.InputStream class

• Random access

– Optimized for finding specific objects – Entire-file operations not efficient – Example: java.io.RandomAccessFile

Relational databases

• Primary key

– an attribute that uniquely identifies a row

• Foreign key

– an attribute that is a primary key from a different table

– in order to simulate associations between objects

• Referential integrity

– guarantees that the foreign-key links are valid

• Structured Query Language (SQL)

– the standard language for accessing relational database tables

Relational Database Example

Figure 13-3 Goes Here

Primary keysare underlined.

Foreign keysin Order are double underlined.

Referential integrityis suggested through the red ovals.

Object-oriented databases

• Two approaches

– Adding persistence extensions to OO languages

– Separate DBMS

• Standards from ODMG (

Object Data

Management Group

): ODL, OML, OQL

(2)

Object-oriented databases

• Support complex data:

– MM

– CAD/CAM, GIS

– Finance, Healthcare, Telecom

• Object ID assigned

• Some support for inheritance

• Sharp learning curve

Object-relational databases

• Relational databases extended to

handle the storage of objects

• Use of user-defined data types

• Extended SQL to handle complex

object data

• Inheritance tends to be language

dependent

Comparing object storage formats

Mapping to object-persistence

formats

• Storing the problem domain (PD) objects

introduces conversion requirements

• Put functionality for storing and retrieving

in

data management

(DM) layer

– DM classes dependent on PD classes, not vice versa

• Add primary and foreign keys

– Unless they add too much overhead

Appointment System: PD and

DM Layers

Patient Appt PD Layer DM Layer DM-Patient DM-Appt contains methods for storing and retrieving Appt-objects

Mapping to an OODB

factoring out MI in 2 different ways Superclass1 -attribute1 -attribute2 Superclass2 -attribute3 -attribute4 Class1 -attribute5 -attribute6 Superclass1 -attribute1 -attribute2 Class1 -attribute5 -attribute6 Superclass2 -attribute3 -attribute4 1 1 Superclass1 -attribute1 -attribute2 Class1 -attribute3 -attribute4 -attribute5 -attribute6

(3)

Mapping PD objects to ORDBMS

schema

1. Map PD classes to DM classes and tables 2. Map attributes to

columns; distinguish

1. elementary types 2. set types 3. class types (object

ID’s) 4. derived attributes Class -attr1: elem_type -attr2: set_type -attr3: class_type Patient DM-Patient Patient_Table these types of attributes are supported by the ORDBMS tables

Mapping PD objects to ORDBMS

schema

3. Map inheritance to associations 4. Map associations to columns (object ID’s) Class1 Class2 Superclass Class1_Table Class2_Table Superclass_tbl -Superclass_table[1..1] -Class2_table[1..*] 1 Class1_Table 1..*

Mapping PD objects to ORDBMS

schema

Patient Appt Person Symptom 0..* 1..* Patient_Table -Person_Table[1..1] -Appts[0..*] -Symptoms[1..*] -name: String 0..* Symptom_Table -name[1..1] -Patients[0..*]

Also add read/write methods in DM_Patient and other DM classes

Person_Table Appt_Table -name[1..1] ... -date[1..1] ... 1

Mapping PD objects to RDBMS

schema

• Get rid of multivalued attributes • Replace object ID’s by foreign keys

Patient_Table -Person_ID ... Person_Table Appt_Table -name -Person_ID ... -date -Person_ID Symp_Table -name -Symp_ID Pat_Symp_Table -Person_ID -Symp_ID Patient Appt Person Symptom 0..* 1..* -name: String 0..* 1

Navigation is not preserved

Album title Track title 1 * «table» Albums key title «table» Tracks key albumKey title

Building a DM layer

• Very complicated to map to an RDBMS

• Several design patterns can be used

(see link on MSO website)

• Many tools are now available

(4)

Optimizing RDBMS-based object

storage

• Two dimensions for optimization:

– Storage efficiency (minimizing storage space)

– Speed of access (minimizing time to retrieve desired information)

Optimizing storage efficiency

• Reduce redundant data

• Limit null values

– Multiple possible interpretations can

lead to mistakes

• A well-formed logical data model does

not contain redundancy or many null

values

Normalization

• A model is in first normal form (1NF) if

it does not lead to multi-valued fields or

repeating fields

– Every row has the same number of columns

– All fields contain precisely one value

Normalization

Order -Order_no -Cust_ID -Date -Last_Name -First_Name -State -Tax_Rate Prod_Order -Order_no -Product_no -Product_descr -Product_price -Product_qty 1..* 0..*

1NF: repeating product fields in original Order table are now in a separate table. However, there are still partial dependencies.

Normalization

Order -Order_no -Cust_ID -Date -State -Tax_Rate Prod_Order -Order_no -Product_no -Product_qty 1..* 0..* Customer -Cust_ID -Last_Name -First_Name Product -Product_no -Product_descr -Product_price 0..* 1..1

2NF: With partial dependencies removed there only remains a transitivedependency: the Tax rate depends only on the State.

Normalization

Order -Order_no -Cust_ID -Date -State Prod_Order -Order_no -Product_no -Product_qty 1..* 0..* Customer -Cust_ID -Last_Name -First_Name Product -Product_no -Product_descr -Product_price 0..* 1..1

3NF: With OO-modelling you would obtain a similar model right away!

State

-State -Tax_Rate

1..1

(5)

Optimizing data access speed

• Denormalize by

adding fields:

– of look-up tables – of tables in 1-1 relationship Order -Order_no -Date -Payment_type -Payment_desc 0..* Payment_type -Payment_type -Payment_desc 1..1

Optimizing data access speed

• Clustering

– put similar records close together on the hard disk

• Indexing

– more space (additional tables)

– more speed (especially when index is put in memory) when retrieving data

– less speed when updating data

Payment Type Index

_{Guidelines for creating indexes}

• Use indexes sparingly for transactionsystems • Use many indexes to increase response times

in decision supportsystems • For each table

– Create a unique index based on the primary key – Create an index based on the foreign key

• Create an index for fields used frequently for grouping or sorting