Data management layer design
Modelling and Systems Development Lecture 12
Mapping problem-domain objects
to object-persistence formats
Object-persistence formats
• Files
– Sequential and random-access
• Relational databases
• Object-oriented databases
• Object-relational databases
Files
• Directly supported by programming language
– Example: java.io package
• Sequential access
– Optimized for operations on entire file – Access to specific objects not efficient – Example: java.io.InputStream class
• Random access
– Optimized for finding specific objects – Entire-file operations not efficient – Example: java.io.RandomAccessFile
Relational databases
• Primary key
– an attribute that uniquely identifies a row
• Foreign key
– an attribute that is a primary key from a different table
– in order to simulate associations between objects
• Referential integrity
– guarantees that the foreign-key links are valid
• Structured Query Language (SQL)
– the standard language for accessing relational database tables
Relational Database Example
Figure 13-3 Goes Here
Primary keysare underlined.
Foreign keysin Order are double underlined.
Referential integrityis suggested through the red ovals.
Object-oriented databases
• Two approaches
– Adding persistence extensions to OO languages
– Separate DBMS
• Standards from ODMG (
Object DataManagement Group
): ODL, OML, OQL
Object-oriented databases
• Support complex data:
– MM
– CAD/CAM, GIS
– Finance, Healthcare, Telecom
• Object ID assigned
• Some support for inheritance
• Sharp learning curve
Object-relational databases
• Relational databases extended to
handle the storage of objects
• Use of user-defined data types
• Extended SQL to handle complex
object data
• Inheritance tends to be language
dependent
Comparing object storage formats
Mapping to object-persistence
formats
• Storing the problem domain (PD) objects
introduces conversion requirements
• Put functionality for storing and retrieving
in
data management
(DM) layer
– DM classes dependent on PD classes, not vice versa
• Add primary and foreign keys
– Unless they add too much overhead
Appointment System: PD and
DM Layers
Patient Appt PD Layer DM Layer DM-Patient DM-Appt contains methods for storing and retrieving Appt-objectsMapping to an OODB
factoring out MI in 2 different ways Superclass1 -attribute1 -attribute2 Superclass2 -attribute3 -attribute4 Class1 -attribute5 -attribute6 Superclass1 -attribute1 -attribute2 Class1 -attribute5 -attribute6 Superclass2 -attribute3 -attribute4 1 1 Superclass1 -attribute1 -attribute2 Class1 -attribute3 -attribute4 -attribute5 -attribute6Mapping PD objects to ORDBMS
schema
1. Map PD classes to DM classes and tables 2. Map attributes to
columns; distinguish
1. elementary types 2. set types 3. class types (object
ID’s) 4. derived attributes Class -attr1: elem_type -attr2: set_type -attr3: class_type Patient DM-Patient Patient_Table these types of attributes are supported by the ORDBMS tables
Mapping PD objects to ORDBMS
schema
3. Map inheritance to associations 4. Map associations to columns (object ID’s) Class1 Class2 Superclass Class1_Table Class2_Table Superclass_tbl -Superclass_table[1..1] -Class2_table[1..*] 1 Class1_Table 1..*Mapping PD objects to ORDBMS
schema
Patient Appt Person Symptom 0..* 1..* Patient_Table -Person_Table[1..1] -Appts[0..*] -Symptoms[1..*] -name: String 0..* Symptom_Table -name[1..1] -Patients[0..*]Also add read/write methods in DM_Patient and other DM classes
Person_Table Appt_Table -name[1..1] ... -date[1..1] ... 1
Mapping PD objects to RDBMS
schema
• Get rid of multivalued attributes • Replace object ID’s by foreign keys
Patient_Table -Person_ID ... Person_Table Appt_Table -name -Person_ID ... -date -Person_ID Symp_Table -name -Symp_ID Pat_Symp_Table -Person_ID -Symp_ID Patient Appt Person Symptom 0..* 1..* -name: String 0..* 1
Navigation is not preserved
Album title Track title 1 * «table» Albums key title «table» Tracks key albumKey title
Building a DM layer
• Very complicated to map to an RDBMS
• Several design patterns can be used
(see link on MSO website)
• Many tools are now available
Optimizing RDBMS-based object
storage
• Two dimensions for optimization:
– Storage efficiency (minimizing storage space)
– Speed of access (minimizing time to retrieve desired information)
Optimizing storage efficiency
• Reduce redundant data
• Limit null values
– Multiple possible interpretations can
lead to mistakes
• A well-formed logical data model does
not contain redundancy or many null
values
Normalization
• A model is in first normal form (1NF) if
it does not lead to multi-valued fields or
repeating fields
– Every row has the same number of columns
– All fields contain precisely one value
Normalization
Order -Order_no -Cust_ID -Date -Last_Name -First_Name -State -Tax_Rate Prod_Order -Order_no -Product_no -Product_descr -Product_price -Product_qty 1..* 0..*1NF: repeating product fields in original Order table are now in a separate table. However, there are still partial dependencies.
Normalization
Order -Order_no -Cust_ID -Date -State -Tax_Rate Prod_Order -Order_no -Product_no -Product_qty 1..* 0..* Customer -Cust_ID -Last_Name -First_Name Product -Product_no -Product_descr -Product_price 0..* 1..12NF: With partial dependencies removed there only remains a transitivedependency: the Tax rate depends only on the State.
Normalization
Order -Order_no -Cust_ID -Date -State Prod_Order -Order_no -Product_no -Product_qty 1..* 0..* Customer -Cust_ID -Last_Name -First_Name Product -Product_no -Product_descr -Product_price 0..* 1..13NF: With OO-modelling you would obtain a similar model right away!
State
-State -Tax_Rate
1..1
Optimizing data access speed
• Denormalize by
adding fields:
– of look-up tables – of tables in 1-1 relationship Order -Order_no -Date -Payment_type -Payment_desc 0..* Payment_type -Payment_type -Payment_desc 1..1Optimizing data access speed
• Clustering
– put similar records close together on the hard disk
• Indexing
– more space (additional tables)
– more speed (especially when index is put in memory) when retrieving data
– less speed when updating data
Payment Type Index
Guidelines for creating indexes
• Use indexes sparingly for transactionsystems • Use many indexes to increase response times
in decision supportsystems • For each table
– Create a unique index based on the primary key – Create an index based on the foreign key
• Create an index for fields used frequently for grouping or sorting