Part III: Relational SQL
PERSON_IDENTIFIER_TYPE *ID_TYPE
PERSON_IDENTIFIER_TYPE *ID_TYPE
INACTIVE_DATE
Figure 8-1 is an Entity Relationship Diagram (ERD) for my finished analysis. I'll use this as a context as I cover JDBC in the following chapters. Now that we have the analysis completed, let's move on to the design.
Figure 8-1. Entity relationship diagram for the sample HR database
8.3 Relational Database Design
At this point, we have a theoretical analysis of the HR database. Before we create a physical implementation, we need to consider how it will be implemented. This is the step in which we decide which data types we will use for the attributes, determine how to constrain those data types, and define external primary keys, among other things. Let's start by deciding which data types to use.
8.3.1 Selecting Data Types
One of the beautiful things about Oracle is that it does not have presentation data types. There is no money type, for example. Not having presentation data types keeps things simple. The number of data types you need to work with is kept to a bare minimum. With Oracle, you get a small number of data types that allow you to work with the following four basic types of data:
• Binary • Character
• Date • Numeric
For binary data, you have the following Oracle data types to work with:
RAW
A varying-length binary type that can hold up to 2 KB
LONG RAW
A varying-length binary type that can hold up to 2 GB
BLOB
A varying-length binary type that can hold up to 4 GB
BFILE
An external file that can hold up to 4 GB
For character data, you have the following types at your disposal:
CHAR (or NCHAR)
A fixed-length character type right-padded with space characters up to its constraining size
VARCHAR2 (or NVARCHAR2)
A varying-length character type that can hold as many characters as will fit within its constraining size
LONG
A varying-length character type that can hold up to 2 GB
CLOB
A varying-length character type that can hold up to 4 GB
When dealing with character data, it's a good idea not to use CHAR, because the side effects of its fixed length require you to right-pad VARCHAR2 data values in order to do comparisons. LONG and CLOB are very specialized and are needed only in rare occasions. That leaves us with VARCHAR2 as the character data type of choice.
The other two types of data you will work with are dates and numbers. For date values, you have the data type DATE. For numeric data, you have the NUMBER type with up to 38 digits of precision.
A VARCHAR2 data type must be constrained with a maximum size, while NUMBER can be constrained or unconstrained as desired. If you are going to use a multi-byte character set in the database, then you need to make the VARCHAR2 or NVARCHAR2 columns larger to hold the same amount of data. On that thought, I suggest you be liberal in the amount of storage you give your VARCHAR2 data types.
When it comes to constraining the size of numbers, I don't. Why should I specify a maximum size when I don't have to? It seems to me that constraining numbers is an old habit from a time when it was necessary to do so for storage management. Since Oracle uses only the number of bytes required to represent something to store it, i.e., varying-length storage, there is no point in constraining numbers, which builds in obsolescence.
So all this discussion has led up to using three data types: • DATE
• NUMBER • VARCHAR2
Things couldn't get much simpler. Before I write the actual DDL statements to create tables for the HR application, let's talk about DDL coding conventions.
8.3.2 DDL Coding Conventions
Whether you call them conventions or standards, when everyone on a development team plays by the same rules, it's more efficient and just plain easier. I say conventions rather than
standards, because I never found a standard I didn't need to break occasionally in order for things to make sense. Here are my suggested conventions for writing DDL:
1. Make table names singular. For example: PERSON, not PERSONS.
2. Make a primary entity's primary key a sequence-generated number named using the table's name suffixed with _ID. For example: PERSON_ID.
3. Create a sequence for each primary entity's table using the table's name suffixed with _ID. For example: PERSON_ID.
4. Create an index for each primary entity's table using the table's name suffixed with _PK. For example: PERSON_PK.
5. Create any required unique indexes for external primary keys using the table's name suffixed with _UK#. For example, PERSON_UK1.
6. Do not use a parent table's primary key constraint (PKC) as part of the definition for a child table's PKC.
7. Use one of the following two methods to create the PKCs for code tables. First, use the code value as the PKC of the code table. Second, create a dumb key just as you do for primary entities. These two methods are equally valid and fraught with complications. Using code values makes decision support queries easier to write but introduces the problem of lost relationships that the primary entities suffered from in our first analysis. 8. Always create foreign key constraints, even if you must leave them disabled because
they are conditional. This helps to document your database. You can always implement a conditional constraint with a database trigger.
If you use these conventions, it will be easy for you to identify the PKCs and unique keys for a given table, transfer system knowledge to other team members, and simplify your documentation process.
8.3.3 Writing the DDL
Now that we have an application context to work from, and some DDL coding conventions to work with, it's time to write some DDL for our HR database. Writing the code for the DDL is a process by which we take our logical model -- the entities, attributes, internal and external primary keys, and relationships -- and transform them into SQL code to create the physical
implementation: tables, columns, PKCs and unique indexes, and foreign key constraints. We'll start with the PERSON entity. First, here's the table definition:
create table PERSON (
person_id number not null, last_name varchar2(30) not null,
first_name varchar2(30) not null, middle_name varchar2(30),
birth_date date not null, mothers_maiden_name varchar2(30) not null ) tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
Next, here's the PKC:
alter table PERSON add constraint PERSON_PK primary key ( person_id ) using index
tablespace USERS pctfree 20
storage (initial 10 K next 10 K pctincrease 0)
Here's our external unique constraint:
create unique index PERSON_UK1 on PERSON ( last_name,
first_name, birth_date,
mothers_maiden_name )
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
And finally, here's our sequence:
create sequence PERSON_ID start with 1
order
That takes care of PERSON. Now let's do the same for LOCATION:
create table LOCATION (
location_id number not null, parent_location_id number,
code varchar2(30) not null, name varchar2(80) not null, start_date date not null, end_date date )
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
alter table LOCATION add constraint LOCATION_PK primary key ( location_id ) using index
tablespace USERS pctfree 20
storage (initial 10 K next 10 K pctincrease 0)
create unique index LOCATION_UK1 on LOCATION ( code,
start_date,
parent_location_id )
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
start with 1 order
Here's the PERSON_LOCATION intersection:
create table PERSON_LOCATION ( person_id number not null, location_id number not null, start_date date not null, end_date date )
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
alter table PERSON_LOCATION add constraint PERSON_LOCATION_PK
primary key ( person_id, start_date ) using index
tablespace USERS pctfree 20
storage (initial 10 K next 10 K pctincrease 0)
and the PERSON_IDENTIFIER entity:
create table PERSON_IDENTIFIER ( person_id number not null, id varchar2(30) not null, id_type varchar2(30) not null ) tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
alter table PERSON_IDENTIFIER add constraint PERSON_IDENTIFIER_PK primary key ( person_id, id, id_type ) using index
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
and finally, the PERSON_IDENTIFIER_TYPE entity:
create table PERSON_IDENTIFIER_TYPE ( code varchar2(30) not null, description varchar2(80) not null, inactive_date date )
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
alter table PERSON_IDENTIFIER_TYPE add constraint PERSON_IDENTIFIER_TYPE_PK primary key ( code )
using index
tablespace USERS pctfree 20
storage (initial 100 K next 100 K pctincrease 0)
Now that we have some of our needed table definitions, let's create the DDL for foreign key constraints. The person table has no foreign key constraints, so we'll start with the LOCATION table:
alter table LOCATION add constraint LOCATION_FK1
foreign key ( parent_location_id ) references LOCATION ( location_id )
alter table PERSON_LOCATION add constraint PERSON_LOCATION_FK1 foreign key ( person_id ) references PERSON ( person_id )
alter table PERSON_LOCATION add constraint PERSON_LOCATION_FK2 foreign key ( location_id ) references LOCATION ( location_id )
and then PERSON_IDENTIFIER:
alter table PERSON_IDENTIFIER add constraint PERSON_IDENTIFIER_FK1 foreign key ( person_id ) references PERSON ( person_id )
alter table PERSON_LOCATION add constraint PERSON_LOCATION_FK2
foreign key ( id_type ) references PERSON_IDENTIFIER_TYPE ( code )
Now that we have our DDL, we can move on to the next step in our process, which is to actually create the database objects. Normally, you'd use Oracle's SQL*Plus to accomplish this task. However, since this is a book about JDBC, I'll show you how to use JDBC to execute the DDL instead. In Chapter 9, we'll cover the execution of DDL and DML. Among other things, you'll see how to execute the DDL to create the HR tables.