• No results found

Single inheritance occurs when you relate a class to a single parent through generalization. The more specific class inherits attributes and behavior from the more general class.

To transform this situation into a relational table, you can take one of two approaches: mapping the classes directly, or mapping them through spreading the attributes and behavior to the subclasses.

Direct Mapping To map classes directly, you create a single table for each class in the inheritance hierarchy, then create a foreign key relationship for each generalization relationship. Figure 11-11 repeats Figure 7-6, the

Identification inheritance hierarchy.

To transform the hierarchy in Figure 11-11 into tables, you create a single table for each of the classes in the diagram, including the abstract classes. Following the guidelines in the previous section on "Classes," you create a primary key out of explicit or implicit object identifiers. Identification has implicit object identity, so you create a primary key column for the Identification table, usually an integer column that gets a value from an automatic sequence number generator. We'll name this column IdentificationID, meaning that it identifies an Identification object. Because the Identification class also has a composite aggregation association to the Person class, the primary key also contains the primary key of the Person table, PersonID.

CREATE TABLE Identification (

PersonID INTEGER NOT NULL REFERENCES Person, IdentificationID INTEGER NOT NULL,

Constraint Identification_PK PRIMARY KEY (PersonID, IdentificationID))

Figure 11-11: The Identification Inheritance Hierarchy

For each subclass of the root parent, you create the class, then create the same column for each subclass as the primary key for the superclass. If you created multiple primary key columns in the superclass for some reason, you create the same columns in the subclass. With LawEnforcementID, for example, you create the same primary key columns, PersonID and IdentificationID.

CREATE TABLE LawEnforcementID (

PersonID INTEGER NOT NULL REFERENCES Identification, IdentificationID INTEGER NOT NULL REFERENCES Identification, BadgeNumber INTEGER NOT NULL UNIQUE,-candidate key Constraint LawEnforcementID_PK

PRIMARY KEY (PersonID, IdentificationID))

In addition to making the column a primary key using the PRIMARY KEY column or table constraint, you must also make it a foreign key with a REFERENCES column constraint or a FOREIGN KEY column constraint. In the example, PersonID references not the Person table but the Identification table. See earlier in this "Foreign Affairs" section for some perspectives on these constraints. This foreign key constraint partially represents the generalization relationship, but not necessarily fully. If you insert a row in LawEnforcementID, the PersonID and IdentityID of that row must match a row in Identification. The foreign key constraint enforces this restriction. However, if you insert an Identity row, there is nothing that links the IdentityID of that row to any subclass table. Even conceptually, this makes

no sense, because SQL has no way to tell what kind of table you mean to create, as it doesn't understand generalization. You can't put another foreign key constraint on Identity, as it would mean a circular dependency.

Note

This situation applies when your parent class is an abstract class. If the class is concrete, then you can create an object of that class without worrying about subclasses. This analysis suggests that an abstract class needs a trigger that enforces its abstractness. The trigger must ensure that if you insert a row in a table corresponding to an abstract class that you also insert a row into one of the several subclasses of the abstract class. If you use this approach, step back and evaluate your overall system complexity before adding the trigger. You may find that it makes less trouble to maintain the integrity at the application level, particularly for smaller databases. Once your stable of applications grows to a decent size, though, you want to enforce as much integrity on the server as possible, as the next application could foul things up.

Spreading The basic idea behind generalization is inheritance, where each subclass inherits the attributes and operations of the superclasses. You can represent the classes directly as in the prior section on "Direct Mapping," then you can use joins and unions to represent the inheritance. Alternatively, you can do the inheritance up front by creating the attributes and operations in the subclasses, which I call spreading.

Spreading has the main advantage that you no longer need to refer to the superclasses for data values or existence. This eliminates many joins and unions and triggers and thus reduces complexity in both your schema and in your application code. So, why wouldn't you want to do this? The downside of this advantage is denormalization of the database schema design. You get a good deal of unnecessary redundancy in your data, leading to insert, update, and delete anomalies (see the later section on "Normalizing Relations").

Consider the Passport class from Figure 11-11. Using the direct approach, you get this schema design: CREATE TABLE Passport (

PersonID INTEGER NOT NULL REFERENCES Identification ON DELETE CASCADE,

IdentificationID INTEGER NOT NULL REFERENCES Identification ON DELETE CASCADE,

PassportNumber INTEGER NOT NULL UNIQUE,-candidate key IssuingOffice VARCHAR2(100) NOT NULL,

Constraint Passport_PK

PRIMARY KEY (PersonID, IdentificationID))

Using the spreading approach, you get this design instead: CREATE TABLE Passport (

PersonID INTEGER NOT NULL REFERENCES Identification, IdentificationID INTEGER NOT NULL REFERENCES Identification, ExpireDate DATE NOT NULL CHECK (ExpireDate > IssueDate), IssueDate DATE NOT NULL,

PassportNumber INTEGER NOT NULL UNIQUE,-candidate key IssuingOffice VARCHAR2(100) NOT NULL,

Constraint Passport_PK

PRIMARY KEY (PersonID, IdentificationID))

Instead of referring to the ExpiringID table for the ExpireDate and Issue Date, this design copies the columns down the hierarchy into the Passport table (and into all the other subclasses of ExpiringID as well). Passport stands alone as a useful table with all its data available to users. A use case that needs passport information need only access the one table. However, whenever you update the ExpireDate or IssueDate of a passport, you must also update the same columns in the superclass row that corresponds to the passport. That means an additional trigger or code in your application.

As with any design decision, you must consider the broad impact of the decision on overall system quality. Does the increased complexity from the trigger balance well against the reduced complexity in the application code that accesses passport information? For example, if the hierarchy is quite deep, you could wind up accessing several classes by joins rather than a single table that provides all the data. However, the trigger processing contributes little

to the maintenance of the system, because once done you leave it alone. In this case, with a single level in the hierarchy, it's probably not worth it.

It's also worth noting that you can mix the two approaches in your system. You can spread certain deep hierarchies while leaving other, more shallow hierarchies to use the direct approach. Any time you mix design approaches, you risk confusing those who come after you. If you do it, you should carefully document what you're doing, preferably in the database itself using table comments or some kind of repository documentation.

Multiple Inheritance

Multiple inheritance occurs when a class has generalization relationships to more than one superclass. The subclass inherits all the attributes and operations of its superclasses.

There is really no good way to represent multiple inheritance in a relational database schema. You are usually better off if you can restructure your design to avoid multiple inheritance, either through the use of interface types or through factoring the elements into separate but conceptually redundant classes. The "elegance" of multiple inheritance forces you to pay too high a price in the semantics of the relational table and its foreign keys [Date and Darwen 1998, pp. 299—315].

As an example of a poor transformation and its problems, consider the approach of building a subclass table that includes the primary keys of two superclasses. As an example, what if you decided to have a Digital Photograph be both a Document and an Image (Figure 11-12)?

Figure 11-12: Multiple Generalizations

You might create the Digital Photograph table this way: CREATE TABLE DigitalPhotograph (

DocumentID INTEGER NOT NULL REFRENCES Document, ImageID INTEGER NOT NULL REFERENCES Image, Photograph BIT VARYING (65536))

Now you need to choose a primary key. Do you choose DocumentID or ImageID? Both are unique identifiers, but you can only have one primary key to use as a foreign key in other tables, such as in the Driver License example in the section on "Roles, Multiplicity, and Foreign Key Columns." As well, you want the primary key to represent the generalization relationship, but since you can only have one, you can't.

When you get into some of the standard conundrums of representing multiple inheritance, such as the diamond- shaped hierarchy in which the two superclasses refer in turn to common superclass, you get into even more trouble (Figure 11-13). You could potentially have two separate objects in the common superclass with different primary keys. Which primary key do you use in the subclass row? If you use the spreading approach, do you create two attributes for each one in the common superclass? Worse and worse.

In C++ and other programming languages, the language designers have built various contorted solutions to this problem, such as virtual base classes in C++. You don't have that option in the relational schema and its standard language SQL-92.

Figure 11-13: Multiple Generalizations with a Common Superclass The Digital photograph is better done as an interface realization: CREATE TABLE DigitalPhotograph (

DigitalPhotographID INTEGER PRIMARY KEY,

DocumentID INTEGER NOT NULL REFRENCES Document, ImageID INTEGER NOT NULL REFRENCES Image,

Photograph BIT VARYING (65536))

The object (the row) stands alone with a primary key unique to the Digital Photograph table. There are references to the document and image, and the table conceptually inherits the interface behavior and behaves as a realized document and image. Semantically, this is comprehensible, if strained, in the context of a relational schema. Alternatively, you can recast the relationship between the classes as an association instead of a generalization or realization. From the OO design perspective, the main difference is that you abjure the possibility of treating the photograph itself as a document or image. Instead, you "get" a document or image related to the photograph. The Document and Image classes turn encapsulate a photograph (where that applies) and the Digital Photograph class in turn encapsulates a document and an image. Since all these classes directly relate, you must create a ternary association rather than a binary one; see the later section on "Ternary and Higher-Cardinality Associations" for details on transforming ternary associations. The results in this case look like this:

CREATE TABLE DigitalPhotograph (

DigitalPhotographID INTEGER PRIMARY KEY, Photograph BIT VARYING (65536))

CREATE TABLE PhotoUsedAs ( DigitalPhotographID INTEGER,

ImageID INTEGER NOT NULL REFERENCES Image,

CONSTRAINT PhotoUsedAs_PK (DigitalPhotographID, DocumentID, ImageID))

The obvious difference is that you now have two tables instead of one, with the Photo Used As table corresponding to the ternary association. That table has as its primary key the three primary keys of the participating tables. This constraint implies the restriction that there can be only one combination of any particular photo, document, and image in the database. You may also need a multiplicity constraint if there can be only one document and image associated with the photo. That means a UNIQUE constraint on the DigitalPhotographID. Since this implies a functional dependency that is not on the primary key, you may want to normalize this back to the single table. Make DigitalPhotographID the primary key to get it into fourth normal form.

Note

This seems to be a rare case where standard OO design techniques yield a schema that is not in fourth normal form. You can recognize these situations fairly easily and deal with them with standard normalization techniques, which you should always have in the back of your mind. See the later section on "Normalizing Relations" for details. Alternatively, this is probably yet another justification for avoiding multiple inheritance.