Even if you could create domains in RDBMS schemas, you would still need a way to relate your UML types to your domains and data types. Before transforming classes, you should have a table in some form that shows how to map UML type expressions into the SQL of your target DBMS. If you have multiple targets, you should have multiple tables. For optimal reusability, you should package your UML type package diagram and the table together in a reuse repository for use in multiple data models and schema designs.
The transformations may be simple ("if UML type is string, then SQL type is VARCHAR(254)," for example), but usually you will find various ifs, ands, or buts that make your job more complex. For example, does a string translate into a VARCHAR(254), or can you specify a shorter size? Do you sometimes want a CHAR instead of a VARCHAR? Enumerated types pose a real problem. If there is a way to define domains, you can encode the CHECK constraint that corresponds to the list of values into the CREATE DOMAIN statement:
CREATE DOMAIN Boolean AS CHAR(1) NOT NULL DEFAULT 'T' CONSTRAINT Bool_Constraint CHECK (Boolean IN ('T', 'F'));
If you don't have domains, and mostly you don't, you must put the CHECK constraint (and probably the DEFAULT and NOT NULL clauses as well) into each attribute that uses the type. Alternatively, you can create a type table and put the values into it, then create a foreign key constraint to that table. You would definitely do that if you thought you might add new values to the enumeration over time. For a Boolean type, that isn't true; for other enumerations, it is often true.
You may also want to specify some display values to use in your programs. For example, if your user interface presents a field that takes a value of an enumerated type, you might want to make it a dropdown list. It can then display fullword natural language values such as "True" and "False" for a Boolean type. You can encode this into a type table, a single table for each type with the following structure:
CREATE TABLE <Name> Type ( Value <Data Type> PRIMARY KEY, DisplayString VARCHAR2(50) NOT NULL, Description VARCHAR2(2000))
In this parameterized CREATE TABLE statement, <Name> is your UML type name and <Data Type> is the SQL data type that corresponds to your enumerated values. The type is usually either CHAR(n) for character code representations or NUMBER for numeric code representations. You insert one row of data for each enumeration value. For example, the possible states of the legal status of a criminal organization include Legally Defined, On Trial, Alleged, and Unknown. The following Oracle7 SQL creates a type table for the UML LegalStatus Type: CREATE TABLE LegalStatusType (
Value CHAR(1) PRIMARY KEY,
DisplayString VARCHAR2(50) NOT NULL, Description VARCHAR2(2000));
INSERT INTO LegalStatusType (Value, DisplayString) VALUES ('L', 'Legally Defined');
INSERT INTO LegalStatusType (Value, DisplayString) VALUES ('T', 'On Trial');
VALUES ('A', 'Alleged');
INSERT INTO LegalStatusType (Value, DisplayString) VALUES ('U', 'Unknown');
CREATE TABLE LegalStatusTypeDefault( Value CHAR(1) PRIMARY KEY);
INSERT INTO LegalStatusTypeDefault (Value) VALUES ('U');
This series of SQL statements sets up the table, inserts a row for each enumeration value with an alphabetic code, and sets up a singleton table containing the default code. The advantage in this approach is that you can add to or modify the value set by inserting or updating rows in the type table. You can use the default table to modify the default value. An alternative way to indicate defaults is with a Boolean tag column in the type table, IsDefault or something like that. The singleton table is clearer in most cases and is easier to maintain from your application.
Note
Sometimes you see type tables in relational database designs that merge several types into a single table. This is a bad design idea for two reasons. First, the Value column must have the type of value that is correct for the enumeration. It may be NUMBER, it may be CHAR, or it may be a DATE. Your choice is either to use the wrong type by choosing one of these or to add a column for each type and make it null for enumerations that are not of that type. Neither of these is good design. The first approach is bad because it doesn't do the job. The second approach is bad because you introduce a complex constraint into the table. The second reason is that you introduce control coupling into your database unnecessarily. By adding a discriminator column, usually the type name, to distinguish just those rows that apply to a given type, you force applications to add control logic to their SQL retrieval.
Classes
A UML class contains attributes and operations. The transformation is easy, at least for the attributes: each class becomes a table in a relational database, and each attribute becomes a column in the table. For example, in the Person subsystem from Figure 11-1, you create a table for each of the classes in the diagram: Person, Address, Identification, ExpiringID, LawEnforcementID, and so on. For example, here is the transformation of the Person class and attributes into a standard CREATE TABLE statement:
CREATE TABLE Person (
PersonID INTEGER PRIMARY KEY,
Sex CHARACTER(2) NOT NULL CHECK (Sex IN ('M', 'F')), BirthDate DATE NOT NULL,
Height FLOAT, Weight FLOAT,
MaritalStatus CHARACTER(1) NULL CHECK (MaritalStatus IN ('S', 'M', 'D', 'W')), Comment VARCHAR(200))
The trick to producing CREATE TABLE statements is in transforming the attribute types into SQL data types and single-attribute constraints. See the following sections on "Attributes" and "Domains and Data Types" for details on these decisions. There is also the problem of the primary key attributes; see the section on "Constraints and Object Identity" for details on producing a PRIMARY KEY constraint.
Warning
A small worry is the length of your class name. Standard SQL-92 limits the length of table names to a maximum of 128 characters, including schema name qualifier, but you'll be lucky if your system gives you anything like that. All the systems I've used limit you to 31 or 32 characters (Oracle, Informix, Sybase, and SQL Server) or even 18 characters (DB2).
But what about the operations? There is nowhere in the CREATE TABLE statement for these. In most relational databases these days you have at least some possibilities for representing behavior as well as structure. See the following section on "Operations and Methods" for some suggestions.