Beyond just looking at short or long-term goals, though, you really need to step back from your design and look at the big picture before you finally accept the choices you need to make. Work through the scenarios from your use cases and from the use cases in the systems you are merging into your schema to see how the schema actually supports them. As you do that, you'll emerge with a bigger and bigger picture of the merged schema as a whole rather than as a collection of parts. The synergy of the whole system and its interrelationships is the key to the final set of strategies for integrating schemas.
The first thing to look at is whether there are any missing relationships. If you want to do this systematically, you can create a spreadsheet matrix for each subsystem with all the subsystem classes along the top and down the side. Fill in the cells with a value if there is a relationship. The value itself can be "True," or you can use the relationship type (such as generalization or association), or you can use the multiplicity. Don't forget the recursive relationships along the diagonal of the matrix, where classes relate to themselves. You can leave the cells above the diagonal (or below, whichever you wish) blank if you think of the relationships as symmetric. If you use directed arrows, on the other hand, you need to represent the different directions of the relationship in different cells. Also, include transitive generalizations, where a class is a subclass of another indirectly at a lower level. Below this matrix, you can list the relationships to classes outside the subsystem, which you'll use later.
Now look at the blank cells. First, consider each association that isn't there. Would it make sense in any potential application to associate the classes? Is the association meaningful? Is it useful enough to add some complexity to the data model? If you can answer "yes" to all these questions, create a new association.
Second, consider each generalization that isn't there. Would connecting the classes as superclass- subclass be meaningful? Again, is it worth adding the complexity? The comments on the various design choices in the previous section on "Business Rule Integration" show you some of the trade-offs here. If it makes sense, reorganize the class hierarchy to include the new class at the appropriate place in the hierarchy.
After you've gone through this exercise for each subsystem, step out another level and consider the relationships between subsystems. If it helps, draw a package diagram and annotate the package dependencies with the exact relationships you've listed at the bottom of your various spreadsheets. First, find any cycles. A cycle in the package diagram is where a dependency from a package leads to a package on which the first package is dependent, directly or indirectly.
Second, find any relationships that are composite aggregations (the black diamond that establishes a class as contained within another class). Without a really strong reason, you should not have such relationships between packages. Composites are tightly coupled, usually with overlapping object identity. Therefore, they belong in the same unit of cohesion in the system.
Third, find any generalization relationships between packages. Again, subclasses are tightly coupled to their superclasses, as an object contains an instance of both classes. There are, however, certain situations that justify putting a superclass in a different subsystem.
For example, you can reuse a packaged component such as a framework through "black box" reuse, which requires subclassing the component class to make it specific to your domain. That means by definition that the superclass is in a different package, as the "black box" nature of the situation means you can't modify the reused component package.
Another example: you can have class hierarchies with disjoint subtrees that relate to completely different parts of the system. The joint superclass may exist only for implementation or very general feature inheritance. This is the case, for example, in systems that have a single root object (CObject, ooObject, and so on). The hierarchical nature of UML is somewhat inhibiting in this regard. It might be better to treat packages as overlapping, with some classes being part of a generalization tree package as well as an association-related package. One represents the entire class hierarchy starting at an arbitrary root, while other packages represent the use of those classes. The more overlaps you have, though, the more difficult it is to understand everything that's going on because of the increasing complexity of relationships between packages.
Fourth, and finally, look at the number of dependencies between each package and the other parts of the system. The objective in decomposing your system into subsystems is to uncouple the subsystem from the rest of the system to the highest possible degree. The fewer dependencies you have, the better. On the other hand, each dependency represents a form of reuse. If a subsystem is highly reusable, it almost certainly means that more than one other system will depend on it for its features. To understand the trade-offs, you should measure the dependencies with the coupling and reuse potential metrics from Chapter 9. If a subsystem has a low reuse potential, you should have a
relatively low coupling metric for each dependency and a low number of dependencies. The higher the reuse potential, the more dependencies the package can support. On the other side, a package that depends on more than just a few other packages should not have a particularly high reuse potential because of its strongly coupled nature. If you find this combination, go back and understand why you have rated the reuse potential so high. Should the entire set of subsystems be part of a larger reusable subsystem (a "framework"), for example?
Note
Another method for evaluating the cohesion of your subsystems is to use the logical horizon of classes. The logical horizon of a class is the transitive closure of the association and generalization relationships from the class that terminate in a multiplicity of 1..1 or 0..1 [Blaha and Premerlani 1998, pp. 62–64; Feldman and Miller 1986]. Walk all the paths from the class to other classes. If an association to the other class has a multiplicity of 1..1 or 0..1, that class is in the logical horizon, but the path ends there. You can walk up a generalization
relationship to the class's superclass, but not down again to its siblings. The classes with the largest horizons are usually at the center of a subsystem that includes most of the logical horizon of that class. As with most mechanistic methods, however, using a logical horizon approach can be misleading and does not guarantee a highly cohesive subsystem, particularly if the associations are incomplete or wrong (never happens, right?). Because you are changing the subsystem relationships, you may want to make one more pass through the subsystem analysis to see if anything has changed because of moving classes between subsystems.
Summary
Not everyone chooses their parents wisely. Developing data models around legacy systems brings into sharp focus the culture of the organization that created those systems. This culture has a large impact on your legacy development and maintenance efforts through several mechanisms:
Norms
Values, attitudes, and beliefs
Rituals
Folklore
Shared language
Given these elements of your organization culture, you may or may not be able to leverage existing data and technology in your legacy system to provide the basis for your system development. Sometimes it's better to start fresh; sometimes its better to renovate the old house. Part of your job is to develop the scope of the new system based on both the system requiements and the system culture. on the other hand, if you renovate, you need to pay a lot more attention to culture to understand both the scope of the old system and the scope of your changes to it.
The next three chapters (the rest of the book) show you how to transform your data model into one of the three scheams: relational, object-relational, or object-oriented. Each chapter follows a similar structure, showing you the transformation of the following elements:
Structures
Relationships
Business rules
Design guidelines
Data definition languages
Finally, the last stage of building your data model is the intergration of all the different views into a single shared conceptual model. You generalize comcepts brought together into class hierarchies. You integrate business rules and resolve any logical conflicts between them. Lastly you step back and look at the big picture to optimize the total view of the system.
You are now at the point where the rubber meets the road. Unfortunately, you are in a computer game where the road changes its character depending on where you are and what tools you are carrying. Moving from the data model to the schema is a tricky process. You must find your way through the maze of culture,
you must understand the tools available, and you must understand how to leverage what you are carrying with you. Good luck, and do watch our for the human sacrifice in Chapter 12.
Chapter 11: Designing a Relational Database Schema
Thou shalt prepare a table before me against them that trouble me: thou hast anointed my head with oil, and my cup shall be full. But thy loving-kindness and mercy shall follow me all the days of my life: and I will dwell in the house of the Lord for ever.
Church of England Prayer Book, 23:4
Overview
Transforming a data model into a database schema is not easy, but it's not going to put another person on the moon either. The techniques for turning an ER model into a relational model are well understood [Teorey 1999, Chapter 4]. Transforming a UML data model into a relational database uses these techniques and adds some more to take some of the expanded features of such models into account. The first three sections of this chapter go into detail on the structures, relationships, and constraints of your UML data model and their transformation into the relational schema. The next section of the chapter discusses the concept of data normalization. You really can't talk about relational databases without talking about normalization. If you approach it from the OO perspective, though, you can eliminate much of its complexity and detail. It is still important to understand how your database structure meets the "normal" requirements of the relational world, but it's much easier to get there from an OO data model.
The last section summarizes the sequence of steps you take to transform the OO data model into an SQL-92 schema. It also shows you some of the specialized things you can do by using the nonstandard elements of a database manager, in this case Oracle7.