The ER model is generally sufficient for "traditional" database applications. But more recent applications of DB technology (e.g., CAD/CAM, telecommunication, images/graphics, multimedia, data mining/warehousing, geographic info systems) cry out for a richer model.
The EER model extends the ER model by, in part, adding the concept of specialization (and its inverse, generalization), which is analogous to the same-named concept (also called extension or subclassing) from object-oriented design/programming. An entity type may be recognized as having one or more subclasses, with respect to some criterion.
This represents a specialization of the entity type. A subclass inherits the features of its parent (the entity type), and can be given its own "local" features. By features we mean not only attributes but also relationships. (This is entirely analogous to what we find in OO programming languages, where a subclass inherits its parent's features (instance variables and methods) and can be defined to have new ones specific to it.)
Using E&N's example , in Figure 4.1 to illustrate, suppose that we have an entity type EMPLOYEE with attributes Name, SSN, BirthDate, and Address. Specializing this entity type with respect to job type, we might identify SECRETARY, TECHNICIAN, and ENGINEER as subclasses, with attributes TypingSpeed, TGrade, and EngType, respectively. The idea is that (at any moment in time) every member of SECRETARY's entity set is also a member of EMPLOYEE's entity set. And similarly for members of the entity sets of TECHNICIAN and ENGINEER. Hence, a SECRETARY entity, also being an EMPLOYEE entity, can participate in relationship types involving EMPLOYEE (such as
WORKS_FOR, although no such relationship type is shown in the figure).
Specialization is the process of defining a set of subclasses of an entity type, usually on the basis of some distinguishing characteristics (or "dimensions") of the entities in the entity type. Interestingly, you may introduce multiple specializations of a single class, each based upon a different distinguishing characteristic of (the entities of) that class.
For example, in Figure 4.1, EMPLOYEE is specialized according to job type (resulting in subclasses SECRETARY, TECHNICIAN, and ENGINEER) and also on the basis of method of pay (resulting in subclasses SALARIED_EMPLOYEE and HOURLY_EMPLOYEE). This results in a situation (not typically seen in OO programming) in which a single entity can be an instance of two sibling subclasses (or, more generally, of two subclasses neither of which is an ancestor of the other).
A reasonable question to ask is What is the purpose of extending the ER model to include specialization? A few answers:
• So as not to carry around null-valued attributes that don't apply
• To define relationships in which only subclass entities may participate (e.g., trade union membership is applicable only to hourly employees)
The term generalization refers to the inverse of specialization. That is, generalization refers to recognizing the need/benefit of introducing a superclass to one or more classes that have already been postulated. Hence, generalization builds a class hierarchy in a bottom-up manner whereas specialization builds it in a top-down manner.
Constraints and Characteristics of Specialization/Generalization Constraints on Specialization and Generalization:
A subclass D of a class C is said to be predicate-defined (or condition-defined) if, for any member of the entity set of C, we can determine whether or not it also is a member of
D by examining the values in its attributes (and seeing whether those values satisfy the so-called defining predicate). If all subclasses in a given specialization are predicate-defined, then the specialization is said to be attribute-defined.
For example, Figure 4.4 shows an EER diagram indicating that the value of an EMPLOYEE entity's Job_Type attribute determines in which subclass (if any) that entity has membership.
When there is no algorithm for determining membership in a subclass, we say that the subclass is user-defined. For such subclasses, membership of entities cannot be decided automatically and hence must be specified by a user.
Disjointness vs. overlapping constraint:
Let C be a class and S1, ... Sk be the (immediate) subclasses of C arising from some specialization thereof. For this specialization to satisfy the disjointness constraint requires that no instance of C be an instance of more than one of the Si's. In other words, for every i and j, with 0 < i < j <= k, the intersection of the entity sets of subclasses Si
and Sj must be empty.
If a specialization is not defined to satisfy the disjointness constraint, we say that it is overlapping, meaning that there is no restriction prohibiting an entity from being a member of two or more subclasses.
Completeness Constraint: partial vs. total
The total specialization constraint requires that every entity of a superclass be a member of at least one of its (immediate) subclasses. A partial constraint is simply the absence of the total constraint (and hence is no constraint at all).
Note that the disjointness and completeness constraints are independent of one another, giving rise to four possible combinations:
• disjoint, total
• disjoint, partial
• overlapping, total
• overlapping, partial
Hierarchies vs. Lattices In a specialization hierarchy (i.e., tree), each subclass has exactly one (immediate) superclass. In a specialization lattice, a subclass may have more than one (immediate) superclasses. (See Figures 4.6 and 4.7.) Having more than one superclass gives rise to the concept of multiple inheritance.
Specialization vs. Generalization: Analogous to top-down refinement vs. bottom-up synthesis. In real-life, most projects are developed using a combination of these approaches.