• No results found

CASE STUDY: SCHEMA REFINEMENT FOR THE STUDENT REGISTRATION SYSTEM

CASE

STUDY Having spent all that effort studying relational normalization theory, we can nowuse it to verify our design for the Student Registration System as outlined in

Section 5.7. The good news is that we did a pretty good job of converting the E-R diagram in Figure 5.14, page 114, into the relations in Figures 5.15 and 5.16, so most of the relations turn out to be in Boyce-Codd normal form. However, you did not struggle through this chapter in vain—read on!

To determine whether a schema is in a normal form we need to collect all FDs relevant to it. One source is thePRIMARY KEYand theUNIQUEconstraints. However, there might be additional dependencies that are not captured by these constraints or the E-R diagram. They can be uncovered only by careful examination of the schema and of the specifications of the application, a process that requires much care and concentration. If no new dependencies are found, all FDs in the schema are the primary and the candidate keys (or the FDs entailed by them), so the schema is in BCNF. If additional FDs are uncovered, we must check if the schema is in a desirable normal form and, if not, make appropriate changes.

In our case, we can verify that all relation schemas in Figure 5.16, except Class, are in BCNF, as they have no FDs that are not entailed by the keys. This verification is not particularly hard because these schemas have six or fewer attributes. It is harder in the case of Class, which has ten.

We illustrate the process using the Class schema. Along the way, we uncover a missing functional dependency and then normalize Class. First, let us list the key constraints specified in theCREATE TABLEstatement for that relation.

1. CrsCode SectionNo Semester Year→ ClassTime 2. CrsCode SectionNo Semester Year→ Textbook 3. CrsCode SectionNo Semester Year→ Enrollment 4. CrsCode SectionNo Semester Year→ MaxEnrollment 5. CrsCode SectionNo Semester Year→ ClassroomId 6. CrsCode SectionNo Semester Year→ InstructorId 7. Semester Year ClassTime InstructorId→ CrsCode 8. Semester Year ClassTime InstructorId→ Textbook 9. Semester Year ClassTime InstructorId→ SectionNo 10. Semester Year ClassTime InstructorId→ Enrollment 11. Semester Year ClassTime InstructorId→ MaxEnrollment 12. Semester Year ClassTime InstructorId→ ClassroomId 13. Semester Year ClassTime ClassroomId→ CrsCode 14. Semester Year ClassTime ClassroomId→ Textbook 15. Semester Year ClassTime ClassroomId→ SectionNo 16. Semester Year ClassTime ClassroomId→ Enrollment

8.12 Case Study: Schema Refinement for the Student Registration System 253

17. Semester Year ClassTime ClassroomId→ MaxEnrollment

18. Semester Year ClassTime ClassroomId→ InstructorId CASESTUDY

Verifying that additional dependencies hold in a large schema can be difficult: One has to consider every subset of the attributes of Class that is not a superkey and check if it functionally determines some other attribute. This “check” is not based on any concrete algorithm. The decision that a certain FD does or does not hold in a relation is strictly a matter of how the designer understands the semantics of the corresponding entity in the real-world enterprise that is being modeled by the database, and it is inherently error prone. However, research is being conducted to help with the problem. For instance, FDEXPERT [Ram 1995] is an expert system that helps database designers discover FDs using knowledge about typical enterprises and their design patterns.

Unfortunately, we do not have an expert system handy, so we do the analysis the hard way. Consider the following candidate FD:

ClassTime ClassroomId InstructorId → CrsCode

It is easy to see why this FD does not apply: Different courses can be taught by the same instructor in the same room at the same time—if all this happens in different semesters and years. Many other FDs can be rejected through a similar argument. However, since in Section 5.7 we assumed that at most one textbook can be used in any particular course, the following FD is an appropriate addition to the set of constraints previously specified for Class:

CsrCode Semester Year → Textbook (8.21)

Although the textbook used in a course can vary from semester to semester, if a certain course is offered in a particular semester and is split in several sections because of large enrollment, all sections use the same textbook.10

It is now easy to see the problem with the design of Class: The left-hand side of the above dependency is not a key, and Textbook does not belong to any key either. For these reasons, Class is not in 3NF. The 3NF synthesis algorithm on page 237 suggests that the situation can be rectified by splitting the original schema into the following pair:

Class1, with all the attributes of Class, except Textbook, and FDs 1, 3–7, 9–13, 15–18 (these numbers refer to the numbered list of FDs provided earlier.) Textbooks(CrsCode, Semester, Year, Textbook), with the single FD CrsCode Semester Year→ Textbook.

The resulting schemas are both in BCNF, since one can verify by direct inspection that all of their FDs are entailed by key constraints. The 3NF synthesis algorithm also guarantees that the above decomposition is both lossless and dependency preserving.

Let us now consider a more realistic situation in which classes can have more than one recommended textbook and all sections of the class in a particular

CASE

Related documents