• No results found

EXERCISES

157 Explanation 6

• The explanation of the first line is as in Example 6.4. The resulting relation is the first operand of an invocation of “”.

RELATION { N INTEGER } { } is the second operand, denoting the empty relation of the same type as as the first operand. Of course, to be a subset of an empty relation is the same as to be (i.e., be equal to) that empty relation, so the invocation of “” yields TRUE if

and only if the first operand is in fact that empty relation.

Obviously, wherever we can use “” we could instead use “”, from the equivalence of r1  r2 and

r2  r2. Moreover, the given explanation of Example 6.6 clearly shows that “=” comparison can be used to express every constraint that can be expressed in the form IS_EMPTY(r) (which could also be written as r = r WHERE FALSE). But the availability of “=” comparisons on values of all types, including relations in particular, is surely required for relational completeness. Under that assumption we could argue that relational completeness is all that is theoretically needed for complete support for constraints. Now, I claimed that every constraint can be expressed as a single comparison of the form r1  r2. You might be wondering how equality of relations can be expressed using a single invocation of “”. Clearly,

r1 = r2 is equivalent to r1  r2 AND r2  r1, but that expression is an invocation of AND, not “”. Example 6.7 shows one way of doing it with a single invocation of “”.

Example 6.7: Relation equality using a single invocation of “”

((r1 MINUS r2)UNION(r2 MINUS r1)){}⊆RELATION { } { }

Explanation 6.7

• Recall: r1 MINUS r2 is equivalent to r1 NOT MATCHING r2 but requires r1 and r2 to be relations of the same type.

( ( r1 MINUS r2 ) UNION ( r2 MINUS r1 ) ) yields the relation whose body consists of every tuple of r1 that is not also a tuple of r2 and every tuple of r2 that is not also a tuple of r1. This is sometimes called the symmetric difference of r1 and r2 (and a relational language might well provide a dyadic operator as a shorthand for expressing it).

Note that the symmetric difference of sets A and B is the empty set if and only if A=B (i.e., they are one and the same set).

• Noting that a projection of relation r is empty if and only if r itself is empty, we can test the symmetric difference for being empty by taking its projection over no attributes and testing that projection for being a subset of the empty relation of degree zero (recall that in TutorialD you can use the name TABLE_DUM for this relation if you prefer).

Download free eBooks at bookboon.com

158

To prove that every expression of the form IS_EMPTY(r)is equivalent to some expression of the form

r1  r2 I merely note that IS_EMPTY(r) is equivalent to r ( r WHERE FALSE ). And as r1  r2

is equivalent to IS_EMPTY(r1 MINUS r2) it is clear that a language can support either IS_EMPTY or just one of our three relation comparison operators with equal expressive power. That gives four choices, so far, for the operator that allows us to express any theoretically expressible constraint as a single invocation of that operator on one or two relations. There are more!

Use of Truth-Valued Aggregate Operators

Our relvar EXAM_MARK really ought to be subject to a constraint requiring every value for the Mark attribute to lie in the range 0 to 100. That is easy enough to express using IS_EMPTY, as Example 6.8 shows, but many people would prefer to say that every mark shall lie within the required range instead of saying that no mark shall lie outside it.

Example 6.8: Restricting exam marks to between 0 and 100

CONSTRAINT Marks_between_0_and_100

IS_EMPTY ( EXAM_MARK WHERE Mark  0 OR Mark 100 ) ;

In Chapter 5 you met the aggregate operator AND, named after its own basis operator. AND(r,c), where

r is a relation and c is a condition, is true if and only if every tuple of r satisfies c. (In Rel ALL is a synonym for aggregate AND. You might find ALL(r,c) more intuitive than AND(r,c).) Use of aggregate AND allows many constraints to be expressed more succinctly and more clearly than use of any of the other methods we have met so far, as Example 6.9 shows in the case of our constraint on exam marks. Note, however, that this example depends on an enhancement in Version 2 of Tutorial D, as described in Chapter 5, Section 5.3, Aggregate Operators.

Example 6.9: Restricting exam marks to between 0 and 100 using aggregate AND as supported in Version 2 of Tutorial D

CONSTRAINT Marks_between_0_and_100_using_AND

AND ( EXAM_MARK, Mark  0 AND Mark  100 ) ;

To show that aggregate AND is in fact yet another candidate for our single additional operator, and that in fact every constraint can be expressed as an invocation of that operator, I note the equivalence of IS_EMPTY(r) and AND(r, FALSE). No tuple satisfies the condition FALSE, so AND(r, FALSE) is false whenever r contains at least one tuple and is true only when r is empty.

Download free eBooks at bookboon.com

Click on the ad to read more

159

We have aggregate OR too, so, recalling from Chapter 2 that “for all x, p(x)” is equivalent to “there does not exist x such that NOT(p(x))”, we can note that AND(r, c) is equivalent to NOT(OR(r, NOT(c))), from which it follows that AND(r, FALSE) is equivalent to NOT(OR(r, TRUE)). Finally, as OR(r, TRUE) is false only when r is empty, we can note that OR(r, TRUE) is equivalent to NOT(IS_EMPTY(r)). Faced with such a plethora of choice for general methods of expressing constraints, Tutorial D does not arbitrate in favour of any of the noted candidates, allowing the user to choose freely from among them whichever is deemed most suitable for each particular purpose. The availability of logical connectives gives the user the further freedom to decide how best to arrange the database constraint into declared constraints, individually named and formulated. Sadly, we cannot say the same for the commercially available DBMSs at the time of writing (2014), for we are not aware of any widely available SQL implementation that supports any of the noted candidates for use in constraints: they don’t support the international standard’s CREATE ASSERTION statement, which would be SQL’s counterpart of Tutorial D’s CONSTRAINT statement, and they don’t permit table expressions to appear inside SQL’s so-called “table constraints”. Typically, the SQL user is restricted to certain special-purpose shorthands of the kinds described in the next section.

6JGFGUKIPQHGEQHTKGPFN[OCTKPGRQYGTCPFRTQRWNUKQPUQNWVKQPUKUETWEKCNHQT/#0&KGUGN6WTDQ 2QYGTEQORGVGPEKGUCTGQHHGTGFYKVJVJGYQTNFoUNCTIGUVGPIKPGRTQITCOOGsJCXKPIQWVRWVUURCPPKPI HTQOVQM9RGTGPIKPG)GVWRHTQPV

(KPFQWVOQTGCVYYYOCPFKGUGNVWTDQEQO

.QYURGGF'PIKPGU/GFKWOURGGF'PIKPGU6WTDQEJCTIGTU2TQRGNNGTU2TQRWNUKQP2CEMCIGU2TKOG5GTX

The Wake

Download free eBooks at bookboon.com

160

6.4

Useful Shorthands for Expressing Constraints

In Chapter 5 I showed how a relational database language can be extended by defining new relational operators—“shorthands”—in terms of the existing ones. If the existing language is relationally complete, then such extensions do not increase the language’s expressive power—there is no need for that—but, judiciously chosen, they do make some problems easier to solve by providing shorthands that are not only convenient but, by raising the level of abstraction, might also be easier to understand than the longhands on which they are defined. In Chapter 5 I illustrated this point by showing you the handful of such operators that have been “judiciously chosen” for Tutorial D, these having been proposed by various writers over the years. Unfortunately, very little in the way of useful shorthands has been proposed for use in constraints; and what little there is is subject to a certain amount of controversy. Yet the requirement for shorthands seems to be compelling, not just for the convenience of users but also for performance, as I will now explain.

Suppose that we require every constraint to be expressed using an expression of the form IS_EMPTY(r). Then consider a simple constraint such as the one to make sure every exam mark is in the range of 0 to 100 and assume it is expressed as shown in either of Examples 6.8 and 6.9. Suppose that a certain update statement is used to add a single tuple to EXAM_MARK. Whether IS_EMPTY or aggregate AND is chosen for the constraint declaration, a naïve evaluation would involve the system in examining each existing EXAM_MARK tuple as well as the one being added. But the existing tuples are all known to satisfy the condition Mark  0 AND Mark  100, for if one of them didn’t the database would have been

visibly inconsistent at the previous statement boundary. If the system could somehow work out that it is sufficient just to check incoming tuples, then simple update operations would be executed very much more quickly. But such optimizations involve sophisticated expression analysis. While we rightly expect industrial-strength DBMSs to attempt such optimizations, their degree of success is likely to be limited in practice. When we can identify a certain class of constraints that lend themselves to more efficient methods of evaluation, one way of guaranteeing that the system will adopt those more efficient methods is to provide an alternative way of expressing the constraint, applicable only to constraints of that class. If that alternative method is easier for the user to write, and perhaps clearer for the reader too, then the addition to the language can be justified even though it is theoretically redundant.

I will now describe some of the special classes of constraint that have been identified and the shorthands typically used for expressing them, but please note carefully that with just one exception (key constraints) these shorthands are not available in Tutorial D. For one thing, they are somewhat controversial but, more importantly, many people have been beguiled by the impoverished state of the existing commercial technology into believing that the term “constraint”, as used in the present context, applies only to what can be expressed using the available shorthands.

Download free eBooks at bookboon.com

161