RELVARS, RELATIONS, AND PREDICATES - Databases, Types, and the Relational Model: The Third Mani

Everything we have said about relvars and relations in this chapter so far is accurate, of course, but there is another way to think about these matters—not the way (we hasten to add) in which the database community usually does think about them, but we venture to suggest it should be. To be more specific, it is common to think of a relvar as if it were just a simple abstraction of a traditional computer file; but we think the ideas discussed in this section can lead to a deeper understanding of what relational databases are really all about.

Consider the suppliers-and-parts database once again. That database contains three relvars, and of course each of those relvars is supposed to represent some portion of reality in some way. In fact, we can make this statement more precise: Each of those relvars represents somepredicate, and each of those predicates is, in essence, a generic statement about some portion of reality. Here is an example:

Supplier S# is under contract, is named SNAME, has status STATUS, and is located in city CITY.

This predicate is theintended interpretation—also known as theintension(note the spelling)— for the suppliers relvar S.

So what exactly is a predicate?2 _{In general, it is a}_function;_{like all functions, it takes a set of} parameters and it returns a result when it is invoked. But a predicate in particular is atruth-valued

function, meaning the result it returns when invoked is a truth value (TRUE or FALSE). In the case of the predicate just shown for relvar S, for example, the parameters are the attributes of the relvar heading—S#, SNAME, STATUS, and CITY—and they stand for values of the corresponding types (i.e., S#, NAME, INTEGER, and CHAR, respectively). When we invoke the function (or

instantiate the predicate,as the logicians say), we substitute arguments for the parameters. Suppose we substitute the argument values S1, Smith, 20, and London, respectively. Then we obtain the followingproposition:

1 _{Similar remarks apply to candidate keys, but here the practical benefits of providing a shorthand are overwhelming.} 2 _{Actually there is a slight lack of consensus on this question in the literature; not all logicians would agree with every}

Supplier S1 is under contract, is named Smith, has status 20, and is located in city London.

Now, a proposition in logic is, in general, a statement that is unconditionally either true or false; however, the particular propositions we are interested in here are supposed to be ones that evaluate to TRUE, a point we will return to in just a moment. In the example just shown, the proposition is indeed true—at least, so we believe—because a tuple corresponding to that proposition does currently appear in relvar S (see Fig. 2.1).

More generally, let relvarRrepresent predicateP;thenPis therelvar predicatefor relvar

R.1 Moreover, let the current value ofRber;then each tupletinrcan be regarded as representing a certain propositionp,derived by invoking, or instantiating,Pwith the attribute values fromt

being substituted for the parameters ofP. And (very important!)we assume by convention that each such proposition—each proposition, that is, that is represented by some tuple inr—evaluates to TRUE. Thus, given the sample value for relvar S shown in Fig. 2.1, we assume the following propositions all evaluate to TRUE:

Supplier S1 is under contract, is named Smith, has status 20, and is located in city London. Supplier S2 is under contract, is named Jones, has status 10, and is located in city Paris. Supplier S3 is under contract, is named Blake, has status 30, and is located in city Paris.

And so on. Furthermore, we subscribe, noncontroversially, to theClosed World Assumption [121], which says that if a given tuple plausibly could appear in the relvar at some time but in fact does not, then the corresponding proposition is understood by convention to be one that evaluates to FALSE at the time in question. For example, the tuple

TUPLE { S# S#('S6'), SNAME NAME('Lopez'), STATUS 30, CITY 'Madrid' }

is (let us agree) a plausible supplier tuple; however, it does not appear in the current value of relvar S as shown in Fig. 2.1, and so we are entitled to assume that the corresponding proposition—

Supplier S6 is under contract, is named Lopez, has status 30, and is located in city Madrid.

—evaluates to FALSE at this time. In other words, the relvar contains, at any given time,alland

onlythe tuples that represent true propositions at that time.

More terminology: Again, letPbe the relvar predicate orintensionfor relvarR,and let the value ofRat some given time ber. Thenr—or the body ofr,to be more precise—constitutes the extensionofPat that given time. (Observe, therefore, that the extension varies over time but the intension does not.) Another way of saying the same thing is to say that relationris “the current manifestation” of predicateP. Yet another way is to say thatrcontains exactly the tuples that make

Pevaluate to TRUE (at the time in question).

Now, if we think of each relvar as containing the tuples that make its predicate evaluate to TRUE at the time in question, it follows that we can think in a similar way aboutarbitrary relational expressions. For example, consider the following expression:

S { S#, SNAME, STATUS }

This expression denotes aprojection—see the section “Relational Operators” later in this chapter— of the current value of relvar S on attributes S#, SNAME, and STATUS. The result of that

projection contains all tuples of the form

TUPLE { S# s, SNAME n, STATUS t }

such that a tuple of the form

TUPLE { S# s, SNAME n, STATUS t, CITY c }

currently appears in relvar S for some CITY valuec. Thus, that result represents the current extension of a predicate that looks like this:

There exists some city CITY such that supplier S# is under contract, is named SNAME, has status STATUS, and is located in city CITY.

Observe that the result relation has three attributes and the corresponding predicate has three parameters; CITY is not a parameter to that predicate but abound variableinstead, thanks to the fact that it isquantifiedby the phrasethere exists some city.1 Another, perhaps clearer, way of making the same point—i.e., that the predicate has three parameters, not four—is to observe that the predicate as just stated is logically equivalent to this one:

Supplier S# is under contract, is named SNAME, has status STATUS, and is located in some city.

This version of the predicate very clearly has just three parameters.

It follows from the foregoing that virtual relvars in particular represent certain predicates (see the section “Virtual Relvars” later in this chapter). For example, let virtual relvar SST be defined as follows:

VAR SST VIRTUAL ( S { S#, SNAME, STATUS } ) ;

Then the relvar predicate for relvar SST is precisely:

Supplier S# is under contract, is named SNAME, has status STATUS, and is located in some city.

1 _{Bound variables are not variables in the usual programming sense, they are variables in the sense of predicate logic.}

See reference [78] or reference [112] if you need further explanation of quantifiers, bound variables, and related matters.

There are a few more points to be made regarding predicates and propositions. First, we have said that a predicate has a set of parameters. Of course, that set can be empty—and if it is, then the predicate in question degenerates to a proposition (certainly it is unconditionally either true or false). In other words, all propositions are predicates, but most predicates are not propositions.

Second, we have said that a predicate is “a generic statement about some portion of reality.” We can now see that it is precisely the fact that the statement is, in general, parameterized that makes it generic (and if the set of parameters is empty, then that “generic” statement becomes rather specific!).

Third, we have also said, in the case of a relvar predicate specifically, that the parameters correspond to the attributes of the relvar; thus, a relvar of degreenrepresents a predicate withn

parameters, or what the logicians call ann-place predicate. However, no harm is done, logically speaking, if we think of that set ofnparameters as constituting a singletupleparameter (and corresponding arguments, in some instantiation of the predicate, as constituting a single tuple value). Thus, we can simplify our discussions somewhat by considering relvar predicates always to bemonadic,meaning they are defined in terms of just one (tuple) parameter. And we can then go on to think of a given tuplet(of the appropriate tuple type) as eithersatisfyingorviolatinga given relvar predicateP. To be specific, tupletsatisfiespredicatePif and only if the proposition obtained fromPby substitutingtfor its (tuple) parameter evaluates to TRUE, and itviolatesit if and only if it does not satisfy it.

Fourth, a matter of notation: For clarity, we have deliberately used different symbols,Rand

P,for a relvar and its predicate. Sometimes, however, there are good reasons to conflate the two. Thus, we will occasionally write expressions of the formR(t)—meaning, specifically, that tuplet

appears in relvarRand therefore satisfies the relvar predicate corresponding toR. Relations vs. Types

It follows from everything we have said in this section so far that:

 Typescomprise the things we can talk about.

 Relationscomprise the truths we utter about those things.

In other words, types give us things we can talk about—in effect, they give us our vocabulary—and relations give us the ability to say things about the things we can talk about. (There is a nice analogy here that might help you remember and appreciate these important points:

Types are to relations as nouns are to sentences.) In the case of suppliers, for example, the things we can talk about are supplier numbers, names, integers, and character strings, and the things we can say are things of the form “The supplier with the specified supplier number is under contract, has the specified name, has the status denoted by the specified integer, and is located in the city denoted by the specified character string.” Note the following important corollaries! In order (as we put it earlier) to “represent some portion of reality”:

1. Types and relations are bothnecessary—without types, we have nothing to talk about; without relations, we cannot say anything.

2. Types and relations aresufficient,as well as necessary—we do not need anything else, logically speaking. (Well, we need relvars too, in order to reflect the fact that reality changes over time, but not to reflect anyparticularportion of reality.)

3. Types and relations arenot the same thing.

With regard to the last of these points, incidentally, we saw in the previous section that there is a logical difference between relvars and types—a relvar is a variable, and variables are not types. By the same token, we now see that there is a logical difference between relations and types as well—a relation is a value, and values are not types either.

In document Databases, Types, and the Relational Model: The Third Manifesto - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 39-43)