• No results found

are used instead), it is important to have at least a basic understanding of them to better comprehend what the higher-level languages are doing.

Relational Algebra

Relational algebra is a language that describes how the database should run a query in order to return the desired results. Because relational algebra describes how to run a query, it is referred to as a procedural language.

A relational algebra expression consists of two relations that act as operands and an operation. The operation produces an additional relation as output without any side effects on the input operands. Relations are closed under relational algebra, meaning that both the inputs and the outputs of an expression are relations. The closure property allows expressions to be nested using the output of one expression to be the input of another.

All relational algebra operations can be broken down into a base set of five operations. While other operations do exist, any operation outside the base set can be expressed in terms of the base set of operations. The base set of operations in relational algebra consists of selection, projection, Cartesian product, union, and set difference.

Relational algebra operations can operate on either a single relation (unary) or a pair of relations (binary). While most operations are binary, the selection and projection operations operate on a single relation and are unary.

In addition to the base operations, this section discusses the intersection and join operations.

To provide an example of relational algebra operations, consider the simple relations defined in Tables 1.4 and 1.5.

ptg18221911

Union (A ∪ B)

The union operator produces a relation that includes all the tuples in the operand relations (see Table 1.6). It can be thought of as an “or” operation in that the output relation has all the members that are in either relation A OR relation B.

Intersection (A ∩ B)

The intersection operator produces a relation that includes all tuples in both relation A and relation B (see Table 1.7).

Table 1.4 Relation A Color

Red White Blue

Table 1.5 Relation B Color

Orange White Black

Table 1.6 A B Color

Red White Blue Orange Black

Table 1.7 A B Color

White

ptg18221911 Relational Languages 11

Difference (A - B)

The difference operator produces a relation that contains the tuples that are members of the left operand without the tuples that are members of the right operand (see Table 1.8).

Cartesian Product (A ë B)

The Cartesian product produces a relation that includes all possible ordered pairings of all tuples from operand A with all tuples from operand B (see Table 1.9). The degree of the output relation is the sum of the degree of each operand relation. The cardinality of the output relation is the product of the cardinalities of the input relations. In our example, both relations A and B have a degree of 1. Therefore, the output relation has a degree of 1 + 1 = 2. Similarly, both relations A and B have a cardinality of three, so the output relation has a degree of 3 * 3 = 9.

Selection (σpredicate (A))

Selection produces a relation with only the tuples from the operand that satisfy a given predicate. Remember that, unlike the previous operations, selection is a unary operation and operates on only a single relation.

As an example of the selection operation, we again consider the os relation from earlier in the chapter. In the example, the os relation is being searched for all tuples that contain an api value that is greater than 19 (see Table 1.10).

Table 1.8 A - B Color

Red Blue

Table 1.9 A ë B

A. Color B. Color

Red Orange

Red White

Red Black

White Orange

White White

White Black

Blue Orange

Blue White

Blue Black

ptg18221911

Projection (Πa1, a2,…,an(A))

Projection produces a relation containing only the attributes that are specified on the operand. The output relation has the values from the attributes listed in the operand, and the operation removes the duplicates.

Like selection, projection is also a unary operation working on a single input relation.

As an example, we again use the relation depicted in Table 1.1. This time, only the values for the attribute codename are included in the resulting relation (see Table 1.11).

Joins

The join relations can be considered a class of relations that are similar to the Cartesian product of two operand relations. Usually, a query does not need to return the complete pairing of tuples from the two operands that are produced by the Cartesian product.

Instead, it is usually more useful to limit the output relation to only those pairings that meet certain criteria. This is where the different join operations are useful.

Natural join is a useful join variant as it conceptually allows two relations to be combined into a single relation connecting the relations over a set of common attributes.

For example, if we consider the os relation in Table 1.1 and the normalized device

relation in Table 1.3, we can produce a relation that combines the two relations using the

device.os_version and os.version attributes from each of the input relations. The results are depicted in Table 1.12.

Table 1.10 σapi>19 (os)

version codename api

5.1 Lollipop 22

5.0 Lollipop 21

Table 1.11 (Πcodename(os)) codename

Lollipop KitKat Jelly Bean

Table 1.12 A ⋈ B

device.version device.manufacturer os.version os.codename os.api

Galaxy Nexus Samsung 4.3 Jelly Bean 18

Nexus 5 LG 5.1 Lollipop 21

Nexus 6 Motorola 5.1 Lollipop 21

ptg18221911 Relational Languages 13

Notice how the result of the natural join is the same unnormalized relation as in Table 1.2. By using a join operation, we are now able to perform additional operations on the output relation to produce the same results that would have been obtained if the data was combined in one table.

Natural join is really a specific type of theta join that uses the equality operation over a set of attributes. Theta join allows the use of any operation to combine the two operand relations. Equality (producing a natural join) is just one of the most common cases.