Integrity Constraints (IC’s) - Relational Databases

2.2 Relational Databases

2.2.2 Integrity Constraints (IC’s)

In a database, we represent a certain "world", and our data must respect some properties. These properties can be ensured by the use of integrity constraints. Integrity constraints are widely used in database management systems, and have a lot of power and expres- siveness.

Integrity constraints are closed first-order L-formulas, where L is a first-order lan- guage defined in the standard way. In the sequel, we denote relation symbols by N, N1, ..., Nm. We denote by ¯x, ¯y, ¯z, sequences of pairwise distinct variables with appropriate arity, such that xi denotes the ith component of ¯x. Also, we represent conjunctions of atomic formulas referring to built-in predicates by ϕ.

We now describe several integrity constraints classes. 2.2.2.1 Keys and Functional Dependencies

Within a relation, we must have a way to distinguish tuples. This is expressed in terms of the values of their attributes. We want the tuples to be uniquely identified by the set of values of a set of attributes. Therefore, the key integrity constraint was developed. In a relational database, they refer to a set of attributes of a relation for which it holds that no two distinct tuples have the same values for those attributes. Therefore, given a key

2. PRELIMINARIES 2.2. Relational Databases

constraint, we have a set of one or more attributes that, taken collectively, allow us to uniquely identify a tuple in a relation.

Definition 2.23(Keys). Given a relation name N , and a set of names of attributes A (the ones that form the key constraint), such that AR(N ) = n, the key constraint, with respect to N and A, expressed by SK(N, A), is defined as follows:

∀x,¯¯y¬  N (¯x) ∧ N (¯y) ∧ ^ i∈#N A xi = yi∧ _ j∈{0,1,2,3,...,n}\#N A xj 6= yj   (2.1)

Consider the Customers relation in Table 2.1 and the following integrity constraint: SK(Customers, {Id}) Customers Id Name 1 John 1 Peter 2 Michael

Table 2.1: Key constraint example

There is a clear violation of the key integrity constraint, since we have two distinct tuples in relation Customers with the same value for the attribute Id.

Keys can be seen as a special case of a more general type of integrity constraints: the functional dependencies. Given a relation R, a set of attributes X in R is said to functionally determine another set of attributes Y , also in R, if each X value is associated with precisely one Y value. This can also be expressed by X → Y .

The difference between key constraints and functional dependencies, is that, when- ever we have a key constraint, a set of attributes functionally determines all the attributes, while in the functional dependency, it may be the case where the attributes that are being functionally determined are not all attributes, but only some of them.

Definition 2.24 (Functional Dependencies). Given a relation name N , two sets of names of attributes A and B from relation N , such that A → B, the functional dependency with respect to N, A and B, expressed by F D(N, A, B), is defined as follows:

∀x,¯¯y¬  N (¯x) ∧ N (¯y) ∧ ^ i∈#N A xi = yi∧ _ j∈#N B xj 6= yj   (2.2)

Since key constraints are a special case of the functional dependencies, we can define the first in terms of the second in the following way: given a relation N and the set A of attributes that form the candidate key, and recalling that the sequence of attributes of a

2. PRELIMINARIES 2.2. Relational Databases

relation N , hκ0, κ1, ..., κni, is given by A(N ):

SK(N, A) = F D(N, A, {κ0, κ1, ..., κn} \ A)

Intuitively, we are saying that a set of attributes functionally determines the whole rest of attributes existing the relation.

2.2.2.2 Inclusion Dependencies

Inclusion dependencies (INDs) state that the sequence of values of a sequence of attributes of the tuples in a relation, must exist as the sequence of values of a sequence of attributes of the tuples of another (possibly the same) relation.

Definition 2.25 (Inclusion Dependencies). Given two names of relations N1 and N2, two sequences of names of attributes A = hκ1, κ2, ..., κki and B = hε1, ε2, ..., εki from relation N1 and relation N2respectively, let the inclusion dependency, with respect to N1, N2, A, B, expressed by IN D(N1, N2, A, B), where the values of the attributes in A must exist as the values of the attributes in B, defined as:

∀_x_¯∃_y_¯ " ¬N₁(¯x) ∨ (N2(¯y) ∧ k ^ i=1 x #N1_κi = y#N2_εi ) # (2.3)

From the previous definition, we conclude that if a sequence of values for a sequence of attributes exists in relation N1, the same sequence of values for another sequence of attributes must exist in relation N2. If it doesn’t exist in N1, it may or may not exist in N2. This is often written as N1[Y1] ⊆ N2[Y2]where Y1 (respectively Y2) is the sequence of values of the sequence of attributes of N1 (respectively N2) corresponding to the attributes that form the inclusion dependency [Cho07].

Accounts

Account Number Branch Name

111 Sete Rios

222 FCT

333 Coimbra

Branches

Branch Name Branch City Sete Rios Lisboa

FCT Almada

Benfica Lisboa

Aliados Porto

Table 2.2: Relations of the example

Consider the following relations: Accounts and Branches, in Table 2.2. If we take into consideration the inclusion dependency, represented by Accounts[Branch Name] ⊆ Branches [Branch Name], there is a clear violation of the integrity constraint, since Coimbra does not exist as the value of the attribute Branch Name in relation Branches.

2. PRELIMINARIES 2.2. Relational Databases

2.2.2.3 Denial Constraints

Denial constraints (DCs) are a kind of integrity constraint that prevent a certain general property to hold in a database. We present two special cases of the denial constraints, the check constraints and the domain constraints. Afterwards, we formally introduce the definition of a denial constraint.

Check constraints are a kind of integrity constraint that restrict the domain of a specific attribute, by imposing some mathematical restriction over the value of that attribute, that must always be obeyed.

Definition 2.26(Check Constraint). Given a relation name N , an attribute name κ, a mathe- matical operator θ ∈ {>, <, ≥, ≤, =, 6=} and a value V of the domain of κ, let a check constraint, with respect to N , κ, θ and V , expressed by CC(N, κ, θ, V ), be defined as follows:

∀¯x¬[N (¯x) ∧ x#N

κ θ V ] (2.4)

Examples of check constraints may be: a person must be over 21 years old to be a costumer in a bank, a bank account balance must be greater than 100, and so on. Consider Table2.3, representing the Employee relation, belonging to a database D:

Name Age

John 22

Peter 32

Paul 35

Table 2.3: The Employee relation

Now consider the check constraint constraint: CC(Employee, Age, <, 36). The relation would be inconsistent, since there are people younger than 36.

Domain constraints are a kind of integrity constraint that do not allow the value of a certain attribute to be outside of a user specified set of values. We restrict the domain Do of an attribute to a subset Do0 of Do.

Definition 2.27(Domain Constraints). Given a relation name N , an attribute name κ and a specific domain Do of the form Do = {val1, val2, ..., valk}, let a domain constraint, with respect to N , κ and Do, expressed by DoC(N, κ, Do), be defined as follows:

∀x¯¬ " N (¯x) ∧ k ^ i=1 x_#N κ 6= vali # (2.5)

Examples of domain constraints may be: the sex of a person must be either Male of Female; a person’s civil state should be either single, married or divorced. Consider Table

2. PRELIMINARIES 2.3. Answer Set Programming

In document Database repairs with answer set programming (Page 31-35)