Mathematical Interlude
5.2 Inductive Definitions and Proofs, Formal Languages
5.2.1 Inductive definitions
John McGregor (a Scottish squire from the 17th century) had four children:
Mary, James, Robert and Lucy.
James died childless. Mary had two children. Robert and Lucy had three children each.
William, Mary’s first child, had one child.
...
and so the story goes on.
We may not know the descendants of John McGregor or their number, but we have no trouble in understanding what a descendant of John McGregor is. It is either a child of McGregor, or a child of a child, or a child of a child of a child,... and so on.
Using the concept of a finite sequence it is not difficult to give an explicit definition of the set of a’s descendants, where a is a person.
x is a descendant of a iff there is a finite sequence (a1, . . . , an), such that a1
is a child of a, ai+1 is a child of ai, for all i = 1, . . . , n− 1, and an = x.
The sequence just described shows the chain connecting x to a. The condition concerning the sequence can be relaxed:
A person x is a descendant of a iff there is a finite sequence in which x is the last member, and every member of it is either a child of a or a child of some previous member.
There is another way of defining the set of descendants, which does not employ finite se-quences. Consider the following two properties of a set X:
(I) If x is a child of a, then x∈ X (i.e., every child of a is a member of X).
(II) If x∈ X and y is a child of X, then y ∈ X (i.e., every child of a member of X is a member of X)
It is obvious that the set of all descendants of a satisfies (I) and (II), i.e., if: X = set of all descendants of a, then (I) and (II) are true.
There are other sets that satisfy (I) and (II); for example, the set of all persons (because every child of a is a person and every child of a person is a person); or the set of all people that are descendants either of a or of b. But every set that satisfies (I) and (II) includes as a subset the set of descendants of a: First, by (I), it contains as members all the children of a;
second, by (II), it contain also all the children’s children; hence, by (II) again, it contains all children’s children’s children, and so on. Therefore we have:
The set of all descendants of a is the smallest set that satisfies (I) and (II)
Here by “smallest” we mean that it is included as a subset in every set that satisfies (I) and (II). Note that if there is a smallest set it must be unique: if Y1 and Y2 are both smallest sets, then Y1 ⊆ Y2 and Y2 ⊆ Y1.
Therefore we can also say:
x is a descendant of a iff it belongs to every set satisfying (I) and (II).
Frege was the first to give definitions of this type.
Note: Instead of (I) and (II) we can use a single condition: their conjunction. This condition can be stated as follows:
(III) If x is either a child of a or a child of a member of X, then x∈ X.
The existence of a smallest set that satisfies a given condition is a property of the condition.
Not every condition has this property. Consider, for example, the condition of being non-empty. There is no smallest non-empty set. Because if b and c are any two different objects, both {b} and {c} are non-empty; but there is no non-empty set that is a subset of both ({b} ∩ {c} = ∅). Each of {b} and {c} is a minimal non-empty set: it has no proper subset which is not empty; but it is not the smallest non-empty set. Or consider the following condition on X:
(IV) At least three children of McGregor are members of X.
Given that Mary, James, Robert and Lucy are McGregor’s children, each of the following sets satisfies (IV):
{Mary, James, Robert} {James, Robert, Lucy}
But no subset of the two satisfies (IV), because their intersection is {James, Robert}.
If Y is the smallest set satisfying the condition P, then it is (i) a member of the family of all sets satisfying P, and (ii) a subset of every set in this family. [By a ‘family of sets’ we mean a set whose members are sets.] Hence, Y is the intersection of all sets satisfying P.
Note: In 5.1.2 we defined intersections of a finite number of sets. The definition generalizes easily to any non-empty family, F, of sets: The intersection of the members of F is the set consisting of those objects that are members of every set in F. An analogous generalization applies to unions: The union of all the sets in the family F is the set whose members are all objects that belong to some member of F.
Operations on Sets, Monotonicity and Fixed Points
Our first definition of descendants tells us how to get each descendant by some finite, bottom-up construction of a sequence. The second definition represents a top-down approach, in which we form the intersection of all the sets that satisfy certain conditions. There is a connection between the two definitions. It is brought out by regarding (I) and (II) not only as conditions, but as rules that determine operations on sets. If X is the set that is operated on, then the rules are as follows.
(I∗) If x is a child of a, add x to X.
(II∗) If x∈ X and y is a child of x, add y to X.
To apply (I∗) to X means to add to X all the children of a. (If there are no children of a, or if all of them are already in X, no new members are added.) To apply (II∗) to X means
to add to it all the children of its members. (If no member of X has children, or if all the children of the members of X are already in X, no new members are added.)
Henceforth we use ‘(I∗)’ and ‘(Ii∗)’, ambiguously, to refer to the rule, as well as to the operation determined by it.
Obviously, (I∗) and (II∗) can either augment X or leave it unchanged. They cannot decrease it. Operations having this property are called non-decreasing. Also, the outcome of applying either (I∗) or (II∗) to X does not decrease if X is augmented. Operations that have this property are called monotone.
A set that is left unchanged by applying to it an operation is said to be a fixed point of the operation. If the operation is non-decreasing we also say that the set is closed under the operation.
If ‘F (X)’ denotes the outcome of applying the operation F to the set X, then the properties above can be summarized as follows:
Non-Decreasing: X ⊆ F (X)
Monotone: X ⊆ X0 ⇒ F (X) ⊆ F (X0) Fixed Point: F (X) = X
For a non-decreasing F , a fixed point of F is said to be closed under F
Obviously, a set is a fixed point of (I∗) and (II∗) iff all children of a are already in it, and, for each of its members, it contains also all the member’s children. But this simply means that the set satisfies (I) and (II). Hence, the sets that satisfy (I) and (II) are exactly the fixed points of (I∗) and (II∗). Our aim is therefore to construct the smallest fixed point of (I∗) and (II∗). This is achieved as follows.
We start with an initial set, X0, such that X0 = ∅. Applying (I∗) to it, we get a set, X1, consisting of all the children of a. By applying (II∗) to X1we get a set, X2consisting of all the children of a and all their children. Again, by applying (II∗) to X2 we get X3, which consists of the children of a, the children’s children, and the children’s children’s children. And so on.
We get in this way a sequence
X0, X1, . . . , Xn, . . . which is non-decreasing: X0 ⊆ X1 ⊆ . . . ⊆ Xn⊆ . . ..
All sets in this sequence contain only descendants of a. It is also easily seen that every descendant of a is a member of some set in the sequence. Hence the union of all the sets in the sequence is exactly the set of all descendants of a. It is the smallest fixed point of (I∗) and (II∗).
Note: The sets of such a sequence can either go increasing all the way, or reach a “plateau”, remaining the same from some point on. In our example, the first is the case if time goes on indefinitely and there are always new descendants of a; the second is the case if, from some time on, no new descendants are added.
We can combine (I∗) and (II∗) into a single operation, which adds to X the children of a, as well as the children of the members of X. This corresponds to the conjunction, (III), of (I) and (II):
(III∗) If x is either a child of a or a child of a member of X, add x to X.
Applied to the empty set, (III∗) adds to it the children of a (as does (I∗)). Afterwards, since our set contains already the children of a, applications of (III∗) are the same as applications of (II∗).
Our case exemplifies the general features of inductive definitions:
(a) We are given certain conditions [in our example: (I) and (II)]. There is a smallest set that satisfies them and this is the set we define.
(b) We recast the conditions as rules that determine non-decreasing monotone operations on sets [in our example: (I∗) and (II∗)]. A set satisfies the defining conditions iff it is a fixed point of these operations.
(c) Starting with the empty set, and iterating the operations we get a non-decreasing sequence of sets whose union is the smallest fixed point.
It is always possible to replace the set of conditions by their conjunction [in our example:
(III)], and to use the operation that corresponds to it [in our example: (III∗)], i.e., to iterate this single operation. This can be preferred for the purpose of a general treatment. But usually it is easier to grasp the construction if we separate the conditions. In particular, it is convenient to distinguish two kinds of rules:
• Base Rules: These are the rules for starting the process. They enable us, uncondition-ally, to put in our set certain objects.
• Recursive Rules: These are the rules that we keep iterating. They enable us to add, as new members, objects that are related (in the relevant way) to members of the set . In our example (I∗) is a base rule and (II∗) is a recursive rule. If we combine the rules into one, then, when this is applied to ∅, it acts as the conjunction of all the base rules; afterwards it acts as the conjunction of all the recursive rules.
The term ‘inductive rule’ is sometime used as a synonym of ‘recursive rule’. But it is also used more broadly to refer to all the rules of the inductive definition.
Note: The conditions of an inductive definition must be such that there is a smallest set satisfying them. They should moreover determine, in the way just illustrated, non-decreasing monotone operations. There are logical characterizations of conditions that have these prop-erties. We shall not go into them here. The general theory of inductive definitions is a subject by itself.
Note: ‘Induction’ has several meanings. You probably know the term as it is used to characterize empirical generalizations; e.g., from the observed cases of human mortality we infer by inductive generalization that all humans are mortal. Do not confuse the two uses of
‘induction’. [The common root of the two refers to the “inducing” of new facts by old ones. In empirical induction, we infer new unobserved cases from observed ones. This type of inference does not have logical or mathematical certainty. In inductive definitions the “inducing” is part of the definition: the fact that c is a descendant of a is “induced” by the facts that c is a child of b and b is a descendant of a.]
The term ‘recursive definition’ is sometimes used as a synonym for ‘inductive definition’. One also speaks of definition by recursion. Unfortunately, ‘recursion’ is used also to denote any computational process based on some algorithm. Thus, both ‘induction’ and ‘recursion’ have more than one meaning.
Terminology and Notation: The conditions that figure in an inductive definition, are known also as the clauses, or the inductive clauses, of the definition.
To cut the terminology short, it is customary to regard these clauses also as rules for adding members. This means that we can speak of (I) and (II) as if they were, respectively, (I∗) and (II∗); we may thus say that a set is closed under (II), or that it is a fixed point of (I) and (II).
It is customary to use the same symbol in the role of the set-variable that is used in stating the conditions (in our example, ‘X’), as well as a name for the inductively defined set. If ‘Da’ is to denote the set of all descendants of a, then its inductive definition will have the form:
(1) If x is a child of a, then x∈ Da.
(2) If x∈ Da and y is a child of x, then y ∈ Da.
We then say that Da is defined inductively by (1) and (2), meaning that it is the smallest set satisfying these conditions. And we also say that Da is the smallest fixed point of (1) and (2). Here are some other examples of inductively defined sets. We denote them as ‘S1’, ‘S2’, etc.
Here (1) and (2) are the base rules and (3) and (4) are the recursive rules. Obviously, (1) and (2) can be replaced by the single base rule:
(10) 2, 3 ∈ S1.
And the other two rules can be combined into a single recursive rule:
(20) If x∈ S1, then 2x ∈ S1 and 3x ∈ S1.
After the first step we get the set {2, 3} and then, with each iteration of the recursive rules, we add to our set all the products of set members with 2 and with 3. The first four sets in the sequence are:
∅, {2, 3}, {2, 3, 4, 6, 9}, {2, 3, 4, 6, 9, 8, 12, 18, 27}
It is not difficult to see that S1 consists of all natural numbers that can be expressed as products > 1, of 2’s and 3’s, that is: all numbers of the form 2m3n, where m, n ≥ 0 and at least one of m, n is non-zero. (Recall that x0 = 1 and x1 = x.)
The set S2:
(1) 2, 3∈ S2
(2) If x, y ∈ S2, then x·y ∈ S2
Clause (2) means that, S2 is closed under products; i.e., it contains, with every two members, also their product.
It is not difficult to see that S1 = S2. The argument, which is easy, shows how the property of being the smallest set satisfying the condition is used:
S2 contains 2 and 3 and is closed under products. Hence it contains all products of 2’s and 3’s. Therefore S2 satisfies the conditions that define S1.
Since S1 is the smallest set satisfying these conditions, we have: S1⊆ S2.
Vice versa, the set of all products > 1 of 2’s and 3’s contains 2 and 3 and is closed under products. Hence it satisfies the conditions that define S2.
Since S2 is the smallest set satisfying these conditions, we have: S2⊆ S1.
Putting the two together we get: S1 = S2.
This case is easy. But, in general, the question whether two given inductive definitions define the same set can be very difficult.
The set S3:
(1) 1∈ S3. (2) If x∈ S3, then 2x ∈ S3.
It is not difficult to see that S3 is just the set consisting of all powers of 2:
{20, 21, 22, 23 , . . . , 2n, . . .}
The set S4:
(1) 3, 5∈ S4.
(2) If x∈ S4, then x+3 ∈ S4.
(3) If x∈ S4, then x+5 ∈ S4.
S4 is the analogue of S1 (with 2 and 3 replaced by 3 and 5) in which products have been replaced by sums. It is not difficult to see that S4 consists of all numbers > 0 that can be written as 3m + 5n, where m, n are natural numbers. Just as S4 is the analogue of S1, so the following set is the analogue of S2.
The set S5:
(1) 3, 5∈ S5.
(2) If x, y ∈ S5, then x+y ∈ S5.
As in the case for products, one can show that S4 = S5. It can be also shown that this is the same as the following S6.
The set S6:
(1) 3, 5, 6, 8∈ S6.
(2) If x∈ S6 and x ≥ 8 then x+1 ∈ S6.
S6 is simply the set consisting of 3, 5, 6, 8, and all numbers greater than 8.
[To see that S4⊆ S6 note that 3, 5 ∈ S6, that of all numbers ≤ 8 only 3, 5, 6, 8 are sums of 3’s and 5’s; consequently, S6 is closed under (2) and (3) in the definition of S4. To see that S6⊆ S4, note that every number among 3, 5, 6, 8 is a sum of 3’s and 5’s and each number from 9 on is obtainable by adding to some number from 3, 5, 6, 8 a sum of 3’s and 5’s.]
In the preceding examples, the recursive rules add to the set numbers of growing size. Conse-quently, the set keeps growing and the fixed point is infinite. As the following example shows, this need not hold in general.
The set S7:
(1) 7∈ S7.
(2) If n∈ S7 and n is odd, then 2n ∈ S7.
(3) If n∈ S7 and n > 4, then n − 2 ∈ S7.
By iterating these rules we put into our set the following numbers: 7, 14, 5, 3, 12, 10, 8, 6, 4. Ad-ditional applications of the rules do not yield new numbers. Hence,
S7 ={7, 14, 5, 3, 12, 10, 8, 6, 4}
Homework 5.9 Let k be a fixed natural number. Let Xk be the set defined, inductively, by the following clauses:
(1) k ∈ Xk.
(2) If x∈ Xk and x is even, then x/2∈ Xk. (3) If x∈ Xk and x is odd, then (3x+1)/2∈ Xk.
Write down (in the curly brackets notation) the sets Xk for the cases:
k = 0, 1, 2, 3, 5, 6, 15, 17.
Does there exist a number k for which Xk is infinite? This is an open and apparently a very difficult problem in number theory.
Many examples of inductive definitions that apply to objects that are not numbers are given in 5.2.3. We had already one example: the set of descendants. Here is one of the same kind.
The Set of Maternal Descendants: Let ‘maternal descendent’ means a descendant via the mother-child relation. Note that the connecting chain must consist of females, except, possibly, the last descendant. Using ‘MDa’ for the set of maternal descendants of a, the clauses of the definition are:
(MD1) If a is female and x is a child of a, then x∈ MDa.
(MD2) If x∈ MDa and x is female and y is a child of x, then y ∈ MDa
Note: If a is not female, MDa is empty. Formally, one shows that ∅ satisfies the two conditions for MDa: Since a is not female, the antecedent of the first condition is false and the condition holds vacuously. ∅ satisfies also the second condition, since no x is in ∅.
Inductive Definitions of Relations
The machinery of inductive definitions can be applied to define relations, where these, recall, are sets of pairs, or of n-tuples. The conditions determine rules for adding certain pairs, or n-tuples, to the set that is being constructed.
Here, for example, is the definition of the descendant relation, Des, which is the set of all pairs (x, z) in which x is a descendant of z. This definition is obtained from that of a’s descendants by replacing the fixed parameter ‘a’ by a variable, say ‘(z)’, and by suitable replacements of
‘x’ by ‘(x, z)’.
(1) If x is a child of z, then (x, z)∈ Des.
(2) If (x, z)∈ Des and y is a child of x then (y, z) ∈ Des.
Notation: Let s be the successor function, defined for natural numbers: s(x) = x + 1.
Many relations over natural numbers can be defined inductively, in terms of the successor function. Here is one.
(1) (x, s(x))∈ R (i.e., this holds for all natural numbers x).
(2) If (x, y)∈ R, then (x, s(y)) ∈ R.
(1) puts in R all pairs of the form (x, s(x)). Then, an application of (2) adds all the pairs (x, s(s(x))), another application adds the pairs (x, s(s(s(x)))), and so on. It is not difficult to see that R consists exactly of all pairs (x, y) in which x < y. Hence, (1) and (2) define inductively the smaller-than relation, <, solely in terms of the successor function. If, instead of ‘(x, y)∈ R’ we write ‘x < y’, we get the usual form of this definition.
(1) x < s(x)
(2) If x < y, then x < s(y).
Inductive techniques can be used to define various functions. (Recall that functions are construed in set theory as relations of a particular kind.) Take, for example, the addition function and let it be the relation Sum. Since addition is a binary function, Sum is a ternary relation:
Sum = {(x, y, z) : z = x + y}
It is not difficult to see that the following inductive definition defines it in terms of the successor function.
(1) (x, 0, x)∈ Sum
(2) If (x, y, z)∈ Sum, then (x, s(y), s(z)) ∈ Sum
Rewriting statements of the form ‘(x, y, z)∈ Sum’ in the form ‘x + y = z’, we get:
(10) x + 0 = x
(20) If x + y = z, then x + s(y) = s(z)
Rewriting (20) in the equivalent form: x + s(y) = s(x + y), yields the following customary form of the definition:
(10) x + 0 = x
(200) x + s(y) = s(x + y)
This definition shows directly the iterated process. Given that s(0) = 1, s(1) = 2, s(2) = 3, ...
etc., we can get the value of m + n for every particular m and n. For example:
5 + 0 = 5
5 + 1 = 5 + s(0) = s(5 + 0) = s(5) = 6 5 + 2 = 5 + s(1) = s(5 + 1) = s(6) = 7 5 + 3 = 5 + s(2) = s(5 + 2) = s(7) = 8 etc.
Multiplication is definable inductively in terms of the successor function and addition:
(1) x· 0 = 0
(2) x· s(y) = (x · y) + x
Homework 5.10 Give inductive definitions of the following relations over the natural numbers, solely in terms of the successor function.
1. The less-than-or-equal relation, R≤, consisting of all pairs (x, y) in which x≤ y.
2. The ternary relation S consisting of all triples of the form (x, x + n, x + 2n) where x and n are natural numbers.