• No results found

Evaluation revisited

This section allows us to round off earlier discussions, in sections 2.3, 2.4 and 2.8, about evaluation, the various kinds of (head,. . . ) normal form and computation and equivalence rules.

In a (typed) functional programming language like Miranda, we can ask for the system to evaluate expressions of any type. If the object is a function, the result of the evaluation is simply the message

<function>

This reflects the fact that we can print no representation of the function

quatransformation, but only in someintensional way by a normal form for its code.

In addition to this, the ultimate values computed by programs will be finite — as finite beings we cannot wait an infinite time for a complete print-out, however close we feel we come sometimes! We would therefore argue that the values which we seek as final results of programs are non- functional.

Definition 2.28 Theorderof a type is defined thus, writing∂(τ) for the order ofτ:

∂(τ) = 0 ifτ ∈B ∂(τ×σ) = max(∂(τ), ∂(σ))

∂(τ⇒σ) = max(∂(τ) + 1, ∂(σ))

Definition 2.29 The terms we evaluate are not only zeroth-order, i.e.of

groundtype, they also have the second property of being closed containing as they do no free variables. The results will thus be closed (β-)normal forms of zeroth-order type. It is these that we call theprintablevalues.

In our exampleλ-calculus, the closed normal forms inN are 0, succ0, succ(succ0), . . .

2.11. EVALUATION REVISITED 57 in other words are 0 andsucc nwherenitself is in normal form.

For a pair of printable types, (τ×σ), the closed normal forms will be (t, s)

wheret,sare closed normal forms of typeτ andσ.

How do we prove that these are the only printable values of these types? We consider the (notationally) simpler case omitting product types. An induction over the structure of numerical terms suffices:

• A variable is not closed.

• 0 is a normal form

• succ tWe argue by induction fort.

• (f e) where f is of functional type. Closed terms of this type have three forms

– λx . t or P rec t1 t2 If f has either of these forms, we have a contradiction, since (f e) will form a redex (by induction for e

in the case ofP rec).

– (g h) where g is an application. First we expandg fully as an application, writing the term as

g1g2 . . . gkh

Each of thegi is closed, and g1 must be of function type. Also

g1is not an application sog1g2must be a redex in contradiction to f, (that is g h), being in normal form

A similar argument establishes the result when product types are added. What is important to note here is that we have no redexes for the equiv- alence rules here: we have reached a normal form excluding such redexes without applying the reduction rules. Clearly this depends upon our twin assumptions of

Closure This excludes terms of the form (fst p, snd p)

where pis a variable of typeN×N, say.

Printable types This excludes the obviousη-redexes, such as

λx . λy .(xy)

We therefore feel justified in making the distinction between the two kinds of rule which we called computation and equivalence rules above. The computation rules suffice for the evaluation of particular terms, whilst the equivalence rules are used when reasoning about the general behaviour of functions (applied to terms which may contain variables).

Chapter 3

Constructive Mathematics

The aim of this brief chapter is to introduce the major issues underlying the conflict between ‘constructive’ and ‘classical’ mathematics, but it cannot hope to be anything other than an hors d’oeuvre to the substantial and lengthy dialogue between the two schools of thought which continues to this day.

Luckily, there are other sources. Bishop gives a rousing call to the constructive approach in the prologue to [BB85], which is followed by a closely argued ‘Constructivist Manifesto’ in the first chapter. Indeed the whole book is proof of the viability of the constructivism, developing as it does substantial portions of analysis from such a standpoint. It contains a bibliography of further work in the field.

An invaluable historical account of the basis of the conflict as well as subsequent activity in the field can be found in the historical appendix of [Bee85]. The first part of [Bee85], entitled ‘Practice and Philosophy of Constructive Mathematics’, also gives a most capable summary of both the scope and the foundations of constructive mathematics. [Dum77] is also a good introduction, looking in detail at the philosophy of intuitionism, and the recent survey [TvD88] also serves its purpose well.

Bishop identifies constructivism with realism, contrasting it with the idealism of classical mathematics. He also says that it gives mathematical statements ‘empirical content’, as opposed to the purely ‘pragmatic’ nature of parts of classical mathematics, and sums up the programme of [BB85] thus ‘to give numerical [i.e. computational] meaning to as much as possible of classical abstract analysis’.

A constructive treatment of mathematics has a number of interlinked aspects. We look at these in turn now.

Existence and Logic

What constitutes a proof ofexistenceof an object with certain properties? A mathematician will learn as a first-year undergraduate that to prove

∃x.P(x) it is sufficient to prove that∀x.¬P(x) is contradictory. The con- structivist would argue that all this proof establishes is the contradiction: the proof of existence must supply a t and show that P(t) is provable. Proofs of existential statements abound in mathematics: the fundamental theorem of algebra states that a polynomial of degree n has n complex roots; the intermediate value theorem asserts that continuous functions which change sign over a compact real interval have a zero in that interval, just to take two examples.

Sanction for proof by contradiction is given by the law of the excluded middle

A∨ ¬A

which states in particular that

∃x.P(x) ∨ ¬∃x.P(x)

If ∀x.¬P(x), which is equivalent to ¬∃x.P(x), is contradictory, then we must accept ∃x.P(x). Our view of existence thus leads us to reject one of the classical logical laws, which are themselves justified by an idealistic view of truth: every statement is seen as true or false, independently of any evidence either way. If we are to take the strong view of existence, we will have to modify in a radical way our view of logic and particularly our view of disjunction and existential quantification.

Bishop gives the interesting example that the classical theorem that ev- ery bounded non-empty set of reals has a least upper bound, upon which results like the intermediate value theorem depend, not only seems to de- pend for its proof upon non-constructive reasoning, itimpliescertain cases of the law of the excluded middle which are not constructively valid. Con- sider the set of reals{rn|n∈N}defined by

rn ≡df 1 if P(n)

≡df 0 if not

The least upper bound of this set is 1 if and only if∃x.P(x); we can certainly tell whether the upper bound is 1 or not, so

∃x.P(x) ∨ ¬∃x.P(x)

As a consequence of this, not only will a constructive mathematics depend upon a different logic, but also it will not consist of the same results. One

61 interesting effect of constructivising is that classically equivalent results often split into a number of constructivelyinequivalentresults, one or more of whichcan be shown to be valid by constructive means.

The constructive view of logic concentrates on what it means to prove or to demonstrate convincingly the validity of a statement, rather than con- centrating on the abstract truth conditions which constitute the semantic foundation of classical logic. We examine theseproof conditionsnow.

To prove a conjunction A∧B we should prove bothAandB; to prove

A∨B we should prove one of A,B, and know which of the two we have proved. Under this interpretation, the law of the excluded middle is not valid, as we have no means of going from an assertion to a proof of either it or its negation. On the other hand, with this strong interpretation, we can extract computationally meaningful information from a proof ofA∨B, as it allows us to decide which of A or B is proved, and to extract the information contained in the proof of whichever of the two statements we have.

How might an implication A⇒B be given a constructive proof? The proof should transform the information in a proof ofA into similar infor- mation for B, in other words we give a function taking proofs of A into proofs of B. A proof of a universally quantified formula∀x.P(x) is also a transformation, taking an arbitraryainto a proof of the formulaP(a). We shall have more to say about the exact nature of functions in the section to come.

Rather than thinking of negation as a primitive operation, we can define the negation of a formula¬Ato be an implication,

A⇒ ⊥

where⊥is the absurd proposition, which has no proof. A proof of a negated formula has no computational content, and the classical tautology¬¬A⇒

Awill not be valid — take the example whereAis an existential statement. Finally, to give a proof of an existential statement ∃x.P(x) we have to give two things. First we have to give awitness ,a say, for whichP(a) is provable. The second thing we have to supply is, of course, the proof of

P(a). Now, given this explanation, we can see that a constructive proof of

∃x.P(x) ∨ ¬∃x.P(x)

constitutes a demonstration of thelimited principle of omniscience. It gives a decision procedure for the predicatesP(x) over the natural numbers, say, and can be used to decide the truth or otherwise of the Goldbach Conjec- ture, as well as giving a solution to the halting problem. It therefore seems not to be valid constructively, or at least no constructive demonstration of its validity has so far been given!

Given the explanation of the logic above, we can see that we have to abandon many classical principles, like¬¬A⇒A,¬(¬A∧ ¬B)⇒(A∨B),

¬∀x.¬P(x)⇒ ∃x.P(x) and so on.

We shall see in the section to come that as well as modifying our logic we have to modify our view of mathematical objects.

Mathematical objects

The nature of objects in classical mathematics is simple: everything is a set. The pair of objectsaandb ‘is’ the set{{a},{a, b}}and the number 4 ‘is’ the set consisting of

∅, {∅}, { ∅,{∅} }, { ∅,{∅},{ ∅,{∅} } }

Functions are represented by sets of pairs constituting their graph, so that the successor function on the natural numbers is

{(0,1),(1,2),(2,3), . . .}

which is itself shorthand for

{ {{∅},{∅,{∅}}}, {{{∅}},{{∅},{ ∅,{∅} }}}. . .

Objects like this are infinite, and an arbitrary function graph will be in- finitary, that is it will have no finite description. Such objects fail to have computational content: given the finitary nature of computation, it is im- possible completely to specify such an object to an algorithm. This is an example of a fundamental tenet of constructive mathematics:

Every object in constructive mathematics is either finite, like natural or rational numbers, or has afinitary description, such as the rule

λx . x+ 1

which describes the successor function over the natural numbers The real numbers provide an interesting example: we can supply a descrip- tion of such a number by a sequence of approximations, (an)n say. This

sequence is finitary if we can write down an algorithm or rule which allows us to compute the transformation

n7→an

63 A second aspect of the set theoretic representation is the loss of distinc- tion between objects of different conceptual type. A pair is a set, a number is a set, a function is a set, and so on. We are quite justified in forming a set like

(3,4)∪0∪succ

although its significance is less than crystal clear! This does not reflect usual mathematical practice, in which the a priori distinctions between numbers, functions and so on are respected. In other words, the objects of mathematics are quite naturally thought of as havingtypes rather than all having the trivial type ‘set’. We summarise this by saying that

Constructive mathematics is naturally typed.

One obvious consequence of this is that quantification in a constructive setting will always be typed.

If we accept that a typed system is appropriate, what exactly do we mean by saying that ‘object a has type A’ ? To understand what this means, we must explain type-by-type what are the objects of that type, and what it means for two objects of the type to be equal. For instance, for the type of functions, A ⇒ B, we might say that objects of the type are (total) algorithms taking objects of A to objects of B, and that two algorithms are deemed equal if they give the same results on every input (theextensionalequality on the function space).

Each objectashould be given in such a way that we can decide whether it is a member of the type A or not. Consider an example. Suppose that we say that the real numbers consist of the class of sequences (an)n which

are convergent, that is which satisfy

∀n.∃m.∀i≥m.∀j≥m .|ai−aj|<1/n

When presenting such a sequence it is not sufficient to give the sequence, it must be presented with thewitness that it has the property required. In this case the witnessing information will be a modulus of convergence functionµtogether with aproof that

∀n.∀i≥µ(n).∀j≥µ(n).|ai−aj|<1/n

This is an example of the general principle:

Principle of Complete Presentation. Objects in construc- tive mathematics are completely presented, in the sense that if an object ais supposed to have type A thena should contain sufficient witnessing information so that the assertion can be verified.

This is a principle to which Bishop adheres in principle, but to smooth the presentation of the results, he adopts a policy of systematic suppression of the evidence, invoking it only when it is necessary. This schizophrenic attitude will also pervade the systems we shall introduce.

There is a relationship between this view and the classical. The con- structive model of objects like the reals can be seen as related to the classi- cal; witnessing evidence which can always be provided by a non-constructive existence proof in the classical setting is incorporated into the object itself by the constructivist.

A final area of note is equality over infinite objects like the reals. For the natural numbers, we can judge whether two objects are equal or not, simply by examining their form. For the reals, we are not interested so much in the syntactic form of two numbers as their respectivelimits. Two reals (an)n, (bn)n (we suppress the evidence!) have the same limit, and so

are deemed equal, if

∀n.∃m.∀i≥m.∀j≥m .|ai−bj|<1/n

We cannot expect that for an arbitrary pair of reals that we can decide whether a=b or a6=b,as one consequence of this is the limited principle of omniscience. A final precept which is useful here is

Negative assertions should be replaced by positive assertions whenever possible.

In this context we replace ‘6=’ by a notion of ‘apartness’. Two real numbers (an)n, (bn)n areseparated,a#b, if

∃n.∃m.∀i≥m.∀j≥m .|ai−bj|>1/n

This is a strong enough notion to replace the classically equivalent inequal- ity. We shall return to the topic of the real numbers in section 7.6.

Formalizing Constructive Mathematics

The reader should not conclude from the foregoing discussion that there is a ‘Golden Road’ to the formalization of constructive mathematics. The type theory which we explore below represents one important school of many, and we should say something about them now. [Bee85] provides a very useful survey addressing precisely this topic, and we refer the reader to this for more detailed primary references. The text covers theories like Intu- itionistic set theory (IZF), Feferman’s theories of operations and classes, [Fef79], as well as various formalized theories of rules, all of which have been proposed as foundations for a treatment of constructive mathematics.

65 One area which is overlooked in this study is the link betweencategory theory and logic, the topic of [LS86]. This link has a number of threads, including the relationship between theλ-calculus and cartesian closed cat- egories, and the category-theoretic models of intuitionistic type theory pro- vided by toposes. The interested reader will want to follow up the primary references in [LS86].

Conclusion

We have seen that constructive mathematics is based on principles quite different from classical mathematics, with the idealistic aspects of the latter replaced by a finitary system with computational content. Objects like functions are given by rules, and the validity of an assertion is guaranteed by a proof from which we can extract relevant computational information, rather than on idealist semantic principles. We lose some theorems, such as

Theorem 3.1 (Intermediate Value Theorem - Classical)

Suppose that f is continuous on [0,1] with f(0) <0 and f(1) >0, then there is anr∈[0,1] withf(r) = 0.

All is not lost, and we can prove the weaker

Theorem 3.2 (Intermediate Value Theorem - Constructive)

Suppose thatf is continuous on [0,1] with f(0)<0 and f(1)>0, then for allε >0 there is anr∈[0,1] with|f(r)|< ε.

The constructive version states that we can get arbitrarily close to the root, and of course, that is all we could expect to do, from a computational point of view. In this respect, we have in the latter theorem a truer picture of our ‘empirical’ capabilities.

For other examples, and more cogent pleading of the constructivist case, we would heartily recommend the opening passages of [BB85]. Indeed, the whole book will repay detailed study. We now pass on to looking at our formal system for type theory.

Chapter 4

Introduction to Type

Theory

This chapter forms the focus of the book, drawing together the three themes of logic, functional programming and constructive mathematics into a single system, which we investigate, develop and criticise in the chapters to come. The short discussion of constructive mathematics introduced the idea that proofs should have computational content; we saw that to achieve this goal, the underlying logic of the system needed to be changed to one in which