Computational Models - The Evolution of Programming Languages

There are various ways of describing a PL. We can write an informal description in natural language (e.g., the Pascal User Manual and Report (Jensen and Wirth 1976)), a formal description in natural language (e.g., the Algol report (Naur et al. 1960)), or a document in a formal notation (e.g., the Revised Report on Algol 68 (van Wijngaarden et al. 1975)). Alternatively, we can provide a formal semantics for the language, using the axiomatic, denotational, or operational approaches.

There is, however, an abstraction that is useful for describing a PL in general terms, without the details of every construct, and that is the computational model (also sometimes called the se- mantic model ). The computational model (CM) is an abstraction of the operational semantics of the PL and it describes the effects of various operations without describing the actual implementation.

Programmers who use a PL acquire intuitive knowledge of its CM. When a program behaves in an unexpected way, the implementation has failed to respect the CM.

Example 28: Passing parameters. Suppose that a PL specifies that arguments are passed by value. This means that the programs in Listing 46 and Listing 47 are equivalent. The statement T a = E means “the variable a has type T and initial value E” and the statement x ← a means “the value of a is assigned (by copying) to x”.

Listing 46: Passing a parameter void f (T x); { S; } T a = E; f (a);

Listing 47: Revealing the parameter passing mechanism T a = E; { T x; x ← a; S; }

A compiler could implement this program by copying the value of the argument, as if executing the assignment x ← a. Alternatively, it could pass the address of the argument a to the function but not allow the statement S to change the value of a. Both implementations respect the CM. 2 Example 29: Parameters in FORTRAN. If a FORTRAN subroutine changes the value of one of its formal parameters (called “dummy variables” in FORTRAN parlance) the value of the corresponding actual argument at the call site also changes. For example, if we define the subroutine

subroutine p (count) integer count . . . . count = count + 1 return end

then, after executing the sequence integer total

total = 0 call p(total)

the value of total is 1. The most common way of achieving this result is to pass the address of the argument to the subroutine. This enables the subroutine to both access the value of the argument and to alter it.

Unfortunately, most FORTRAN compilers use the same mechanism for passing constant arguments. The effect of executing

call p(0)

is to change all instances of 0 (the constant “zero”) in the program to 1 (the constant “one”)! The resulting bugs can be quite hard to detect for a programmer who is not familiar with FORTRAN’s idiosyncrasies.

The programmer has a mental CM for FORTRAN which predicts that, although variable arguments may be changed by a subroutine, constant arguments will not be. The problem arises because

typical FORTRAN compilers do not respect this CM. 2

A simple CM does not imply easy implementation. Often, the opposite is true. For example, C is easy to compile, but has a rather complex CM.

Example 30: Array Indexing. In the CM of C, a[i] ≡ ∗(a + i). To understand this, we must know that an array can be represented by a pointer to its first element and that we can access elements of the array by adding the index (multiplied by an appropriate but implicit factor) to this pointer.

In Pascal’s CM, an array is a mapping. The declaration var A : array [1..N ] of real;

introduces A as a mapping:

A : { 1, . . . , N } → R.

A compiler can implement arrays in any suitable way that respects this mapping. 2

Example 31: Primitives in OOPLs. There are two ways in which an OOPL can treat primitive entities such as booleans, characters, and integers.

. Primitive values can have a special status that distinguishes them from objects. A language that uses this policy is likely to be efficient, because the compiler can process primitives without the overhead of treating them as full-fledged objects. On the other hand, it may be confusing for the user to deal with two different kinds of variable — primitive values and objects. . Primitive values can be given the same status as objects. This simplifies the programmer’s

task, because all variables behave like objects, but a naive implementation will probably be inefficient.

A solution for this dilemma is to define a CM for the language in which all variables are objects. An implementor is then free to implement primitive values efficiently provided that the behaviour predicted by the CM is obtained at all times.

In Smalltalk, every variable is an object. Early implementations of Smalltalk were inefficient, because all variables were actually implemented as objects. Java provides primitives with efficient implementations but, to support various OO features, also has classes for the primitive types. 2 Example 32: λ-Calculus as a Computational Model. FPLs are based on the theory of partial recursive functions. They attempt to use this theory as a CM. The advantage of this point of view is that program text resembles mathematics and, to a certain extent, programs can be manipulated as mathematical objects.

The disadvantage is that mathematics and computation are not the same. Some operations that are intuitively simple, such as maintaining the balance of a bank account subject to deposits and withdrawals, are not easily described in classical mathematics. The underlying difficulty is that many computations depend on a concept of mutable state that does not exist in mathematics. FPLs have an important property called referential transparency . An expression is referentially transparent if its only attribute is its value. The practical consequence of referential transparency is that we can freely substitute equal expressions.

For example, if we are dealing with numbers, we can replace E + E by 2 × E; the replacement would save time if the evaluation of E requires a significant amount of work. However, if E was not

referentially transparent — perhaps because its evaluation requires reading from an input stream or the generation of random numbers — the replacement would change the meaning of the program. 2

Example 33: Literals in OOP. How should an OOPL treat literal constants, such as 7? The following discussion is based on (K¨olling 1999, pages 71–74).

There are two possible ways of interpreting the symbol “7”: we could interpret it as an object constructor , constructing the object 7; or as a constant reference to an object that already exists. Dee (Grogono 1991a) uses a variant of the first alternative: literals are constructors. This introduces several problems. For example, what does the comparison 7 = 7 mean? Are there two 7-objects or just one? The CM of Dee says that the first occurrence of the literal “7” constructs a new object and subsequent occurrences return a reference to this (unique) object.

Smalltalk uses the second approach. Literals are constant references to objects. These objects, however, “just magically exist in the Smalltalk universe”. Smalltalk does not explain where they come from, or when and how they are created.

Blue avoids these problems by introducing the new concept of a manifest class. Manifest classes are defined by enumeration. The enumeration may be finite, as for class Boolean with members true and false. Infinite enumerations, as for class Integer, are allowed, although we cannot see the complete declaration.

The advantage of the Blue approach, according to K¨olling (1999, page 72), is that:

. . . . the distinction is made at the logical level. The object model explicitly recognizes these two different types of classes, and differences between numbers and complex objects can be understood independently of implementation concerns. They cease to be anomalies or special cases at a technical level. This reduces the dangers of mis- understandings on the side of the programmer. [Emphasis added.]

Exercise 38. Do you think that K¨olling’s arguments for introducing a new concept (manifest classes) are justified? 2

Summary Early CMs:

. modelled programs as code acting on data;

. structured programs by recursive decomposition of code. Later CMs:

. modelled programs as packages of code and data;

. structured programs by recursive decomposition of packages of code and data. The CM should:

. help programmers to reason about programs; . help programmers to read and write programs; . constrain but not determine the implementation.

Names are a central feature of all programming languages. In the earliest programs, numbers were used for all purposes, including machine addresses. Replacing numbers by symbolic names was one of the first major improvements to program notation (Mutch and Gill 1954).

In document The Evolution of Programming Languages (Page 79-83)