Some Scientific Questions about Programming Languages

designs for machines and devices. For example, over the past decades, theorists have been thinking about new ways of doing exact calculations with real numbers, the integration of simulated and experimental data in virtual worlds, languages for specifying and programming distributed and mobile devices, ways of modelling and proving concurrent systems correct, asynchronous designs for chips, and quantum and biological computers. In time, some theories become the foundations of mature technologies, whilst others stubbornly resist practical or commercial exploitation. Theoretical Computer Science also has many problems that retain importance long after the technology is established. The theory of programming languages is laden with such legacy problems: the theories of types, concurrency, objects, agents are rich in difficulties and mysteries, both conceptual and mathematical.

Theories of programming languages and program construction are a fundamental area of Theoretical Computer Science. There are many programming constructs and program development techniques and tools, all of which are the fruits of, or require, theoretical investigation. In our time, it is believed widely that the development of theories is necessary for the practical craft of program construction to become a mathematically well-founded engineering science. Independently of such a goal, we believe that the development of theories is necessary to satisfy our curiosity and to understand what is being done, what could be done, and how to do better. The scientific approach of the theory of programming languages places three intellectual requirements on the reader:

Intellectual Aims

1. To ask simple questions.

2. To make and analyse simple mathematical models. 3. To think deeply and express ideas precisely.

Now we will formulate a number of questions and set the scene for some mathematical theories to answer them. Our theories will show how it is possible

• to model mathematically any kind of data;

• to model mathematically the syntax of a programming language; • to model mathematically the semantics of a programming language.

1.2 Some Scientific Questions about Programming Lan-

guages

Let us begin by posing some simple questions about a programming language. The questions will make us reflect on our ideas about programming languages and programs. They require us

to think generally and abstractly. They also require us

This is part of the raw material from which theories are made. Most of the questions we pose are insufficiently precise to allow a definite answer. Thus, one of our most important tasks will be

to make questions precise, using mathematical models, and hence turn them into technical problems that can be solved definitively.

By no means all of the questions will be given an answer in this book: readers are invited to return to this section from time to time to see which questions they can formulate and answer precisely.

Let_Land_L0 _be_any _{programming languages, real or imaginary. Try to answer the following} seemingly vague questions.

Data

What is data and where does it come from? How do we choose operations on data? How do we compare the choice of different sets of operations? How do we know we have enough operations on data? What exactly are data types? What is an interface to a data type? What is an implementation of a data type? How do we specify data types like integers, reals or strings, independently of programming languages? How do we model data types for users’ applications? To what extent is a data type independent of its implementation? How do we compare two implementations of a data type? Can any data type be implemented using an imperative language? Are the representations of the natural numbers based upon decimal, binary, octal and Roman notations equivalent? How accurate as approximations are implementations of the real numbers? What are the effects of approximating infinite sets of data by finite subsets? Are error messages necessary for data types? What are the base data types of L? What data types can be implemented in _L? Can any data type be implemented in _L?

Syntax

What is syntax and why is there so much of it? Is any notation a syntax? How do we choose and make a syntax using symbols and rules? How do we check for errors in syntax? How do we transform program syntaxes, as in substitution and expansion, compilation and interpretation? How do we specify and transform texts for pretty printing, internet publication and slide projection? What are the syntactic categories such identifiers, types, operations, declarations, commands, and procedures, classes? What are scope and binding rules? What are the benefits of prefix, infix, postfix, or mixfix notations? How do we define exactly the syntax of L? How do we specify the set of legal programs of L? Is there an algorithm that checks that the syntax of a program of _L is correct?

Semantics

What is semantics? Why do we need to define behaviour formally? How do we choose one from the many possible semantics for a construct? Is there a “right” semantics for every construct? What is input-output behaviour? What are deterministic and nondeterministic behaviours? Can every partial function be extended to a total function by adding an undefined flag? How do we model the behaviour of any program containingwhile,goto, and recursion? Can one use tests that return a “don’t know” flag? What happens when a program raises an exception? How

1.2. SOME SCIENTIFIC QUESTIONS ABOUT PROGRAMMING LANGUAGES 5

do procedures work? What is encapsulation and information hiding? What is parallel execution and concurrent communication? What are type systems, classes and objects, inheritance and polymorphism? What is a program library? How do we define exactly the semantics of _L? How do we specify the meaning of the data types and commands of _L? How do we specify the operation or dynamic behaviour of programs of L? How do we specify the input-output behaviour of programs required by users? What is a program trace? What is the relationship between the number of steps in a computation and its run time? What exactly does it mean for two programs of L to be equivalent?

Expressiveness and Power

How expressive or powerful is L? Can L implement any desired data type, function or specification? Which specifications cannot be implemented in _L? Is _L equally expressive as another programming language _L0_{? There are four possibilities:}

• L and L0 _{are equivalent;}

• L can accomplish all that _L0 _{can and more;} • L0 _{can accomplish all that} _L _{can and more; or}

• L can accomplish some tasks that L0 _{cannot, and} _L0 _{can accomplish some tasks that} _L cannot.

What exactly does it mean for two languages L and L0 _{to be equivalent in expressiveness or} power? Which are the most expressive, imperative, object-oriented, functional or logic programming languages? Are parallel languages more expressive than sequential languages? Is L a universal language, i.e., can it implement any specification that can be implemented in some other programming language? Do universal languages exist? What is the smallest set of imperative constructs necessary to make a universal language?

Program Properties

What properties of the programs of _L can be specified? Are there properties that cannot be specified? What exactly are static and dynamic properties? What properties of the programs of L can be checked by algorithms? What properties are decidable, or undecidable, when the program is being compiled? What properties are decidable, or undecidable, when the program is being executed? What is the relationship between the expressiveness or power of the language L and its decidable properties? What properties of programs in L can be proved? Given a program and an identifier, can we decide whether or not

(i) the identifier appears in the program?

(ii) given an input, the identifier changes its value in the resulting computation? (iii) the identifier changes its value in some computation?

Given a program and an input, can we decide whether or not the program halts when executed on the input? Given two programs, can we decide whether or not they are equivalent?

Correctness

How do we specify the purpose of a program in _L? To what extent can we test whether or not a program of _L meets its specification? How do we prove the correctness of a program with respect to its specification? Given a specification method, are there programs that are correct with respect to a specification but which cannot be proved to be correct? Is there an algorithm which, given a specification and a program, decides whether or not the program meets its specification?

Compilation

What exactly is compilation? How do we structure a compiler from the syntax of L to the syntax of _L0_{? What does it mean for a compiler from} _L _to _L0 _{to be correct? Is correctness} defined using a class of specifications, or by comparing the semantics of _L and _L0_{? How do we} prove that a compiler fromL toL0 _{is correct?}

Efficiency

Are the programs of L efficient for a class of specifications? If L and L0 _{are equally expressive} will the programs written in them execute equally efficiently? Can _L be compiled efficiently on a von Neumann machine and is the code generated efficient? Are imperative programming languages more efficient than logic and functional languages? Are programs only involving while and other so-called structured constructs, less efficient than those allowing gotos? Are parallel languages more efficient than sequential languages?

Attempting to answer the questions should reveal what the reader knows about programming languages, before and after studying this book.

In document Data, Syntax and Semantics: An Introduction to Modelling Programming Languages - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 33-36)