Many languages which permit the kind of expression whose translation is dis- cussed in this chapter also use a variety of compile-time types –real,integer,
character, and so on. This at once puts a burden of error handling on the translator and means that it must become more voluminous than ever (though not more complicated).
The ‘type’ of a node depends on the ‘types’ of its subnodes and the operation which must be carried out. Some prohibited combinations of types and operators will be classified as compile-time errors4– e.g. in FORTRAN, REAL*LOGICAL,
COMPLEX**COMPLEX, etc. – whilst those type combinations which are per- mitted will require individual treatment. The type is easily found recursively, either by a preliminary tree-walk or at the same time as the instructions which compute the value are being generated. Using a preliminary tree-walk can help if, for example, the choice of a code fragment depends on the order of evaluation of the subnodes.
The TranBinOp procedure must look at the types of the left and right subnodes of an expression to select a code fragment on the basis of the type combination revealed. In figure 5.17 the source language allows only real, integer and
Booleanvariables, and on the object machine all arithmetic instructions have
4 These errors are sometimes wrongly called ‘semantic’ errors. They’re really long-range
syntactic errors. True semantic errors arise when a program asks for an impossible sequence of operations at run-time.
5.7. COMBINATIONS OF ARITHMETIC TYPES 97
let TranBinOp(op, nodep, regno) = valof
{ let first, second = nodep.left, nodep.right
/* node reversal, etc. omitted to clarify algorithm */ TranArithExpr(first, regno)
TranArithExpr(second, regno+1) op := op++‘r’
{ let Ltype = first.atype
let Rtype = second.atype
if Ltype=Rtype then
{ Gen((Ltype=INT -> op, ‘f’++op), regno, regno+1) nodep.atype := Ltype
}
elsf Ltype=INT & Rtype=REAL then
{ /* convert second operand to INT type */ Gen(FIXr, regno+1, 0); Gen(op, regno, regno+1) nodep.atype := INT
}
elsf Ltype=REAL & Rtype=INT then
{ /* convert second operand to REAL type */
Gen(FLOATr, regno+1, 0); Gen(‘f’++op, regno, regno+1) nodep.atype := REAL
}
elsf Ltype=BOOL | Rtype=BOOL then
{ Error("Boolean expression in arithmetic context") nodep.atype := INT /* avoid secondary error */
}
else
CompilerFail("garbaged types in TranBinOp", nodep)
} }
98 CHAPTER 5. TRANSLATING ARITHMETIC EXPRESSIONS a floating point variant. In this strange language integer +real is integer,
real+integerisreal. I assume that each node in the expression tree contains an ‘atype’ field which records its arithmetic type and that constant nodes and symbol table descriptors have an appropriate value stored in their ‘atype’ field. The last few lines in figure 5.17 illustrate how the translator can detect type- combination errors and how easy it is to write a ‘self-checking’ translator. The fact that translators can easily carry out type processing, and can as a side effect check type validity in hierarchies of nodes, allows the syntax analyser to avoid any type checking even when, as with Boolean/arithmetic types in ALGOL 60, the syntax specifies permissible combinations of operations. Thus
the translatorcan easily detect that ‘(a=b)*(c=d)’ is an invalid expression, if
the language definition makes it so. The translator’s diagnostic, which could describe the error as an invalid combination of types, is close to my intuition about the kind of misunderstanding on the part of the user which might lead to the inclusion of such a construct in a program. The message that would be produced if a syntax analyser detected the error – ‘invalid sequence of operators’ or ‘logical operation expected’ (see chapter 3 and section IV) isn’t nearly so helpful.
Allowing a complicated type structure in expressions, then, merely means that the tree-walking translator must check for and individually process each valid type combination. The translator becomes more voluminous, though not more complicated. If the type structure is really intricate, as it is in COBOL or PL/1, it may be more economical to use a lookup matrix to detail the actions required for each operation rather than to code them explicitly as in figure 5.17.
Summary
This chapter demonstrates the power of tree-walking techniques to generate remarkably good code from a tree which represents an arithmetic expression. It shows the strength of the design technique set out in chapter 3 – first produce a general mechanism which can translate the most general form of a particular kind of node, refine it to work better if the node has a particular structure, refine it still further if there are any special cases worthy of consideration. The power of the tree-walking technique comes not from the quality of the final code – code like that illustrated in this chapter can be produced in a num- ber of different ways – but from the technique of structured refinement. If the translator never progresses beyond the stage in which it uses the most general mechanism it is still usable as a translator. It cannot be too strongly empha- sised that a correct but sub-optimal translator is infinitely preferable to one which produces excellent code in almost every case yet breaks down whenever presented with a novel situation. I don’t claim that tree walking eliminates such bugs but I do claim that it minimises them. Once the general mechanism is working you can move on to refine any crucial code fragments which are se- riously sub-optimal, and using tree walking means that this refinement doesn’t
5.7. COMBINATIONS OF ARITHMETIC TYPES 99 disturb any other sections of your translator, which go on operating just as they did before.
Later chapters show how the tree-walking mechanism can handle other kinds of node, but it is perhaps in the area of expression evaluation, as demonstrated in this chapter and the next, that it is at its most impressive.
Chapter 6
Translating Boolean
Expressions
It’s possible to treat Boolean expressions in exactly the same way as arithmetic expressions – generate instructions to evaluate the subnodes, generate an in- struction (or a sequence of instructions) to combine the Boolean values. Thus Booleanandand Booleanoroperations can be translated using the TranBinOp procedure of chapter 5 with the machine instructions AND and OR. However Boolean expressions more often than not appear as the conditional test in an
if or awhilestatement and, as a result, are used more as sections of program which select between alternative paths of computation than as algebraic ex- pressions which compute a truth value. Most good compilers therefore try to generate ‘jumping’ code for Boolean expressions in these contexts. First of all, however, it is necessary to demonstrate the code fragments which are required when Boolean expressions are regarded as value-calculating mechanisms.
6.1
Evaluating a Relational Expression
Primary constituents of Boolean expressions can be constants, variables, ele- ments of data structures or the results of function calls, just as in the case of an arithmetic expression. Relational expressions, however, are a little different and require special treatment since the object code fragment must usually execute a ‘test’ instruction to discover the result of a comparison. In this chapter I use theSKIPfamily of instructions to test the truth of a relation such as ‘<’ or ‘=’. Consider the expression ‘a<b’, in the context ‘x := a<b’. Here ‘x’ is a Boolean variable, which should containtrueor falseafter the statement has been exe- cuted. The expression ‘a<b’ is then a means of calculating the valuetrueor the valuefalse. Figure 6.1 shows a translation procedure for relational expression nodes which can generate code to do this, together with the code generated as a result of giving it the operation ‘LT’, a pointer to the tree for ‘a<b’ and the
102 CHAPTER 6. TRANSLATING BOOLEAN EXPRESSIONS
let TranRelation(op, nodep, regno) be
{ Gen(LOADn, regno, TRUE)
TranBinOp(‘SKIP’++op, nodep, regno+1) Gen(LOADn, regno, FALSE)
}
Tree: <
a b
Code: LOADn 1, TRUE
LOAD 2, a
SKIPLT 2, b
LOADn 1, FALSE
Figure 6.1: Computing the value of a relational expression
Statement: if a<b then <statement> Simple code: LOADn 1, TRUE
LOAD 2, a SKIPLT 2, b LOADn 1, FALSE JUMPFALSE 1, labela {<statement>} labela:
‘Best’ code: LOAD 1, a SKIPLT 1, b
JUMP , labela
{<statement>}
labela: