3.2 Representation of Language Elements
3.2.4 Control Structures
A node representing a control structure generally results in several disjoint code sequences rather than a single code sequence. The meanings of and relationships among the sequences depend primarily upon the source language, and hence general schemata can be used to specify them. Each of the disjoint sequences then can be thought of as an abstract machine operation with certain dened properties and implemented individually.
The
goto
statement is implemented by an unconditional jump instruction. If the jump leaves a block or procedure then additional operations, discussed in Section 3.3, are needed to adjust the state. In expression-oriented languages, a jump out of an expression may require adjustment of a hardware stack used for temporary storage of intermediate values. This adjustment is not necessary when the stack is simply an area of memory that the compiler manages as a stack, computing the necessary osets at compile time. (Unless use of a hardware stack permits cheaper access functions, it should be avoided for this reason.)Schemata for common control structures are given in Figure 3.9. The operation `condi- tion(expression,truelabel,falselabel)' embodies the jump cascade discussed in Section 3.2.3. The precise mechanism used to implement the analogous `select' operation depends upon the set
k
1:::k
m. Letk
min be the smallest andk
max the largest values in this set. If `most' of the values in the range [k
min;k
max] are members of the set then `select' is implemented asshown in Figure 3.10a. Each element of
target
that does not correspond to an element ofk
1:::k
m is set to `L0'. When the selector set is sparse and its span is large (for example, the set 0;
5000;
10000), a decision tree or perfect hash function should be used instead of an array. The choice of representation is strictly a space/time tradeo, and must be made by the codegenerator for each case clause. The source-to-target mapping must specify the parameters to be used in making this choice.
condition(
e
, L1, L2) L1: clause L2: a)if
ethen
clause; condition(e
, L1, L2) L1: clause1 GOTO L L2: clause2 L:b)
if
ethen
clause1else
clause2; select(e
,k
1, L1,:::
,k
n, Ln, L0) L1: clause1 GOTO L:::
Ln: clausen GOTO L L0: clause0 L:c)
case
eof
k
1: clause1;:::
;k
n: clausenelse
clause0; GOTO L L1: clause L: condition(e
, L1, L2) L2: d)whileedo
clause; L1: clause condition(e
, L2, L1) L2:e)
repeat
clauseuntil
e
forbegin(
i
,e
1,e
2,e
3)clause
forend(
i
,e
2,e
3)f)
fori
:=e
1by
e
2toe
3do
clause;Figure 3.9: Implementation Schemata for Common Control Structures
By moving the test to the end of the loop in Figure 3.9d, we reduce by one the number of jumps executed each time around the loop without changing the total number of instructions
3.2 Representation of Language Elements 53 required. Further, if the target machine can execute independent instructions in parallel, this schema provides more opportunity for such parallelism than one in which the test is at the beginning.
`Forbegin' and `forend' can be quite complex, depending upon what the compiler can deduce about the bounds and step, and how the language denition treats the controlled variable. As an example, suppose that the step and bounds are constants less than 212, the step is positive, and the language denition states that the value of the controlled variable is undened on exit from the loop. Figure 3.10b shows the best IBM 370 implementation for this case, which is probably one of the most common. (We assume that the body of the loop is too complex to permit retention of values in registers.) Note that the label LOOP is dened within the `forbegin' operation, unlike the labels used by the other iterations in Figure 3.9. If we permit the bounds to be general expressions, but specify the step to be 1, the general schema of Figure 3.10c holds. This schema works even if the value of the upper bound is the largest representable integer, since it does not attempt to increment the controlled variable after reaching the upper bound. More complex cases are certainly possible, but they occur only infrequently. It is probably best to implement the abstract operations by subroutine calls in those cases (Exercise 3.9).
target :
array
[kmin .. kmax]of
address; k : integer;k := e;
if
k kminand
k kmaxthen goto
target [k]else goto
L0; a) General schema for `select' (Figure 3.9c)LA 1,
e
1e
1 = constant<
2 12 LOOP ST 1,i
:::
Body of the clauseL 1,
i
LA 2,e
2e
2 = constant<
2 12 LA 3,e
3e
3 = constant<
2 12 BXLE 1,2,LOOPb) IBM 370 code for special-case forbegin
:::
forendi
:=e
1;t
:=e
3;if
i > t
then gotol
3else gotol
2;l
1 :i
:=i
+ 1;l
2 ::::
(* Body of the clause *)if
i < t
then gotol
1;l
3 :c) Schema for forbegin...forend when the step is 1
Figure 3.10: Implementing Abstract Operations for Control Structures
Procedure and function invocations are control structures that also manipulate the state. Development of the instruction sequences making up these invocations involves decisions about the form of parameter transmission, and the construction of the activation record { the area of memory containing the parameters and local variables.
A normal procedure invocation, in its most general form, involves three abstract opera- tions:
Transfer:
Transfer control to the procedure.Callend:
Relinquish access to the activation record of the procedure.Argument computation and transmission instructions are placed between `callbegin' and `transfer'; instructions that retrieve and store the values of result parameters lie between `transfer' and `callend'. The activation record of the procedure is accessible to the caller between `callbegin' and `callend'.
In simple cases, when the procedure calls no other procedures and does not require complex parameters, the activation record can be deleted entirely and the parameters treated as local variables of the environment statically surrounding the procedure declaration. The invocation then reduces to a sequence of assignments to these variables and a simple subroutine jump. If, as in the case of elementary functions, only one or two parameters are involved then they can be passed in registers. Note that such special treatment leads to diculties if the functions are invoked as formal parameters. The identity of the procedure is not xed under those circumstances, and hence special handling of the call or parameter transmission is impossible. Invocations of formal procedures also cause problems if, as in ALGOL 60, the number and types of the parameters is not statically specied and must be veried at execution time. These dynamic checks require additional instructions not only at the call site, but also at the procedure entry. The latter instructions must be avoided by a normal call, and therefore it is useful for the procedure to have two distinct entry points { one with and one without the tests.
Declarations of local variables produce executable code only when some initialization is required. For dynamic arrays, initialization includes bounds computation, storage allocation, and construction of the array descriptor. Normally only the bounds computation would be realized as in-line code; a library subroutine would be invoked to perform the remaining tasks. At least for test purposes, every variable that is not explicitly initialized should be im- plicitly assigned an initial value. The value should be chosen so that its use is likely to lead to an error report; values recognized as illegal by the target machine hardware are thus best. Under no circumstances should 0 be used for implicit initialization. If it is, the programmer will too easily overlook missing explicit initialization or assume that the implicit initialization is a dened property of the language and hence write incorrect programs.
Procedure and type declarations do not usually lead to code that is executed at the site of the declaration. Type declarations only result in machine instructions if array descriptors or other variables must be initialized. As with procedures, these instructions constitute a subprogram that is not called at the point of declaration.
ALGOL 68 identity declarations of the form
mid
=expression
are consistently replaced by initialized variable declarationsmid
0 :=expression
. Hereid
0 is a new internal name, and every applied occurrence ofid
is consistently replaced byid
0". The initialization remains the only assignment to
id
0. Simplication of this schema is possible when the expression can be evaluated at compile time and all occurrences ofid
replaced by this value.The same schema describes argument transmission for the reference and strict value mech- anisms, in particular in ALGOL 68. Transmission of a reference parameter is implemented by initialization of an internal reference variable:
ref
m parameter =argument becomesref
m variable := argument.We have already met the internal transformation used by the value and name mechanisms in Section 2.5.3. In the result and value/result mechanisms, the result is conveniently assigned to the argument after return. In this way, transmission of the argument address to the procedure is avoided. When implementing value/result transmission for FORTRAN, one should generate the result assignment only in the case that the argument was a variable. (Note that if the argument address is transmitted to the procedure then the caller must
3.3 Storage Management 55