Despite the desirability of keeping all operands in registers, many procedures require an area in memory for several purposes, namely, 1
1. to provide homes for variables that either don’t fit into the register file or may not be kept in registers, because their addresses are taken (either explicitly or implicitly, as for call-by-reference parameters) or because they must be indexable;
2. to provide a standard place for values from registers to be stored when a procedure call is executed (or when a register window is flushed); and
112
Run-Time Support Previous stack frame Old sp - Decreasing memory addresses Current _________ stack a frame s p Offset of a from spFIG. 5.4 A stack frame with the current and old stack pointers.
Since many such quantities come into existence on entry to a procedure and are no longer accessible after returning from it, they are generally grouped together into an area called a frame, and the frames are organized into a stack. M ost often the frames are called stack frames. A stack frame might contain values of parameters passed to the current routine that don’t fit into the registers allotted for receiving them, some or all of its local variables, a register save area, compiler-allocated temporaries, a display (see Section 5.4), etc.
To be able to access the contents of the current stack frame at run time, we assign them memory offsets one after the other, in some order (described below), and make the offsets relative to a pointer kept in a register. The pointer may be either the frame pointer f p, which points to the first location of the current frame, or the stack pointer sp, which points to the current top of stack, i.e., just beyond the last location in the current frame. M ost compilers choose to arrange stack frames in memory so the beginning of the frame is at a higher address than the end of it. In this way, offsets from the stack pointer into the current frame are always non-negative, as shown in Figure 5.4.
Some compilers use both a frame pointer and a stack pointer, with some vari ables addressed relative to each (Figure 5.5). Whether one should choose to use the stack pointer alone, the frame pointer alone, or both to access the contents of the current stack frame depends on characteristics of both the hardware and the lan guages being supported. The issues are (1) whether having a separate frame pointer wastes a register or is free; (2) whether the short offset range from a single register provided in load and store instructions is sufficient to cover the size of most frames; and (3) whether one must support memory allocation functions like the C library’s a l l o c a ( ), which dynamically allocates space in the current frame and returns a pointer to that space. Using the frame pointer alone is generally not a good idea, since we need to save the stack pointer or the size of the stack frame somewhere anyway, so as to be able to call further procedures from the current one. For most architectures, the offset field in load and store instructions is sufficient for most stack frames and there is a cost for using an extra register for a frame pointer, namely,
Section 5.3 The Local Stack Frame 113
i
Previous stack frame Decreasing memory addresses fp (old sp)i
Current stack frame Offset of a from f p a s p --- ► '---FIG. 5.5 A stack frame with frame and stack pointers.
saving it to memory and restoring it and not having it available to hold the value of a variable. Thus, using only the stack pointer is appropriate and desirable if it has sufficient range and we need not deal with functions like a l l o c a ( ).
The effect of a l l o c a ( ) is to extend the current stack frame, thus making the stack pointer point to a different location from where it previously pointed. This, of course, changes the offsets of locations accessed by means of the stack pointer, so they must be copied to the locations that now have the corresponding offsets. Since one may compute the address of a local variable in C and store it anywhere, this dictates that the quantities accessed relative to sp must not be addressable by the user and that, preferably, they must be things that are needed only while the procedure invocation owning the frame is suspended by a call to another procedure. Thus, sp- relative addressing can be used for such things as short-lived temporaries, arguments being passed to another procedure, registers saved across a call, and return values. So, if we must support a l l o c a ( ), we need both a frame pointer and a stack pointer. While this costs a register, it has relatively low instruction overhead, since on entry to a procedure, we (1) save the old frame pointer in the new frame, (2) set the frame pointer with the old stack pointer’s value, and (3) add the length of the current frame to the stack pointer, and, essentially, reverse this process on exit from the procedure. On an architecture with register windows, such as sparc, this can be done even more simply. If we choose the stack pointer to be one of the out registers and the frame pointer to be the corresponding in register, as the sparc Un ix System V ABI specifies, then the sav e and r e s t o r e instructions can be used to perform the entry and exit operations, with saving registers to memory and restoring left to the register-window spill and fill trap handlers.
An alternative that increases the range addressable from sp is to make it point some fixed distance below the top of the stack (i.e., within the current stack frame), so that part of the negative offset range from it is usable, in addition to positive
114 Run-Time Support
offsets. This increases the size of stack frames that can be accessed with single load or store instructions in return for a small amount of extra arithmetic to find the real top of the stack in the debugger and any other tools that may need it. Similar things can be done with fp to increase its usable range.
5.4
The Run-Time Stack
At run time we do not have all the symbol-table structure present, if any. Instead, we must assign addresses to variables during compilation that reflect their scopes and use the resulting addressing information in the compiled code. As discussed in Section 5.3, there are several kinds of information present in the stack; the kind of interest to us here is support for addressing visible nonlocal variables. As indicated above, we assume that visibility is controlled by static nesting. The structure of the stack includes a stack frame for each active procedure,4 where a procedure is defined to be active if an invocation of it has been entered but not yet exited. Thus, there may be several frames in the stack at once for a given procedure if it is recursive, and the nearest frame for the procedure statically containing the current one may be several levels back in the stack. Each stack frame contains a dynamic link to the base of the frame preceding it in the stack, i.e., the value of f p for that frame.5
In addition, if the source language supports statically nested scopes, the frame contains a static link to the nearest invocation of the statically containing procedure, which is the stack frame in which to look up the value of a variable declared in that procedure. That stack frame, in turn, contains a static link to the nearest frame for an invocation of its enclosing scope, and so on, until we come to the global scope. To set the static link in a stack frame, we need a mechanism for finding the nearest invocation of the procedure (in the stack) that the current procedure is statically nested in. Note that the invocation of a procedure not nested in the current procedure is itself a nonlocal reference, and the value needed for the new frame’s static link is the scope containing that nonlocal reference. Thus,
1. if the procedure being called is nested directly within its caller, its static link points to its caller’s frame;
2. if the procedure is at the same level of nesting as its caller, then its static link is a copy of its caller’s static link; and
3. if the procedure being called is n levels higher than the caller in the nesting structure, then its static link can be determined by following n static links back from the caller’s static link and copying the static link found there.
An example of this is shown in Figures 5.6 and 5.7. For the first call, from f ( ) to g ( ), the static link for g ( ) is set to point to f ( ) ’s frame. For the call from g ( )
4. We assume this until Chapter 15, where we optimize away some stack frames.
5. If the stack model uses only a stack pointer to access the current stack frame and no frame pointer, then the dynamic link points to the end of the preceding frame, i.e., to the value of sp for that frame.
Section 5.4 The Run-Time Stack 115 p ro ced u re f ( ) b e g in p ro ced u re g ( ) b e g in c a l l h ( ) end p ro ced u re h ( ) b e g in c a l l i ( ) end p ro ced u re i ( ) b e g in p ro ced u re j ( ) b e g in p ro ced u re k ( ) b e g in p ro ced u re 1 ( ) b e g in c a l l g ( ) end c a l l 1 ( ) end c a l l k ( ) end c a l l j ( ) end c a l l g ( ) end
FIG. 5.6 An example of nested procedures for static link determination.
to h( ), the two routines are nested at the same level in the same routine, so h( ) ’s static link is a copy of g ( ) ’s. Finally, for the call from l ( ) t o g ( ) , g ( ) i s nested three levels higher in f ( ) than 1( ) is, so we follow three static links back from
1 ( ) ’s and copy the static link found there.
As discussed below in Section 5.6.4, a call to an imported routine or to one in a separate package must be provided a static link along with the address used to call it.
Having set the static link for the current frame, we can now do up-level ad dressing of nonlocal variables by following static links to the appropriate frame. For now, we assume that the static link is stored at offset s l _ o f f from the frame pointer fp (note that s l _ o f f is the value stored in the variable S ta tic L in k O ff s e t used in Section 3.6). Suppose we have procedure h( ) nested in procedure g ( ), which in turn is nested in f ( ). To load the value of f ( ) ’s variable i at offset i _ o f f in its frame while executing h( ), we would execute a sequence of instructions such as the following l ir: I r l <r- [ f p + s l_ o f f ] r2 <r- [ r l + s l _ o f f ] r3 <- [r 2 + i_ o ff] I I g e t fram e p o in te r o f g ( ) I I g e t fram e p o in te r of f ( ) I I lo a d v alu e of i
116 Run-Time Support
FIG. 5.7 (a) Static nesting structure of the seven procedures and calls among them in Figure 5.6, and (b) their static links during execution (after entering g( ) from 1( )).
While this appears to be quite expensive, it isn’t necessarily. First, accessing nonlocal variables is generally infrequent. Second, if nonlocal accesses are common, a mecha nism called a display can amortize the cost over multiple references. A display keeps all or part of the current sequence of static links in either an array of memory loca tions or a series of registers. If the display is kept in registers, nonlocal references are no more expensive than references to local variables, once the display has been set up. Of course, dedicating registers to holding the display may be disadvantageous, since it reduces the number of registers available for other purposes. If the display is kept in memory, each reference costs at most one extra load to get the frame pointer for the desired frame into a register. The choice regarding whether to keep the display in memory or in registers, or some in each, is best left to a global register allocator, as discussed in Chapter 16.
5.5
Param eter-Passing Disciplines
There are several mechanisms for passing arguments to and returning results from procedures embodied in existing higher-level languages, including (1) call by value, (2) call by result, (3) call by value-result, (4) call by reference, and (5) call by name. In this section, we describe each of them and how to implement them, and mention some languages that use each. In addition, we discuss the handling of label
Section 5.5 Parameter-Passing Disciplines 117
parameters, which some languages allow to be passed to procedures. We use the term arguments or actual arguments to refer to the values or variables being passed to a routine and the term parameters or form al parameters to refer to the variables they are associated with in the called routine.
Conceptually, call by value passes an argument by simply making its value available to the procedure being called as the value of the corresponding formal parameter. While the called procedure is executing, there is no interaction with the caller’s variables, unless an argument is a pointer, in which case the callee can use it to change the value of whatever it points to. Call by value is usually implemented by copying each argument’s value to the corresponding parameter on entry to the called routine. This is simple and efficient for arguments that fit into registers, but it can be very expensive for large arrays, since it may require massive amounts of memory traffic. If we have the caller and callee both available to analyze when we compile either of them, we may be able to determine that the callee does not store into a call-by-value array parameter and either it does not pass it on to any routines it calls or the routines it calls also do not store into the parameter (see Section 19.2.1); in that case, we can implement call by value by passing the address of an array argument and allowing the callee to access the argument directly, rather than having to copy it.
Versions of call by value are found in C, C++, Alg o l 60, and Algo l 68. In C, C++, and Alg o l 68, it is the only parameter-passing mechanism, but in all three, the parameter passed may be (and for some C and C++ types, always is) the address of an object, so it may have the effect o f call by reference, as discussed below. Ada in parameters are a modified form of call by value— they are passed by value, but are read-only within the called procedure.
Call by result is similar to call by value, except that it returns values from the callee to the caller rather than passing them from the caller to the callee. On entry to the callee, it does nothing; when the callee returns, the value of a call-by-result parameter is made available to the caller, usually by copying it to the actual argument associated with it. Call by result has the same efficiency considerations as call by value. It is implemented in Ada as out parameters.
Call by value-result is precisely the union of call by value and call by result. On entry to the callee, the argument’s value is copied to the parameter and on return, the parameter’s value is copied back to the argument. It is implemented in Ada as in ou t parameters and is a valid parameter-passing mechanism for Fortran.
Call by reference establishes an association between an actual argument and the corresponding parameter on entry to a procedure. At that time, it determines the address of the argument and provides it to the callee as a means for access ing the argument. The callee then has full access to the argument for the duration of the call; it can change the actual argument arbitrarily often and can pass it on to other routines. Call by reference is usually implemented by passing the address of the actual argument to the callee, which then accesses the argument by means of the address. It is very efficient for array parameters, since it requires no copy ing, but it can be inefficient for small arguments, i.e., those that fit into registers, since it precludes their being passed in registers. This can be seen by considering
118 Run-Time Support
a call-by-reference argument that is also accessible as a global variable. If the argu ment’s address is passed to a called routine, accesses to it as a parameter and as a global variable both use the same location; if its value is passed in a register, ac cess to it as a global variable will generally use its memory location, rather than the register.
A problem may arise when a constant is passed as a call-by-reference parameter. If the compiler implements a constant as a shared location containing its value that is accessed by all uses of the constant in a procedure, and if it is passed by reference to another routine, that routine may alter the value in the location and hence alter the value of the constant for the remainder of the caller’s execution. The usual remedy is to copy constant parameters to new anonymous locations and to pass the addresses of those locations.
Call by reference is a valid parameter-passing mechanism for Fortran. Since C, C++, and Algol 68 allow addresses of objects to be passed as value parameters, they, in effect, provide call by reference also.
The semantics of parameter passing in Fortran allow either call by value-result or call by reference to be used for each argument. Thus, call by value-result can be used for values that fit into registers and call by reference can be used for arrays, providing the efficiency of both mechanisms for the kinds of arguments for which they perform best.
Call by name is the most complex parameter-passing mechanism, both con ceptually and in its implementation, and it is really only of historical significance since Algol 60 is the only well-known language that provides it. It is similar to call by reference in that it allows the callee access to the caller’s argument, but differs in that the address of the argument is (conceptually) computed at each ac cess to the argument, rather than once on entry to the callee. Thus, for example, if the argument is the expression a [ i ] and the value of i changes between two uses of the argument, then the two uses access different elements of the array. This is illustrated in Figure 5.8, where i and a [ i ] are passed by the main pro gram to procedure f ( ). The first use of the parameter x fetches the value of a [ l ] , while the second use sets a [2]. The call to out in te g e r ( ) prints 5 5 2. If call by reference were being used, both uses would access a [ l ] and the program would print 5 5 8. Implementing call by name requires a mechanism for computing the address of the argument at each access; this is generally done by providing a pa rameterless procedure called a thunk. Each call to an argument’s thunk returns its current address. This, of course, can be a very expensive mechanism. However, many simple cases can be recognized by a compiler as identical to call by refer ence. For example, passing a simple variable, a whole array, or a fixed element of an array always results in the same address, so a thunk is not needed for such cases.
Labels may be passed as arguments to procedures in some languages, such as Algol 60 and Fortran, and they may be used as the targets of got os in the procedures they are passed to. Implementing this functionality requires that we pass both the code address of the point marked by the label and the dynamic link of the corresponding frame. A goto whose target is a label parameter executes a series of