Control Abstraction - Engineering A Compiler pdf

The procedure is, fundamentally, an abstraction that governs the transfer of control and the naming of data. This section explores the control aspects of procedure’s behavior. The next section ties this behavior into the naming dis- ciplines imposed in procedural languages.

In Algol-like languages, procedures have a simple and clear call/return discipline. On exit from a procedure, control returns to the point in the calling procedure that follows its invocation. If a procedure invokes other procedures, they return control in the same way. Figure 7.1 shows a Pascal program with several nested procedures. Thecall tree andexecution history to its right sum- marize what happens when it executes. Feeis called twice: the ﬁrst time from

Foeand the second time from Fum. Each of these calls creates an instance, or aninvocation, of Fee. By the time thatFumis called, the ﬁrst instance of Fee

is no longer active. It has returned control to Foe. Control cannot return to that instance of Fee; whenFumcallsFee, it creates a new instance of Fee.

The call tree makes these relationships explicit. It includes a distinct node for each invocation of a procedure. As the execution history shows, the only procedure invoked multiple times in the example isFee. Accordingly, Feehas two distinct nodes in the call tree.

When the program executes the assignmentx := 1;in the ﬁrst invocation of Fee, the active procedures areFee,Foe,Fie, and Main. These all lie on the path from the ﬁrst instance of Feeto the program’s entry in Main. Similarly, when it executes the second invocation of Fee, the active procedures are Fee,

Fum, Foe, Fie, and Main. Again, they all lie on the path from the current procedure toMain.

The call and return mechanism used in Pascal ensures that all the currently active procedures lie along a single path through the call graph. Any procedure not on that path is uninteresting, in the sense that control cannot return to it. When it implements the call and return mechanism, the compiler must arrange to preserve enough information to allow the calls and returns to operate correctly. Thus, whenFoecallsFum, the calling mechanism must preserve the information needed to allow the return of control to Foe. (Foe may diverge, or not return, due to a run-time error, an inﬁnite loop, or a call to another procedure that does not return.)

This simple call and return behavior can be modelled with a stack. As α

calls β, it pushes the address for a return onto the stack. When β wants to return, it pops the address oﬀ the stack and branches to that address. If all procedures have followed the discipline, popping a return address oﬀ the stack exposes the next appropriate return address.

This mechanism is sufficient for our example, which lacks recursion. It works equally well for recursion. In a recursive program, the implementation must preserve a cyclic path through the call graph. The path must, however, have finite length—otherwise, the recursion never terminates. Stacking the return addresses has the effect of unrolling the path. A second call to procedure Fum

7.2. CONTROL ABSTRACTION 169

main() {

printf("Fib(5) is %d.", fibonacci(5));

}

int fibonacci( ord ) int ord;

{

int one, two; if (ord < 1) {

puts("Invalid input."); return ERROR VALUE; } else if (ord == 1) return 0; else return fib(ord,&one,&two); } int fib(ord, f0, f1) int ord, *f0, *f1; { int result, a, b; if (ord == 2) { /* base case */ *f0 = 0; *f1 = 1; result = 1; } else { /* recurse */ (void) fib(ord-1,&a,&b); result = a + b; *f0 = b; *f1 = result; } return result; }

Figure 7.2: Recursion Example

eﬀect, creating a distinct space to represent the second invocation of Fum. The same constraint applies to recursive and non-recursive calls: the stack needs enough space to represent the execution path.

To see this more clearly, consider thec_{program shown in Figure 7.2. It com-} putes the ﬁfth Fibonacci number using the classic recursive algorithm. When it executes, the routinefibonacciinvokesfib, andfibinvokes itself, recursively. This creates a series of calls:

Procedure Calls main fibonacci(5) fibonacci fib(5,*,*) fib fib(4,*,*) fib fib(3,*,*) fib fib(2,*,*)

Here, the asterisk (*) indicates an uninitialized return parameter.

This series of calls has pushed ﬁve entries onto the control stack. The top three entries contain the address immediately after the call infib. The next entry contains the address immediately after the call infibonacci. The fourth entry contains the address immediately after the call tofibonacciinmain.

After the ﬁnal recursive call, denoted fib(2,*,*)above,fibexecutes the base case and the recursion unwinds. This produces a series of return actions:

Call Returns to The result(s)

fib(2,*,*) fib(3,*,*) 1 (*one = 0; *two = 1;)

fib(3,*,*) fib(4,*,*) 1 (*one = 1; *two = 1;)

fib(4,*,*) fib(5,*,*) 2 (*one = 1; *two = 2;)

fib(5,*,*) fibonacci(5) 3 (*one = 2; *two = 3;)

fibonacci(5) main 3

The control stack correctly tracks these return addresses. This mechanism is suﬃcient for Pascal-style call and return. In fact, some computers have hard- wired this stack discipline into their call and return instructions.

More complex control ﬂow Some programming languages allow a procedure to return a procedure and its run-time context. When the returned object is invoked, the procedure executes in the run-time context from which it was returned. A simple stack is inadequate to implement this control abstraction. Instead, the control information must be saved in some more general structure, such as a linked list, where traversing the structure does not imply deallocation. (See the discussion of heap allocation for activation records in the next section.)

7.3 Name Spaces

Most procedural languages provide the programmer with control over which procedures can read and write individual variables. A program will contain multiple name spaces; the rules that determine which statements can legally access each name space are calledscoping rules.

7.3.1 Scoping Rules

Specific programming languages differ in the set of name spaces that they allow the programmer to create. Figure 7.3 summarizes the name scoping rules of several languages. Fortran, the oldest of these languages, creates two name spaces: a global space that contains the names of procedures and common blocks, and a separate name space inside each procedure. Names declared inside a procedure’s local name space supersede global names for references within the procedure. Within a name space, different attributes can apply. For example, a local variable can be mentioned in a savestatement. This has the effect of making the local variable astatic variable—its value is preserved across calls to the procedure.

The programming language c has more complex scoping rules. It creates a global name space that holds all procedure names, as well as the names of global variables. It introduces a separate name space for all of the procedures in a single file (or compilation unit). Names in the file-level scope are declared with the attributestatic; they are visible to any procedure in the file. The file- level scope holds both procedures and variables. Each procedure creates its own name space for variables and parameters. Inside a procedure, the programmer can create additional name spaces by opening a block (with{and}). A block can declare its own local names; it can also contain other blocks.

7.3. NAME SPACES 171

In document Engineering A Compiler pdf (Page 178-181)