• No results found

Our Intermediate Languages: MIR, HIR, and LIR 77 void make_node(p,n)

struct node *p; int n; { struct node *q; q = malloc(sizeof(struct node)); q->next = nil; q->value = n; p->next = q; > void insert_node(n,l) int n; struct node *1; { if (n > l->value)

if (l->next == nil) make_node(l,n); else insert_node(n,l->next);

>

FIG . 4 .5 Exam ple pair of C procedures.

make_node: begin receive p(val) receive n(val) q <- call malloc,(8,int) *q.next <- nil *q.value <- n *p.next <- q return end insert_node: begin receive n(val) receive l(val) tl <- 1*.value if n <= tl goto LI t2 <- l*.next if t2 != nil goto L2

call make_node(1,type1 ;n,int) return L2: t4 <- l*.next call insert_node,(n,int;t4,typel) return L I : return end

78 Interm ediate Representations

ways to compute them very efficiently and, in particular, without any branches. For example, for pa-risc, the mir instruction tl t2 min t3 can be translated (assum­ ing that t i is in register ri) to 2

MOVE r2,rl /* copy r2 to rl */

C0M,>= r3,r2 /* compare r3 to r2, nullify next if >= */ MOVE r3,rl /* copy r3 to rl if not nullified */

Also, note that we have provided two ways to represent conditional tests and branches: either (1) by computing the value of a condition and then branching based on it or its negation, e.g.,

t3 <- tl < t2 if t3 goto LI

or (2) by computing the condition and branching in the same mir instruction, e.g.,

if tl < t2 goto LI

The former approach is well suited to an architecture with condition codes, such as sparc, power, or the Intel 386 architecture family. For such machines, the comparison can sometimes be subsumed by a previous subtract instruction, since t l < t2 if and only if 0 < t2 - t l , and it may be desirable to move the compare or subtract away from the conditional branch to allow the condition codes to be determined in time to know whether the branch will be taken or not. The latter approach is well suited to an architecture with compare and branch instructions, such as pa-risc or m ips, since the mir instruction can usually be translated to a single machine instruction.

4.6.2

High-Level Intermediate Representation (HIR)

In this section we describe the extensions to mir that make it into the higher-level intermediate representation hir.

An array element may be referenced with multiple subscripts and with its sub­ scripts represented explicitly in hir. Arrays are stored in row-major order, i.e., with the last subscript varying fastest. We also include a high-level looping construct, the f o r loop, and a compound i f . Thus, M IRInst needs to be replaced by H IR Inst, and the syntax of A ssignlnst, Traplnst, and Operand need to be changed as shown in Table 4.4. An IntExpr is any expression whose value is an integer.

The semantics of the f o r loop are similar to those of Fortran’s do, rather than C ’s f o r statement. In particular, the meaning of the hir f o r loop in Figure 4.7(a) is given by the mir code in Figure 4.7(b), with the additional proviso that the body of the loop must not change the value of the control variable v. Note that Figure 4.7(b) selects one or the other loop body (beginning with LI or L2), depending on whether opd2 > 0 or not.

2. The MOVE and COM opcodes are both pseudo-ops, not actual p a-riscinstructions. MOVE can be

Section 4.6 Our Intermediate Languages: MIR, HIR, and LIR 79

TABLE 4.4 Changes to xbnf description of instructions and operands to turn mir into h ir. HIRInst * Assignlnst \ Gotolnst \ Iflnst \ Calllnst \ Returnlnst

| Receivelnst \ Sequencelnst \ Forlnst \ Iflnst | Traplnst \ Label : HIRInst

Forlnst for VarName <- Operand [by Operand] to Operand do

HIRInst* endfor

Iflnst —> if RelExpr then HIRInst* [else HIRInst*] endif

Assignlnst [VarName | Array Ref] <- Expression | [*] VarName [. EltName] <r- Operand Traplnst — ► trap Integer

Operand VarName \ Array Ref \ Const Array Ref — ► VarName [ {Subscript x ,} ]

Subscript — ► IntExpr

v <- opdl

t2 < - opd2

t3 <- opd3 if t2 > 0 goto L2

for v <- opdl by opd2 to opd3 instructions endfor instructions v <- v + t2 goto L2 L3: LI: if v < t3 goto L3 instructions v <- v + t2 goto LI L2: if v > t3 goto L3 (a)

(b)

FIG. 4.7 (a) Form of the hir fo r loop, and (b) its semantics in m ir.

4.6.3

Low-Level Intermediate Representation (LIR)

In this section we describe the changes to m ir that make it into the lower-level inter­ mediate code l ir. We give only the necessary changes to the syntax and semantics of m ir by providing replacements for productions in the syntax of m ir (and addi­ tional ones, as needed) and descriptions of the additional features. The changes to the x b n f description are as shown in Table 4.5. Assignments and operands need to be changed to replace variable names by registers and memory addresses. Calls need to be changed to delete the argument list, since we assume that parameters are

80 Interm ediate Representations

TABLE 4.5 Changes in the xbnf description of mir instructions and expressions to create lir. LIRInst RegAsgnlnst CondAsgnlnst Storelnst Loadlnst Gotolnst Calllnst Operand MemAddr Length

RegAsgnlnst \ CondAsgnlnst \ Storelnst \ Loadlnst | Gotolnst | Iflnst \ Calllnst \ReturnInst

| Sequencelnst | Label: LIRInst RegName <- Expression

| RegName ( Integer , Integer ) <- Operand RegName <- ( RegName ) Operand

MemAddr [ ( Length ) ] <- Operand RegName <- MemAddr [ ( Length ) ] goto {Label \ RegName [{+ | -} Zwteg^r]}

[RegName <-] c a ll {ProcName \ RegName] , RegName RegName \ Integer

[ RegName ] [( Length )]

| [ RegName + RegName ] [ ( Length ) ] | [ RegName [+ | -] Integer ] [ ( Length ) ] Integer

passed in the run-time stack or in registers (or, if there are more than a predetermined number of parameters, the excess ones are also passed in the run-time stack).

There are five types of assignment instructions, namely, those that

1. assign an expression to a register;

2. assign an operand to an element of a register (in lir, the element is represented by two integers, namely, the first bit position and the width in bits of the element separated by a comma and enclosed in parentheses);

3. conditionally assign an operand to a register depending on the value in a register; 4. store an operand at a memory address; or

5. load a register from a memory address.

A memory address is given in square brackets (“ [ ” and “ ] ” ) as the contents of a register, the sum of the contents of two registers, or a register’s contents plus or minus an integer constant. The length specification, if it appears, is an integer number of bytes.

In a c a l l instruction that has two or more register operands, the next to last contains the address to be branched to and the last names the register to be used to store the return address, which is the address of the c a l l instruction.

Section 4.7 Representing MIR, HIR, and LIR in ICAN 81 For the names of registers, we reserve rO, r l , . . . , r31 for integer or general- purpose registers, fO, f 1, . . . , f31 for floating-point registers, and sO, s i , . . . for symbolic registers.