• No results found

Optimizing Compilers

In the last section we discussed the placement of optimizations in the overall com­ pilation process. In what follows, the wrap-up section of each chapter devoted to optimization includes a diagram like the one in Figure 1.7 that specifies a reason­ able sequence for performing almost all the optimizations discussed in the text in an aggressive optimizing compiler. Note that we say “ aggressive” because we assume that the goal is to improve performance as much as is reasonably possible without compromising correctness. In each of those chapters, the optimizations discussed there are highlighted by being in bold type. Note that the diagram includes only optimizations, not the other phases of compilation.

The letters at the left in Figure 1.7 specify the type of code that the optimizations to its right are usually applied to, as follows:

A These optimizations are typically applied either to source code or to a high-level intermediate code that preserves loop structure and sequencing and array accesses in essentially their source-code form. Usually, in a compiler that performs these optimizations, they are done very early in the compiling process, since the overall process tends to lower the level of the code as we move along from one pass to the next.

B,C These optimizations are typically performed on medium- or low-level intermediate code, depending on whether the mixed or low-level model is used.

D These optimizations are almost always done on a low-level form of code—one that may be quite machine-dependent (e.g., a structured assembly language) or that may be somewhat more general, such as our lir—because they require that addresses have been turned into base register + offset form (or something similar, depending on the addressing modes available on the target processor) and because several of them require low-level control-flow code.

E These optimizations are performed at link time, so they operate on relocatable object code. One interesting project in this area is Srivastava and Wall’s OM system, which is a pilot study for a compiler system that does all optimization at link time.

The boxes in Figure 1.7, in addition to corresponding to the levels of code ap­ propriate for the corresponding optimizations, represent the gross-level flow among the optimizations. For example, constant folding and algebraic simplifications are in a box connected to other phases by dotted arrows because they are best structured as subroutines that can be invoked anywhere they are needed.

12 In tr o d u c tio n to A d v an ced T o p ic s

FIG. 1.7 Order of optimizations.

The branches from C l to either C2 or C3 represent a choice of the methods one uses to perform essentially the same optimization (namely, moving computations to places where they are computed less frequently without changing the semantics of the program). They also represent a choice of the data-flow analyses used to perform the optimizations.

The detailed flow within the boxes is much freer than between the boxes. For example, in box B, doing scalar replacement of aggregates after sparse conditional constant propagation may allow one to determine that the scalar replacement is worthwhile, while doing it before constant propagation may make the latter more effective. An example of the former is shown in Figure 1.8(a), and of the latter in

Section 1.5 Placement of Optimizations in Aggressive Optimizing Compilers 13

D

E

(to constant folding, algebraic simplifications, and reassociation)

A

FIG. 1.7 (continued)

Figure 1.8(b). In (a), upon propagating the value 1 assigned to a to the test a = 1, we determine that the Y branch from block B1 is taken, so scalar replacement of aggregates causes the second pass of constant propagation to determine that the Y branch from block B4 is also taken. In (b), scalar replacement of aggregates allows sparse conditional constant propagation to determine that the Y exit from B1 is taken.

Similarly, one ordering of global value numbering, global and local copy prop­ agation, and sparse conditional constant propagation may work better for some programs and another ordering for others.

Further study of Figure 1.7 shows that we recommend doing both sparse con­ ditional constant propagation and dead-code elimination three times each, and in­ struction scheduling twice. The reasons are somewhat different in each case: 1. Sparse conditional constant propagation discovers operands whose values are con­

stant each time such an operand is used— doing it before interprocedural constant propagation helps to transmit constant-valued arguments into and through proce­ dures, and interprocedural constants can help to discover more intraprocedural ones. 2. We recommend doing dead-code elimination repeatedly because several optimiza­ tions and groups of optimizations typically create dead code and eliminating it as soon as reasonably possible or appropriate reduces the amount of code that other

14 Introduction to Advanced Topics

(a) (b)

FIG. 1.8 Examples of (a) the effect of doing scalar replacement of aggregates after constant propagation, and (b) before constant propagation.

compiler phases— be they optimizations or other tasks, such as lowering the level of the code from one form to another—have to process.

3. Instruction scheduling is recommended to be performed both before and after regis­ ter allocation because the first pass takes advantage of the relative freedom of code with many symbolic registers, rather than few real registers, while the second pass includes any register spills and restores that may have been inserted by register allo­ cation.

Finally, we must emphasize that implementing the full list of optimizations in the diagram results in a compiler that is both very aggressive at producing high- performance code for a single-processor system and that is quite large, but does not deal at all with issues such as code reorganization for parallel and vector machines.

1.6

Reading Flow Among the Chapters

There are several approaches one might take to reading this book, depending on your background, needs, and several other factors. Figure 1.9 shows some possible paths through the text, which we discuss below. 1

1. First, we suggest you read this chapter (as you’re presumably already doing) and Chapter 2. They provide the introduction to the rest of the book and the definition of the language ican in which all algorithms in the book are written.

2. If you intend to read the whole book, we suggest reading the remaining chapters in order. While other orders are possible, this is the order the chapters were designed to be read.

Section 1.6 Reading Flow Among the Chapters 15

FIG. 1.9 Reading flow among the chapters in this book.

3. If you need more information on advanced aspects of the basics of compiler design and implementation, but may skip some of the other areas, we suggest you continue with Chapters 3 through 6. 4 5 6

4. If your primary concern is optimization, are you interested in data-related optimiza­ tion for the memory hierarchy, as well as other kinds of optimization?

(a) If yes, then continue with Chapters 7 through 10, followed by Chapters 11 through 18 and 20.

(b) If not, then continue with Chapters 7 through 10, followed by Chapters 11 through 18.

5. If you are interested in interprocedural optimization, read Chapter 19, which covers interprocedural control-flow, data-flow, alias, and constant-propagation analyses, and several forms of interprocedural optimization, most notably interprocedural register allocation.

6. Then read Chapter 21, which provides descriptions of four production compiler suites from Digital Equipment Corporation, IBM, Intel, and Sun Microsystems and includes examples of other intermediate-code designs, choices and orders of

Introduction to Advanced Topics

optimizations to perform, and techniques for performing optimizations and some of the other tasks that are parts of the compilation process. You may also wish to refer to the examples in Chapter 21 as you read the other chapters.

The three appendixes contain supporting material on the assembly languages used in the text, concrete representations of data structures, and access to resources for compiler projects via f t p and the World Wide Web.