• No results found

Instrumentation to Regulate Compiler Behavior

In document dtj v01 06 feb1988 pdf (Page 94-96)

We also used instrumentation data gathered dur­ ing a compilation to actually modify the overal l flow of the compi ler, and thus improve the com­ piler's performance . In particular, the compiler uses instrumentation data to modify its behavior accordi ng to the availabiliry of memory. This kind of optimization is often seen in computer operating systems and in general manufacturing processes, but rarely seen in software tools such as compi lers. This section describes the use of i nstrumentation data to d iagnose and solve a pag­ ing-rate problem we detected during the devel­ opment of the compiler.

The VAX Ada compi ler consists of a number of phases that process the internal tree representa­ tion of Ada source code in a series of tree traver­ sals, or walks. Wal ks i n the semantics phase mod­ ify the tree representation tO reflect the semantic meaning of the Ada code . Later walks, prior to optimization and code generation, add code-gen­ eration information to the tree .

Each of these walks is instrumented to show the amount of CPU time, elapsed time, page faults, and 1/0 operations involved . An analysis of this information during the development of VAX Ada showed that a very large number of page faults often occurred for typical program units. Even with larger than normal working sets, the paging rate was high enough to significantly increase the load on the system , thus affecting overal l system performance and responsiveness for a l l users. Comparison of the paging rates with the same data for other parts of the compiler, and against the totals for the whole compi lation , showed that a very large proportion of the page

9 3

Software Productivity Tools

faults occurred duri ng the walks that added code-generation information .

The trouble with any "static" solution to this problem is that page faults are a property of the amount of physical memory available tO the com­ p iler. The amount of physical memory varies based on both the VAX hardware configuration and the use of that hardware by other VMS pro­ cesses ru nning concurrently with the process executing the compiler.

I n an effort to solve this problem, we measured the size of the tree for typical Ada subprograms. We found the tree size to be significantly smal ler than the size of the code doing the individual tree wa lks. Furthermore, the code for the tree wal ks was l arger than typical VMS working sets . Thus, the code for each wal k was paged out by the sub­ sequent wal k and then paged back in again for the next subprogram. We concluded that the high paging rates were caused by our i nside-out code­ generation approach, which was designed to min­ imize the use of virtual memory.

To reduce the paging of the code, we chained together the trees for sets of subprograms and did each walk across all the elements of t he set before applying the subsequent walk to any of them. This approach is contrary tO the earlier goal of reducing memory usage by doing one sub­ program completely before doing the next one . However, in this context the earlier goal is more accurately stated as " keep i ng the memory usage to withi n the amount of memory that is avail­ able."

As a resul t of our observations, we also made the compi l er "self-correcting" in a release fol­ lowing version 1 . 0 . We instrumented the com­ piler tO measure the amount of virtual memory avai lable, the amount of physical memory avail­ able, and the pre-code-generation size of each subprogram 's trees. In addition , very conserva­ tive heuristics estimate the addi tional memory req u i red for the code-generation information for each su bprogram . Together, the measurements and heuristics are used by the compiler tO build the largest possible set of subprograms that do not present a danger of exceeding the ava ilable virtual memory. Furthermore, the sets are chosen so that the code for the largest phase plus the size of a l l the trees for the subprograms in the set are less than the size of the working set extent of the VMS process.

This modification successfu l l y l owered the paging rates of the compiler, hence improving

94

elapsed time and system performance . The exact numbers vary accordi ng tO the actual VAX hard­ ware configuration and Ada code being com­ piled. However, figures for the code-generation phases were often halved, resulting in 30 percent or more overall i mprovement for the whole com­ pilation .

This dynamic measurement of working set, vir­ tual memory, and tree size and the subsequent tuning of the selection of sets tO the process's avai lable resources means that a l l resources - large or smal l - were fu lly exploited. This tech­ nique is applicable for enhancing the perfor­ mance of any compute-bound programs that also use significant amountS of virtual memory.

Instru mentatio n as a Debugging and

Main tenance A id

I n addition tO using instrumentation to obtai n resource measurements, we have used i t t o debug the compiler. We have also found it to be a useful maintenance aid .

Instrumentation data is read b y cal l ing one of a nu mber of routines e ither from the VMS Debug­ ger or from code triggered by an event. (Events are special places in the compiler code.) The routine displays the instrumentation data on the term inal (so the programmer can see it right away) and in the listing fi le (for post-mortem examination) . The debugger or event-driven rou­ tines are capable of producing human-readable l istings of large and complex data structu res. The listings hel p simplify the task of debugging the compi ler, as it can be very time-consu ming to exa mine directly a very complex data struc­ ture, such as a tree, with a general -purpose tool like the VMS Debugger. (An example listing appears at the end of this paper i n the section Sel f-description . )

Each event i s specified i n the compiler code by a DEILEVENT macro. This macro takes one or more parameters. The first parameter is the name of the event , and subsequent para meters specify additiona l code that causes instrumentation data tO be displayed.

An event wi l l not occur unless its name has been given either on the command l i ne that invoked the compiler or via a simple interpreter that is li nked i nto the compiler. The interpreter d isplays event names and a l l ows breakpoints to be set or canceled on particular events . For exam­ ple, the Ada compiler implements a sophisticated syntax error recovery scheme that attempts a

Digital Technical]ournal

large variety of local corrections when an error is detected. When the parser makes an u nexpected correction , events in the recovery code can be set to gather the data to determine why. Events in the recovery code are set by the setting of break­ points on all events whose names start with PAILRECOVERY. The result is an informative dis­ play at the start of error recovery, and another dis­ play as each kind of recovery is attempted . The displays can then be used to determ ine the reason for the particular recovery chosen.

The information obtained by setting an event gives precise i nformation that is needed to deter­ mine why the compi ler code made a particular decision , as opposed to the more general infor­ mation given by the VMS Debugger. Often the time saved in ana lyzing each problem exceeds the amount of time required i nitially to put the events into the code . Furthermore, such events are sti l l in place for the benefit of future develop­ ers who need to make en hancements or debug other problems.

In document dtj v01 06 feb1988 pdf (Page 94-96)