Effective Partial Evaluation:
Principles and Applications
The COMPOSE project
A Selection of Representative Publications January 1996 – May 1998
This collection of published papers provides a comprehensive presentation of the principles, techniques, and applications of program specialization that the Compose group has been studying in the past 3 years.
The topic of program specialization has been the common theme of our work. It is one of the facets of program adaptation [5]. Although program specialization has been actively studied since the seventies, a number of open problems remained to enable it to scale up to real-size applications. The only evidence regarding its effectiveness had been demonstrated by successful applications of manual program specialization such as the Synthesis operating system.
Contributions in program specialization. We took up this challenge as the primary goal of our research project. We designed and implemented a complete specializer for C programs, called Tempo [1, 3]. The functionalities and features of Tempo have been driven by the needs of practical applications. This strategy has lead us to develop drastically new approaches to program specialization. The most important contributions towards this goal include an approach to enable programs, not only to be specialized at compile time as is usually the case, but also at run time [4, 12]. As a result, a whole class of new applications can now be treated by program specialization: programs whose specialization opportunities rely on run-time values.
Another set of contributions addresses new analysis techniques aimed at maximizing the use of special-ization values [6, 7] and that are crucial to obtain a high degree of specialspecial-ization. These new principles and techniques have been integrated in Tempo.
Systems Applications. Validation of this work has been primarily focused on systems applications. Our most important result has been achieved on the SUN Remote Procedure Call (RPC). More specifically, we have specialized the encoding/decoding layers (XDR) of the RPC with respect to a call signature. This 1984 industrial-strength program was quite a challenge due to its size and the fact that it was written without any considerations for program specialization. Application of Tempo to the RPC is quite successful: the specialized XDR layers is up to 3.7 times faster than the original; for a complete round-trip call, the improvement is up to a factor of 1.5 [9, 11].
Declaring specialization of programs. As we are studying new techniques to broaden the applicability of program specialization, we expand the scope of our research to branch out into areas which contribute the utilization this approach. In fact, applying program specialization beyond “academic examples” raises method-ology questions. Given the numerous functionalities of a program specializer, how can one interact with it in a high-level way? What run-time environment does program specialization require to ensure correct use of specialized programs? Can a degree of specialization be guaranteed for a given program?
We have developed a declarative approach aimed at addressing some of these issues. Specialization classes declare what, how and, when to specialize programs in a non-intrusive way. This work has been developed in a object-oriented framework [16]. In fact, we are also using a JavaVM-to-C front-end [10] in order to investigate the specialization of Java programs.
Structuring programs for specialization. Beyond explicitly declaring specialization, the next natural step is to explore ways in which programs could be structured such that specialization is somehow guaranteed. This issue is leading us to study various software architectures to know whether they structurally expose special-ization opportunities. The most natural software architecture to study is interpreters due to the large body of work on this topic in the area of program specialization. Furthermore, interpretation-based implementation of languages plays an important role in Domain-Specific Language (DSL), a very promising area which has recently emerged.
We have proposed a methodology for developing DSLs [14] based on interpreters and abstract machines. This methodology has been validated on a realistic application, namely, a language for specifying device drivers for graphic cards [15]. We showed that specifications of device drivers in our DSL are 10 times smaller than the equivalent program written in C, while being equivalently efficient. Not surprisingly, efficiency is obtained by specializing the DSL interpreter with respect to a given specification. Finally, we demonstrated that the restrictions introduced in a DSL allow various domain-specific properties to become provable.
Beyond DSLs, we have explored a broad set of software architectures. We observed that the flexibility of software architectures typically remains in the implementation incurring interpretive overhead. We showed that program specialization can remove this interpretive overhead systematically because it targets opportunities which are structurally exposed by the software architectures [8]. This opens a vast area of applications in software engineering [2].
Organization of this document
After the listing the references, we provide a reading guide to the above-cited paper, i.e. very short abstracts with page numbers with respect to the present document. Published articles follow.
Copyright Notice
This document is provided by the contributing authors as a means to ensure dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright.
More information / Availability of prototypes
The leading theme of the Compose project is the design and development of adaptive programs and systems. Compose is one of the research groups at IRISA.
IRISA (Institut de Recherche en Informatique et Syst`emes Al´eatoires) is a public research laboratory located in Rennes, France. It is composed of two units : an INRIA research unit and a CNRS research unit which is associated with the University of Rennes 1 and INSA of Rennes.
For up-to-date information, including publicly available prototypes used in our experiments, please visit our web pages.
IRISA / Compose project Tel: +33 2 99 84 71 00
Campus universitaire de Beaulieu Fax: +33 2 99 84 71 71
35042 Rennes Cedex http://www.irisa.fr/compose/
FRANCE mailto:[email protected]
References
[1] C. Consel, L. Hornof, J. Lawall, R. Marlet, G. Muller, J. Noy´e, S. Thibault, and N. Volanschi. Tempo: Specializing systems applications and beyond. ACM Computing Surveys, Symposium on Partial Evalua-tion, 1998. To appear.
[2] C. Consel, L. Hornof, J. Lawall, R. Marlet, G. Muller, J. Noy´e, S. Thibault, and N. Volanschi. Partial evaluation for software engineering. ACM Computing Surveys, Symposium on Partial Evaluation, 1998. To appear.
[3] C. Consel, L. Hornof, F. No¨el, J. Noy´e, and E.N. Volanschi. A uniform approach for compile-time and run-time specialization. In O. Danvy, R. Gl¨uck, and P. Thiemann, editors, Partial Evaluation, International Seminar, Dagstuhl Castle, number 1110 in Lecture Notes in Computer Science, pages 54–72, February 1996.
[4] C. Consel and F. No¨el. A general approach for run-time specialization and its application to C. In Con-ference Record of the23
rd
Annual ACM SIGPLAN-SIGACT Symposium on Principles Of Programming Languages, pages 145–156, St. Petersburg Beach, FL, USA, January 1996. ACM Press.
[5] Charles Consel. Program adaptation based on program transformation. ACM Computing Surveys, 28(4es):164–167, 1996.
[6] L. Hornof and J. Noy´e. Accurate binding-time analysis for imperative languages: Flow, context, and return sensitivity. In PEPM’97 [13], pages 63–73.
[7] L. Hornof, J. Noy´e, and C. Consel. Effective specialization of realistic programs via use sensitivity. In P. Van Hentenryck, editor, Proceedings of the Fourth International Symposium on Static Analysis, SAS’97, volume 1302 of Lecture Notes in Computer Science, pages 293–314, Paris, France, September 1997. Springer-Verlag.
[8] R. Marlet, S. Thibault, and C. Consel. Mapping software architectures to efficient implementations via partial evaluation. In Conference on Automated Software Engineering, pages 183–192, Lake Tahoe, Nevada, November 1997. IEEE Computer Society.
[9] G. Muller, R. Marlet, E.N. Volanschi, C. Consel, C. Pu, and A. Goel. Fast, optimized Sun RPC using automatic program specialization. In Proceedings of the 18th International Conference on Distributed Computing Systems, Amsterdam, The Netherlands, May 1998. IEEE Computer Society Press. To appear. [10] G. Muller, B. Moura, F. Bellard, and C. Consel. Harissa: A flexible and efficient Java environment mixing bytecode and compiled code. In Proceedings of the 3rd Conference on Object-Oriented Technologies and Systems, pages 1–20, Portland (Oregon), USA, June 1997. Usenix.
[11] G. Muller, E.N. Volanschi, and R. Marlet. Scaling up partial evaluation for optimizing the Sun commercial RPC protocol. In PEPM’97 [13], pages 116–125.
[12] F. No¨el, L. Hornof, C. Consel, and J. Lawall. Automatic, template-based run-time specialization : Im-plementation and experimental study. In International Conference on Computer Languages, Chicago, IL, May 1998. IEEE Computer Society Press. Also available as IRISA report PI-1065.
[13] ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, Amster-dam, The Netherlands, June 1997. ACM Press.
[14] S. Thibault and C. Consel. A framework of application generator design. In M. Harandi, editor, Proceed-ings of the Symposium on Software Reusability, pages 131–135, Boston, Massachusetts, USA, May 1997. Software Engineering Notes, 22(3).
[15] S. Thibault, R. Marlet, and C. Consel. A domain-specific language for video device drivers: from design to implementation. In Conference on Domain Specific Languages, pages 11–26, Santa Barbara, CA, October 1997. Usenix.
[16] E.N. Volanschi, C. Consel, G. Muller, and C. Cowan. Declarative specialization of object-oriented pro-grams. In OOPSLA’97 Conference Proceedings, pages 286–300, Atlanta, USA, October 1997. ACM Press.
Reading Guide
Overviews
Program Adaptation based on Program Transformation [5]::::::::::::::::::::::::::::::::::::::::7
Much research effort in program transformation has traditionally been devoted to better compile pro-grams. This position paper advocates another promising application of program transformation: program adaptation. That is, the ability for a program to adapt to the context in which it is used. We claim that program adaptation is a key technique in the program development process and is applicable to real size systems. We first explain why program adaptation is important and what this process involves. Then, we discuss how program adaptation can be achieved. Finally, we outline research directions which can contribute to this line of work. [ACM Comp. Survey’96]
Tempo: Specializing Systems Applications and Beyond [1]::::::::::::::::::::::::::::::::::::::::11
Tempo is a partial evaluator designed to exploit specialization opportunities that exist in systems appli-cations. As Tempo consists of many subcomponents, many of which are research topics themselves, our strategy was to develop the simplest subcomponents needed to achieve our goal. Whenever possible, we simply reused existing technology. When limitations prevented us from achieving our goal, we devel-oped the necessary new techniques. The results of this development process, including an assessment and some future perspectives, are summarized in this paper. [ACM Comp. Survey – SOPE’98]
Partial Evaluation for Software Engineering [2]::::::::::::::::::::::::::::::::::::::::::::::::::16
Up to now, partial evaluation has focused on the specialization process. Less attention has been devoted to validating the technology on concrete applications. This paper presents methods that are essential to integrate partial evaluation into the software engineering process, either explicitly by declaring special-ization opportunities, or implicitly by using software architectures and mechanisms that are known to expose predictable specialization opportunities. [ACM Comp. Survey – SOPE’98]
Accurate Binding-Time Analyses
Accurate BTA for Imperative Languages: Flow, Context, and Return Sensitivity [6]::::::::::::::::20
The accuracy of binding-time information directly determines the degree of specialization of off-line par-tial evaluators. We have designed and implemented a binding-time analysis for an imperative language, and integrated it into our partial evaluator for C, called Tempo. This binding-time analysis includes a number of new features, not available in any existing partial evaluator for an imperative language, which are critical when specializing existing programs such as operating system components, namely: flow, context and return sensitivity. [PEPM’97]
Effective Specialization of Realistic Programs via Use Sensitivity [7]:::::::::::::::::::::::::::::::31
Our empirical studies have demonstrated that real-sized applications extensively use non-liftable values such as pointers and data structures. Therefore, it is essential that the binding-time analysis accurately treats non-liftable values. To achieve this accuracy, we introduce the notion of use sensitivity, and present a use-sensitive binding-time analysis for C programs. This analysis has been implemented and integrated into our partial evaluator for C, called Tempo. Experimental studies validate our analysis and show its critical importance. [SAS’97]
Run-Time Specialization
A General Approach for Run-Time Specialization and its Application to C [4]::::::::::::::::::::::53
Specializing programs with respect to run-time invariants allows a program to adapt to execution contexts that are valid for a limited time. This paper describes a general approach to automatic run-time special-ization, based on code templates. Our approach improves on previous work in that: (1) templates are automatically produced from the source program and its invariants, (2) the approach is not machine de-pendent, (3) it is formally defined and proved correct, (4) it is efficient, as shown by our implementation for the C language. [POPL’96]
Automatic Run-Time Specialization: Implementation and Experimental Study [12]:::::::::::::::::65
In this paper we describe new techniques to implement run-time specialization, using of code templates. Templates are automatically generated from ordinary programs. They are compiled and optimized before run time, thus minimizing the time to generate code at run time. Experimental results obtained on sci-entific and graphics code indicate that our approach is highly effective. Run-time specialized programs run nearly (80% on average) as fast as fully optimized programs, improving performance up to a factor of 10. Specialization is amortized in as few as 3 runs. [ICCL’98]
Partial Evaluator Design
A Uniform Approach for Compile-Time and Run-Time Specialization [3]::::::::::::::::::::::::::82
In this paper, we propose design principles and techniques for developing an effective off-line partial evaluator. By structuring the essential components, we are able to tackle a rich language like C without compromising the resulting implementation. By targeting a specific application area, namely system software, we developed a partial evaluator capable of treating realistic programs. Because our design clearly separates preprocessing and processing aspects, we are able to introduce run-time specialization in our partial evaluation system as a new way of exploiting information produced by the preprocessing phase. [PE’96]
Systems Applications
Scaling up Partial Evaluation for Optimizing the Sun Commercial RPC Protocol [11]:::::::::::::101
The Sun commercial RPC (Remote Procedure Call) protocol is implemented in a highly generic way. We report here a successful experiment in using partial evaluation to optimize it. Our study describes specialization opportunities in the RPC code and shows the incapacity of traditional binding-time analy-ses to treat them. We identify the problems of precision in those analyanaly-ses and solve them in Tempo, our partial evaluator for C, succeeding in specializing the RPC code. [PEPM’97]
Fast, Optimized Sun RPC Using Automatic Program Specialization [9]:::::::::::::::::::::::::::112
This paper presents an experiment that achieves automatic optimization of an existing, commercial RPC implementation, namely the Sun RPC. It runs up to 1.5 times faster than the original Sun RPC. The contributions of this work are: (1) the optimized code is safely produced by an automatic tool and thus does not entail any additional maintenance; (2) to the best of our knowledge this is the first successful specialization of mature, commercial, representative system code; and (3) the optimized Sun RPC runs significantly faster than the original code. [ICDCS’98]
Declaring Specialization of Programs
Declarative Specialization of Object-Oriented Programs [16]:::::::::::::::::::::::::::::::::::::129
Designing and implementing generic software components is encouraged by languages such as object-oriented ones. However, it comes at a price: genericity often incurs a loss of efficiency. This paper presents an approach aimed at reconciling genericity and efficiency. To do so, we introduce declarations to the Java language to enable a programmer to specify separately how generic programs should be specialized for a particular usage pattern. Our approach has been implemented as a compiler from our extended language into standard Java. [OOPSLA’97]
Structuring Programs for Specialization
A Framework of Application Generator Design [14]:::::::::::::::::::::::::::::::::::::::::::::144
This paper presents a framework for the development of effective application generators. This framework consists of a two level design process: The first level is the identification of operations that expresses the fundamental computations of the application domain. The second level is the design of a domain-specific language which allows one to express variations within a family of applications. This language is implemented in terms of the operations defined by the first level. We show that the uniform application of partial evaluation enables automatic application generation from a micro-program to its implementation. [SSR’97]
A Domain-Specific Language for Video Device Drivers: from Design to Implementation [15]:::::::151
This paper validates our proposed framework [14] for designing and developing domain-specific lan-guages (DSL), providing automatic generation of efficient implementations of DSL programs. We de-scribe an example of a complete DSL for video display device drivers and the benefits of the DSL ap-proach in this application. This illustrates some of the generally claimed benefits of using DSLs: in-creased productivity, higher-level abstraction, and easier verification. The DSL, named GAL, has been fully implemented with our approach and is publicly available. [DSL’97]
Mapping Software Architectures to Efficient Implementations via Partial Evaluation [8]:::::::::::167
The source of inefficiency in software architectures can be identified in the data and control integration of components, because flexibility is present not only at the design level but also in the implementation. We propose the use of program specialization in software engineering as a systematic way to improve performance and, in some cases, to reduce program size. We study several representative, flexible mech-anisms found in software architectures: selective broadcast, pattern matching, interpreters, layers, and generic libraries. We show how partial evaluation can systematically be applied in order to optimize those mechanisms. [ASE’97]
Java
Harissa is a JavaVM-to-C converter. We have included the following paper because it is being used as a front-end to Tempo to perform specialization of Java programs.
Harissa: a Flexible and Efficient Java Environment Mixing Bytecode and Compiled Code [10]:::::177
In this paper, we present Harissa, a Java environment which reconciles portability and efficiency while preserving the ability to dynamically load bytecode. Harissa’s compiler translates Java bytecode to C, incorporating aggressive optimizations such as virtual-method call optimization based on the Class Hi-erarchy Analysis. Extensive experimental studies show that the C code produced by Harissa’s compiler is more efficient than all other alternative ways of executing Java programs (that were available to us): it is up to 140 times faster than the JDK interpreter, up to 13 times faster than the Softway Guava JIT, and 30% faster than Toba, another bytecode to C compiler. [COOTS’97]
Program Adaptation based on Program
Transformation
Charles Consel
University of Rennes/IRISA, Rennes, France. E-mail: [email protected]
Much research eort in program transformation has traditionally been devoted to better compile programs. Key examples include program optimization and program parallelization. Program transformation has sometimes been used in program development to derive implementations from high-level specications.
This position paper advocates another promising application of program transformation: pro-gram adaptation. That is, the ability for a program to adapt to the context in which it is used. We claim that program adaptation is a key technique in the program development process and is applicable to real size systems.
In the following sections, we rst explain why program adaptation is important and what this process involves. Then, we discuss how program adaptation can be achieved. Finally, we outline research directions which can contribute to this line of work.
1. WHY DOES PROGRAM ADAPTATION MATTER?
A key feature of modern computing environments is their changing nature: hetero-geneous machines are connected together and tasks can migrate as they execute; connections can evolve in time and space; hardware platforms oer vastly dierent functionalities and performance; software environments provide applications with changing services; etc.
This situation calls for programs to adapt to these changing parameters. A con-ventional way to achieve this adaptation consists of structuring programs in terms of modules and layers to enable various functionalities to be added to match varying features of the computing environment. Unfortunately, what seems to be an ade-quate strategy at the design level appears to be a drawback at the implementation level for performance reasons. As a consequence, a well-structured system has to undergo manual optimization to achieve acceptable eciency. This situation is il-lustrated by the microkernel-based operating systems: they provide coarse-grained adaptability at the expense of eciency [3].
2. WHAT IS PROGRAM ADAPTATION?
Program adaptation is aimed at pushing further the generality of programs. Besides modules and layers, it stresses parameterization of the fundamental operations of an application to lead to highly generic programs. In contrast with other approaches, the genericity introduced in a program includes declarative aspects which allow it to adapt to various parameters of its context of usage. These parameters range from properties about its input values to features of underlying operating systems. The declarative nature of the program genericity contributes to make the benets of the adaptation predictable; this aspect appears to be crucial for applications
where performance is critical like systems software.
3. HOW DOES PROGRAM ADAPTATION WORK?
It has long been known that program transformation is a key technology to reconcile generality and eciency. More specically, various experiments have demonstrated that a signicant optimization is obtained from specializing programs with respect to invariants which become valid at various stages of a program's lifetime (e.g., compile time, load time, run time). Examples of such experiments can be found in widely dierent areas such as graphics [13] and operating systems [16]. How-ever, these studies have been limited to manual code transformation, thus trading eciency for safety and maintainability.
Partial evaluation [10; 4], a form of program transformation, is a particularly well-suited technique to adapt general programs to aspects of their computing en-vironment. It is aimed at specializing programs with respect to partially known inputs. This area of research is now reaching a level of maturity that makes it applicable to real-sized problems. In fact, not only are there now partial evaluation systems for languages like C, but the program specialization approach is at the basis of the development of adaptable systems in a number of major research projects and in dierent areas such as networking [14; 18], graphics [11], and operating systems [2; 15; 8].
4. RESEARCH DIRECTIONS FOR PROGRAM ADAPTATION
Various lines of work are playing an important role in the development of program adaptation. Mostly, they include partial evaluation, run-time code generation, data specialization, and program design.
Partial Evaluation. Various partial evaluation systems for real-sized languages like C have been developed [1; 5]. Such systems, combined with other tools, make it possible to achieve program adaptation. To perform program adaptation whenever invariants become valid, a new form of partial evaluation has been introduced to enable programs to specialize at run time [6].
Run-Time Code Generation. To best adapt programs to hardware features, code generation must sometimes be postponed until run time. This strategy allows, for instance, a program to select the best instruction a machine can oer for given operand values (not known until run time) [9; 7; 12].
Data Specialization. This form of partial evaluation is aimed at identifying early computations and encode them as data structures. Then, a staged program commu-nicates intermediate results between early phases and late ones, thus avoiding their recalculations [11]. Knoblock and Ruf have demonstrated that this approach ap-pears to be particularly well-suited for applications where there are a huge number of invariants, like in graphics.
Program Design. Experience has shown that the introduction of a new language is a dicult process which seldomly leads to wide acceptance, especially in industry. On the other hand, considering the rapid changes which occur in computer science, new aspects of programming need to be expressed somehow. We believe that these con icting constraints can be addressed by a declarative approach whose goal is to
provide information for specic purposes, like program specialization in the case of program adaptation. To do so, a promising avenue of research consists of developing languages specic to a given application domain [17]. Such dedicated languages manipulate the fundamental concepts of a domain. They facilitate the design, verication, and reuse of software components.
5. CONCLUSION
This position paper has argued that program adaptation is a key aspect in the design of programs to cope with constantly changing computing environments. Program adaptation is part of the design process and takes the form of declarations. These declarations are used by program transformations to adapt a program at various stages of its lifetime, in a predictable manner. Program adaptation is being ex-perimented with in the context of dierent areas such as networking, graphics and operating systems.
REFERENCES
[1] L. O. Andersen. Self-applicable c program specialization. In C. Consel, editor,ACM Work-shop on Partial Evaluation and Semantics-Based Program Manipulation, pages 54{61, Yale University, 1992. Research Report 909.
[2] B.N. Bershad, S. Savage, P. Pardyak, E. Gun Sirer, M.E. Fiuczynski, D. Becker, C. Chambers, and S. Eggers. Extensibility, safety and performance in the SPIN operating system. In
Proceedings of the 1995 ACM Symposium on Operating Systems Principles, pages 267{ 283, 1995.
[3] A. Bricker, M. Gien, M. Guillemont, J. Lipkis, D. Orr, and M. Rozier. A new look at microkenel-based unix operating systems; lessons in performance and compatibility. In
Proceedings of the USENIX Summer Conference, 1991.
[4] C. Consel and O. Danvy. Tutorial notes on partial evaluation. InACM Symposium on Prin-ciples of Programming Languages, pages 493{501, 1993.
[5] C. Consel, L. Hornof, F. Noel, J. Noye, and E.-N. Volanschi. A uniform approach for compile-time and run-compile-time specialization. In O. Danvy, R. Gluck, and P. Thiemann, editors,
Partial Evaluation, International Workshop, 1996. To Appear.
[6] C. Consel and F. Noel. A general approach for run-time specialization and its application to C. InProceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles Of Programming Languages, pages 145{156, 1996.
[7] D.R. Engler, W.C. Hsieh, and M.F. Kaashoek. `C: a language for high-level, ecient,
and machine-independent dynamic code generation. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles Of Programming Languages, pages 131{ 144, 1996.
[8] D.R. Engler, M.F. Kaashoek, and J.W. O'Toole. Exokernel: an operating system architecture for application-level resource management. InProceedings of the 1995 ACM Symposium on Operating Systems Principles, pages 251{266, 1995.
[9] D.R. Engler and T.A. Proebsting. DCG: an ecient retargetable dynamic code generation system. InProceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), pages 263{273, 1994. [10] N. D. Jones, C. K. Gomard, and P. Sestoft. Partial Evaluation and Automatic Program
Generation. Prentice-Hall International, 1993.
[11] T. B. Knoblock and E. Ruf. Data specialization. InACM SIGPLAN Conference on Pro-gramming Language Design and Implementation, 1996. To appear.
[12] M. Leone and P. Lee.Deferred Compilation: The Automation of Run-Time Code Generation. Technical Report CMU-CS-93-225, Carnegie Mellon University, December 1993. [13] B. Locanthi. Fast BitBlt with asm() and CPP. InEuropean Unix Users Group Conference
Proceedings (EUUG), 1987.
[14] A.B. Montz, D. Mosberger, S.W. O'Malley, L.L. Peterson, T.A. Proebsting, and J.H. Hart-man.Scout: A Communications-Oriented Operating System. Technical Report 94{20, Department of Computer Science, The University of Arizona, 1994.
[15] C. Pu, T. Autrey, A. Black, C. Consel, C. Cowan, J. Inouye, L. Kethana, J. Walpole, and K. Zhang. Optimistic incremental specialization: streamlining a commercial operating system. InProceedings of the 1995 ACM Symposium on Operating Systems Principles, pages 314{324, 1995.
[16] C. Pu, H. Massalin, and J. Ioannidis. The Synthesis kernel. ACM Computing Systems, 1(1):11{32, 1988.
[17] S. Thibault and C. Consel. A Framework of Application Generator Design. { INRIA Technical Report 3005, Rennes, France, October 1996.
[18] E.N. Volanschi, G. Muller, and C. Consel. Safe operating system specialization: the rpc case study. InFirst Annual Workshop on Compiler Support for System Software, February 1996.
Tempo: Specializing Systems Applications and
Beyond
C. Consel, L. Hornof, R. Marlet, G. Muller, S. Thibault, E.-N. Volanschi, IRISA/INRIA-University of Rennes, France
and J. Lawall
Oberlin College, Ohio and
J. Noye
Ecole des Mines de Nantes, France
Tempo is a partial evaluator for C programs. It targets real-size, systems applications. This paper summarizes the development process, assesses the resulting partial evaluator, and provides some perspectives for future work.
Tempo is a partial evaluator designed to exploit specialization opportunities in systems applications. Obtaining ecient systems has become increasingly impor-tant in the context of pervasive high-speed networks, interactive multimedia, and large distributed databases. Toward this end, Tempo was developed by rst study-ing specic applications to be optimized, and then workstudy-ing backwards to create a tool that would perform the desired specialization. As Tempo consists of many subcomponents, many of which are research topics themselves, our strategy was to combine the simplest subcomponents that our studies showed were needed. Whever possible, we simply reused existing technology. When limitations were en-countered, we identied the fundamental problem and developed new techniques to overcome it. This paper summarizes the development process, assesses the resulting partial evaluator, and provides some perspectives for future work.
1. SPECIALIZING SYSTEMS PROGRAMS
Numerous specialization opportunities exist in operating systems programs since, for software engineering reasons, they must be general and modular [Consel et al. 1993; Hornof 1997; Pu et al. 1995; Volanschi et al. 1996]. Some of these oppor-tunities are a result of the way components are combined. For example, calling contexts of generic library functions often include invariants that can be exploited. Other opportunities are created while the operating system is running. For ex-ample, values that become known when a le is opened remain invariant during
Name: Compose project (http://www.irisa.fr/compose)
subsequent reads and writes to that le [Pu et al. 1995]. To take advantage of such opportunities, a partial evaluator must:
|Treat the C programming language, since operating systems are typi-cally written in C.
|Exploit specialization opportunities, such as those due to generality and modularity.
|Exploit run-time invariants, like those found in le reads and writes. |Handle large applications, because certain operating systems contain
up to a million lines of code.
|Ensure predictability, by informing the user how the code will be trans-formed during specialization.
Tempo addresses each of these needs as follows:
Reusing Existing Analyses.
In order to treat the complex constructs of the C programming language, Tempo includes a number of standard analyses [Consel et al. 1996]. Dealing with pointers and imperative features requires both an alias analysis and a side-eect analysis [Banning 1979; Emami et al. 1994]. For these analyses, a ow-sensitive, context-insensitive analysis based on existing research was sucient for our purposes.New Analysis Features.
A number of improvements to the binding-time analysis were necessary to exploit the aforementioned specialization opportunities [Hornof 1997]. For example, return and use sensitivity were needed to eectively treat the complex way in which data is passed interprocedurally [Hornof and Noye 1997; Hornof et al. 1997].Template-Based Run-Time Specialization.
The fundamental challenge in ex-ploiting run-time invariants is to generate the best possible code in the least amount of time, since the cost of code generation is incurred at run-time. Tempo provides a template-based run-time specializer [Consel and Noel 1996; Noel et al. 1996]. Tem-plates are automatically generated, compiled, and optimized at compile time. This approach minimizes run-time costs while producing high-quality code.Module-Oriented Specialization.
The bottleneck of large applications is often located in only small parts of the code. Tempo facilitates the optimization of these small parts with a feature known as module-oriented specialization [Consel et al. 1996]. The set of functions to be optimized is extracted from the original appli-cation and reinserted after specialization. An interface le describes the context from which the functions are extracted, and Tempo preserves the interface during specialization.Visualization tools.
To enable a systems programmer to understand and fol-low the specialization process, Tempo includes visualization tools that display the results of each analysis.2. ASSESSMENT
By incorporating the features described above into Tempo, we were able to suc-cessfully specialize systems programs as well as programs from other domains.
Operating Systems
One major concern in distributed systems is obtaining a fast remote procedure call protocol (RPC), a protocol that makes a remote procedure look like a local one [Sun Microsystem 1989]. A considerable amount of work has been dedicated to optimiz-ing RPC in dierent operatoptimiz-ing systems, either by buildoptimiz-ing new implementations or by manually optimizing critical sections of existing code [Schroeder and Burrows 1990]. Once a given client/server interface is xed, specialization opportunities exist in the data encoding/decoding functions. We used Tempo to automatically optimize the Sun RPC, a commercial RPC implementation that has become a de facto standard [Muller, Marlet, Volanschi, Consel, Pu, and Goel 1997; Muller, Volanschi, and Marlet 1997]. Optimized encoding/decoding functions ran up to 3.7 times faster, yielding an overall speedup of 1.5 for the round-trip call. Tempo has also been used to optimize other operating system applications, including signal delivery and memory allocation.
Other domains
Specializing systems programs required the development of a sophisticated partial evaluator. Tempo is also capable of successfully specializing programs in other domains, which are often simpler or less hand-optimized.
Scientic Algorithms.
Signicant speedups were obtained by specializing scien-tic algorithms, such as cubic spline interpolation, Romberg integration, and fast Fourier transformation (FFT) [Noel et al. 1996].Graphics Programs.
Specializing graphics programs, such as image ltering [Noel et al. 1996] and ray tracing, has also proven eective.Software Engineering.
Tempo has also been used in software architectures in order to tightly integrate separate components into an ecient implementation [Marlet et al. 1997]. Tempo plays a key role in a framework of application-generator design, exploiting the fact that partial evaluation is especially suited to specialize interpreters [Thibault and Consel 1997; Thibault et al. 1997].3. FUTURE WORK
We are continuing to use Tempo to specialize more applications. Additionally, using Tempo as a core transformation engine, we are pursuing two directions to widen its applicability and improve its usability.
Back-End Specializer.
In order to extend the applicability of Tempo, we are in-vestigating ways of using it in a multi-language specialization platform. By trans-lating languages such as Fortran, C++, and Java into C, we can use Tempo as a back-end specializer. For example, we have added a front-end containing thecfronttranslator to specialize part of the Chorus IPC subsystem, written in C++.
We have also developed Harissa, a translator from Java to C, to specialize Java programs [Muller, Moura, Bellard, and Consel 1997].
Specialization Toolkit.
We are also working on complementing Tempo with other tools to make specialization easier for non-programming language experts. Some tools to assist module-oriented specialization have already been developed in collaboration with systems researchers at the Oregon Graduate Institute. Type-Guard and MemType-Guard analyze large programs and provide information concerning invariant usage; Replugger allows a generic version of a function to be replaced with a specialized version on the y.1 A compiler for specialization classes,2wheninter-faced with Tempo, allows the user to express what, when, and how to specialize a program using a high-level specication language [Volanschi et al. 1997]. Finally, we are developing a tool that uses proling information to detect bottlenecks and specialization opportunities in programs.
Acknowledgements
This research was supported in part by SEPT/France Telecom grant CTI 951W009, ARPA grant N00014-94-1-0845, and NSF grant CCR-92243375.
REFERENCES
Banning, J. 1979. An ecient way to nd the side eects of procedure calls and the aliases
of variables. InConference Record of the 6th annual ACM Symposium on Principles Of
Programming Languages(San Antonio, TX, USA, Jan. 1979), pp. 29{41. ACM Press.
Consel, C., Hornof, L., Noel, F., Noye, J., and Volanschi, E. 1996. A uniform
ap-proach for compile-time and run-time specialization. InO. Danvy, R. Gluck, and P. Thie-mannEds.,Partial Evaluation, International Seminar, Dagstuhl Castle, Number 1110 in Lecture Notes in Computer Science (Feb. 1996), pp. 54{72.
Consel, C. and Noel, F. 1996. A general approach for run-time specialization and its
application to C. In Conference Record of the 23rd Annual ACM SIGPLAN-SIGACT
Symposium on Principles Of Programming Languages (St. Petersburg Beach, FL, USA, Jan. 1996), pp. 145{156. ACM Press.
Consel, C., Pu, C., and Walpole, J. 1993. Incremental specialization: The key to high
performance, modularity and portability in operating systems. InPartial Evaluation and Semantics-Based Program Manipulation (Copenhagen, Denmark, June 1993), pp. 44{46. ACM Press. Invited paper.
Emami, M., Ghiya, R., and Hendren, L. 1994. Context-sensitive interprocedural
points-to analysis in the presence of function pointers. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation (June 1994), pp. 242{256. ACM SIGPLAN Notices, 29(6). ACM Press.
Hornof, L. 1997. Static Analyses for the Eective Specialization of Realistic Applications.
Ph. D. thesis, Universite de Rennes I.
Hornof, L. and Noye, J. 1997. Accurate binding-time analysis for imperative languages:
Flow, context, and return sensitivity. InACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (Amsterdam, The Netherlands, June 1997), pp. 63{73. ACM Press.
1TypeGuard, MemGuard, and Replugger are publically available at
http://www.cse.ogi.edu/DISC/projects/synthetix/toolkit/.
2The Java Specialization Classes compiler is publically available at
http://www.irisa.fr/compose/sc/.
Hornof, L., Noye, J., and Consel, C. 1997. Eective specialization of realistic programs
via use sensitivity. InP. Van HentenryckEd.,Proceedings of the Fourth International Symposium on Static Analysis, SAS'97, Volume 1302 ofLecture Notes in Computer Science
(Paris, France, Sept. 1997), pp. 293{314. Springer-Verlag.
Marlet, R., Thibault, S., and Consel, C. 1997. Mapping software architectures to e-cient implementations via partial evaluation. InConference on Automated Software Engi-neering(Lake Tahoe, Nevada, Nov. 1997), pp. 183{192. IEEE Computer Society.
Muller, G., Marlet, R., Volanschi, E., Consel, C., Pu, C., and Goel, A. 1997. Fast,
optimized Sun RPC using automatic program specialization. Rapport de recherche RR-3220 (July), INRIA, Rennes, France.
Muller, G., Moura, B., Bellard, F., and Consel, C. 1997. Harissa: A exible and
ecient Java environment mixing bytecode and compiled code. InProceedings of the 3rd Conference on Object-Oriented Technologies and Systems(Portland (Oregon), USA, June 1997), pp. 1{20. Usenix.
Muller, G., Volanschi, E., and Marlet, R. 1997. Scaling up partial evaluation for optimizing the Sun commercial RPC protocol. InACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (Amsterdam, The Netherlands, June 1997), pp. 116{125. ACM Press.
Noel, F., Hornof, L., Consel, C., and Lawall, J. 1996. Automatic, template-based
run-time specialization : Implementation and experimental study. Rapport de recherche 1065 (Nov.), IRISA, Rennes, France.
Pu, C., Autrey, T., Black, A., Consel, C., Cowan, C., Inouye, J., Kethana, L., Walpole, J., and Zhang, K. 1995. Optimistic incremental specialization: Streamlining a
com-mercial operating system. InProceedings of the 1995 ACM Symposium on Operating Sys-tems Principles(Copper Mountain Resort, CO, USA, Dec. 1995), pp. 314{324. ACM Op-erating Systems Reviews, 29(5), ACM Press.
Schroeder, M. and Burrows, M. 1990. Performance of Fire y RPC.ACM Transactions on Computer Systems 8, 1 (February), 1{17. ACM Press.
Sun Microsystem 1989. NFS: Network le system protocol specication. RFC 1094
(March), Sun Microsystem. ftp://ds.internic.net/rfc/1094.txt.
Thibault, S. and Consel, C. 1997. A framework of application generator design. In Pro-ceedings of the Symposium on Software Reusability(May 1997), pp. 131{135. ACM Press.
Thibault, S., Marlet, R., and Consel, C. 1997. A domain-specic language for video
de-vice drivers: from design to implementation. InConference on Domain Specic Languages
(Santa Barbara, CA, Oct. 1997), pp. 11{26. Usenix.
Volanschi, E., Consel, C., Muller, G., and Cowan, C. 1997. Declarative specialization
of object-oriented programs. InOOPSLA'97 Conference Proceedings(Atlanta, USA, Oct. 1997), pp. 286{300. ACM Press.
Volanschi, E., Muller, G., and Consel, C. 1996. Safe operating system specialization:
the RPC case study. In Workshop Record of WCSSS'96 { The Inaugural Workshop on Compiler Support for Systems Software (Tucson, AZ, USA, Feb. 1996), pp. 24{28.
C.Consel,L.Hornof,R.Marlet,G.Muller,S.Thibault,E.-N.Volanschi, IRISA/INRIA-Universityof Rennes,France
and J.Lawall
OberlinCollege,Ohio and
J.Noye
EcoledesMinesdeNantes,France
Up to now, partial evaluation has focused on the specialization process. Less attention has been devoted to validating the technology on concrete applications. This paper presents methods that are essential to integrate partial evaluation into the software engineering process, either explicitly by declaring specialization opportunities, or implicitly by using software architectures and mechanisms that are known to expose predictable specialization opportunities.
1. BEYONDTHESPECIALIZATIONPROCESS
Partial evaluation has been intensively studied in the past twenty years. It has made major advances regarding the understanding of specialization from both a semantic and an implementation viewpoint. Many variations with respect to various language paradigms and features have been explored. Though prototype implementations have been putting theory into practice for several (mainly declarative) programming languages, they have only been applied to small programs. Less attention has been devoted to validating the technology on concrete applications.
Recently, however, attention has turned toward making a real proof of concept, so as to convince potential end-users in various communities. To do so, a partial evaluator has to successfully specialize large pieces of code, written in languages used by the software industry. Widely-used programming languages like Fortran and C are now being targeted for the development of partial evaluators [Baier et al. 1994; Andersen 1994; Consel et al. 1996].
However, as opposed to an optimizing compiler which calls for little parame-terization, a specializer is a complex tool that requires specic expertise from the programmer to guide the optimization process. As a result, partial evaluation is seldom used outside of its own community.
This paper presents essential methods for partial evaluation to be actually inte-grated in a software engineering process, either explicitly by declaring specializa-tion opportunities, or implicitly by using software architectures and mechanisms that are known to expose predictable specialization opportunities. Another use of partial evaluation for software engineering, not treated in this paper, is program understanding [Blazy and Facon 1997].
Name: Compose project (http://www.irisa.fr/compose)
2. EXPLICITSPECIALIZATION
As partial evaluators become more powerful, they also become more dicult to use. The user must identify the portion of code to specialize and describe the invariants to which specialization should be carried out. Then the user must choose between traditional compile-time specialization and new forms of specialization such as runtime specialization [Consel and Noel 1996], incremental specialization [Consel et al. 1993], and data specialization [Knoblock and Ruf 1996]. Finally, specialized routines must be installed, so that they are used in valid specialization contexts.
To make this process accessible to a non-expert, a simple interface must be pro-vided, allowing a high-level description of what, when, and how to specialize. Spe-cialization classes [Volanschi et al. 1997] are an example of such a specication that automatically exploits and manages specialization. Specialization classes are fully integrated in the object-oriented paradigm. They introduce declarations to the Java language that enable a programmer to specify how programs should be specialized for a particular usage pattern.
This approach has been implemented as a compiler from our extended language into standard Java. It is currently being used for experimentation with Java spe-cialization through Harissa (a Java to C translator) [Muller et al. 1997] and Tempo (a C specializer) [Consel et al. 1996; Consel et al. 1998].
Specialization classes allow a programmer to declare specialization opportuni-ties. Beyond this explicit approach, one may wonder whether opportunities can be exposed implicitly.
3. IMPLICITSPECIALIZATIONOF SOFTWAREARCHITECTURES
Experience has shown that specialization works well on generic programs. This observation suggests that good specialization can be guaranteed for some program-ming styles and software architectures. Thus, partial evaluation could be embedded in a built-in mechanism responsible for the ecient instantiation of a software ar-chitecture.
A software architecture expresses how systems should be built from various com-ponents and how those comcom-ponents should interact. A key feature of software archi-tectures is exibility, resulting in re-usability, extensibility, adaptability. Because exibility is present not only at the design level but also in the implementation, it may introduce a performance overhead. Sources of ineciency can be identied in the integration of data (exchanged or shared) and control (means of commu-nication) between components. Partial evaluation has been shown to improve the performance of a wide range of software architectures and mechanisms [Marlet et al. 1997]:
Selective Broadcast. In the selective broadcast (or implicit invocation) architec-ture [Shaw and Garlan 1996], components are independent agents that interact with each others by subscribing to certain types of events and sending broadcast messages. Specializing with respect to a subscription converts broadcasting into explicit calls to registered agents.
Pattern Matching. In an environment like Field [Reiss 1990], pattern matching is used to select broadcast events and decode message arguments. Specializing the
pattern matcher and the decoder with respect to the pattern generates automata-like routines.
Software Layers. A layered system is a hierarchical organization of a program where each layer provides services to the layer above it and acts as a client to the layer below. An example is Sun's implementation of the Remote Procedure Call (RPC) protocol. If we specialize with respect to the data interface description, par-tial evaluation tightly merges the micro-layers that perform the encoding/decoding of data to/from a network independent representation, resulting in simple memory transfers [Muller et al. 1997].
Interpreters. Scripting languages glue together powerful components (building blocks) written in traditional systems programming languages. For exibility and simplicity, they are often interpreted. If we specialize with respect to the script, the partial evaluation of the interpreter acts as a compiler [Jones 1996]. This approach applies to domain-specic languages as well, with the additional specialization of the components [Thibault and Consel 1997]. It has been successfully put into practice for the automatic generation of ecient video card drivers [Thibault et al. 1997].
Generic Libraries. Complex data structures consist of shape as well as actual content information. Routines in very generic libraries need to test the validity of such arguments and complete bounds checking before performing the actual service. Specializing with respect to the shape of a data structure eliminates verications: the safety interface layer is compiled away.
The above invariants need not be available at compile time in order to allow suc-cessful optimization: partial evaluation can be performed at runtime as well [Consel and Noel 1996]. In contrast with unstructured programming, the improvement here can be predicted and guaranteed. Since partial evaluation is automatic, it does not defeat the goals of software engineering: performance is improved while exibility is retained.
4. CONCLUSION
In order to bridge the gap between software engineers, programmers and partial evaluation, we have proposed two directions. On the one hand, explicit partial evaluation is specied using a high-level description that describes specialization opportunities and hides the intricacies of a partial evaluator. On the other hand, partial evaluation is implicitly guaranteed when using a wide range of software architectures.
Both tracks are actively pursued within the Compose project, as we foresee that partial evaluation will not be successful outside of its community until, paradoxi-cally, it has completely disappeared from the scene.
REFERENCES
Andersen, L. 1994. Program Analysis and Specialization for the C Programming Language. Ph. D. thesis, Computer Science Department, University of Copenhagen. DIKU Technical Report 94/19.
ASE'97. 1997. Conference on Automated Software Engineering(Lake Tahoe, Nevada, Nov. 1997). IEEE Computer Society.
Baier, R., Gluck, R., and Zochling, R. 1994. Partial evaluation of numerical programs
in Fortran. In ACM SIGPLAN Workshop on Partial Evaluation and Semantics-Based Program Manipulation (Orlando, FL, USA, June 1994), pp. 119{132. Technical Report 94/9, University of Melbourne, Australia.
Blazy, S. and Facon, P. 1997. Application of formal methods to the development of a software maintenance tool. In Conference on Automated Software Engineering (Lake Tahoe, Nevada, Nov. 1997), pp. 162{171. IEEE Computer Society.
Consel, C., Hornof, L., Lawall, J., Marlet, R., Muller, G., Noye, J., Thibault, S., and Volanschi, N. 1998. Tempo: Specializing systems applications and beyond.ACM Computing Surveys, Symposium on Partial Evaluation. To appear.
Consel, C., Hornof, L., Noel, F., Noye, J., and Volanschi, E. 1996. A uniform
ap-proach for compile-time and run-time specialization. InO. Danvy, R. Gluck, and P. Thie-mannEds.,Partial Evaluation, International Seminar, Dagstuhl Castle, Number 1110 in Lecture Notes in Computer Science (Feb. 1996), pp. 54{72.
Consel, C. and Noel, F. 1996. A general approach for run-time specialization and its application to C. In Conference Record of the 23rd Annual ACM SIGPLAN-SIGACT
Symposium on Principles Of Programming Languages (St. Petersburg Beach, FL, USA, Jan. 1996), pp. 145{156. ACM Press.
Consel, C., Pu, C., and Walpole, J. 1993. Incremental specialization: The key to high
performance, modularity and portability in operating systems. InPartial Evaluation and Semantics-Based Program Manipulation (Copenhagen, Denmark, June 1993), pp. 44{46. ACM Press. Invited paper.
Danvy, O., Gluck, R., and Thiemann, P.Eds. 1996. Partial Evaluation, International Seminar, Dagstuhl Castle, Number 1110 in Lecture Notes in Computer Science (Feb. 1996).
Jones, N. 1996. Whatnotto do when writing an interpreter for specialisation. InO. Danvy, R. Gluck, and P. ThiemannEds.,Partial Evaluation, International Seminar, Dagstuhl Castle, Number 1110 in Lecture Notes in Computer Science (Feb. 1996), pp. 216{237.
Knoblock, T. and Ruf, E. 1996. Data specialization. InProceedings of the ACM SIG-PLAN '96 Conference on Programming Language Design and Implementation(May 1996), pp. 215{225. ACM SIGPLAN Notices, 31(5). Also TR MSR-TR-96-04, Microsoft Research, February 1996.
Marlet, R., Thibault, S., and Consel, C. 1997. Mapping software architectures to
e-cient implementations via partial evaluation. InConference on Automated Software Engi-neering(Lake Tahoe, Nevada, Nov. 1997), pp. 183{192. IEEE Computer Society.
Muller, G., Moura, B., Bellard, F., and Consel, C. 1997. Harissa: A exible and
ecient Java environment mixing bytecode and compiled code. InProceedings of the 3rd Conference on Object-Oriented Technologies and Systems(Portland (Oregon), USA, June 1997), pp. 1{20. Usenix.
Muller, G., Volanschi, E., and Marlet, R. 1997. Scaling up partial evaluation for
optimizing the Sun commercial RPC protocol. InACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (Amsterdam, The Netherlands, June 1997), pp. 116{125. ACM Press.
Reiss, S. P. 1990. Connecting tools using message passing in the led environment.IEEE Software 7, 4 (July), 57{66.
Shaw, M. and Garlan, D. 1996. Software Architecture. Prentice Hall.
Thibault, S. and Consel, C. 1997. A framework of application generator design. In Pro-ceedings of the Symposium on Software Reusability(May 1997), pp. 131{135.
Thibault, S., Marlet, R., and Consel, C. 1997. A domain-specic language for video
de-vice drivers: from design to implementation. InConference on Domain Specic Languages
(Santa Barbara, CA, Oct. 1997), pp. 11{26. Usenix.
Volanschi, E., Consel, C., Muller, G., and Cowan, C. 1997. Declarative specialization
of object-oriented programs. InOOPSLA'97 Conference Proceedings(Atlanta, USA, Oct. 1997), pp. 286{300. ACM Press.
Accurate Binding-Time Analysis For Imperative Languages:
Flow, Context, and Return Sensitivity
Luke Hornof
[email protected]Irisa
Campus Universitaire de Beaulieu
35042 Rennes Cedex, France
Jacques Noye
[email protected]Ecole des Mines de Nantes
4 rue Alfred Kastler
44070 Nantes Cedex 03, France
Abstract
Since a binding-time analysis determines how an o-line partial evaluator will specialize a program, the accuracy of the binding-time information directly determines the de-gree of specialization. We have designed and implemented a binding-time analysis for an imperative language, and in-tegrated it into our partial evaluator for C, called Tempo [9]. This binding-time analysis includes a number of new features, not available in any existing partial evaluator for an imperative language, which are critical when specializ-ing existspecializ-ing programs such as operatspecializ-ing system components [24, 25].
Flow sensitivity. A dierent binding-time description
is computed for each program point, allowing the same variable to be considered static at one program point and dynamic at another.
Context sensitivity. Each function call is analyzed
with the context of the call site, generating multiple binding-time annotated instances of the same function denition.
Return sensitivity. A dierent binding-time
descrip-tion is computed for the side-eects and the return value of a function
1 Introduction
Automatic program specialization is emerging as a key soft-ware engineering concept which allows softsoft-ware to be generic without sacricing performance. The motivation for our work on Tempo [9], a partial evaluator for C, is to demon-strate that partial evaluation can provide a realistic basis
for automatic program specialization. Therefore, we have chosen to deal with a widely used language, namely C, and focus on optimizing existing, realistic applications. One of the main areas of applications we are looking at is operating system code. Indeed, this is an area where the con ict be-tween generality (an operating system must, by denition, deal with a wide variety of situations) and performance is especially acute. It is therefore not surprising that many opportunities for applying partial evaluation to operating systems code have been identied [11, 12, 28].
However, we have discovered that existing partial-evaluation technology is not suciently advanced to eec-tively specialize the corresponding programs. This is due to a lack of accuracy of binding-time analyses in dealing with typical features of imperative programs, such as pointers, aliases, and side-eecting functions. We have found that
ow,context,return, anduse sensitivity are necessary in a binding-time analysis in order to successfully specialize sys-tems programs.
Use sensitivity is addressed in [18]. The basic idea is that, at specialization time, the value of a variable is allowed to be computed in certain contexts even if the variable identier is residualized in others. An accurate handling of pointers and structures makes it essential that a single residualized use of an object does not force all other uses to be residu-alized. This led us to develop an analysis in two dierent phases. The rst phase determines which parts of the pro-gram can be computed at specialization time, whereas the second phase determines the actual transformations which will be applied at specialization time.
This paper focuses on the rst phase of the analysis, which determines which parts of the program are static,
i.e. can be computed at specialization time, and describes how to obtain ow,context, andreturnsensitivity. Firstly, ow sensitivity allows a dierent binding time to be asso-ciated with a variable at dierent program points, i.e. a variable is allowed to be static at one point and dynamic at another. Secondly, systems code contains calls to the same function which occur in dierent system states. Context sensitivity permits each call to be analyzed with respect to its specic state, allowing the dierent static values in each
state to be exploited by each call. Finally, a system pro-cedure typically returns some sort of constant error status. Return sensitivity allows the binding-time analysis to take advantage of this constant value, even if the system function contains dynamic constructs.
We have implemented an inter-procedural ow, context, and return-sensitive binding-time analysis and integrated it into Tempo. The analysis deals with a wide subset of C, including in particular multiple returns, pointers, and struc-tures. As a result, signicant existing applications can be handled without major rewriting. The results of the analysis are used to drive both Tempo's compile-time and run-time specializer [9, 10]. We have found that, with this extra pre-cision obtained by our analysis, we are able to eectively specialize systems code [24, 25, 31]. It has also been ap-plied successfully to many other application domains such as domain-specic language interpreters [30], and, in the context of run-time specialization, scientic programming and image processing [27].
In the next section, Sect. 2, we explain ow, context, and return sensitivity and show how they improve the precision of the binding-time analysis. The details of the analysis are then presented in Sect. 3. Existing applications on which this analysis is being applied are given in Sect. 4. Related work is addressed in Sect. 5 and nal remarks are made in Sect. 6.
2 Sensitivities
Let us look at a few examples that exhibit ow, context, and return sensitivity. For each example, we rst present the ini-tial source program. Then we give the program annotated by the rst phase of the binding-time analysis, where static constructs are overlined and dynamic constructs underlined. The second phase of the binding-time analysis determines which action (i.e. transformation) to apply to each con-struct during specialization. Theevaluate action instructs the specializer to evaluate the construct andresidualize in-structs the specializer to residualize it. We present the action-annotated program, where overlined constructs are to be evaluated and underlined constructs are to be residu-alized. Finally, we show the resulting specialized program.
2.1 Flow Sensitivity
A ow-sensitive analysis associates a dierent state with each program point. This allows variables that are read and written multiple times to be associated with dierent binding-times at dierent locations in a program.
In the example in Fig. 1, the function is analyzed with an initial binding-time description specifying that parameters
x,y, andp are all static, and that the global variable dis
dynamic. The variablexis read and written multiple times,
and is static at some program points and dynamic at others. The left-hand side, orlvalue, of an assignment is considered static if it depends only on static data. This is why, in the example, all of the variables which occur on the left-hand side of an assignment are static.
Pointers and aliasing may create an ambiguous deni-tion, an assignment for which the analysis cannot statically determine which location will be modied at run time. In the example, we assume that pointerpmay point to eitherx
ory, which creates an ambiguous assignment (binding-time
annotated aliases appear in comments next to dereferenced
pointers). Since the assignment is dynamic, both locations must become dynamic.
The action annotations only slightly dier from the binding-time annotations. Static constructs become eval-uate constructs, and dynamic constructs become residualize constructs. The only exceptions to this are the static left-hand sides of dynamic assignments, which are annotated residualize, instructing the specializer to residualize the vari-able identiers on the left-hand sides (instead of evaluating the variable and lifting the resulting value).
The subsequent specialization phase is guided by the ac-tion annotaac-tions. Evaluate constructs are evaluated and residualize constructs are residualized. Evaluate statements disappear completely. Evaluate expressions are evaluated, and the resulting value is lifted into the residual code. Resid-ualize expressions and statements are residResid-ualized.
2.2 Context Sensitivity
Context sensitivity enables a function to be analyzed with respect to dierent states, or contexts, producing an annotated instance of the function for each context. Since annotated instances are separate, each one can exploit the static values of its specic context.
The second example shows a functionf()which contains
a sequence of calls tog(), as given in Fig. 2. Functionf()
is analyzed with an initial binding-time description specify-ing that global dis dynamic. The context of the rst call
consists of a static actual parameter, a static non-local vari-ablex, and a dynamic non-local variabley(binding times of
the non-local variables appear in comments). An instance of the function is then annotated with respect to this context. Notice thatxbecomes dynamic while analyzing the body of g(), which creates a dierent context for the second call to g(). Therefore, a second instance of the function is created
and annotated with respect to this new context. The third call tog()has the same context as the second call, so a new
instance is not created.
The corresponding actions are then produced and are used to specialize the program. In the residual program, each dierent instance of functiong()produces a dierent
residual function denition. Since the third call tog()had
the same context as the second call, it also shares the same residual function denition.
2.3 Return Sensitivity
Return sensitivity allows a function to return a static value even though the function contains dynamic side-eects and is therefore residualized.
In the third example, shown in Fig. 3, the function is analyzed with an initial binding-time description specifying that global variabledis dynamic. Return sensitivity allows
the static value returned by g()to be used at its call site,
which in turn enables the multiplication to be considered static as well. At the function's denition, we indicate that the function contains dynamic side-eects by annotating the identiergas dynamic and that it returns a static value by
annotating its return typeintas static. At the call site, the
identier is annotated as both static and dynamic.
The specializer exploits the static return value returned by g()to perform the multiplication, and residualizes the
call in order to residualize its side-eects. Notice that the specialized denition ofg()no longer returns a value.
Source code
int d;
void f(int x, int y, int *p)
f x = x + y; x = d; x = x + y; *p = d; x = x + y; g
Action annotated code
int d;
void f(int x, int y, int *p)
f x = x + y; x = d; x = x + y; * p = d; /* alias: p ! fx, yg */ x = x + y; g
Binding-time annotated code
int d;
void f(int x, int y, int *p)
f x = x + y; x = d; x = x + y; * p = d; /* alias: p ! f x, y g */ x = x + y; g Specialized code (w.r.t. x = 2, y = 3) int d;
void f(int x, int y, int *p)
f x = d; x = x + 3; *p = d; x = x + y; g
Figure 1: Flow sensitivity
3 The Binding-Time Analysis
We shall make the above-mentioned ideas precise by de-scribing our binding-time analysis using a data- ow analysis framework (see, for instance, [1, 20]) on the subset of C de-scribed in Fig. 4. For the sake of conciseness, this subset contains only a limited number of expressions and state-ments; further details on the intra-procedural aspects of the analysis can be found in [18]. Note also that non-void func-tion calls are assumed to assign their return value directly to an identier, which can then be used in subsequent cal-culations. This strategy simplies the analysis without re-stricting its applicability. We assume that all programs are transformed prior to the analysis, if needed, so that they conform with this constraint. Also, the analysis presented is further simplied by the fact that it does not handle re-cursive functions.
3.1 Intra-procedural aspects
Locations and States
We refer to the sets of values propagated by the analysis asstates. States are elements ofLocation! Bt, where Btis the lattice U <S <D with
least upper bound operator t. U stands for undened, S
for static, andD for dynamic. In the intra-procedural case
and in the absence of structures, Location = Identifier,
provided all identiers have been renamed in order to be unique. That is, each actual memory location associated to a given variable identier is modeled by a single abstract location denoted by the identier.
The binary operatornof typeStateLocations!State
resets a set of locations to the bottom elementU.
In the following, we shall use a graph representation of states. The application of a state is modeled by a lookup function which takes a graph (a set of pairs location/binding time) and a location, and returns the corresponding binding time. All the locations do not need to occur in the graph. A location which does not occur in the graph is considered to be undened (the lookup function returnsU).
Pre-processing
We assume that, prior to binding-time analysis, an alias analysis and a denition analysis have been executed. The alias analysis gives, for each derefer-ence expressione
expat program pointe, the setaliases(e)
of corresponding aliases, i.e. a set of locations. The def-inition analysis computes, for each statement at program point s, the set of locations defs(s) whichmay be dened
(through an assignment) within the statement. The func-tionunambiguous-defs() additionally computes, for each as-signment, the set of locations unambiguously dened by the assignment. If there is a single location associated to the
Source code int x, y, d; void f() g(int z) f f x = 1; x = (x + z) + y; y = d; g g(5); g(5); g(5); g
Action annotated code
int x, y, d;
void f() void g(int z) /* x, y */
f f x = 1; x = (x + z) + y; y = d; g g(5); /* x, y */ void g(int z) /* x, y */ g(5); /* x, y */ f g(5); /* x, y */ x = (x + z) + y; g g
Binding-time annotated code
int x, y, d;
void f() void g(int z) /* x,y */
f f
x = 1; x = (x + z) + y;
y = d; g
g(5); /* x, y */ void g(int z) /* x,y */ g(5); /* x, y */ f g(5); /* x, y */ x = (x + z) + y; g g Specialized code int y, d; void f() void g1() f f y = d; x = 6 + y; g1(); g g2(); void g2() g2(); f g x = (x + 5) + y; g
Figure 2: Context sensitivity Source code
int x, d;
void f() int g(int z)
f f
x = (g(1) * 2) + d; x = z + d;
g return (z + 3);
g
Action annotated code
int x, d;
void f() int g(int z)
f f
x = (g(1) * 2) + d; x = z + d;
g return (z + 3);
g
Binding-time annotated code
int x, d;
void f() int g(int z)
f f x = (g(1) * 2) + d ; x = z + d; g return (z + 3); g Specialized code int x, d; void f() void g() f f g(); x = 1 + d; x = 8 + d; g g
Figure 3: Return sensitivity
Domains:
const2Integer id2Identifier bop2BinaryOperator
Abstract syntax:
exp ::=const constant
j id variable
j &lexp reference
j *exp dereference
j expbopexp binary expression
lexp ::=id variable
j *exp dereference
stmt ::= lexp=exp assignment
j if (exp)stmtelsestmt conditional statement j dostmtwhile (exp) loop
j fstmt
g block
j id(exp
) void function call
j id=id(exp
) non-void function call
j returnexp function return
j return void function return
type-spec ::= intjcharj... base types
j *type-spec pointer type
decl ::= type-spec id declaration
func-def ::= type-spec id(decl
)stmt function denition
program ::= declfunc-def program
Figure 4: Syntax of C subset left-hand side of the assignment, the assignment is
unam-biguous; it unambiguously denes the location. Otherwise, there are, because of aliasing, several locations associated to the left-hand side of the assignment, the assignment is ambiguous; the dened location cannot be determined stat-ically. The set of locations unambiguou