Genericity and Efficiency - Applications of Type-Safe Linguistic Reflection

2.5 Applications of Type-Safe Linguistic Reflection

2.5.1 Genericity and Efficiency

There are a number of application areas for the styles of strongly typed linguistic reflection described earlier. One of these is in supporting highly generic programs efficiently. The advantage of genericity is that it may promote software reuse, with associated economic benefits, by making the programs that are written more generally applicable than their non- generic counterparts. Thus for a given application it is more likely that existing software will be available, reducing the amount of new code that needs to be written.

A number of languages support polymorphic functions [Mat85, Tur85, Per87, MTH89, MBC+89, DM90, HWA+90, She90]. These achieve genericity by allowing the programmer to abstract over details of types. For example, a single function that counts the lengths of homogeneous lists of any element type may be defined. This is possible because the type of the list elements does not affect the way in which the length of the list is calculated. This variety of polymorphism is known as parametric polymorphism. Another variety, inclusion polymorphism, allows types to be partly abstracted over. For example, a function that expects a record parameter with a single field name may also be passed a record with two fields, name and address. The extra information is ignored by the function.

While these forms of polymorphism allow generic functions to be defined, their use is confined to cases where the generic computation does not depend on the types of the operands. There also exist application areas where a generic operation may sensibly be defined over many different types, but where the type of the data does affect the computation. Some examples are: natural join, deep equality testing and pretty-printing. In these cases the ‘same’ operation may be performed on instances of many different types, with details of each computation being determined by the particular type. For example with natural join the types of the input relations determine both the type of the result relation and the algorithm to produce the result. This constitutes a form of ad hoc polymorphism [Str67].

Generic programs whose behaviour depends on the types of their data can be written using type-safe linguistic reflection. The technique involves defining generators that, supplied with the types for a particular call, produce source representations of code to perform the operation for those types. The generators are used differently in compile-time and run-time reflection. With compile-time reflection the following actions are performed:

• During compilation, generator definitions are compiled.

• Also during compilation, calls to generators are executed. The generators produce new source code that is specialised to operate on particular types. They have access to type information accumulated by the compiler during compilation up to the point of the generator calls.

• The new source representations are incorporated into the original source program, replacing the calls to the generators.

• After each generator call, compilation continues from the point where the call was encountered. The new code produced by the generator is compiled and type-checked as if it had been part of the original program.

• When compilation has been completed all the reflective constructs have been ‘compiled away’ from the resulting compiled code.

• When a generic operation is required to be applied to some data during execution, a generator is called and the data passed to it. The generator is able to discover the types of the data.

• The generator produces new source code to operate on those particular types.

• The new source code is compiled. If compilation succeeds the resulting compiled code is applied to the data.

In both compile-time and run-time reflection the code produced by the generators may be executed many times after the process of reflection has taken place, with no further overheads due to the genericity. This contrasts with the interpretive scheme that would be required to provide the same genericity if reflection were not used. In such a scheme the costs of specialisation would be borne every time a generic operation was performed.

To illustrate this difference, consider both reflective and non-reflective implementations of a generic operation to perform natural join in a language without built-in support for relations. The reflective implementation produces a specialised version of natural join whenever it is required. This version is specialised to the types of the input relations, specifying the names and types of their attributes, and type of the result relation. It is possible to verify before any call to the specialised function that it is supplied with relations of the correct types, thus the body of the function itself need not contain any checking for the well-formed-ness of the input relations. In addition the computation of the result type and the algorithm for producing the result tuples can be performed in the generator rather than the specialised function, which may be executed many times for each execution of the generator.

Without reflection, interpretation is required to provide the genericity. This solution requires a more loosely typed representation of relations, where all relations have the same type, for example a list of attribute names together with a two dimensional array of values. A single natural join function can then be defined for all relations. The disadvantage is that more computation is required at run-time: the compatibility of the input relations must be checked and the algorithm to produce the result tuples determined from examination of the input relations.

In the reflective solution to the natural join problem, the type dependent details of instances of a family of functions are generated. Thus the generator can be thought of as a highly generic abstraction over the functions. Another example of this approach is a set of four traversal functions over recursive data types [She91]. These functions generalise the list map and fold functions allowing them to be applied to any recursive data type. Sheard has also used the technique to define a deep equality test for any type [She90]. Similarly, forms systems for data entry and access can be automatically generated from type definitions. Cooper has used such a technique to provide a rich repertoire of interaction modes over any structures that may be defined in a range of data models [Coo90b].

The genericity achievable via linguistic reflection depends on the ability of a generator to access type details and generate program fragments that are tailored to the types given when the generator is executed. This constitutes a form of ad hoc polymorphism, but the genericity attained in these examples exceeds the capabilities of current polymorphic type systems [SFS+90]. In most polymorphic systems, the behaviour of polymorphic functions must be essentially invariant over the range of input types. The examples listed above have behaviour that varies too much to be accommodated by current polymorphic systems.

In conclusion, linguistic reflection supports the definition of generic programs whose behaviour depends on the types of their inputs, and that are more efficient at run-time than the equivalent interpretive versions. Efficiency is gained by allowing the input data to be represented in a more specialised form while still supporting generic abstractions over the data. This allows validity checking and algorithm construction to be performed earlier.

In document Reflection and hyper programming in persistent programming systems (Page 40-42)