I These classes are described fully elsewhere [dear87] To give the reader a flavour of PAIL
1 index into class string
! takes a sub string length 1
! from string class at position pos ! gets next lexeme fiem
Î 3ie class identifier string
! ++ is concatenation ! return str
! build string vector ! build procs vector I build menu title
let type = lexQ 1 '”n" is the newline character
\ e t field - lex()
name := name ++ type ++ " " ++field ++ " ; "
strings := strings ++field++ ++ type ++ ",'n"
procs := procs ++ if type ="pntr"
then "procQ ; Trav(p( "++//e/rf++") ),'n" else "proc() ; write p( " ++//e/£/++ "),'n" end
while last -= ")"
name := name ++ ")'n" ! list part of structure n ^ e
strings := strings ++ "*"****'" ]'n" ! last entry of strings vector
procs := procs ++ "procQ ; {} ]’n" ! last entry of procedures vector
! next create string containing program representation let prog := "proc( proc( pntr ) Trav > proc( pntr ) )’n" ++
"proc( pntr p )’nbegin'n" ++
name ++ strings ++ procs ++
"let this.menu = menu( strings,procs )’n" ++
"if p is " ++ title ++ " then this.menu( 20,20 ) else ErrorQ'n" ++ end’n"
structure gen{ proc( proc( pntr ) > proc( pntr ) ) maker ) let S = gen{ proc( proc( pntr ) t > proc( pntr ) ) ; nullproc )
end compiler(prog,S ) ! return the result of compilation that is ! S containing the requir&l procedure
example 4
The procedure Trav can now be refined to use this procedure. Whenever a class is found for which no traversal procedure exists in the trav.procs table mk.trav.proc will be called to create a traversal procedure. The generator procedure is then extracted from the structure and called with the generic pointer traverser ( Trav itself ) as a parameter. The resulting procedure can then be stored in the table and finally called to traverse the structure that caused the procedure to be generated. The Trav procedure will therefore look something like this.
1
end
1
let Trav - proc( pntr/? ) 1
begin
structure gen{ proc( proc( pntr ) ->proc( pntr ) ) maker ) structure trav.pack{ proc( pntr ) trav )
let key ~ class.identifier{ p ) ! get class of instance
let traverser. := s.lookupl key,trav,procs ) ! look for display procedure
if traverseris trav.pack ! found one so Ï
then traverser^ trav ){p) ! call it with p as a parameter
else begin
let package - mk.trav.proc{ key ) 1 create a display package
let T - package^ maker ) ! get generator from package |
let bound = T( Trav ) ! generate a display proc
traverser := trav.pack{ bound ) Î re-package display procedure
s.enter( key,trav.procs,traverser ) ! and put it into the table
boundi p ) ! finally call it
end
I
examples
The browser is now complete. The traversal procedure Trav maintains and uses the
trav.procs table which is used to store the procedures that display particular classes.
Whenever a display procedure cannot be found by Trav, the procedure mk.trav.proc is
called to generate the necessary compiled code. This code may need to have access to the f 3
Trav procedure, therefore, the mktrav.proc procedure returns a display generating procedure # which is passed to Trav as a parameter. This step is equivalent to linking in a conventional
system. The newly generated procedure is then put into the table so that it can be called to display subsequent instances of that structure class.
7.6 Firewalls
The language type rules have not broken in the browsing program. However, the discovery of the structure class types using the class.identifier procedure has been permitted. The procedure closure has remained sacrosanct and has provided a fire-wall through which this program cannot penetrate. Nevertheless, the need to see inside a closure or, indeed, an activation does arise, for example, when a symbolic debugger is used. The need to see inside such objects also arises when a system is in need of repair. This is seen as being equivalent to the hardware engineer placing probes on a board to identify faults within it. The scheme described does not handle such cases which are clearly in need of more
investigation. It is thought that different levels of object interpretation may be needed in this r| case.
7.7 Perfbmiance I
■I
The alternatives approaches to the above scheme will now be considered. The |irst is to halt the system with an error message when a structure class for which no traversal procedure exists is found. The user would then have to write,compile,debug and enter into the table a procedure to traverse the object. The solution outlined in above is several orders of magnitude faster than this. The second alternative would be to write the browser in a lower level language - not a viable compromise in terms of software engineering costs.
The procedure shown in example 1 to traverse the class, structure ;c( int a ; string b ; pntr c )
takes the browser 4.5 seconds of user time to write, compile, enter into the trav.procs table and display the menu on a SUN 3/260.If the procedure is already in the table, the combined time required for look up and display time takes less than a sixtieth of a second.
7.8 Persistence
In a conventional programming system the scheme described would be very expensive. The traversal program would have to recreate the traversal procedures in every invocation. In a persistent programming language the table trav.procs may reside in the persistent store and therefore any changes made to the tables will exist as long as they are accessible. Consequently the traverser program never has to recompile traversal procedures. The program in effect learns about new data structures. It does so in a lazy manner, as it only learns how to display the classes that it is actually required to display. This may be viewed pictorially in figure 3.
root of persistence Persistent Store browser compiler trav.procs figures 7.9 Browser Software Architecture
As the browser evolved it became apparent that it was more important than merely a method of traversing data structures. What had evolved was a new software architecture. The important features of the architecture are:
1. strong ( static ) type checking
2. late ( dynamic ) demand driven binding 3. dynamically linking of code
4. adaptive in nature. These features are discussed below.
The browser is built entirely using the mechanisms provided by the PS-algol language. The language is statically type checked, apart from the projection out of the infinite union pntr, where dynamic type checking is employed. The procedures written by the browser are type
checked by the callable compiler and only syntactically correct programs are admitted into the architecture.
The system is adaptive in nature. The browser can traverse any data structure composed from any of the infinite number of types in the system. These types do not have to have been declared at the time the browser was written. The system adapts itself to operate on new data structures as required.
The binding of procedures to the architecture is performed extremely late. Indeed, procedures that traverse a particular class are not even written unless required. The storage
of procedures in an extensible data structure with dynamic lookup on class is necessary in %
Î
order to permit the flexibility required. The kind of binding performed here is the weakest %
possible in a strongly typed system.
The procedures written by the browser and compiled by the callable compiler are dynamically linked into running code. The linking is performed when the procedure which is returned from the callable compiler is entered into the table.
Two brief examples, showing how this architecture has been exploited in the fields of $
bootstrapping and databases, are discussed below. f
7.10 Browsers as a bootstrapping tool
Code files for the Persistent Abstract Machine consist of a heap of objects prefixed by five | words of header information. In order to bootstrap the system, such a file must be created
by the bootstrap compiler. The problem is the mapping of the complex graph structure, | which comprises a Persistent Abstract Machine heap, onto the file system. This task is 3? normally carried out by the POMS, but in the bootstrap system the compiler is written in 3 PS-algol and no Persistent Abstract Machine code is running.
The problem may be solved by a similar solution to that used in the browser. A set of rules exists for the creation of valid objects. The object management system must keep maintain
3
;
not one but two tables. The first is similar to the one maintained by the browser - that is a table of output procedures for each object class encountered. The second maintains an address mapping table. This maps object pointers in the address space of the bootstrap compiler to addresses within the code file. The first table may be persistent but the second table is recreated with the production of every heap.
This solution allows the object formats to be easily changed since they are controlled by the | browsing program. No programmer time is required in the production of a heap containing a
different set of object classes. Therefore, the decision of which objects are in the heap is inexpensive. The production of a virtual image using this technique is much cheaper than hand building a virtual image or even writing a program to produce a custom made one. 7.11 Adaptive Databases
The technology used in the browser has also been used in the production of a relational database system [coop87]. Traditionally, databases are implemented by creating a canonical relation structure [cod70]. Relations introduced by the programmer of the system are then mapped onto this canonical representation. Relations then require a level of secondary interpretation at run time.
The database system constructed by Cooper uses the techniques first invented in the PS- algol object browser. When a user of the database system defines a new class of relation the system generates a set of creation and selector functions for a data structure. The programs are compiled using the callable compiler and entered into a table. Whenever the relation is accessed the appropriate selector functions are used. In this way each data structure is stored in the most appropriate manner for that relation without the need for secondary interpretation of the data structures. Furthermore the expensive task of programming the movement of objects to and from backing store is performed by the POMS.
I
Î
7.12 Conclusions
It has been illustrated how a browser may be written in a closed strongly typed environment. This has been achieved without having to use dynamic typing, or make the requirement that every data stmcture has to have a printString method as in the Smalltalk-80
%
system. In the system described the programmer may still write a display procedure manually thus specializing the programs default action as in the Smalltalk case. It is also possible to have different display formats for objects by having more than one display table. The program is allowed to discover the type of objects, even when the type of an object may have been deliberately hidden by the programmer. This raises the issue of who should be able to break these fire walls? The browser needs to be able to see inside objects if it is to be used as a debugger but the programmer may not want the contents of say, an abstract type discovered.
The architecture developed in the browser has been explored and two further examples of how the architecture may be exploited have been given.
8 Conclusions
This thesis presents research into the design and construction of persistent programming systems. This work has been performed as part of the Persistent Information Space Architecture ( PISA ) project [ack86b].
The main areas in which research has been performed are :
171 1. programming language design ;
2. programming language implementation ; 3. compiler construction ;
4. abstract program graphs ; and |
5. adaptive object browsers.
JI
8.1 Programming Language Design
The importance of good programming language notations cannot be overstated. The provision of a good notation permits the programmer to concentrate on the complexities of a given task rather than the mapping of that task onto a particular language. Research into programming languages has been explored using the persistent languages PS-algol and Napier.
The main areas explored in the language domain are:
i
1. machine independent graphics ; |
2. environments ; and 3. polymorphism. 8.1.1 Graphics
When the work documented in this thesis was started, PS-algol had no raster graphics. It did not therefore provide any means of utilising the power of the graphics facilities provided on the then new ICL Perq computers [icl83]. Several experimental language
A
implementations [mor86c,mor86b] were constructed in order to discover how graphics ï facilities could be integrated with a high level language.
This kind of problem lead to the conclusion by myself and others [atk85a] that control over binding mechanisms is extremely important in large persistent information spaces. The datatype environment was introduced in order to provide a mechanism that would allow incremental system construction and change within a large system.
The integration of graphics facilities into a high level language permits sophisticated machine independent user interfaces to be constructed. Graphics objects are language objects with full civil rights, this means that they may be be stored in the persistent store and manipulated
by procedures. For example, the persistent object browser makes use of the menus provided 4 by the graphics facilities. Menus are held within procedure closures in the persistent store,
allowing them to be rapidly displayed when required.
The PS-algol graphics facilities have been used to build the front ends to a number of M
.
^
sophisticated applications including a windowing system [cut87] and an object oriented
I database with inheritance [ben87].
8.1.2 Environments
Much of the work has involved discovering what special problems arise in persistent systems. One of thé problems that emerged early during the research was the need to control complexity in large systems. Indeed, the problems of building large systems have been known for many years.
The way in which objects are bound together in a large system is especially important.
During the development of the browser a design flaw was discovered and the knowledge ^ that the browser had gained had to be discarded. This was necessary because of the way that ] the system had been bound together. In this case, too much static binding had been used j which did not permit enough change.
I
I
The environment datatype is a simple mechanism with clean semantics that are easy to -i ■f understand. Environments provide a way of smoothly integrating the programming f
j language with the programming environment. They also provide a structuring mechanism l over the name space which is similar to the structure imposed by directories on a file
system. Problems still remain in this area, in particular, how functions like Is in Unix may be expressed in a strongly typed system [atk87].
8.1.3 Polymorphism
The cheapest way to build a software system is to construct from components already written [brk86]. In order to achieve this, a type system is required that is powerful enough to describe all the objects in a system. Polymorphism provides the mechanism for abstracting over types. However, the search for an all powerful type system is not an end in itself. One mechanism, the type system, may be used for checking the legal composition of objects makes the system simpler with the attendant cost benefits.
8,2 Abstract Machine Design
To design programming languages and not implement them is pointless, yet this often happens. It is only through implementations that engineering lessons are learned. Sometimes paper designs cannot be realised by current implementation technologies and the design has to be revised - this is part of the design process. Much has been learned from implementing PS-algol, the first persistent programming language. Without the implementations of PS-algol, the language Napier would not have evolved. Many important, though small, advances have been made during the research into machine support :
1. modularisation ;
2. uniform object formats ; and
3. efficient implementation of non uniform parametric polymorphism.
8.2.1 Modularisation
The Persistent Abstract Machine, like all the components of the Napier system, is constructed in a modular fashion. Each layer in the machine presents a well defined interface. This has two main benefits: the first is in maintenance costs, the second is as a research vehicle.
Parnas cites information hiding as one of the most effective ways of avoiding rework [par79]. The PS-algol machine has proven expensive to maintain. This is partly due to its size and partly due to its complexity. Much of this complexity has arisen in this machine due to its nature - that of a research vehicle.
The Napier support environment is also a large, complex piece of software, as such requires maintenance. It is hoped, that the modular design of the architecture will result in lower costs in the future. More importantly the modular design of the architecture allows experimentation into language implementations to be performed independently. For example, it is possible to change the persistent object management strategy without changing the Persistent Abstract Machine. This will allow the interactions between different parts of the system to be explored.
8.2.2 Uniform Object Format
One of the biggest advances in the Persistent Abstract Machine is a simple one. The abstract machine has no knowledge of the type systems of the languages that it implements. One result of this is that the machine has a uniform object format. The heap is the only dynamic storage system supported by the system. Objects are partitioned into pointer and non-pointer fields, minimising the potentially high cost of garbage collection and persistent object management. 174
Î
■s 'I ;ïI
%
■1
ÿ:*
8.2.3 Parametric Polymorphism
The parametric polymorphism provided by Napier has a large impact on the machine design. One consequence of polymorphism is that the compiler cannot tell statically how big a polymorphic object is or whether it is a pointer type or not.
An efficient implementation of first class polymorphic procedures has been achieved without adversely affecting the performance of non-polymorphic ones. The implementation is novel in that it implements parametric polymorphism for non-uniformly sized objects. The
technique makes use of the block retention architecture provided by the Persistent Abstract Ij Machine. The technique may be extended without modification to support a powerful notion
of abstract types. It is thought that this mechanism may also be used to support inclusion polymorphism.
8.3 Compiler construction
During the development of the Napier language many benefits have emerged from using the
persistent store as a compiler construction vehicle. The most obvious benefits are: ^