2. State-Saving in Rule-Based S y stem s
3.1. Design of a Functional Rule-Based System
One of the main investigations of this thesis was the analysis of the design and implementation of a large functional application. The process of building large applications in imperative languages is well known, but has its own problems. However, the process for functional languages is not well documented.
Early research indicated that there are problems which arise in functional programs that are not evident in imperative programs. These problems are the issues of (i) the manipulation of state and the related issue of store, and (ii) doing input and output. In imperative systems there are variables that hold values of state and which may be accessed or updated arbitrarily. Many imperative languages use lexical scoping to limit access to variables, but global variables are accessible everywhere. Each variable has a position in the computer’s store and may be accessed and updated. A procedure in an imperative language may access and update the variables even if the variables have not been passed as an argument. Similarly, input and output in imperative languages can be done in arbitrary places. The input and output streams are part of a global environment that can be easily accessed without explicit mention of them if used in a function.
The functional rule-based system was designed with five main parts: there are three components that constitute the recognize-act cycle — the matcher, conflict resolution, and act ; a compiler, which compiles the textual form into a form used by the matcher and the act process; and a run-time system, which provides the infra-structure to glue the previous four parts together.
Initially there seems to be a problem with retaining and updating state for both production memory and working memory. Functional languages do not provide updatable global variables, so how is it possible to implement a system which is inherently state-saving? The matcher needs access to both production and working memory, conflict resolution needs access to a selected subset of both production and working memory, and the act process needs to change the contents of working memory. An answer to this question will be seen in this chapter.
In the traditional imperative model, much of the global state is available in all parts of the system. In addition, any part of the state can be updated at any time, regardless of whether or not it is appropriate to that part of the system. This method of updating allows bugs to be easily introduced, although object-oriented techniques provide a discipline which reduces this problem [Stroustrup86]. Because functional systems do not have a global environment which can be accessed at any time, any items of state that
system has implicit access to state.
In the functional implementation, the run-time system of the rule-based system passes state explicitly from one part of the system to another, removing the need for any global updatable state. Because no part of the system needs access to everything held in the state, the relevant items can be passed to any part of the system. For example, the match phase of the main cycle only needs access to the production memory and the working memory. No other items in the state are needed and no others are passed on.
Another aspect of passing explicit state in functional languages which is not seen in imperative languages is the need to plumb in the state. State has to be passed explicitly from function to function, just as water pipes are passed from room to room in a central heating system. Consider the example:
w o rk : : ( a - > b ) - > [ a ] - > [b ]
w o rk f l = [ f a | a < - l , t e s t a ]
Suppose we wish to count the number of times f is applied to its argument. In an imperative language, it would be possible to add a line of code which updated the state of a global variable and the type of the function would not need to change. In a functional language this technique cannot be used. The state has to be made explicit, thereby changing the type of the function to:
t y p e S t a t e = I n t
w o rk s : : ( a - > b ) - > ( [ a ] , S t a t e ) - > ( [ b ] , S t a t e ) w o rk s f ( 1 , 3 ) = ( l i s t , s + sum s t a t e v a l s )
w h e re
( l i s t , s t a t e v a l s ) = u n z ip [ ( f a , 1 ) | a < - 1 , t e s t a ]
This explicit change of the type and the extra code has to be done by design; it cannot be added as an afterthought. This is plum bing.
Figure 3.1 shows how the five main parts of the system fit together. The run-time system retains all the state and then passes the appropriate items to other parts of the system. The details of the items passed to each part are described in section 3.2, but
only some state items are needed in each part of the system. In figure 3.1, pm represents production memory and wm represents working memory.
w m elemsX p r o d uction \ 1 production pm
ma t c h env '
conflict set kWm elems productions filename. c o m p i le r r e s o l u t i o n c o n f l i c t m a tc h a c t r u n - t i m e s y s te m
Figure 3 J : How the functional rule-based system fits together
The run-time system is the interface to the outside world, thus providing a mechanism for doing input and output. A large part of the design was a compiler that would recognize a language which specifies the rules for the rule-based system. Input to the compiler is in a textual form. Output from the compiler is in a form used by the match process, namely a list of productions which are saved in production memory. This requires interaction with the state-saving mechanism.
The match function takes the current working memory and current production memory and does an exhaustive match by matching every clause of every production against every working memory element. The result of this function is a conflict set, which is returned to the run-time system. The conflict set is passed through a conflict resolution function which selects one production to execute. The selected production together with the whole of working memory is passed to the act function, which executes the production and updates the working memory by either adding and deleting elements or doing input and output. The new working memory is passed back to the run-time system for the next iteration of the recognize-act cycle.