Chapter 12 Formal Specification
Rule 1. Create_object (o, c):
12.6 METHODS OF DECOMPOSITION
At one extreme, you can have a specification that is very abstract and closely resembles the model (as does our example); in such instances you must deal with the difficult task of convincingly demonstrating the correspondence between the code and the specification. At a much more detailed level, the specification might closely match the operations visible at the interface to the system—function for function, and parameter for parameter. Such a specification will be very complex and unreadable, and a formal proof that it corresponds to the model may be impractical. These alternatives are shown qualitatively in figure 12-7. At an even more detailed extreme, the specification represents the internal procedures of the system rather than the visible interface. The correspondence proof to the model may be extremely difficult (or at least no easier than the second case), but the correspondence to the code may be close enough to permit a partial proof.
Several specification techniques deal with these large differences in levels of abstraction in various ways. They correspond, roughly, to the techniques used in FDM, old HDM, and Gypsy, although some techniques are used by more than one methodology.
12.6.1 Data Structure Refinement
The data structure refinement method, used in our example and in FDM, employs a refinement of detail at different levels of abstraction. Each layer of specification is a state machine that completely describes the system. The top layer is highly abstract and combines multiple data types, variables, and functions into a few simple functions. The second layer adds more detail, possibly dividing generic functions about subjects and objects at the top layer into specific functions about specific types of objects. Once the second layer is written and has been shown to map into the upper layer (in the sense that we mapped the specification into the model in our example), the upper-layer specification is no longer needed. The second layer is a more concrete
Figure 12-7. Extremes of Specification Detail. A detailed specification will make the
code correspondence simpler but the formal proof harder (and maybe impractical), whereas a highly abstract specification will make the code correspondence impractical or unconvincing. IMPLEMENTATION ABSTRACT SPECIFICATION MODEL IMPLEMENTATION DETAILED INTERFACE SPECIFICATION MODEL PROCEDURE SPECIFICATION MODEL IMPLEMENTATION Unconvincing Argument Easy Proof Easy Argument Hard Proof
Very Hard Proof Proof?
description of the system and, when proved to satisfy the mapped invariants and constraints, satisfies the same security properties as the top layer.
Similarly, we can add more detail at the next-lower layer and have yet more functions. Once we add a layer, do the mappings to the upper layer, and complete the proofs, we no longer need the upper layers (unless we someday need to modify and re-prove the lower layer). The bottom layer (the one closest to the implementation) may closely correspond to variables and functions in the code, making it a very precise and detailed description of the interface to the system and a specification from which designers can implement a system.
The data structure refinement technique does not provide you with any clues for designing the internals of the system. The lowest level of specification only describes the system interface; it says nothing about the design. Making a credible code correspondence argument that the underlying software accurately implements this specification requires traditional software engineering techniques such as code inspection and testing.
12.6.2 Algorithmic Refinement
In contrast to the data structure refinement technique, whose lowest layer specification presents the external view of the system, the algorithmic refinement technique, used in HDM and
illustrated in table 12-1, allows you to specify some of the internal structure of the system. The technique most directly applies to systems designed with internal layers, as discussed in section 11.1. The technique views a system as a series of layered abstract state machines. Each machine makes available a set of functions for use by the machine above. The implementation of each function in a machine consists of an abstract program that calls functions in the machine below. (For simplicity, only call statements are shown in the programs in the table, but in general the programs may contain the usual semantics of programming languages.) The lowest-level machine provides the most primitive functions of the system-those that cannot be further decomposed.
The abstract machine concept is best illustrated with an example of a three-layer machine implementing a file system (table 12-2). The bottom, most primitive machine (machine 0) knows only about disks, disk blocks, and memory. It provides a few primitive functions, such as
disk_block_read (disk_name, block_address, buffer_address)
and knows nothing about the concept of files or access control.
Machine 1 provides a primitive flat file system, with functions typical of a file system manager:
file descriptor = open(file_index) file_read(file_descriptor,offset,buffer)
where file_index is simply an integer pointing to the file on disk. The implementation of
functions in machine 1 consists of abstract programs that use the functions of machine 0 to create a file system out of disk blocks, using file indexes (stored on disk blocks) to keep track of
Layer Formal Specifications Abstract Programs interface to system ↓ N top-level machine (interface specification) func A func B proc AN proc BN call AN – 1 call BN – 1 call CN – 1 call AN – 1 return return N – 1 intermediate machine func A func B func C
proc AN – 1 proc BN – 1 proc CN – 1 call AN – 2 call BN – 2 call CN – 2 call CN – 2 call AN – 2 return return call AN – 2
return
N – 2 intermediate machine proc AN – 1 proc BN – 1 proc CN – 1
. . . . . . . . . . . . . . .
1 intermediate machine proc A1 proc B1
0 primitive machine proc A0 proc B0
Table 12-1. Algorithmic Refinement. The approach of specifying layered abstract
machines allows the internal structure of a system (below the top-level interface) to be modeled. The top-level machine provides the functions visible at the interface to the system.
Machine 2 implements a hierarchical file system containing directories and files within directories. It provides file names as strings of characters and functions for access control to files. It implements directories (using files in machine 1) that store names of files and access control information.
In the algorithmic refinement technique, the highest-layer machine implements the interface to the system as it appears to users. Each function call at the interface results in a possible cascade of calls to lower-layer machines.
Abstract Machine Data Structures Functions
Machine 2 Files Directories
Create/delete files/directories Read/write files
Access control functions Machine 1 Files
File descriptors
Create/delete files Read/write files
Machine 0 Disk blocks Read/write disk blocks
Table 12-2. Example of Three-Machine System. The higher-level machines provide
increasingly more complex file system functions.
When you write a specification using this technique, you write two things for each abstract machine: a formal state-machine specification that resembles a single-layer specification of the sort used in the data structure refinement technique; and an abstract program for each function in the machine, providing an algorithmic description of the function in terms of calls to functions in the lower-layer machine. Code correspondence proofs using a specification such as this require proving that the abstract programs at all layers correspond to the real programs in the system.
Proof of a specification developed with these techniques first requires proving that the highest-layer machine specification corresponds to the model, in a manner identical to the one used to prove a specification in the data structure refinement technique. Then, in a manner analogous to (but mechanically quite different from) proving the consistency of mappings between layers, we must prove that the abstract program for the highest-layer machine correctly implements its specification, given the specification of the functions of the next-lower-layer machine. The process is repeated down to the lowest layer, at which point we must assume that the specification of the lowest layer primitive machine is implemented correctly. In the overall proof, it is necessary to specify how data structures in each machine are mapped onto data structures in the next-lower machine.
Each layer in the real system corresponds to a layer of the specification, with functions that closely match the functions in the abstract programs. As a result, it should be much easier to argue for correspondence between the specification and the code in this case than if you had only an interface specification, as in the data structure refinement technique. In fact, it has been proposed (but never proved) that someday it might be possible to write a translator that converts an abstract program into a computer-language program.
Unfortunately, the algorithmic refinement technique suffers from several drawbacks that make its use a bit more theoretical than practical (though pieces of practical systems have been developed using this technique and show promise for the near future). The primary drawback is the difficulty involved in carrying out proofs of the abstract algorithms. It is much more difficult to prove an algorithm than to prove a mapping, and such a proof becomes intractable for all but
fairly small algorithms. Abstract program proofs differ little from concrete program proofs; the only reason there is greater hope of proving abstract programs is that these programs can be written in a highly restricted language that need not deal with many details of real programming.
Another drawback—this one far from fatal—is that the top-level specification is quite complex because it represents the real interface to the system. Because the specification is so close to the real system, proving its correspondence directly to the model has all the same problems with level of detail that we faced with the data structure refinement technique, where we proposed a single very detailed specification between the model and the code (the leftmost extreme of figure 12-7).
The reason this second drawback is not fatal is that nothing restrains us from applying the multiple levels of the data structure refinement technique above the top-level abstract machine (fig. 12-8). Using this method, we can have the best benefits of both worlds; do not go to your corner software store looking for an off-the-shelf system that implements this combination of techniques—at least for a few years.
12.6.3 Procedural Abstraction
Gypsy’s specification technique might be called procedural abstraction. Gypsy directly models the way a system is implemented: as a set of nested procedure calls. As in the algorithmic refinement technique, each function in a Gypsy specification is equivalent to a function in the implementation, but Gypsy does not require the system to be built in layers, as does HDM. The
specification of a Gypsy function describes how the function manipulates its arguments, not how the function affects a global state of the system. Gypsy goes further than HDM and FDM in
allowing you to specify the functions of every internal procedure in the system, not just the interface to the system or to each layer.
Because Gypsy specifications are so closely aligned to the code (in fact, the Gypsy language includes a PASCAL-like programming language), Gypsy might be viewed as more a program-
Algorithmic Refinement
Top-level Abstract Machine · ·
·
Intermediate-level Abstract Machine · ·
·
Bottom-Level Abstract Machine
Data Structure Refinement
Top-level Specification · · · Intermediate-level Specification · · · Bottom-level Specification
=
Figure 12-8. Combination of Specification Techniques. Though not yet
demonstrated in practice, a merge of both the data structure refinement and algorithmic refinement techniques can achieve the benefits of both.
proving system than a specification system. But Gypsy does permit you to write specifications without code and to prove abstract properties about those specifications without writing the programs. When used in this manner, the specification for the set of top-level procedures accessible from outside the system resembles the specifications for the top-layer interface machine in HDM and for the bottom-layer interface in FDM.