Verification - Cache coherency protocols - Arrows for knowledge based circuits

6.6 Cache coherency protocols

6.6.2 Verification

We firstly verified a battery of sanity properties of the model, such as exclusiveness of ownership, the possibility of completing memory operations without bus operations, liveness of the processors and memory, and that a cache always knows the value of the memory line when it completes a memory operation but can be ignorant of it at other times.

Our main correctness property is that the processors have a suitable view of memory. In general memory consistency is difficult to specify and somewhat subjective; it is now common for higher-performance weak consistency models to push some of the problem back into software by loosening the ordering of reads and writes as viewed by different processors.Steinke and Nutt(2004) develop a theory that accounts for many consistency models.

In this case we can showsequential memory consistency, a concept due toLamport(1979). According toAlur, McMillan, and Peled(2000, §2.2):

The intuition behind sequential consistency ... is that an implementation of a collection of concurrent objects should appear to be correct to an observer that is able to record the history of each individual process, but has no global clock by which to determine the relative order of events of different processes.

We might say that in general consistency need only respect causality (§4.3.1).

Alur et al.(2000) go on to argue that verifying the sequential consistency of a finite-state system is undecidable, and conclude that each sequentially-consistent finite-state system obeys some stronger property. In our model we have the very strong property that a processor always reads the value most-recently written by any processor. We break this assertion into two cases, one for each possible value of the line. In the positive case we have:

AG(V

i←cachesprocwritei∧procvali∧procconti −→ AX(A[(V

j←caches procreadj∧proccontj −→ procvalj) U(W

and similarly for the negative case. Intuitively we have that, after completing a write of a one, in the next state all processors read a one until some processor completes the writing of a zero. We use anunboundedsemantics for the until modality; it may be that the processors stop writing to memory at some point, and so we do not require the standard eventuality condition.

It is also the case that if a clean cache knows the value of the line then all caches do, and it is this property that prevents us from modelling theexclusive unmodified(E) state in the MOESI classification. In fact we can show that the memory register value is always common knowledge to all the caches, which we demonstrate by adding the testkTest(ácknows_caches"memory value")

to one of the caches and verifying that it is always true. This result depends crucially on perfect recall as caches do not record the state of the line in their local states unless they are the owner. Thus it does not hold under theobservationalsemantics for knowledge (§2.3.2).

This latter property is not surprising as the central memory simply records what everyone sees. However it also shows that this component is redundant in this model, which is clearly not true of any realistic shared memory system. We discuss this issue and the closely related problem of modelling cache flushes in the next section.

6.6.3 Concluding remarks

In contrast to the parametrised, compositional model we have described here,Baukus and van der Meyden(2004) manually expand the asynchronous composition of the memory process and two cache controllers. This is necessary as the modelling language of MCK lacks facilities for describing asynchronous systems, and the complexity of the resulting artifact makes it difficult to see that it is correct. For example, in their MCK models the{Copy Back}clause that cleans a dirty cache by writing the line back to memory has that cache and the central memory communicating without the other cache making an observation. This violates our assumption that bus communication is a broadcast. As a result the value in the central memory is not commonly known as the{Copy Back}clause is always enabled when a cache is dirty.

This part of their model also illustrates the cache flushing problem that we discussed earlier. In particular, when a cache controller flushes the line back to the memory we expect it to be forgotten. The knowledge-based specification ofBaukus and van der Meyden(2004, §5) requires that a cache reset the variable it uses to track the line but not that it forget the value; that the cachedoesforget the value relies on the use of the observational semantics for the knowledge post-condition in their{Read Miss}clause, which is oblivious to history. In contrast the perfect- recall semantics we use here does not allow an agent to voluntarily forget anything, and so we cannot treat this facet of the protocol.

We also note that if the{Copy Back}action is broadcast in their model then the caches retain (perfect-recall) knowledge of the line after{Copy Back}and{Flush}operations. In the case of the Write-Once protocol we consider here, two consecutive bus writes by cacheiindicates a{Copy Back}has occurred, and hencei has transferred ownership to the central memory,

whose value is then commonly known. We conclude that their completeness result for this case hinges on an improper modelling of the bus. Indeed, if one does treat the bus as a broadcast then the asynchronous semantics for knowledge coincides with the synchronous one in our model.

A proper treatment of flushing requires us to account for the motivation of this operation: to recycle the space for another cache line. We conclude that perfect recall is not the right semantics for this task as it yields implementations with too many states; in practice cache protocols trade communication for space, whereas perfect recall favours memory over communication. One could imagine instead using the clock semantics and then model checking the implementation for perfect recall, but it is unclear this has any benefits over a standard model. Other options include making our KBP formalism space-aware, or adding a forgetting operation. We leave further exploration to future work.

One may wonder if our assumption that the system is globally synchronous limits the applica- bility of our implementations. We defend it by observing that we are modelling a single bus that all of the cache controllers are synchronised to; the processors can proceed at their own rate in some other clock domain or even asynchronously, but must synchronise with their controllers for memory operations. As none of this has any impact on the cache coherency protocol used on the bus, we can disregard it.Alglave, Maranget, Sarkar, and Sewell(2010) further argue that while such an assumption may not be entirely correct, it is adequate to capture the main ideas of even quite complex memory models.

We note that symmetry reduction (Cohen et al. 2009) may be a large win in this setting.

Extending this approach to hierarchical cache coherency protocols (Clarke et al. 1995) that includebus bridgeswould require some new theory as these violate our broadcast assumption.

6.7 Concluding remarks

Here we have described how we augment the circuit Arrows of the previous chapter with con- structs for knowledge-based programming, and shown how we can implement the algorithms of Chapter3symbolically. We have applied these techniques to several examples, including that of cache coherency on a shared bus.

Conclusions and future work

W

Eset out to convince the reader that mechanically reasoning about knowledge is useful when designing some kinds of systems. To that end we presented a theory of the implementation of knowledge-based programs in Chapter3that underpinned the symbolic approach shown in Chapters5and6, which drew on the tradition of modelling circuits as functional programs that we surveyed in Chapter4. Here we review our experience of using Arrows for KBPs and the KBP formalism itself, and point to future work.

7.1 Arrows for Knowledge-based Circuits

By embedding our modelling language in Haskell we have a superior foundation for experi- mentation to our previous MCK tool (§2.3.2). The new approach has much better support for data types. It is far easier to parametrise protocols and communication topologies and does not require recourse to another language to do so. Combinationally cyclic circuits (§4.1) ease composition and lead to smaller models than would otherwise be possible (§6.6). This greater flexibility leads to much better performance (§6.5) with the existing model checking algorithms. From a software engineering perspective the system itself is much more modular, maintainable and extensible, smaller and simpler, and we did not invest effort in the typical language process- ing drudgery of parsing, type checking, etc. We claim that the EDSL approach is the best way to build experimental language processors.

With respect to the tradition of describing circuits as functional programs that we surveyed in Chapter4, Arrows have in a sense brought us back to the combinatory approach ofµFP (§4.2.1) with the option of writing our definitions in pointwise or point-free styles (§5.1). By building the synchronous isomorphism into the Arrow structure (§4.3.5, §5.1.2) we have substantially avoided the explicit tuple spaghetti ofµFP.

The reader may wonder if the conceptual and syntactic overhead of the techniques we use are necessary to resolve the issues we canvassed in Chapters4and6. The following sections discuss the major facets of our approach and compare them to the alternatives.

In document Arrows for knowledge based circuits (Page 164-168)