4.4 Induction on the Coherence Order: RingBuffer
4.4.4 Lemma 5 and Theorems 8 and 9
Now we can complete the proof of Lemma 5 by unifying the invariants for tryProd and tryCons, We will show that when a tryCons invocation reads the buffer at a particular index written by a
companion tryProd invocation, the reader and writer indices used to calculate the index are the same. This proof brings together the invariants of Lemmas 7 and its tryCons counterpart.
By the semantics of reads, if a write and read of a buffer memory location are associated then the calculated index used to determine the memory location in the associated tryProd and tryCons invocations must be the same. Let the reader index value used in the tryCons calculation be r and the writer index value used in the tryProd calculation be w. If the reader and writer indices are not the same, then by the calculations on lines 10 of tryCons and 11 of tryProd we know that, r = w + x × N, for some integer x , 0.
We consider the cases for x, first x < 0. Then r = w + x × N implies r ≤ w − N. We know that if the tryProd containing the write to the buffer executed completely then w and its own r0
are related according to Lemma 7 with w < r0+ N. Then we have w − N < r0, and by assumption
we have r ≤ w − N, which gives us r ≤ w − N < r0and r < r0.
Figure 4.13c illustrates the contradiction in these two reads. By reader index monotonicity and because r < r0, wherever the tryProd got its reader index r0 is coherence order after the
tryCons that computed its index location using r. This is the co edge pointed downward. But, we can establish, through the read of the buffer and the visibility orders, that the coherence order later write happened-before the coherence order earlier write and thereby derive a contradiction. Intuitively, we ensure the visibility of the “future” write through the specified visibility orders. That is, through the visibility orders, the read learns about a write that has yet to take place.
The case for x > 0 is similar but uses the tryCons invariant to prove that the tryCons must have seen a writer index coherence order after another tryProd write to the same buffer index. In turn, this implies that it would have ignored the new state of that index to read into the past.
With Lemma 5 in hand we can prove the two main theorems. Theorem 8 follows from the fact that the two executed tryCons procedures must have read different reader index values. Then by Lemma 5 their paired tryProds must also write distinct writer indices and thus be in coherence order. Theorem 9 follows from two facts. First, the two tryProds must have used writer indices that are equal modulo N. Second, the later tryProd must have seen a reader index that was greater than its writer index less the buffer length, w − N < r. Since r is larger than w − N there
must be some tryCons for any r ≤ w − N and by Lemma 5 it must have read from the earlier tryProd’s index write.
4.5
Summary
Here, we have presented a sound, stateless logic for reasoning about the correctness of lock free concurrent algorithms executing on the JOM. As examples, we proved the correctness of Dekker and RingBuffer. The result is that these algorithms, when paired with their attendant specified orders and given to our compiler can produce fast, correct code for many memory models.
CHAPTER 5
Related Work
For a long time, researchers have known that correctness can depend critically on the execution order of two instructions. Fences are a crude way of ensuring that two instructions execute in order. Kuperstein, Vechev, and Yahav [44] used a notion of specified orders as the output of a synthesis algorithm. Like us, they see these orders as part of a correct program, but inferred from a correctness property, rather than specified. The idea of specified orders appeared for first time in publications in 2014–2015, namely in the 2014 PhD dissertation of [54], and in the POPL 2015 paper by [19]. Crary and Sullivan’s POPL 2015 paper introduced the RMC memory model together with a semantic foundation that includes specified orders. More recently a “Placed Before” intra-thread ordering relation was proposed for the C++ concurrency standard [59]. It captures the key idea of the visibility ordering (specifying ordering dependencies explicitly) but with a focus on ruling out thin-air reads.
Beyond the concept of specified orders we will consider work related to this thesis in three categories: fence insertion, memory models and semantics, and verification for weak memory models.
5.1
Fence Insertion
Many authors have presented approaches to insert fences that enforce sequential consistency, including Lee et al. [53], Fang et al. [29], and Alglave et al. [4]. An alternative to programmer specification of orders is inference of orders. The idea of inference is somewhat different from type inference, which can be understood as articulating program invariants. In the case of order inference, the challenge is to articulate assumptions needed to prove a correctness property.
Kuperstein, Vechev, and Yahav [44], presented promising work on inference; they infer spec- ified orders from a program, a correctness property, and a memory model. Their approach first runs a whole-program state-space exploration algorithm that produces a logical formula, then solves the formula to get a set of specified orders, and finally uses those orders to insert fences. Their approach to enforce an order (i1, i2)is to insert a fence right after i1 or right before i2. The
whole-program nature of their approach means that while the inserted fences are sound in the given context, they may be unsound in a different context. Still, their approach can give worth- while feedback to an algorithm designer who tries to specify a set of orders that are sufficient to prove correctness. We note that our choice of correctness property (opacity) of transactional transactional memory algorithms is currently beyond the capabilities of the Kuperstein-Vechev- Yahav approach. What is needed here is a more powerful language for specifying correctness properties along with a suitable generalization of the approach. This can be an exciting direction for future work.
Kuperstein et al. [45], Meshman et al. [63], and Dan et al. [21] have presented approaches to a related inference problem, allowing degrees of infinite-state programs, but seemingly without specified orders as an intermediate step towards fences. Their approaches can likely be recast as inference of specified orders. Again, a direction for future work is to make their specification languages more powerful to enable specification of correctness of TM algorithms.
Liu et al. [57] presented an execution-based approach to inference, in which they run the program on a memory model and then use the traces to infer fences. This technique can likely be recast to infer specified orders instead.
Specified orders are restrictions on the possible executions of a program. In this way they are similar to previous work on using annotations to restrict scheduling for correctness [20] and for testing [40]. Our work differs in its focus on the restriction of the possible executions due to instruction reordering (or the appearance thereof) as opposed to the restriction of the possible executions due to thread scheduling.