RingBuffer Algorithm and Specification - Induction on the Coherence Order: RingBuffer

4.4 Induction on the Coherence Order: RingBuffer

4.4.1 RingBuffer Algorithm and Specification

Our implementation, detailed in Figure 4.12, is a simplified form of the two-thread ring buffer that appears in the Linux kernel documentation [38]. It closely follows that of [82], though we

do not consider allocation and we use monotonic read and write indices1_{. It implements a queue}

using a fixed number of memory locations, which means that when the buffer is full tryProd will fail and when it is empty tryCons will fail.

We treat wi, ri, b, and N as fixed constants representing the write index offset (0), read index offset (1), buffer offset (2), and buffer size respectively. We also include the succeed and fail result constants for clarity.

Each procedure returns a value when the corresponding expression evaluates to a natural number. When the tryProd procedure successfully enqueues an element x in q it evaluates to succeed, otherwise it evaluates to fail. When the tryCons procedure successfully dequeues from q, it evaluates to the dequeued value, otherwise it evaluates to fail.

The writer index is managed by tryProd write [wic] := w’ (green code/cell). It represents the tail of the queue. The reader index is managed by the tryCons write [ric] := r + 1 (red code/cell). It represents the head of the queue. Both the reader and writer indices are allowed to increase indefinitely. When the buffer (blue cells) is empty the head and tail are equal (modulo N). When the buffer is full the writer index is one fewer than the reader index (both modulo N). The buffer index is the position at which a tryProd or tryCons enqueues or respectively dequeues an x. The buffer index is always calculated modulo the length of the buffer (N).

The correctness of the RingBuffer algorithm hinges on how the state of the writer index and reader index evolve over time. Specifically, we must show how the state of each evolves individ- ually and then we must use that information to show how they evolve together.

As examples, the individual indices must progress by one index at a time. Otherwise, a tryCons invocation may skip an element in the queue or a tryProd invocation may leave an element that has already been dequeued for a tryCons to dequeue again. Together, the indices must stay within the bounds of the buffer size with respect to one another or the algorithm will exhibit similar problems.

We define the three theorems in our specification for RingBuffer. In our theorems we use

1_{This is worthy of special note since the logic of [82] employs ghost state with overflowing indices which is a} non-trivial addition to the logic.

tryCons(q) =

let wic = q + wi in let ric = q + ri in let w = [wic] in let r = [ric] in

let limited = w mod N == r mod N in if limited then

fail else

let index = q + b + (r mod N) in let x = [index] in let _ = [ric] := r + 1 in x tryProd(q, x) = let wic = q + wi in let ric = q + ri in let w = [wic] in let r = [ric] in let w' = w + 1 in

let limited = w' mod N == r mod N in if limited then

fail else

let index = q + b + (w mod N) in let _ = [index] := x in let _ = [wic] := w' in succeed init: [q + wi] := 0 [q + b + 0] := 0 [q + ri] := 0 [q + b + (N-1)] := 0 ⠇ q+wi q+ri q+b+0 q+b+1 … q+b+(N-1) buﬀer, size N indices n₁ w r n₂ n_N svo svo svo svo svo svo Figure 4.12: RingBuffer

the following conventions. We use LtryProd and LtryCons to represent arbitrary label sequences

locating instances of their respective procedures. We use ri∗, wi∗, and buff∗ to represent label sequences locating the read index, write index, and buffer location accesses within the procedures where ∗ ∈ {rd, wr} for reads and writes. Finally we use L1, seq to represent sequence

concatenation.

Lemma 5(Paired). If we have L_tryCons , fail and L. tryProd .= succeed then

L_tryProd, buffwr −−−rf→ L_tryCons, buffrd iff L_tryProd, wird .= L_tryCons, rird

Informally, if a tryCons expression and tryProd expression both succeed, then the tryCons reads from the write to the buffer in the tryProd if and only if the reader and writer indices (resp.) used to calculate the buffer index are the same.

Lemma 5 is a natural correctness criteria for an unbounded linear queue. It guarantees that the nth_{dequeue is paired with the n}th_{enqueue, where n}th_{is defined here by the monotonic reader}

and writer indices. As evidence, it’s possible to use Lemma 5 to prove the key theorems for the concurrent queue of Herlihy and Wing [36] (see Appendix J).

Theorems 8 and 9 focus on the bounded nature of the queue. Intuitively, each dequeue (tryCons) should be paired with a newly enqueued value (tryProd) and not an old one, prevent-

ing stale values, and a dequeue (tryCons) musth have executed when more than one enqueue (tryProd) is attempted at a particular position in the buffer, preventing overwrites.

Theorem 8 (Produce). If we have L_tryCons1 , LtryCons2, LtryCons1 ,. fail, LtryCons2 ,. fail, LtryProd1, buffwr −−−rf→ L_tryCons1, buffrd and L_tryProd2, buffwr −−−rf→ L_tryCons2, buffrd then L_tryProd1 , LtryProd2

Informally, If two distinct tryCons invocations succeed, then their corresponding tryProd invocations are distinct. That is, no two tryCons invocations can read from the same tryProd and thus no tryCons can read a stale value.

Theorem 9(Consume). If we have L_tryProd1 , LtryProd2, LtryProd1 .= succeed, LtryProd2 .=

succeed and LtryProd1, buffwr, lloc .= LtryProd2, buffwr, lloc then there exists a LtryConssuch that,

if L_tryCons , fail then L. tryProd1, buffwr−−−rf→ LtryCons, buffrd

Informally, If two distinct tryProd invocations succeed and write to the same place in the buffer, then there must be a tryCons invocation that reads from the write of the first tryProd invocation. That is, values in the buffer can’t be overwritten without having been read. Note that we must assume the successful execution of the tryCons (LtryCons1 , fail) in our conclusion.. This is a weakness of our abstraction over the expression semantics in dealing with the situation where the tryCons has not fully evaluated but the increment of the reader index inside the procedure executed. In the language of Herlihy and Wing, the tryCons is concurrent with the second tryProd but the update to the index has taken place allowing the second tryProd to proceed and reuse the buffer index.

Our mechanized the proofs of these theorems is approximately 3000 lines in Coq without whitespace. Notably, there is a large amount of duplicate proof code shared between tryCons and tryProd invariants. Here, we give a proof sketch that focuses on how our logic supports complex algorithms using induction on the partial coherence order. We will work through a series of lemmas building up to the proof of Theorems 8 and 9. We first give a simple example to show how the coherence order supports proofs of invariants and then examine a more complex example where it forms a scaffolding for write elimination.

In document Prove Once, Run Efficiently Anywhere: Tools for Lock-free Concurrent Algorithms (Page 108-112)