Memory Consistency Conditions - The STAPL Parallel Container Framework

In sequential programming, invocations are completed in the order in which they were issued in the program. In a parallel system it is often the case that this requirement is relaxed in order to provide improved performance.

1. pContainer Default Memory Consistency Model

Based on the termination guarantees introduced in the previous section we now de- scribe the default memory consistency conditions of the pContainer methods. We first introduce a set of notations and rules and later in this section we show the interaction of this rules with a set of examples.

We adapted from [6] a set of notations to formally specify the pContainer memory consistency model. Let E be an execution of a program which has con- current method invocations by multiple threads in multiple locations and on multiple pContainer elements. E is a sequence of method invocations as they occurred as

a result of interleavings of the actions of all the threads in the system. E can be thought as a trace of a particular execution of a program. We use the notation E|i to denote the subsequence of E consisting of all invocations performed by and responses received by a thread i. Similarly we use E|x to indicate the subsequence of E consisting of all invocations and responses that are performed on an element x of the pContainer.

To simplify reasoning about the possible method interleavings and values re- turned by an execution E, we introduce the notion of a permutation of the method invocations as a linear sequence of all method invocations in the system. The MCM specifies the restrictions on the possible permutations corresponding to a particular execution E. Fences serve as global synchronization points that force the completion of all previous pContainer methods. In the following we discuss the guarantees for the method invocations between fences.

The pContainer MCM: For an execution E, a pContainer guarantees that there is a permutation P of all method invocations in E such that:

1. The methods in P occur sequentially (no overlapping).

2. For each element x, the restriction of P to just those methods on x, denoted P |x, satisfies the specification of the data type of x. (E.g., if x is a register that supports Read and Write, then each Read returns the value of the latest preceding Write invoked on x.)

3. For each thread i, the restriction of E to just the Coll and Synch methods invoked by i, denoted E|(Coll ∪ Synch)|i, must equal P |(Coll ∪ Synch)|i. That is, the permutation P has all the collective and synchronous methods by i in the same order as they were invoked. However, no guarantee is given as to how Synch methods at different locations are ordered in P .

4. For each element x and each thread i, the restriction of P to the methods on x invoked by i, denoted P |x|i, consists of all the Synch, Asynch, and Split P hase methods on x invoked by i in E, in the order of their invocation.

5. Consider any element x and let Oi and Oj be two operations on x in E such

that Oi is invoked by some thread i, Oj is invoked by some other thread j, and

Oi completes (i.e., receives its ACK) before Oj is invoked. Then Oi is ordered

in P before Oj.

In the remaining of this section we exemplify some of the ordering relations for pContainer methods as derived from the consistency conditions previously introduced.

For asynchronous methods the PCF guarantees that subsequent invocations from the same thread affecting the same element x will receive their implicit acknowledgments in the order in which they were invoked (condition 4). For example, let us assume a thread in location L0 performs the sequence of asynchronous write invocations depicted in Figure 21(a). The acknowledgment for two consecutive writes on x (AW0

(x), AW1

(x)) are guaranteed to be in the order in which they were issued. The superscript is used to distinguish subsequent invocations of the same type on the same element/location. The acknowledgment for the write to y has no relationship with the acknowledgment for x. Figure 21(a) shows three possible valid interleavings with the acknowledgment for y received in an arbitrary order relative to the acknowledgments for writes to x. Corresponding to Figure 21(a) we include in Figure 21(b) three possible valid interleavings as perceived by the user.

When invoking methods concurrently on the same pContainer element from threads in different locations there is no guarantee about the order in which they will terminate. Let us assume the following method invocations from two different

L0 : AW0 (x, 1), AW(y, 1), AW1 (x, 2) f ence ACK AW0 (x), ACK AW1 (x), ACK AW(y) or L0 : AW0 (x, 1), AW(y, 1), AW1 (x, 2) f ence ACK AW0

(x), ACK AW(y), ACK AW1

(x) or L0 : AW0

(x, 1), AW(y, 1), AW1

(x, 2) f ence ACK AW(y), ACK AW0

(x), ACK AW1

(x)

(a) Asynchronous method relative execution order

AW(x,1) ACK_AW(x)

fence

AW(y,1) ACK_AW(y) AW(x,2) ACK_AW(2)

L0 AW(x,1)

ACK_AW(x)

fence

AW(y,1) ACK_AW(y)

AW(x,2) ACK_AW(2)

L0 AW(x,1) ACK_AW(x) fence

AW(y,1) ACK_AW(y) AW(x,2) ACK_AW(2)

(b) Interleavings as perceived by user

Fig. 21. Asynchronous methods ordering. (a) Relative order for acknowledgments. The superscript is used to distinguish subsequent invocations of the same type on the same element/location (e.g., two writes or reads of the same variable). The or is used to denote that any interleaving is a valid one (b) Possible interleavings.

locations:

Li : AW(x, 1), SR0

(x), ACK SR(1or2)ACK AW(x)f ence SR1

(x), ACK SR(a) Lj : AW(x, 2), SR0

(x), ACK SR(1or2)ACK AW(x)f ence SR1

The first read invocations (SR0

) on both locations do not have a deterministic result. It can be either 1 or 2 and the result can be different on the two locations. After the fence, it is guaranteed that both reads (SR1

) will return the same value a, though is not known if it is 1 or 2. Assuming the element x was initially zero it is guaranteed that none of the reads in the example will return 0 because of the ordering guarantees provided for accesses to the same element in a thread. The reads are always executed after the previous writes in the program order.

In document The STAPL Parallel Container Framework (Page 116-120)