In Chapter III, Section B we introduced the RTS and described the sync rmi, async rmi and rmi fence as the main communication and synchronization primi- tives. The rmi fence construct is provided by the RTS to ensure that previously spawned messages are received on their target locations. There are currently two fence calls available in stapl: rmi fence collective call, where all threads in the execution group need to perform the invocation and one sided os fence that can be invoked independently of the other threads of computation. From the individual thread’s point of view, both fences ensure all acknowledgments for the pending operations are received. The global fence ensures the acknowledgments are globally received by all pending operations. The stapl runtime provides additional primitives that subsume the fence semantic such as one-sided/collective reduce/broadcast.
We introduce in this section a terminology to specify the start and the termina- tion of the pContainer methods and the specification of when pContainer methods are completed.
• Collective methods: start CollM and termination ACK CollM (e.g., constructor). ACK CollM is received by the invoking thread when the method returns.
• Synchronous methods: start SM and the termination is denoted as ACK SM (e.g., get element). For a SM(x), the ACK SM(val) is received by the invoking thread when the method returns.
• Asynchronous methods: start AM and termination ACK AM (e.g., set element). For a AM(x) there is no explicit acknowledgment sent from the pContainer to the user. To reason about correctness we logically assume that the ACK AM is received by the invoking thread at any point in the program order before any one of the following events:
– Encountering a fence call
– A subsequent SM(x) or SPM(x) receives its acknowledgment. Synchronous and split phase methods on an element x of a pContainer forces acknowl- edgments for all pending asynchronous methods operating on the same element x.
– A subsequent AM(x) receives its acknowledgment. Asynchronous method invocations from the same thread and on the same element x of a pContainer receive their acknowledgments in order.
• Split phase methods: start SPM and termination ACK SPM (e.g., split phase get element). The invocation of a split phase method returns immediately to the user a future. We denote the creation of the future as SFuture. The return value corresponding to the future is obtained by invoking the method get and we represent this as Future.get. The acknowledgment for the Future.get is denoted as ACK F and it is received when the method returns. The acknowl- edgment of a split phase method and implicitly the value returned, is logically received by the invoking thread at any point in the program order after the future creation and before any one of the following events:
– Encountering a fence call.
– When Future.get receives its acknowledgment ACK F.
The fence receives its acknowledgment when it returns together with the ac- knowledgments of all pending asynchronous and split phase methods. Our framework guarantees that for all method invocations there is an acknowledgment and this is an important property of a distributed system referred to as liveness[6].
L0
SR(x) ACK_SR(0) AW(x,1)
L1
AW(y,2) SR(y) ACK_AW(y), ACK_SR(2)
L3
SPR(z) Future(z) Future(z).get() ACK_SPR(0)
Fence, ACK_AW(x), ACK_Fence
Fig. 20. Completion guarantees. The time increases from left to right.
In the following sections we use SRLi(x) to denote the beginning of a synchronous method that will read an element x of a pContainer. We use ACK SRLi(val) to denote the acknowledgment and the returned value. The index Li is a thread identifier and is
used to distinguish among invocations in different threads. Similarly, for asynchronous write operations we use AW and ACK AW and for split phase reads we use SPR and ACK SPR. In Figure 20 we include a picture to exemplify the termination guarantees introduced in the previous section. We exemplify using asynchronous writes (AW), synchronous reads (SR), and split phase reads (SPR), but the same holds for any element-wise pContainer method. The example accesses elements corresponding to three different pContainer elements identified by their GIDs, x, y, z. For all examples
considered in this chapter the elements have all value 0 initially. Each row in the figure contains method invocations in program order from left to right. Each invocation takes a certain amount of time which is represented as the length of the segment between start and acknowledgment. Location L0 performs a synchronous read of element x. The invocation returns ACK SR(0) when the method returns. The next AW operation receives the acknowledgment before the subsequent fence returns. Location L1 performs an AW on y followed by a SR on y. The SR operation since it is on y implies the ACK AW(y) and the ACK SR(2). Location L2 performs a split phase read of z. The invocation returns immediately to the user a future. When the get method of the future is invoked the ACK SPR(2) is received.