Chapter 7 Specifying and Checking Effects for Framework APIs
7.2 Safe, Reusable Parallel Frameworks
7.3.1 DPJ Frameworks
We focused on the framework operations needed for the two benchmarks but ensured that the operations themselves were general, i.e., were not specifically tied to the needs of the benchmarks, as discussed below. Adding more operations is not difficult.
Parallel array framework: We implemented a framework called DPJDisjointAray with an interface similar
to a subset of the ParallelArray API for Java [1]. The API supports the following operations:
1. Acreatemethod that creates an array with a user-supplied factory method, as discussed in Sec- tion 7.2.1.
2. AwithMappingmethod that maps one array to another, element by element, with a user-supplied mapping function. Like ParallelArray, we provide two forms of the mapping: the first takes an index variable, and the second does not. As in the factory method pattern, we use a method region parameter Rto ensure that the mapping function creates a new output object for each element, and the mapping function is allowed to writeR.
3. Areducemethod that reduces the array to an object, given a starting element and a user-specified Reducerthat combines two elements into one. Following the pattern discussed in Section 7.2, the two elements coming into theReducermethod are parameterized by method region parametersR1 and R2, and the user-supplied method is allowed to write the regions bound to these parameters. Using distinct parameters ensures that theReducercannot violate disjointness, e.g., by storing one object into a field of the other.
The framework implementation is a thin wrapper that uses a ParallelArray instance internally to provide all the operations.
Parallel tree framework: We wrote a framework that provides a tree of user-specified arity (i.e., each inner
node has at mostaritychildren) with data of generic typeTstored in every node. The API supports the following operations:
1. A buildTreemethod that takes a DPJDisjointContainer eltsof objects of type Tand a positive arity and inserts the bodies into the leaves of the tree. The user provides an index function that takes aTto insert, a Tat the current (inner or leaf) node, and aTat the parent node of the current node, and computes which of the children of the current node to follow next when inserting the object in the subtree rooted at the current node. The framework creates the inner nodes as necessary and populates each one with a fresh object of type T, using a user-specified factory method.
2. AvisitPOmethod that recursively does a parallel postorder tree traversal. As shown in Figure 7.8, this method takes a user-supplied visitmethod that, given aTobject at the current node and an ArrayList ofV(result) objects produced from visiting the children (ornullif the current node is a leaf), produces aVobject for this node. Again we use two region parameters,R1andR2, to ensure that disjointness of theTobjects is preserved by the traversal.
1 public class DisjointTree<type T<region Elt>, region Cont> 2 implements DisjointContainer<T,Cont> {
3
4 public <effect E#>double visitPO(POVisitor<T, effect E> visitor) 5 reads Cont writes Elt:* effect E { ... }
6
7 public interface POVisitor<type T<region Elt>,
8 type V<region VR>, effect E> {
9 public <region R1, R2> V<R2>
10 visit(T<R1> data, ArrayList<V<R2>, Cont> childResults) 11 reads Cont writes R1, R2 effect E;
12 }
13 }
Figure 7.8: The postorder visitor from the region-based spatial tree.
Parallel pipeline framework: We implemented a framework called DPJPipeline that supports applications
Intel’s Threading Building Blocks (TBB) [101] and the StreamIt language [118], we call the operation applied by each stage a filter. Each data element flows sequentially through the stages, but different stages can apply their filters to different elements at the same time, creating pipeline parallelism. This parallel control structure cannot be expressed directly in DPJ as described in Chapters 3 through 6.
The DPJPipeline API is parameterized by a type T<TR> for the type of an element, a region PRfor the pipeline internals, and an effect Ethat bounds the user-specified effects of the filters. The effect Eis constrained not to interfere with writing under TR orPR, or with itself, ensuring that filters may safely update the data elements and the pipeline state. The API provides two interfaces for the user to implement: a filter and a factory method for creating a filter. The API also provides the following methods for the user to invoke directly:
1. A methodappendStageWithFilterthat accepts a user-defined filter factory, uses it to create a fresh filter, and inserts a stage with that filter at the tail of the pipeline.
2. A methodlaunchthat launches one task for each pipeline stage.
Internally, each stage is represented by an object of typeStage(a private class, not visible to the user) that stores the user-specifiedFilterfor that stage and maintains an output buffer for the data items produced by that stage. The output buffer of a stage is the input buffer for the next stage. Extending our framework to a recursive fork-join graph, as supported in StreamIt, or a general DAG would not be difficult.
Effect management for this framework works as follows. Method region parameters on the user-defined factory methods as discussed previously ensure that each filter and each element is a freshly-created object, each in its own region. TheFilterinterface looks like this:
public interface Filter<type T<region TR>, region FR, effect E> { public <region R>T<R> op(T<R> item) writes R, FR effect E; }
As in the previous examples, this method is invoked only by the framework, in the stage implementation. At a particular invocation ofop,Ris bound to the region of the data element being operated on, which is under the region bound toTRin theDPJPipelineclass, andFRis bound to the region associated with the current stage, which is under the region bound to PRin theDPJPipelineclass. The actual effect bound toEis supplied in the instantiation of the framework and is constrained as discussed above. Thus the user-defined filter operation is limited to updating the regions of the data object and the filter state, and
doing any other noninterfering effects. In particular, it cannot update a data element being operated on by a concurrent filter, or a different filter.
The framework implementation passes the object returned by the filter operation from one stage to the next. The returned object need not be the same as the object passed in. However, the region parameterR ensures that the object returned has the same region bound to its type as the input object. In particular, the return object cannot be a data element processed concurrently by a different stage, or even a data element reachable from such a data element, except through a partially-specified RPL.