Advanced Multiprocessor
Programming
Jesper Larsson Träff
Research Group Parallel Computing
Combining and Counting (Chap. 12)
Parallelizing associative function application: get-and-increment of shared counter as example.
Trivial:
Protect counter with lock (Java: synchronized method = monitor with condition variable)
Properties:
•With no contention (concurrent get-and-increment), O(1) and
Tree-based implementation (how?):
•Always O(log n) operations, even when no contention •But, possibly, also O(log n) on contention
Latency/throughput trade-offs:
Use better data structure to get better throughput, possibly at the cost of higher latency per operation
RT 0 Combining tree I I I I I I
Combining tree:
•The shared value (counter) is maintained at root node •Each thread is assigned to a leaf node
•(at most) Two threads share a leaf node •At most two threads share an interior node •Node has a status, indicating what to do next
To update shared value (increment counter):
Thread starts at leaf and works up the tree. If it meets
public class Node { enum Node_status =
{IDLE, FIRST, SECOND, RESULT, ROOT}; boolean locked;
Node_status status;
int firstval, secondval; int result;
Node parent;
public Node() { // root constructor
status = Node_status.ROOT; locked = false;
}
I Status: I(dle), R(oo)T, F(irst), S(econd), R(esult) Result
Value from first thread/value from second thread
public CombiningTree(int width) { Node[] nodes = new Node[width-1]; nodes[0] = new Node(); // the root for (i=1; i<width-1; i++) {
nodes[i] = new Node(nodes[(i-1)/2]); Node[] leaves = new Node[(width+1)/2]; for (i=0; i<(width+1)/2; i++)
leaves[i] = nodes[width-1-i-1]; }
RT 0 I I I I I I Thread A: getandinc()
RT 0 I F I I I I Thread A: getandinc()
RT 0 F F I I I I Thread A: getandinc()
RT 0 F F I I I I
Thread A: getandinc() 1. Reserve combining path
1. Reserve combining path where thread is active (precombine) 2. Write values on active path (combine)
RT 0 1 F 1 F I I I I
Thread A: getandinc() 1. Reserve combining path
1. Reserve combining path where thread is active (precombine) 2. Write values on active path (combine)
3. Perform op on last node 4. Distribute
RT 1 1 F 1 F I I I I
Thread A: getandinc() 1. Reserve combining path
RT 1 1 I 1 I I I I I
Thread A: getandinc() 1. Reserve combining path
public int getandinc() {
Stack<Node> stack = new Stack<Node>(); Node leaf = leaves[ThreadID.get()/2]; Node node = leaf;
while (node.precombine()) node = node.parent; Node last = node;
node = leaf; int combined = 1; while (node!=last) { combined = node.combine(combined); stack.push(node); }
int prior = last.op(combined); 1
3 2
synchronized boolean precombine() { while (locked) wait();
switch (status) { case IDLE: status = Node_status.FIRST; return true; case FIRST: locked = true; status = Node_status.SECOND; return false; case ROOT: return false; default: Passive thread, will have to wait
synchronized int combine(int combined) { while (locked) wait();
locked = true; firstval = combined; switch (status) { case FIRST: return firstval; case SECOND: return firstval+secondval; default:
// cannot happen, throw exception }
}
Wait on condition variable
synchronized int op(int combined) { switch (status) {
case ROOT:
int prior = result; result += combined; return prior;
case SECOND:
secondval = combined; locked = false;
notifyAll(); // wake up waiting threads while (status!=Node_status.RESULT) wait(); locked = false;
notifyAll();
synchronized void distribute(int prior) { switch (status) { case FIRST: status = Node_status.IDLE; locked = false; break; case SECOND: result = prior+firstval; status = Node_status.RESULT; break; default:
// cannot happen, throw exception }
RT 0 1 F 1 F I I I I Thread A: getandinc()
RT 0 1 F 1 F F I I I Thread A: getandinc()
RT 0 1 S 1 F F I I I Thread A: getandinc()
RT 0 1 S 1 1 F F I I I Thread A: getandinc()
RT 0 2 S 1 1 F F I I I Thread A: getandinc()
RT 2 2 S 1 1 F F I I I
Properties
•Fine-grained locking by synchronized methods. There is no lock
on the whole data structure
•Blocking: threads will have to wait on locked nodes for active
thread to complete update
•Linearizable
•Not unfair (what does that mean?)