• No results found

5.9 Rewire Target Analysis

6.1.1 Loop Masks

Each divergent loop has to maintain a mask that is true for all instances that are still active in the loop. The loop can only be exited when this mask is false for all instances. While the loop is still iterating, results of inactive instances must not be altered. Therefore, a special φ-function—the loop mask phi—is generated in the loop header (mb in Figure 6.2). Its first

incoming value is the mask of the incoming edge from the preheader, the second value is the mask of the loop back edge.

Also, to ensure correct execution after the loop is finished, a divergent loop with multiple rewire exit blocks needs to persist the information which instance left the loop over which edge. This is achieved by introducing loop exit masks that are maintained by the following instructions (Function createLoopExitMasks): a mask update operation (mup) and the loop exit

mask phi, which is a φ-function in the loop header (mexit). The update

operation is the disjunction of the loop exit mask phi of the current loop and the accumulated mask of the next inner loop that is left via this exit, if there is one. Otherwise, the second operand is simply the exit condition of the exit edge. The loop exit mask phi has one incoming value from the preheader and one from the latch. The value coming from the latch is the result of the update operation. The value coming from the preheader is an empty mask (all elements set to false). Note that it is not necessary to persist the complete exit mask of an exit that leaves multiple loops in any other loop than the outermost one that is left. The inner loops only require information about which instances left in their current iteration.

After mask generation, each rewire loop exit mask thus has one update operation per loop that is left and one loop exit mask phi in the header of each loop that is left.

Note that, again, the analyses presented in Chapter 5 allow us to generate more efficient code: If an exit block is optional, we omit its loop exit mask because it is equal to the active mask. If the entire loop is not divergent, the loop mask and all loop exit masks can be omitted. This is because in such a loop, all instances that enter the loop will exit together through the same exit.

6.1 Mask Generation 91

Function createLoopExitMasks(Loop L) begin

if L is divergent then

foreach E ∈ rewire exit blocks of L do ExitMaskPhis[E][L] ← Mask(PHI); end

end

foreach N ∈ nested loops of L do createLoopExitMasks(N ); end

if L is not divergent then return;

end

Block P ← preheader of L;

foreach X → E ∈ rewire exit edges of L do Mask exitMaskPhi ← ExitMaskPhis[E][L]; exitMaskPhi.blocks.push(P);

exitMaskPhi.values.push(Mask(VALUE, false)); Mask maskUpdate ← Mask(OR, exitMaskPhi);

if exit leaves multiple loops and L is not innermost loop left by

this exit then

N ← next nested loop of exit;

maskUpdate.push(ExitMaskUpdates[E][N]); else

maskUpdate.push(ExitMasks[X→E]); end

ExitMaskUpdates[E][L] ← maskUpdate; if L is top level loop of exit then

ExitMasks[X→E] ← maskUpdate; end exitMaskPhi.blocks.push(latch); exitMaskPhi.values.push(maskUpdate); end createCombinedLoopExitMask(L); end

Function createCombinedLoopExitMask(Loop L) begin

Mask combinedMask ← Mask(OR); foreach E ∈ rewire exit blocks of L do

combinedMask.push(ExitMaskUpdates[E][L][1]); end

CombinedExitMasks[L] ← combinedMask; end

Combined Loop Exit Masks. Finally, to reduce the number of instructions required to persist loop results, a combined loop exit mask may be used during select generation (see Section 6.2). This mask combines all information about instances that left the loop in the current iteration. In case of a loop that contains more nested loops, the current iteration of the parent includes all iterations of all nested loops. Thus, the combined loop exit mask is a disjunction of all accumulated loop exit masks of exits from nested loops and the exit masks of exits from the current loop. Function createCombinedLoopExitMask shows this in pseudo code.

6.1.2 Running Example

Figure 6.2 shows the masks generated for the Mandelbrot kernel. The di- vergent loop has one uniform exit (b → e) and one varying exit (c → f ). The uniform exit does not require a dedicated exit mask, but the varying one does. It is maintained by the φ-function mexit in the loop header b and

updated by the disjunction mup in c. Since mexit is initialized with false,

the disjunction accumulates those instances that have left the loop in each iteration, given by mc→f. The mask in block f is exactly this accumulated

exit mask. The mask in e is simply the active mask if the exit is taken. This is because the exit condition is uniform and the block is an optional exit: if the exit is taken, it is taken by all instances that are still active. In block g, the masks from both sides are merged by a disjunction. In this case, since the block is by all, it is equal to true. The combined loop exit mask mcomb has no uses before Select Generation (Section 6.2). It consists of a

disjunction of the exit masks of the loop exits in the current iteration (mb→e

6.1 Mask Generation 93 ma ← true .. . ma→bma br b a mbphi(ma→b, md→b) mexitphi(false, mup)

.. .

condbiter ≥ maxIter

mb→cmb∧ ¬condb mb→emb∧ condb br condb, e, c b mcmb→c .. . condcx2 + y2 > scaleSq mc→dmc∧ ¬condc mc→fmc∧ condc mupmexit∨ mc→f br condc, f, d c mdmc→d mcombmb→e∨ mc→f .. . md→bmd br b d memb→e .. . me→gme br g e mfmup .. . mf →gmf br g f mgme→g∨ mf →g .. . g true false true false

Figure 6.2: Mask generation for the Mandelbrot kernel. mexit is the accumu- lated exit mask of edge c → f . mupis the update operation of that exit mask. mcombis the combined exit mask.