Equivalent Expressions Semantics - Equivalent Expressions Analysis

3.3 Equivalent Expressions Analysis

3.3.4 Equivalent Expressions Semantics

Similar to the analysis of accuracy and resource usage, a set of equivalent expressions can be computed with semantics. That is, we define structures, i.e. sets of equivalent expressions, that can be manipulated with arithmetic operators. In our equivalent

function FRONTIER() frontier ←  for e ∈ do

for e0 ∈ do

if Area(e0_{) < Area(e)}_{and AbsError(e}0_{) < AbsError(e)}_then frontier ← frontier /{e}

end if end for end for

return frontier end function

Figure 3.2. The algorithm used to compute fr(), i.e. the Pareto frontier from a set of equivalent expressions .

expressions semantics, an element of ℘ (AExpr≡)is used to assign a set of expressions to each node in an expression parse tree. To begin with, at each leaf of the tree, the variable or constant is assigned a set containing itself, as for x, the set xof equivalent expressions is x= {x}. After this, we propagate the equivalence expressions in the parse tree’s direction of flow, using (3.27) defined below, where fr is the algorithm shown in Figure 3.2:

x⊗ y := fr (clIk(E⊗(x, y))) ,

where E⊗(x, y) = {ex⊗ ey | ex ∈ x∧ ey ∈ y} , and ⊗ ∈ {+, −, ×, /}.

(3.27)

It is noteworthy that we override the meaning of ⊗, from arithmetic computations originally, to denote the construction of equivalent expressions. The equation implies that in the propagation procedure, it recursively constructs a set of equivalent subexpressions for the parent node from two child expressions, and uses the depth-limited equivalence function clIk to work out a larger set of equivalent expressions. Similarly, we can define another equation that propagates equivalent subexpressions in an expression with a unary subtraction:

− := fr (clIk(Eunary())) ,

where Eunary() = {−e | e ∈ } .

To reduce computation effort, we select only those expressions on the Pareto frontier for the propagation in the DFG. Although in worst case the complexity of this process is exponential, the selection by Pareto optimality accelerates the algorithm significantly. For example, consider the sample DFG in Figure 3.3, for the subexpression a + b, we have:

a+ b = fr (clIk(E⊗(a, b))) = fr (clI_k(E⊗({a}, {b}))) = fr ({a + b, b + a}) . (3.29) + a b × 1 2 3 4

Figure 3.3. The DFG for the sample expression (a + b) × (a + b).

Alternatively, we could view the semantics in terms of DFGs representing the algorithm for finding equivalent expressions. The parsing of an expression directly determines the structure of its DFG. For instance, consider the tree structure of the expression e0= (a + b) × (a + b), as shown in Figure 3.3. This tree structure can be used to generate a DFG illustrated in Figure 3.4, which when data-flow analysis is applied, discovers a set of equivalent expressions to e0. The circles labeled 3 and 7 in this diagram are shorthands for the operations E+and E× respectively, where E+ and E×are defined in (3.27).

a b + [ Ik Frontier ⇥ [ Ik 1 2 3 4 5 6 7 8 9 10 Frontier

For our example in Figure 3.4, similar to the construction of data-flow equations in Section 2.3 of Chapter 2, we can produce a set of equations from the data-flow of the DFG, which now produces equivalent expressions:

A(1) = A(1) ∪ {a}, A(2) = A(2) ∪ {b}, A(3) = E+(A(1), A(2)), A(4) = A(3) ∪ A(5), A(5) =Ik(A(4)), A(6) = fr(A(5)),

A(7) = E×(A(6), A(6)), A(8) = A(7) ∪ A(9), A(9) =Ik(A(8)), A(10) = fr(A(9)).

(3.30)

By solving this system of equations for the value A(10), we find a set of expressions that are equivalent to the original that produce an optimized trade-off between area and accuracy. Because of loops in the DFG, it is no longer trivial to find the solution. In general, the analysis equations are solved iteratively, using the DFA approach discussed in Section 2.3.1 of Chapter 2. We can regard the set of equations as a single transfer function F as in (3.31), where the function F takes as input the variables A(1), . . . , A(10) appearing in the right-hand sides of (3.30) and outputs the values A(1), . . . , A(10) ap- pearing in the left-hand sides. Our aim is then to find an input ~xto F such that F (~x) = ~x, i.e. a fixpoint of F .

F ((A(1), . . . , A(10))) = (A(1) ∪ {a}, . . . , fr(A(9))). (3.31)

Initially we assign A(i) = ∅ for i ∈ {1, 2, . . . , 10}, and we denote ~∅ = (∅, . . . , ∅). Then we compute the least fixpoint of F :

lfp F := [ n∈R

Fn(~_∅). (3.32)

This expression can be computed iteratively by first evaluating F (~∅), F2(~∅) = F (F (~∅)), and so forth, until the fixpoint is reached for some iteration n, i.e. F (Fn_(~

∅)) = Fn+1(~∅). Hence, we know that for any iterations m > n + 1, Fm_(~

∅) = Fn(~∅). The value n should be a finite constant, because the relation can only reach a finite number of expressions.

In cases when lfp F is computational intensive, we could limit the number of iterations n, to compute an under-approximation (a subset) of lfp F .

The fixpoint solution lfp F gives a set of equivalent expressions derived using our method, which is found at A(10). In essence, the depth limit acts as a sliding window. The semantics allow hierarchical transformation of subexpressions using a depth-limited search and the propagation of a set of subexpressions that are locally Pareto optimal to the parent expressions in a bottom-up hierarchy.

The problem with the semantics above is that the time complexity of clIkscales poorly, since the worst case number of subexpressions needed to explore increases exponentially with k. Therefore an alternative method is to optimize it by changing the structure of the DFG slightly, as shown in Figure 3.5. The difference is that at each iteration, the Pareto frontier filters the results to decrease the number of expressions to process for the next iteration, whereas the former approach filters the Pareto-suboptimal candidates only at the end of the iterative procedure. The latter method is therefore pruning the set of discovered candidates more frequently than the former. Equivalently, this approach yields the following semantics for arithmetic operations on equivalent expressions as an alternative to (3.27): x⊗ y := cl (fr ◦Ik) (E⊗(x, y)) , where ⊗ ∈ {+, −, ×, /}, − := cl (fr ◦I_k) (E_unary()) . (3.33) a b + [ Ik Frontier ⇥ [ Ik 1 2 3 4 5 6 7 8 9 10 Frontier

Figure 3.5. The alternative DFG for (a + b) × (a + b).

In the rest of this chapter, we use frontier_trace to indicate our equivalent expression finding semantics, and greedy_trace to represent the alternative method.

In addition to the above approaches, another possibility is to view the optimization in a perspective from denotational semantics. We can define a recursively-defined function O [e] σ] which accepts an expression e, and σ]_{, an input condition on the variables.} This function produces a set of optimized expressions equivalent to e. The set of input conditions will be formally defined in Chapter 4. Therefore for e1, e2 ∈ AExpr and a variable x: O [e1⊗ e2] σ] = fσ] n e0₁⊗ e0₂| e0₁ ∈ O [e1] σ], e02 ∈ O [e2] σ] o , O [x] σ] = {x},

In document Structural optimization of numerical programs for high-level synthesis (Page 98-103)