Scalable Methods for Rewriting - Equivalent Expressions Analysis

3.3 Equivalent Expressions Analysis

3.3.2 Scalable Methods for Rewriting

The above rules of equivalence relate an expression with all of its equivalent expressions. In general because of combinatorial explosion, the set of all equivalent expressions is so large to be derived, which motivates us to develop scalable methods that execute fast enough even with large expressions.

Instead of deriving the full set of equivalent expressions, we can define a new relation , a subset of ≡, which is identical to our equivalent relation ≡ except that we place a few restrictions on the relation. This new relation can be generated by removing the equivalence relation rules in (3.14). Firstly, reflexivity can be removed, because it is not necessary to rediscover expressions. Secondly, we disable transitivity from ≡, as we can have the flexibility to apply in a series of steps to generate equivalent expressions. Finally, we further disallow symmetry in (3.14) for the reduction rules in (3.19) and (3.20) to reduce the space-time complexity of the search space, because often performance metrics, such as accuracy and resource usage, improve when the number of terms in the expression is reduced.

To make use of the new relation we define the following category of functions:

Definition 3.2. We call a function an equivalent expression generator (EEG) function if and only if the function takes as an input an initial set of equivalent expressions, and generates another set of expressions equivalent to those in the input set.

For instance, an EEG function I : ℘ (AExpr≡) → ℘ (AExpr≡), where ℘ (AExpr≡) denotes the power set of all equivalent expressions AExpr≡, can be defined as follows:

I() =

e0∈ AExpr | e e0∧ e ∈

, (3.22)

where is a set of equivalent expressions.

We define a functional:

clN : (℘ (AExpr≡) → ℘ (AExpr≡)) → (℘ (AExpr≡) → ℘ (AExpr≡)) , (3.23)

which takes as an input a EEG function and produces another EEG function:

f0() := N [

i=0

here, f and f0, respectively the input and output of clN, are both EEG functions. In the rest of the section, we omit the brackets surrounding the input of clN for simplicity,

e.g. clN(f )can be written as clNf.

As an example use of the functional clN, we may note that we can substitute f withI in clNf ()to generate a set of equivalent expressions, by taking the union of N steps of repeated application ofI to . By further allowing N to approach ∞, we obtain the full set of equivalent expressions of that can be discovered using our inference system, i.e. the transitive closure of equivalent expressions related by f , from an initial set of equivalent expressions : cl∞I() = ∞ [ i=0 Ii_(). _(3.25)

Alternatively, we can view cl∞f as computing the least fixpoint of g:

lfp g = ∞ [

i=0

gi(∅), where g(ε) := f (ε) ∪ . (3.26)

We may further omit the ∞ from cl∞ to denote the transitive closure, e.g. the above example in (3.25) can be simplified to be clI().

In practice, it is often infeasible to generate the full transitive closure of a given expression, we therefore impose further constraints on how we discover equivalent expressions.

Firstly, instead of exploring the full transitive closure, that is, by allowing the number of steps N in (3.24) to be infinite, we may restrict N to be a small finite value to allow a smaller set of equivalent expressions to be computed. In later experiments, we have chosen N = 10.

Secondly, the complexity of equivalent expression finding is reduced by fixing the structure of subexpressions at a certain depth k in the original expression. The definition of depth is given as follows: first the root of the parse tree of an expression is assigned depth d = 1; then we recursively define the depth of a node as one more than the depth of its greatest-depth parent. If the depth of the node is greater than k, then we fix the structure of its child nodes by disallowing any equivalence transformation beyond

this node. We let Ik denote this “depth-limited” equivalence finding function, where

kis the depth limit used. We can then use clNIkand clIk to denote the functions to respectively compute the union of N steps ofIkand the transitive closure. This approach is similar to Martel’s depth-limited equivalent expression transform [Mar07], however Martel’s method eventually allows transformation of subexpressions beyond the depth limit, because rules of equivalence would transform these to have a smaller depth. This contributes to a time complexity at least exponential in terms of the expression size. In contrast, our technique has a time complexity that does not depend on the size of the input expression, but grows with respect to the depth limit k. Note that the full equivalence closure using the inference system we defined earlier in (3.25) is at least O((2n − 3) !!) where n is the number of terms in an expression, as we discussed earlier. As the maximum number of terms in a binary tree with a depth k grows at a rate O(2k_), the number of equivalent expressions that can be discovered is at leastO((2 × 2k_{− 3) !!)} with respect to k. In the production of experimental results, k is chosen to be either 2 or 3.

Finally, we use an iterative algorithm to accelerate the computation of clNf (), where f is a ∪-distributive EEG (see Definition 3.3) such asIk. In each iteration, we keep track of the equivalent expressions that are newly discovered in the current iteration, so that in the next iteration we apply f only to those expressions, to avoid redundant computation. This algorithm is shown in Figure 3.1 to efficiently compute clNf (), where f can beIk. The correctness of this algorithm is discussed in greater depth in Appendix A.

Definition 3.3. We say an EEG function f is ∪-distributive if and only if the function satisfies f (a∪ b) = f (a) ∪ f (b).

The algorithms we have described so far do not incorporate analyses detailed in Sec- tions 3.1 and 3.2, hence, they do not guide the optimization process with objectives to minimize. The following section explains how the analyses can be used to steer the algorithms to optimize the trade-off between accuracy and area in synthesized circuits of transformed expressions.

function CLOSURE(f , N , ) s0← s0₀←  for i ← 1, . . . , N do s0_i ← f s0 i−1 _{− s} i−1 si ← si−1∪ s0i if s0_i = ∅then return si end if end for return si end function

Figure 3.1. Our algorithm to compute clNf (), which discovers a set of equivalent expressions

with a ∪-distributive EEG f from an initial set of equivalent expressions .

In document Structural optimization of numerical programs for high-level synthesis (Page 94-98)