Generation of Disjoint MIMO Patterns - Exhaustive Pattern Enumeration

4.2 Exhaustive Pattern Enumeration

4.2.5 Generation of Disjoint MIMO Patterns

Disjoint pattern enumeration algorithm produces the set of all feasible disjoint MIMO patterns denoted as DP S. According to Theorem 2, each disjoint pattern dp ∈ DPS is composed of more than one connected patterns that satisfy the input, output and convexity constraints. We use the the set of all feasible connected MIMO patterns denoted as CP S as the base to produce all the disjoint patterns.

We observed that the number of output nodes of any feasible disjoint pattern is simply the summation of those of its constituent connected patterns. Based on this observation, we classify the patterns according to the the number of output nodes. We define CPSi and DPSi as the set of all the feasible connected patterns and disjoint patterns with exactly i output nodes, respectively. Note that according

CHAPTER 4. SCALABLE CUSTOM INSTRUCTIONS IDENTIFICATION 70

to our definition CPSi∩ DPSi= ∅. Feasible disjoint patterns with n output nodes can be generated by combining feasible connected patterns with less than n output nodes. More formally, we have to consider all possible partitions of n (a partition of a positive integer n is a way of writing n as a sum of positive integers) except for the partition with single element n. For example, the partitions of integer 4 are 4, 3 + 1, 2 + 2, 2 + 1 + 1, 1 + 1 + 1 + 1. Therefore

DPS4 = (CPS3× CPS1) ∪ (CPS2 × CPS2) ∪ (CPS2× CPS1× CPS1) ∪(CPS1× CPS1× CPS1× CPS1)

where × and ∪ represent cross product and union operations, respectively. However, we can simplify the disjoint pattern generation process by replacing certain parts of the above equation with DPSi. Following we show the equations for disjoint patterns with up to 5 output nodes.

DPS1 = ∅ DPS2 = CPS1× CPS1 DPS3 = (CPS2× CPS1) ∪ (CPS1 × CPS1× CPS1) = (CPS2× CPS1) ∪ (DPS2× CPS1) DPS4 = (CPS3× CPS1) ∪ (CPS2 × CPS2) ∪ (CPS2× CPS1× CPS1) ∪(CPS1× CPS1× CPS1× CPS1) = (CPS3× CPS1) ∪ (CPS2 × CPS2) ∪ ((CPS2× CPS1) ∪ (CPS1× CPS1× CPS1)) × CPS1 = (CPS3× CPS1) ∪ (CPS2 × CPS2) ∪ (DPS3× CPS1) DPS5 = (CPS4× CPS1) ∪ (CPS3 × CPS2) ∪ (DPS4× CPS1)

The above equations indicate that the disjoint patterns should be generated in increasing order of the number of output nodes (i.e., DPS2, DPS3, ...). Also each cross product operation is performed on two sets, i.e., each disjoint pattern is obtained by composing two previously generated patterns (connected or disjoint), thus simplifying the generation algorithm. Note that starting from DPS6,

CHAPTER 4. SCALABLE CUSTOM INSTRUCTIONS IDENTIFICATION 71 2 1 0 (a) (b) p1 p2 upScope(p1) 4 5 3 6 7 8 2 1 0 4 5 3 6 7 8

Figure 4.5: Non-connectivity/Convexity check based on upward scope. (a) p2 con- nects with p1. (b) p2 introduces non-convexity.

cross product operation on more than two sets need to be performed; for example CPS2× CPS2× CPS2 cannot be resolved. However, the term CPS2× CPS2 appears during the generation of DPS4. By re-using these intermediate results, we can still ensure that the cross product is always performed with two sets.

Pruning

Directly computing the right side of each equation from DPSi may produce infea- sible or redundant patterns. For example, if we combine two connected patterns that overlap with each other, the resulting pattern will either be connected or will have lesser number of output nodes than expected. Non-convex patterns may also be generated in this process. In order to avoid this, we must ensure that each feasible disjoint pattern is generated by combining two patterns p1 and p2 (disjoint or connected) that are (1) disjoint from each other and (2) there is no path from p1 to p2 or p2 to p1. The second condition ensures that combining the two patterns does not result in a non-convex disjoint pattern.

We define upward scope of a pattern p (upScope(p)) for this purpose. It is the collection of all the predecessors of the nodes in pattern p. When combining two patterns p1 and p2, if p1 ∩ upScope(p2) 6= φ or p2 ∩ upScope(p1) 6= φ, they need not to

CHAPTER 4. SCALABLE CUSTOM INSTRUCTIONS IDENTIFICATION 72

be combined because either non-connectivity and/or convexity condition will be violated. Let us assume that p2 ∩ upScope(p1) 6= φ, there must exists node v ∈ p2 and u ∈ p1 such that v ∈ predecessors(u). Now that there exists a path hv, . . . , xi, . . . , ui between v and u, if all xi belongs to either p1 or p2, then the combined subgraph will be a connected one; otherwise, the combined subgraph should be non-convex. Figure 4.5 shows these two cases. In disjoint pattern generation process, the upward scope for each pattern need to be computed and stored to perform this check.

To further prune the search space, we first number the nodes according to reverse topologically sorted order. Next we define CPSv_i as the set of feasible connected patterns with i output nodes and v as the smallest numbered node. Similar definition applies to DPSv i. Clearly, DPSi = [ v∈valid nodes DPSv_i DPS = MAXOUT [ i=2 DPSi where MAXOUT is the output constraint.

Algorithm 4 details the disjoint pattern generation steps. It computes DPSv_i for each valid node v in the innermost loop according to the corresponding equation (line 8), aggregates them to form DPSi (line 20) and finally DPS (line 21).

DPSv

i is computed by combining pattern sets of node v with pattern sets of node u, where u is bigger than v in reverse topologically sorted order (line 6). Non-symmetrical terms, such as CPS1× CPS2 should be combined twice with their place exchanged (line 18–19). Upward scope check helps reduce the design space at two places. First, node u can be entirely bypassed if it falls in upScope(v) (line 7); otherwise non-connectivity or convexity will be violated. Second, constituent pattern p1 from pattern set of v can be bypassed if upScope(p1) overlaps with u

CHAPTER 4. SCALABLE CUSTOM INSTRUCTIONS IDENTIFICATION 73

Algorithm 4: Feasible disjoint pattern enumeration

DPSetGen begin DPS := φ; 1 for i = 2 to MAXOUT do 2 DPSi:= φ; 3

for all valid nodes v of DFG in reverse topological order do 4

DPSv_i := φ;

for all valid nodes u s.t. order(u) > order(v) do 6

if u ∈ upScope({v}) then continue with the next u; 7

for every term T on r.h.s. of the equation of DPSido

Let T = T1 × T2;

for all the patterns p1 in T1 with smallest node v do 9

if u ∈ upScope(p1) then 10

continue with the next p1; 11

for all patterns p2 in T2 with smallest node u do 12

if p1 ∩ upScope(p2) 6= φ or p2 ∩ upScope(p1) 6= φ then 13

continue to the next p2; 14 tmp := p1 ∪ p2; 15 if InCheck (tmp) then 16 DPSv_i := DPSv_i ∪ {tmp}; 17 if T1 6= T2 then 18

repeat lines 9 to 17 by exchanging the place of T1 and T2; 19 DPSi:= DPSi∪ DPSvi; 20 DPS := DPS ∪ DPSi; 21 end

(line 10). These two checks bypass a set of combinations at each time and greatly reduce the search space. A normal upward scope check between two constituent patterns is conducted before combining them (line 13). Lastly, the resultant pattern is added to DPSv

i subject to input check (line 16–17).

In document Design methodologies for instruction-set extensible processors (Page 87-91)