The Move Step - An Algorithm for Cost-Based Buffering

7.3 An Algorithm for Cost-Based Buffering

7.3.6 The Move Step

Let P be a maximal path in A with the property that none of its internal vertices has out-degree larger than 1. Let ν, ω ∈ V (A) be its endpoints (i. e. P = A[ν,ω]).

We use a Dijkstra-like [Dij59] algorithm to propagate initial candidate pairs with node ω and position κ(ω) along P to ν. If ω ∈ N\{s}, there is only one initial candidate pair: The pair representing the sub-tree consisting of ω only. If ω is a Steiner node with out-degree larger than one, the initial candidate pairs are a result of the merge step (see Section 7.3.8). All these initial candidate pairs are stored in a heap with key cost(ident( C C)) + cost(invert( C C)).

While the heap is non-empty we do the following. First, we erase a candidate pair C C with minimum key from the heap. If node( C C) = ν, we mark it as a final candidate pair. Otherwise, we create new candidate pairs by propagation along the edge e = κ((ν,node( C C))), where (ν, node( C C)) is the edge entering node( C C) in A. We create two types of new candidate pairs: candidate pairs that arise from C C by propagation along a wire or a via without inserting new repeaters and candidate pairs that arise by repeater insertion.

. . . . . . ν ν node( C C) ω C C p( C C) u ident

(a) Non-parallel mode. Creating a new candidate pair left of u would introduce an electrical violation. . . . . . . ν ν node( C C) ω C C p( C C) u invert ident

(b) Parallel mode. Creating a new candidate pair left of u would introduce an electrical violation. The new candidate ends the parallel mode.

Figure 7.5: Propagation of candidate pairs with repeater insertion. We apply the move step of Section 7.3.6 on the path from ν to ω. By binary search we look for position u “between” p( C C) and κ(ν)such that we do not create an electrical violation. We create a new candidate with position u that represents repeater insertion.

Propagation without repeater insertion. We propagate C C to position κ(ν). The resulting candidate pair is added to the heap unless it is almost-dominated by a candidate pair that is currently contained in the heap or that has been removed before. The new candidate pair C C0 _{has node( C C}0_{) = ν} _{and p( C C}0_{) = κ(ν)}_{. Note that C C and C C}0 _{have the}

same type.

Propagation with repeater insertion. When we create new candidate pairs by insertion of further repeaters we have to distinguish if C C is of parallel type or not.

If C C is not of parallel type, we do the following. For each repeater l ∈ L we look for a point u on the straight path segment between p( C C) and κ(ν) (if e is a wiring edge) respectively in {p( C C), κ(ν)} (if e is a via). We want to create a new candidate pair C C(u, l) arising by propagation to u and insertion of a repeater of type l in the valid part as described in Section 7.3.3. Position u is chosen such that the downstream capacitance of the inserted repeater does not exceed the capacitance limit of l and such that the slew limit of the new candidate is not smaller than the slew target. By binary search we can find u as far away from p( C C) as possible. If no such point exists, we choose u = p( C C). This is the case if the capacitance of the valid part of C C is already larger than the capacitance limit of l or if a repeater of type l cannot drive C C at all without violating the slew limit. We add C C(u, l) to the heap unless it is almost-dominated by a candidate pair contained in the heap or already removed from it, and set node( C C(u, l)) = ν if u = κ(ν). An example of non-parallel repeater insertion during the move-step can be found in Figure 7.5(a).

If C C is of parallel type, we create candidate pairs resolving the parallel mode. For each inverter l ∈ L and each polarity in pol ∈ {ident, invert} we use binary search to find the point u as far away from p( C C) as possible such that after propagation to position u and insertion of an inverter of type l in the pol-part of C C we obtain a candidate pair without capacitance violations and for which no slew limit is smaller than the slew target. As before, we choose u on the straight path segment between p( C C) and κ(ν) if e is a wiring edge and in {p( C C), κ(ν)} if e is a via. If a point such as u does not exist, we choose u := κ(ν). We add the resulting candidate pair to the heap unless it is almost-dominated. An example of parallel repeater insertion during the move-step can be found in Figure 7.5(b).

When we compute the positions u we only take electrical violations into account. Positions for which delays are optimum can be between p( C C) and u. In our practical application, tile sizes are small compared to optimum distances between repeaters and

the error we make by computing u based on electrical violations only is also small. In the presence of large tile sizes we can make use of the stationary slew slewtarget in the

pre-computed optimum repeater chain (Bartoschek et al. [Bar+09]) and compute u such that C C(u, l) does not result in a slew larger than slewtarget at C C. This can be achieved by

reducing the slew limit of every valid part of C C to slewtarget.

Every time we add a candidate pair C C to the heap, we erase all candidate pairs C C0₍₆₌ _{C C)}

for which C C almost-dominates C C0 _{from the heap. Note that due to non-negativity of}

the cost function, a candidate pair C C1 can never be almost-dominated by a pair C C2 that

is created after C C1 is extracted from the heap without the property that C C1 almost-

dominates C C2 as well. When pruning almost-dominated candidates we have to make sure

that whenever C C1and C C2almost-dominate each other, we never erase the earlier candidate

pair.

Although almost-domination is already sufficient to prune most candidate pairs, their number can still be exponential in general (although it is possible to give polynomial runtime bounds when making restrictions to the input, see Permin [Per16]). In order to obtain acceptable running times in practice, Permin [Per16] developed and implemented useful speed-up techniques such as future costs and caching of transitions that lead to almost-dominated solutions. By increasing the values γcost, γcap, γslewlim we can obtain

even further speed-ups – even though at the cost of worse solutions.

In the next section we show how we can prune and avoid most candidate pairs and thus, obtain a polynomial running time. These techniques are used in the fast version of the algorithm.

7.3.7 Speed-up Techniques for the Move Step in the Fast Version

In document Timing-Constrained Global Routing with Buffered Steiner Trees (Page 147-149)