A Simple Delay Model - Fast Repeater Tree Construction

Since we want to evaluate the properties of our topologies with respect to timing, we somehow have to compute a slack at root and sinks. It would be prohibitively slow to insert repeaters into each topology we want to evaluate. Therefore, we propose a simple delay model that estimates the timing from the geometric structure of the topology. The delay model will compute arrival times and required arrival times for all nodes of a topology giving us a slack that can be used to evaluate the topology. The delay model mainly consists of two components: delay over wire segments and delay due to bifurcations.

We have seen in Section 4.1 how buffering a long net linearizes the delay. Given a buffering mode m, the estimated delay for a net between two points x and v is given by

delay := md||x − v||. (5.1)

Every internal node of a topology with outdegree two is a bifurcation and thus an additional capacitance load for the circuit driving both of the two outgoing branches (compared to alternative direct connections). The real delay caused by bifurcations is hard to estimate beforehand. It will depend on the strength of the driver, the additional capacitance, and the position of the driver compared to the sinks. In

5.1 A Simple Delay Model

Section 4.1.6 we computed the parameter dnode estimating the average effect of a

bifurcation. It is a very rough estimation, but we will show in Section 8.5 that the used value serves us well. To evaluate the delay through a topology, we will add the additional delay to each outgoing edge of a node with two children.

It is reasonable to assume that the additional load capacitance will be smaller for the less critical branch. Uncritical side path are more likely to be buffered by a small repeater with nearly neglectable capacitance. We therefore allow the distribution of

dnode between both involved edges. We denote by dnode(e) the amount assigned to

edge e. We introduce a new parameter η controlling how uneven the distribution of dnode can be. If e is an outgoing edge of a node with outdegree 1, then we

require dnode(e) = 0. Otherwise, we require that dnode(e) ≥ ηdnode. For two edges

e, e0 leaving the same internal node we require dnode(e) + dnode(e0) = dnode. The

parameter η has to be between 0 and 1/2to be able to fulfill the requirements.

Next, we have to determine the arrival time at the root node. If the edge leaving the root has buffering mode m assigned, then we assume that the root will have to drive capacitance mcap1. We set the arrival time at the root to

atT(r) := max

atr_r(mcap), atfr(mcap)

Note that for maximizing the worst slack, an accurate arrival time at the root is not important because each change affects the slack at all sinks in the same way.

Finally, we have to determine the required arrival time for each sink. In our simple delay model, we only want to handle a single RAT value and not a pair of functions. Therefore, we evaluate the RAT function at the slew target of the incoming edge. As shown in Section 4.1.5, the capacitance of a sink has to be taken into account when the delay is estimated. This is done by subtracting the appropriate sinkdelay from the resulting RAT.

Given a sink with required arrival time function rat, pin capacitance cap, and buffering mode m at the sink’s incident edge, we define the RAT used in the delay model as

sinkrat(rat, cap, m) := minnratr(mr_s), ratf(mf_s)o− sinkdelay(cap). (5.2) Given a topology and a buffering mode assignment F : E(T ) → M, we can now estimate the slack at sink s to be

σs:= sinkrat(rats, capin(s), ms) −

e=(v,w)∈E(T )[r,s])

(dnode(e) + F (e)d||P l(v) − P l(w)||) − atT(r)

with m being the buffering mode of the arc entering s and ms the according slew

target.

Figure 5.3 shows how our delay model correlates with the slacks that are achieved after buffering. For each instance of a 22 nm design we depict the difference between

Figure 5.3: Correlation between estimated slacks and exact slacks. For each instance

(slightly more than 300 000) of a middle-sized 22 nm design the difference (y- axis) between the slack in our delay model and the final slack after buffering is shown. The instances are sorted by the distance (x-axis) of the most critical sink to the root.

5.1 A Simple Delay Model

the slack of the topology used for repeater insertion and the slack of the final result. Although there are some outliers where we overestimate the strength of the root and are about 50 picoseconds too optimistic, the vast majority of instances are estimated up to 20 picoseconds correctly.

5.1.1 Time Tree

Algorithm 1 TimeTree

Input: A topology T , an embedding P l, a buffering mode assignment F : E(T ) →

M, and parameters dnode, η

Output: Arrival time function atT, RAT function ratT and a dnode assignment

1: for v ∈ V(T ) traversed in postorder do 2: if v is a leaf then

3: Let e be the incoming edge to v if v 6= r 4: ratT(v) := sinkrat(ratv, capin(v), F (e))

5: else

6: if |δ+(v)| = 1 then . v is root or Steiner point along a path

7: Let a = δ+(v)

8: ratT(v) := ratT(a) − F ((v, a))d||P l(v) − P l(a)||

9: dnode((v, a)) := 0

10: else

11: Let {a, b} = δ+(v) with rat_T(a) ≤ rat_T(b) 12: α:= rat(a) − F ((v, a))d||P l(v) − P l(a)||

13: β:= rat(b) − F ((v, b))d||P l(v) − P l(b)||

14: ratT(v) := maxηdnode≤d≤(1−η)dnodemin{α − d, β − (dnode− d)}

15: dnode((v, a)) := ratT(a) − ratT(v)

16: dnode((v, b)) := dnode− dnode((v, a))

17: end if 18: end if 19: end for

20: Let e be the outgoing edge of r 21: atT(r) := max

atr_r(F (e)cap), atfr(F (e)cap)

22: for v ∈ V(T ) \ r traversed in preorder do 23: Let w be the parent of v

24: atT(v) := atT(w) + dnode((w,v)) + F ((w, v))d||P l(w) − P l(v)||

25: end for

During topology construction, we will only maintain the required arrival times and update them incrementally. Arrival times will not be explicitly calculated. However, it is often desirable to compute the delay model of a given topology. This can be done with Algorithm 1 (TimeTree). It first traverses the topology bottom

up, computes required arrival times, and distributes dnode. Then, arrival times are

computed in a second top-down traversal. Both traversals have a running time that is linear in the size of the topology as each update step can be done in constant time.

In document Fast Repeater Tree Construction (Page 44-48)