3. Methodology and Project Development
3.5. Methodology and Experiment Sets
3.5.5. Clock Cell Analysis
After the experiments regarding slew and fanout were done the next step is analysing the selection of invereters in the repeater selection.
When considering the selection of repeater cells for the clock tree only inverters have been considered. The use of inverters is preferred over buffers due to the inherent reduced delay.
Typically, inverters will have half the parasitic capacitance and thus transition delay of a buffer using the same technology. However, when using a inverter it must be considered the phase when the signal arrives at the desired point. On this case, the addition of an additional inverter must be considered to invert the phase. This consideration is usually not necessary when using buffers as the input buffer phase is the same as the output buffer phase, whereas on inverters, the input and output phase are the opposite.
When selecting clock cells several considerations must be taken. The usage of big clock cells will have the following consequences in the clock tree:
Increased cell parasitic capacitance and cell delay. Increased cell dynamic power.
Increased cell leakage power. Increased driving power.
On the other hand, it should be expected that the use of small clock cells will result in: Reduced cell parasitic capacitance and cell delay.
Reduced cell dynamic power. Reduced cell leakage power. Reduced driving power.
It is understood as driving power, the capability of a clock cell to propagate a single change to its output fanout correctly.
An insufficient driving power will result on not being able to switch the clock signal at each of the clock cells driven by the input driver. A higher fanout at the output of a repeater will require of higher driving power, and a low fanout will require low driving power.
Asides from these variations when different size cells are used, other considerations must be done regarding clock tree building.
As explained before the flow used has activated by default the Concurrent Clock and
Data Optimization (CCD) configuration option.
This configuration option builds the clock tree considering the datapath. With this option active, it is sought to minimize the Worst Negative Slack of the design. This differs from more typical zero-skew clock building technique where the key aspect is balancing each tree branch to have the same delay from the clock pin to each of the flip-flops at the end of the clock to reduce the skew.
It is to be expected then, that the usage of CCD modifies the expected behaviour obtained when modifying the repeaters used, compared to a zero-skew balancing technique.
In a zero-skew balancing technique, the modification of the clock cells in a tree branch will result on applying the same modification in all the tree branches to keep the tree balance.
Once Concurrent Clock and Data Optimization is applied, given the Worst Negative Slack minimization strategy, several additional conditions must be considered.
The change of the clock tree wirelength, number of clock cells and even variation of parts of the structure may be needed to keep the clock balance that yields the best results in terms of Worst Negative Slack.
To build the clock tree three inverters are used. All the inverters are used on clock tree building while the two smallest inverters are used on the balancing step.
Several experiment sets have been selected and performed regarding repeater usage. The first experiment done involves the usage of bigger clock cells while the second experiment is focused on the usage of smaller clock cells.
As seen before, the usage of bigger or smaller clock cells is correlated with the driving power needed and thus the output fanout a repeater. Following this consideration, when bigger clock cells have been used, they have been paired with larger fanout constraints. On the other hand, when smaller clock cells have been used, the fanout constraint has been more controlled with lower fanout constraints.
3.5.5.1. Experiment Set Definition
The sets of clock cells used are the following ones. Reference repeater set:
a. Clock Tree Building: INV_D16, INV_D12, INV_D6
b. Clock Tree Balancing: INV_D12, INV_D6 Bigger repeater set:
a. Clock Tree Building: INV_D24, INV_D16, INV_D8 b. Clock Tree Balancing: INV_D16, INV_D8
Smaller repeater set:
a. Clock Tree Building: INV_D14, INV_D10, INV_D4 b. Clock Tree Balancing: INV_D10, INV_D4
The experiment sets selected for each block are summarized on the following tables: Block 1 Fanout = 32 Fanout = 64 Fanout = 128 Fanout = 256 Fanout = 512
Bigger repeaters X X X Smaller repeaters X X X
Table 3.8: Experiment set definition table for the clock cell selection with fanout variation constellation at Block 1.
Block 2 Fanout = 32 Fanout = 64 Fanout = 128 Fanout = 256 Fanout = 512
Bigger repeaters X X X Smaller repeaters X X X
Table 3.9: Experiment set definition table for the clock cell selection with fanout variation constellation at Block 2.
Block 3 Fanout = 32 Fanout = 64 Fanout = 128 Fanout = 256 Fanout = 512 Bigger repeaters X X Smaller repeaters X X X
Table 3.10: Experiment set definition table for the clock cell selection with fanout variation constellation at Block 3.
The first two blocks were done following the same constellations regarding fanout and clock cell. For Block 3, the results and analysis on the first two blocks was done and it was modified from the conclusions extracted.
Finally and due to problems in the selection of clock cells, a run in Block 3 was done using several repeaters.
This last run used both the reference and the smaller repeater sets and it will be covered due to the results obtained.