ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.11 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Power Reduction Techniques in the SoC
Clock Network
Clock Power
l Why clock power is important/large
» Generally the signal with the highest frequency
» Typically drives a large load – all sequential logic elements – all precharged/dynamic logic
– distributed throughout chip, so lots of wiring
» DEC 21164’s clock accounts for 40% of total chip power
– 3.75nF total clock load
– 20W (out of 50W) in clock distribution network
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.33 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Processor Power Budgets
Clock Datapath Memory I/O (pads)
Inner circle: low end embedded microprocessor Next circle: high end CPU with on-chip cache
Next circle: MPEG2 decoder ASIC Outer circle: ATM switch ASIC
Clock Power Reduction
P clock = CV dd 2 f
l Minimize voltage (V) using half swing clocks
l Minimize clock load (C)
» clock gating
» careful routing, distributed drivers
l Minimize clock frequency (f)
» DET flipflops
» localized PLL to multiply frequency of clock
l GALS design approach
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.55 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Reduced Swing Clock
Vdd
Gnd
Vdd
Gnd
Regular Clock
Half Swing Clock
N-device clock
P-device clock
P-device clock N-device clock Vtp
Vtn
Half Swing Clocks
l Advantages
» as long as Vtn (Vtp) less (greater) than 1/2Vdd on-off characteristics of nfet (pfet) unchanged l Disadvantages
» sequential element delay approx. doubled (propagation delay and setup/hold time) due to increased on-resistance
» half-swing clock generator done via charge sharing, so sleep modes problematic
» not appropriate for very low voltage systems
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.77 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Clock Gating
l Most popular method for power reduction of clock signals and fu’s
» often idle functional units – e.g., floating point units
» need circuit to generate enable signal
– increases complexity of control logic
– timing critical to avoid clock glitches at AND gate output
» additional gate delay on clock signal
– masking AND gate can replace a buffer in the clock distribution tree
Functional clock unit
enable
Glitch Free Clock Gating
Gated Clock
<
Clock
<
Clock
Gated Clock
(1)
(2)
From <
0 1 A
B <
REG
Clock
Clock
Gated Clock (1)
Gated Clock (2)
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.99 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Gated Clock FSM Architecture
Clock Gated Clock
Comb Logic Reg
Latch
AF AF - Activation Function,
Which evaluates to logic 1 when clock needs to be stopped.
Clock Tree Construction to Facilitate Gating
Clock
Clock Idle condition
Gated clock Can insert clock gating at
multiple levels in clock tree Can shut off entire subtree
if all gating conditions are satisfied
H-Tree Clock Network
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.1111 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Clock Driver Distribution Comparison
Dimension (cm) SD (W) DD (W)
0.25 0.052 0.051
0.5 0.206 0.101
0.75 0.464 0.152
1.0 0.825 0.202
1.25 1.29 0.253
1.5 1.85 0.303
1.75 2.53 0.354
SD = single driver, DD = distributed driver (H-tree) 3.3V supply, 100MHz frequency, 1 micron feature size
Clock Tree Structure Affects Gating
Clock A B
Clock A B
(a) (b)
R1
R2
R3
R4 x1
x2 x1+x3
x2+x4
R1
R3
R2
R4 x1
x3
x2 x4
Assuming x1, x2, x3, x4 are mutually exclusive
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.1313 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
Multiple Frequency Clocks
PLL PLL System PLL
clock
RISC Core Parallel serial interface Bus Interface
I/O controller
f
f2 f1
f3 f < f1 < f2 < f3
Key is in the design of the local circuits used to generate the clock signal in each module
Clock Frequency Multipliers
49.4mW 3.8V
33MHz DDL 3
0.52mm 2 10mW
3.3V 50MHz
0.5 µ PLL 2
0.31mm 2 16mW
5V 50MHz
0.8 µ PLL 1
Area Power
Diss Vdd
Input Freq Tech
Circuit
1
Young, 1992
2
Alvarez, 1995
3
Gupta
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.1515 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999
GALS Design Style
l Reduce clock power consumption by using a Globally Asynchronous, Locally
Synchronous (GALS) design style
l Overheads for
» local clock generation
– independent clock generators
– low power global clock reference signal with local clock frequency multipliers
» global asynchronous communication l Skew tolerant
GALS Architecture
PLL PLL
PLL
RISC Core
Parallel serial interface Bus Interface
I/O controller
f2 f1
f3
handshake protocol
data
ASIC Tutorial
ASIC Tutorial SoCSoCClock.Clock.1717 Low Power Design for
Low Power Design for SoCsSoCs ©M.J. Irwin, PSU, 1999©M.J. Irwin, PSU, 1999