Sequential Circuit Design
Lan-Da Van (范倫達), Ph. D.
Department of Computer Science
National Chiao Tung University
Taiwan, R.O.C.
Fall, 2009
[email protected]
Lecture 6
Outlines
Introduction
Sequencing Methods
Latches and Flip-Flops
Sequential System Design
Conclusion
Lecture 6
Sequential Machines
Use memory elements to make primary output values
depend on (
state + primary inputs
).
Varieties:
Mealy machines
— outputs function of present state and
inputs;
Moore machines
— outputs depend only on state.
Machine computes next state N, primary outputs O
from current state S, primary inputs I.
Next-state function:
N = δ(I,S).
Output function (Mealy):
O = λ(I,S).
Duty cycle: fraction of clock period for which clock is
active (e.g., for active-low clock, fraction of time clock
is 0).
Lecture 6
Lecture 6
Sequencing Elements
Latch: level sensitive
Transparent latch, D latch
Flip-flop: edge triggered
Master-slave flip-flop, D flip-flop, D register
Timing Diagrams
Transparent
Edge-trigger
D Lat Flopc h Q clk clk D Q clk D Q (latch) Q (flop)
Lecture 6
Memory Elements
Store a value as controlled by one or more
control inputs.
May have multiple control inputs.
Clock, Load, S-R, …
In CMOS, memory is created by:
capacitance (dynamic);
feedback (static).
Storage element
Latch:
transparent
when internal memory is being
set from input.
Flip-flop:
not transparent
— reading input and
Lecture 6
Memory Categories
Memory Arrays
Random Access Memory Serial Access Memory Content Addressable Memory
(CAM) Read/Write Memory
(RAM) (Volatile)
Read Only Memory (ROM) (Nonvolatile) Static RAM (SRAM) Dynamic RAM (DRAM)
Shift Registers Queues
First In First Out (FIFO) Last In First Out (LIFO) Serial In Parallel Out (SIPO) Parallel In Serial Out (PISO)
Mask ROM Programmable
ROM (PROM) Erasable Programmable ROM (EPROM) Electrically Erasable Programmable ROM (EEPROM) Flash ROM
Lecture 6
Setup & Hold Times
Setup time
: time before clock during which data input
must be stable.
Hold time
: time after clock event for which data input
must remain stable.
clock
data
Lecture 6
Sequencing Methods
Flip-flops
2-Phase Latches
Pulsed Latches
F lip -F lops Flo p L a tch Flo p clk φ1 φ2 φp clk clk L a tch L a tch φp φp φ1 φ2 φ1 2-P has e T rans pare n t Lat c h es Pul s ed Lat c hes Combinational Logic Combinational Logic Combinational Logic Combinational Logic L a tch L a tch Tc Tc/2 tnonoverlap tnonoverlap tpw Half-Cycle 1 Half-Cycle 1Lecture 6
Timing Diagrams
Contamination and
Propagation Delays
Flop A Y tpd Combinational Logic A Y D Q clk clk D Q Lat c h D Q clk clk D Q tcd tsetup thold tccq tpcq tccq tsetup thold tpcq tpdq tcdqt
pdLogic Prop. Delay
t
cdLogic Cont. Delay
t
pcqLatch/Flop Clk-Q Prop Delay
t
ccqLatch/Flop Clk-Q Cont. Delay
t
pdqLatch D-Q Prop Delay
t
pcqLatch D-Q Cont. Delay
t
setupLatch/Flop Setup Time
Lecture 6
Max-Delay: Flip-Flops
F1
F2
clk
clk
clk
Combinational Logic
T
cQ1
D2
Q1
D2
t
pdt
setupt
pcq(
setup
)
sequencing overhead
pd
c
pcq
t
≤
T
−
t
+
t
14243
Lecture 6
Max-Delay Example (1/2)
Suppose the
registers are built
from flip-flops with a
setup time of 62ps,
hold time of -10ps,
propagation delay of
90nps and
contamination delay
of 75ps.
Lecture 6
Max-Delay Example (2/2)
setup
pd
pcq
c
t
t
t
T
≥
+
+
ps
t
pd
=
590
+
60
+
100
+
80
+
100
+
70
=
1000
ps
T
c
≥
90
+
1000
+
62
=
1152
Lecture 6
Max Delay: 2-Phase Latches
Tc Q1 L1 φ1 φ2 L2 L3 φ1 φ2 φ1 Combinational Logic 1 Combinational Logic 2 Q2 Q3 D1 D2 D3 Q1 D2 Q2 D1 tpd1 tpdq1 tpd2 tpdq2
(
)
1
2
sequencing overhead
2
pd
pd
pd
c
pdq
t
=
t
+
t
≤
T
−
t
123
Lecture 6
Max Delay: Pulsed Latches
L1 L2
(
setup
)
sequencing overhead
max
,
pd
c
pdq
pcq
pw
t
≤
T
−
t
t
+
t
−
t
14444244443
Lecture 6
Max-Delay Example
Re-compute the ALU self-bypass path cycle
time if the flip-flop is replaced with a pulsed
latch. The pulsed latch has a pulse width of
150 ps, a setup time of 40 ps, a hold time of
5 ps, a clk-to-Q propagation delay of 82 ps
and contamination delay of 52 ps, and a
D-to-Q propagation delay of 92 ps.
Solution:
(
setup
)
sequencing overhead
max
,
pd
c
pdq
pcq
pw
t
≤
T
−
t
t
+
t
−
t
14444244443
ps
T
≥
max(
92
+
1000
,
82
+
1000
+
40
−
150
)
=
1092
Lecture 6
Min-Delay: Flip-Flops
hold
cd
ccq
t
≥
t
−
t
CL
clk
Q1
D2
F1
clk
Q1
F2
clk
D2
t
cdt
holdt
ccqLecture 6
Min-Delay Example
In the ALU self-bypass example with the
flip-flop from Fig. 7.6, the earliest input to
the late bypass multiplexer is the imm value
coming from another flip-flop. Will this path
experience any hold time failures?
Solution: No. The late bypass mux has
t
cd
=45 ps. The flip-flops have t
hold
=-10ps and
t
ccq
=75 ps. Hence, t
cd
=45 ps is larger than
(t
hold
-t
ccq
=-10-75=-85 ps).
Lecture 6
Min-Delay: 2-Phase Latches
1,
2
hold
nonoverlap
cd
cd
ccq
t
t
≥
t
−
t
−
t
CL
Q1
D2
D2
Q1
φ
1L1
φ
2L2
φ
1φ
2t
nonoverlapt
cdt
holdt
ccqHold time reduced by
nonoverlap
Paradox: hold applies
twice each cycle, vs. only
once for flops.
But a flop is made of two
latches!
Lecture 6
Min-Delay: Pulsed Latches
hold
cd
ccq
pw
t
≥
t
−
t
+
t
CL
Q1
Q1
D2
φ
pt
pwφ
pL1
φ
pL2
t
cdt
holdt
ccqHold time increased
by pulse width
Lecture 6
Time Borrowing
In a flop-based system:
Data launches on one rising edge
Must setup before next rising edge
If it arrives late, system fails
If it arrives early, time is wasted
Flops have hard edges
In a latch-based system
Data can pass through latch while transparent
Long cycle of logic can borrow time into next
As long as each loop completes in one cycle
Lecture 6
Time Borrowing Example
Lat
c
h
Lat
c
h
Lat
c
h
Combinational Logic
Combinational
Logic
Borrowing time across
half-cycle boundary
Borrowing time across
pipeline stage boundary
(a)
(b)
Lat
c
h
Lat
c
h
Combinational Logic
CombinationalLogicφ
1φ
2φ
1φ
1φ
1φ
2φ
2Lecture 6
How Much Borrowing?
2-Phase Latches
Q1
L1
φ
1φ
2L2
φ
1φ
2Combinational Logic 1
Q2
D1
D2
D2
T
cT
c/2
Nominal Half-Cycle 1 Delay
t
borrowt
nonoverlapt
setup(
)
borrow
setup
nonoverlap
2
c
T
t
≤
−
t
+
t
t
borrow
≤
t
pw
−
t
setup
Lecture 6
Clock Skew
We have assumed zero clock skew
Clocks really have uncertainty in arrival time
Decreases maximum propagation delay
Increases minimum contamination delay
Decreases time borrowing
Clock must arrive at all memory elements in time to
load data.
Lecture 6
Clock Skew: Flip-Flops
F1 F2 F1 F2
(
setup
skew
)
sequencing overhead
hold
skew
pd
c
pcq
cd
ccq
t
T
t
t
t
t
t
t
t
≤
−
+
+
≥
−
+
144
42444
3
Lecture 6
Clock Skew: Latches
Q1 L1 φ1 φ2 L2 L3 φ1 φ2 φ1 Combinational Logic 1 Combinational Logic 2 Q2 Q3 D1 D2 D3
(
)
(
)
sequencing overhead1 2 hold nonoverlap skew
borrow setup nonoverlap skew
2
,
2
pd c pdq cd cd ccq ct
T
t
t
t
t
t
t
t
T
t
t
t
t
≤
−
≥
−
−
+
≤
−
+
+
123
(
)
(
)
setup
skew
sequencing overhead
hold
skew
max
,
pd
c
pdq
pcq
pw
cd
pw
ccq
t
T
t
t
t
t
t
t
t
t
t
t
t
t
t
t
≤
−
+
−
+
≥
+
−
+
≤
−
+
1444442444443
Pulsed Latches
2-Phase Latches
Lecture 6
Two-Phase Clocking
If setup times are violated, reduce clock speed
If hold times are violated, chip fails at any speed
In this class, working chips are most important
No tools to analyze clock skew
An easy way to guarantee hold times is to use
2-phase latches with big nonoverlap times
Lecture 6
Signal Skew
Machine data signals must obey setup and hold
times — avoid signal skew.
Lecture 6
Data Shoot Through
Latches do not cut combinational logic when clock is
active.
Latch-based machines must use multiple ranks of
latches.
Multiple ranks require multiple phases of clock.
Data shoot through occurs if single-phase latch is
used.
Lecture 6
Unbalanced Delays
Logic with unbalanced delays leads to inefficient use
of logic:
Lecture 6
Retiming Solution
Retiming moves memory elements through
combinational logic:
Property:
Retiming changes encoding of values in registers, but proper
values can be reconstructed with combinational logic.
Retiming must preserve number of latches OR registers
Lecture 6
Summary
Flip-Flops:
Very easy to use, supported by all tools
2-Phase Transparent Latches:
Lots of skew tolerance and time borrowing
Pulsed Latches:
Lecture 6
Outlines
Introduction
Sequential Methods
Latches and Flip-Flops
Sequential System Design
Conclusion
Lecture 6
Dynamic Latch (1/3)
Pass Transistor Latch
Pros
Tiny
Low clock load
Cons
V
tdrop
Leakage away
Backdriving
Diffusion input
D
Q
φ
Used in 1970’s
D
Q
φ
φ
Transmission gate
No V
tdrop
Leakage away
Backdriving
Diffusion input
Lecture 6
Dynamic Latch (2/3)
Store charge on inverter gate capacitance:
φ = 0: transmission gate is off, inverter output is
determined by storage node.
φ = 1: transmission gate is on, inverter output follows
D input.
Lecture 6
Dynamic Latch (3/3)
Inverting buffer
No V
tdrop
Leakage away
No backdriving
Fixes either
Diffusion input (upper side)
Output noise sensitivity with
inverted output (bottom side)
Setup and hold times
determined by transmission gate
— must ensure that value stored
on transmission gate is solid.
Lecture 6
Stick Diagram
V
DDQ’
D
V
SSφ
φ’
Lecture 6
Physical Layout
V
DDD
Q’
V
SSφ’
φ
Lecture 6
Lecture 6
Static Latch (1/3)
Must use feedback to restore value.
Some latches are static on one phase (pseudo-static)
— load on one phase, activate feedback on other
phase.
Lecture 6
Static Latch (2/3)
Tristate feedback
No V
tdrop
Leakage compensation
Backdriving risk
Diffusion input
Non-isolated from output noise
Requires inverted clock
Buffered input
No V
tdrop
Leakage compensation
No backdriving
No diffusion input
Non-isolated from output noise
Requires inverted clock
φ
φ
φ
φ
Q
D
X
φ
φ
Q
D
X
φ
φ
Lecture 6
Static Latch (3/3)
Buffered output
No V
tdrop
Leakage compensation
No backdriving
No diffusion input
Isolated from output noise
Requires inverted clock
Widely used in Artisan standard
cells
Very robust (most important)
Rather large
Rather slow (1.5 – 2 FO4 delays)
High clock loading
φ
φ
Q
D
X
φ
φ
Lecture 6
Multiplexer Static Latches
Negative Latch
Mux Static Latch
No V
tdrop
Leakage compensation
No backdriving
No diffusion input
Requires inverted clock
Lecture 6
Recirculating Quasi-Static Latch
Eliminate the problem: the value stored on the
capacitor leaks away over time on dynamic latch
Quasi-static: the latch data will vanish if the clocks
are ceased. (i.e. static on one phase)
Lecture 6
Clocked Inverter
φ = 0: If both clocked transistors are off, output is
floating.
φ = 1: If both clocked inverters are on, acts as an
inverter to drive output.
symbol
Lecture 6
Clocked Inverter Latch
φ = 0: i1 is off, i2-i3 form feedback circuit.
φ = 1: i2 is off, breaking feedback; i1 is on, driving i3
and output.
Lecture 6
Flip-Flops
Not transparent—use multiple storage elements to
isolate output from input.
Edge-Trigger:
Lecture 6
Master-Slave Flip-Flop
D
Q
master
slave
φ = 0: master latch is disabled; slave latch is enabled,
but master latch output is stable, so pop the output of
the master.
φ = 1: master latch is enabled, loading value from
input; slave latch is disabled, maintaining old output
value.
Lecture 6
Latch-Based Flip-Flop
The storage nodes have to be refreshed at periodic
intervals
Lecture 6
Lecture 6
Clock Skew Problem
D-Latch
D-Latch
The 1-1 clock overlap introduces a race condition.
During the 1-1 overlap, node A is driven by both D
and B.
Lecture 6
Outlines
Introduction
Sequencing Methods
Latches and Flip-Flops
Sequential System Design
Conclusion
Lecture 6
Procedure
Step1: Specification
Step2: Formulation –
Obtain a state diagram or state table
Step3: State Assignment –
Obtain state table if only a state diagram is available
previously and assign binary codes to the states
Step4: Flip-Flop Input Equation Determination –
Select flip-flop types and derive flip-flop equations from next
state entries in the table
Step5: Output Equation Determination –
Derive output equations from output entries in the table
Step6: Optimization –
Optimize the equations
Step7: Technology Mapping –
Find circuit from equations and map to flip-flops and gate
Step8: Verification –
Lecture 6
State Transition Graphs/Tables
Basic functional description of FSM.
Symbolic truth table for next-state, output functions:
no structure of logic;
no encoding of states.
State transition graph and table are functionally
equivalent.
Lecture 6
State Assignment
Must find binary encoding for symbolic states —
state
assignment
.
State assignment affects:
combinational logic area;
combinational logic delay;
memory element area.
Lecture 6
Example: One-bit Counter (1/4)
Easy to specify as one-bit counter.
Harder to specify n-bit counter behavior.
Can specify n-bit counter as structure made of 1-bit
counters.
State table:
Count
Cin
Next Count
Cout (Carry
Out)
0
0
0
0
0
1
1
0
1
0
1
0
Lecture 6
Implementation (2/4)
XOR computes next value of this bit of counter.
NAND/inverter computes carry-out.
Lecture 6
(3/4)
l1(latch)
n(NAND)
i(INV)
x(XOR)
l2(latch)
C
in
C
out
V
DD
V
SS
Lecture 6
Lecture 6
(1/5)
Behavior of machine which recognizes “01” in continuous
stream of bits.
Operation:
Waits for 0 to appear in state
bit1
.
Goes into separate state
bit2
when 0 appears.
If 1 appears immediately after 0, can’t have a 01 on next cycle, so
can go back to wait for 0 in state
bit1
.
Time
0
1
2
3
4
5
Input
0
0
1
1
0
1
State
Bit1
Bit2
Bit2
Bit1
Bit1
Bit2
Next
Bit2
Bit2
Bit1
Bit1
Bit2
Bit1
Lecture 6
State Transition Table (2/5)
Operation:
Waits for 0 to appear in state
bit1
.
Goes into separate state
bit2
when 0 appears.
If 1 appears immediately after 0, can’t have a 01 on next
cycle, so can go back to wait for 0 in state
bit1
.
Input
Present
Next
Output
0
Bit1
Bit2
0
1
Bit1
Bit1
0
0
Bit2
Bit2
0
Lecture 6
State Transition Graph (3/5)
Lecture 6
01 Recognizer Encoding (4/5)
Choose bit1=0, bit2=1, and then truth table is as follows:
Input
Present
Next
Output
0
0
1
0
1
0
0
0
0
1
1
0
Lecture 6
Implementation (5/5)
After encoding, truth table can be implemented in gates:
D
Q
D
Q
Lecture 6
Power Optimization
Memory elements stop glitch propagation:
Lecture 6