Custom Single-purpose
Custom Single-purpose
processors
processors
•
• A single-purpose processor is a digital systemA single-purpose processor is a digital system intended to solve a specifc computation task. intended to solve a specifc computation task. •
• AA custom single purposecustom single purpose processor to execute a processor to execute a specifc task within the ES
specifc task within the ES •
• An embedded system designer choosing to use aAn embedded system designer choosing to use a custom single-purpose, rather than a
custom single-purpose, rather than a general-purpose, processor to implement part o a
purpose, processor to implement part o a system!s unctionality may achieve several system!s unctionality may achieve several benefts.
benefts.
–
– perormance may be astperormance may be ast
–
– si"e may be smallsi"e may be small
•
• #ere its start with a review o combinational and#ere its start with a review o combinational and se$uential design, and then describe a method se$uential design, and then describe a method
or converting programs to custom single-purpose or converting programs to custom single-purpose processors.
•
• A single-purpose processor is a digital systemA single-purpose processor is a digital system intended to solve a specifc computation task. intended to solve a specifc computation task. •
• AA custom single purposecustom single purpose processor to execute a processor to execute a specifc task within the ES
specifc task within the ES •
• An embedded system designer choosing to use aAn embedded system designer choosing to use a custom single-purpose, rather than a
custom single-purpose, rather than a general-purpose, processor to implement part o a
purpose, processor to implement part o a system!s unctionality may achieve several system!s unctionality may achieve several benefts.
benefts.
–
– perormance may be astperormance may be ast
–
– si"e may be smallsi"e may be small
•
• #ere its start with a review o combinational and#ere its start with a review o combinational and se$uential design, and then describe a method se$uential design, and then describe a method
or converting programs to custom single-purpose or converting programs to custom single-purpose processors.
Combinational logic design
Combinational logic design
•
• AA combinational circuit combinational circuit is a digital circuit whoseis a digital circuit whose output is purely a function of
output is purely a function of itsits currcurrent ent inputsinputs%% such a circuit has
such a circuit has no memory o past inputsno memory o past inputs.. •
• A transistor is the basic electrical component oA transistor is the basic electrical component o digital systems.
digital systems. Combinations o transistorsCombinations o transistors
orm components called logic
orm components called logic gatesgates.. •
• The basic principle of aThe basic principle of a NPN transistor to act asNPN transistor to act as a switch is
a switch is,, a high voltagea high voltage &typically '( )olts as&typically '( )olts as logic *+ is applied to the gate,
logic *+ is applied to the gate, the transistorthe transistor
conducts
conducts, so current ows. hen, so current ows. hen low voltagelow voltage
&reer to as logic , typically ground,+ is applied &reer to as logic , typically ground,+ is applied to the gate, the transistor
Creation o /ates using
Creation o /ates using
transistors
transistors
0asic logic gates
0asic logic gates
Combinational circuit design
•
1
2 y is * i a is e$ual to *, or b and
c is e$ual to *.
" is * i a is e$ual to * and b is
e$ual to * or i b or c is e$ual to
*, but not both.
34 5level combinational components
•
34 level uses combination
components that are more power ull
than gates.
•
Such Components are
–
6ultiplexer
–7ecoder
–
Adder
–
Comparator
–A89
6ultiplexer
•
A
multiplexor, sometimes called a
selector, allows only one of its data
inputs to
pass through to the output
according to the selection pins inputs.
•
: there are
m data inputs, then there
7ecoder
•
A
decoder converts its binary input
into a one!hot output ". #"ne!hot#
means that
exactly one o the output
lines can be * at a given time.
•
4hus, i there are
n outputs, then
there
Adder
•
An
adder adds two n!bit binary inputs
$ and %, generating an n!bit output
Comparator
•
A
comparator compares two n!bit
binary inputs $ and %, generating
outputs that
indicate whether
$ is less
than, e&ual to, or greater than %.
A89
•
An
$' (arithmetic!logic unit) can
perform a variety of arithmetic and
logic
unctions on its n-bit inputs
$ and
%.
•
The select lines choose the current
function* if
there are
m possible
functions, then there must be at least
log2(m) select lines.
Se$uential logic design
•
A
se&uential circuit is a digital circuit
whose outputs are a function of the
current as
well as previous input
values.
•
:n other words, se$uential logic
possesses memory.
•
;ne o the most basic se$uential
circuits is the
+ip!+op. $ +ip!+op stores
3egisters
•
A
register stores n bits from its n!bit
data input , with those stored bits
appearing at
its output
".
•
$ register usually has at least two
control inputs, cloc and load.
•
-or a
rising-edge-triggered register, the
inputs
are only stored when load is
Shitregisters
•
A shit register has a one-bit data
input
, and at least two control inputs
cloc and shift.
•
/hen cloc is
rising and
shift is , the
value of is stored in the (n)0th bit,
while the (n)0th bit is stored in
the
&n-*+!th bit, and likewise, until the second
bit is stored in the frst bit.
•
4he frst bit is typically shited out,
Counters
•
A
counter is a register that can also
increment (add binary ) to its stored
binary
value.
•
A counter has a
clear input, which resets all
stored bits to ,
and a
count input, which
enables incrementing on the cloc edge.
•
There are two types of counters
– $synchronous counters(p34wn) 5 No need of
cloc pulse to count
– ynchronous 6ounters (p34wn)5 need cloc
Se$uential logic design eg2
•
12 >ou want to construct a clock
divider Slow down your pre-existing
clock so that you output a * or every
our clock cycles.
C9S4;6 S:?/8E-<93<;SE
<3;CESS;3 7ES:/?
@
Custom single-purpose processor
basic model
controller and datapath
controller datapath … … external control inputs external control outputs … external data inputs … external data outputs datapath control inputs datapath control outputs … …
a view inside the controller and datapath controller datapath … … state register next-state and control logic registers functional units
#owB
•
7esigner
can
apply
the
all
combinational and se$uential logic
design techni$ues to build data-path
components and controllers.
•
7esigner
has
nearly
all
the
knowledge ,he needs to build a custom
single-purpose processor or a given
program, since a processor consists o
a controller and a data-path.
•
#ere it
describe a techni$ue or
Explanation with eg%
•
1S4?2 7esign a CS< circuit to fnd
greatest common devisor &/C7+ o two
no!s, ie% i the inputs are *@ and , the
output should be = or : the inputs are
*D and (, the output should be *.
Solution
•
4o begin building our single-purpose
processor implementing the /C7
program, we frst convert our program
into a complex state diagram called fnite
state machine with data &S67+ .
•
:n which states and arcs may include
arithmetic expressions, and these
expressions may use external inputs and
outputs or variables.
•
irst we have to learn how Fwhile loopG
and F i- elseG statement can be convert
to state diagram.
Step*2 <roblem view with unctionality
• black-box view
• x_i • y_i
• d_o
• go_i
We can use templates to convert this program to a state diagram.
Step D2 7ivide the unctionality into a datapath part and a controller part
•
4he datapath part should consist o an
interconnection o combinational and
se$uential components.
•
4he controller part should consist o a
basic state diagram, i.e., one
containing only boolean actions and
conditions.
• Construction o datapath through = steps2
–*. we create a register or any declared variable. :n the example,
these are x and y . /e treat an output port as having an implicit variable, so we create a register d and connect it to the output port. /e also draw the input and output ports.
–@. Second, we create a unctional unit or each arithmetic
operation in the state diagram. :n the example, there are two subtractions, one comparison or less than, and one comparison or ine$uality, yielding two subtractors and two comparators, as shown in the fgure.
–D. 4hird, we connect the ports, registers and unctional units. or
each write to a variable in the state diagram, we draw a
connection rom the write!s source &an input port, a unctional unit, or another register+ to the variable!s register. or each
arithmetic and logical operation, we connect sources to an input o the operation!s corresponding unctional unit. hen more than
one source is connected to a register, we add an appropriately-si"ed multiplexor.
–=. inally, we create a uni$ue identifer or each control input and
•
Construction o controller part
–
e replace every variable write by actions
that set the select signals o
themultiplexor in ront o the variable!s
register!s such that the write!s source
passes through, and we assert the load
signal o that register.
–
e replace every logical operation in a
condition by the corresponding unctional
unit control output.
==
• e oten start with a state
machine
– 3ather than algorithm
– Cycle timing oten too central
to unctionality
• Example
– 0us bridge that converts =-bit
bus to -bit bus
– Start with S67
– Hnown as register-transer
&34+ level
– Exercise2 complete the design
34-level custom single-purpose
processor design
P r o b l e m S p e c i f i c a t i o n BridgeA single-purpose processor that converts two -bit inputs! arriving one
at a time over data_in along with a rdy_in pulse! into one "-bit output on
data_out along with a rdy_out pulse# Sende r data_in$% rdy_in rdy_out data_out$"% &ece iver clock ' S ( ) *ait'irst &ec'irstStart data_lo+data_in *aitSecond rdy_in+, rdy_in+ &ec'irst.nd rdy_in+, &ecSecondStart data_hi+data_in &ecSecond.nd rdy_in+, rdy_in+ rdy_in+, rdy_in+ Send"Start data_out+data_hi / data_lo rdy_out+, Send".nd rdy_out+ Bridge rdy_in+ 0nputs
rdy_in1 bit2 data_in1 bit342 5utputs
rdy_out1 bit2 data_out1bit3"4 6ariables
<roblem Specifcation
P r o b l e m S p e c i f i c a t i o n BridgeA single-purpose processor that converts two -bit inputs! arriving one at a time over data_in along with a rdy_in pulse! into one "-bit output on
data_out along with a rdy_out pulse# Sende r data_in$% rdy_in rdy_out data_out$"% &ece iver clock
S67 or the <robelm
' S ( ) *ait'irst &ec'irstStart data_lo+data_in *aitSecond rdy_in+, rdy_in+ &ec'irst.nd rdy_in+, &ecSecondStart data_hi+data_in &ecSecond.nd rdy_in+, rdy_in+ rdy_in+, rdy_in+ Send"Start data_out+data_hi / data_lo rdy_out+, Send".nd rdy_out+ Bridge rdy_in+ 0nputsrdy_in1 bit2 data_in1 bit342 5utputs
rdy_out1 bit2 data_out1bit3"4 6ariables
=I
&7-level custom single-purpose processor
design $cont8%
*ait'irst &ec'irstStart data_lo_ld=1 *aitSecond rdy_in+, rdy_in+ &ec'irst.nd rdy_in+, &ecSecondStart data_hi_ld=1 &ecSecond.nd rdy_in+, rdy_in+ rdy_in+, rdy_in+ Send "Start data_out_ld=1 rdy_out+, Send8End rdy_out+ (a) 9ontroller rdy_in rdy_out data_lo data_hi data_in$% (b) )atapath data_out d a t a_ o u t_ l d d a t a_ h i_ l d d a t a_ l o_ l d clk t o a l l r e g i s t e r s data_out Bridge;ptimi"ing Custom single-purpose
processors design
•
;ptimi"ation is the task o making
design metric values the best
possible
•
;ptimi"ation in CS<< design means,
;ptimi"ing the original program
;ptimi"ing the S67
;ptimi"ing the datapath
;ptimi"ing the original program
Analy"e program attributes and look or
areas o possible improvement
number o computations si"e o variable
time and space complexity operations used
/C7 program
( 2 int x, y% *2 while &*+ J @2 while &KgoLi+% D2 x M xLi% =2 y M yLi% (2 while &x KM y+ J N2 i &x O y+ I2 y M y - x% else 2 x M x - y% P 2 dLo M x% P 2 int x, y, r% *2 while &*+ J @2 while &KgoLi+%QQ x must be the larger number
D2 i &xLi RM yLi+ J =2 xMxLi% (2 yMyLi% P N2 else J I2 xMyLi% 2 yMxLi% P 2 while &y KM + J *2 r M x y% **2 x M y% *@2 y M r% P *D2 dLo M x% P
original program optimized program
replace the subtraction operation&s+ with modulo operation in order to speed up program /C7&=@, + - iterations to complete the loop
x and y values evaluated as ollows 2 &=@, +, &=D, +, &@N,+, &*,+, &*,
+, &@,+, &@,N+, &@,=+, &@,@+. /C7&=@,+ - D iterations to complete the loop
;ptimi"ing the inite state machine with datapath
Areas o possible improvements
merge states
states with constants on transitions can be eliminated,
transition taken is already known
states with independent operations can be merged
separate states
states which re$uire complex operations &aTbTcTd+ can
be broken into smaller states to reduce hardware si"e
Scheduling
Scheduling the task o assigning operations rom the
(@ int x, y% @2 goL i K goLi x M xLi y M yLi xO y xR y y M y -x x M x -y D2 (2 I2 2 dLo M x 2 y M y -x I2 x M x -y 2 N-U2 xK My ( 2 K&xK My+ xOy K &xO y+ N2 (-U2 *2 * K * x M xLi y M yLi =2 @2 @-U2 K goL i K&K goLi+ dLo M x *-U2 2 int x, y% D2 original FSMD optimized FSMD eliminate state 5 transitions
have constant values
merge state 2 and state 27 5 no loop operation in between them
merge state 8 and state 9 5 assignment operations are independent o one another
merge state : and state ; 5 transitions rom state N can be done in state (
eliminate state :7 and ;7 5 transitions rom each state can be done rom state I and state , respectively
eliminate state !7 5
transition rom state *-U can be done directly rom state
;ptimi"ing the datapath
Sharing o unctional units
one-to-one mapping, as done previously, is
not necessary
i same operation occurs in diVerent states,
they can share a single unctional unit
6ulti-unctional units
A89s support a variety o operations, it can be
shared among operations occurring in diVerent states