Inthisse tionweinvestigatetwometa-heuristi algorithmsforsolvingthebehavioral
synthesisproblem: (i)Simulatedannealingand (ii)evolutionaryalgorithms[78,42,
79,66,43,32,52℄. Meta-heuristi algorithmsareinterestingin this ontextaslarge
DFGs anbes heduledwithfastrun-times. Furthermoretheyareeasilybestopped
if the optimal solution is notrequired to be found, but just asolution whi h falls
within the area requirement. The power- onstrainthas notyet been implemented
intothesealgorithms.
For these algorithms we target DFG fragments to be s heduled and a
time- onstraint whi h spe ies the maximum amount of ontrol steps allowed for the
exe ution of the DFG fragment. The DFGs onsidered here are a y li dire ted
graphwithverti es
σ i
,representingtheoperatorstobeexe uted,andedgesσ i → σ l
,spe ifying the order in whi h they haveto be exe uted for the omputation to be
orre t(
σ i
hastobeexe utedbeforeσ l
). TheDFGisaugmentedwithasour e( on-ne ting to inputs,I) and atarget vertex ( onne tingfrom outputs, O).Toexe ute
operationsweusethesameresour elibraryoffun tionalunits,denedin table6.2.
Withthehardtimeframe onstraintweneedtonds heduleinwhi htoexe ute
theoperationsintheDFGontosomeFUssu hthatwenishalloperatorsbeforethe
timeframe
T
(withoutviolatingtheirdependen ies) andat thesametimeminimize thearea. Thisinvolvestrade-osbetweens hedulinge.g.many{+, −, >}
operationsinparallel(requiringmore heapALUs),toserializemore
{∗}
operations(requiringfewerexpensivemul1),aswellastradeosbetweendierentsubtypesofFUs(fast
orslow). All this depends strongly on thespe i DFG and thetime frame
T
wehaveavailable.
6.3.1 Problem formulation
First,weformulate the behavioral synthesisproblem asanILP problem. We have
aDFG with operators
σ i i = 1 . . . n
and dependen iesσ i → σ l
, aresour e librarywith fun tional units of type
F U j j = 1 . . . m
having a sili on areaw j
. And atimeinterval
k = 1 . . . T
givingfor ea hoperatorσ i
atimeintervalwhere it anbes heduled:
S i . . . L i
. We want to minimize the used sili on area. Let us start byintrodu ingthevariablesinourformulation:
x : Let
x i,j,k
bea0, 1
integervariableasso iatedwiththeoperatorσ i
:x i,j,k = 1
ifσ i
iss heduledtostartintime-stepk
assignedtoexe uteonF U j
andx i,j,k = 0
otherwise.
N : Let
N j
beanintegervariablewhi h denotesthenumberof fun tionalunits oftype
F U j
wewillallo ateonourIC.Theobje tivefun tionis:
minimize
A =
The obje tive fun tion (equ. 6.2) states we want to minimize the total used
sili on areaandsumsoverallfun tional unit typesand forea hmultiplies its area
bythenumberrequiredforthes hedule. Therst onstraint(equ. 6.3)simplystates
that alloperators mustbes heduledto startin sometime stepand on some
F U j
.These ond onstraint(equ. 6.4)spe ies that for ea h DFG dependen y
σ i → σ l
operator
l
an only start after operatori
nishest l ≥ d j + t i
(whi h depends onwhi h FU
i
iss heduledon). Thethierd onstraint(equ. 6.5)statesaFU anonlyexe uteoneoperationat atime. Thenal onstraint(equ. 6.6)ensuresthat there
nowhere is usedmore powerthan availeble. This last onstraintwill be ignored in
A i+3
i i+1 A i+1 i+2
i+2 i+3 cost gradient
Feasible
Infeasible Feasible
Perturbation A i
A
Figure6.9: Crossingfromoneislandofthesolutionspa etoanotherbykeepingthe
infeasible solutions, when the perturbation is smaller than the minimum required
distan e. Thesequen eof
φ j
'sindi atedbythedotsarethea tualsolutionsandthesequen eof
F(φ j ) = A j
indi atedbythe rosses, orrespondtothefeasiblesolutionsthe ostareafun tionis omputedfrom.
6.3.2 Representation and feasibility
Weuse asolution ve tor ontaining
n
tuples (one for ea h operator), onsisting ofthepair
(k i , j i )
wherek i
isthe timestep, where operatori
startsandj i
is theFUtypeto exe uteit on(
k i ∈ S i . . . L i
andj : σ i ∈ F U j
). Letthes hedule bedenedby:
φ = [(k 1 , j 1 ), (k 2 , j 2 ), . . . , (k n , j n )]
In bothsimulatedannealing and evolutionaryalgorithms wewill likelyprodu e
(andstartwith) solutionswhi h areinfeasible. Whereinfeasible meansweare
vio-latingDFGdependen ies,thereforeweneedto makethesolutionfeasible
φ → φ ′
.We also use this feasibility algorithm to allow for easy rossing overregions of
infeasible solutions, as illustrated on Figure 6.9. We keep the infeasible solution
but omputethe ostof thisinfeasible solutionbymakingthesolutionfeasibleand
then ompute the ost of this solution. This requires howeverthat the feasibility
algorithmisdeterministi , su h thatthe best solution(feasible) anberegenerated
fromapossibleinfeasible best solution. This isabettersolutionthanworkingwith
apenaltyfun tionor removingtheinfeasiblesolutions.
First,letusrevisittheASAPalgorithm. Beforethealgorithmstartsassumewe
assignanoperator
σ i
totimestepwithint i ∈ S i . . . L i
andwithj i
equaltothefastestF U j
. TheoutputistheearliesttimeS l ′
theotheroperatorsσ l
anbes heduledwithσ i
iss heduledintimestepk i
. Onlysu essorstoσ i
areae tedS l ≤ S l ′
.Criti al for this to be of any use is
S l ′ ≤ L l ∀ l
: Assume we at somepointgetS l ′ > L l
after assigning operatorr
to time stept r
(∈ S r . . . L r , S r ≤ L r
). Letp
be the longest path
σ r → σ l
andq
the longest pathσ l → σ r
(going'ba kwards'):S l ′ ≥ t r + |p|
andL r ≤ L l − |q|
. Sin etheDFGis a y li|p| = |q|
, soS l ′ ≥ t r + |p|
and
L r + |p| ≤ L l
,thereforeifS l ′ > L l ⇔ t r + |p| > L r + |p|
ort r > L r
, whi h is aontradi tion.
The same applies to the ALAP algorithm and by running both algorithms in
su ession, we redu e the time intervals for all other operators
σ l
:k l ∈ S l ′ . . . L ′ l
,S l ≤ S l ′ , S l ′ ≤ L ′ l , L ′ l ≤ L l
.Upuntilnowwehaveassumed
j i
wasassignedontothefastestFU.Theavailabledelay is the minimal
L ′ l
time for its su essorsσ l
minus the start time:delay i = min{L ′ l } − k i
. SoanyF U j
withd j ≤ delay i
anbe hosen.Thealgorithmforfeasibilityisasfollows:
Initial set
φ ′
empty.Step 1 Pi kanuns heduledoperator
σ r
inφ
.Step2 S hedule
σ r
intimestep:φ ′ .k r = φ.k r
.Step 3 Compute
delay r = min{L ′ l } − k r
Step4 If
φ.j r ≤ delay r
:φ ′ .j r = φ.j r
else assign:φ ′ .j r = j
(j is the one withtheslowestallowableexe ution)where
σ r ∈ F U j
andd j ≤ delay r
.Step5 ASAP(update
S l → S l ′
)Step 6 ALAP(update
L l → L ′ l
)Step7 For all uns heduled operators
σ l
inφ
: ifφ.k l < S l ′
setφ.k l = S l ′
and ifφ.k l > L ′ l
setφ.k l = L ′ l
.Step8 Ifanyuns heduledoperatorsin
φ
gotostep 1.Thealgorithm worksbyiterativelys hedulingoperatorsoneat atimeandea h
timerunningASAP and ALAPredu ingthe validtime intervalsfor uns heduled
operators and a feasible s hedule an be obtained. The algorithm is deterministi
andhas omplexity
O(n 2 )
.6.3.3 Simulated annealing
The simulated annealing algorithm is a meta-heuristi algorithm for solving ILP
problemswhi hborrowsfromthephysi almodelofnearadiabati rystallizationi.e.
theformationofaperfe t rystallatti e.
Simulatedannealingalgorithm:
Initial Generateinitial feasiblesolutionve tor
→ φ
and omputeitsarea ostA
Step1 Perturb
φ
, by randomly movingan operator in time and hangingits FUassignment
→ φ ′
.Step 2 Generate a feasible solution from the perturbed solution ve tor
F(φ ′ ) →
φ ′ f easible
Step3 Computethearea ostof
φ ′ f easible → A ′
.Step3 If the new ost is smaller than the existing solution (
A ′ < A
) a ept thenew solution
φ ′
, otherwise onditionally a eptφ ′
depending ifexp(−(A ′ − A)/T emp) > random(1)
istrue.Step4 Update the solution spa e
(φ ′ , A ′ , T emp ′ ) → (φ, A, T emp)
and while notthermalequilibriumgotostep 1.
Step5 Redu ethetemperatureexponentially
T emp ′ = αT emp
,with0 < α < 1
.Step6 Ifthetemperature
T emp ′
is largerthanT emp crystal
(thestoppingtemper-ature)and
A ′
islargerthanA accept
gotostep 1.In the iteration stepa random operator
σ i
is hosen and random (a eptable) valuesareinsertedforbothk i
andj i
. Thenthes heduleismadefeasiblestartingwiths heduling
σ i
andthens hedulingtherest. Inthiswaywe ensuretheperturbation survivesthe feasibility pro ess. Then depending on the ost and the temperaturewea eptthis news hedule ornot. Thefundamental dieren ebetweensimulated
annealingandlo alsear hliesintheabilityathigh temperaturestomoveuphill
i.e. a ept solutionswhi h are lessoptimal (as well as alwaysmovedownhill i.e.
a eptmoreoptimalsolutions). Thisishandled bythea eptfun tion maintaining
the Boltzmann distribution from statisti al me hani s. Initially the algorithm is
started with an random solutionwhi h is made feasible. The thermal equilibrium
onditionrepeatstheinner-loopa ertainamount,thisisdeterminedinthefollowing
hapter.
T emp crystal
stopsthealgorithmifthetemperature omesdownto1. It an beshownmathemati allythatbysele tingthe orre ttemperaturefun tionspe ito the problem, the simulated annealing algorithm will nd the optimal solution.
Howeverthetimespentonndingtheoptimalsolution anbeshowntobeequalto
orlargerthanthetimetoperformanexhaustivesear h. Wesetthestarttemperature
to
10000
andit anbeshownthataadiabati ool-ointemperature orrespondsto anexponentialtemperaturede ayi.e. thenewtemperatureisgeneratedbyT emp ′ = αT emp
with0 < α < 1
. Wedeterminetheappropriatevalueforα
in thefollowinghapter.
6.3.4 Evolutionary algorithm
Theevolutionaryalgorithm approa h is ameta-heuristi algorithm forsolvingILP
problems whi h is biologi ally inspiredand implements the on ept of survivalof
thettest.
Evolutionaryalgorithm:
Initial Generateinitial setof feasiblesolutionve tors
→ Φ = {φ}
, thepopulation, and omputetheirrespe tivearea ostsA = {A}
andsetthegeneration ounttozero
G = 0
.Step1 Removethehalf partof thepopulation
Φ
with thelowestarea ost→ Φ 1 2
andset
Φ ′ = ∅
.Step 2 Sele t twoelementsfrom
Φ 1
2 → {φ a , φ b }
, the parentsolution ve tors, andremovetheelementsfromtheset
Φ 1
2 \{φ a , φ b } → Φ ′ 1 2
.
Step3 Sele tarandom rossoverpositionandformtwonewsolutionve tors
{φ a , φ b } → {ψ, ϕ}
,the hildsolutionve tors.Step 4 Mutate
{ψ, ϕ}
, by randomly moving anoperator in time and hangingitsFUassignment
→ {ψ ′ , ϕ ′ }
usingalowprobabilityχ
formutating thesolutionve tors.
Step5 Add the parent and the the hild solution ve torsto the new population
Φ ′ + {φ a , φ b , ψ ′ , ϕ ′ } → Φ ′′
.Step6 Update thesolutionsets
(Φ ′ 1 2
Step7 Generatefeasiblesolutionsfromtheperturbedsolutionve torsin
Φ ′
:F(Φ ′ perturbed ) → Φ ′ f easible
.Step8 Computethearea ostof
Φ ′ f easible → A ′ f easible
.Step9 In rementthegeneration ount
G
andupdatethesolutionspa e(Φ ′ , A ′ ) → (Φ, A)
.Step 10 Ifthebestsolution
A best
islargerthanA accept
andthegenerationG
islessthan
G stop
goto Step1.Thealgorithmworksbyrstdeletingthemostunthalfofthepopulation. Then
for two survivor pairs we sele t a random rosspoint and perform the rossover
thereby produ ing two new hildren. Then we randomly sometimes add a
muta-tionto the hildren. Then the hildren are made feasible (in the sameway asfor
thesimulatedannealing)andthe ostfun tions areevaluatedandtheyareput into
thenewpopulation. Thefundamentaldieren ebetweenthelo alsear h/simulated
annealing andthe evolutionaryalgorithm is theuse of apopulationofsolutions in
thelatter. Thedeletion ofthe most unthalf in prin ipleworks asthe downhill
movingpartandwiththe ross-overandmutationasthepotentialdownhill/uphill
moving part. Initially thealgorithm is startedwith set of random solutions,made
feasibleandevaluated. Themutationrateisin ludedintheevolutionaryalgorithms
topreventtheentirepopulationfrom onvergingto asingle olle tionofsimilar
so-lutions. The mutation rate should not be the prin ipal solutionspa e exploration
methodofthealgorithmandshouldbeverylow;we hose
χ = 0.01
. Thegenerationountterminates the main loop if morethan
G stop
generations has passed. Inthe following hapterwedetermineboththepopulationsizeandtheG stop
parameter.Module Oprs Area Time-slots E/time-slot[nJ℄
add
{+} 2032.75 1 0.0266
sub
{−} 2032.75 1 0.0266
omp
{>} 2032.75 1 0.0266
ALU
{+, −, >} 2965.00 1 0.0266
mul1
{∗} 41978, 50 3 0.1046
mul2
{∗} 28414.50 6 0.0523
mul3
{∗} 14638.75 17 0.0319
input i
43.00 1 0.0
output o
43.00 1 0.0
Table 6.2: 16 bit fun tional unit librarybased on balsa- ost numbers,available to
thesynthesisalgorithm.
Figure6.10: (Left)Partitionof ourCDFGintoDFGfragments. (Right)The
orre-spondingtaskgraphtothepartitionoftheCDFG.