Implementation and Performane Evaluation
LUCMOREAU l.moreaues.soton.a.uk
Department of Eletronis and Computer Siene, University of Southampton, Southampton
SO171BJ.UnitedKingdom.
ReeivedJune15,2000;RevisedFebruary22nd,2001
Editor: TakayasuIto
Abstrat. Wehavereentlydenedanewalgorithmfordistributedgarbageolletionbasedon
referene-ounting[20,24℄. Atthe heartofthe algorithm,wendtreererooting,a mehanism
abletoreduethird-partydependeniesbyreorganisingdiusiontrees. Inreality,thealgorithm
desribesa spetrumof algorithmsaording to the poliyusedto manage messages. In this
paper,wepresenttheimplementationofthe algorithmandevaluateitsperformane. Wehave
implementedtwopoliies,whihareextremesofthespetrum,respetivelyusingandnotusing
treererooting. Inaddition,twodierentstrategiesformanagingationqueueshavebeen
imple-mented. Theonlusions ofourexperimentationsarethe following. Treererootingoersmore
parallelismduringdistributedgativity;weexplainthisphenomenonbythelengthredutionof
ausalityhainsinthe distributedg. Groupingmessagesperdestination dramatiallyredues
the numberofmessages, but requiresa moreompleximplementationas messageshaveto be
sortedperdestination. Speedupof100%hasbeenobservedonsomebenhmarks.
Keywords: distributedgarbageolletion,distributedrefereneounting,performane
evalua-tion,benhmark
1. Introdution
Overthelastdeade,distributedsymboliomputinghasfoundusefulappliations
outsideresearhlaboratories. Environmentsfordevelopingdistributedappliations
are nowshipped bymajor software suppliers, and are used to produe advaned
appliations involving omplex interations between multiple lients and servers.
Java,whihplaysadynamiroleinthisontext, isbundledwiththermi
ommu-niationlayer[12℄ ableto ativatemethodsonremoteobjets.
In partiular, rmi provides a distributed garbage olletor (also written
dis-tributed g or dg) that turns out to be a very valuable tehnology as it
au-tomatially maintains pointer onsisteny: it ensures that an objet will not be
relaimedaslongasitisreferredtoloallyorremotely.
Theauthorhasreentlypublishedanalgorithmfordistributedgarbageolletion
[20℄. This algorithm based on distributed referene ounting wasdeveloped and
prototypedaspartofNeXeme[23℄,adistributedimplementationofSheme,based
on the message-passing library Nexus [9℄. This algorithm has been studied in
detail, and mehanial proofs of safety and liveness have been arried out using
variantsandoptimisations.
ThealgorithmwasimplementedinCandusestheNexuslibrary[9℄for
ommuni-ations. LoalgarbageolletionishandledbytheBoehm-Demers-Weiserolletor
[10℄. The implementation is about 5000 lines of ode, plus an extra 2000 for
in-strumentation. Planstoimplementthealgorithmin Javaareunderway;ombined
with the NexusRMI stub ompiler [4℄, it would give aess to a multi-language
garbage-olleteddistributed environment.
Thispaperisorganisedasfollows. First,westateourassumptionsregardingthe
distributed omputing model and we dene some terminology in Setion 2. We
analyseurrentdistributedrefereneountingalgorithmsinSetion3,andexplain
the design rationale of our distributed g algorithm. Then, we summarise its
priniples in Setion4. Theoverallimplementationdesignis presentedin Setion
5. Somebenhmarkprogramsaredesribedandalgorithmperformaneisevaluated
inSetion6. Finally,aomparisonwithrelatedworkandasummaryonludethe
paper.
2. Models and Terminology
Inthissetion,weintroduethemodelsofdistributionandommuniationthatwe
haveadoptedaswellassomeusefulterminology. Thispaperstudiesdistributed
ref-ereneountingonanetworkofproessesthatanommuniateonlybyexhange
ofmessages. Aomputation exeutesonasetof proessesandonsists ofaset of
threads. Anindividual threadexeutesasequentialprogram,whihmayreadand
writedatasharedwithotherthreadsexeutinginthesameproess. Inthisontext,
aproessdenotesamemoryspaeholdinganumberofobjets. Proessesmaybe
distributed aross several physial mahines, but also a single mahine anhost
several proesses; in bothases, theonly meansbywhih proessesommuniate
isbyexhangeofmessages. Weshallassumethatommuniationsarereliableand
orderedbetweenanypairofproesses.
In the desriptionof our garbage olletor implementation, an address denotes
the loationof an objetin aproess; suh anaddressis only meaningfulwithin
itsproess. Theneedfordistributed garbageolletionomesfrom theusageofa
notionofremotereferene bywhihaproessanrefertoan objetin apossibly
remoteproess. Asopposedtoanaddress,aremoterefereneanbeommuniated
tootherproesses. Wedenetheowner ofanobjetastheproesswheretheobjet
isloated;by extension, theownerof areferenedenotesthe ownerofthe objet
thereferenepointsat.
2.1. TheCommuniation Model
In this Setion, we present the harateristis of the ommuniation model that
is expeted by our implementation. Even though they are heavily inuened by
faetsofadistributedobjetsystem. Ahost pointer isanetworkrepresentativeof
anobjet,ableto refertoan objetin apossiblyremoteproess; alternatively,it
maybealledglobalpointer[9℄ornetworkpointer[3℄. Wetakeaviewofmessage
passing suh that the arrivalof a message initiates a omputation to handle the
message [9℄. Therefore, remote method invoation as in Network Objets [3℄ or
Javarmi[12℄isimplementedbytwomessages,toativatetheomputationandto
transporttheresult,respetively.
Theatualinterfaetotheommuniationlayerislibrarydependent. Generally,
aommuniationisspeiedbyprovidingahostpointer,ahandleridentier,and
a data buer, in whih data are serialised. Initiating the ommuniation auses
the data buer to be transferred to the proess designated by the host pointer,
after whih theroutinespeiedby thehandler is exeuted, potentiallyin anew
thread of ontrol. Both the data buer and the objet referred to by the host
pointer are made available to the handler. Host pointers must be a predened
data typesupported by theommuniationlibrary;theymust beserialisableand
transportabletoremoteproesses.
2.2. Garbage Colletion andReahability
Ineahproess,objetsarealloatedinaheapmanagedbyaloalgarbageolletor.
The purpose of a loal garbage olletor is to ollet objets that are no longer
loallyreahable. Loalreahabilityisdenedintermsasetofroots. Therootsare
loationsholdingaddressesofheapobjets;generally,rootsareproessorregisters,
programstaksandglobalvariables. Anobjetintheheapisloallyreahableifits
addressisheld in aroot orifits addressis heldin anotherloally reahableheap
objet[14℄.
Theommuniationlibraryanditshostpointersoerthemeanstorefertoobjets
inremoteproesses. Inthisontext,thepurposeofadistributed garbageolletor
is to olletobjetsthat are nolonger globallyreahable. An objet ois globally
reahable ifoneofthefollowingonditionsholds:
1. ifoisloallyreahablein aproesstakingpartintheomputation,
2. ifthereisagloballyreahableobjetthatontainstheaddressofo,
3. ifthere isaglobally reahableobjetthat ontainsahostpointer referringto
o.
(Notethatamoredetaileddesriptionofhowahostpointerreferstoanobjetwill
begivenlaterin thepaper.)
Followingthe ustomin dgliterature,anappliation is regardedasomposed
of twoseparate ativities: themutator performstheations speied bythe
Distributedrefereneountinganditsderivativesare byandlargethemost
om-monly found tehniquesused to implement distributed garbage olletion. Even
though suh tehniques are not able to relaim distributed yles, they are
fre-quentlyadoptedbeausetheyareeasytointerfaewithuniproessorgarbage
ol-letors,whihonlyneedtoprovidenalizers(thatausereferenederement
mes-sagestobesent)andanotionofaweakpointer(sothatDGCtablesdonotause
objetsto be preserved inorretly by the olletor) and donot need to support
speialformsoftraing.
Refereneountingwasinitiallyoneivedintheontextofuniproessor
applia-tions[5℄. Inarefereneountingsystem,eahobjetisassoiatedwithareferene
ounter. An objet's ounter is inremented every time a new referene to the
objet is reated, and deremented every time suh a referene is destroyed. A
referene ounter equal to zero indiates that there is no referene to the objet
andthereforethespaeoupiedbytheobjetmayberelaimedsafely.
Therefereneountingtehniquemaynavelybeextendedtoadistributedsetting
byintroduingtwomessagesinrement andderement, butunfortunatelythis
re-sultsinaninorretalgorithm. TheesseneoftheproblemissummarisedinFigure
1,wherearefereneptr toobjetoofproessP
1
ispassedfromproessP
2 to
pro-essP
3
, immediatelyfollowedbyproess P
3
deleting itsreferenetoptr. A nave
extensionofrefereneountingwouldsendaninrementmessagetoinrementthe
ounter on P
1
when ptr is sentto P
3
andaderement messageto derementthe
ounter onP
1
whenP
3
deletesits referene. A raeonditionbetweeninrement
andderement messagesmayausetheounteronP
1
tobederementedbeforeit
isinremented,possiblymakingitnulltemporarily;anullounterwouldmakethe
objetreadyforolletion,eventhoughthere stillisaliverefereneptr onP
2 .
(1)methodinvoation
P1
P3 P2
o
(2) inrement(ptr)
(4) derement(ptr)
(3) refereneptr ptr
ptr
ptr
extension,whihessentiallyfallintothreeategoriesdesribedbelow.
TriangularProtools ThesenarioofFigure1involvesthreedierentproesses,
namelytheowner,thesenderandthereeiverofareferene. LermenandMaurer
[15, 31℄were therstto introdueaprotool betweenthese proessesin order to
maketheopyofareferenesafe. Birreletal. [3℄alsoproposedanotheralgorithm,
whih was adopted in Network Objets [3℄ and in Javarmi [12℄. We summarise
thesealgorithmsbelow.
LermenandMaurer[15,31℄introduetwonewmessagesreate andaknowledge.
Whenarefereneisdupliated,areatemessageissenttoitsowner,whihinreases
the owner's referene ounter and is followed by an aknowledge message to the
referenereeiver. Whenarefereneisdeleted,aderementmessageissenttothe
owner;thereeiverensuresthatnoraeonditionantakeplaebetweenareate
message and aderement messageby sending aderement messageonly after an
aknowledgementforthisreferenehasbeenreeived.
Inthisprotool,theownerisinvolvedeverytimetheemittersendsarefereneto
areeiver. Lermenand Maurer'sshemarequires thereeiverto maintainaount
ofboththenumberofopiesmadeandthenumberofaknowledgementsreeived;
derementmessagesan onlybesentwhenbothareequal.
Birrel et al. also present atriangular algorithm aspart of Network Objets,a
distributedobjetlanguagewithremotemethodinvoation[3℄. SimilarlytoLermen
and Maurer, their protool ensures that referene ounters are not deremented
prematurely. For this purpose, synhronisations are introdued betweengarbage
olletionandmutatorativities.
The owner of an objet maintains a \dirty" set, whih ontains identiers for
alltheproessesthat havearefereneto theobjet. Whenalientrstreeivesa
referene,itmakesadirtyalltotheowner. Whentherefereneisnolongerloally
reahable,asdeterminedbythelient'sloalg,thelientmakesalean alland
deletesthereferene. Inordertoavoidonitsbetweendirtyandlean alls,the
emitter keeps the referene reahable until it reeives an aknowledgement from
thereeiver. Theaknowledgementmaybereturnedaspartoftheresultwhenthe
refereneispassedasanargumenttotheremotemethod all;otherwise,anextra
messagemayberequiredwhen therefereneis returnedastheresultofaremote
methodall.
InBirrel'salgorithm,distributedrefereneountingativityissynhronouswith
the appliation. In partiular, unmarshalling may be suspended by dirty alls.
Furthermore,theemitterofarefereneisonlyallowedtofreeitsrefereneafterthe
method invoationhasterminated onthe reeiver: this maypotentiallymaintain
theexisteneof apointer in theemitter forgarbageolletionpurpose only, even
thoughitspreseneisnotrequiredbythemutator.
Other triangular protools havebeen proposed. The distributed variant of the
TrainGC[11℄relies onareferene-ountingstylepointer-trakingmehanism(on
tolerantversionofdistributedrefereneounting.
Weighted Referene Counting Weightedreferene ounting (wr) [2,34, 6℄
avoidsraeonditionsbetweenmessagesbysendingderementmessagesonly. wr
assoiates a weight with eah objet and with eah referene. When an objet
and its rst referene are reated, theyare given asame weight. The algorithm
maintainstheinvariantthat theweightofanobjetisequaltothesumofweights
of referenespointingat it. Wheneverarefereneis deleted,amessageis sentto
theownerwiththerefereneweight,whoseeetistoderementtheobjetweight
bytherefereneweight. Whenarefereneisopied,itsweightis(equally)divided
betweenthetwoopieswithoutommuniationwiththeowner.
Dierent solutions may be adopted when the weight of a referenereahes one
and annot be dividedfurther. (i) A newobjetontaining arefereneto the
initial objet may be reated; suh a new objet, usually alled indiretion ell,
and its assoiated referene are given equal weights. The new referene is then
propagatedaordingto thesamealgorithm. (ii)A messagemaybesenttothe
ownerinordertorequestmoreweight.
Indiret Referene Counting Piquer's indiret referene ounting (ir) [27℄
doesnotuseinrementmessages,whihresultedinraeonditionswithderement
messagesin thenaveextension. Instead,his algorithm relies onarepresentation
of the path along whih referenes are propagated in a distributed appliation.
While,ingeneral,thepropagationpathofreferenesanformaylistruture,a
spanningtree,alledthediusion tree,maybeobtainedbyrememberingthepath
alongwhih arefereneispropagatedtoaproessforthersttime.
Indiretrefereneountingrequireseahproessinthediusiontreetomaintaina
ounterforeahrefereneitholds. Everytimeaproesssendsarefereneremotely,
itinreasestheounterassoiatedwiththisreferene. Thediusiontreeisbuiltand
maintainedatruntime: eahproessthathasreeivedarefereneforthersttime
just needs to remember whih proessthe referenewassentfrom; bydenition,
the reeiving proess is said to bethe sender's hild in the diusion tree. When
areferene is deleted by aproess, indiret referene ountingsends a derement
messagesto the parent of this proess in the diusion tree. If a proess sends a
referenetoanotherproessthat alreadyholdsaopyofthereferene, thesender
inreases its referene ounter, assuming that the reeiver will join the diusion
tree asitshild; asthe reeiveralready holds aopy,a derementmessageneeds
to besentbak to the messageemitter. In astable situation (when all messages
havebeenpropagated),thesumof allounters foragivenreferene indiatesthe
numberofremoteopiesfortheounter.
Indiretrefereneountingintroduesthirdpartydependenies,whereareferene
is kept in a proess for the sole purpose of ir beause its hildren in the
diu-siontreestillontainativereferenes. Suh referenes,whihPiqueralls zombie
pointers,areharmfulbeausetheypreventtheompletenessofthegarbage
nisations they fore with the mutator; (ii) the third party dependenies they
introdue; (iii) thenumberof messagestheygenerate; (iv) thetype ofnetwork
onnetivitytheyrequire; (v)theirabilitytosupportmobileobjets.
Today, Birrel'salgorithm is themostwidely used, asitis partof Javarmi [12℄
andNetworkObjets[3℄. Assummarisedabove,thisalgorithmhoweverintrodues
synhronisations betweenthe mutatorand thegarbage olletor, whih mayslow
downtheomputation.
Birrel'sand Piquer's algorithms introdue third party dependenieswhih
pre-ventthe senderof areferene from freeingit. InBirrel'sase, dependenies take
plae for the duration of a remote method invoation. In Piquer's, they last as
longasreferenesarereahableinthereeiver(whih anbemuh longerthan in
Birrel'sase). Depending on the solutionadopted when pointer weightsbeome
one,weightedrefereneountingintroduesthirdpartydependeniesor
synhroni-sations. An indiretionellintroduesanunwantedform of dependeny;
alterna-tively, ontatingthe ownerto obtainmoreweightintroduesan undesired delay
tothemutator. Thefrequenyat whihthese eventsouris howevermuhlower
that in Birrel'sand Piquer's beause theyonlyhappen whenpointer weights an
nolongerbedivided.
Ifweanalysethe numberof generatedmessages,allalgorithmsrequireonlyone
messagewhenarefereneisdeleted. wrandirrequirenoextramessagewhen
arefereneisommuniated,thoughwr may requireaommuniationwiththe
ownerwhen the weightbeomes one. Triangular protools requireup to two
ad-ditionalmessages;LermenandMaurer'ssolutionsystematiallyrequires two
mes-sagesbeausethesenderisindierenttowhether thereeiveralreadyholdsa
ref-erene.
Another interesting aspet for omparison is how proesses are required to be
onneted in order to propagate dg messages. ir is uniquebeausedg
mes-sagesfollowthesamepathasmutator'smessages. Otheralgorithmsrequirediret
onnetivitywiththeowner: thismaynotneessarilybeavalidassumptioninthe
presene of rewalls, whih make somepart of the network invisible [1℄. Finally,
exeptforir,noneofthesealgorithmssupportmobileobjets.
Shapiro, Dikman and Plainfosse [30, 29℄ analyse the problem of third party
dependenyandarethersttoproposeamehanismtoollapsehainsofpointers.
Detailsof theiralgorithm is presented in therelated work setion. At this point,
it is important to note that their tehnique supports mobile objets and partial
onnetivity: it is able to adapt to the network onguration by preventing the
short-uttingofpointersifandwhenrequired[1℄;onthedownside,dgativityis
synhronisedwiththemutatorativity.
Our goals are similar to Shapiro, Dikman and Plainfosse's, beause we also
want to support mobile objets and to ollapse the distributed state, whenever
possible aordingto onnetivity (or even aording to loality [21℄). However,
we adopted a modular approah: rst, we investigated the ollapse of state for
addition, our solutiondoes notintrodue any synhronisation with the mutator,
and keepsthenumberofmessagestobeexhangedlow,byrequiringamaximum
oftwomessages,onlywhenarefereneisreeivedforthersttime. Sinetherst
publiation ofour algorithm,Dikman[7℄ hasindependently disoveredavariant
ofour rerooting tehnique; wedelay adeeperomparisonwith his approah until
therelatedworksetion.
Inthispaper,wefousonlyonthe distributedgarbage olletionof non-mobile
objets,andthemehanismwehaveintroduedtoavoidthirdpartydependenies:
tree rerooting. In partiular, we investigate the performane impliations of this
tehnique. Our experimentation may be summarised as follows: tree rerooting
introdues moreparallelism by reduing the length of ausality hains; grouping
dg messages dramatially dereases dg traÆ. In the following setion, we
desribeouralgorithm.
4. The Algorithm
The essene of our algorithm an be summarised as follows. When a proess P
reeivesarefereneforthersttime,indiretrefereneountingditatesthatthe
proesshastojoin thereferene'sdiusiontreeasthehildofthesenderproess.
Tree rerooting is the sequene of operations by whih aproess P an beomea
diret hild of the owner proess (if not already so). Tree rerooting requires a
sequeneoftwomessages: therstoneisarequestfromP totheownertohange
parent, and the seond oneis a message from the owner to the previous parent
informingitofthetransfer.
In the rest of the setion, we present the key features of the algorithm. Our
algorithm hasbeen presented, formalised and provedorret in other papers [20,
24℄. The omplete formalspeiation ofthe algorithm in Coqis available from:
http://www.es.soton.a.uk/lavm/oq/dr.
Atanabstratlevel,wedealwithanotionofremotereferene,whihwealldg
pointer. A refereneounter is assoiatedwith eahdgpointer. A dgpointer
ptr is onventionallyrepresented bytwoboxes. Therst box isused only in the
owner,where itontainstheaddressof theobjetptrrefersto; itis empty inthe
otherproesses. Theseondboxontainsarefereneounter (initiallyzero).
ptr
Wealsoassumethateahpartiipatingproessmaintainsatable,alleda
reeive-table, that ontainstheset of dgpointers reeivedby theproessand urrently
regarded as reahable by the loal g; to this end, the reeive-table is not
on-sidered as a root for the loal g. (This aspet will be disussed further in the
implementation setion.) In addition to regular messages sent by the mutator,
1 2 3
proess is the ownerof an objeto and eah proess also ontainsa pointer ptr
referringto o. Eahinstane ofthepointer isassoiatedwitharefereneounter;
reeivetablesarerepresentedasboxesannotatedbytheletter\R"intheright-hand
sideofeahproess.
Referene ounters, reeivetables and dg messages are used in the following
rulesthat regulatethesending ofpointers (S),thereeiving ofpointers (R1, R2,
R3), the handling of dg messages (DEC, INC), and loal garbage olletion
(GC).
In the text, we onventionally use numbers in brakets to refer to ations in
gures; they identify suessive messages and their respetive eet on the loal
stateofaproess.
0 1
1
ptr
ptr
ptr
(4)messagepassingptr
(2)messagepassingptr (3)
P1
P
3 (1)
o
P
2
R
R
R
(5)
Figure2. PointerDiusion
S: Everytimeapointerissenttoaremoteproess,itsassoiatedounteris
inre-mentedbyone. InFigure2,ptr issentfromP
1 toP
2
(2)andthentoP
3 (4);its
refereneounter isinrementedonP
1
(1)andP
2
(3). Inagivenproess,the
refereneounterassoiated withapointer representsthenumberofbranhes
leavingthatproessinthepointer'sdiusiontree. InP
3
,theounterforptr is
zerobeausethepointerhasnotbeenpassedtootherproesses.
R1: Ifa proessreeivesapointer sentbyits owner,and if thepointer doesnot
belongtothereeivetable,thenthepointerisenteredinthereeivetable. The
reeivetableonP
2
pointstoptr (f. ation(3));whenapointerisreeivedfor
thersttime, theinitial valueofitsrefereneounterisset to0.
R2: If aproess reeivesapointer sent from aproess other than itsownerand
0 2
ptr
ptr
(6)inde(ptr,P
2 )
(7)
(8)de(ptr) (9)
P1
P
3 o
P2 R
R
R
Figure3. TreeReorganisation: Rerooting
ontaininga refereneto theemitter P
2
. Figure 2shows that P
3
reeivesptr
fromP
2
(5); inFigure3,ation(6)issuhanin demessage.
INC: Whena proess reeives anin de message referring to a pointer and a
thirdproess,itinrementstheounterassoiatedwiththepointeranditposts
ade message to the third proess. In Figure 3, after thein de message
is reeived, thereferene ounter of ptr is inremented (7) byP
1
, and a de
messageissenttoP
2 (8).
DEC: When aproess reeivesade messagepertaining to apointer, it
dere-mentstheounterassoiatedwiththepointer. Afterreeivingthedemessage,
therefereneounterofptr is derementedbyP
2
(9),givingthevalue0.
GC: The purpose of theloal g is to detet data strutures that areno longer
reahablefrom theset ofloal roots. As dgpointersare regardedasregular
datastruturesbytheloalg, theloalg andetetwhenadgpointeris
nolongerloallyreahable. Whensuhaneventours,thepointerisremoved
from the reeive-table, and a derement messagede is sent to the pointer's
owner.
R3: Ifaproessreeivesapointersentbyanyemitterandifthepointerisalready
presentinitsreeivetable,thenade messageissentbakto theemitter.
Thekeyideaofthealgorithmistheinde messagefollowedbythede
mes-sage, whose eet is to propagate referene ounter values safely from internal
proessesofthediusion treetotheowner(whih istherootof thediusiontree
for the urrentpointer). As this operation grafts a subtree onto the root of the
ofptr. InFigure3,after thetreererootingoperationhastakenplae, thesumof
allthe ounters is foundon theowner. In other words,the eet of thein de
messageis toattenthediusiontree.
All partiipating proesses maintainanother table alled send-table, not
repre-sentedinthegures. Whenarefereneounterisinreased,thepointerisentered
in theproess send-table, ifit wasnot present. When areferene ounter is
de-reased and reahes 0, the pointer is removed from the send-table. In order to
ensuresafety andliveness, send-tablesasopposed to reeive-tablesare dened as
roots of loal garbage olletors. The role of the send-tableis the following. On
proessesotherthantheowner,pointersheldbythesend-tablearekeptreahable
whilemessagesontainingopiesofthesepointersareintransitanduntilrerooting
hasompleted. Ontheowner,thesend-tableiskeytothesafetyofthealgorithm:
pointers on the owner are kept reahable while a opy is in transit or reahable
anywhere in thenetwork [24℄. Consequently,thedataassoiatedwiththepointer
will not berelaimed by theowner'sloal g, beause the send-tablemakesthe
pointer,andthereforethedata, reahablefrom theloalgroots.
Ouralgorithmavoidstheraeonditionbetweeninrementandderement
mes-sages of nave referene ounting by always inrementing the owner's referene
ounter,withanin demessage,beforederementingitonanotherproesswith
ade message. Weshould note that thealgorithm requires in-ordermessage
de-liverybetweenanypair ofproesses, soasto avoidade messageovertakingan
inde message;indeed,notdoingsomayresultintheundesirablederementing
ofaounterbeforeitsinrementing.
A Spetrum of Algorithms Animportantaspet ofouralgorithmis thatthe
rerooting operationisnot synhronouswiththemutator'sativity. Rerooting
re-quires two messages, but this takes plae for the rst arrival of a referene at a
proess only. In networks where full onnetivity to the owner is not available,
rerooting doesnothave to beperformed. In addition, the algorithm may be
op-timised in two ways. First, all dg messages may be delayed and grouped: by
grouping dg messages, onean signiantly redue the amount of dg-spei
traÆ. Seond, R2 immediately followed by GC may be optimised by a single
de messageto theproess that sentthepointer: thisoptimisationonly requires
asinglemessageinsteadoftwo,and itinvolvestwoproessesinsteadofthree.
Ouralgorithm'sabilitytousedierentstrategiesadaptedtothenetwork
onne-tivity and message traÆ is unique. In fat, thealgorithm and its optimisations
harateriseawholespetrumof refereneounting algorithms. Atoneendofthe
spetrum, eager ativation of rerooting with rule R2 tends to atten the
diu-sion tree. At the other end of the spetrum, the ativation of rerooting may be
delayed asmuh aspossible: in this ase, inde messagesare neversent, and
the algorithm behaves as Piquer's indiret referene ounting algorithm [28℄. In
our experiments, weshall see that eager rerooting generates moremessagesthan
indiretreferene ounting. However,theinreaseof messagesisompensatedby
Our implementation of the algorithm relies on two libraries for ommuniations
and loal garbage olletion: weuse the Nexus message-passing libraryfor
om-muniations [9℄ and the Boehm-Demers-Weiser loal garbage olletor [10℄. The
Boehm-Demers-Weiser garbage olletor was used beause it is onservative and
multithreadedandthereforeaneasilyoexistwith theNexusClibrary; itisalso
thegarbageolletorusedbyBiglooattheheartofourdistributedSheme,
NeX-eme [23℄. A major onern wasto design ageneri implementation, whih ould
bemadeindependentoftheselibraries.
5.1. TheNexus CommuniationLibrary
Setion 2.1 introdued the broad harateristis of the ommuniationmodel
as-sumedbyourimplementation. InthisSetion,weexplaintheatualrepresentation
ofhostpointersadoptedbyNexus.
Aording to the Nexusommuniation paradigm, aommuniation ows over
avirtualommuniationlink identied byastartpoint and anendpoint. A
start-pointanbethoughtofasaapabilitygrantingrightstooperateontheassoiated
endpoint. Both startpointsand endpoints anbereateddynamially; the
start-pointhasthe additionalpropertythat itanbemovedbetweenproesses;onthe
ontrary,endpointsannotbeopiedbetweenproesses.
A ommuniation link supports a single ommuniation operation: an
asyn-hronousremoteservierequest (rsr). Anrsrisappliedtoastartpointby
provid-ingaproedurenameandadatabuer. Fortheendpointlinkedtothestartpoint,
the rsr transfers the data buer to the address spae in whih the endpoint is
loatedand remotelyinvokesthespeiedproedure,providing theendpointand
thedatabuerasarguments.
We refer the reader to the Nexus literature desribing the rationale of Nexus
design. Inpartiular,wenotethatNexusisportableovermultipleommuniation
protools: thevirtualommuniationlinkidentied byastartpoint-endpointpair
isabletohidethedetailsoftheatualommuniationprotoolused.
5.2. DistributedGC Pointers
Weonsiderthateah proessmemoryspaeisdividedin twoheaps: therstone
ismanaged byaloalg, whih anautomatially dealloateobjets; theseond
heapisundertheresponsibilityoftheommuniationlibrary. (IntheaseofNexus,
objetdealloationisnotautomati.)
Figure4showshowtheheapmanagedbytheloal garbageolletoranoexist
with the heap managed by the ommuniation library. Users' programs alloate
objetsin the garbageolletedheap. Host pointers pointingat theseobjetsare
UserObjet user refereneount
mutex
hp lifeyle protool
ommuniationlibrary heap managedby HandlersTable
Nexus
Endpoint Startpoint
Nexus Pointer Host
heapmanagedbyloalgarbageolletor
Figure4. Objets,HostPointers,anddgpointers
alloated in the garbageolleted heap and is assoiated with ahandler table; a
startpointisboundtoanendpointandmaybeopiedtoremoteheaps.
Hostpointersmayberegardedasnetworkrepresentativesofuserobjets. Inorder
todeneourdistributedgarbageolletor,weintrodueanewkindofobjet,alled
adgpointer,thatisagarbaged-olleted,networkrepresentativeofauserobjet.
Theombination of Nexuswith ourdistributed garbageolletoris intended to
provide aomponent of the runtime systemof adistributed language. The
lan-guagedesigner hasawiderangeofoptionstospeifywhatprogramminginterfae
is atually made available for the programmer. High-level annotationsfor
distri-bution suh asfutureouldbeprovided[19℄; alternatively, dgpointersan be
maderst-lassobjets,whihwouldprovidealow-levelbutpowerfulprogramming
interfae,asin NeXeme[23℄. The librariesarediretly usablefrom C,and atthe
implementationlevel,dgpointers areregardedasregulardatastrutures,whih
anbeolletedbytheloalg;hostpointersareinvisibletotheuserandareused
onlybytheommuniationlibrary.
A dgpointer is adata struturealloated in thegarbage-olletedheap. The
realimplementationrenesthe\twoboxes"notationofSetion4,byusingseveral
elds:
auseraddress,whih pointsat the userdata ontheowner, and whih is null
onanyotherproess;theuserdataresidesinthegarbageolletedheap;
aremoteproess;
amutextoensureexlusiveaesstothedgpointerontent;
thedistributedgarbageolletionprotoolhandlingthisobjet;
astateinformation desribingthe pointin the dgpointer's life yle, whih
weexplainbelow.
Figure5displaystwoproessesP
1 ,P
2
. ProessP
1
istheownerofanobjet,and
adgpointerppointsatthisobjet;thesamepointerwassenttoP
2 . OnP
1 ,the
useraddressof ppointsat the objet, whereasitis nullonP
2
. Thehostpointer
eldreferstoaNexusstartpoint,whihitselfpointstoanendpointeponP
1 . The
endpointeponlyexists intheownerproessP
1
,andalsorefersto theobjet.
Pointer Host
Pointer Host
user
DGCPointerp
hp
user
DGCPointerp
hp
heapmanagedby
Endpoint Startpointsp proessP 1 proessP 2 Startpointsp
Comms. libraryref.to
loalgarbageolletor UserObjet
heapmanagedby
ommslibrary
heapmanagedby
loalgarbageolletor ommslibrary heapmanagedby ep
endpointeponP1
Figure5. Pointersontwoproesses
Setion2.2denedglobalreahabilityinanabstratmanner. Weanrestatethe
thirdonditionintermsofourimplementation. Anobjetoisglobally reahable if
oneofthefollowingonditionsholds:
1. ifoisloallyreahablein aproesstakingpartintheomputation,
2. ifthereisagloballyreahableobjetontainingtheaddressofo,
3. ifthereisagloballyreahableobjetontaining(theaddressof)adgpointer
p,ifp ontainstheaddressof astartpointspassoiated withanendpointep,
andif epontainstheaddressofo. (Notethat epand oare requiredto be in
thesameproess,andspandepareallowedto bein dierentproesses.)
Theideaofasend-tableontainingdgpointerswithanon-nullrefereneounter
As we havedened send-tablesto be roots for the loal gs, the rst and third
onditionsmaynowbemergedintoasingleone;ouralgorithmsafelyapproximates
theglobally reahability ofanobjetoifoneofthefollowingonditionsholds:
1. ifoisloallyreahablein aproesstakingpartintheomputation,
2. ifthereisagloballyreahableobjetontainingtheaddressofo.
As a result, when a loal garbage olletor detets that an objet is no longer
loally reahable, italso meansthat nootherdg pointer tothis objetexists in
thesystem,whether inaremoteproessorinamessagein transit.
From a loal garbage olletor's point of view, dg pointers are regular data
struture, for whih loal reahabilityanbedetermined asfor any other objet.
dg pointers have a life yle omposed of three suessive states, and our
dis-tributedgimplementationhasanexpliitrepresentationofthisstateinaeldof
adg-pointer. The onlyallowedtransitions arefrom live to phantomand
phan-tomto dead; theyareonlyred bytheloal garbageolletordeteting the
non-reahabilityofadgpointer.
1. live: Whenadgpointerisreated,itis saidtobelive,anditremainsso,as
longasitisloallyreahable.
2. phantom: Theloal garbageolletoris responsiblefor detetingwhenadg
pointer isnolongerloally reahable. Ourimplementationusesnalization to
hangethepointerstateto\phantom". Duringthisnewstage,thepointerisno
longerused(notevenreahable)bythemutator. Itisremovedfromthe
reeive-tablebutitremainsinusebythedgsothatrule(GC)anbeompleted,i.e.
ade messageissenttotheowner.
3. dead: One rule (GC) has terminated for a dg phantom pointer, the loal
garbageolletoran detet thatit isno longerloally reahablefor aseond
time;then,thedgpointeranagainbenalizedanditentersthenewphase
\dead". In this third phase, resouresused by theassoiated hostpointer in
theommuniationlibraryshouldexpliitlybedealloated.
In the implementation, there may be a delay between the moment a live dg
pointerisdetetedtobeunreahablebytheloalg,atwhihpointitismarkedas
phantom,andthemomentitisremovedfromthereeive-table. Therefore,a
reeive-table ontains live pointers, but may also ontainphantom pointers, unreahable
bythe mutatorand waiting to be removed. Consequently,rules R1and R2need
toberened: theirringrequiresthataproessreeivesadg-pointerthatisnot
urrently in the reeive-table withthe liveag. Inthis ase, ourimplementation
All dgpointershaveathree-stagelife yle, but adgprotool determinesthe
distributed garbageolletion poliy that regulates this objet: the poliy
deter-mines when, where, orwhat kind of messagepertaining to this pointer has to be
sent. Thedgprotoolisspeiedbyanexpliiteldofadgpointer. Protools
arealsorepresentedbyadatastruturewhoseeldsareshownin Figure6.
When a message ontaining a dg pointer is reeived by a proess, the dg
protool uses a reeive-table to determine whether the pointer is present loally.
Similarly, theprotoolmaintainsasend-tablethatontainsall dg-pointerswith
anon-nullrefereneounter. (Weremindthereaderthat thesend table isaroot
fortheloalgarbageolletorbut thereeivetableisnot.)
Theloalgarbageolletorplaysanessentialrolebeauseitdetetswhenadg
pointer is no longer loally reahable; during nalization, it moves dg pointers
to the phantomordead stages of theirlife yles. Duringthese phases, thedg
ortheommuniationlayermayhaveto send messages;these operations may be
longandmayrequireasubstantialamountofmemory. Itisnotsuitabletoperform
theseoperationsduringthenalizationitself;indeed,nalizationmaybeperformed
insidethegarbageolletorritialsetion,andonepreferstoleavethissetionas
quiklyaspossibletoavoidstarvingotherthreadsrunninginparallelandrequiring
morememory. There are further implementation-dependent onsiderationsto be
takenintoaount: isthenalizerruninanewthread? whatisthepriorityofthe
nalizerthread? Inourimplemenentation,nalizeddgpointers areentered ina
nalizationqueue(f. (1)inFigure 6).
Adediatedthread,alledthedgthread,isresponsiblefortransferringpointers
fromthe nalizationqueueintoanationqueue(f. (2)in Figure6). Thisation
queueisalsodiretlyusedbyprotools,whenforinstane,uponreeivingapointer,
adgmessagemustbeemitted. Aswedonotwantthedgativitytodelaythe
mutator,dgmessagesarenotimmediatelysent,butenqueuedintheationqueue.
Thesamedgthreadisalsoresponsibleforativatingitemsfromtheationqueue
(3);thisoperationtypiallyrequiresallingaproedureoftheommuniationlayer.
Finally,thepollingthreadoftheommuniationlibraryisresponsibleforreading
inoming messages and ativating the orresponding handlers. The distributed
garbage olletor implementation provides some handlers for handling de and
inde messagesor initialisingremoteproesses.
Theprotooldata strutureontainspointersto thereeiveandsend tables, to
thenalizeandationqueues, tofuntions forpointer reation,nalization,
om-muniationnotiation,and dealloation, andto allneessarymutexes toprotet
aess to ritial resoures. Most of our ode is parameterised by the protool
managingtheurrentdgpointer;thisorganisationfailitateseasydispath. This
In Out
In
Out reeivetable
sendtable
nalization
nalizequeue
dgthread
ationqueue reateptr
sendptr
reeiveptr
enqueueinde
dealloate
unreferened enqueuede
Protool
...
Send
Table
AtionQueue Reeive
Table
Polling
Thread
Layer Communiation
LoalGC
DGCPointertransfer
pointer
weakpointer dg
Thread
indehandler
dehandler
FinalizeQueue ptr
(1)
(2)
(3)
(2)
(3)
Figure6. Protoolandassoiateddatastrutures
5.4. ImplementedProtools
In order to study thebenetsof the inde messageon thediusion tree
reor-ganisation,wehaveimplementedthefollowingprotools.
5.4.1. TreeRerooting with inde Messages Ifadgpointer pis reeivedby
aproess,andifpisnotapointerin itslivestagepresentin itsreeive-table,and
if the emitter and reeiver are both dierent from the pointer's owner, then an
inde messageispreparedandenqueued intheationqueue. Whenproessed,
the in de message will be sent to the pointer's owner. If a dg pointer p is
reeivedbya proess, andif pisliveand present in itsreeive-table, thena de
messageis preparedandenqueuedintheationqueue.
Let us observe that we do not send dg messages immediately, but add them
to theationqueue. This approahhastwobenets: (i) Reeivingapointerp
wemayoptimisethem, asexplainedinSetion 5.6.
5.4.2. Indiret referene ounting In this protool, inde messages are not
used. dgpointers nowhave anextraeld, alled emitter, whih ontainsa
ref-erenetotheproessthat emittedthepointer. Ifadgpointer pisreeivedbya
proess, andifpisnotalivepointer presentin itsreeive-table,then theemitter
eld forpis set to theproess that emittedp. Otherwise, de messagesare
pre-paredasintherstprotool. Onethepointerisnalizedandreahesthephantom
state,ademessageisnolongersenttoitsownerbutto theproessontainedin
theemittereld ofthepointer.
5.4.3. Thenullprotool Thenullprotooldoesnot maintainrefereneounters
forpointers. Inourimplementation,proessesaredenotedbynullprotoolpointers.
Suh pointers are added to everyommuniationso that reipients of amessage
andetermineitsoriginator.
5.5. Ations
Ationsareoperationstobeperformedbythedistributedg. Ationsaremanaged
bytheationqueue. Currently,thealgorithmsupportsthreetypesofations.
5.5.1. Sending a de message In the simple ase, this onsists of sending a
single de message related to a pointer. In the most omplex ase, this ation
sendsamessagetodereasetherefereneountersofseveralpointers,byanamount
speiedforeahpointer.
5.5.2. Sending aninde message Inthesimplease,thisonsistsofsending
anin demessageto apointer'sowner. Inthe mostomplexase, themessage
atsupontherefereneountersofseveralpointers.
5.5.3. Dealloatingapointer Thisationdealloatesallresouresusedbyahost
pointerin theheapmanaged bytheommuniationlibrary.
5.6. ImplementedAtion Queues
Theationqueueisadatastrutureontainingations. Anyimplementation
strat-egy is validfor ation queues, aslongasit preservesthesafety onditionsset by
thealgorithm: (i)de messagesannotovertakein de messagesiftheyare
relatedto asamepointer; (ii) dealloationof apointer annot takeplae before
ismadetomergemessagesthatarerelatedtothesamepointers.
5.6.2. The sortedation queue Eahationforsending ademessageentered
in the queueis mergedwith asimilar ation relatedto the samepointer.
There-fore,ationsforsendingdemessagesareassoiatedwithanumberspeifyingthe
amountbywhihaounterhastobedereased. Inordertopreservethesafety
on-dition,ade messageisonlyextratedfromasortedqueueifthereisnoin de
message waiting to be proessed. In order to failitate these operations, ations
aresortedbytheirtype: in demessagesinaqueue,whiletheotherationsare
maintained in a seond queue. As inde messageshavea lowerfrequeny, we
havedeidednottomergethem.
A further optimisation is possible: if ade message is sent to the owner of a
pointer,andifthesortedqueueontainsanin demessageforthesamepointer
(to be followed by a de message to a proess P), then these messages an be
replaed by asingle de message to P diretly. Whilethis optimisation redues
the numberof messages, we havenot implemented it, beauseit would not have
beenusedinanysigniantmannerinthebenhmarksweproposeinthefollowing
setion. However,systematiallyapplyingthisoptimisationafterdelayingin de
messagesgivesusir,whihwehaveimplemented.
6. PerformaneEvaluation
Inthissetion,weevaluateandomparethedierentprotoolsandthedierent
a-tionqueuesthatwehaveimplemented. First,wedesribethebenhmarkprograms
thatwehavedesigned;seond,wepresentourexperimentsresults.
6.1. BenhmarkPrograms
The sienti programmingommunity hasdeveloped aseries of benhmarks for
sequential and parallel languages. Similarly, Gabriel'sbenhmarks and nob are
respetivelyusedbytheLispandHaskellommunities;Feeley[8℄hasprodueda
se-riesofprogramstomeasuretheeÆienyofMultiLisp. Wedramatiallylak
benh-marksspeially designedfor evaluating the performane of distributed garbage
olletors.
Unfortunately, we lak real appliations to evaluate the overall eet of dg
algorithms. It is a major problem to evaluate algorithms, but we are ondent
thatsuhappliationswillbeomeavailableasplatformssuhasJavarmibeome
widelyused.
Instead, we have speially devised two benhmarks that exhibit the
perfor-maneofourimplementedpoliies: ylehighlightstheeetofindemessages,
whereasdiuse showsthebenetsofthesortedationqueue. Thebenhmarksare
olle-grate while keeping a pointer to their home base; making sure that the mobile
agentbeomes independent of the loationsit has visited is an essential
require-mentto ensureitsrobustness. Theseond benhmarkdiuse isanabstration of
thebehaviourof adistributed searharossdierent WWW servers,where these
serversooperateinordertondreahablepageswithaspeiontent[18℄. The
problem in this ontext is deteting thetermination of the searh, whih an be
implementedbyusingdistributedgarbageolletion[31℄;ensuringaprompt
dete-tionofthedistributed termination isof paramountimportane inorder toobtain
aneÆientsystem.
6.1.1. Cyle Therstbenhmarkisaimedat measuringthebenetof
reorgan-ising the diusion tree, using in de messages; its desription an be found in
Figure 7. Weonsideraxed numberN of proesses,forming ayle, wherethe
suessorofproessP
N 1
isproessP
0
. Therstproessalloatesadgpointer,
passesittotheseondproess,whihin turnpassesittothethirdone,andsoon.
Inorderto avoidsending apointerptoaproess thathasalreadyreeivedit, we
reateanewdgpointerp 0
,everyN 1steps. Thepointerp 0
isdened soasto
pointto anewlyreatedobjet,whihontainstheaddressoftheurrentpointer
p;asaresult,pisloallyreahablefromp 0
. Then,werepeatthealgorithmwithp 0
.
Algorithmparameter: N
Sequeneofproesses: P
0 ,P
1
,...,P
N 1
Numberofyles:
InitialConguration:
Create adgpointerponproess P
0
;send(p;;N 2) toP
1 .
ProessP
i
reeiving(p;j;k):
If k>0,thensend (p;j;k 1) toproessP
i+1mod N .
If k=0andj=0,thenlose any referene topandterminate.
If k=0andj>0,then reate a dgpointerp 0
referringtoanew objet that
ontainsthe address ofp, andsend(p 0
;j 1;N 2)toproess P
i+1 modN .
Figure7. CyleBenhmark
Let us analyse the reahability of the dierent pointers. After a dg pointer
p has been ommuniated suessively to N 1 proesses, a dg pointer p 0
is
reated. Asp 0
isreferringtoanobjetthatontainsp,premainsreahableaslong
asp 0
isreahable. Whentheiterationsterminate,thelastdgpointerisreleased,
whih in turn releasesthe previousone, andso on. If indiret refereneounting
i+1modN
ofthereahabilityhain isN withindiret refereneounting. Tree rerooting
doesnotforeapointerptoremainreahablewhenitshildinthediusiontreeis
alsoreahable. Asaresult,exeptonitsowner,anydgpointerbeomesgarbage
assoonasthepointerhasbeensentremotelyand treererootinghastakenplae.
Proc2
Proc1
Proc0
366
364
362
360
358
356
354
352
350
348
A
B
F
D
E
G
C
p1
p2
p3
p4
p5
DEC
INCDEC
INCDEC
DEC
DEC
MUTATOR
INCDEC
Figure8. Diusiontreereorganisation(1unit=1ms)
Figure 8 displays an example of exeution. It displays the timelines of three
proesses partiipating in the omputation. Thin gray lines represent mutator's
messagesarryingpointers; bulletsrepresentdgpointer reation. Wesee thata
pointer p1 ispassedfrom proess0(labelA)to proess 1,andthen to proess2,
whereanewpointerp2isreated(labelC),passedto 0,andthen1,andsoon.
Dashed-dotted lines represent inde messages, whereas dashed lines denote
de messages. Wealsoseethat whenproess 2reeivestherstpointer,itsends
anin de message(labelE) to proess0, itsowner, whih in turn sendsa de
message(labelF)toproess1. Thebenetofthisreorganisationisthatthepointer
p1 isno longerneeded onproess1: it may be nalizedand ade message may
besentfromproess1to2(labelG,attime363). Thisexampleisanillustration
thattreererootingavoidsthirdpartydependeniesintroduedbyPiquer'sindiret
refereneounting.
The yle program is typial of mobile appliations, whih for instane keepa
pointer to their home base while they migrate. Every hop of amobile program
potentiallykeepsthehomebase pointer inthe send-tableof thepreviousproess.
messages, (ii) total time (inluding time to leanup), (iii) length of ausality
hains(tobeexplainedinSetion6.3).
6.1.2. Diuse In the yle program, the mutator's ativity is sequential,as it
repeatedly propagates a single pointer to proess after proess. Even though a
limiteddistributedgarbageolletionativitymaytakeplaeinparallel,thereisnot
muhroomforoptimisingdgativity. Therefore,wedesignedaseondbenhmark
diuse, whih measures thebenetofgroupingdgmessagesperdestination (f.
Figure9).
Algorithmparameter: N
Sequeneofproesses: P
0 ,P
1
,...,P
N 1
Width: W
Depth: D
InitialConguration:
Create a dg pointer p on proess P
0
; send (p;W;D) to W randomly hosen
proesses.
ProessP
i
reeiving(p;w;d):
If d=0, thenlose anyreferenetopandterminate.
If d>0, thensend(p;w;d 1) towrandomly hosenproesses.
Figure9. DiuseBenhmark
WeonsideranumberofproessN,suhthateahproessknowsabouttheN 1
otherproesses. Aproessreeivesadgpointer andtwointegersw(width)and
d(depth). If diszero,thenthepointer isdisarded. Otherwise,thesamepointer
is propagatedto w destination proesseshosenrandomly, with adepth givenby
d 1.
Thisprogramdiusesapointeralongatreeofaspeieddepthandwidth,whose
proessesaredeidedatruntime. Weseletthewidthanddepthsuhthatthetotal
numberN ofproesses ismuh smallerthan w d
: asa result,asamepointer will
beommuniatedtoasameproessseveraltimesduringashortperiod.
Let us analyse the reahabilityof the singledg pointer p used by the diuse
benhmark. Therst timeapointerpis sentto wproesses,atreererooting an
takeplae. Weexpeteahproessrapidlytoobtainaopyofpasweassumedthat
thetotalnumberofproessesN ismuhsmallerthanw d
. Indeed,eventhoughno
furtherativityisperformedbyaproessafterithassentapointer, thefrequeny
reahability an be deteted by hild proesses and remaining dg-messages an
bepropagated,whihwill eventuallymakethepointerunreahableintheowner.
Therstthreeinterestingmeasuresaresimilartotheonesforyle,thelasttwo
arespeitothisbenhmark: (i)numberofmessages, (ii)totaltime(inluding
timeto leanup), (iii)lengthof ausalityhains (tobeexplainedinSetion 6.3),
(iv)ountersvaluesinademessage, (v)numberofdgpointersinasinglede
message.
6.2. MeasureQuality
Measuringthe performane ofadistributed garbageolletoris notatrivialtask
beause,asforanyotherdistributedalgorithm,exeutionmaybenon-deterministi
duetoproessesorthreadsshedulingandmessagespropagation. Thebenhmark
diuseevenusesrandomnumbergenerators.
Theabseneof agloballokhasalso inuenedthe designofourbenhmarks.
Inorder tomeasure atimeduration, wemade surethat themeasuresweretaken
onasingleproess. Graphialvisualisationsdisplaytimelinesforseveralproesses;
someof thesetimelines maybeshiftedwith respetto eah otherbeausetime0
isnotdenedglobally.
In addition, there are issues that are spei to garbage olletion. There is
astrong partnership between thedistributed garbageolletor and loal garbage
olletors: itistheirooperationthatprovidesautomatidistributedmemory
man-agement. In partiular, several dgevents are triggered by nalization initiated
by loal garbage olletors, when some objetsare deteted to be garbage. For
instane, sending a de message when a pointer is no longer needed or
deallo-ating ommuniationresouresare bothstarted bynalizers. Consequently, the
performaneofourdgalgorithmannotbemeasuredindependently oftheloal
olletor 2
.
Furthermore,speipropertiesofloal garbageolletorsomeinto eet. For
instane, inourase,theloalgarbageolletoris inremental,andneedsto
allo-ateobjetsin order to relaim existing garbageand ativate nalizers. In order
to irumvent this feature,wereateda thread whose sole purpose is to alloate
garbagetobesurethatobjetsthatarerelevantto ourexperimentsgetnalized.
6.3. Comparison
Foreahbenhmark,wedisplayexeutionrunsthat givesomeintuition aboutthe
patternsofmessageexhangeandtheoverallexeutiontime. Then,weextratand
disussspeimeasurementsthatarenottimerelated. Wehavesuessfully run
ourprogramsonLinux2.0and SunOS5.6,withommuniationtaking plaeover
a10Mb/sethernet. As theyall presented similar patterns ofommuniation, we
presenthereruns whereallproessesranonasingleLinuxmahine.
versions.) In Figure10 (a), mutatorativityis interleavedwith dgmessage
ex-hangesandisspreadbetween2233and4629ms. InFigure10(b),mutatorativity
isbetween1850and2061ms. Theoverallomputation terminatesafter18000ms
inFigure 10(a)butonlyafter35000ms inFigure10(b).
With ir, the distributed garbage olletionativity doesnot take plae when
themutatoris proeeding, but onlystartsafter themutatorhasaomplishedits
exeution and has released the last pointer, in the last proess, after the end of
the last iteration. A sequene of de messagesis then propagated in a diretion
opposite totheonethepointer waspropagated. On theotherhand,aspreviously
illustratedin Figure 8, dgativityis interleavedwith themutator omputation
when inde messagesare used. A verylargenumber ofdg messagesmay be
propagatedat averyearlystage inparallel. Asaresult,weansee thatthetime
to lear all pointer alloated by the yle benhmark is halved when using tree
rerooting.
This behaviour may be explained as follows. We onsider the onguration
reahed after n iterations of the yle benhmark, and we assume that the last
pointer in the last proess is still regarded aslive. Weimagine that all in de
andsubsequentdemessageshavebeenpropagated. Wethenobtaina
ongura-tionsuh thatforeahiterationofyle,there remainsasingleinstaneof adg
pointerthat islive.
Consequently,onethelastpointerinthelastproessisdetetedtobegarbage,
adomino eet will follow, freeing the pointer it is referringto, and soon. The
number of de messages that remains to be propagated is given by the number
of iterations that were ompleted. On the ontrary, with ir, no tree rerooting
took plae,and thenumberof de messagesisgiven bythenumberofiterations
multipliedbythenumberofproessesinayle.
Thebenhmarkwasdesignedsoastomodelmobileomputations,hoppingfrom
proessestoproesses.IfweonsiderthelimitofaveryhighnumberNandasingle
iteration, after all in de and de messages were propagated, there remains a
singledemessagetosend,whereasintheaseofir,N messagesstillhavetobe
sent.
Figure 10also showsthat using treererootingredues thetotalduration ofthe
benhmark (in this gure, 1 unit is 1ms). The time to omplete the exeution
of the mutator is longer with tree rerooting then with ir, beause some dg
ativitytakes plae (inluding the ativation of loal olletors). However, when
usingtreererooting,aboutthree timesmoremessagesareexhangedthanin ir,
but messagesanbepropagated in parallel, whih resultsin anhalvedexeution
duration.
The absene of parallelism when using ir an be explained in a more formal
way. Thelengthofthelongestausalityhainissubstantiallyhigherintheindiret
refereneountingalgorithmthanin thepreseneof treererooting. Causality[17℄
is a partial relationshipbetween events, whih expresses the ausal dependenies
betweenevents: ifevente
1
ausallydependsonevente
2 ,thene
2
neessarilyours
P0
P1
P2
P3
P4
P5
16000
14000
12000
10000
8000
6000
4000
2000
0
P0
P1
P2
P3
P4
P5
35000
30000
25000
20000
15000
10000
5000
0
Aolouredversionoftheseguresisavailablefrom
http://www.es.soton.a. uk/l avm/ pap ers/ hos 01- ol our. tar .gz.
1
e
2
,thene
2
ausallydependsone
1 ;
2. reeivingamessageausallydependsonsendingit;
3. ifevent ereates adg pointer ppointingat another dgpointer p 0
, then e
ausallydepends onthelasteventinurredbyp 0
in thesamehost.
Theausalityrelationshipisapartialorderfrom whih weanderiveadireted
ayligraph,startingatthebeginningoftheexeutionandonvergingtothelast
eventoftheomputation. Weandeterminethelongestpathbetweenthebeginning
and the end of the omputation, whih we all the longest ausality hain. The
longest ausality hain is an indiation of the number of events that have to be
exeutedinsequene. Theshorterthehain, themoreparallelismweanobserve.
81
243
729
2187
6561
0
10
20
30
40
50
60
70
log 3 scale
Number of cycle iterations
Length of Causality Chains
Indirect Reference Counting
Diffusion Tree Reorganisation
Figure11. Lengthofthelongestausalityhainforylewith3proesses
Figure 11 displays the dierene between the lengths of the longest ausality
hains inthetwoalgorithms. Inorder toshowthat this diereneisproportional
tothenumberofproessesinvolvedintheomputation,theyaxissaleisexpressed
asthe logarithm ofthe numberof hosts. Weansee that thespae betweenthe
urvesremainsonstantasthenumberofiterationsinreases.
Theseond benhmarkevaluatestheperformaneof ationqueuestrategies. In
ordertogainsomeintuition,Figure12displaysthemessagesthatweresentfortwo
P0
P1
P2
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
P0
P1
P2
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Aolouredversionoftheseguresisavailablefrom
http://www.es.soton.a. uk/l avm/ pap ers/ hos 01- ol our. tar .gz.
ir,beauseweassumedthatw islargeomparedtothenumberofproesses: as
aresult, the numberof in de messagesis of the sameorder asthe number of
proesses,andthereforenegligible.
Themutatordurationisintherange4500{5000msforeahrun. Howeveroverall
exeutionendswhenthelastpointerisrelaimed,whihidentiesthedetetionof
thedistributedtermination. Intheaseoffoqueues,ittakesabout10000mswhile
itonly takes5000 mswith orderedqueues. This phenomenonis explainedbythe
dramatiredutionofthenumberofdgmessageswhensortedationqueuesare
used.
The following table gives asmall sample of the numberof messages exhanged
during runsof thediuseprogram(againwith threeproesses). Theolumn
mu-tator indiates thenumberof messagesexhangedbythemutator,fo andsorted
respetivelyindiatethenumberofdgmessagesusingthefo andsortedation
queues.
depth width mutator fo sorted
6 3 1092 732 104
6 4 5460 3640 394
As the width inreases, dg pointers are more frequently sent to proesses to
whih theywere previously sent. Consequently, at anygivenmoment, moredg
messagesarewaiting to besenttoasamedestination. Thesortedqueuestrategy
takesopportunityofthissituation togroupseveralmessagesintoasingleonetoa
samedestination.
Thetableshowsthatinreasingthewidthausesavefoldinreaseinthenumber
ofmessagessentbythemutator. Thesameinreaseisalsoobservedinthenumber
of messages sent by the fo strategy; as a result, the ratio of dg messages to
mutator messages remains high and onstant to 66%. On the other hand, the
sortedqueuestrategygenerateslessmessagesandonlyhasafourfoldinrease: the
sameratiodiminishesfrom 9%to7%.
7. RelatedWork
Literatureondistributedgarbageolletionisabundant;werefertoJones'book[14℄
for aomplete hapter on thesubjetand the garbage olletionhome page [13℄
withmorethan1600referenes. Setion3gaveanoverviewandadisussionofthe
urrentwork ondistributed refereneounting. Twodierentapproaheshowever
deserveafurther omparison. Shapiro, Dikman and Plainfosse[30, 29℄ were the
rstto study theproblem ofthird party dependeny. However,werst drawour
attentionto Dikman whohas independentlypublished analgorithm for
restru-turingdiusiontrees[7℄sinetherstpubliationofouralgorithm.
Dikman'salgorithmassumesamodelofommuniationbasedonremotemethod
reduingthirdpartydependenies(Piquer'szombiepointers).
The rst one onsists of rerooting a subtreewhen the root of the subtree
per-forms a remote method invoation to the owner (the root of the diusion tree).
Therequest forrerootingis piggy-bakedonthe method all. When theresultis
returned,theowner'srefereneounteris inremented,andtheownerismadethe
parentoftheproessthat initiatedtheall. Ade messagehasto besentbythe
rerooted proess to its previousparent, in order to derease its previousparent's
refereneounter. Itisimportanttoobservethatthedemessageanonlybesent
after thetermination ofthe remotemethod invoation, otherwiseraeonditions
mayarisesimilarlytothenaveextensionofrefereneounting. Consequently,long
proessingduringtheremotemethodinvoationwillpotentiallypreventtheearlier
releaseof resoureon theprevious parent. Similarly to Dikman's algorithm,we
ould piggy-bak our inde message to a remote method invoation, but our
de messagemaybesentearlier, allowinganearly releaseof resoures. F
urther-more, our tree rerooting tehnique may proeed even though no remote method
invoationisperformedontheowner.
The seond tehniqueproposed by Dikman redues the depth of the tree in a
less radial manner. Whena proess holding areferene to an objetreeivesa
new opy of the referene, ir and our algorithm ditate that it should send a
demessagetothesenderofthemessage. Instead,hoosingbetweentheprevious
parentandthemessagesender,Dikmansendsade messageto theproessthat
has the deeper depth, in eet grafting the urrent subtree to the proess with
theshallowerdepth. Therealisationof thisalgorithm turns outto be nontrivial
beausedepths ofproessesarenotpropagated instantaneously,andthereforethe
deisiontohangeparentmaybemadeonpossiblyout-of-datedepthinformation.
We have shown that tree rerooting introdues more parallelism by reduing the
longesthain ofausality; Dikman's tree struturingredues hains of ausality,
but not as muh as tree rerooting. Therefore, besides the risk of inreasing the
hainbyawrongdeisiontorestruturethetree,histehniquedoesnotreduethe
ostofdgasradiallyasours. However,webelievethatitstillhassomeusewhen
theownerisnotdiretlyreahable,beauseforinstane hiddenbehindarewall.
Our algorithm desribes a spetrum of algorithm from eager ativation of tree
rerootingtoitslazyativationwhihleadstoindiretrefereneounting. Dikman
introduesafurtherdimensionbyallowingahoieinthetreere-struturing: ifthe
reorganisationisloserto theroot, itallowsmoreparallelismin thedg ativity
anditreduesthenumberofthird partydependenies.
Shapiro, Dikman and Plainfosse [30, 29℄ address the problem of distributed
garbage olletion for mobile objets. Our approah separates distributed
refer-eneountingfromtheproblemofobjetmigration,forwhihwedevisedaspei
algorithm [22℄. Our omparison therefore fouses on non-mobile objets. They
introduethenotionofSSP(stub-sionpair)hains,whihessentiallyistheir
rep-resentation of diusion trees. They allow tree rerooting, whih they all hain
piggy-lefttothedistributedgarbageolletoradoptingaspeiprotooltolearunused
stubs and sions. Theirapproah suers from the samedrawbaks asDikman's
algorithmbeausetreererootingremainsdependentonthemutatorativating
re-mote method all on theowner. Baggio[1℄ showedthat SSPhains ould adapt
to the network onguration. In the absene of diret onnetivity, e.g. behind
a rewall, short-utting doesnot have to take plae. A similar exibility is also
presentinouralgorithmbeausetreererootingisoptional.
8. Conlusion
The garbage olletor we present in this paper is based on distributed referene
ounting. As other referene-ounting algorithms, ours is unable to relaim
dis-tributed yles. However,we should observethat there is arangeof appliations
that do not reate distributed yles. In partiular, Tel and Mattern [31℄ have
shownthat theproblemofterminationin distributed systemsisequivalentto
dis-tributed g. Referene ountingan beused beauseproesses form ahierarhy.
Similarly,groups[25℄haveahierarhialorganisationandanberefereneounted.
This paper onludes the investigation of an algorithm for distributed garbage
olletion based on referene ounting. This algorithm has been speied, and
its orretnesshas been provedmehanially. Inthis paper, wehavedesribeda
ompleteimplementationandevaluatedsomeofitsperformaneaspets. The
on-lusionofourexperimentsisthattreererootingoersmoreparallelisminthedg
asausalityhains beome shorterand that groupingdgmessagesredues dg
traÆ. A omplete performane evaluation would require real-world appliations
usingdistributedg,butweurrentlylaksuhappliations. Futureworkonerns
aJavaimplementationofthealgorithm,whihwouldprovideamulti-lingual
envi-ronment,using Nexus/Globus asa ommuniation layerandthe NexusRMI stub
ompiler [4℄. Inorder to support garbage olletion of mobile objets, a further
studyisrequiredtounderstand theinterationsbetweentreererootingandobjet
migration[22℄.
Aknowledgements
TheauthorisgratefultoRihardJones,whosearefulreadingandsuggestionsled
to various improvements of thetehnial ontentsand presentationof the paper.
The author also wishes to aknowledge Danius Mihaelides, Christian Queinne,
andtheanonymousrefereesfortheirusefulomments.
Notes
1. Safeapproximationmeansthatifanobjetisgloballyreahable,thenitisalsoglobally
However, inthe partiularaseof theyle programusingindiret refereneounting, itis
knownthatone the refereneounter ofa pointerp
1
beomeszero,the pointer p
2 pointed
byp
1
haslostitslastativereferene;thereforeademessagemaybesentimmediatelyfor
p
2
,withoutwaitingforthenalizerativation. Thereforebyusingknowledgethatisspei
totheproblem,somebenhmarksmaybeimproved. However,wedidnotimplementsuha
variantbeauseitdoesnotremainvalidforotherdgstrategies.
Referenes
1. Aline Baggio. System support for transpareny and network-awareadaptation inmobile
environments.InACMSymposiumonAppliedComputingspeialtrakonMobileComputing
SystemsandAppliations,Atlanta,Georgia,1998.
2. David I. Bevan. Distributed Garbage Colletion using Referene Counting. In PARLE
Parallel Arhitetures and Languages Europe, volume 259of Leture Notesin Computer
Siene,pages176{187.Springer-Verlag,June1987.
3. AndrewBirrell,DavidEvers,GregNelson,SusanOwiki,andEdwardWobber. Distributed
GarbageColletionforNetworkObjets. TehnialReport116,DigitalSystemsResearh
Center,130LyttonAvenue,PaloAlto,CA94301,Deember1993.
4. FabianBregandDennisGannon.CompilersupportforanRMIimplementationusing
Nexus-Java. Tehnialreport,IndianaUniversity,1997.
5. GeorgeE.Collins.AMethodforOverlappingandErasureofLists.Communiationsofthe
ACM,3(12):655{657,Deember1960.
6. Peter Dikman. OptimisingWeighted Referene Counts for Salable Fault-Tolerant
Dis-tributedObjet-SupportSystems,1992.
7. PeterDikman. DiusionTreeRediretionforIndiretRefereneCounting. InTony
Hosk-ing,editor,Proeedings of theSeond InternationalSymposium onMemoryManagement,
Minneapolis,MN,Otober2000.ACM.
8. MarFeeley. AnEÆientand GeneralImplementation ofFuturesonLargeSale
Shared-MemoryMultiproessors. PhDthesis,BrandeisUniversity,1993.
9. IanFoster,CarlKesselman,andStevenTueke. TheNexusApproahtoIntegrating
Mul-tithreadingandCommuniation. JournalofParallel andDistributed Computing,37:70{82,
1996.
10. H.-J.BoehmandM.Weiser.GarbageColletioninanUnooperativeEnvironment.Software
{PratieandExperiene,18(9):807{820,1988.
11. R.L.Hudson,R.Morrison,J.E.B.Moss,andD.S.Munro. GarbageColletingthe World:
OneCarataTime. InProeedingsofOOPSLA'97,Atlanta,USA,1997.
12. JavaRemoteMethodInvoationSpeiation,November1996.
13. Rihard Jones. The Garbage Colletion Page.
http://www.s.uk.a.uk/p eop le/s taf f/re j/g .ht ml.
14. Rihard Jones and Rafael Lins. Garbage Colletion.Algorithms for Automati Dynami
MemoryManagement. Wiley,1996.
15. C.-W.LermenandD.Maurer. AProtoolforDistributedRefereneCounting. InLispand
FuntionalProgramming,pages343{354,1986.
16. LuigiV.Maniniand S.K.Shrivastava. Fault-tolerantrefereneountingforgarbage
ol-letionindistributedsystems. ComputerJournal,34(6):503{513,Deember1991.
17. Friedemann Mattern. Virtual timeand global states ofdistributed systems. InM.
Cos-nardetal.,editors,ProeedingsoftheInternational WorkshoponParalleland Distributed
Algorithms,pages215{226,Amsterdam,1989.ElsevierSienePublishers.
18. DaniusMihaelides,LuMoreau,andDavidDeRoure.AUniformApproahtoProgramming
theWorldWideWeb. ComputerSystemsSieneandEngineering,14(2):69{91,1999.
19. LuMoreau. Corretnessofa Distributed-MemoryModelforSheme. InSeond
Mobility. InProeedingsoftheThirdInternationalConfereneofFuntionalProgramming
(ICFP'98),pages204{215,September1998.AlsoinACMSIGPLANNoties,34(1):204-215,
January1999.
21. Lu Moreau. Hierarhial Distributed Referene Counting. In Proeedings of the First
ACMSIGPLANInternationalSymposiumonMemoryManagement(ISMM'98),pages57{
67,Vanouver,BC,Canada,Otober1998. AlsoinACMSIGPLANNoties,34(3):57{67,
Marh1999.
22. LuMoreau.DistributedDiretoryServieandMessageRouterforMobileAgents.Siene
ofComputerProgramming,39(2{3):249{272,2001.
23. LuMoreau,David DeRoure, and Ian Foster. NeXeme: a Distributed Sheme Based on
Nexus.InThirdInternationalEuroparConferene(EURO-PAR'97),volume1300ofLeture
NotesinComputerSiene,pages581{590,Passau,Germany,August1997.Springer-Verlag.
24. LuMoreauandJeanDuprat.AConstrutionofDistributedRefereneCounting.Tehnial
ReportRR1999-18,EoleNormaleSuperieure,Lyon,Marh1999.
25. LuMoreauandChristianQueinne.DesignandSemantisofQuantum:aLanguageto
Con-trolResoureConsumptioninDistributed Computing. InUsenixConfereneon
Domain-SpeiLanguages(DSL'97),pages183{197,Santa-Barbara,California,Otober1997.
26. LuMoreau,VitorTan,andNiholasGibbins. TransparentMigrationandOwnership of
MobileAgents.Tehnialreport,UniversityofSouthampton,2000.
27. JoseM.Piquer.IndiretRefereneCounting: ADistributedGarbageColletionAlgorithm.
InParallelArhiteturesandLanguagesEurope(PARLE'91),pages150{165,1991.
28. JoseM.Piquer.IndiretDistributedGarbageColletion:HandlingObjetMigration.ACM
TransationsonProgrammingLanguagesandSystems,18(5):615{647,September1996.
29. DavidPlainfosseandMarShapiro.ASurveyofDistributedGarbageColletionTehniques.
InHenryG.Baker,editor,International WorkshoponMemoryManagement (IWMM95),
number986inLetureNotesinComputerSiene,pages211{249,Kinross,Sotland,1995.
30. MarShapiro, Peter Dikman, and David Plainfosse. SSP Chains: Robust, Distributed
Referenes Supporting Ayli Garbage Colletion. Rapport deReherhe 1799,
INRIA-Roquenourt,November1992.
31. GerardTel and FriedemannMattern. The Derivationof Distributed Termination
Dete-tionAlgorithms from Garbage Colletion Shemes. ACM Transations onProgramming
LanguagesandSystems,15(1):1{35,January1993.
32. P.W.Trinder,K.Hammond,J.S.Mattson,A.S.Partridge,andS.L.PeytonJones.GUM:a
portableparallelimplementationofHaskell.InProeedingsoftheSIGPLAN'96Conferene
onProgrammingLanguageDesignandImplementation(PLDI'96),pages79{88,1996.
33. PeterVanRoy. Ontheseparationofonernsindistributedprogramming: Appliationto
distributionstrutureandfaulttoleraneinMozart. InFourthInternationalWorkshopon
ParallelandDistributedComputingforSymboliandIrregularAppliations(PDCSIA99),
Sendai,Japan,July1999.WorldSienti.
34. PaulWatsonand IanWatson. AnEÆientGarbage ColletionShemeforPar