• No results found

Data Management in Distributed Systems: A Scalability Taxonomy

N/A
N/A
Protected

Academic year: 2020

Share "Data Management in Distributed Systems: A Scalability Taxonomy"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

DATA MANAGEMENT IN DISTRIBUTED SYSTEMS:A SCALABILITY TAXONOMY

A VIJAY SRINIVAS AND D JANAKIRAM

Abstrat.

Data management isa key aspet of anydistributed system. Thispaper surveys data management tehniques in various

distributedsystems,startingfromDistributedSharedMemory(DSM)systemstoPeer-to-Peer(P2P)systems.Theentralfousis

onsalability,animportantnon-funtionalpropertyofdistributedsystems.Asalabilitytaxonomyofdatamanagementtehniques

ispresented. Detaileddisussionofthe evolutionofdata managementtehniques inthe dierentategoriesas wellasthe state

oftheartisprovided. Asaresult,severalopenissuesareinferredinludinguseofP2Ptehniquesindatagridsanddistributed

mobilesystemsandtheuseofoptimaldataplaementheuristisfromContentDistributionNetworks(CDNs)forP2Pgrids.

1. Introdution. Data management is an important faet of distributed systems. Data management

enompasses the ability to desribe data, handle multiple opies (repliation or ahing) of data objets or

les,supportformeta-dataaswellasdataqueryingandaessing. Dierentapproahesfordatamanagement

havegivenimportanetothesedierentaspetsandprovideexpliitsupport,whileotheraspetsareimpliitly

orindiretly supported. Forinstane, Distributed Shared Memory (DSM) systemsand shared objet spaes

handled onsistenyofrepliateddata,but supportedmeta-dataindiretlythroughobjetlookups.

Orthogonaltotheabovementionedissuesofmanagingdata,themainnon-funtionalhallengesare

fault-tolerane, salabilityand seurity, asillustratedin [32℄. Wesurveyvarious distributed systemsfrom the

per-spetive of salability of data management solutions and provide a salability taxonomy. We lassify data

managementapproahesintothreeategories:Centralized/NaivelyDistributed(CND)tehniques,

Sophistiat-ed/IntermediateData(SID)managementtehniquesandLargeSaleData(LSD)managementtehniques. We

giveabriefviewoftheevolutionofdatamanagementin eahoftheategories.

CNDtehniquesfordatamanagementwereusedbyDSMsystemssuhasTreadMarks[10℄,Munin[25℄and

sharedobjetspaessuhasLinda[24℄, Ora[36℄andTSpaes[4℄. Manyofthesesystemsprovideappliation

transparentrepliaonsistenymanagement. Theyuseentralizedornaivelydistributedomponentstoahieve

thesame. Forinstane,TSpaesusesaentralizedserverforonsistenymaintenaneandforobjetlookups,

whileJavaSpaes [81℄usesaentralizedtransationoordinator.

SIDtehniqueshavebeenusedmainly indatamanagementingridomputingsystemssuhas[51℄,whih

providesaRepliaManagementServie(RMS).Someofthesesystemsareharaterizedbydatasharingaross

autonomousorganizationsatintermediatesale(possiblythousandsofnodes). Theseapproahesmainlymanage

repliateddatainagridomputingenvironment. Datagrids[27℄handledatamanagementasrstlassentities

in addition to omputation issues. They areharaterizedby thesize ofthe data sets, whih ould be order

ofgigabytesoreventerabytes. HighEnergyPhysis(HEP)appliationssuhasGriPhyN[31℄andCERN [79℄

areexamplesofdatagrids. Otherapproahesthat useSID tehniques inlude ContentDistributionNetworks

(CDNs)and datamanagementin distributedmobile systems. CDNs suh asAkamai[43℄ havebeenproposed

to deliverwebontenttousersfrom loserto theedgeoftheInternet,enablingwebserversto saleup. Data

managementin distributed mobilesystemsareharaterizedby datasharingin thepresene ofmobile nodes,

exemplied by systemssuh asCoda[74℄. The ommonfeature arossthese dierentsystems is thesale of

operation(thousandsofnodes)thatdistinguishesSIDtehniquesfordatamanagement. Manyofthesesystems

assumethatfailuresarerareandreliableservers(distributed, notentralized)areavailable.

LSDmanagementtehniquesdonotassumereliableservers. ThedistinguishingfeatureofLSDtehniques

is that the exeution of servies is delegated to the edges of the Internet, resulting in high salability and

fault-tolerane. LSDtehniquesworkwellovertheInternet and ouldhandle millionsof nodes/dataentities.

Peer-to-PeerlesharingsystemssuhasNapster[57℄andGnutella[33℄,P2Plestoragemanagementsystems

suhasPAST[15℄andOeanstore[49℄aswellasP2PextensionstoDistributedDataBaseManagementSystems

(DDBMS)suhasPIER[38℄andPeerDB[60℄allfallintotheLSDategory.

Ataxonomyofdatagridshasbeenprovidedin[87℄. Itomparesdatagridswithrelateddatamanagement

approahes suh as CDNs, DDBMS and P2P systems. A funtional perspetive of data management that

fouses ondata loation, integration, sharing andquery proessing aswell asthe dierent P2Psystems that

Distributed & Objet SystemsLab, Dept. of ComputerSiene & Engg., Indian Institute of Tehnology, Madras, India,

http://dos.iitm.a.in,{a vs, djrams.iitm.ernet.in}

(2)

addressthese funtionalitiesis given in [50℄. A surveyofP2P ontent distribution hasbeenprovidedin [77℄.

ItexaminesP2Parhiteturesfromtheperspetiveofnon-funtionalpropertiessuhasperformane,seurity,

fairness,fault-toleraneandsalability. Oursurveyisbroaderandtriestoprovidetheequivalentsurveyforgrids,

P2Psystems,CDNs and DDBMS. Wealsoprovide asalabilitytaxonomythat distinguishes oursurveyfrom

others. Further,wedisussstateoftheartinseveraloftheseareasanddisusshowideas/onepts/tehniques

fromoneareaanbeapplied toothers. Thereadermustkeepin mindthat thoughtheauthors havemadean

eortto beunbiased,thesurveyhaslimitationsasitispereivedthroughtheirlookingglass.

Therestofthepaperisorganizedasfollows. Setion2disussestheCNDtehniquesfordatamanagement

andinludesDSMsandsharedobjetspaes. Setion3disussestheSIDtehniquesandinludesdata

manage-mentin grids, CDNs, and distributed mobilesystems. Setion 4disussesP2P datamanagement tehniques.

Setion5exploresthestateoftheartdatamanagementtehniquesindistributedsystems. Setion6onludes

thepaperandinludesataxonomygureandgivesdiretionsforfuture researh.

2. CND Tehniques: Data Repliation in DSMs and Shared Objet Spaes. DSM providesan

illusionofgloballysharedmemory,inwhihproessorsansharedata,withouttheappliationdeveloperneeding

tospeifyexpliitlywheredataisstoredandhowitshouldbeaessed. DSMabstrationispartiularlyuseful

for parallel omputingappliations, asdemonstrated byTreadMarks [10℄. Collaborativeappliations suh as

on-linehattingand ollaborativebrowsingwouldbeeasiertodevelopoveraDSM.

PagebasedDSMs anbemoreeient,dueto theavailabilityofhardwaresupportfordetetingmemory

aesses. Butduetothelargergranularityofsharing,pagebasedDSMsmaysuerfromfalsesharing. Relaxed

onsistenymodelsinludingReleaseConsisteny(RC)anditsvariantssuhaslazyRCallowfalsesharingtobe

hiddenmoreeientlythanstritonsistenymodels[64℄. Munin[25℄wasanearlyDSMsystemwhihfoused

onreduingthe ommuniationrequiredfor onsistenymaintenane. It providessoftwareimplementation of

RC.TreadMarks[10℄isanotherDSMsystemthatprovidesanimplementationofreleaseonsisteny. Java/DSM

[91℄providesaJavaVirtualMahine(JVM)abstrationoverTreadMarks.ItisanexampleofpagebasedDSMs,

similartoMuninandTreadMarks.

ReleaseonsistenyisawidelyknownrelaxedonsistenymodelforDSMs. Memory aessesaredivided

intosynhronization(syn)andnon-synhronization(nsyn)operations. Thensynoperationsareeither data

operations or speial operations not used for synhronization. The syn operations are further divided into

aquire and release operations. An aquire is like a read operation to gain aess to a shared loation. A

releaseisthe omplementary operationperformedto allowaessto thesharedloation. Aquireandrelease

operationsanbethoughtofasonventionaloperationsonloks. TherearetwovariationsofRC,

RC

sc

whih realizessequentialonsistenyand

RC

pc

whihrealizesproessoronsisteny.

RC

sc

maintainsprogramorder

fromanaquiretoanyoperationthatfollowsit,fromanoperationtoareleaseandbetweenspeialoperations.

RC

pc

issimilar, exept that writeto read programorder isnot maintainedforspeial operations. EagerRC,

astheoriginalRCbeamesubsequentlyknown[48℄,requires ordinarysharedmemoryaess to beperformed

onlywhen asubsequentreleaseoperationisdue bythesameproessor. LazyRC(LRC)is avariationofRC

in whih proessorsfurtherdelayperformingmodiationsuntil subsequentaquires by other proessorsand

modiationsaremadeonlybytheaquiringproessor. LRCintuitivelyassumesompetingsharedaessesto

beseparatedbysynhronizationoperations.

2.1. SharedObjetSpaes. ObjetbasedDSMs(alsoknownassharedobjetspaes)alleviatethefalse

sharingproblembylettingappliationsspeifygranularityofsharing. ExamplesofobjetbasedDSMsinlude

Linda[24℄, Ora[36℄,TSpaes [4℄, JavaSpaes[81℄aswell asanobjetbasedDSM inthe.NET environment

[75℄. Orarelies on an update mehanismbased on totally ordered group ommuniationto serialize aess

to replias. Eventhoughastudy hasshownthat theoverheadof totallyordered groupommuniationaets

appliationperformaneminimally[37℄ 1

,thestudywasdoneonaMyrinetluster. Orahasnotbeenevaluated

on the Internet sale. T spaes is a shared objet spae from IBM [4℄ that adds database funtionality to

Lindatuplespae [24℄ andis implementedin Javato takeadvantageof its widerusability. In additionto the

traditionalLinda primitivesofin, out, read, Tspaessupports set orientedoperators andanovelrendezvous

operator alled rhonda. Global sharedobjets[90℄ allowsheap objetsin a JVMto be sharedaross nodes.

Basedonmemoryaesspatternsofappliations,italsoproposesvariousonsistenymehanismstoberealized

eiently. However, it uses loks and per-objetlok managers for keeping replias onsistent. It does not

addressfailuresofthelokmanager. JavaSpaes speiationfrom Sun[81℄ providesadistributed persistent

1

(3)

sharedobjetspaeusingJavaRMIandJavaserialization. ItprovidesLinda-likeoperationsonthetuplespae

and usesJini's transationspeiationto ahieveserializabilityof write operations. It alsodoesnotaddress

faulttolerane,animportantissueforInternetsalesystems.

2.1.1. Globe. Globe [3℄ attempted to address the hallenges of building software infrastruture for

de-velopingappliations over theInternet. A key designobjetiveof Globe wasto providea uniform model for

distributed omputing. This means that Globe providesa uniform way to aess ommon servies (suh as

naming,repliationandommuniation)withoutsariingdistributiontranspareny. ObjetsinGlobe

enap-sulatepoliiesforrepliation,migration,et. Eahobjetomprisesmultiplesub-objets,allowinganobjetto

bephysiallydistributed. Thedierentsub-objetsofanobjetinludeoneeahforsemantis(funtionality),

ommuniation(sending/reeivingmessages),repliationandontrolow. Thishelpstheprogrammerto

sepa-ratefuntionalityfromorthogonalnon-funtional propertiessuhasrepliation. Objetsalsohelpin realizing

distributiontransparenybyhidingimplementationdetailsbehindwelldenedinterfaes. Theimplementation

frameworkof Globeis exible, meaningthat dierentimplementationsof thesameinterfaesare possible. It

also provides an eient mehanism for objet lookups by using a tree based hierarhialnaming spae. It

mustbeobservedthatdistributed objetmiddlewaresuhasCORBA[61℄alsoprovidesimilarserviessuhas

namingandtrading. Buttheyannotprovideobjet-speipoliiesthatanbeprovidedin Globe.

2.2. Software Availability and Usage Summary. To the knowledge of the authors, T spaes and

JavaSpaes arewidely used andare available asopen souresoftware. Lindais aspeiation andhasbeen

implementedbyseveralgroups. OraandGlobeareresearhprototypes,informationontheirdeploymentand

useisnotavailable.

2.3. Observations. Wehaveproposedagenerisalabilitymodelforanalyzingdistributedsystemsin[6℄.

It takestheviewthat salabilityof distributed systemsshould beanalyzed onsideringrelated issuessuh as

onsisteny,synhronization,andavailability. Wegivebelowtheesseneofthemodel.

scalability

=

f

(

avail, sync, consis, workload, f aultload

)

availisavailabilityanbequantiedastheratioofthenumberoftransationsaeptedversusthose

submitted.

onsisisonsisteny,itselfafuntionofupdateorderingandonsistenygranularity. Updateordering

refersto the update orderingmehanismsarossreplias of anobjetand anbeoneof ausal,

seri-alizable or PRAM. Consisteny granularity refersto the grain size at whih onsisteny needs to be

maintained.

synreferstosynhronizationamongthereplias. Thetwodimensionsofsynhronizationarehowoften

therepliasaresynhronizedandthemodeofsynhronization(push/pull).

workloadanbebrokendownintoworkloadintensity(numberoftransationsperseondornumberof

lients)andworkloadserviedemandharaterization(CPUtimeforoperations).

faultloadrefersto thefailuresequenesandthenumberaswellasloationofthereplias.

Thesalabilitymodelgivenaboveisusefultoidentifybottleneksindistributed systems. Byapplyingthe

salabilitymodel onshared objet spaes, we haveidentied the key bottleneks that inhibit existing shared

objetspaes(withtheexeptionofGlobe)fromsalinguptotheInternet:

CentralizedComponents

Many existing DSMs and shared objet spaes have some entralized omponents that aet their

salability. Forinstane,Orahasasequenerforrealizingtotallyorderedgroupommuniation,while

otherslikeTSpaes[4℄haveaentralizedomponentforobjetlookups.

Failures

Existingsharedobjetspaesdonothandlefailures. Forinstane,JavaSpaesandglobalsharedobjets

donothandlefailuresof transationoordinator,whileOradoesnothandlefailureofthesequener.

ObjetLookup

Givenanobjetidentier(id),eientmehanismsmustexistthatmapstheidtothenodethateither

storesareplia orstoresmeta-dataaboutthereplia. ExistingsharedobjetspaessuhasTSpaes

useentralizedlookupmehanisms. Objetlookupmehanismsin distributedobjetmiddlewaresuh

asCORBAand DCOMalsohavediultyinhandlingfailuresandsalingup.

Consisteny

SeveralexistingDSMsystemssuhasTreadMarks,MuninandsharedobjetspaessuhasJavaSpaes

(4)

thesemehanismshavenotbeenevaluatedinInternetsalesystems. Peer-to-Peer(P2P)systemswhih

havebeensaledtotheInternet,suhasPastry[69℄andTapestry[17℄assumerepliasareread-only.

3. SID Tehniques for Data Management.

3.1. Computing Grids. Globus [39℄ a de-fato standard toolkit for grid omputing systems, relies on

expliitdatatransfersbetweenlientsandomputingservers. ItusestheGridFTPprotool[19℄that provides

authentiationbasedeientdatatransfermehanismforlargegrids. Globusalsoallowsdataatalogues,but

leavesatalogueonsistenyto theappliation. Thepaper[51℄ explores theinterfaesrequiredfor aReplia

ManagementServie(RMS) thatatsasaommonentrypointforrepliaatalogueservie,meta-dataaess

aswellaswideareaopy. Itdoesnotaddressonsistenyissuesperse. Further,theRMSisentralizedandmay

notsaleup. Theothergridpaperthathasaddresseddatamanagementissues[29℄outlinespossibleuse-ases

andgiveshigherlevelviewofthedatamanagementrequirementsinagrid. Thequorumshemeitdesribesfor

handlingread-writemayhavetobemodiedinanInternetkindofanenvironmenttohandlequorumdynamis.

Further,itdoesnotaddressvariousgranularitiesofrepliationandusesloksforsynhronization. Thepaper[78℄

alsoaddressesread-writedataonsistenyinagridenvironmentbasedonalazyupdatepropagationalgorithm.

Theupdate propagationalgorithmis basedontimestampsand may notsaleupto work inalargesalegrid

environment(Updateonits arehandledmanuallybyappliationprogrammer- non-trivialtask). Attempts

havealso beenmade to extendthe existing 2Phase Commit (2PC) basedalgorithms [82℄. These would need

globalagreementandmaybeexpensiveinanInternetsetting.

3.2. Data Grids. Ageneriarhitetureforhandlinglargedatasetsingridomputingenvironmentshas

beenproposed in [27℄. Itdesribestheway datagridservies suh asrepliationand repliaseletionanbe

builtoverbasiserviesofdataandmeta-dataaess. Itassumesthatreplias(leinstanes)areread-only.

GriPhyN[31℄attemptstosupportlarge-saledatamanagementinHighEnergyPhysis(HEP)appliations

aswellas forastronomy andgravitationalwavephysis. GriPhyN provides userstransparentaess to both

rawandproesseddata(Thetermvirtualdata isusedtorefertoboth). Itanonvertrawdatato proessed

data by sheduling required omputations and data transfers. GriPhyN is built on top of Globus. It takes

appliation meta-dataand mapsit intoaDireted AyliGraph(DAG),whih isanabstrat representation

oftherequiredationsondatasets. ArequestplannertakestheDAGandtransformsitintoaonreteDAG,

whihanbeexeutedbyagridshedulingsystemsuhasCondor-G[42℄.

CERN, theEuropean organizationfornulear researh,is alsoinvolvedin handlingomputation onlarge

data sets in the HEP area. Objet level aswell as le level repliation for data grids has been explored in

[79℄, a CERN eort. It also assumes les are read only and an be repliated without need for onsisteny

protools. Theysupport replia atalogsto handle meta-data. Atualle/objettransfersare ahievedusing

GridFTP[19℄.

Data relatedativitieson thegrid suh asqueuing, monitoringand shedulingneedto bearefully

man-aged,asdata ouldbeome bottlenekfordataintensiveappliations. Currently,these datarelated tasksare

performedmanuallyorbysimplesripts. Themain goalofStork [85℄wastomakedataarstlassitizenon

thegrid. Data plaement jobshavedierent harateristisfrom omputeintensivejobs and so,may haveto

betreateddierently. Stork is aseparate sheduler forshedulingandmanaging data intensive jobson grid.

DatarelatedativitiesarerepresentedintheformofaDAG.Storkaninteratwithhigherlevelplannerssuh

asDireted Ayli Graph Manager(DAGman) whih is apart ofCondorG. Enhanements havebeen made

to DAGman tomakeit submitomputeintensivejobsto gridshedulerssuhasCondorGand dataintensive

jobsto Stork. Storkalsosupportsdierentheterogeneousstoragesystemsandvariousdatatransferprotools.

CasestudieshavedemonstratedtheuseofStork asapipelinebetweentwoheterogeneousstoragesystemsand

forruntimeadaptationof datatransfers.

3.3. ContentDistribution Networks. Webservershaddiultyin handlingtheashrowdproblem.

The ash rowd problem refersto a largenumber of requests omingin suddenly, overwhelming theserver's

bandwidth,orCPUorbak-endtransationinfrastruture. Webservershaveburstyrequestnature,forinstane

during a football math in World Cup or during an eletion ounting proess, resulting in the ash rowd

problem. Content Distribution Networks (CDNs) suh as Akamai [43℄ have been proposed to handle this

problemandtoenablewebserverstosaleup. Aseparateinfrastrutureofdediatedserversspreadarossthe

Internet wasbuilt byseveral ompaniesto ooadontentdistribution from webserversorto deliverontent

fromtheedgeoftheInternet. Akamai'sCDNonsistsofovertwelvethousandserversarossthousanddierent

(5)

Studies have shown that ahing is beneial in CDNs as they mainly deliver images or videos (stati

ontent)[44℄. AkamaiCDNsahievedahehitratesofnearly

88%

inanotherstudythatomparedtheCDNs

with P2P le sharingsystemsfor distributing ontent [76℄. This shows that CDNs are beneial for ontent

deliveryandanredueresponsetimeforlients. However,anotherstudyhasshownthattheaverageresponse

timeforlientsisnotaetedbyemployingCDNs[44℄. Buttheyavoidworstaseofbadlyperformingservers

ratherthanroutinglientrequeststo anoptimalCDNserver.

Caheonsistenybeomesahallengingissueinordertodelivernon-stationtenttolients. Traditional

ahingmehanismssuhasleasing[22℄maynotbediretlyappliabletoCDNs. Originserverswouldhaveto

keeptrakofeahCDNproxythatahesanobjet(webdoument)fromtheserver. Itmustalsomanagethe

leaserelatedissuesforthatCDNproxy,inludingnotifyingtheCDNproxyonupdatestotheobjet. TheCDN

proxyhastorenewtheleasetoreeivefurthernotiations. MehanismsforCDNsmustbesalable,requiring

the CDN proxies to ooperatively maintain onsisteny. Cooperative leases hasbeen proposed asa salable

mehanismformaintainingaheonsistenyinCDNs. [12,11℄. Eahobjetisassigneda

parameter,whih

indiatesthetimeortherate

1

/

atwhihanoriginservernotiesinterestedCDN proxiesofupdatesto that

objet. ThisallowsonsistenytoberelaxedimplyingthatCDNproxyanbenotiedonlyoneevery

time

units,insteadofafter everyupdate. Leasesareooperative,meaning thataCDNproxyatsasaleaderfora

CDNproxygroupforleaserelatedinterationswithanoriginserver. Theleaderisresponsiblefornotifyingthe

otherCDN proxies. Thisredues boththestatemaintainedattheoriginserverandthenumberof updatesit

mustsend.

3.4. Data Management in Distributed Mobile Systems. Distributed Mobile Systems (DMS) are

distributed systemsin whih some nodes maybe mobile and may haveonstraints. These onstraints ould

bebatteryormemoryor omputingpowerrelated. Dataouldeitherbestoredonorbeaessedfrom mobile

devies. Dierent kinds of management have been identied, with respet to the level of transpareny to

appliationsin [54℄. Clienttransparentadaptationallowsappliationstoseamlesslyaess datawithoutbeing

aware of mobility, with the system providing omplete support. The other extreme is a laisse-faire model

in whih adaptation is entirely at user level, with the system providing no support. There are a wealth of

strategiesbetweenthetwoextremes,thatallowappliationstobeawareofmobilityinvaryingdegreesinluding

appliationawareadaptationandextendedlientservermodels.

Coda[74℄wasoneoftheearlylesystemsthatallowslientstoseamlesslyaessinformation,anexampleof

lienttransparentadaptation. ThemaingoalofCodawastoenableoperationstobeperformedonashareddata

repository,evenin thefae ofdisonnetedoperations. Disonnetions maybefrequentinDMS. Venusisthe

ahemanageroneahlientthatmanagestheahe,hidingmobilityfromtheappliation.Venusahesvolume

mappings,with avolumereferringto asubtreeof theCodanamespae. Inthe fae ofonneted operations,

Codausesserverrepliationand allbakbasedaheohereneto ensuresession semantis(ontentswillbe

latest when a session is starting and after it ends) for appliations. During disonnetions, Venus relies on

aheontentsandpropagatesfailuretoappliationwhenaahemissours. Whendisonnetionends,Coda

revertsbaktoserverrepliationbyusingreintegrationoperationsusinglogs.

AppliationawareadaptionhasbeenusedintheOdysseysystem[21℄. Odysseyprovidesaleanseparation

between the onerns of the system and the appliation: system monitors resoure dynamis and noties

appliationsifrequired,butretainsontrolofresourealloationmehanism;whileappliationsspeifymapping

ofresourelevelstodelity levels. Fidelityis denedasthedegreeto whih lientdatamatheswithserver's.

Ithasmultiple dimensionsofonsisteny,framerateandimagequalityforvideodataaswellasresolutionfor

spatial data. Building a systemthat allowsdiverse delity levels neessitates typeawareness- lientode is

responsibleforhandlingpartiulardatatypes. Thisisahievedthroughtheuseofwardens,whiharespeialized

odeomponentsthatenapsulatesystemlevelsupportatthelient. WardensaresubordinatetoVieroy,whih

isresponsibleforentralizedresouremanagement.

Odyssey is an exampleof lient based appliation aware adaptation. Rover[13℄ is a system that allows

lient-serveradaptation. This meansthat someode requiredforadaption would alsoresidein server. Rover

uses the onept of Reloatable Dynami Objets (RDOs) for data types handled by the appliation. The

appliationprogrammersplitstheprogramontainingRDOsintothosethatresideonthelientandthosethat

run onservers. This requires that the adaptationode be residenton origin servers. Another approah has

beentakentoavoidthis, namedasproxybasedadaptation. Theadaptationisdonebythe proxy,whih ats

onbehalfoflients. TheBarwanprojet[30℄isanexample. Flexiblelientservermodelforappliationaware

(6)

onsisteny, an unbounded onsistenymehanismthat allowsreplias to diverge,but beonsistent after an

unspeiedtime.

3.5. Software Availability and Usage Summary. Globus isawidelyused toolkitandis available as

an open soure software. Stork is a researh prototype, while GriPhyN and CERN havebeen deployed and

used. Akamai'sCDNsarewidelydeployedandused,whileooperativeleases[12℄isaresearhprototype. Coda

andOdysseyarethedistributedmobile systemssoftwarethat arewidelydeployedandused.

4. Large Sale Data ManagementTehniques.

4.1. P2P Data Management. We rst give an overview of P2P le sharing systems starting from

the initial unstrutured P2Psystems suh as Napster to super-peer systems suh asKazaa before disussing

struturedP2Psystems. WegoontodisussP2PstoragemanagementsystemssuhasOeanstore.

4.1.1. P2P File Sharing Systems. P2Pasan areabeamepopularonly after theadvent ofNapster,

a le sharing system. Napster [57℄ was used for sharing musi les. Meta-data about les is stored in a

global diretory, whih is stored in a entralized server. The meta-datastored information about musi les

themselves, whih were downloadedfrom peers. Gnutella [33℄ ame upwith a deentralized searh protool

forlesharingappliations. Gnutellaanbeseento beapurelydeentralizedunstruturedP2Psystem. The

termunstrutured refersto thelakofstruturein theoverlay,whihismostlyarandomgraph. Searhwas

ahieved by ooding the network or by using random walks. Freenet added a mehanism to route requests

to possibleontent loations,based onbest eortsemantis. Freenet also addsa notionof anonymityto the

data shared. Themain advantageof theunstrutured P2Psystemswasthat omplexqueriesould be easily

handled. Byomplexqueries,wemeanqueriessuhasgetallnodeswithproessingspeed

>

3GHzandRAM

>

1GBandstorage

>

100GB.Thisisbeausethequeryissenttoeahnodeandevaluatedexpliitly. However,

deterministiguaranteesforsearhingarediulttoprovideinthesesystems.

Initial attempts at introduing struture to the overlay in P2P systems resulted in super-peer systems,

with some nodes (whih have better apabilities) ating as super-peers. The other nodes at as lients to

the super-peers, whih form aP2Poverlayamong themselves. Super-peers madesearhing moreeient for

omplex queries, by exploiting the heterogeneous nature of nodes (some nodes have better apabilities and

more importantly, better onnetivity than others). An example of a popular super-peer system is Kazaa

(http://www.kazaa.om). However, handling super-peer failures requires repliating super-peers (otherwise

thelientsmay beome disonneted). K-repliasanbe reatedin eahluster,resultingin reduedloadon

thesuper-peers[93℄. However,thismaymakerepliaslientaware. Otherdesignissuesin super-peersystems

inlude lustersize anddynamilayermanagement. Alargelustersize isgood foraggregatebandwidth,but

may reate bottleneks. A small luster size avoids bottleneks, but may redue searh eieny. Dynami

layermanagementallowsnodes to play super-peer orlientnodes adaptively, thereby makingthe super-peer

networkmoreeient[95℄.

The third generation of P2Psystems introdued struture in the overlay network. The motivation ame

from providing deterministi searh guarantees, partitioning the loadover the available mahines eetively,

salingtolargenumbersandahievingfault-tolerane. TheDistributedHashTable(DHT)wasmainlyusedas

thestrutureforoverlayformation. ItwasbasedonthePlaxtondatastruture[23℄. Nodesaregivenidentiers

(ids)fromanidspae. Appliationobjetsarealsogivenidsfromthesamespae. TheDHTprovidesamapping

from theappliation objetid (key) tothe node id that is responsiblefor that key. Eah nodehasa routing

table onsisting of neighbours and performs routing funtions to lookup objets. Various DHTs have been

proposed, eah having dierentroutingalgorithms androutingtable maintenane. Geometriinterpretations

ofDHTshavebeengivenin[45℄(butthefousofthatpaperwasmainlytostudythestatiresilieneofDHTs).

Chord [40℄ is based ona ring, while ContentAddressableNetwork (CAN) is based on ahyperube, Plaxton

datastruture isbasedonatree,whilePastry[69℄ isahybridgeometry ombiningthetreeand thering. We

disusssomeof thesestruturedP2Psystemsin moredetailbelow.

Chord provides the lookup abstration of DHTs throughthe method: lookup(key) whih maps akey to

anoderesponsible for it. Chorduses onsistent hashing to assign m-bitidentiersto bothChord nodes and

appliationobjets. Theidsarearrangedinaringfashion(modulo

2

m

). Akey

k

mapstotherstnodewhose

id isequalto orfollowsk inthe identierspae(this nodeis knownassuessor(

k

)). Eah nodemaintainsa

pointer to its suessorin the ring. Routingproeeds alongthe ringtill akeyisstraddled betweentwonode

ids, with the seond node id beingthe destination. Eah node also maintainsinformation on

O

(log(

N

))

(for

(7)

tableweretofail,onlyeienyisaeted, butnotorretness. Aslongaseahnodeisabletoonnettoits

suessor,routingisguaranteedtonishin

O

(log(

N

))

time.

CANroutesoverahyperube. EahCANnodestoresahunk(orzone)ofthehashtable. Eahnodealso

stores informationon adjaent zones in the table. This is again to speed uprouting. Lookup requests fora

partiularkeyareroutedtowardsaCANnodewhosezoneontainsthatkey. Requestsareroutedbyorreting

bits (

n

bits fora n-dimensional hyperube). Generally tree based DHTs suh asthe Plaxton data struture

allowbitstobeorretedinorder(fromMSBtoLSBofkey),whilehyperubebasedDHTsallowbitorretion

inanyorder. Thismakesroutingmoreresilientto node/linkfailures.

Pastryanbeviewedashavingahybridgeometryduetoitsuseoftreebasedroutingandringlikeneighbour

formation. Itprovidesarouteabstration toappliations. Theroute(msg, key)ensuresthatthemessagewith

agivenid isroutedtoanodewiththelosestmathingidaskeyamongalllivenodes. Eah nodekeepstrak

ofitsimmediateneighboursinthenodeidspaebymaintainingleafsets. Theyalsostoreinformationabouta

fewothernodesthathaveprexmathingidsintheformofaroutingtable. Pastrytakesintoaountnetwork

loalityinrouting. Thismeansthatagivenmessagewillberoutedtothenearestnodethatisaliveandthathas

thelosestmathingid asthekey. Routingtakesplaebyprexmathing, witheah hoptakingthemessage

onebit loserin thenodeidspae,resultingin

O

(log(

N

))

hops.

4.1.2. P2P File Storage Systems. Ivy [56℄ is aread/writeP2P lesystemthat providesanNFS-like

abstration for programmers. Ivy provides NFS-likesemantisin afailure free environment. Under network

partitionsandfailures,Ivyuseslogstoallowappliationstodetetandresolveonits. Ivylogsarespeito

eahpartiipantand host. Thelogs arestoredin DHash, aDHT basedP2Pblok storagesystemoverwhih

Ivyisbuilt. Partiipantsanreadotherlogs,butwriteonlyhis/herlogwhileupdatingthelesystem. Ivyuses

versioningvetorsto detetonitingupdatesandprovidesinformationtoappliation levelonitresolvers.

Ivysystemdemonstratedaperformanewithin2-3fatorofNFSperformaneinaWAN testbed.

PAST[15℄isanInternetbasedP2Pstorageutility. Itoerspersistentstorageservies,availability,seurity

and salability. PAST providesinsert, relaimand retrieve operationsonles. Sine aleannot beinserted

multiple times, les are assumed to be immutablein PAST. It must be noted that PAST is an extension of

Pastrytoprovidealestoragesystem. OninsertionofaleintoPAST,theleisroutedbyPastrytok-nodes

with losestmathing ids asthele id and thatare alive. Theset

k

will be diversewith respet to loation,

apabilities and onnetivity due to the randomization of the identier spae. File availability is ensured as

longasall

k

nodesdonotfailsimultaneously. Itprovidesseurityusingoptionalsmartardsthatarebasedon

apubli-keyryptosystem.

Oeanstore[49℄is anInternet basedlesystemthat providespersisteneand availabilityoflesby using

atwo-tieredsystem. Theuppertieronsistsof apablemahines withgood onnetivity. Thesemahines at

asaninnerirleofserversforserializingupdates. Thelowertieronsistsoflessapablemahineswhihonly

provide storageresoures to the system. Pond [67℄ is an Oeanstorerealization that provides fault tolerant

durable storageto appliations. It useserasure oding to storedata. Erasure oding [20℄ isatehnique that

allowsablok to be split into m fragments, whih are enoded into n fragments(

n > m

). The keyproperty

oferasureodingisthat itensuresthat theblokanbereonstrutedfrom anym ofthen oded fragments.

OeanstoreusesTapestry[17℄,anotherDHT,tostoretheerasureodedfragments(basedonfragmentnumber

+blokid). Oeanstoreusesprimaryopyrepliationtoensureonsistenyoflebloks. Ithandlesread/write

databyaversioningmehanisminwhihanywriteoperationreatesanewversionofthedata. Theproblem

isthenreduedtooneofndingthemostreentversionofthele.

4.1.3. Observations. Ivyhasthedisadvantagethat itleaveswriteonitresolutionto theappliation,

limiting thesalability. PAST provides apersistentahing and storagemanagementlayeron topof Pastry.

Itprovidesinsert, lookup andrelaimoperationsonles. However,italsoassumeslesareimmutable,asles

annot be insertedmultiple times withthe sameid. Oeanstore'sversioning mehanismhasnotbeenproved

salable. Theevaluations on Oeanstore and Pond [67℄ havenot onsidered oniting write operations and

haveassumedthere is asinglewrite perdata blok. Moreover,Oeanstoreassumesan innerirle of reliable

serversto ensureonsisteny. Further, all thethree storagesystems (Ivy, PAST and Oeanstore) havebeen

built over DHTs. DHTs provide support for only limited queries(exat mathing kind) and may not allow

appliationspeiriterionfordataplaement. Inthewordsof[47℄,virtualization(through DHTs)destroys

(8)

4.2. P2P Extensionsto DDBMS. Asimplistiviewofatraditionaldistributeddatabasemanagement

systemisthatitusesaentralizedservertoprovideaglobalshemaandACIDpropertiesthroughtransations.

Several approahes have extended these tehniques to work in a deentralized manner, to apply to Internet

or P2P systems. Ative XML [9℄ provides dynami XML douments over web servies for distributed data

integration. It is a model for repliating (whole le) and distributing (parts of a le) XML douments by

introduing loation aware queries in X-Path and X-Query. It also provides a framework by whih peers

perform deentralized query proessing in the presene of distribution and repliation. It allows peers to

optimizeloalizedqueryevaluationosts,byaseriesofrepliationsteps.

Edutella[58℄ attemptsto designandimplementashemabasedP2Pinfrastrutureforthesemantiweb.

It usesW3C standardsRDFand RDF Shema astheshemalanguageto annotate resouresonthe web. It

usesRDF-QEL asan expressivequeryexhange languageto retrievethedata stored in theP2P network. It

usessuper-peerroutingindiesthat inludeshemaandotherindex information.

Piazza[83℄isapeerdatamanagementsystemthatfailitatesdeentralizedsharingofheterogeneousdata.

Eahpeerontributesshemas,mappings,dataand/oromputation. Piazzaprovidesqueryanswering

apabil-itiesoveradistributedolletionofloalshemasandpairwisemappingsbetweenthem. Itessentiallyprovides

ashemamediationmehanismfordataintegrationoveraP2Psystem.

P2PInformationExhangeandRetrieval(PIER)[38℄isaP2PqueryengineforqueryproessinginInternet

saledistributed systems. PIERprovidesamehanismfor salablesharingandqueryingofngerprint

infor-mation,usedin networkmonitoringappliations suh asintrusiondetetion. It providesbest eortresults,as

ahievingACIDpropertiesmaybediultin Internet salesystems. Thequeryenginedoesnotassumedata

isloaded intodatabasesonallpeers, butis availablein theirnaturalhabitats in lesystems. PIERisrealized

overCAN,thehyperubebasedP2Psystem.

PeerDB[60℄isanobjetmanagementsystemthatprovidessophistiatedsearhingapabilities. PeerDBis

realizedoverBestPeer[59℄,whihprovidesP2P enablingtehnologies. PeerDB anbeviewedasanetwork of

loal databasesonpeers. It allowsdatasharingwithoutaglobal shemabyusingmeta-dataforeahrelation

and attributes. The queryproeeds in twophases: in the rst phase, relationsthat math the user'ssearh

arereturnedbysearhingonneighbours. After theuserseletsthedesiredrelations,theseond phasebegins,

wherequeriesarediretedto nodesontainingtheseletedrelations. Mobileagentsaredispathedtoperform

thequeriesinbothphases.

4.3. Software Availability and Usage Summary. Gnutellaand Napster havebeen widely deployed

andused. Chordisaresearhprototypethatisalsoavailableasanopensouresoftware. Pastryisalsoavailable

asanopensouresoftwareandhasalsobeenused widely. CANand Ivyareresearhprototypesaboutwhih

deploymentinformationisnotavailable.PASTandOeanstoreareresearhprototypesthathavebeendeployed

andusedin thePlanetlab testbed.

Edutella is available as an open soure software. The authors do not have information on the

deploy-ment/availabilityon other researhprototypesPiazza, PeerDB and AtiveXML. PIER hasbeen deployedin

thePlanetlabtestbed.

5. State ofthe Art Data Management.

5.1. SID Tehniques: State of the Art.

5.1.1. P2P Tehniques in Grids. JuxMem[2℄providesadatasharingservieforgrids by integrating

DSM onepts with P2P systems. It is realized over (Juxtapose) JXTA [34℄, an emerging framework for

developingP2P appliations. JuxMem uses luster advertisementsto advertise the amount of memory eah

peer anprovideto the global storage. It is organizedinto a federation of lusters, with eah luster having

aClusterManager(CM). TheCM isresponsiblefor storingallluster advertisementsin itsgroup. TheCMs

arosslustersform aDHT. Atually, theamountofmemoryprovidedin theluster advertisementis hashed

andtheCM withthelosestmathingidintheDHTstoresthisadvertisement. Whenalientasksforablok

of memory with a given rounded size (xed sized bloksan only besupported), the size is hashed and the

luster advertisement whih provides that size is retrieved from the CM with the losest mathing id. The

lusteradvertisementhasthedetailsoftheatualstorageprovider. ReentextensionstoJuxMem[14℄provide

mehanismstodeoupleonsistenyprotoolsfromfault-toleranemehanisms. Thisallowstheuseofstandard

DSM onsisteny protools to integrate fault-tolerane omponents. In partiular, DSM onsisteny shemes

(9)

wellasanatomimultiastprotool,whihisahievedbyusingonsensusprotoolsbasedonFailureDetetors

(FDs)[26℄. ThedatasharingmehanismsofJuxMemhaveonlybeenevaluated atthelusterlevel.

The replia loation problem has been addressed in grids using P2P onepts in [5℄. It proposes a P2P

realizationoftheRepliaLoationServie(RLS),akeyomponentofdatagrids. TheLogialFileName(LFN)

ishashed togivetheidentierforareplia. Thenodewith thelosestmathingid astheLFN hashontains

theLFN to Physial FileName(PFN) mapping. This isthemeta-datastored in RLSforle lookup. It also

proposes an update protool to handle onsistenyof meta-data. The RLSrealization is based on Kademlia

[63℄. Kademlia isa struturedP2P systemthat uses anovelXOR metrifor routingdistanebetween two

nodesis dened as theeXlusiveOR(XOR) oftheirnumeri ids. A Kademlianodeforms log(n)neighbours,

whereneighbour

i

isatXORdistane

[2

i

,

2

i+1

]

. TheneighboursetissameasthatformedbyatreebasedDHT

PRR [23℄. Eventhefailure-freeroutingin Kademlia issimilar to PRR,in that bits areorretedfrom left to

right. However,in theaseoffailures, XORmetriallowsbitstobeorretedin anyorder. This impliesthat

thestatiresiliene 2

ofKademliais betteromparedtoPRR[45℄.

5.1.2. Replia Plaement in CDNs. Optimal plaementofreplias inCDNs isanon-trivialtaskand

has not been addressed. QoS aware replia plaement was proposed in [92℄ to meet QoS requirements of

lientswith theobjetiveofminimizing therepliationost. Therepliationost inludesostof storageand

onsistenymanagement,whileQoSisspeiedintermsofdistanemetrissuhashopount. Twoproblems

areformulated: Replia-awareandReplia-blind. Inreplia-awaremodel, theCDN serversareawareofwhere

objetrepliasarestoredin theCDN network. Thishelpstheserverstorediretlientrequeststo thenearest

replia. In thereplia blind model, appliation ornetwork levelrouting ensureslient requests arerouted to

CDN servers, with serversbeing transparent to replia loation. Eah replia (CDN server) serves requests

omingto it. Dynami programming tehniques areused to arriveat near optimal solutions forthe optimal

repliaplaementproblem,whihisshowntobeNP-omplete.

5.1.3. Distributed Mobile Storage System. Segank [80℄ providesan abstrationof asharedstorage

systemforheterogeneousstorageelements. Themotivationwasthattraditionalmehanismsformanagingdata

indistributedmobileenvironmentssuhasCodaandBayou,havetimeonsumingmergeoperations. InCoda,

updates are released to the server before beoming visible on lients. If serversare physially far away, this

ouldinreasethetime afterwhih updatesbeomevisible. Bayouusesfull repliation,leadingto potentially

expensivemergeoperations. Segankhandlesdataloationproblemwhendataouldbeloatedonanysubsetof

devies,byusingaloationandtopologysensitivemultiast-like(namedassegankast)operation. Itallowslazy

P2Ppropagationofinvalidationinformationtohandleonsistenyofrepliateddata. Italsousesadistributed

snapshot mehanism to ensure a onsistent image aross all devies for bakup. It must be observed that

Segank uses onlyunstrutured P2P system onepts. This implies that Segank annot provide deterministi

searhguarantees.

5.2. Large SaleData Management: Stateofthe Art. Weshallexplaintheurrentstateoftheart

inP2Pdatamanagementalongfourdiretions: integratingstruturedandunstruturedP2Psystemsproviding

Qualityof Servie(QoS) guaranteesin P2P systems,omposableonsistenyfor P2Psystemsandlargesale

DHTdeployment. Wealsoexplainthestateoftheartin P2PDBMS.

5.2.1. Integrating Struturedand UnstruturedP2P Systems. Anattempthasbeenmadein[55℄

to improve strutured P2P systems along three diretions where they were traditionally known to perform

worseomparedtounstruturedP2Psystems: handlinghurn,exploiting heterogeneityandhandlingomplex

queries. InP2Psystems,node/network dynamisresultinginrouting-tableupdates and/ordatamovementis

known as hurn. The paper[55℄ shows that MS Pastry, animplementation of Pastry, anhandle hurnwell

byusing aperiodiroutingtable maintenaneprotool. This protool updatesfailed routingtable entries. It

also has a passive routingtable repairprotool. They demonstrate that by exploiting struture, MS Pastry

anhandlehurnbetterthanunstruturedP2Psystems. HeterogeneityisdiulttohandleinstruturedP2P

systems due to onstraintson data plaement and neighbour seletion. MS Pastry handles heterogeneity in

twoways: onebyusingsuper-peeronepts;seond,bymodifyingneighbourseletionto handleapaity. MS

Pastryisalsoextendedtohandleomplexqueriesbyintroduingnewtehniquesforoodingorrandomwalks.

Floodingisahievedbysendingthemessagetoallnodesintheroutingtable. Randomwalkisahievedbyusing

atag ontainingtheset of nodes to visit,a queueof nodesin theroutingtable rowand abound onnumber

ofrowsto traverse. Afewother eortshavealso beenmadereentlyto makestrutured P2Psystemshandle

(10)

rangequeries[16℄, multi-dimensionalqueries[65℄aswellaqueryalgebra[73℄. ASalable WideAreaResoure

Disovery(SWORD)[62℄hasbeenbuilttorealizeresouredisoveryoverWANsbysupportingmulti-attribute

rangequeriesoverDHTs.

Another approah to integrate strutured and unstrutured P2P systems has been made in the Vishwa

omputing grid middleware [53℄. Vishwa uses the task management layerto handle initial task deployment

and load adaptabilityof the tasks. The task management layeris realized using unstrutured P2P onepts

andallowsapabilitybasedresourelustering. ThereongurationlayerofVishwaisrealizedasastrutured

P2P layerand stores information needed to handle node/network failures. The twolayeredarhiteture has

also been used for data managementin Virat [1,7℄. Viratprovides asharedobjet spaeabstration overa

wide-areadistributed system. Virathasbeenextended to areplia managementmiddleware forP2Psystems

[8℄. Theunstruturedlayerformsneighboursbasedonnodeapabilities(intermsofproessingpower,memory

available,storageapaityandloadonditions). AstruturedDHTisbuiltoverthisunstruturedlayerbyusing

theoneptofvirtualnodes. Viratahievesdynamirepliaplaementonnodeswithgivenapabilities,whih

would be veryuseful in omputing/data grids. Detailedperformane omparison is also madewith a replia

mehanismrealized overOpenDHT[68℄, astate oftheart strutured P2Psystem. It hasbeendemonstrated

thatthe99thperentileresponsetimeforViratdoesnotexeed600ms,whereasforOpenDHT,itgoesbeyond

2000msin anInternettestbed.

5.2.2. Composable Consisteny for P2P Systems. A exible onsistenymodel knownas

ompos-ableonsistenysuitableforavarietyofP2Pappliationshasbeenproposedin[72℄. Theauthorshaveinitially

surveyed onsisteny requirements for P2P appliations suh as personal le aess, real time ollaboration

and database or diretory servies. The survey showed that dierent appliations need dierent semantis

for read/write and for replia divergene. The main ontribution of [72℄ is the lassiation of onsisteny

requirementsalong ve orthogonal dimensions: onurrenydegreeof oniting read/write aess; replia

synhronizationdegreeofrepliadivergene;failurehandlingdataaess semantisinthepreseneof

ina-essiblereplias;updatevisibility-timeafterwhihloalupdatesmaybemadegloballyvisible;viewisolation

time after whih remoteupdates must bemade loally visible. A rih olletionof onsistenysemantisfor

shareddataanbeomposedbyombiningtheaboveveoptions. Performanestudieshaveshownthat

om-posableonsistenyintheSwarmsystemoutperformsCoDA[74℄inalesharingsenario,whileforarepliated

BerkeleyDBdatabase,itprovidesdierentonsistenymehanismsfromstrongtotime-based.

5.2.3. Providing QoS Guarantees inP2P Systems. GuaranteeingQualityofServie(QoS)

parame-terssuhasresponsetimeorthroughputinP2Psystemsisahallengingtask. Aninitialattemptwasmadein

[70℄ atusing P2PsystemoneptsforDomain NameSystem(DNS), whihrequires eientdata loation. It

showedthatthoughP2PDNSouldprovidebetterfault-toleranethanonventionalDNS,lookupperformane

ofO(log(N))providedbyDHTswasfarworseomparedtoonventionalDNS.CooperativeDNS(CoDoNS)[89℄

wasproposed to taklethree problemsofonventional DNS:suseptibilityto Denialof Servie(DoS)attaks;

lookupdelays,espeially forashrowds;lakofaheohereny,preventingquikserviereloationin

emer-genies. CoDoNShasbeenproposed asabakwardompatiblereplaementforonventionalDNS.It provides

O(1)lookuptime byusingtheproativeahinglayerofBeehive[88℄. BeehiveenablesDHTsto ahieveO(1)

lookupperformanebyproativerepliation. Traditionally,prexmathingDHTsstoreanappliationobjetat

thelosestmathingnode,witheahroutingstepsuessivelymathingprexes,resultinginO(log(N))lookup

performane. Byaggressivelyahingtheobjetallalongthelookuppath,BeehiveahievesO(1)lookup

per-formaneforthatobjet. Sine,Beehiveassoiatesdierentrepliationlevelsfordierentappliationobjets,

an average lookup performane of O(1) is ahieved. CoDoNS builds a DNS based on a self-organizing P2P

overlayformedarossorganizations(ifeahorganizationanprovideaserverforCoDoNS).CoDoNSassoiates

a domain name with thenode having the losestmathing id asthe domain name's hashed id. If the home

node fails, the node withthe nextbest mathing id takesoverasthe homenode for that partiulardomain.

Performane studies overPlanetLab testbed show that CoDoNS ahieveslowerlookup latenies, an handle

slashdot eets and anquikly disseminateupdates. However, theuseof DHTs asthebasis leavesCoDoNS

vulnerableto networkpartitions. Forexample,ifanorganizationispartitioned from theoutsideworld, while

onventionalDNSwouldensurethat loallookupsworkedorretly,withCoDoNSevenloallookupsmayfail

(DHTlookupmaygooutsidetheloalnetworkevenforloallookupsstrethpropertyofDHTs). Thissuggests

that SkipNets [35℄ may be abetter hoie for realizingDNS thanDHTs. This isbeausedata in SkipNets is

(11)

5.2.4. Large Sale Deployment. OpenDHT [68℄ is a publi large sale DHT deployment that allows

lientsto useDHTs withouthaving to deploythem. It provides asharedstoragespae abstrationusing the

getandputprimitives. ThemainmotivationforOpenDHTisthatitishardtodeploylongrunningdistributed

system servies, espeially in the publi domain. OpenDHT is deployed on PlanetLab

(http://www.planet-lab.org/),aglobaltestbedfordeployingplanetarysaleservies. OpenDHTisdeployedoninfrastruturenodes

whih alonepartiipate in DHT routingand storage. Clients onlyuse the storagespae throughtheget and

putinterfaeongateway(infrastruture)nodes. OpenDHTallowsdierentmutuallyuntrustingappliationsto

sharetheDHT.Itensuresthatlientsgetafairshareofstorageresoureswithoutimposingarbitraryquotasa

trade-obetweenfairnessandexibility. ThisisahievedbyassoiatingaTime-to-Live(TTL)withappliation

objetsandletting themexpireiflientsdonotrenewthem. OpenDHT providesstorageabstration ofDHTs

inontrasttothelookupabstrationofChordortheroutingabstrationofPastry.

ItisrealizedoverBambooDHT(bamboo-dht.org),thatissimilartoPastrybuthasdierenesinhandling

node dynamis. OpenDHT is notashared objet spae. The levelof abstration providedto programmer is

dierent. Forinstane,theprogrammerhastotakeareofobjetserialization,RTTI(runtimetypeinferening)

et. torealizeanobjetstorageontopofthebytestoragethatOpenDHTprovides. OpenDHTprovideslimited

onsisteny for the shared byte spae. Conit resolution (for onurrent writes) is left to the appliation,

similarto theBayousystemthat ensureseventualonsisteny,averylooseform ofonsisteny. Butonit

resolutionisanon-trivialtaskfortheappliationprogrammer. TheperformaneofOpenDHT(espeiallyworst

aseresponse time) suers due to thepreseneof stragglersorslow nodes. This has beenimprovedby using

delayawareand iterativeroutingin[71℄.

5.2.5. Stateofthe ArtP2PDDBMS. AtlasP2PArhiteture(APPA)[86℄istheurrentstateofthe

artdata managementsolutionforlargesaleP2Psystems. Itusesathree layeredarhiteture,with theP2P

networkforming thelowest layer. Thislayerouldberealizedusing unstruturedorstrutured orsuper-peer

basedP2Ponepts. Abovethislayer,thebasiP2Pservieslayerisbuilt. ThisprovidesP2Pdatasharingand

retrieving(keybased) in the P2Pnetwork,support for peer ommuniation, support forpeer dynamis(join

andleave)andgroupmembershipmanagement. OverthebasiservieslayeradvanedP2Pdatamanagement

serviessuhasshemamanagement, repliation,queryproessing andseurityare built. The shareddata is

in XML format and queriesexpressed in X-Queriesin order to makeuse of web servies. It is realized over

JXTA. Itprovidesrepliamanagementbyextendingtraditionalentralizedlogbasedreoniliationtehniques

for P2P systems. It assumes the existene of ashared storage spae for distributed reoniliation by peers.

Thisrequiresonsensusprotoolsforrealizationandmaybeexpensive. Ithasnotbeenevaluatedinlargesale

systems.

A reent eort has been made to provide a middleware based data repliation sheme in [94℄ by using

SnapshotIsolation(SI)astheisolationlevel. InSIbasedDBMS,readoperationsofatransationTarehandled

fromasnapshotofthedatabase(setofommittedtransationswhenTstarted). Thisimpliesreadoperations

neveronitwithwriteoperationsandonlywrite-writeonitsanour,resultinginmoreonurrenyand

onsequentlybetterperformane. Ithasbeenproposedat thelusterlevelandmaynotbeappliableforP2P

systemsdue toitsstrongassumption ofatotallyorderedmultiast.

5.3. Software Availability and Usage Summary. Juxmem and Segankare researhprototypes.

De-ploymentinformationonStrutellaisnotavailable.VishwaandViratareresearhprototypesthatareavailable

asopenbinaries. OpenDHThasbeendeployedonthePlanetlabtestbedandisalsoavailableasanopensoure

software. APPAisaresearhprototype.

6. Conlusions. Wehavepresentedasalabilitytaxonomyof datamanagementsolutionsindistributed

systems. WegroupdatamanagementworkdoneinDSMsandsharedobjetspaesin theCentralized/Naively

Distributed(CND)datamanagementategory. TheSophistiated/IntermediateData(SID)management

teh-niques inlude data management in grid omputing systems and data grids as well asContent Distribution

Networks(CDNs)anddatamanagementindistributedmobilesystems. ThesesolutionssalebetterthanCND

tehniques by using distributed data management, instead of entralized approahes. Theyhowever,assume

aninner set ofreliableserverswhih takeareofonsistenyand reliabilityissues. However,in order to take

the data managementservies to theedges ofthe Internet, LargeSale Data (LSD) management tehniques

makeuseofP2Ponepts. Theyonsequentlyprovidebettersalabilityandfault-tolerane,butattheostof

relaxingonsisteny(mostapproahesprovideprobabilistiguaranteesoreventualonsisteny).

(12)

Fig.6.1. PitorialRepresentationof SalabilityTaxonomy

It anbeobservedthatLSD tehniquessuhasVirat[8℄handlelargenumberofsmall dataobjets. The

aseofhandlinglargenumberoflargedataobjetsariseswhenexistingdatagridsbeomepurelyP2P,instead

ofusingSIDtehniques. TheexistingLSDtehniquesmaynotworkinthisase,asthesizeofdataobjetsalls

forspeialmehanismstohandlesomeoperationsinludingupdates. Inrementalupdatesorfuntionshipping

inombinationwith LSDdatamanagementtehniquesmayhavetobeexplored.

AnotherinterestingavenueforexplorationistheuseofLSDtehniquesombinedwithnodemobility. The

solutionswhih havebeenproposed for handlingdata managementin distributed mobile systemsdo notuse

P2P onepts, but assume the presene of reliableservers that handle mobile lient requests. When mobile

nodes form the P2P overlay, hurn ould be very high due to node mobility. This, oupled with the devie

onstraints,mayopenupawealth ofresearhquestions.

Optimal data plaement tehniques whih havebeen proposed for CDNs [92℄ anbe used in P2P grids.

Existingdatamanagementtehniquesingrids(orevenP2PgridssuhasP-Grid[46℄)donotaddressoptimal

repliaplaementissues. Thework[8℄providesheuristisforrepliaplaementinP2Pgrids. Butplaementof

replias may notbe exatlyoptimal. Thus, we see thattehniques for datamanagementin oneategoryan

beapplied tootherstoopenupresearhinlargesaledatamanagement.

REFERENCES

[1℄ AVijaySrinivas,MVenkateshwaraReddy,andDJanakiram, DesigningaRepliationServieforLargePeer-to-Peer

DataGrids,IEEEDistributedSystemsOnline,7(2006).

[2℄ GabrielAntoniu,LuBougé, andMathieu Jan, JUXMEM:AnAdaptiveSupportivePlatform forDataSharingon

theGrid,SalableComputing:PratieandExperiene,6(2005),pp.4555.

[3℄ MaartenvanSteenandPhilipHomburgandAndrewS.Tanenbaum,Globe:AWide-AreaDistributedSystem,IEEE

Conurreny,7(1999),pp.7078.

[4℄ PWykoff,SWMLaughry,T JLehman,andDAFord,TSpaes,IBMSystemsJournal,37(1998),pp.454474.

[5℄ A.Chazapis,A.Zissimos, andN.Koziris,APeer-to-Peer Replia Management ServieforHigh-Throughput Grids,in

(13)

[6℄ A VijaySrinivasandDJanakiram, AModelforCharaterizing theSalabilityof DistributedSystems,ACMSIGOPS

OperatingSystemsReview,39(2005),pp.6472.

[7℄ AVijaySrinivasandDJanakiram,APeer-to-PeerFrameworkforCollaborativeDataSharingOvertheInternet,Teh.

ReportIITM-CSE-DOS-2005-28,aeptedforpubliationinIEEEInternationalConfereneonCollaborativeComputing:

Networking,AppliationsandWorksharing(CollobarateCom2006),IEEEComputerSoietyPress.

[8℄ AVijaySrinivasandDJanakiram,NodeCapabilityAwareRepliaManagementforPeer-to-PeerGrids,TehnialReport

IITM-CSE-DOS-2006-04,Distributed &ObjetSystemsLab, IndianInstitute of Tehnology,CommuniatedtoIEEE

TransationsonSoftwareEngineering.

[9℄ S.Abiteboul,A.Bonifati,G.Cobéna,I.Manolesu,andT.Milo,DynamiXMLdoumentswithdistributionand

repliation,inSIGMOD'03: Proeedingsofthe2003ACMSIGMODinternationalonfereneonManagementofdata,

NewYork,NY,USA,2003,ACMPress,pp.527538.

[10℄ C. Amza, A.Cox,S. Dwarkadas, P.Keleher, H.Lu,R. Rajamony, W. Yu, andW. Zwaenepoel,TreadMarks:

SharedMemoryComputingonNetworksofWorkstations,IEEEComputer,29(1996),pp.1828.

[11℄ Anoop George Ninan, Purushottam Kulkarni, Prashant Shenoy, Krithi Ramamritham, and Renu Tewari,

Salable ConsistenyMaintenane inContentDistributionNetworks UsingCooperativeLeases, IEEETransationson

KnowledgeandDataEngineering,15(2003),pp.813828.

[12℄ AnoopNinan, PurushottamKulkarni,PrashantShenoy,KrithiRamamritham, andRenuTewari,Cooperative

Leases: Salable Consisteny Maintenane in ContentDistributionNetworks, inWWW'02: Proeedings ofthe 11th

internationalonfereneonWorldWideWeb,NewYork,NY,USA,2002,ACMPress,pp.112.

[13℄ AnthonyD Joseph, Joshua A Tauber, and M Frans Kaashoek,Mobile Computing with the Rover Toolkit,IEEE

TransationsonComputers,46(1997),pp.337352.

[14℄ G.Antoniu,J.-F.Deverge,andS. Monnet,How toBringTogetherFaultToleraneand DataConsistenytoEnable

GridDataSharing,ConurrenyandComputation: PratieandExperiene,17(2006).Toappear.

[15℄ AntonyRowstronandPeterDrushel,StoragemanagementandahinginPAST,alarge-sale,persistentpeer-to-peer

storageutility,inSOSP'01:ProeedingsoftheeighteenthACMsymposiumonOperatingsystemspriniples,NewYork,

NY,USA,2001,ACMPress,pp.188201.

[16℄ ArturAndrzejakandZhihenXu,Salable,EientRangeQueriesforGridInformationServies,inP2P'02:

Proeed-ingsoftheSeondInternationalConfereneonPeer-to-PeerComputing,Washington,DC,USA,2002,IEEEComputer

Soiety,pp.3340.

[17℄ B. Y. Zhao, L. Huang, J.Stribling, S. C. Rhea, A.D. Joseph, andJ. D. Kubiatowiz, Tapestry: AResilient

Global-SaleOverlayforServieDeployment,IEEEJournalonSeletedAreasinCommuniations,22(2004),pp.4153.

[18℄ D.Bauer,P.Hurley, R. Pletka, andM.Waldvogel,Bringingeientadvaned queriestodistributedhashtables,

inLCN'04: Proeedingsof the 29th AnnualIEEE International Conferene onLoal ComputerNetworks (LCN'04),

Washington,DC,USA,2004,IEEEComputerSoiety,pp.614.

[19℄ Bill Allok, Joe Bester, John Bresnahan,Ann L. Chervenak, Ian Foster, Carl Kesselman, Sam Meder,

VeronikaNefedova,DaryQuesnel,andStevenTueke,DataManagementand TransferinHigh-Performane

ComputationalGridEnvironments,ParallelComputing,28(2002),pp.749771.

[20℄ J.Blomer,M.Kalfane,R.Karp,M.Karpinski,M.Luby,andD.Zukerman,Anxor-basederasure-resilientoding

sheme,1995.

[21℄ Brian D Noble, M Satyanarayanan, Dushyanth Narayanan, James Eri Tilton, Jason Flinn, and KevinR.

Walker,AgileAppliation-AwareAdaptationforMobility,inSOSP'97:ProeedingsofthesixteenthACMsymposium

onOperatingSystemsPriniples,NewYork,NY,USA,1997,ACMPress,pp.276287.

[22℄ CGrayandDCheriton,Leases:anEientFault-TolerantMehanismforDistributedFileCaheConsisteny,inSOSP

'89: Proeedings of the twelfthACMsymposiumon Operating systemspriniples,New York, NY,USA,1989, ACM

Press,pp.202210.

[23℄ CGregPlaxton,RajmohanRajaraman,andAndreaW Riha,AessingNearbyCopiesofRepliatedObjetsina

Distributed Environment,inSPAA'97: Proeedingsofthe ninthannualACMsymposiumon ParallelAlgorithmsand

Arhitetures,NewYork,NY,USA,1997,ACMPress,pp.311320.

[24℄ N.CarrieroandD.Gelenter,LindainContext,CommuniationsoftheACM,4(1989),pp.444458.

[25℄ J.B.Carter,DesignoftheMuninDistributedSharedMemorySystem,JournalofParallelandDistributedComputing,29

(1995),pp.219227.

[26℄ T. D. Chandra and S. Toueg,UnreliableFailure Detetors forReliableDistributed Systems,Journal ofthe ACM, 43

(1996),pp.225267.

[27℄ Chervenak,A,Foster,I,Kesselman,C,Salisbury,C,andTueke,S,TheDataGrid: TowardsanArhiteturefor

theDistributedManagementandAnalysisofLargeSientiDatasets,JournalofNetworkandComputerAppliations,

23(2001),pp.187200.

[28℄ U.Dayal,K.Ramamritham,andT. M.Vijayaraman,eds.,Proeedingsof the19thInternationalConfereneonData

Engineering,Marh 5-8,2003,Bangalore,India,IEEEComputerSoiety,2003.

[29℄ DirkD¼llmann andBen Segal,Modelsfor Replia Synhronisation andConsistenyin a DataGrid,inHPDC'01:

Proeedingsof the 10th IEEE International Symposiumon HighPerformaneDistributed Computing(HPDC-10'01),

Washington,DC,USA,2001,IEEEComputerSoiety,p.67.

[30℄ EriA Brewer, RandyH Katz, ElanAmir1, Hari Balakrishnan,YatinChawathe,Armando Fox, StevenD

Gribble,ToddHodes,GiaoNguyen,VenkataNPadmanabhan,MarkStemm,SrinivasanSeshan,Tom

Hen-derson,JoshuaATauber,andMFransKaashoek,ANetworkArhitetureforHeterogeneousMobileComputing,

IEEEPersonalCommuniations,5(1998),pp.824.

[31℄ Ewa Deelman, Carl Kesselman, Gaurang Mehta, Leila Meshkat, Laura Pearlman, Kent Blakburn, Phil

Ehrens,Albert Lazzarini, Roy Williams, andSott Koranda,GriPhyN and LIGO,Buildinga Virtual Data

GridforGravitationalWaveSientists,inProeedingsofthe11thIEEEInternationalSymposiumonHighPerformane

(14)

JournalofGridComputing,2(2004),pp.207222.

[33℄ Gnutella,The Gnutella protool speiation v0.4. http://www9.limewire.o m/de velo per /gnu tel la protool 0.4.pdf

2000.

[34℄ L.Gong,JXTA:ANetworkProgramming Environment,IEEEInternetComputing,5(2001),pp.8895.

[35℄ Harvey,NiholasJ.A., Jones,Mihael B.,Saroiu,Stefan, Theimer,Marvin, andWolman,Ale,Skipnet: A

salableoverlaynetworkwithpratialloalityproperties,inProeedingsoftheFourthUSENIXSymposiumonInternet

TehnologiesandSystems(USITS'03),Seattle,UnitedStates,Marh2003,USENIXAssoiation.

[36℄ HenriEBal,M FransKaashoek,andAndrewSTanenbaum,Ora: ALanguageforParallel Programming of

Dis-tributedSystems,IEEETransationsonSoftwareEngineering,18(1992),pp.190205.

[37℄ HenriEBal,RaoulBhoedjang,RutgerHofman, CerielJaobs,KoenLangendoen,TimRuhl,andMFrans

Kaashoek, Performane evaluation of the ora shared-objet system, ACMTransations on Computer Systems, 16

(1998),pp.140.

[38℄ R. Huebsh, J. M. Hellerstein, N. Lanham, B. T. Loo, S. Shenker, andI. Stoia,Querying the Internetwith

PIER.,inVLDB2003,Proeedingsof29thInternationalConfereneonVeryLargeDataBases,September 9-12,2003,

Berlin,Germany,J.C.Freytag,P.C.Lokemann,S.Abiteboul,M.J.Carey,P.G.Selinger,andA.Heuer,eds.,Morgan

Kaufmann,2003,pp.321332.

[39℄ I.FosterandC.Kesselman, Globus: AMetaomputingInfrastrutureToolkit,IntlJournalofSuperomputer

Applia-tions,11(1997),pp.115128.

[40℄ I. Stoia,R. Morris,D.Karger,M.F. Kaashoek,andH.Balakrishnan,Chord: ASalablePeer-to-Peer Lookup

ServieforInternetAppliations,IEEE/ACMTransationsonNetworking,11(2003),pp.1732.

[41℄ L.Iftode,J.P.Singh,andK.Li,SopeConsisteny: aBridgeBetweenReleaseConsistenyandEntryConsisteny,in

SPAA'96:ProeedingsoftheeighthannualACMsymposiumonParallelalgorithmsandarhitetures,NewYork,NY,

USA,1996,ACMPress,pp.277287.

[42℄ J.Frey, T. Tannenbaum,M. Livny, I. Foster, andS. Tueke,Condor-G: AComputation Management Agent for

Multi-InstitutionalGrids,inHPDC'01: Proeedingsofthe10thIEEEInternationalSymposiumonHigh Performane

DistributedComputing(HPDC-10'01),Washington,DC,USA,2001,IEEEComputerSoiety,p.55.

[43℄ JohnDilley,BrueMaggs,JayParikh,HaraldProkop,RameshSitaraman,andBillWeihl,GloballyDistributed

ContentDelivery,IEEEInternetComputing,06(2002),pp.5058.

[44℄ K. L. Johnson, J. F. Carr, M. S. Day, and M. F. Kaashoek, The measured performane of ontent distribution

networks,ComputerCommuniations,24(2001),pp.202206.

[45℄ KGummadi,RGummadi,SGribble,SRatnasamy,SShenker,andI.Stoia,TheImpatofDHTRoutingGeometry

on Resiliene and Proximity, in SIGCOMM '03: Proeedings of the 2003 onferene on Appliations, tehnologies,

arhitetures,andprotoolsforomputerommuniations,NewYork,NY,USA,2003,ACMPress,pp.381394.

[46℄ Karl Aberer, Philippe Cudre-Mauroux, Anwitaman Datta, Zoran Despotovi, Manfred Hauswirth,

Mag-dalenaPuneva,andRomanShmidt,P-Grid: a Self-OrganizingStrutured P2PSystem,ACMSIGMODReord,

32(2003),pp.2933.

[47℄ P.J.Keleher,B.Bhattaharjee,andB.D.Silaghi,AreVirtualizedOverlayNetworksTooMuhofaGoodThing?,

inIPTPS '01: Revised Papers fromthe First International Workshop on Peer-to-Peer Systems, London, UK, 2002,

Springer-Verlag,pp.225231.

[48℄ Kourosh Gharahorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John

Hen-nessy,Memory Consistenyand Event Ordering in Salable Shared-Memory Multiproessors, inISCA '90:

Proeed-ingsofthe 17th annualinternational symposiumonComputerArhiteture,New York, NY,USA,1990,ACMPress,

pp.1526.

[49℄ J.Kubiatowiz, D.Bindel, Y.Chen,S. Czerwinski,P.Eaton, D.Geels,R. Gummadi,S. Rhea, H.Wea

ther-spoon,C.Wells,andB.Zhao,OeanStore: anArhitetureforGlobal-SalePersistentStorage,SIGARCHComputer

ArhitetureNews,28(2000),pp.190201.

[50℄ LGAlexSung,NabeelAhmed,R.andHermanLi,MohamedAliSoliman,andDavidHadaller,ASurveyofData

ManagementinPeer-to-Peer Systems.CS856WebDataManagement,2005. ShoolofComputerSiene,Universityof

Waterloo.

[51℄ LGuy,PKunszt,ELaure,HStokinger,andKStokinger,RepliaManagementinDataGrids.TehnialReport,

GGFWorkingDraft,2002.

[52℄ M.AhamadandR.Kordale, SalableConsistenyProtoolsforDistributedServies,IEEETransationsonParalleland

DistributedSystems,10(1999),pp.888903.

[53℄ M.V.Reddy, A.V.Srinivas,T. Gopinath,andD. Janakiram,Vishwa: AReongurablePeer-to-Peer Middleware

for Grid Computing, in 35th International Conferene on Parallel Proessing, IEEE Computer Soiety Press, 2006,

pp.381390.

[54℄ MahadevSatyanarayanan,AessingInformationonDemandatanyLoation.MobileInformationAess,IEEEPersonal

Communiations,3(1996),pp.2633.

[55℄ MiguelCastro, ManuelCosta,andAntonyRowstron,DebunkingSomeMythsAboutStruturedand Unstrutured

Overlays,inProeedingsofthe2ndUsenixSymposiumonNetworkedSystemDesignandImplementation,Boston,MA,

May2005.

[56℄ A. Muthitaharoen, R. Morris, T. M. Gil, and B. Chen, Ivy: a Read/Write Peer-to-Peer File System, SIGOPS

OperatingSystemsReview,36(2002),pp.3144.

[57℄ NAPSTER,Napstermediasharingsystem.http://www.napster.om

[58℄ W.Nejdl,W.Siberski,andM.Sintek,DesignissuesandhallengesforRDF-andshema-based peer-to-peer systems,

SIGMODReord,32(2003),pp.4146.

[59℄ W. S.Ng, B. C. Ooi, andK.-L. Tan,BestPeer: ASelf-CongurablePeer-to-Peer System.,inProeedingsof the18th

InternationalConfereneonDataEngineering,26February-1Marh2002,SanJose,CA,IEEEComputerSoiety,2002,

(15)

etal.[28℄,pp.633644.

[61℄ ObjetManagementGroup,TheCommonObjetRequestBroker:ArhitetureandSpeiation. 2.3.1,Otober1999.

[62℄ Oppenheimer,D.,Albreht,J.,Patterson,D.,andVahdat,A.,DesignandImplementationTradeosforWide-area

ResoureDisovery,in Proeedings.14thIEEEInternationalSymposiumonHighPerformaneDistributedComputing,

2005.HPDC-14,Washington,DC,USA,July2005,IEEEComputerSoiety,pp.113124.

[63℄ Petar Maymounkovand David Mazires,Kademlia: APeer-to-Peer Information System Based onthe XOR Metri,

inIPTPS '01: Revised Papers fromthe First International Workshop on Peer-to-Peer Systems, London, UK, 2002,

Springer-Verlag,pp.5365.

[64℄ PeterJKeleher,TheRelativeImportaneofConurrentWritersandWeakConsistenyModels,inICDCS'96:

Proeed-ingsofthe16thInternationalConfereneonDistributedComputingSystems(ICDCS'96),Washington,DC,USA,1996,

IEEEComputerSoiety,p.91.

[65℄ PrasannaGanesan,BeverlyYang, andHetor Garia-Molina,One TorustoRuleThem All: Multi-Dimensional

QueriesinP2PSystems,inWebDB'04:Proeedingsofthe7thInternationalWorkshopontheWebandDatabases,New

York,NY,USA,2004,ACMPress,pp.1924.

[66℄ M.Raynal,G.Rhia-kime,andM.Ahamad,SerializabletoCausalTransationsforCollaborativeAppliations,in

Pro-eedingsofthe23rdEuromiroConferene,Budapest,Hungary,September1997.

[67℄ S.Rhea,P.Eaton,D.Geels,H.Weatherspoon,B.Zhao,andJ.Kubiatowiz,Pond: TheOeanStorePrototype,

inProeedingsoftheConfereneonFileandStorageTehnologies,USENIXAssoiation,2003.

[68℄ S.Rhea,B. Godfrey,B.Karp,J.Kubiatowiz,S. Ratnasamy,S.Shenker,I. Stoia,andH.Yu,OpenDHT:a

publiDHT servieand itsuses,inSIGCOMM'05: Proeedingsofthe 2005onfereneon Appliations,tehnologies,

arhitetures,andprotoolsforomputerommuniations,NewYork,NY,USA,2005,ACMPress,pp.7384.

[69℄ A.RowstronandP.Drushel,Pastry: Salable, Distributed ObjetLoation andRoutingforLarge-SalePeer-to-Peer

Systems,inProeedingsofthe18th IFIP/ACMInternationalConfereneonDistributedSystemsPlatforms(Midleware

2001),Heidelberg,Germany,November2001,pp.329350.

[70℄ RussCox,AthihaMuthitaharoen,andRobertMorris,ServingDNSUsingaPeer-to-PeerLookupServie,inIPTPS

'01:RevisedPapersfromtheFirstInternationalWorkshoponPeer-to-PeerSystems,London,UK,2002,Springer-Verlag,

pp.155165.

[71℄ S.Rhea,B.G.Chun,J.Kubiatowiz,andS.Shenker,FixingtheEmbarrassingSlownessofOpenDHTonPlanetLab,

inProeedingsofUSENIXWORLDS2005,USENIXAssoiation,2005.

[72℄ SaiSusarlaandJohnCarter,Flexible ConsistenyforWidearea PeerRepliation,inProeedingsofthe25th

Interna-tionalConfereneonDistributedComputingSystems(ICDCS),Washington,DC,USA,2005,IEEEComputerSoiety.

[73℄ K.-U. Sattler, P. Rösh, E. Buhmann, and K. Böhm,A Physial Query Algebra forDHT-based P2PSystems, in

Proeedingsofthe6thWorkshoponDistributedDataandStrutures,Lausanne,Switzerland,July2004.

[74℄ M.Satyanarayanan,J.J.Kistler,P.Kumar, M.E.Okasaki, E.H.Siegel,andD.C. Steere,Coda: AHighly

AvailableFileSystemforaDistributedWorkstationEnvironment,IEEETransationsonComputers,39(1990),pp.447

459.

[75℄ T.Seidmann,RepliatedDistributedSharedMemoryForThe.NETFramework,in Proeedingsof1stInt.WorkshoponC#

and.NETTehnologiesonAlgorithms,ComputerGraphis,Visualization,ComputerVisionandDistributedComputing,

Plzen,CzehRepubli,February2003.

[76℄ StefanSaroiu, KrishnaP Gummadi,Rihard J Dunn, StevenD Gribble, andHenryM.Levy,AnAnalysisof

InternetContentDeliverySystems,SIGOPSOperatingSystemsReview,36(2002),pp.315327.

[77℄ StephanosAndroutsellis-TheotokisandDiomidisSpinellis,ASurveyofPeer-to-PeerContentDistribution

Tehnolo-gies,ACMComputingSurveys,36(2004),pp.335371.

[78℄ H.Stokinger,DistributedDatabaseManagementSystemsandtheDataGrid,inMSS'01:ProeedingsoftheEighteenth

IEEESymposiumonMassStorage Systemsand Tehnologies, Washington,DC,USA,2001,IEEE ComputerSoiety,

p.1.

[79℄ H.Stokinger,A.Samar,K.Holtman,W.E.Allok,I. Foster,andB.Tierney,FileandObjetRepliationin

DataGrids.,ClusterComputing,5(2002),pp.305314.

[80℄ SumeetSobti,NitinGarg,FengzhouZheng,JunwenLai,YileiShao,ChiZhang,ElishaZiskind,Arvind

Krish-namurthy,andRandolphY.Wang,Segank:ADistributedMobileStorageSystem,inFAST'04:Proeedingsofthe

3rdUSENIXConfereneonFileandStorageTehnologies,Berkeley,CA,USA,2004,USENIXAssoiation,pp.239252.

[81℄ SunMirosystems, JSJavaSpaesServieSpeiation.

http://java.sun.om/pr odu ts/j ini /2.0 /do /sp es/ htm l/js -spe .h tml2001.

[82℄ Sushant Goel, Hema Sharda, and David Taniar, Atomi Commitment and Resiliene in Grid Database Systems,

InternationalJournalofGridandUtilityComputing,1(2005),pp.4660.

[83℄ I. Tatarinov,Z. Ives, J.Madhavan,A. Halevy, D.Suiu, N.Dalvi, X. Dong,Y. Ka diyska,G. Miklau,and

P.Mork,ThePiazzaPeerDataManagementProjet,SIGMODReord,32(2003).

[84℄ D. B. Terry, K. Petersen, M. Spreitzer, and M. Theimer,The Case forNon-transparent Repliation: Examples

fromBayou.,IEEEDataEngineeringBulletin,21(1998),pp.1220.

[85℄ Tevfik Kosar and Miron Livny, Stork: Making Data Plaement a First Class Citizen in the Grid, in ICDCS '04:

Proeedings of the 24th International Conferene on Distributed Computing Systems (ICDCS'04), Washington, DC,

USA,2004,IEEEComputerSoiety,pp.342349.

[86℄ P.ValduriezandE.Paitti,DataManagement inLarge-SaleP2PSystems.,inVECPAR,M.J.Daydé,J.Dongarra,

V.Hernández,andJ.M.L.M.Palma,eds.,vol.3402ofLetureNotesinComputerSiene,Springer,2004,pp.104118.

[87℄ S. Venugopal, R. Buyya, and K. Ramamohanarao, A Taxonomy of Data Grids for Distributed Data Sharing,

ManagementandProessing,ACMComputingSurveys,(2006). Toappear.

[88℄ Venugopalan Ramasubramanian and Emin G Sirer, Exploiting Power Law Query Distributions for O(1) Lookup

Performane in Peer to Peer Overlays, in Proeedings of the First Symposium on Networked Systems Design and

(16)

ofthe 2004 onferene on Appliations,tehnologies, arhitetures,and protoolsfor omputerommuniations, New

York,NY,USA,2004,ACMPress,pp.331342.

[90℄ WeijianFang,Cho-LiWang,andFranisCMLau,OntheDesignofGlobalObjetSpaeforEientMulti-threading

JavaComputingonClusters,ParallelComputing,29(2003),pp.15631587.

[91℄ WeiminYuandAlanCox, Java/DSM:APlatformforHeterogeneousComputing,inACM1997WorkshoponJava for

SieneandEngineeringComputation,June1997.

[92℄ XueyanTangandJianliangXu,QoS-AwareRepliaPlaementforContentDistribution,IEEETransationsonParallel

andDistributedSystems,16(2005),pp.921932.

[93℄ B.YangandH.Garia-Molina,Designingasuper-peernetwork.,inDayaletal.[28℄,pp.4962.

[94℄ YiLin, BettinaKemme,MartaPatino-Martinez,andRiardo Jimenez-Peris,MiddlewareBased DataRepliation

Providing Snapshot Isolation, inSIGMOD '05: Proeedings of the 2005 ACM SIGMODinternational onferene on

Managementofdata,NewYork,NY,USA,2005,ACMPress,pp.419430.

[95℄ L. Zhenyun Zhuang and Member-Yunhao Liu, Dynami Layer Management in Superpeer Arhitetures, IEEE

TransationsParallelandDistributedSystems,16(2005),pp.10781091.

Editedby: ThomasLudwig

Reeived: May25,2006

References

Related documents

Voice acknowledgement is provided by APR9600 (IC2). It is a single-chip voice recording and play back device that can record and play multiple message at random or in

With doubt in (323), the anaphor can only refer to the matrix clause proposition, as the embedded wh - complement does not make a proposition available for anaphoric

The thesis mainly aims to investigate the role of Asian Infrastructure Investment Bank in promoting economic growth in Asian developing countries. The long term relationship

In 1955, the Johnston plan proposed allocating water from the Jordan River on the basis of the right to an equitable and reasonable share between the riparian populations,

Participation in the OPUS school meal study.. november 2012 Dias 19. Data from the OPUS school

It was decided to intimate/inform students about “Know about healthcare services & Health Insurance at Symbiosis” about process of health Insurance during:

To meet the demands of such investigations, local law enforcement agencies must be enabled to provide dedicated child abuse investigation units, staffed by officers

Five themes emerged: (a) workplace environment, focusing on the level of flexibility given to employees in the organization; (b) feedback sources in organizations, centering