SCORE:
Simulator
for
cloud
optimization
of
resources
and
energy
consumption
Damián Fernández-Cerero
a, Alejandro Fernández-Montes
a,
Agnieszka Jakóbik b,
Joanna
Kołodziej
c,
Miguel Toro aaEscuela Técnica Superior de Ingeniería Informática, Universidad de Sevilla, Av. Reina Mercedes s/n, Sevilla, Sevilla 41012, Spain bInstitute of Computer Science, Cracow University of Technology, Warszawska 24, Cracow 31-155, Poland
cResearch and Academic Computer Network (NASK), Kolska 12, Warsaw 01-045, Poland
Keywords: Datacenter Energysaving Cloudcomputing Energy-awarescheduling
a
b
s
t
r
a
c
t
Achieving efficiency both in terms of resource utilisation and energy consumption is a complex challenge, especially in large-scale wide-purpose data centers that serve cloud- computing services. Simulation presents an appropriate solution for the development and testing of strategies that aim to improve efficiency problems before their applications in production environments. Various cloud simulators have been proposed to cover different aspects of the operation environment of cloud-computing systems. In this paper, we de- fine the SCORE tool, which is dedicated to the simulation of energy-efficient monolithic and parallel-scheduling models and for the execution of heterogeneous, realistic and syn- thetic workloads. The simulator has been evaluated through empirical tests. The results of the experiments confirm that SCORE is a performant and reliable tool for testing energy- efficiency, security, and scheduling strategies in cloud-computing environments.
1. Introduction
Cloud-computing (CC)andlarge-scale webserviceshavehada notableimpactinthedatacenterscenarioandthe big-dataenvironment,sincetheyenablehugeamountsofdatatobeprocessedinareliableanddistributedway.However,the CCservicesinthebig-dataerashouldmeetnewrequirementsfromtheendusers,suchasfast-responseandlowlatency.
MajorCC serviceproviders,suchasGoogle,MicrosoftandAmazon,areconstantlydevelopingnewapplicationsand ser-vices.Asthenumberoftheseservicesgrows,manytypesofapplicationshavetobedeployedonthesamehardware. Virtual-isationoftheresourcesenabledtheresourceutilisationtobeimprovedbydesigninggeneralandwide-purposedatacenters. Thesefacilitiescan handlean enormousworkloadwithvariousrequirements.Therefore,manywell-knownsolutions,such asfragmentingthedatacenterintoasetofclustersthatareresponsibleforexecutingonlyonekindofapplication,areno longerneeded.
Large-scale data centers which execute heterogeneous workloads on shared hardware resources bring new challenges in addition tothose inherent to smalland medium-sizedclusters,not leastbecause scheduling such an amount ofwork may exceed the capacity of a centralized monolithic scheduler. In order to overcome this limitation, several scheduling models
withdifferentdegreesofparallelismhave beendeveloped,such asthetwo-levelapproach ofMesos[1],theshared-state approachofOmega[2],andvariousapproachesforschedulinginlarge-scalegridsystems[3].
Theaforementionedinfrastructuresusuallyconsumeasmuchenergyasdomanyfactoriesandsmallcities,andaccount forapproximately1.5%ofglobalenergyconsumption[4].
Although,datacenters may be usedby users worldwide, they are usually deployedon a continentalbasis. Therefore, thesefacilities areusually underhigherpressure duringday-timehours thaninnight-timehours. Thisandother reasons, suchasthefear ofanychangethatcouldbreakoperationalrequirements[5],andthecomplexityofthesystemsinvolved, leadto anover-provision ofdata-centerinfrastructures.Thisdecisionleadsto serversbeingkept underusedorinan idle state,whichishighlyinefficientfromanenergy-consumptionperspective.
Manymodels have been implementedin order to minimisethe energy consumption in datacenters, such as chiller-freecoolingsystems,andhardwaremodelimprovements,suchasdynamicvoltageandfrequencyscaling(DVFS).However, anotherapproachthatmaylowertheenergyconsumptionconsiderablyindatacentersisseldomimplemented.Thisstrategy involvesswitchingoff idleservers.
Thestrategiesthatshutdownidlemachinesmayhaveanegativeimpactintermsofperformanceofthewholesystem. Theinactivemachineswouldnotbeableto executelargeworkloadsina shorttime. Therefore,theoptimalenergy-aware strategiesmust guarantee an appropriate level ofreduction of energyconsumption. In orderto test these strategiesand to measure the impact in terms of performance and energy consumption, a trustworthy simulation tool is required. In addition,thechosen simulatorhastobeabletoreproducetheconditionspresentinrealdatacenters.Thisrequirementis critical,sincethesimulationisusuallythesteppriortoimplementingthesestrategiesinworkingdatacenters.Therefore,the “optimal” energy-awarecloudsimulatorshouldguaranteethefollowingachievements:a)Lowresourceconsumption– data centersarecomposedofthousandsofdataserverswhichconsumeahugeamountofenergy;b)Simulationofthe parallel-schedulingmodels– themonolithic-schedulingmodelmayproveineffectiveinthelarge-scaleCCsystems;c)Easycodesfor replicationandextensions–genericmodelfocusedonshut-down,power-onandschedulingstrategies,sincemanylow-level aspects, such ashardwareandnetworking details, maybe overlooked, andd)Validationof resultsagainst a trustworthy source.Thismeansthatthesimulationtoolhasbeencomparedwiththereal-lifesystemthatitsimulates.
Severalsimulatorswere evaluated in thispaper.Thesetools followedvarious simulationmodels, such as:a) Discrete-eventsystems; b)Multi-agent systems;c)Multi-paradigm systems;andd)Hierarchicsystems.Thesimulators under eval-uation presentedawide rangeofpurposes andlevels ofdetail.However, byusing thesesimulators,it isverydifficult to achieve the aforementioned properties. In this paper, we propose a newdata-center simulation tool, namelythe SCORE Cloudsimulator,whichisourproposalforan“optimal” energy-awareCCsimulationpackage.SCOREisbasedontheGoogle Omegalightweightsimulator[2],whichwasextendedbytheimplementationofthehybridisationofthediscrete-eventand themulti-agent scheduling andresource utilisation models. The Google Omega lightweight simulator hasbeen validated against Google datacenters, which makes it suitable forbeingthe coreof a trustworthysimulation tool. This iscritical, sincetheauthorshavenotbeenabletovalidateSCOREagainstreal-worlddatacentersduetothehugesizeoftheclusters takenintoconsideration.
Thepaperis organisedasfollows.InSection 2,asimple comparativeanalysisis presentedofthemostrelevantcloud computingsimulatorsavailable.InSection3,thehigh-levelarchitecturaldecisionsofSCOREarehighlighted.Thescheduling modelsunderevaluationaredescribedinSection3.1.InSection3.3,thecoremodulesofSCORE,whicharerelatedtoenergy efficiency,aredescribed. Anumberofthe parametersavailable forthe configurationoftheexperimentationareshownin
Section4.InSection4.2,theworkloademployedtoperformtheexperimentationinSCOREischaracterised.InSection5,the dataavailable asaresultoftheexperimentsperformedinSCORE,aswellasthemeanstoretrievethisdataareexplained. ThevariousexperimentationscenariosandtheresultparametersareshowninSection6.Finally,theconclusionsandfuture workarediscussedinSection7.
2. RelatedWork
Inrecentyears,manysimulatorshavebeendevelopedforthemodellingofthemaincomponentsofthecomputational cloudsystems.Inthissection, asimplesurveyandcomparativeanalysisisprovided.Thefollowingbasictrade-offsshould beconsidered when developingasimulation tool:a)Performanceversus features;b) Performanceversus accessibility;c) Accessibilityversusfeatures;andd)Performanceversusaccuracy.
Althoughvisualgeneral-purpose simulators, suchasInsightMaker [6],couldbe usedto simulatecomputationalcloud systems,thedevelopmentofthemodelofanenergy-efficientcloud-computinginfrastructure,suchasthatdescribedearlier, wouldbefailure-proneandover-sized.
Ontheotherhand,thereareothernon-general-purposesimulators,suchasElasticTree[7],andCloudSched[8],whichare focusedontheenergyconsumptionofnetworkingelementsandschedulingpolicies,respectively.Duetothisspecialisation, thesesimulatorspresentmajorlimitationsandrestrictions.
In addition, we evaluated wide-ranging cloud-computing simulation tools, which cover more elements of the Cloud-Computingsystems(CC).EachoftheframeworksstudiedsimulatesdifferentaspectsoftheCCsystemstoadifferentdegree ofdetail. These modellingand implementationdecisions rendereach systemdifferent interms of performance and fea-turesprovided,which,inturn,makessomeofthemmoresuitablefortheaforementionedpurpose.Thisclassofsimulators includes:
Table 1
Simulatorcomparison.
Simulator Scheduling Energy Energy Scheduling Performance models aware strategies policies
GridSim N N N Y Medium
CloudSim N Y Y Y Medium
GreenCloud N Y Y Y Low
GoogleOmega Y N N N High
paper
Grid’5000Toolbox N Y Y N High
• GridSim. This toolkit [9], based on the SimJava library [10], models various aspects of grid systems, such as users, machines, applications,andnetworks. However, itfails toconsider severalcloud-computing features. Italso lacks the energy-consumptionperspective.
• CloudSim.ThispopularcloudsimulatorisbasedonSimJavaandGridSimandismainlyfocusedonIaaS-relatedoperation environmentfeatures [11].Itpresents ahighlevel ofdetail,andthereforeallows severalVM allocationandmigration policiestobedefined,networkingtobeconsidered,featuresandenergyconsumptiontobetakenintoaccount.However, itfeatures certaindisadvantageswhenapplied forthesimulationoflarge data-centerenvironments:thishighlevelof detailmeansthatCloudSimisconsideredcumbersometoexecute,especiallyfordatacenterscomposedofthousandsto tens ofthousandsofmachines. In addition,it isnot principally designedtosimulatemultiple schedulingmodels, but takesalargelyamonolithicapproach.
• GreenCloud.This simulatorisan extensionof theNS2networksimulator. Its purposeistomeasure andcomputethe energyconsumptionateverydatacenterlevel,anditpaysspecialattentiontonetwork components[12].However, its packet-levelnature compromises performance inorder to raisethe level ofdetail, whichmay be not optimalforthe simulationoflarge datacenters. Inaddition, itisnot designedto offerease ofdevelopmentandextension invarious schedulingmodels.
• GoogleOmegalightweight simulator.This simulatorisdesignedforthe comparisonofvarious schedulingmodelsin largeclusters.Tothisend,itfocusesonmaximizingtheperformanceofthesimulations byreducingthelevelofdetail. However,itisnotdesignedtoeasilydevelopandextendotherschedulingstrategies.Inaddition,thistoolfailstoconsider energyconsumption.
• Grid’5000 Toolbox. Grid’5000wasbuilt upona network ofdedicated clusters.The infrastructure ofGrid’5000 is geo-graphicallydistributedovervarioussites,ofwhichtheinitial9arelocatedinFrance.Grid’5000Toolbox [13]simulates thebehaviourofGrid’5000resourcesforrealworkloadswhilechangingthestateoftheresources accordingto several energypolicies.Thesimulatorincludes:
a) AGUIthatallowstheusertoperformasetofsimulationsforeachlocationandtoexecuteasetofenergypolicies; b) Agraphicalvisualisationoftheresourcesduringthesimulation,includingtheirstates,andfutureandpastjobs;c)A graphicalviewoftheresultsthroughseveralchartsandspreadsheets.
Ontheotherhand,thesimulatorfailstoincludevariousschedulingframeworksanditdoesnotsimulatethebehaviour orconsumptionofnetworkdevicesandresources.
Severalof thesesimulators are well-known, andin-depth comparisonshave been presentedin the literature [14–17]. Inthiswork,themostimportantaspects regardingthedevelopmentandapplicationofpower-off/onstrategiesinorderto minimisetheenergyconsumptionarepresented.Amongthese:
• Schedulingmodels:Thisparameterreflectswhetherthesimulationtool hasdifferentschedulingmodelsimplemented, suchasparallel, distributed,andmonolithic approaches.Inaddition, thisparameter alsoconsiderswhetherthe toolis designedtoeasilyextendanddevelopnewschedulingframeworks,andnotonlyallocationpolicies.
• Energyaware:Thisparameterreflectswhetherthesimulationtooliscapableofmeasuringandcomputingenergy con-sumptionandefficiencyparameters.
• Shut-downand Power-onpolicies: This parameterreflects whetherthe simulation tool hasdifferent shut-down and power-onalgorithms implemented.Inaddition,italsoconsiderswhetherthetoolisdesignedtoeasilyextendand de-velopnewenergypolicies.
• Schedulingstrategies:Thisparameterreflectswhetherthesimulationtoolhasdifferentallocation/schedulingalgorithms implemented.Inaddition,italsoconsiderswhetherthetoolisdesignedtoeasilyextendanddevelopnewstrategies.
• Performance:Thisparameterreflectstheamountoftimeforasimulationtobe runinacomparableenvironment.The amountofcomputationalandmemoryresourcesisalsoconsidered.
AshortcomparativebetweenthesimulationtoolsdescribedarepresentedinTable1.
Thesimulatordescribedinthiswork,SCORE,isahigh-performancecloud-computingsimulationtoolfocusedon energy-efficiencyinlargedatacenters.Thistoolisdesignedtobeeasilyextended,andoffersseveralstrategiesalreadyimplemented andreadytouse,forvariousschedulingmodels,allocation,andshut-downandpower-onstrategies.Thehighperformance
Fig. 1. SCOREarchitecture.
andeaseofusehave beenachievedby minimisingthelow-level features,such asnetworking,low-level machine details, andlow-leveltaskdetails.
3. SCOREArchitecture
TheSCOREsimulatordevelopedhereinisbasedonthemodelproposedbyRobinsonin[18].Themainworkflowofthat simulatorcanbe definedasfollows:1.Definitionandinitialisation ofthe problem;2.Determination ofthemodelingand generalobjectives;3.Identificationofthemodelinputs;4.Identificationofthemodeloutputs;and5.Determinationofthe modelcontentandlevelofdetail.
BasedonthisworkflowmodelinCC,wedevelopedtheSCOREarchitectureaspresentedinFig.1.TheSCOREsourcecode ispubliclyavailableat:
https://github.com/DamianUS/cluster-scheduler-simulator.
Thearchitecturalmodeliscomposed oftwomainmodules,namelythecoresimulator– (CS)andtheefficiencymodule– (EF).Thecoresimulatoris thecoreexecutionengine, andit hasbeeninheritedfromthe Omegalightweight simulatoras explainedinSection2.Althoughthemainlayers inthe CSmodelare basedonthesamestructurepresentinthe original simulator,we madeseveralmodifications inorderto be ableto performtheexperimentation relatedtothe securityand energy-efficiency.Eachexperimentisasetofexecutionsandeachexecutiondefinesalltheoperationalenvironmentdetails, suchasenergypoliciesandschedulingmodels.
ThearchitectureofCSisa3–layerarchitecture withtheWorkload generationasthe firstlayer,whichisresponsiblefor the generation ofthe CC workload that will be used in every run of a singleexperiment. The workloadis created only once.Thisiscritical,sinceitenablestheparametersandother aspectsusedwithin eachruntobe reliablycompared.The workloadisgeneratedbythevariousdata-centerutilisationpatterns.Forinstance,theday/nightpatternisthepredominant patterninthedatacentersthatexecutelargewebservicesandapplications,sincethisiscommonhumanbehaviour.
Aworkloadiscomposed ofasetofJobs.Inthesameway,aJobiscomposed ofaBag ofTasks.ATaskistheminimum executionunitthatmaybedeployedonacomputationalserver.EachTaskismappedtoalinuxcontainerwhichisdeployed inasimilar (butmore lightweight)wayasavirtual machine (VM).Thismodern andmoreflexible virtualisationstrategy replacesthetraditionalvirtualisationstrategybasedonindependentvirtual machines.EachTaskdeployedona linux con-tainerrequiresagivenamountofcomputationalandmemoryresourcesforagiventimetobesuccessfullycompleted.Inthe currentversionofthissimulator,nolinux-containermigrationnorserverconsolidationstrategyisconsidered.Thus,oncea Taskisdeployedonaserver,itrunsonthismachineuntilitscompletion.
The CoreEngine Simulation layer performs all the simulation duties, reads theworkload generated, andperforms the schedulingdecisionsto deploy thetasks ontheworker nodes.The Clusterisdefinedby its Descriptor andrepresentsthe numberofcomputationalservers andtheir features.ThecurrentversionofSCOREdoesnot provideanynetworking capa-bilitiesduetotwomainreasons:a)Themaingoalofthisworkistoprovideaperformantandlowresource-consumingtool capableofsimulatinglarge-scale datacenters.Theserequirementsimposeseriousrestrictions regardingthelevelofdetail ofthedevelopedfeatures.b)Thereareseveralsimulationtoolsthatfocusonnetworkingdetails,asstatedinSection2.
Finally,theSchedulingmodelslayerimplementsvariousschedulingframeworks,suchasOmega[2],Mesos[1],and Mono-lithic models.These schedulersperform the resource-allocationprocess, and dictatethescheduling decisions tothe Core EngineSimulationlayer.
We developed the EF module inorder to apply several energy-efficiency and resource-selection strategies. Therefore, thismodule is logically divided into two layers of the same name.The Efficiency strategies layer definesthe policies for shutting-downandpowering-oncomputationalserverswiththeobjectiveofoptimisingtheenergyconsumptionofthedata
Fig. 2. Monolithicschedulerarchitecture,B-Batch-typetask,S-Service-typetask,M-WorkerNode.
center.Thesepolicies canact overtheCluster,changingtheenergystate oftheworkingnodesfrompowered-on to shut-downandviceversa.Finally,intheResource-selectionstrategieslayer,severalapproachesforthereservationofresourcesare implemented,suchasmaximizingthedispersionoftasks,deployingthemrandomly,andminimisingthedispersionoftasks. Theseselectiondecisionsthereforehaveanimpactontheoverallperformanceandenergy-savingresults.
3.1. Schedulingmodels
• Monolithicschedulers:Inthismodel,a single,centralized schedulingalgorithmisemployed foralljobs.Google Borg
[19]isanexampleofthiskindofschedulingmodel.TheworkflowofthemonolithicschedulerisdemonstratedinFig.2.
• Two-level schedulers: In this model, the resource allocation and task placement concerns are separated. There is a unique, centralised active resource manager that offers computing resources to multiple parallel, independent, application-levelschedulingnodes,asshowninFig.3.Thisapproachallowsthetask-placementlogictobedevelopedfor everysingleapplication,butalsoallowstheclusterstatemeta-datatobesharedbetweentheseschedulers.Mesos[1]is anexampleofthiskindofscheduler.
• Shared-stateschedulers: Inthismodel,thecluster meta-dataissharedbetweenall schedulingagents.Thescheduling processisperformedbyusinganout-of-datecopyofthissharedclustermeta-data.When oneoftheseparallel sched-ulersperforms aschedulingdecisionbasedontheprobablystalecluster meta-data,theschedulingagentcommitsthe schedulingdecisionasatransactioninanoptimisticway,asshowninFig.4.Hence,ifanyoftheschedulingoperations committedcannotbeappliedbecausethechosencomputingresourcesarenolongerfree,thenthatschedulingoperation isrepeatedbythescheduleruntilnoconflictsarefound.GoogleOmegaisanexampleofthiskindofmodel.
• Fully-distributed schedulers: In this model, the scheduling frameworks have various independent scheduling nodes which work witha local andout-of-date vision ofthe cluster state withno central coordination. Sparrow [20] is an exampleofthiskindofscheduler.
• Hybrid schedulers: In this model, several scheduling strategies are used (typically a fully distributed architecture is combinedwithamonolithicorshared-statedesign)dependingontheworkload.Thereareusuallytwoschedulingpaths: a distributedpath forshort orbatch tasks, anda centralized path forthe remaining tasks. Mercury[21] provides an exampleofthiskindofscheduler.
3.2. SCOREenergy-awarenessmodel
SCOREhasbeendevelopedtoenrichtheGoogleOmegaLightweightSimulator.Amainfeatureincludedisthecapability ofperformingenergy-efficiencyanalysisbyapplyinganenergy-consumptionmodel.TheCPUisconsideredinorderto com-putetheenergyconsumption.Theproposedenergy-awarenessmodelconsidersthefollowingstatesforeachCPUcoreina
Fig. 3. Two-levelschedulerarchitecture,SA-SchedulerAgent,O-Resourceoffer,C-Commit.
server:a)On:150Wb)Idle:70W.TheenergyconsumptionislinearlycomputedintermsoftheutilisationofeachCPUcore. Thefollowingmachinepowerstateshavealsobeenconsidered:a)Off:10W b)Shutting down:160W∗numberofcoresc) Poweringon:160W∗numberofcores.
The totalenergy consumedby the whole data centeris measured fromtime to time by checking thepower state of everymachine.Thistimeintervalisaconfigurableparameter.
Regardingtheshut-downprocesstimeparameters,thefollowingvalueshavebeenassumed:a)TOn→Hibernated:10s,andb) THibernated→On:30s.ThepowerstatesandtransitionsareshowninFig.5.Alltheaforementionedpowerandtimeparameters canbemodifiedforeachexperiment.
3.3.SCOREenergy-efficiencymodules
Asaforementioned,theenergy-efficiencytieriscomposedofthefollowingthreemodules:
• Shut-downmodule:Thismoduleisresponsibleforshuttingdownthecomputational serversinorderto minimise en-ergy consumption. Several strategies may be used in order to shut down the machines. Each shut-down strategy is implementedintheformofaShut-downpolicy.
• Power-onmodule:Thismoduleisresponsibleforwakingupthemachinesrequiredtomeetpresentorfutureworkload demands.Severalstrategiesmaybeusedinordertominimisethenegativeperformanceimpactcausedbymachinesthat are notavailabletoimmediatelyexecutetasksbecausethey areshutdown.Eachofthesestrategiesisimplementedin theformofaPower-onpolicy.
• Schedulingmodule:Thismoduleisresponsiblefordeterminingwhichtasksshouldbedeployedonwhichmachine.
3.3.1. Shut-downpolicies
Power-off policiesareresponsiblefordecidingwhetherornotamachineshouldbeshutdownandresponsiblefor trig-geringtheorderfortheshut-downoperation.
Inthiswork,authorshavedividedtheprocessintomakingadecisionofwhetherashut-downactionmustbetaken,and carryingontheactualactionoforderingtheshut-downofthemachineinordertoallowvariouscombinationsofstrategies tobeperformed.TheworkflowofthisprocessisillustratedinFig.6.
Shut-downdecisionpoliciescanbe:deterministic,suchasalwaysshutting-down;orprobabilistic,suchasshuttingdown machinesfollowingtheexponentialpolicy.ThesedecisionpoliciesalwaysreturnaBooleanvaluewhichdetermineswhether agivenmachine mustbe shut down.In ordertomake thedecision, thispolicy maycheck variousClustervariables after havingfinishedataskandhavingfreedtheresources.
Fig. 4. Shared-stateschedulerarchitecture,U-ClusterStateUpdate.
Fig. 5. Machinepowerstates.
Severalshut-downpolicieshavebeenimplementedandtestedin-depth,asshowninSection6.Thesepoliciesinclude: Neverpoweroff,Alwayspoweroff,Shutdowndependingondata-centerload,Leaveasecuritymargin,Exponential,andGamma. Inaddition,variouspower-off decisionpoliciescanbecombinedbyusingthelogicaloperatorsandandortoachievepolicies ofamoreflexibleandcomplexnature.
3.3.2. Power-onpolicies
Power-onpoliciesareresponsibleformaintainingsufficientresourcesavailableinordertoproperlyexecutethearriving jobs.Asthecomplementtoshut-downpolicies,thestrategiesdevelopedarecriticaltoguaranteethatheterogeneous work-loadsandpeak loadscan beexecuted withoutcausinga negativeimpact intheoverall data-centerperformance,without breakingSLAs/SLOs,andwithoutaffectingtheuserexperience.
Asopposedtopower-off policies,whichmakeadecisionandperformanactionindependentlyforeachmachine, power-onpoliciesworkwiththeoverallcellstateinordertoturnonmultiplemachinesifrequiredbytheworkload.Thispower-on
Fig. 6. Power-off modulearchitecture.
businesslogicisconceptuallyexecutedwhenanewjobisscheduled,asopposedtotheshut-downbusinesslogic,whichis executedwhenresourcesarereleased.Ofcourse,eachschedulerhasdifferentschedulingprocesses,sopowering-oncycles canthereforevarydependingonthescheduler.Inordertomakepower-ondecisions,thesystemmayrequireseveralitems ofoperationalinformation,suchas:a)TheJobthattriggeredthepower-onaction;andb)Theschedulermodelthattriggered thepower-onaction.
Inordertoberealistic,thissimulationtoolusuallyworkswithaheterogeneousworkloadwithnoevidentusagepattern. Thiskindofworkloadisreallycomplextopredict,andthereforeshut-downpoliciesmaynegativelyimpactthedata-center performanceiftheyareunabletopredictnear-futureworkloadrequirementsproperly.
Variousdeterministicandprobabilisticpower-ondecisionandactionpolicieshavebeendevelopedtofacethischallenge. Thesepoliciesinclude:Neverpoweron,Power-ononlytherequiredmachines,Power-on afixednumber ofmachinesto main-tainasecuritymargin,Power-ona percentageofmachinestomaintaina securitymargin,andpower-onpolicieswhichmake decisionsbasedonstatisticaldistributions,suchasExponential,andGamma.
Inaddition,it is especiallyinteresting that various power-ondecision policiesmaybe combined usingthelogical op-eratorsandandortoachievepoliciesofamoreflexible andcomplexnature.Withthisstrategy, thebenefitsofpredictive policiesmaybeenjoyedwithoutgivingup thepossibilityofturning ontherequiredmachinesifthepredictionfailsora peakloadarrives.
3.3.3. Schedulingstrategies
SCORE designs the scheduling strategy as a plug-in piece that is established at the experiment creation time. This schedulingstrategy isused by all theschedulers ofall scheduling modelsin orderto determineon which machinesthe tasks should be deployed. Thus, the scheduling strategy works asa black box which uses theinformation of the whole clusterandofthejobfortheseschedulers,andreturns themthemappingbetweentasksandthemachinetobe applied. Oncethismappingisavailable,eachschedulingmodeldeploysthesetasksonthechosenmachinesdependingontheirown schedulinglogic,asillustratedinFig.7.
Severalschedulingstrategieshavebeendeveloped.Thesestrategiesinclude:thosebasedontheETC-matrixgenetic pro-cess[22],suchasETCminimisingmakespan[23],andETCminimisingenergy[24],Random,Spreadtasksthemaximum,Greedy minimisingenergy,Greedyminimisingmakespan,Spreadtaskstheminimum,Spreadtaskstheminimumwithrandomness.
4. Experimentanalysis
4.1. Generalparameters
Fig. 7. Scheduling-strategyworkflow.
4.2. Workloadparameters
In orderto performrealistic experimentation, theworkload presentinGoogle traces[25] is considered. Based onthe studies made by the research community in[26,27], it isknown that realistic jobsare composed of one or moretasks, sometimesthousandsoftasks.Inaddition,twotypesofjobsaretobeconsidered:
• Batchjobs:Thisworkloadiscomposedofjobswhichperformacomputationandthenfinish.Thesejobshavea deter-minedstartandend.MapReducejobsareanexampleofaBatchjob.
• Service jobs: This workload is composed of long-running jobswhich provide end-user operationsand infrastructure services.AsopposedtoBatchjobs,thesejobshavenodeterminedend.WebserversorservicessuchasBigTable[28]are goodexamplesofaServicejob.
Inorderto properlydefinethe workloadandcreatethemodelforthegeneratedjobs, thefollowingjob attributesare considered:a)Inter-arrivaltime,whichrepresentsthetimeelapsedbetweentwoconsecutiveServicejobsorbetweentwo Batchjobs;b)Numberoftasks,whichisusuallyhigherforBatchjobsthanforServicejobs;c)Job duration,whichmaybe modifiedbythemachineperformance profile;andd)Resource usage,whichrepresentstheamountofCPU andRAM that everytaskinthejobconsumes[2].
4.2.1. Workloadgeneration
Regardingthegenerationoftheworkload,severalapproachesareimplementedandmaybeused,rangingfromuniform generators, followed by severaldegrees ofexponential generators and generators that rely on tracesto modelthe tasks. This workloadis generatedat thebeginning andused forall the experimentsof that execution. The followingworkload generatorsareavailableinSCORE:
• Uniform: This strategy generates jobsat a uniformrate, ofa uniform size and whichconsume the same amount of resources.
• Exponential:Thisapproachgeneratesworkloadswithjobsthathavetheinter-arrivaltime,thenumberoftasks,andthe taskdurationsampledfromexponentialdistributions.TwoversionsoftheExponentialgeneratorhavebeenimplemented inordertocreatebothaflatandaday/nightpatternedworkloads.
• Exponentialbuiltfromatracefile:Thisgeneratorcreatesworkloadswherethedurationandthenumberoftasksofall jobsaresampledfromexponentialdistributions builtfroma tracefile.Inaddition,theExponentialandtheExponential builtfromatracefilecanbechosenonaper-parameterbasis.
• Tracefile:Thisapproachgeneratesworkloadsthatreproduceatracefile.
Trace-relatedworkloadgeneratorsrequireatracefilethatcontainsonejobperline.Eachlinemustpresentthefollowing columns separated by a whitespace: 1. Submission time; 2. Number of tasks; 3. Job duration; 4. Number of CPU cores requiredbythetasks;and5.AmountofRAMrequiredbythetasks.
Table 2
Configurableexperimentparameters.
Parameter Description Values
Cluster Parametersrelatedtothedatacenterthatmustbefixed forallexperiments
#Machines Data-centersize [1- ∞ ] #Cores NumberofCPUcoresforeverymachine [1- ∞ ] RAM AmountofRAMinGBforeverymachine [0.1- ∞ ] Heterogeneity Flagtodecidewhetherdata-centermachinesare
heterogeneous
Boolean Machine
performance profile
Describetheperformanceofeverymachineinthedata center.Thelowerthisvalue,themoreperformantthe serveris
Array,size:number ofmachines[0.01
-∞ ]
Machine security profile
Describethesecurityofeverymachineinthedata center.Thehigherthisvalue,themoresecuretheserver is[23]
Array,size:number ofmachines[1-5] Machines energy
profile
Describetheenergyconsumptionofeverymachinein thedatacenter.Thelowerthisvalue,themore energy-efficienttheserveris
Array,size:number ofmachines[0.01
-∞ ]
Power-on time Thetimerequiredtobootaserver,inseconds [0.1- ∞ ] Shut-down time Thetimerequiredtohibernateaserver,inseconds [0.1- ∞ ] Performance Parametersrelatedtoperformanceiteratedinorderto
createallexperimentvariations Per-job algorithm
time
Timespent(inseconds)bytheschedulerinorderto makeajob-levelschedulingdecision.Thissimulatesthe performanceoftheschedulingalgorithm
Array[0.001-∞ ]
Per-task algorithm time
Timespent(inseconds)bytheschedulerinorderto makeatask-levelschedulingdecision.Thissimulatesthe performanceoftheschedulingalgorithm
Array[0.001-∞ ]
Blacklist Thepercentageofmachinesnottobeused Array[0.0-∞ ] Inter-arrival Thisparameterrewritestheinter-arrivaltimegenerated
foralljobs,replacingitwithafixedtimeinstead
Array[0.001-∞ ]
Energy Parametersrelatedtoperformanceiteratedinorderto createallexperimentvariations
Shut-down policies Theshut-downpoliciestoberun,suchas: Always power off, Exponential ,andGamma
Array Power-on Thepower-onpoliciestoberun Array Scheduling Theschedulingstrategiestoberun Array Sorting Thesestrategiesareusedbygreedyschedulingstrategies
tosortthecandidateserversfortheirlaterselection
Array
Specific Parametersusedbyspecificschedulers Schedulers assigned
to workloads
Mappingthatdescribeshowmanyandwhichschedulers areassignedtowhichworkloadtype(Batch/Service) usedbynon-monolithicschedulers.Eachrowaddsa newschedulertoserveaworkload
Map[Scheduler name->Workload name]
Conflict mode Approachusedbynon-monolithicschedulerstodecide whethera commit resultsinconflict
resource-fit sequence-numbers Transaction mode Approachusedbynon-monolithicschedulerstodecide
whattodowhena commit resultsinconflict
all-or-nothing incremental
5. Informationretrieval
TheresultsoftheexperimentsarestoredinonetoseveralGoogleprotocolbufferfiles.Themainblocksoftheseprotocol bufferfilesincludethefollowing:
• ExperimentEnvironmentsectionstorestheinformationrelatedtotheglobalconfigurationoftheexperiments.
• ExperimentResultssectionstorestheinformationrelatedtoglobalresultsforeachoftheexperimentsperformed.
• Workload Stats section stores the workload-specific information forboth Batch and Service jobs foreach Experiment Result,includingthedetailsofthegeneticprocesswhenused.
• Scheduler Stats section stores the scheduler-specific information for each Experiment Result, including daily and workload-relateddetails.
Fig. 8. Simulationresultsgraphicscripts. Table 3
Experimentoutputs.
Parameter Description Values
Performance Outputparametersrelatedtothedatacenter overallandper-workloadperformance Queue time until
first deploy
Representsthetimeajobwaitsinthequeueuntil itsfirsttaskisscheduled(inseconds).
[0.0- ∞ ] Queue time until
full deploy
Representsthetimeajobwaitsinthequeueuntil itistotallyscheduled(notcompletion).
[0-∞ ] Timed-out jobs Numberofjobsleftunscheduledafter100
unsuccessfulschedulingtriesforagivenjobor 1000triesforanygiventaskinajob.
[0.0- ∞ ]
Scheduler occupation
Percentageofschedulerutilisationonaverage [0.0-100.0] Job scheduling
attempts
Numberofschedulingoperationsneededtofully deployajob.
[0- ∞ ] Task scheduling
attempts
Numberoftasksschedulingoperationsneededto fullydeployajob.
[0- ∞ ]
Energy-efficiency Outputparametersrelatedtothedata-center resourceandenergyefficiency.
Energy consumed Totaldata-centerenergyconsumption(inkWh) [0.0- ∞ ] Energy saved vs.
current system
Totalenergysavedbyapplyingenergy-efficiency policiescomparedtothesamescenariowithno energy-efficiencypoliciesapplied(inkWh).
[0.0- ∞ ]
Shut-downs Numberofshut-downoperations. [0- ∞ ] Idle resources Percentageofresourcesoperatinginan Idle state
onaverage.
[0.0-100.0] KWh saved per
shut-down operation
Thisrepresentstheenergysavedagainstthe numberofshut-downsperformed.Itshowsthe goodness oftheshut-downoperationsperformed.
[0.0- ∞ ]
• EfficiencyStatssectionstorestheenergy-efficiencyparametersforeachExperimentResult.
• Measurementssectionstorestheperformanceandenergy-efficiencymetricsgatheredineachmeasurementperformed everyfewsecondsforeachExperimentResultinordertoshowtheclusterevolution.
Theinformationstoredintheprotocolbuffer filescan’t bevisualized directly.Hence, asetofpythonscripts, aimedto createvaluablehuman-readableandgraphicinformationfromthesefiles,havebeenimplemented.The resultsofsomeof thesegraphicscriptsareillustratedinFig.8.
5.1. Outputindicators
Table 3 presentsthe most relevantresults indicators ofthe experimentation performedin terms ofperformance and energy-efficiency.
6. Examplesofusage
Inthissection,asetofexperimentshavebeenruninordertoillustratecertainexperimentparameterisationandresults relatedtoboth performance andenergy-efficiency.Although eachexperiment mayshowa differentsubset ofparameters andresults,alltheparametersareusedandreturnedineachexperiment.Inaddition,severalparametershavebeenfixedin ordertokeepthetestssimple,suchasalwaysusingthedefaultpower-onpolicy,whichaimstopoweronmachineswhen aworkloadcannotbeserved.
Eachexperimentrunsfora7-dayoperationperiodandprocessesaday-nightpatternedworkloadthatuses,onaverage, approximately30%oftheresources,withloadpeaksachievingapproximately60%ofdata-centerutilisation.Theparameters ofthejobsinthisworkloadaregeneratedbyusinganexponentialdistributionfortheinter-arrivaltime, numberoftasks, andduration, whilethesecurityconstraintsare generatedrandomly.Regardingtheseconstraints,thefollowingvaluesare used: a)Batch jobsare composed of 50tasks andService of9 tasks onaverage; b) Batch-job taskstake 90 secondsand Service-jobtasks2,000secondstofinishonaverage;andc)Batch-jobtasksconsume0.3CPUcoresand0.5GBofmemory, whileService-jobtasksconsume0.5CPUcoresand1.2GBofmemory.
Thedata centeris composedof 1,000machines of4CPU cores and8GBRAM. The energy,performance, andsecurity profilesofthesemachinesarerandomlygenerated.
6.1. Geneticprocessexperimentation
Inthistest,a setofexperimentsfocused on illustratingcertain resultsofthegenetic processofthe ETC-matrix-based schedulingstrategiesarerunbyusing:a)amonolithicschedulingmodel,andb)fixedalgorithmtimes.
Theresultscorresponding toBatch jobsandtheSingle-path monolithic schedulerarepresentedinTable4,whereit can beobservedthatthegeneticschedulingstrategyfocused onminimisingthemakespanachievesbetterperformanceresults, anditisshownhowthismakespanaverageevolvesbetweenthegenetic-processepochs.
6.2.PerformanceexperimentationfortheOmegascheduler
Severalparametersareavailablefortheevaluationoftheimpactontheoperationenvironmentoftheenergy-efficiency policiesandalgorithm performance.A numberoftheseparameters arepresented afterhavingexecuted a newsetof ex-perimentsbyusingthefollowingsimulationconfiguration:a)theOmega schedulingmodel,with4schedulersresponsible forservingBatchjobsand1schedulerforServicejobs;b)oneschedulingstrategywhichstrivestomaximisemachineusage whileminimisingresourcecontention;andc)theResource-fitconflictmode.
Table 4
Parametersoftheminimising-makespangeneticprocess.
Shut-down Scheduling Savings(%) MakespanAvg.(s) Epoch0(s) Epoch100(s) Neveroff Makespan N/A 236.01 324.02 202.07
Neveroff Energy N/A 290.55 N/A N/A
Random Makespan 55.88 258.41 344.51 273.99
Random Energy 55.31 311.91 N/A N/A
Table 5
Omegaperformanceexperimentation.
Shut-down policy Transaction mode Per-jobalg. time(s) Per-taskalg. time(ms) Savings (%) Queuetime untilfirstdeploy (ms) Queuetime untilfulldeploy (ms)
Neveroff all-or-nothing 0.1 10 N/A 0.0 0.0
Neveroff all-or-nothing 0.1 100 N/A 560.0 1,021.4
Neveroff all-or-nothing 1.0 10 N/A 0.2 0.2
Neveroff all-or-nothing 1.0 100 N/A 791.8 1,604.0
Neveroff incremental 0.1 10 N/A 0.0 0.0
Neveroff incremental 0.1 100 N/A 12.9 27.5
Neveroff incremental 1.0 10 N/A 0.0 0.0
Neveroff incremental 1.0 100 N/A 18.5 38.9
Alwaysoff all-or-nothing 0.1 10 46.08 0.1 0.8
Alwaysoff all-or-nothing 0.1 100 40.86 2,294.7 5,368.4
Alwaysoff all-or-nothing 1.0 10 44.97 1.2 2.2
Alwaysoff all-or-nothing 1.0 100 38.78 3,503.1 9,061.6
Alwaysoff incremental 0.1 10 45.06 0.1 0.5
Alwaysoff incremental 0.1 100 45.10 36.9 84.6
Alwaysoff incremental 1.0 10 45.02 1.1 2.5
Table 6
Mesosenergy-efficiencyexperimentation. Shut-down policy Per-jobalg. time(s) Per-taskalg. time(ms) Energyconsumed (kWh) EnergySaved (kWh) #shut downs Idleresources (%) Neveroff 0.1 10 57,259 0 0 69.30 Neveroff 0.1 100 57,359 0 0 69.29 Neveroff 1.0 10 57,372 0 0 69.27 Neveroff 1.0 100 57,440 0 0 69.28 Alwaysoff 0.1 10 31,138 25,774 18,520 5.97 Alwaysoff 0.1 100 31,748 25,208 18,473 7.28 Alwaysoff 1.0 10 31,642 25,321 19,719 7.00 Alwaysoff 1.0 100 31,808 25,137 17,550 7.48 Random 0.1 10 32,003 24,949 9,136 7.97 Random 0.1 100 32,694 24,295 8,968 9.55 Random 1.0 10 32,702 24,225 8,816 9.82 Random 1.0 100 32,739 24,241 9,606 9.65 Exponential 0.1 10 34,952 21,842 1,156 16.43 Exponential 0.1 100 36,119 20,823 1,286 18.33 Exponential 1.0 10 35,953 20,951 1,214 17.87 Exponential 1.0 100 36,513 20,408 1,816 19.49
The resultscorresponding toBatch jobsare presentedinTable 5,whereit canbe observedthat the algorithm perfor-manceandtheconflict-handlingstrategyhaveanotableimpactbothonqueuetimesandonenergyconsumption.
6.3. Energy-efficiencyexperimentationfortheMesosscheduler
In additionto performance parameters,the energy-efficiency parameters constitutethe coreofthe simulationtool. In order to illustrate theseparameters, severalexperiments havebeen run by using the same configuration aslaid out in
Section6.2fortheOmegascheduler.
TheresultscorrespondingtoBatchjobsarepresentedinTable6,whereitcanbeobservedthattheshut-downpolicies exertamajorimpactonenergyconsumptionandhardwarestress,andthatahighlevelofenergy-efficiencycanbeachieved withoutperformingmanyshut-downoperations,therebyminimisingtheperformanceimpact.
7. Summaryandfuturework
Oursimulationtool:SCOREispresentedasanextensiontotheGoogleOmegalightweightsimulator,asimulatorfocused onthecomparisonoftheperformance betweenschedulingmodels,especiallyparallelframeworksinlarge-scaledata cen-ters.ThemodeloftheGoogleOmegalightweight simulatorhasbeenenhanced withextensionsandimprovementsfor:a) The developmentofan energyconsumptionmodel; b)The extensionandcreationofallocationpolicies; c)The extension andcreationofenergy-efficiency policiesbased onshuttingdown andpoweringon machines;d)Theaddition of hetero-geneitytodatacentermachines;ande)Theconsiderationofsecurityprofiles.
Ithasbeenshownthat theapplicationofallthefeatures ofthistooltoheterogeneousworkloadsinlarge-datacenters resultsinmajorpotentialimprovementsfromboththeenergy-efficiencyandperformancepointofview.
Asfuturework,severalaspectsofthetoolarebeingimproved.Theseimprovementsinclude:
• Greatereaseofuseandextendabilityoftheimplementedcode.
• Additionofavisualinterfacetosettheexperimentconfigurationsandtoexecutetheexperiments.
• Additionofareal-timevisualizerofthepowerandperformancestateofthemachines.
• Incorporationofsupportforotherworkloadconstraints,suchastimeandprecedenceconstraints.
• Tasksdependency,whichimplies,amongothers,networkingconsiderations.
• DevelopmentofTasksmigrationandserverconsolidationtechniques.
Acknowledgement
Theresearch issupportedbytheVPPI-University ofSevilla.Thisarticleisbasedupon workfromCOSTActionIC1406 “High-PerformanceModellingandSimulationforBigDataApplications” (cHiPSet),supportedbyCOST(EuropeanCooperation inScienceandTechnology).
References
[1]B.Hindman,A.Konwinski,M.Zaharia,A.Ghodsi,A.D.Joseph,R.H.Katz,S.Shenker,I.Stoica,Mesos:Aplatformforfine-grainedresourcesharingin thedatacenter.,NSDI11(2011)22.
[2]M.Schwarzkopf,A.Konwinski,M.Abd-El-Malek,J.Wilkes, Omega:flexible,scalableschedulersforlargecomputeclusters,in:Proceedingsofthe8th ACMEuropeanConferenceonComputerSystems,ACM,2013,pp.351–364.
[3] J.Kołodziej,Evolutionaryhierarchicalmulti-criteriametaheuristicsforschedulinginlarge-scalegridsystems,419,Springer,2012.
[4] J.Koomey,Growthindatacenterelectricityuse2005to2010,AreportbyAnalyticalPress,completedattherequestofTheNewYorkTimes9(2011).
[5] A.Fernández-Montes,D.Fernández-Cerero,L.González-Abril,J.A.Álvarez-García,J.A.Ortega, Energywastingatinternetdatacentersduetofear, PatternRecognit.Lett.67(2015)59–65.
[6] S.Fortmann-Roe,Insightmaker:Ageneral-purposetoolforweb-basedmodeling&simulation,SimulationModel.PracticeTheory47(2014)28–45.
[7] B.Heller,S.Seetharaman,P.Mahadevan,Y.Yiakoumis,P.Sharma,S.Banerjee,N.McKeown,Elastictree:Savingenergyindatacenternetworks.,in: Nsdi,10,2010,pp.249–264.
[8] W.Tian,Y.Zhao,M.Xu,Y.Zhong,X.Sun,Atoolkitformodelingandsimulationofreal-timevirtualmachineallocationinaclouddatacenter,IEEE Trans.Autom.Sci.Eng.12(1)(2015)153–161.
[9] R.Buyya,M.Murshed,Gridsim:Atoolkitforthemodelingandsimulationofdistributedresourcemanagementandschedulingforgridcomputing, ConcurrencyComput.14(13-15)(2002)1175–1220.
[10] F.Howell,R.McNab,Simjava:Adiscreteeventsimulationlibraryforjava,Simul.Series30(1998)51–56.
[11]R.N.Calheiros,R.Ranjan,A.Beloglazov,C.A.DeRose,R.Buyya,Cloudsim:atoolkitformodelingandsimulationofcloudcomputingenvironmentsand evaluationofresourceprovisioningalgorithms,Software41(1)(2011)23–50.
[12] D.Kliazovich,P.Bouvry,S.U.Khan,Greencloud:apacket-levelsimulatorofenergy-awarecloudcomputingdatacenters,J.Supercomput.62(3)(2012) 1263–1283.
[13] A.Fernández-Montes,L.Gonzalez-Abril,J.A.Ortega,L.Lefèvre,Smartschedulingforsavingenergyingridcomputing,ExpertSyst.Appl.39(10)(2012) 9443–9450.
[14] K.Bahwaireth,E.Benkhelifa,Y.Jararweh,M.A.Tawalbeh,etal.,Experimentalcomparisonofsimulationtools forefficientcloudandmobilecloud computingapplications,EURASIPJ.Inf.Secur.2016(1)(2016)1–14,doi:10.1186/s13635-016-0039-y.
[15] W.Tian,M.Xu,A.Chen,G.Li,X.Wang,Y.Chen,Open-sourcesimulatorsforcloudcomputing:Comparativestudyandchallengingissues,Simul.Model. Pract.Theory58(2015)239–254.
[16] C.Badii,P.Bellini,I.Bruno,D.Cenni,R.Mariucci,P.Nesi,Icarocloudsimulatorexploitingknowledgebase,Simul.Model.Pract.Theory62(2016)1–13.
[17]G.Kecskemeti,Dissect-cf:asimulatortofosterenergy-awareschedulingininfrastructureclouds,Simul.Model.Pract.Theory58(2015)188–218.
[18] S.Robinson,Conceptualmodellingforsimulationpartii:aframeworkforconceptualmodelling,J.Oper.Res.Soc.59(3)(2008)291–304.
[19] A.Verma,L.Pedrosa,M.Korupolu,D.Oppenheimer,E.Tune,J.Wilkes,Large-scaleclustermanagementatgooglewithborg,in:Proceedingsofthe TenthEuropeanConferenceonComputerSystems,ACM,2015,p.18.
[20]K.Ousterhout,P.Wendell,M.Zaharia,I.Stoica,Sparrow:distributed,lowlatencyscheduling,in:ProceedingsoftheTwenty-FourthACMSymposium onOperatingSystemsPrinciples,ACM,2013,pp.69–84.
[21]K.Karanasos,S.Rao,C.Curino,C.Douglas,K.Chaliparambil,G.M.Fumarola,S.Heddaya,R.Ramakrishnan,S.Sakalanaga,Mercury:Hybridcentralized anddistributedschedulinginlargesharedclusters.,in:USENIXAnnualTechnicalConference,2015,pp.485–497.
[22] A.Jakobik,D.Grzonka,J.Kolodziej,H.González-Vélez,Towardssecurenon-deterministicmeta-schedulingforclouds,in:30thEuropeanConference onModellingandSimulation,ECMS2016,Regensburg,Germany,May31-June3,2016,Proceedings.,2016,pp.596–602,doi:10.7148/2016-0596.
[23] A.Jakobik,D.Grzonka,F.Palmieri,Non-deterministicsecuritydrivenmetaschedulerfordistributedcloudorganizations,Simul.Model.PracticeTheory (availableonline4November2016).https://doi.org/10.1016/j.simpat.2016.10.011.
[24] A.Jakóbik,D.Grzonka,J.Kołodziej,Securitysupportiveenergyawareschedulingandscalingfor cloudenvironments,in:EuropeanConferenceon ModellingandSimulation,ECMS2017,Budapest,Hungary,May23-26,2017,Proceedings.,2017,pp.583–590,doi:10.7148/2017-0583.
[25]C.Reiss,J.Wilkes,J.L.Hellerstein,Obfuscatoryobscanturism:makingworkloadtracesofcommercially-sensitivesystemssafetorelease,in:3rd Inter-nationalWorkshoponCloudManagement(CLOUDMAN),IEEE,Maui,HI,USA,2012,pp.1279–1286.
[26]O.A.Abdul-Rahman,K.Aida,TowardsunderstandingtheusagebehaviorofGooglecloudusers:themiceandelephantsphenomenon,in:IEEE Inter-nationalConferenceonCloudComputingTechnologyandScience(CloudCom),Singapore,2014,pp.272–277.
[27]S.Di,D.Kondo,C.Franck,CharacterizingcloudapplicationsonaGoogledatacenter,in:42ndInternationalConferenceonParallelProcessing(ICPP), Lyon,France,2013.
[28]F.Chang,J.Dean,S.Ghemawat,W.C.Hsieh,D.A.Wallach,M.Burrows,T.Chandra,A.Fikes,R.E.Gruber,Bigtable:Adistributedstoragesystemfor structureddata,ACMTrans.Comput.Syst.(TOCS)26(2)(2008)4.