Harman, M., Lakhotia, K., Singer, J., White, D., and Shin, Y. (2013) Cloud
engineering is search based software engineering too. Journal of Systems
and Software, 86 (9). pp. 2225-2241. ISSN 0164-1212
Copyright © 2013 Elsevier Ltd.
A copy can be downloaded for personal non-commercial research or
study, without prior permission or charge
The content must not be changed in any way or reproduced in any format
or medium without the formal permission of the copyright holder(s)
When referring to this work, full bibliographic details must be given
http://eprints.gla.ac.uk/71317/
Deposited on: 20 August 2013
Enlighten – Research publications by members of the University of Glasgow
http://eprints.gla.ac.uk
ContentslistsavailableatSciVerseScienceDirect
The
Journal
of
Systems
and
Software
j ourna l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / j s s
Cloud
engineering
is
Search
Based
Software
Engineering
too
Mark
Harman
a,
Kiran
Lakhotia
a,
Jeremy
Singer
b,
David
R.
White
b,∗,
Shin
Yoo
aaUniversityCollegeLondon,GowerStreet,LondonWC1E6BT,UnitedKingdom bSchoolofComputingScience,UniversityofGlasgow,G128QQ,UnitedKingdom
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received2July2012 Accepted15October2012 Available online 23 November 2012 Keywords:
SearchBasedSoftwareEngineering(SBSE) Cloudcomputing
a
b
s
t
r
a
c
t
Manyoftheproblemsposedbythemigrationofcomputationtocloudplatformscanbeformulated andsolvedusingtechniquesassociatedwithSearchBasedSoftwareEngineering(SBSE).Muchofcloud softwareengineeringinvolvesproblemsofoptimisation:performance,allocation,assignmentandthe dynamicbalancingofresourcestoachievepragmatictrade-offsbetweenmanycompetingtechnicaland businessobjectives.SBSEisconcernedwiththeapplicationofcomputationalsearchandoptimisationto solvepreciselythesekindsofsoftwareengineeringchallenges.InterestinbothcloudcomputingandSBSE hasgrownrapidlyinthepastfiveyears,yettherehasbeenlittleworkonSBSEasameansofaddressing cloudcomputingchallenges.Likemanycomputationallydemandingactivities,SBSEhasthepotentialto benefitfromthecloud;‘SBSEinthecloud’.However,thispaperfocuses,instead,ofthewaysinwhich SBSEcanbenefitcloudcomputing.Itthusdevelopsthethemeof‘SBSEforthecloud’,formulatingcloud computingchallengesinwaysthatcanbeaddressedusingSBSE.
© 2012 Elsevier Inc. All rights reserved.
1. Introduction
Inmanywayscloudcomputingisnotconceptuallynew:it effec-tivelyintegratesasetofexistingtechnologiesandresearchareas onagrandscale,butthereisnothinginherentlynewaboutmostof thespecifictechnicalchallengesinvolved.Thebusinessmodelthat
underpinscloudcomputingisnew.Ratherthanviewingsoftware
andthehardwareuponwhichitexecutesasfixedassetstobe pur-chasedandsubsequentlysubjecttodepreciation,thecloudmodel
treatsbothsoftwareandhardwareasrentedcommodities.
Thisisaprofoundchange,becauseitremovesboththebusiness issueofdepreciationandmoretechnicallyorientedconcernsover resourcefragmentationandsmoothscalability,alongwithdefinite commitmentstoparticularplatforms,protocols,and interoperabil-ity.Nevertheless,inordertoavailthemselvesoftheseadvantages,
boththecloudproviderandtheirclientswillneednewwaysof
designing,developing,deployingandevolvingsoftware,thereby
posingnewquestionsfortheresearchcommunity.
The pace of the recent industrial uptake of cloud
comput-ing, withits associated technical challenges, also increasesthe
significanceofexistingconceptualandpragmaticquestionsthat
hadpreviouslyfailedtoreceivewidespreadattention.Examples
includethetailoringofoperatingsystemenvironmentsandtesting
∗ Correspondingauthor.
E-mailaddresses:[email protected](M.Harman),[email protected]
(K.Lakhotia),[email protected](J.Singer),[email protected]
(D.R.White),[email protected](S.Yoo).
inavirtualisedenvironment.Someofthesetechnicalissuespresent newdifficultiesthatmustbeovercome,whilstothersoffer oppor-tunitiesthatwillenablesoftwareengineerstotestanddeploycode
innewways.
Cloudcomputingrepresentsafurtherstepalongaroadthathas
takenourconceptionofcomputationfromtherarefiedandideal
worldofformalcalculitothemoremundaneandinexactworld
ofengineeringreality.Inthe1970sand1980s,therewasa
seri-ousdebateaboutwhethersuchathingas‘softwareengineering’
evenexisted(Hoare,1978).Manyarguedthatsoftwarewasnot
anengineeringartefactandthattotreatitassuchwasnotonly
wrongbut dangerous,withspecial opprobrium(Dijkstra,1978)
beingreservedforanydissenterswhodaredtoadvocatesoftware testing(DeMilloetal.,1979).
However,astimepassedandsystemsbecamelarger,therewas
agradualrealisationthatassizeandcomplexityincreases,
algo-rithmsgivewaytosystemsandprogramsgivewaytosoftware.
Adetailedandcompleteformalproofofa50millionlinebanking systemmay,indeed,beconceptuallydesirable,butitisnomore practicalthanaquantummechanicaldescriptionoftheoperation ofthehumancirculatorysystem.Theviewofsoftwareasan
engi-neeringartefact,whereby itis meaningful tospeakofsoftware
engineeringasadiscipline,graduallybecameaccepted–evenby itserstwhiledetractors(Hoare,1996,1996).
Astherecognitionofsoftwaredevelopmentasanengineering
activitygainedincreasingtraction,theviewofsoftware engineer-ingasadisciplinefundamentallyconcernedwithoptimisationalso gainedwiderattentionandacceptance.Thisoptimisation-centric viewofsoftwareengineeringistypifiedbythedevelopmentofthe
0164-1212/$–seefrontmatter © 2012 Elsevier Inc. All rights reserved.
areaofresearchandpracticeknownasSearchBasedSoftware Engi-neering(SBSE)(HarmanandJones,2001;Harman,2007),atopicof growinginterestformorethantenyears(FreitasandSouza,2011).
SBSEhasnowpermeatedalmosteveryareaofsoftware
engineer-ingactivity(Zhangetal.,2008;Afzaletal.,2009;Alietal.,2010; Räihä,2010;AfzalandTorkar,2011;Harmanetal.,2012a).
Cloudengineeringpropelsusfurtheralongthisjourneytowards avisionofsoftwareengineeringinwhichoptimisationliesatthe verycentre.Thecentraljustificationfordeployingsoftwareona cloudplatformrestsuponaclaimaboutoptimisation.Thatis:
Optimisationofresourceusagecanbeachievedbyconsolidating hardwareandsoftwareinfrastructureintomassivedatacentres, fromwhichtheseresourcesarerentedbyconsumerson-demand.
Cloud computing is thus the archetypal example of an
optimisation-centricviewofsoftwareengineering.Manyofcloud
computing’sproblemsand challenges revolvearound
optimisa-tion,whilemostofitsclaimedadvantagesareunequivocallyand unashamedlyphrasedintermsofoptimisationobjectives.SBSEis thusanaturalfitforcloudcomputing;itisprimarilyconcerned withtheformulationofsoftwareengineeringproblemsas
optimi-sationproblems,whichcansubsequentlybesolvedusingsearch
algorithms.
Aswell as providing a natural set of intellectual toolswith
whichtoaddressthechallengesandopportunitiesofcloud
com-puting,SBSEmayalsoprovideaconvenientbridgebetweenthe
engineeringproblemsofcloudcomputinganditsbusiness impera-tives.Optimisationobjectivesconnectthesetwoconcerns,seeking architecturesthatmaximiseflexibility,assignmentsthatminimise fragmentation,(re)designsthatminimisecostpertransactionand soon.
ThispaperexploresthissymbioticrelationshipbetweenSBSE
and cloud computing. Many papers have been written on the
advantages of migrating software engineering activities to the
cloud(Abadi,2009;Sotomayoretal.,2009;Armbrustetal.,2010).
NodoubttherearemanyadvantagesofsuchamigrationtoSBSE
computationinthecloud,someofwhichhavealreadybeenrealised (LeGouesetal.,2012).Inthispaperwepursueadifferent
direc-tion.RatherthanshowingthatthecloudbenefitsSBSE(weterm
thisSBSEinthecloud)insteadthispaperexploresanddevelops howSBSEmaybenefitthecloud(wetermthisSBSEforthecloud). Thusweaskthefollowingquestion:
How can SBSE help to optimise the design, development,
deploymentandevolutionofcloudcomputingforitsproviders andtheirclients?
Wereachouttotwoaudienceswiththispaper:first,weaim
to encourage SBSE researchers that cloud computing presents
relevantresearchchallenges;second,weaimtoshowcloud practi-tionersthatSBSEhaspotentialtosolvemanyoftheiroptimization problems.Theprimarycontributionsofthispaper1are:
1TosetoutaresearchagendaofSBSEforcloudcomputing,giving adetailedroadmap.
2To make the road map concrete we specify, in detail, five
problemswithinthebroadareasof‘SBSEforthecloud’.We
re-formulatetheseproblemsassearchproblemsin thestandard
mannerfor SBSEresearch(HarmanandJones, 2001;Harman,
2007),byproposingspecificsolutionrepresentationsand evalu-ationfunctionstoguidethesearch.
1 Thisisaninvited‘roadmap’paperwhichformspartofJournalofSystemsand
SoftwareSpecialIssueon‘SoftwareEngineeringfor/intheCloud’.
3Toreviewhowcloudsystemsposenewchallengesand
opportu-nitieswithinexistingapplicationareasofSBSE,andinparticular searchbasedsoftwaretesting.
Therestofthispaperisorganisedasfollows.First,wegiveabrief backgroundofcloudcomputingandSBSEinSection2.Thenwe dis-cusstheengineeringchallengesfacingcloudprovidersandtheir clientsinSection3.Section4givesexamplesofwhereandhow
SBSEcanbeappliedtosolvecloudengineeringproblems,which
arenewareasofapplicationforSBSEtechniques.InSection5we outlinewherethenatureofcloudcomputingoffersopportunities forimprovementoverpastSBSEapplications.InSection6,we dis-cusssomecommonchallengesofapplyingthesemethodsinacloud scenariobeforeconcludinginSection7.
2. Background
WenowprovideabriefoverviewofcloudcomputingandSearch
BasedSoftwareEngineering.
2.1. Cloudcomputing
Cloudcomputing(MellandGrance,2009)isarelativelyrecent andcurrentlyhigh-profilemethodofITdeployment.Compute facil-itiessuchasvirtualisedserverhostingorremotestorageaccessed viaanAPI,areprovidedbyacloudprovidertoacloudclient.2Oftena clientisathirdpartycompany,whowillusethecomputefacilities toprovideanexternalservicetotheirusers.Forexample,Amazon
isacloudprovidersupplyingastorageAPItoDropbox,whouse
theAPItoprovideafilesynchronisationservicetodomesticand
commercialusers.
Inmanyways,cloudcomputingresemblesamoveawayfrom
conventional desktopcomputingthat hasbeenthehallmark of
thepreviousdecade,towardsacentralisedcomputingmodel.One
starkdifferencetothemainframesofthetimesharingpastisthat
thenew paradigmis supported by vast datacentrescontaining
tensofthousandsofserversworkinginunison.Thesedatacentres
aredescribedbyBarrosoandHölzle(2009)as‘Warehouse-Sized
Computers’(WSCs),reflectingtheirviewthatthemachinescan col-lectivelyberegardedasasingleentity.Thecoordinationofthese machinesissupportedbyspecialistsoftwarelayers,including vir-tualisation technologiessuchas Xen (Barhamet al., 2003) and distributedservicessuchasBigTable(Changetal.,2008).
Mostof thetechnologiesinvolvedin constructingand using
cloudcomputingarenotnew,butratheritistheirparticular combi-nationandlarge-scaledeploymentthatisnovel.Theemergenceof
cloudcomputingwouldnothavebeenpossiblewithoutthegrowth
invirtualisationandwidespreadinternetaccess.Cloudcomputing
promisestocommoditisedatastorage,processingandservingin
thewayenvisionedbyutilitycomputing(Rappa,2004)andservice orientedarchitecture(PapazoglouandvandenHeuvel,2007). 2.1.1. Cloudarchitectures
Cloudcomputingisfrequentlypresentedasalayered architec-ture,asshowninFig.1.However,inpracticetheselayersarenot
distinct.For example,datastorageservicesmayberegardedas
infrastructure,aplatformoranapplication,dependingonexactly
howthoseservicesarebeingused.Wedeliberately avoidusing
thesesomewhatartificialdistinctions.
2Inthispaper,theword‘user’isreservedfortheend-usersofsoftware,i.e.the
customersofaclient.Thisiscomplicatedsomewhatbythefactthatsomecloud providersservebothclientsandusers;forexample,GoogleprovideAppEngineas aservicetodevelopers,buttheyalsoprovideGmailtousers.
Fig.1.Thecloudstackarchitecture.
Cloudhardware istypicallycomposedof commodity
server-grade×86computersarrangedinashallowhierarchycomposed
ofracksandclustersofracks.Theyarephysicallylocatedinaseries ofgeographicallydistributeddatacentres.InthecaseofAmazon WebServices,thesedatacentresaresplitintodiscreteavailability zones.
Servers runhypervisor technologysuchas Xen orVMWare,
which managemultiple virtualmachines(VMs)oneach
physi-cal machine.VMs are provided directly toclient companiesby
providerssuchasRackSpaceorAmazonattheinfrastructurelayer. EachlivevirtualmachineisaninstanceofanofferedVM configu-ration,whichisprincipallydefinedbythememoryandprocessing poweravailabletotheVM.Everyinstancemountsamachineimage,
alsoknownasanoperatingsystemimage. Animageincludesan
operatingsystem,andtherequiredserversoftwaresuchasweb
serversortransactionprocessingsoftware.
Themoststraightforwardexampleofaplatform-levelservice
is Google App Engine, where a customer provides code and
Googleautomaticallymanagesthescalingtorespondtoincoming
requests.
Applicationscomprisemostusageofcloudcomputingbythe
generalpublic.PopularcurrentexamplesincludeMicrosoftOffice
365andGmail.
2.1.2. Cloudresearch
Ratherthan attemptingadetailed literaturereview of cloud
computing,wepresentasimple,informal,graphicaloverviewof
cloudresearch.Fig.2isawordcloudgeneratedfromthetitlesof around25,000papersreturnedbyakeywordsearchforcloudatthe
ACMguidetocomputingliterature.Thesearchwasperformedon
15May2012.Inthiscloudwordcloud,thesizeofawordreflects itsfrequencyintheunderlyingcorpus.CommonEnglishwordsand thoseoccurringrarelyarenotincludedinthediagram.
FromacursoryinspectionofFig.2,thelargestwordsare indica-tiveofthetopic(cloud,computing)andthetechnology(applications, networks,web,etc.).Todrilldowntotheresearchthemesofcloud
computing,weremovefromtheunderlyingcorpusalltermsthat
occurintheconciseNISTdefinitionofcloudcomputing,whichis thefirstsentenceofSection2inMelland Grance(2009).Fig.3
showstheresultingfilteredcloudwordcloud. TherearetwokeypointstohighlightfromFig.3.
1AsmentionedinSection1,cloudresearchissuesarenotnew. Thecloudwordcloudshowsstronglinkswithancestralresearch fieldsincludingdistributed,parallel,andgridcomputing.
2The concepts presented in the word cloud are amenable to
SBSEoptimisation.Apparenttermssuchasscheduling, cluster-ing,reconstruction, and optimisationall admitthepotentialof
SBSE.Termssuchasestimation,modelling, predictionand
sim-ulationshowthatthecloudresearchcommunityisattempting
toapplywell-knownexistingsystemoptimisationtechniquesto
theclouddomain.
2.1.3. Cloudwordcloudinthecloud
Asanaside,thecloudwordcloudsinFigs.2and3were gener-atedinthecloud.Thepapertitlemetadatawascollectedfromthe
ACMwebportalusing50AmazonEC2t1.microLinuxinstances,
andstoredinasingleS3bucket.Theestimatedcostofthis
com-putationis$0.89,usingAmazonWebServices’on-demandprices
fromMay2012intheEU(Ireland)region.Thegraphical represen-tationsofthecloudwordcloudsareproducedbyWordle,anonline wordcloudgenerationservice(Viegasetal.,2009).
Thisexerciseservestodemonstratetheaccessibilityand econ-omyofcloudcomputingforsimpledataprocessingtasks.Wefully expecttherapid,widespreadadoptionofcloudcomputing,as pre-dictedbyindustryanalystsIDC(2011).
2.2. SearchBasedSoftwareEngineering
SearchBased SoftwareEngineering(SBSE)isthenamegiven
toafieldofresearchandpracticeinwhichcomputationalsearch andoptimisationtechniquesareusedtoaddressproblemsin
soft-wareengineering (Harman and Jones, 2001).The approach has
provedsuccessfulasameansofattackingchallengingproblemsin whichtherearepotentiallymanydifferentandpossiblyconflicting
objectives,themeasurement ofwhich maybesubjecttonoise,
imprecision and incompleteness. Notable successes have been
achievedforproblemssuchastestdatageneration,modularisation, automatedpatching,andregressiontesting,withindustrialuptake (Cornfordetal.,2003;WegenerandBühler,2004;Afzaletal.,2010; Lakhotiaetal.,2010b;Cadaretal.,2011;Yooetal.,2011)andthe provisionoftoolssuchasAUSTIN(Lakhotiaetal.,2010a),Bunch (MitchellandMancoridis,2006),EvoSuite(FraserandArcuri,2011), GenProg(LeGouesetal.,2012),Milu(JiaandHarman,2008)and SWAT(AlshahwanandHarman,2011).
ThefirststepintheapplicationofSBSEconsistsofa
reformu-lation ofa software engineering problemasa ‘search problem’
(HarmanandJones,2001;Harmanetal.,2012a).Thisformulation involvesthedefinitionofasuitable representationofcandidate
solutions to the problem, or some representation from which
thesesolutionscanbeobtained,andameasureofthequalityof agivensolution:anevaluationfunction.Strictlyspeaking,allthat isrequiredofanevaluationfunctionisthatitenablesoneto com-pute,foranytwocandidatesolutions,whichisthebetterofthe two.
Fig.2.Cloudwordcloud,summarizing25,000researchpapersoncloudcomputing.
Therearemanysurveys,overviewsandreviewsofSBSEthat
providemorein-depthtreatmentandcoverageofresultsspecificto individualengineeringsubdisciplinessuchasrequirements analy-sis(Zhangetal.,2008),predictivemodelling(Harman,2010;Afzal andTorkar,2011),non-functionalproperties(Afzaletal.,2009), programcomprehension(Harman,2007),design(Räihä,2010)and testing(McMinn,2004;Harman,2008;Afzaletal.,2009;Alietal., 2010).Thereisalsoatutorialpaperaimedprimarilyatthereader withnopriorknowledgeofcomputationalsearchorSBSE applica-tions(Harmanetal.,2012b)andarecentbibliometricanalysisof tenyears’ofSBSEliterature(2001–2010)(FreitasandSouza,2011).
Fig.4showstherecentrapidriseinpublicationsonSBSE. Inthispaperweintroducetheideaof‘SBSEfortheCloud’by definingrepresentationsandevaluationfunctionsforseveralcloud
computingproblems.Wedothistoillustratehowcloud
comput-ingproblemscanbere-formulatedasSBSEproblems.Weseekto
providereasonablecoverageofcloudcomputingapplicationsand
issues,butaresurethattherearemanymoresuchreformulations waitingtobeexplored.
3. Engineeringchallengesincloudcomputing
Theengineeringchallengesposedbycloudcomputingcanbe
dividedbetweenthose facingtheprovidersof cloud
infrastruc-tureand those faced bytheirclients.For theformer, there are challengesinthe(re-)engineeringoftheirsystemstoensurethey
aresuitable forclouddeployment,aswellas reducingcostsby
optimisingresourceusage.Cloudproviderssharesimilargoalsin reducingresourceusage,buttheymustalsofocusonmaintaining theirservicelevelagreements(SLAs).Fig.5detailsthepotential
applicationareasforSBSEmethodsthatwehaveidentified.The
figureisannotatedwithcorrespondingsectionnumbersforthose
discussedlaterinthepaper. 3.1. Challengesforcloudclients
Fromtheviewofaclient,thecrucialfeatureofcloud comput-ingisarelianceoncloudprovidersfortheircomputationaland
storageneeds.Insteadofmakingcapitalinvestmentinhardware
Fig.4.YearlySBSEpublicationrates1976–2010. Source:SBSERepository(Zhangetal.,2012).
Fig.5. AnoverviewofthesoftwareengineeringtasksfacingcloudprovidersandtheirclientsthatmaybeamenabletoanSBSEapproach.
andhostingtheirownservers,acompanymayinsteadchooseto
rentfacilitiesfromprovidersonapay-as-you-go(tm)basis.Aswith
otherformsofoutsourceddeployment,thisenableslower
over-heads through economies of scale and expertise. However,the
pay-as-you-gomodelhasadditionaladvantages:
1Computingexpensesarenowregardedasanoperational
over-head,ratherthandepreciatingcapital.Thus,largeandrisk-laden upfrontexpenditureisnotrequired.
2Theopportunityforscalabilityisunprecedented.Capacitycanbe added‘on-demand’,withoutfixedcostsandatshortnotice. 3Thereisnoeconomicdifferencebetweenusingasingle
proces-sorforahundredhoursversususingahundredprocessorsfor
onehour.Thisenablesclientstoquicklycompletetasksthat pre-viouslywouldhaverequiredlongertimeperiodsorsubstantial capitalinvestment.
Clouddeploymentpresentsaseriesofengineeringandbusiness challengesforcloudclients:
3.1.1. Predictivemodelling
Inordertobestoptimiseclouddeploymentthroughmost
effi-cientuseofavailableresourcesandminimisationofcostsinvolved itwillbenecessarytopredictbehaviour,load,useandother pro-filesthataffectresourceconsumptionandcosts.Fortunately,there
hasbeenmuchworkonpredictivemodellingandthe
optimisa-tionofpredictivemodelsusingsearchbasedtechniques(Harman, 2010;AfzalandTorkar,2011)thatcanbeappliedtocloudsystem behaviourprediction.
3.1.2. Scalability
Scalabilitydoesnotcomeforfree.Existingsoftwaremustbe
re-engineeredto takeadvantageof thescalabilityof thecloud. Scalability relies upon parallelism, which is typically provided
byeitheruser-levelparallelism(manyusersaccessingthesame
service)ordata-levelparallelism(datacanbeprocessedin par-allel).Softwaremustbedesignedorrefactoredtohavealoosely
coupled and asynchronous architecture, to reduce bottlenecks
demandmust beanticipated usingpredictive modelling,and a policyformanagingscalemustbeestablished.
3.1.3. Faulttolerance
Thedistributednatureofsystemsinthecloudandthelower
reliability(VishwanathandNagappan,2010)of cloudplatforms requireanewapproachtofaulttolerance.Softwareshouldbe tol-eranttocompletehardwarefailure:avirtualmachinecanbelost atanytime,andsimilarlyitshouldbeanticipatedthatAPIcallscan fail.Expectationsofnetworkperformance,suchaslatency,haveto berevised.Thefragilityoftheunderlyingplatformisinstark con-trasttotheexpectationsofcustomersusingthewebservicesthat sooftenrelyonthecloud–suchservicesmusthaveahighlevelof availabilityandlowlatency.Testingproceduresmustbesuitably adapted;thecanonicalexampleisNetflix’s‘ChaosMonkey’(Netflix TechBlog,2008).
3.1.4. Resourceefficiency
Thecloudpay-as-you-gomodelmakesexplicitthevariablecosts ofcomputation.Thisprovidesagreaterincentiveforcompaniesto beasefficientaspossibleintheiruseofresources.Thesparecycles ofthepastaretoagreatdegreeeliminated,andthereforedecisions suchashowandwhentoscale,whichtechnologiesortoolchainsto use,andefficientsoftwaredesign,becomefarmoreimportantthan theyhavebeenduringthepre-cloudera.Forexample,ifacompany canuseasmallerVMinstancebyrequiringlessRAM,thenitwill reduceitsoperatingcosts.
3.1.5. Evaluatingsystems
Newscalesmeannewchallenges.Systems thatworkfineat
smallscales,or insimulation,mayfacenewproblemsatlarger
scalescausedbybottlenecksthatwerenotpreviouslyapparent.
Thisisprevalentintheanalysisof‘BigData’(Jacobs,2009).Of par-ticularconcerntoresearchersisthattheproblemofevaluatingnew ideasappliestotheirresearchasmuchas,ormorethan,software
developmentin industry.Problemsmayonlyoccurata certain
levelof scaleand concurrency.We discussthis issuefurtherin
Section6. 3.1.6. Security
Maintainingdataandsystemsecurityisachallenge,notleast
becausedatathatwaspreviouslyhostedona company’s
inter-nalserversmaynowresideonathirdpartymachineandpossibly residealongsidethedataandcomputationofthirdparties. Tradi-tionalfirewallsandintrusiondetectionsystemsdonotapplyand thesandboxingofvirtualisationlimitsthevisibilityofnetwork traf-fic,impairingthemonitoringofthistrafficbycloudclientsasa
defensivemeasure.
3.1.7. Managingthebusinessmodel
Reese(2009)notesthatitissometimeslostindiscussionsof cloudcomputing that scalability in itself is not an end, rather
that scalability must coincide with the goals of the business.
For example, serving extra users may not necessarily be
prof-itable, if those users are unlikely to purchase a given product
orservice. Criticaltoachievinga sustainablebusiness model is
themanagement of scalabilityand thepredictive modelling of
demand.
3.1.8. Accelerateddevelopmentcycles
The difficultyof distributing software toindividual users is
oftenavoided by employing cloudcomputing.Only server-side
softwareneedbeupdatedwhenchangingaserviceprovidedover
theweb,forexample.Furthermore,theroll-outofnewsoftware
canbelimitedtogivenusersorgeographicalareas,atechnique sometimesreferredtoascanaryingorA/Btesting.Thustraditional
barrierstoreleasearenolongeranobject,andcompaniessuchas Googlefrequentlyreleasemultiplerevisionsoftheirsoftwarein asingleweek.Thisbringsnewchallengesinanaccelerated life-cycle,suchascontinuoustesting,versioncontrolandmaintaining documentation.
3.1.9. Riskmanagement
Thereare two major risks facingan organisationhoping to
migrateordeploytothecloud(Armbrustetal.,2009,2010;Cliff, 2010).Thefirst istheissueofvendorlock-in,particularlywhen
making useof platform-as-a-service, which relies on
provider-specificAPIs.Anacceptedstandardforcloudinteroperabilityhas yettobeproposed,andasproviderscontinuetodivergeintheir
approach to provision, such standardisation looks increasingly
unlikely. Projectssuch asEucalyptus (Nurmi et al., 2009)have
suggesteda more pragmaticapproach, bycreatingopen-source
implementationsofproprietaryAPIs.Thesecondmajorrisk
sur-roundsdatagovernance,whendataistrustedtoathirdparty(the cloudhost)andinsituationsparticulartocloudcomputing,suchas coresidency.
3.2. Challengesforcloudproviders
Principally, cloudproviders are concernedwith maintaining
theirservicelevelagreementsasefficientlyaspossible,andina
scalablemanner:
3.2.1. Virtualmachinemanagement
Itisconvenienttoconsiderthemachinesthatpopulatea data-centreasasinglemassivelyparallelunit.Fromthispointofview,
theproblemsofresourcemanagementareclear:schedulingmust
takeplaceacrosstensofthousandsofmachines;individual vir-tualmachineinstancesmustbeassignedtophysicalservers;loads
mustbemonitoredandbalanced;VMscanbemigratedbetween
servers,andmemorycanbepagedacrossthenetwork(Williams
etal.,2011).Coresidencyalsobringschallengesofnewpathologies, inthesamewaythatharddisk‘thrashing’mayaffectconventional sharedsystems.
3.2.2. Managingoversubscription
Manycloudproviderstakeadvantageofthevariableworkloads ofclientsbyutilisingoversubscriptiontoreducetheircosts.Inan
oversubscribedsystem,theamountof resourcesprovisionedto
clientsexceedsthetotalcapacityofthephysicalresourcesavailable totheprovider.Thechallengeistoanticipatewhendemandmight outstripsupply,andtotakemeasurestomitigatesuchsituations withoutviolatingSLAs.
3.2.3. TranslatingSLAstolow-levelbehaviour
Thereiscurrentlyadisconnectbetweenthehigh-levelbusiness
modelofSLAsandthelow-levelresourcemanagementofkernels
andhypervisors.
3.2.4. Scalableserviceprovision
Aswellasprovidinginfrastructure,mostlargecloudproviders alsoprovidesomeservices,intheformofanAPI.Inconstructing suchservices,theyfacemanyofthesamechallengesastheircloud clients.Forexample,distributedlarge-scalestoragesystemshave
seenconsiderabledevelopmentin thelast fiveyears. Themain
challengeistoensurescalabilityandresilience(DeCandiaetal., 2007).
3.2.5. Resourceefficiency
Providers aimto reduce theirexpenditure through
ple,moderndatacentresareforthemostpartpower-inefficient (BarrosoandHölzle,2009).Themainoverheadofrunninga data-centreisclimatemanagement.Mosteffortinthisareafocuseson
thedesignofcoolingsystems,butonceefficiencyimproves,the
focusmayshiftonto‘energy-scalable’computing,thatisensuring
thatlowutilisationcorrespondstolowpowerconsumptionand
vice-versa.
4. ExampleapplicationsofSBSEtocloudengineering
Wenowprovideillustrativeexamplesofspecificoptimisation
problemsincloudcomputing,inordertodemonstratethe
appli-cabilityof SBSEtocloudengineering. Toavoid imprecision,we
givedetailed suggestionsonhoweach taskmaybeformulated
asasearchproblem,andalsoreferencerelatedworkinSBSEthat couldbeappliedinthisdomain.DetaileddescriptionsofSBSE
algo-rithmsandthesoftwareengineeringapplicationstowhich they
havehithertobeenappliedcanbefoundelsewhereinsurveyson
SBSE(Harmanetal.,2012a).
We have chosen optimisation problemsthat can potentially
benefitboth,cloudprovidersaswellastheirclients.The exam-plesgiveninSections4.1,4.3and4.5addressspecificproblems facedbycloudprovidersofInfrastructureAsAService,PlatformAs AServiceorSoftwareAsAService.Theproblemsandsolutionslaid outinSections4.2and4.4canservetobenefitboth,cloudproviders aswellasclients.
4.1. Providerresourceefficiency:cloudstackconfiguration
Onechallengetobothprovidersandclients,notedinSection3, istoreducetheresourceconsumptionoftheirsystems.
Inthis section,weproposeanapproachthat willhelp cloud
providers optimise their cloud stack configuration of physical
serversinsideadatacentre.Fig.6illustratesatypicalserver soft-warearchitectureinacloudenvironment.Aphysicalserverina clouddatacentreexecutesahypervisor,generallyXenorVMware. Thehypervisorrunsasingleoperatingsystem,suchasamodified versionofLinux,asthedom0hostoperatingsystem.Thishostis controlledbythecloudprovidertoadministerthenode;residing indom0providesitwithspecialaccessprivileges.Guest operat-ingsysteminstancesarethevirtualmachinesutilisedbyacloud
client,and executewithindomU.These aremultiplexedonthe
nodebythedom0host.Eachoftheselayersmaybeconfigured
independently.
Therearemanyopportunitiesforoptimisation:boththedom0
hostand thedomU guest operating systems can besubject to
configurationtuning,inordertooptimizeparticularperformance characteristics,suchasnetworkbandwidthandenergyefficiency.
Cloudprovidersusehostconfigurationparameterstooffera
rangeofinstanceconfigurations,eachwithfixedcharacteristics.For
example,theAmazont1.microinstancecurrentlyoffers613MB
ofRAManduptotwovirtualprocessingunits.These
characteris-ticsmaymapdirectlyontotheappropriateXenconfigurationfile
parameters:
maxmem =613 vcpus= 2
Suchhostconfigurationfilesspecifyhowthephysicalresources
oftheunderlyinghardwarenodeshouldbedividedamongstguest
operatingsystems.Thisconfigurationcanevenbealtered
dynam-ically.
Theguestoperatingsystemsettingsaregenerallyconfiguredat
boot-time,forexamplethroughthebootparameterstotheLinux
kernel(RedHatEnterprise,2007).Manyoftheseparametersare
performance-sensitive.Someparameterscanbeconfiguredat
run-time,butevenchangesthatrequirearebootmaybeappliedina
anticipated.
4.1.1. Formulation
Wecanrepresenttheparametersofthehypervisororoperating systemasavaluevectorcontainingmultipletypes,suitableforthe applicationofahill-climberorgeneticalgorithm.Searchoperators
shouldrespectparameterboundariesandconstraintsthat
deter-minethestructureoffeasiblesolutions.Inanofflinescenario,in whichtheoptimisationisperformedinasafesandboxenvironment disconnectedfromthelivecloudinstances,thismaybeachieved bypunishingunfeasiblesolutionsusingaspeciallydesignedfitness function.Inonlineoptimisation,however,thesearchalgorithmwill evaluatethefitnessofcandidatesolutionsinsituusingthelivecloud instancesasthefitnessfunction.Consequently,thesearchwillhave tobelimitedtothesubspaceoffeasiblesolutions.
In generalit seemsreasonable toassumethat an individual
user’s previous workload patterns may be indicative of future
behaviour,andoptimiseaccordingly.Thisapproachwouldbe sim-ilartotheworkonSBSEforcompileroptimisation,whichsearches
thespace ofparametersettings: SBSEhasbeenadvocated asa
tuningmechanismforcompilers(Cooperetal.,1999).Hosteand Eeckhout(2008)demonstratedhowacompilercanbetunedby targetingdifferentperformanceobjectivesforthecodeitproduces.
Compilershaveasurprisingnumberofpossibleparametersthat
canbetunedinthismanner:gcchasabout200.Indeed,oneway
totuneanoperatingsystemorothercloudstackcomponentwould betosearchforsuitablecompilersettingsthatenablearecompiled versiontoperformbetterinagivenscenario.
4.1.2. Evaluationfunction
The evaluation function for the tuning of cloud stack
com-ponents would depend on the performance objectives to be
monitoredand optimised.Obviouschoices aredynamic
perfor-mance attributes includingmemory usage,execution time and
energyconsumption.Sincethesepropertiesaresubjectto variabil-itydependingonthegivenworkload,theevaluationfunctionwill beforcedtoseeknormalisationthroughsamplingandstatistical analysis.Thisisacross-cuttingissueincloudoptimisationandis discussedfurtherinSection6.
4.2. Clientresourceefficiency:imagespecialisation
Cloud providers usually find themselves running many
instancesusingthesamemachineimage,thefullfeaturesofwhich areunlikelytoberequiredineverycase.Onagivenvirtualmachine, acustomerwillusuallyberunningaspecificapplication,suchas
aLAMP(Linux-Apache-MySQL-PHP)stack.Eitherthroughexplicit
declarationoftheintendedusebycustomers,orbypermitted mon-itoringofactivity,theprovidercouldtailortheimagetobetterfit theneedsofthecustomer.
Itseemswastefultouseonlyafractionofsoftwarewithinan image;thelargerthescaleofthedatacentre,theworsethis prob-lemof‘unusedsoftwareplant’willbe.Specialisationcouldhelp inreducingimagesize,reducingexecutiontime,andloweringthe timerequiredtomigrateanimageacrossthenetwork(Voorsluys etal.,2009).Thebeneficiariesofsuchspecialisationwouldbeboth, cloudprovidersaswellascloudclients.Foracloudprovideritwill resultinlessdemandonphysicalhardware.Forcloudclients, spe-cialisationoffersonewayofcuttingcost,forexamplebyconsuming lessresources.
Atpresent cloud providersalready offermany different
Fig.6. Schematicdiagramofcloudapplicationexecutionstack.
images3 for customers onthe EC2 cloudplatform. Each image
includesa particularoperatingsystemreleaseandasetof
bun-dledapplications.The‘BitNamiWordPressStack’imageprovides
UbuntuLinuxandallnecessarysoftwaretosupportWordPress:
BitNami,WordPress,Apache,MySQL,PHPandphpMyAdmin4.Such
coarse-grainedspecialisationcanbeformulatedastheselectionofa setofappropriateLinuxdistributionpackages(e.g.RPMs)toinclude inaVMimage.Thereisawell-defineddependencygraphstructure
forRPMpackages,whichcouldbeco-optedtocreatespecialised
Linuximagesforthecloud.
Thereexiststheinterestingproblemofidentifyingthe trade-offbetweenthefrequencyofuseofamoduleversuspotentialsize reductionshoulditberemoved.Ifremovingararelyusedmodule canresultinasignificantspacesaving,aworkaroundmaybeworth considering.
More fine-grained VMspecialisation is also possible. A
sin-gleRPM (e.g.,a Linux librarysuchas libstdc++)maycontain
superfluousfunctionalityforthepurposeofagivenVMinstance. Programslicingmighthelptoreducethelibrarytothebare essen-tialsfortheusecasesrequired(GuoandEngler,2011).Evenmore ambitiousspecialisationmaybeachievedbyrewritingapplication
andoperating systemcomponents.Manual specializationeffort
(Madhavapeddy et al., 2010)demonstrates clear potential; the applicationofSBSEshouldautomatethisprocess.
4.2.1. Formulation
Forthereductionofimagesizethroughthedeletionofunused modules,thesearchcoulduseadependencygraphrepresentation.
Searchingforcompromisesbetweenusageandsizereductionis
closelyrelatedtothecost-benefittrade-offsinrequirements engi-neering(Zhangetal.,2007;Durilloetal.,2009).
More generally, this is a type of ‘partial evaluation’ of the
machineimagewithrespecttotheintendedapplication.Partial
Evaluation(PE)hasbeenstudiedformanyyearsasawayof spe-cialisingprogramstospecificcomputation(Beckmanetal.,1976; Bjorneretal.,1987;Jones,1996).However,highlynon-trivial
engi-neeringworkwould berequiredtoadapt and develop existing
PEapproachestothescalerequired,suchasmodifyingtheLinux kernel,whichmayruntoseveralmillionsoflinesofcode.
3 https://aws.amazon.com/amis(accessedApril2012).
4
https://aws.amazon.com/amis/bitnami-wordpress-stack-3-3-1-0-ubuntu-10-04.
Anotherapproach wouldbetouseprogramslicing(Harman
and Hierons,2001; Silva, 2012)to ‘slice away’ unusedparts of animagethatcanbestaticallyidentified.Likepartialevaluation, slicinghasbeenstudiedformanyyears(Weiser,1979).However, despitemanyadvancesinslicingtechnologies,scalabilityis cur-rentlylimited.
AnalternativetoPEandslicingforimagespecialisationwould betouseasearchbasedapproachtofindpartsoftheimagethat
canberemoved.Moreambitiously,asearchbasedapproachmight
alsoseektoconstructanewversionoftheimage,specialisedfor aparticularapplication,usingtheoriginalimagetoprovidebotha startingpointandanoraclewithwhichtodeterminethecorrect behaviourofthere-write.Anaturalchoiceofformulationforthis
problemwouldrestontheuseofgeneticprogramming.Related
SBSEworkusingGeneticProgramming(GP)(Koza,1992)includes recentadvancesinautomatedsolutionstobugfixing(Arcuriand Yao,2008;Weimeretal.,2009),platformandlanguagemigration (LangdonandHarman,2010)andnon-functionalproperty optimi-sation(Whiteetal.,2008,2011;Sitthi-Amornetal.,2011).
AtypicalrepresentationinaGPsearchisanabstractsyntaxtree.
However,theGPprocess inherentlyinvolvesmaintainingmany
copies oftheprogram under optimisationand would hencebe
memory-intensive.Fortunately,recentapplicationsofgenetic
pro-grammingtotheproblemofcloningsystemsandsubsystemshave
developedconstrainedrepresentationstoovercomesuch scalabil-ityissues.Forexample,LangdonandHarman(2010)useagrammar basedGPapproach,inwhichchangesareconstrainedbyagrammar specialisedtotheprogram,whileLeGouesetal.(2012)representa solutionasasequenceofeditstotheoriginalinamanner reminis-centofearlierworkonsearchbasedtransformation(Ryan,2000; Fatiregunetal.,2005).EitherapproachcouldbeusedtoscaleGP totheimagespecialisationproblem.Profilingcouldalsofocusthe search,identifyingthosesoftwarecomponentsthatarefrequently executed.
Sufficientknowledgeofasystem’sbehaviourwillplayacritical
role.Theupperlimitofthedeletionapproachisdefined bythe
minimalfunctionalitythatmustbepreserved.Itisthebehavioural characteristicsthatwouldprovideapseudo-specificationforthe specialisation;theresultingsoftwareshouldbeabletohandleall systembehavioursthatclientsareinterestedin.Theprimarysource ofthisknowledgeshouldbeusageprofiles.
Accumulated profile data would provide a certain level of
confidence in determining the behavioural characteristics that
2010; Lakhotia et al., 2010a; Alshahwan and Harman, 2011; FraserandArcuri,2011)couldalsobeusedtospecificallytarget
theseapparentlyunexecutedregionsofcodetoraiseconfidence
that theyare not executed.Code that resistsexecution forthe
applicationinquestion,evenwhensearchisusedspecificallyto targetitsexecution,mightbespeculativelyremoved.
4.2.2. Evaluationfunction
Naturaltargetmetricsareimagesize,memoryusage,and exe-cutiontime.Forimagesize,wemayconsiderthetrade-offbetween thesizeofimagesandthefunctionalityprovided. Imagesizeis static,eliminatingtheneedtomakerepeatedobservations.With
regardstomemoryusageandexecutiontime, wemayconsider
whetheraverage,peakorminimalrequirementsaremost
impor-tant.Theaveragedrawoneitheroftheseresourcesmaynotbeas importantasthevariance,andwemightseektoraisetheminimum
drawonresourcesandreducethemaximumdraw,therebyaiding
loadpredictionandmanagementonacloudinfrastructure.
Ifanonlineapproachisadopted,thenanengineermightalso bepresentedwithoccasionalreportsofPareto-optimaltrade-offs (Harman,2007)currentlypossibleforheavilyusedapplications.In alloptimisationwork,itisimportanttoconsidertheappropriate
pointatwhichtodeployhumanjudgementinanotherwise
auto-matedprocess(Harman,2007,2010).Thesehumanjudgements
couldbeinformedbytheautomatedconstructionofspeculative
optimisation reports from realtime image specialisers, running
concurrentlywiththeapplicationstheyserveandwhichabsorband makeuseofotherwiseredundantdatacentreresourcestooptimise forthefuture.
4.3. Virtualmachineassignmentandconsolidationbyproviders
Virtualisation offers opportunities for power consumption
reduction using consolidation, whereby somevirtual machines
maybemigratedfromonephysicalservertoanother.Traditional
consolidationcanbeemployedsothatsomeserversmaybe
pow-ereddownortransitionedtoalow-powerconfiguration.Servers
continuetousesignificantamountsofpowerwhentheyareidle;
forexample,Googlereportthatidlepoweris‘generallyneverbelow 50%ofpeak’(Fanetal.,2007).Henceconsolidationmaybemore
desirablethan load-balancing,althoughpoweringdownservers
hasanassociatedcostinthetimetakentorestorethem.
TheproblemofallocatingVMstoserversisessentiallyoneof bin-packing,albeitdynamicinnatureandundermanyconstraints. TheconstraintsareimpliedbytheSLAsbetweenthecloudprovider
anditsclients.Forexample,wemayseektomakepowersavings
asdescribedabove,butalsowishtoavoidcohostingtwoVMsfrom
thesamecustomeronthesamemachineornetworksegment,so
asnottoaffectthetermsofanSLAin regardstoavailabilityor
responsiveness.Tocomplicatematters,itiscommonpracticeto
oversubscribephysicalhardware(Williamsetal.,2011),suchthat
demandmayinsomecircumstancesbeexpectedtoexceedsupply.
Furthermore,theenergyprofileofanapplicationandofaserver appearstobehighlyvariableandexperimentalstudies investigat-ingservers’energyconsumptionhavereportedconflictingfindings (BarrosoandHölzle,2009;LeeandZomaya,2012).
The allocation of VMs to servers has two overriding goals:
firstly,toreduceenergyconsumptionbyshuttingdownunoccupied machineswhendemandislowwithinadatacentre,i.e.traditional ‘consolidation’.Secondly,tomake efficientuseofmachinesthat
arerunningwithoutimpactingonacustomer’sVMinawaythat
SLAscannotbemet.Aggressiveconsolidation mayleadto
com-petitionforresourcesonheavilyloadedservers,anditmayalso
beproblematicifdemandrisesfasterthanpowered-downservers
ofdemandarerelated.
Thereexistsasubstantialbodyofliteratureonvirtualmachine management(Srikantaiahetal.,2008;BeloglazovandBuyya,2010; Lee and Zomaya, 2012), which invariably suggest handwritten heuristics.TheapplicationofSBSEmethodsoffersanopportunity toincreasethesophisticationandefficiencyofmanagement. 4.3.1. Formulation
Letusfirstconsiderthesimplestcaseofasinglestaticsituation. Wemayexpressthistaskasasearchproblembyconsideringa solu-tionasamappingfromadescriptionofthedemandforresources ontoanexistinginfrastructureofphysicalmachines.Tobuildon theformulationfromLeeandZomaya(2012),aserverssioutofw serversbelongstothesetS,andtheutilisationUiofasingleserver si∈Scanbedescribedasthesumoftheutilisationui,jduetoeach residentvirtualmachine
v
jrunningonsithus:Ui= n
j=1
ui,j (1)
Theenergy consumption ofa singleserver, ci,mustthen be
modelledasafunctionofitsutilisation,
ci=b+f(Ui) (2)
wherebindicatestheminimal(base)energyusageofaserverthat isidling.However,thisassumesthatallutilisationishomogeneous, whereasinfactwemightregardUiasak-dimensionalvector,giving utilisationforeachofthektypesofresource(CPU,memory,energy, etc.)providedbyaserver.LeeandZomayaalsoassumethatfisa linearfunction,whichisinconflicttothatreportedbyBarrosoand Hölzle(2009).
Momentarilydiscardingtheselimitations,theproblemcanthen beconsideredassearchingforanassignmentconsistingofavector AsuchthatAjdescribeswherevirtualmachine
v
jshouldreside.The goalistominimiseoverallenergyconsumption,C:C=
w
i=1
ci·Ri (3)
whereRigivesabooleanvalueindicatingifaserverisrunningat leastonevirtualmachine:
Ri=
1 if
∃
j:Aj=i0 otherwise
(4)
Itisnotunreasonabletoassumethataprovidermaywishto
separatevirtualmachinesfromthesamecustomerontoseparate
servers,i.e.lettingcust(
v
x)denotethecustomerwhoisrunning virtualmachinev
x,wewanttoenforcethefollowingconstraint:∀
i,j,i/=j:cust(v
i)=cust(v
j)⇒Ai/=Aj (5)Therewillbefurtherconstraintsbasedontherisk profileof highlyutilisedservers.Furthermore,wemayconsiderthecostsof transitionfromthecurrentallocationAtattimettothesuggested allocationAt+1.IfweassumeafixedcostforVMmigration,thenwe maytrytominimisethedifferencebetweenAtandAt+1.Ifaserver istobeshutdownorrestarted,thenwemusttaketheassociated costintoconsideration.
Muchdependsonwhethertheproblemistreatedstatically,in thesensethatasingleallocationisfoundforagivensituation,or
whetherthegoalistoconstructapolicytobeexecuted
dynam-ically.Oneinterestinghybridapproachwouldbetosearchfora staticallocationthatnotonlyoptimisesagivenevaluation func-tion,butalsotakesintoaccountfuturescenarios,forexamplein ordertominimisethecostofmigrationsgivenpotentialanticipated
demandchanges.ThisideaissimilartothatofEmbersonandBate (2010)intheirSBSEworkonallocatingprocessestoprocessorsin
amultiprocessorsystem.
Thestatic allocation of VMs tophysical servers canbe
rep-resentedusingtheintegervector,A.Adynamicsolutionwillbe representedasaprogramorheuristicfunction
At+1=f(At,Ut) (6)
Amoreambitiouspolicycouldincorporatemorecomplex infor-mation,suchasanticipateddemand,intothisfunction.Theabove discussionisheavilysimplified,andtheconstraintsinvolvedwill dependonthedesignofthedatacenter,aswellasthegoalsofthe partiesinvolved.Justafort(2012)considerstheimpactat applica-tionlevel,forexample,whereStillwelletal.(2012)optimiseon
heterogeneousplatforms.
Searchingoveravectorspacecouldbeperformedusing
Sim-ulatedAnnealing(Kirkpatricketal.,1983),thetechniqueusedby
EmbersonandBate.
Thesearchforadynamicheuristicwouldbewell-suitedtoa
GeneticProgrammingapproach,withafunctionmappingvirtual
machinestophysicalserversrepresentedasanexpressiontree.
PreviousworkonschedulinginGPprovidesexamplesofhowthis
mightbeachieved(Jakobovi ´cand Budin,2006; Jakobovicetal., 2007).
4.3.2. Evaluationfunction
Theobjectivesofthis taskare tominimisepower
consump-tionwithintheconstraintsspecifiedbyanSLAwhilsttakinginto accountpotentialfutureusage.SLAconstraintswillinclude
pro-vidingenoughphysicalRAM,ensuresufficientlylowlatency,and
providingenoughindependenceofexecutiontomaintainuptime
guarantees–forexample,byavoidingthecoresidencyofaclient’s ownvirtualmachinesonasingleserver.Futureusagemustbetaken intoaccounttoensurethatviolationofanSLAconstraintisunlikely tooccurafterreassignment.
Therearethreeapproachestoevaluationthequalityofa solu-tion:modelling,simulationandphysicalevaluation;thelattercan
involve real-time feedback from the datacentre. The CloudSim
toolkit(Calheiros et al.,2011) providesan appropriate level of abstractionforthisevaluationfunctionandwasusedpreviously byBeloglazovandBuyya(2010).
4.4. Scalemanagementforcloudclients
Oneofthekey advantages ofcloudcomputingis theability
toautomaticallyscaleaserverinfrastructureforanapplicationin responsetochangesindemand.Acloudclientwillwishtominimise theirexpenditure;theyarebilledforthecomputationalresources
thattheyconsume.AhigherspecificationVM,withmoreRAMor
afasterprocessor,incursahighercost.Further,costsaretypically basedon‘instancehours’,i.e.howlongaVMisrunningfor, regard-lessoftheamountofcomputationperformed.Idleinstancescost
moneyanditisimportanttobringmoreVMsupwhenthereis
asurgeindemand,andscaleanapplicationdownbyterminating
VMinstanceswhen demanddrops.AsnotedinSection3.1,this
scalabilitymustbemanaged.
Thus a cloud application must be configured such that it
minimises resource usage while maintaining a certain level of
throughput.Thiscanbeachievedbyeitheranticipatingdemand
orbyconstructingasetofrulesthatreacttochangesinan applica-tion’susageprofile.Fluctuationscanbecausedbyfactorsincluding promotionalcampaigns,the‘Slashdoteffect’whereapopular web-sitelinkstoasmallerwebsite,andevenbehaviouralpatternsof users.Anexampleoftencited istheelectricity spikecaused by kettlesswitchedonathalftimeduringpopularsportingevents.
Somevariationsin demandarepredictable,while othersare
not.CloudprovidersAmazonandMicrosoftAzureallowaclient
toconfigurerulesforscalinganapplication.Theserulesarebased
onmetricsincludingmaximum,minimumandaverageCPUusage,
thenumberofitemsinaservers’requestqueueandsoon.Theyare typicallydesignedbyadeveloperandsystemadministrator.
Fig.7showsanexamplesnippetfromaMicrosoftconfiguration document.Thetophalfcontainsrulesforascheduledscalingof anapplication(alsocalled‘constraintrules’),whilethebottomhalf showsrulesthatdefinewhentoreacttofluctuationsinan applica-tion’susage(alsocalled‘reactiverules’).Bothtypesofrulesshould
beconfiguredinawaythatmaximisesperformance,butminimises
cost.Forexample,onewouldnotwanttoscalebeyondthelevel
specifiedbytheconstraintrules,becauseotherwiseaclientwillbe chargedforidleinstances.Equally,thereactiverulesneedtobe con-figuredsuchthattheyarerobusttosuddenspikesinuserdemand. WecanconsiderusingSBSEtooptimisesuchrulesinordertofind
anoptimaltrade-offbetweenperformanceandcost.
4.4.1. Formulation
TherulesshowninFig.7aredescribedinanXMLformat.As
suchtheyarealreadyinaformatamenabletoaGPapproach.The structureofanXMLdocumentisdefinedindifferentschemasthat
canbeusedbythesearchtoensureitonlygeneratesvalidXML
documents.
Rulesare composedfrom conditionsand associated actions.
Condition operators such as equals, greater,
greaterOrE-qual, less, lessThan, notcanbeused toform partofthe
functionset.Whenaconditionevaluatestotrue,the
correspond-ing action is performed. Actions are alsofunctions that take a
variablenumberofparameters.Forexample,thescaleactionis
parameterisedbythetypeofVMtoscale,aswellashowmany
replicatedVMsshouldbecreated.
Theterminalsetcanbeconstructedfromtheattributesdefined forthedifferentelementsintheXMLschema,alongwiththeir val-ues.Forexample,astartTimeattributeisoftypetimestamp.In addition,theterminalsetneedstocontainallpossibleVM config-urationoptionsalongwiththedatatypesrequiredbythefunction set(e.g.therangeofintegers).
4.4.2. Evaluationfunction
We cantreat thisas a multi-objectiveoptimisation problem
whereourobjectivesarelatencyandcost.Assumewehavean exist-ingloadtestsuite.Wemayevaluateagivensetofconstraintand reactiverulesbymeasuringtheaveragelatencyobservedforthe loadtests.Equally,wecanmeasurethecostofexecutingaloadtest suiteintermsofcomputationalresourcesrequired.
Evaluatingasetofscalingrulesbyrunninganentireloadtest islikelytobetoocomputationallyexpensiveinpractice.Forrest etal.(2009)showedintheirworkonautomaticpatchfixingthat samplingfromasetoftestcasescanbemoreefficientthanusing anentiretestsuite.Intheirwork,testcaseswereusedtovalidate candidatepatches,andapatchwasconsideredvalidifitpassedall testcases.Theyfoundthatasamplingmethodprovidedsufficient informationtoaGPinordertoguideittowardspossiblesolutions. Candidatesolutionsonlyhavetobeevaluatedonthecompletetest suiteonceanoptimisationrunfinished,orafteragivennumberof generations.
Sincetheevaluationfunctionusestwoconflictingobjectives,
theGPwillnotgenerateasinglesolution.Insteadwewillobtaina Paretofront.AParetofrontisasetofnon-dominating,Pareto opti-mal,solutions.Solutionsaresaidtobenon-dominatedifnoother solutionexiststhatisbetterinallobjectivesi.e.bothlatencyand cost.
Sincethe outputofthesearch willbea setof equallygood
Fig.7. Exampleconfigurationrulesforwhentoscaleanapplication.SnippettakenfromaMicrosoftAzureconfigurationfile(Microsoft,2012).Suchrulescanbegenerated andoptimizedusinggeneticprogramming.
Visualisinga Paretofrontintwo dimensionscanhelpwiththis
processasitillustratesthetrade-offsbetweenlowerlatencyand cost.
4.5. Spot-pricemanagement
WehaveseenhowSBSEmightbeusedtoreducethecostfor
cloudclientsbyevolvingoptimisedscalingrulesforanapplication.
Nowweexaminethemanagementofdemandfromthepositionof
acloudprovider.
Fromthepointofviewofacloudprovider,whena serveris
notrunningatfullcapacity,underutilisationofresourcesequates tolostpotentialrevenue.Thus, itisimportant tomaximisethe usageoftheircloudinfrastructureatanygiventime,given suffi-cientsparecapacitytocopewithanticipateddemandsurges.One waytoaddressthisproblemistoauctionoffunusedresources,such asAmazon’sSpotinstances(Amazon,2012).Aclientcanbidfora
spotinstance,andwhen theirbidmatchesorexceeds theprice
ofaspotinstance,theinstancebecomesavailabletothem.Aspot instanceremainsavailabletotheclientforaslongastheirbidprice matchesorexceedsthepriceofaninstance.Pricesareperiodically
adjustedbythecloudprovider,andwhenaspotpriceexceedsa
client’sbid,theylosetheirinstance.
Tomaximisetheeffectivenessofspotinstances,acloudprovider
mustefficientlydividesparecapacity amongstsuitableinstance
configurations,andsubsequentlymakethoseinstancesavailable
forauction.
4.5.1. Representation
Weproposetoformulatetheproblemofauctioningoffunused
capacityasabin-packingproblem,inasimilarmannertothe con-solidationprobleminSection4.3.Binsdenoteserverswithspare
capacityand theobjectstoplaceintothebinsareVMinstance
types. We assume that a provider only offers a limited
num-berofspotinstancesizes.ThegoalistodistributeVMinstances acrossaserverinfrastructuresuchthattheamountofidleserver resourcesareminimised.Foreaseofexplanationwewillonly con-siderthedistributionofVMinstancesoverafixedtimeperiodof
onehour.
LetusdenotethedifferenttypesofVMinstanceaprovidermay offerasvt1,...,vtn.Wecansimplyrepresentthepossibleallocation ofVMinstancestoserverbinsasamatrix,asshowninFig.8.Rows
inthematrixdenoteVMinstancetypesandthecolumnsdenote
serverbins.Thenumbersintherow/columnsdenotehowmany
newinstancesofaparticularVMtypeareallocatedtoaspecific server.
4.5.2. Evaluationfunction
Asanevaluationfunctionweproposetominimisespare capac-ity.Moreformally,lethidenotethesparecapacity(headroom)of serversi.ThisisdeterminedwithreferencetoEq.(1):
hi=1−Ui=1− n
j=1
ui,j (7)
Wemustthenfindanewallocationofspotinstancestoservers, representedbythematrixBasillustratedinFig.8whereandBi,j denotesthenumberofnewVMinstancesoftypej,tobedeployed onserveri.
Theevaluationfunctionwillbetoreducehiovertime,under theconstraintsimpliedbySLAs.OfferingtheVMsaccordingtothe matrixB(whetherfounddirectlyoroutputbyasearch-generated policy)isnotenoughtoimproveutilisation,asbuyersforthenew VMinstancesmustbefound.Inparticular,Bdoesnotgeneratea priceatwhichaparticularVMtypeshouldbeauctionedoff.
Ben-Yehuda et al. (2011) showed that spot instance prices
appear tobeset,not bysupply and demand,but rathera
ran-domisedalgorithm.Inparticular,theauthorsfoundthefollowing formulatobeagoodfitforhistoricalpricesetting:
Pi=Pi−1+i (8)
wherei=−0.7×i−1+
().Intheirformula,()denoteswhite noisewithstandarddeviation.TheauthorsmatchedtoavalueFig.8.ExamplematrixrepresentingpossibleVMinstancetypeallocationstoserver bins.
Fig.9. ShareofpaperstargetingeachSBSEresearchapplication. Source:SBSERepositoryZhangetal.(2012)(datafrom1976toMay2012).
of(0.39×(C−F)),whereCdenotesthemaximum(ceiling)andF denotestheminimum(floor)priceofaparticularVMinstance.Once asearchalgorithmhasfoundanoptimalallocationofVMinstance typesforidleresources,theVMimagescanbepre-builtandsold off.Thepriceforaninstancetypecanthenbesetusingtheabove formula.
5. ChallengesandopportunitiesforexistingSBSEmethods
Intheprevioussectionwepresentedanoverviewofchallenges
posedby cloudsystems that willlead tonew applications and
objectivesfortheSBSEresearchandpractitionercommunity.In
thissectionweturnourattentiontoexistingareasofSBSEresearch
thatfacenewchallengesandnewopportunitieswhenappliedin
theclouddomain.
SearchBasedSoftwareEngineeringresearchhasproducedmany
distinctsubdisciplines,mostofwhichwillremaincomparatively untouchedbythemigrationtocloudplatforms.Forinstance,the problemsofrequirementsengineering(ChengandAtlee,2007)and theassociatedSBSEresearch(Zhangetal.,2008),willnot necessar-ilychangefundamentally.Themainchallengeislikelytoremainthe elicitation,understandingandbalancingofmultiplerequirements andtheirrelationshipsandtensions,inthepresenceofmultiple
competingandconflictingstakeholders,withpoorlyunderstood
andill-expressedneedsanddesires.
However, for other areas of SBSE research and practice,
there will be a significant change, when the application focus
movesfromtraditionalplatformsandbusinessmodelstoacloud
paradigm.
5.1. Searchbasedtestingforthecloud
One of the biggest changes can be expected in the area of
SBSEfortesting,knownasSearchBasedSoftwareTesting(SBST)
(McMinn, 2004; Harman, 2007;Ali et al.,2010; Harman et al., 2012a).The changes in this areaare not entirelyproblematic;
cloud platforms offer significant advantages to the tester that
researchersinSBSTmayseektoexploit.Thisisparticularly sig-nificant,becausetestinganddebuggingformapproximatelyhalfof theoverallresearchactivityinSBSE(seeFig.9).Assuch,aneffecton SBSTwillhaveasignificantimpactoftheSBSEresearchagendaas whole.
5.1.1. On-demandtestenvironments
Oneofthemajorbottlenecksintestingisthetimeittakesto preparetestenvironmentsandexecutetestcases.Evenwhentests canbeexecutedinparallel,limitedavailabilityofhardwaremeans overalltestexecutiontimeremainshigh.Further,upuntilnow,it hasnotbeeneasytosetupatestinfrastructuretoruntestsin paral-lel.Cloudplatformsofferanewsolutiontothisproblem.Apartfrom unittests,manytestssuchasintegrationandloadtestsdependon
anexecutionenvironment.Wecaneasilyreplicatesuch
environ-mentsbycloningaVMimage.Thesecanthenbedistributedacross manyvirtualmachinestoruninparallel.
5.1.2. Snapshotting
Inacloudenvironment,everyapplicationrunsinsideavirtual
machine. Thus, it is possibleto takea snapshot of an
applica-tion, including its entire runtimeenvironment, at an arbitrary
stageinitsexecution.Thisopensupmanynewpossibilitiesfor debugging.
Traditionally software is deployed and installed on client
machines. Most software companies have little or no control
abouttheexecutionenvironmentoftheirapplication.5Thislackof
information,orratherthehugeconfigurationspaceapplications
run within,meansdebugging a failure is a difficult process. In
cloudapplications,thespecificationofauser’smachineislikelyto havemuchlessimpactonthecorrectrunningofsoftware;users’
machinesarebecomingmoreandmoreakintodumbterminals.
Since an application resides in thecloud, developers have full
access and control over its environment. Thus, when a failure
occurs,weareabletotakeasnapshotoftheentireVMstatefor
debuggingpurposes.
Inasimilarmanner,thereisthepossibilityofrapidlyforkinga runningVMinstance(Lagar-cavillaetal.,2009).Thiswillfurther enableefficientsoftwaretesting,byenablingasearchprocessto bifurcatetheexecutionofaproblembasedon,forexample,abranch predicate.
5.1.3. Multi-versiondeployment
Onewaytotestforregressionfaults,orevaluatepatches,isto
deploymodifiedsoftwareonlytoacertaingroupofusers,known
5ForexamplesomeWindowsapplicationsaredeployedunderLinuxwiththe
Forexample,Facebookrecentlyevaluateditspayperpostfeature onarelativelysmallaudienceinAustraliaandNewZealand,while therestoftheworldcontinuedtousethe‘oldversion’.Thismeans weareabletoeasilycollectprofileinformationacrossversions.We mayusethisinformationtohelpSBST,forexamplethroughseeding ofoptimisationalgorithms.
5.1.4. Quantifiablecost
It is also worth noting that the cloud offersa new way of
evaluating the cost of testing. Le Goues et al. (2012) usedthe
cloudtomeasurethecostofabugfix.Thisissimplythe
mone-tarychargeincurredforthecomputationalresourcesused,such
asthetime it tooktofinda patch.In testing wecouldequally
measure the cost of code coverage, the cost of finding a fault
andsoon.Justasacloudclientmightconsiderwhetherscaling
toanother VMwill bring sufficient return in revenue, wemay
alsoconsider the return oninvestment of additional resources
appliedtotesting.Theelasticresourcesofthecloudmustbeused efficiently.
5.1.5. Shortreleasecycles
Previously,bugswereexpensivetofixoncesoftwarehadbeen
deployed.Thiswaspartlybecausesoftwarewas‘shipped’,andthus itwasnoteasytopatchdeployedsoftwareapplications.Evenwhen patcheswereavailablefordownloadonline,noassumptionscould
bemadewhetherapatchhadbeendownloadedandinstalledby
auserornot.Thisinturnleadtoitsownsecurityvulnerabilities, sincehackerscouldusepatchestodetectsecurityvulnerabilities, andthenexploitusersthathadnotupdatedtheirsoftwarewiththe latestpatch.
Cloudcomputingalsooffersnewopportunitiesinthisarea. Soft-warenolongerneedstobeshipped,asitisprovidedinthecloud. Thus,whenaVMisupdatedwithapatch,allusersofan applica-tionautomaticallystartusingthepatchedversion,withouttheir knowledgeorintervention.
Theabilitytoeasilydeploysoftwareisleadingtoevershorter releasecycles.Smallchangescanbepushedtoeveryuserofthe
software,almostin anon-demandmanner.These shortrelease
cycleswillrequireimprovementsinhowregressiontestingis
per-formed.Oneresearchavenuecouldbehowtomake regression
testingfitintoaclouddevelopmentcycle.Itneedstobeefficient, soitdoesnotcounteractthenotionof‘shortreleasecycle’,while remainingrobust.
5.2. Softwaremaintenance
Muchpreviousworkintheareaofsoftwaremaintenancewithin
SBSEhasactuallyfocusedonrefactoringsoftware.Forexample,
acommongoalhasbeentorefactorsoftwaretoachieveagiven
qualityintermsofcohesionorcouplingmetrics(Lutz,2001). Ascloudcomputingreliesonacentralisedsystemofsoftware distribution,itpresentsnewopportunitiesforautomating mainte-nancethroughsearchthatwerenotpossibleinastandarddesktop computingscenario.Firstly,softwarechangescanberolledoutto
userswithoutuserintervention.Secondly,theseupdates canbe
rolledouttoa limited audiencethroughcanarying tolimitthe
impactofanyerroneousmodifications,aswellasthefacilityto
rapidlybackoutthose changesshoulditberequired.Thiscould
potentiallyallowSBSEresearcherstoautomatemoreoftheprocess thanhaspreviouslybeenpalatable,astheriskprofileofsoftware updatesisreduced.
Runningmanyvirtualmachineinstances,potentially
duplicat-ingsoftwareacrossmanymachines,offersnewopportunitiesfor
fault-findingandrepair.Snapshotting,asdescribedabove,allows improvedfaultlocalisationbyeffectivelycapturingtheinformation
bugmaybereplicatedgiventhecorrectinputs.Inaddition,itopens upthepossibilityofattemptingtorepairtheproblem automati-cally,particularlyiftheproblemrelatestoperformanceorresource consumption.
Forexample,considertheworkofCarzanigaetal.(2008),who useddifferentpathsthrougha system,suchassimilarAPIcalls,
tofindaworkaroundforagivenproblem. Theirworkreliedon
theredundancyinherentinsoftwareinthatthesame
function-alitycanoftenbefoundinmultiplelocations.Asimilarapproach canbefollowedusingdifferentversionsofthesamesoftware,and thoseversionscanalsobeusedasoracleswhenautomatingrepair. Inthecloud,theremayexistmanydifferentversionsofthesame software,allavailabletoarepairmechanismthroughaglobalAPI. Similarly,wemayusesnapshotstoroll-back,modify,andre-runa
systemafteracrashorperformanceproblem.Anobviousmethod
ofrepairistovarythesoftware’senvironment,byusingadifferent machineimageorphysicallocationinthedatacentre.
6. Cross-cuttingissues
This section discusses issues that must be addressed in the
designandimplementationofexperimentsinvestigatingSBSEfor
thecloud.Researchersandengineerswillencounterthree
com-monthemes:theproblemofeffectiveevaluationoftheirideasor
designs,howbesttosampletheavailabledatainordertomake
theirdecisions,andthephenomenonofincreasingspecialisation. 6.1. Prototypingandevaluation
Clouddatacentresaretypicallycomposedofserversinthetens ofthousands,andcosthundredsofmillionsofdollarstoconstruct. Constructing adatacentre iseconomicallyinfeasible foran
aca-demicinstitution,butwemustbeabletoevaluateourresearch
withsufficientfidelitytoenableustoestablishtheircredibility. Engineersworkingintheindustryfacesimilarproblems,andmay
belimitedtousingexisting facilitieswhendemandfor themis
low,orevaluationusingcanarying:limitedroll-outofchangesto arestrictedaudience.Ifneitheroftheseoptionsareavailable,they facethesamechallengesasacademicresearchers.
Whenworkingattheapplicationlevel,forexamplewhen
car-ryingoutresiliencetesting,existing cloudservicescanbeused.
However, anywork that requires access tothe lower levelsof
thecloudstackwillneeda cloudtestbed.The firstoptionis to usesimulation,andthereareanumberofcloudsimulationtools underdevelopment(Kliazovichetal.,2010;Calheirosetal.,2011). Suchsimulatorsmaybeappropriateforevaluatingcourse-grained
behaviourunderagivenVMmanagementsystem,bysimulating
migrationandconsolidationpatternswithoutconcernfordetailed applicationbehaviour.
Theefficiencyandfidelityofsimulationrestrictitsapplication. Therefore,itseemsthatpartoftheresearchagendaintocloud com-putingmustbetheevaluationofa‘scalemodel’cloudapproach.
Suchsmall-scalecloudmodelscouldbeconstructedbybuilding
fragmentsof cloudsystems (e.g.a singlerackofservers),orby replicatinglargerslicesofadatacentreusinglesscapablehardware. Indesigningbothsimulationsandscalemodelsofrealcloud sys-tems,therewillbenaturalopenresearchquestionsaboutwhether themodelscorrectlysimulatethebehaviourexhibitedbythereal worldcloud.ThisisnotanissuethatisspecifictoSBSEfor the cloud,butanissueforanyworkthatseekstoevaluateand exper-imentwithamodelorsimulationofthecloudand,indeed,isan issueforanymodelofanysystem,suchasforexample,modelsand simulationsoffinancialsystems(Cohenetal.,1983;Panayietal., inpress).
Theseissuesarechallengingbuttheyarenotinsurmountable.
Inthefinancialmodellingdomainforexample,DarleyandOutkin
(2007),wereabletoconstructaveryeffectivemodelofthemarkets whichwasusedtopredicttheimpactofNasdaqstockmarket dec-imalisation.Ifcomplexdynamicsystemssuchasfinancialmarkets canbeeffectivelymodelledandsimulated,thenthereisgrounds forhopethatthesamemaybetrueofcloudmodels.
Furthermore,theissueofdesigningasuitablecloudmodelis,
itself,anoptimisationproblem,makingSBSEanidealcandidate
forcloudmodellingtoo.Theoptimisationobjectiveistodesignthe model(selectingparameters,architectureandproperties)suchthat
predictionsandbehavioursobservedusingthemodelare
appro-priatesimulationsofrealworldclouds.Thisapproachtodesign
andtuningofmodelsandsimulationsusingsearchbased
optimi-sationhasalreadybeenappliedtofinancialmodels(Rogersandvon Tessin,2004;Narzisietal.,2006).Thereisnoreasontoassumethat itcannotalsobeappliedtooptimisingthedesignofcloudmodels andsimulations.
Thefurtherchallengeofgeneratingrepresentativeworkloads
is also problematic. In particular, we might hope to replicate
realisticworkloadsasseeninexistingdatacentres.Duetothe multi-residencyofdatacentres,thisdatamaynotbepubliclyavailable. Someprovidershavereleasedlimiteddata(Mishraetal.,2010; Google,2012).
6.2. Monitoringandsampling
Whendiscussingdesignissuesatthehighlevel,thedifficultyof measuringpropertiesthatareessentialcomponentsofan evalua-tionfunctioncanbelost.Forexample,considerpowerefficiency:
notonlyisitdifficulttoaccuratelymeasurepowerconsumption
and relatethat consumption to a particularvirtual machine or
application,butthepowerconsumptionofagivenapplicationor
servicemaybeheavilydependentonthedemandandinput
pro-fileforthatservice.Giventhattherearealargenumberofpotential executionscenarios,someformofsamplingisinevitable;thisraises
theissueofhowbesttochoosethatsample.Toolargeasample
mayleadtoanover-expensiveevaluationfunctionthatcannotbe
practicallyaccommodatedwithintheconstraintsimposedbymany
SBSEalgorithms.Searchalgorithmstypicallyrequirelarge
num-bersofevaluations,soevaluationhastobeefficient.However,too smallasamplemayyieldanestimateofqualitythatmayprovide insufficientguidanceforthesearch.
Findingtherightsamplesizewill,itself,beamatteroftuning pragmaticsortheoreticalanalysisfortheSBSEapproachused,but
thisisnotinsurmountable.ComparedtopreviousSBSEwork,the
processofdesigninganefficientevaluationfunctionwillrequire moreeffort.
Thevarianceinperformancecharacteristicsoverthesamplealso raisestheissueofwhatformofaggregationismostappropriate. Thiswilldependontheapplicationsinhandandwillbeinfluenced
bythetargetpropertiesoftheservicelevelagreementbetween
cloudproviderandclient.Forinstance,onsomecloudsettings,it willbeimportanttoreducethevarianceinperformancetoincrease predictabilityofexecution.However,inothercases,theengineer mayseektoreducetheaverageloadorthepeakloadonresources
suchastimeandmemory.
6.3. Onlineoptimisation
ManyapplicationsofSBSEtocloudsystemswillrequireonline optimisationratherthanofflineoptimisation,byperforminginsitu adaptationintheliveenvironment.Thisisincontrasttomost
pre-viouswork,whichhasassumedpre-deploymentoptimisationin
aclean-roomenvironment.Onlineoptimisationisyettobe seri-ouslyadoptedbytheSBSEcommunity,andamajorinhibitingfactor
isthecostofcomputation:optimisationmethodssuchas
evolu-tionaryalgorithmscanrequiresignificantcomputationalresources themselves.
Apossiblesolutiontothisproblemhasrecentlybeenproposed, intheformofAmortisedOptimisation(Yoo,2012).Yoosuggests distributingthecostofoptimisationacrossmultipleexecutionsof thetargetsystem:eachexecutionofthetargetsystemprovidesa singleobservationoftheevaluationfunction.Whilethe optimisa-tionprocesstakesplaceoverlongertimescales,thetotalamount
of computational overhead for the optimisation is significantly
reduced,allowingittotakeplaceintheliveenvironment.
Cloud systems providean amenable environment for
amor-tisedoptimisation,astheparallelexecutionofthesameapplication acrossmultiplevirtualmachinespresentsthepossibilityof
speed-inguptheamortisedoptimisationbyaggregatingmanyparallel
observations of the evaluation function. This is analogous to
pastuseofislandmodelsindistributionoptimisationalgorithms (Whitley,2001).
7. Conclusions
CloudcomputingofferstheSBSEresearchagendaastimulusof newchallengesandopportunities.Thereisthechallengeofusing SBSEtoaddressthetrade-offsinherentincloudcomputing.There arealsonewopportunitiesforextendingpastworkintestingand maintenancewithinSBSEtocloud-specificapplications.
Though the fundamental technical concepts of the cloud
paradigmwillbefamiliartomany,theemphasis theyplace on
thecombinationofvirtualisation,dynamicre-assignment, distribu-tionandparallelismdoesraiseimportantnewresearchquestions. Thisnovelshiftofemphasisalsolendsadditionalimportanceto manyalreadyknown,butunderdeveloped,applicationsofsoftware engineeringoptimisation.
Theshifttoapay-as-you-gobusinessmodelforthedeployment
anduseofcomputationalsocreatesnewchallengesand
opportu-nities.Thisshiftisa‘disruptive innovation’,becausethecostof
computationcanmorereadilybemeasureddirectlyinmonetary
terms.Thisis asignificantchangeintheunderlyingcomputing
businessmodel.Onemightimaginethatreal-currencycostingmay
ultimatelyhave animpactonglobalfinance.We maysoon see
tradingin‘computationalfutures’and,perhaps,eventhecostof computationasa‘globalcurrencystandard’.
Theabilitytodirectlymeasurecostwillalsohaveaprofound effect onresearch,and particularly onSBSEresearch. Hitherto,
costmeasurementhasprovedtobeathornyprobleminsoftware
engineeringresearch:ourforebearswereforcedtorelyoncost
sur-rogates.Forexample,thosewhosoughttomeasuredevelopment
costhadtobesatisfiedwithfunctionpoints,codesizeand per-sonmonthsofdevelopereffort.Similarly,inplaceoftestcost,we
havebecomeaccustomedtousingthetestsuitesize,execution
timeandtestprocessduration.Cloudcomputingmaychangethis: developmentandtestcostcanbemeasuredindollarsoryuan.
MostincarnationsofSBSEaremanifestationsofthetrade-off
betweencostandvalueobjectives.FortheseSBSEformulations,
theabilitytomeasurecostintermsofthedirectbottomlinefor theorganisationislikelytoaddrealismandactionability.Results foroptimisationthatproducereducedcostscanbedirectly mea-suredinfinancialgaintotheadopters,makingthebusinesscase
foradoptionsimplerandmorecompelling.
TheoutputsofSBSEresearchforcloudarealsolikelytobeofhigh impactbecauseofthewideuptakeofcloudresearch.Evidenceof thegrowthofinterestandactivityincloudcomputingabound.In
thispaperwehavealsopresentedevidencethattheproblemsof
cloudengineeringarehighlyamenabletoSBSEsolutions,
reformu-latingfivesuchproblemsasSBSEproblems.Therapiduptakeof