• No results found

Cloud engineering is search based software engineering too

N/A
N/A
Protected

Academic year: 2021

Share "Cloud engineering is search based software engineering too"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Harman, M., Lakhotia, K., Singer, J., White, D., and Shin, Y. (2013) Cloud

engineering is search based software engineering too. Journal of Systems

and Software, 86 (9). pp. 2225-2241. ISSN 0164-1212

Copyright © 2013 Elsevier Ltd.

A copy can be downloaded for personal non-commercial research or

study, without prior permission or charge

The content must not be changed in any way or reproduced in any format

or medium without the formal permission of the copyright holder(s)

When referring to this work, full bibliographic details must be given

http://eprints.gla.ac.uk/71317/

Deposited on: 20 August 2013

Enlighten – Research publications by members of the University of Glasgow

http://eprints.gla.ac.uk

(2)

ContentslistsavailableatSciVerseScienceDirect

The

Journal

of

Systems

and

Software

j ourna l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / j s s

Cloud

engineering

is

Search

Based

Software

Engineering

too

Mark

Harman

a

,

Kiran

Lakhotia

a

,

Jeremy

Singer

b

,

David

R.

White

b,∗

,

Shin

Yoo

a

aUniversityCollegeLondon,GowerStreet,LondonWC1E6BT,UnitedKingdom bSchoolofComputingScience,UniversityofGlasgow,G128QQ,UnitedKingdom

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received2July2012 Accepted15October2012 Available online 23 November 2012 Keywords:

SearchBasedSoftwareEngineering(SBSE) Cloudcomputing

a

b

s

t

r

a

c

t

Manyoftheproblemsposedbythemigrationofcomputationtocloudplatformscanbeformulated andsolvedusingtechniquesassociatedwithSearchBasedSoftwareEngineering(SBSE).Muchofcloud softwareengineeringinvolvesproblemsofoptimisation:performance,allocation,assignmentandthe dynamicbalancingofresourcestoachievepragmatictrade-offsbetweenmanycompetingtechnicaland businessobjectives.SBSEisconcernedwiththeapplicationofcomputationalsearchandoptimisationto solvepreciselythesekindsofsoftwareengineeringchallenges.InterestinbothcloudcomputingandSBSE hasgrownrapidlyinthepastfiveyears,yettherehasbeenlittleworkonSBSEasameansofaddressing cloudcomputingchallenges.Likemanycomputationallydemandingactivities,SBSEhasthepotentialto benefitfromthecloud;‘SBSEinthecloud’.However,thispaperfocuses,instead,ofthewaysinwhich SBSEcanbenefitcloudcomputing.Itthusdevelopsthethemeof‘SBSEforthecloud’,formulatingcloud computingchallengesinwaysthatcanbeaddressedusingSBSE.

© 2012 Elsevier Inc. All rights reserved.

1. Introduction

Inmanywayscloudcomputingisnotconceptuallynew:it effec-tivelyintegratesasetofexistingtechnologiesandresearchareas onagrandscale,butthereisnothinginherentlynewaboutmostof thespecifictechnicalchallengesinvolved.Thebusinessmodelthat

underpinscloudcomputingisnew.Ratherthanviewingsoftware

andthehardwareuponwhichitexecutesasfixedassetstobe pur-chasedandsubsequentlysubjecttodepreciation,thecloudmodel

treatsbothsoftwareandhardwareasrentedcommodities.

Thisisaprofoundchange,becauseitremovesboththebusiness issueofdepreciationandmoretechnicallyorientedconcernsover resourcefragmentationandsmoothscalability,alongwithdefinite commitmentstoparticularplatforms,protocols,and interoperabil-ity.Nevertheless,inordertoavailthemselvesoftheseadvantages,

boththecloudproviderandtheirclientswillneednewwaysof

designing,developing,deployingandevolvingsoftware,thereby

posingnewquestionsfortheresearchcommunity.

The pace of the recent industrial uptake of cloud

comput-ing, withits associated technical challenges, also increasesthe

significanceofexistingconceptualandpragmaticquestionsthat

hadpreviouslyfailedtoreceivewidespreadattention.Examples

includethetailoringofoperatingsystemenvironmentsandtesting

∗ Correspondingauthor.

E-mailaddresses:[email protected](M.Harman),[email protected]

(K.Lakhotia),[email protected](J.Singer),[email protected]

(D.R.White),[email protected](S.Yoo).

inavirtualisedenvironment.Someofthesetechnicalissuespresent newdifficultiesthatmustbeovercome,whilstothersoffer oppor-tunitiesthatwillenablesoftwareengineerstotestanddeploycode

innewways.

Cloudcomputingrepresentsafurtherstepalongaroadthathas

takenourconceptionofcomputationfromtherarefiedandideal

worldofformalcalculitothemoremundaneandinexactworld

ofengineeringreality.Inthe1970sand1980s,therewasa

seri-ousdebateaboutwhethersuchathingas‘softwareengineering’

evenexisted(Hoare,1978).Manyarguedthatsoftwarewasnot

anengineeringartefactandthattotreatitassuchwasnotonly

wrongbut dangerous,withspecial opprobrium(Dijkstra,1978)

beingreservedforanydissenterswhodaredtoadvocatesoftware testing(DeMilloetal.,1979).

However,astimepassedandsystemsbecamelarger,therewas

agradualrealisationthatassizeandcomplexityincreases,

algo-rithmsgivewaytosystemsandprogramsgivewaytosoftware.

Adetailedandcompleteformalproofofa50millionlinebanking systemmay,indeed,beconceptuallydesirable,butitisnomore practicalthanaquantummechanicaldescriptionoftheoperation ofthehumancirculatorysystem.Theviewofsoftwareasan

engi-neeringartefact,whereby itis meaningful tospeakofsoftware

engineeringasadiscipline,graduallybecameaccepted–evenby itserstwhiledetractors(Hoare,1996,1996).

Astherecognitionofsoftwaredevelopmentasanengineering

activitygainedincreasingtraction,theviewofsoftware engineer-ingasadisciplinefundamentallyconcernedwithoptimisationalso gainedwiderattentionandacceptance.Thisoptimisation-centric viewofsoftwareengineeringistypifiedbythedevelopmentofthe

0164-1212/$–seefrontmatter © 2012 Elsevier Inc. All rights reserved.

(3)

areaofresearchandpracticeknownasSearchBasedSoftware Engi-neering(SBSE)(HarmanandJones,2001;Harman,2007),atopicof growinginterestformorethantenyears(FreitasandSouza,2011).

SBSEhasnowpermeatedalmosteveryareaofsoftware

engineer-ingactivity(Zhangetal.,2008;Afzaletal.,2009;Alietal.,2010; Räihä,2010;AfzalandTorkar,2011;Harmanetal.,2012a).

Cloudengineeringpropelsusfurtheralongthisjourneytowards avisionofsoftwareengineeringinwhichoptimisationliesatthe verycentre.Thecentraljustificationfordeployingsoftwareona cloudplatformrestsuponaclaimaboutoptimisation.Thatis:

Optimisationofresourceusagecanbeachievedbyconsolidating hardwareandsoftwareinfrastructureintomassivedatacentres, fromwhichtheseresourcesarerentedbyconsumerson-demand.

Cloud computing is thus the archetypal example of an

optimisation-centricviewofsoftwareengineering.Manyofcloud

computing’sproblemsand challenges revolvearound

optimisa-tion,whilemostofitsclaimedadvantagesareunequivocallyand unashamedlyphrasedintermsofoptimisationobjectives.SBSEis thusanaturalfitforcloudcomputing;itisprimarilyconcerned withtheformulationofsoftwareengineeringproblemsas

optimi-sationproblems,whichcansubsequentlybesolvedusingsearch

algorithms.

Aswell as providing a natural set of intellectual toolswith

whichtoaddressthechallengesandopportunitiesofcloud

com-puting,SBSEmayalsoprovideaconvenientbridgebetweenthe

engineeringproblemsofcloudcomputinganditsbusiness impera-tives.Optimisationobjectivesconnectthesetwoconcerns,seeking architecturesthatmaximiseflexibility,assignmentsthatminimise fragmentation,(re)designsthatminimisecostpertransactionand soon.

ThispaperexploresthissymbioticrelationshipbetweenSBSE

and cloud computing. Many papers have been written on the

advantages of migrating software engineering activities to the

cloud(Abadi,2009;Sotomayoretal.,2009;Armbrustetal.,2010).

NodoubttherearemanyadvantagesofsuchamigrationtoSBSE

computationinthecloud,someofwhichhavealreadybeenrealised (LeGouesetal.,2012).Inthispaperwepursueadifferent

direc-tion.RatherthanshowingthatthecloudbenefitsSBSE(weterm

thisSBSEinthecloud)insteadthispaperexploresanddevelops howSBSEmaybenefitthecloud(wetermthisSBSEforthecloud). Thusweaskthefollowingquestion:

How can SBSE help to optimise the design, development,

deploymentandevolutionofcloudcomputingforitsproviders andtheirclients?

Wereachouttotwoaudienceswiththispaper:first,weaim

to encourage SBSE researchers that cloud computing presents

relevantresearchchallenges;second,weaimtoshowcloud practi-tionersthatSBSEhaspotentialtosolvemanyoftheiroptimization problems.Theprimarycontributionsofthispaper1are:

1TosetoutaresearchagendaofSBSEforcloudcomputing,giving adetailedroadmap.

2To make the road map concrete we specify, in detail, five

problemswithinthebroadareasof‘SBSEforthecloud’.We

re-formulatetheseproblemsassearchproblemsin thestandard

mannerfor SBSEresearch(HarmanandJones, 2001;Harman,

2007),byproposingspecificsolutionrepresentationsand evalu-ationfunctionstoguidethesearch.

1 Thisisaninvited‘roadmap’paperwhichformspartofJournalofSystemsand

SoftwareSpecialIssueon‘SoftwareEngineeringfor/intheCloud’.

3Toreviewhowcloudsystemsposenewchallengesand

opportu-nitieswithinexistingapplicationareasofSBSE,andinparticular searchbasedsoftwaretesting.

Therestofthispaperisorganisedasfollows.First,wegiveabrief backgroundofcloudcomputingandSBSEinSection2.Thenwe dis-cusstheengineeringchallengesfacingcloudprovidersandtheir clientsinSection3.Section4givesexamplesofwhereandhow

SBSEcanbeappliedtosolvecloudengineeringproblems,which

arenewareasofapplicationforSBSEtechniques.InSection5we outlinewherethenatureofcloudcomputingoffersopportunities forimprovementoverpastSBSEapplications.InSection6,we dis-cusssomecommonchallengesofapplyingthesemethodsinacloud scenariobeforeconcludinginSection7.

2. Background

WenowprovideabriefoverviewofcloudcomputingandSearch

BasedSoftwareEngineering.

2.1. Cloudcomputing

Cloudcomputing(MellandGrance,2009)isarelativelyrecent andcurrentlyhigh-profilemethodofITdeployment.Compute facil-itiessuchasvirtualisedserverhostingorremotestorageaccessed viaanAPI,areprovidedbyacloudprovidertoacloudclient.2Oftena clientisathirdpartycompany,whowillusethecomputefacilities toprovideanexternalservicetotheirusers.Forexample,Amazon

isacloudprovidersupplyingastorageAPItoDropbox,whouse

theAPItoprovideafilesynchronisationservicetodomesticand

commercialusers.

Inmanyways,cloudcomputingresemblesamoveawayfrom

conventional desktopcomputingthat hasbeenthehallmark of

thepreviousdecade,towardsacentralisedcomputingmodel.One

starkdifferencetothemainframesofthetimesharingpastisthat

thenew paradigmis supported by vast datacentrescontaining

tensofthousandsofserversworkinginunison.Thesedatacentres

aredescribedbyBarrosoandHölzle(2009)as‘Warehouse-Sized

Computers’(WSCs),reflectingtheirviewthatthemachinescan col-lectivelyberegardedasasingleentity.Thecoordinationofthese machinesissupportedbyspecialistsoftwarelayers,including vir-tualisation technologiessuchas Xen (Barhamet al., 2003) and distributedservicessuchasBigTable(Changetal.,2008).

Mostof thetechnologiesinvolvedin constructingand using

cloudcomputingarenotnew,butratheritistheirparticular combi-nationandlarge-scaledeploymentthatisnovel.Theemergenceof

cloudcomputingwouldnothavebeenpossiblewithoutthegrowth

invirtualisationandwidespreadinternetaccess.Cloudcomputing

promisestocommoditisedatastorage,processingandservingin

thewayenvisionedbyutilitycomputing(Rappa,2004)andservice orientedarchitecture(PapazoglouandvandenHeuvel,2007). 2.1.1. Cloudarchitectures

Cloudcomputingisfrequentlypresentedasalayered architec-ture,asshowninFig.1.However,inpracticetheselayersarenot

distinct.For example,datastorageservicesmayberegardedas

infrastructure,aplatformoranapplication,dependingonexactly

howthoseservicesarebeingused.Wedeliberately avoidusing

thesesomewhatartificialdistinctions.

2Inthispaper,theword‘user’isreservedfortheend-usersofsoftware,i.e.the

customersofaclient.Thisiscomplicatedsomewhatbythefactthatsomecloud providersservebothclientsandusers;forexample,GoogleprovideAppEngineas aservicetodevelopers,buttheyalsoprovideGmailtousers.

(4)

Fig.1.Thecloudstackarchitecture.

Cloudhardware istypicallycomposedof commodity

server-grade×86computersarrangedinashallowhierarchycomposed

ofracksandclustersofracks.Theyarephysicallylocatedinaseries ofgeographicallydistributeddatacentres.InthecaseofAmazon WebServices,thesedatacentresaresplitintodiscreteavailability zones.

Servers runhypervisor technologysuchas Xen orVMWare,

which managemultiple virtualmachines(VMs)oneach

physi-cal machine.VMs are provided directly toclient companiesby

providerssuchasRackSpaceorAmazonattheinfrastructurelayer. EachlivevirtualmachineisaninstanceofanofferedVM configu-ration,whichisprincipallydefinedbythememoryandprocessing poweravailabletotheVM.Everyinstancemountsamachineimage,

alsoknownasanoperatingsystemimage. Animageincludesan

operatingsystem,andtherequiredserversoftwaresuchasweb

serversortransactionprocessingsoftware.

Themoststraightforwardexampleofaplatform-levelservice

is Google App Engine, where a customer provides code and

Googleautomaticallymanagesthescalingtorespondtoincoming

requests.

Applicationscomprisemostusageofcloudcomputingbythe

generalpublic.PopularcurrentexamplesincludeMicrosoftOffice

365andGmail.

2.1.2. Cloudresearch

Ratherthan attemptingadetailed literaturereview of cloud

computing,wepresentasimple,informal,graphicaloverviewof

cloudresearch.Fig.2isawordcloudgeneratedfromthetitlesof around25,000papersreturnedbyakeywordsearchforcloudatthe

ACMguidetocomputingliterature.Thesearchwasperformedon

15May2012.Inthiscloudwordcloud,thesizeofawordreflects itsfrequencyintheunderlyingcorpus.CommonEnglishwordsand thoseoccurringrarelyarenotincludedinthediagram.

FromacursoryinspectionofFig.2,thelargestwordsare indica-tiveofthetopic(cloud,computing)andthetechnology(applications, networks,web,etc.).Todrilldowntotheresearchthemesofcloud

computing,weremovefromtheunderlyingcorpusalltermsthat

occurintheconciseNISTdefinitionofcloudcomputing,whichis thefirstsentenceofSection2inMelland Grance(2009).Fig.3

showstheresultingfilteredcloudwordcloud. TherearetwokeypointstohighlightfromFig.3.

1AsmentionedinSection1,cloudresearchissuesarenotnew. Thecloudwordcloudshowsstronglinkswithancestralresearch fieldsincludingdistributed,parallel,andgridcomputing.

2The concepts presented in the word cloud are amenable to

SBSEoptimisation.Apparenttermssuchasscheduling, cluster-ing,reconstruction, and optimisationall admitthepotentialof

SBSE.Termssuchasestimation,modelling, predictionand

sim-ulationshowthatthecloudresearchcommunityisattempting

toapplywell-knownexistingsystemoptimisationtechniquesto

theclouddomain.

2.1.3. Cloudwordcloudinthecloud

Asanaside,thecloudwordcloudsinFigs.2and3were gener-atedinthecloud.Thepapertitlemetadatawascollectedfromthe

ACMwebportalusing50AmazonEC2t1.microLinuxinstances,

andstoredinasingleS3bucket.Theestimatedcostofthis

com-putationis$0.89,usingAmazonWebServices’on-demandprices

fromMay2012intheEU(Ireland)region.Thegraphical represen-tationsofthecloudwordcloudsareproducedbyWordle,anonline wordcloudgenerationservice(Viegasetal.,2009).

Thisexerciseservestodemonstratetheaccessibilityand econ-omyofcloudcomputingforsimpledataprocessingtasks.Wefully expecttherapid,widespreadadoptionofcloudcomputing,as pre-dictedbyindustryanalystsIDC(2011).

2.2. SearchBasedSoftwareEngineering

SearchBased SoftwareEngineering(SBSE)isthenamegiven

toafieldofresearchandpracticeinwhichcomputationalsearch andoptimisationtechniquesareusedtoaddressproblemsin

soft-wareengineering (Harman and Jones, 2001).The approach has

provedsuccessfulasameansofattackingchallengingproblemsin whichtherearepotentiallymanydifferentandpossiblyconflicting

objectives,themeasurement ofwhich maybesubjecttonoise,

imprecision and incompleteness. Notable successes have been

achievedforproblemssuchastestdatageneration,modularisation, automatedpatching,andregressiontesting,withindustrialuptake (Cornfordetal.,2003;WegenerandBühler,2004;Afzaletal.,2010; Lakhotiaetal.,2010b;Cadaretal.,2011;Yooetal.,2011)andthe provisionoftoolssuchasAUSTIN(Lakhotiaetal.,2010a),Bunch (MitchellandMancoridis,2006),EvoSuite(FraserandArcuri,2011), GenProg(LeGouesetal.,2012),Milu(JiaandHarman,2008)and SWAT(AlshahwanandHarman,2011).

ThefirststepintheapplicationofSBSEconsistsofa

reformu-lation ofa software engineering problemasa ‘search problem’

(HarmanandJones,2001;Harmanetal.,2012a).Thisformulation involvesthedefinitionofasuitable representationofcandidate

solutions to the problem, or some representation from which

thesesolutionscanbeobtained,andameasureofthequalityof agivensolution:anevaluationfunction.Strictlyspeaking,allthat isrequiredofanevaluationfunctionisthatitenablesoneto com-pute,foranytwocandidatesolutions,whichisthebetterofthe two.

(5)

Fig.2.Cloudwordcloud,summarizing25,000researchpapersoncloudcomputing.

Therearemanysurveys,overviewsandreviewsofSBSEthat

providemorein-depthtreatmentandcoverageofresultsspecificto individualengineeringsubdisciplinessuchasrequirements analy-sis(Zhangetal.,2008),predictivemodelling(Harman,2010;Afzal andTorkar,2011),non-functionalproperties(Afzaletal.,2009), programcomprehension(Harman,2007),design(Räihä,2010)and testing(McMinn,2004;Harman,2008;Afzaletal.,2009;Alietal., 2010).Thereisalsoatutorialpaperaimedprimarilyatthereader withnopriorknowledgeofcomputationalsearchorSBSE applica-tions(Harmanetal.,2012b)andarecentbibliometricanalysisof tenyears’ofSBSEliterature(2001–2010)(FreitasandSouza,2011).

Fig.4showstherecentrapidriseinpublicationsonSBSE. Inthispaperweintroducetheideaof‘SBSEfortheCloud’by definingrepresentationsandevaluationfunctionsforseveralcloud

computingproblems.Wedothistoillustratehowcloud

comput-ingproblemscanbere-formulatedasSBSEproblems.Weseekto

providereasonablecoverageofcloudcomputingapplicationsand

issues,butaresurethattherearemanymoresuchreformulations waitingtobeexplored.

3. Engineeringchallengesincloudcomputing

Theengineeringchallengesposedbycloudcomputingcanbe

dividedbetweenthose facingtheprovidersof cloud

infrastruc-tureand those faced bytheirclients.For theformer, there are challengesinthe(re-)engineeringoftheirsystemstoensurethey

aresuitable forclouddeployment,aswellas reducingcostsby

optimisingresourceusage.Cloudproviderssharesimilargoalsin reducingresourceusage,buttheymustalsofocusonmaintaining theirservicelevelagreements(SLAs).Fig.5detailsthepotential

applicationareasforSBSEmethodsthatwehaveidentified.The

figureisannotatedwithcorrespondingsectionnumbersforthose

discussedlaterinthepaper. 3.1. Challengesforcloudclients

Fromtheviewofaclient,thecrucialfeatureofcloud comput-ingisarelianceoncloudprovidersfortheircomputationaland

storageneeds.Insteadofmakingcapitalinvestmentinhardware

(6)

Fig.4.YearlySBSEpublicationrates1976–2010. Source:SBSERepository(Zhangetal.,2012).

Fig.5. AnoverviewofthesoftwareengineeringtasksfacingcloudprovidersandtheirclientsthatmaybeamenabletoanSBSEapproach.

andhostingtheirownservers,acompanymayinsteadchooseto

rentfacilitiesfromprovidersonapay-as-you-go(tm)basis.Aswith

otherformsofoutsourceddeployment,thisenableslower

over-heads through economies of scale and expertise. However,the

pay-as-you-gomodelhasadditionaladvantages:

1Computingexpensesarenowregardedasanoperational

over-head,ratherthandepreciatingcapital.Thus,largeandrisk-laden upfrontexpenditureisnotrequired.

2Theopportunityforscalabilityisunprecedented.Capacitycanbe added‘on-demand’,withoutfixedcostsandatshortnotice. 3Thereisnoeconomicdifferencebetweenusingasingle

proces-sorforahundredhoursversususingahundredprocessorsfor

onehour.Thisenablesclientstoquicklycompletetasksthat pre-viouslywouldhaverequiredlongertimeperiodsorsubstantial capitalinvestment.

Clouddeploymentpresentsaseriesofengineeringandbusiness challengesforcloudclients:

3.1.1. Predictivemodelling

Inordertobestoptimiseclouddeploymentthroughmost

effi-cientuseofavailableresourcesandminimisationofcostsinvolved itwillbenecessarytopredictbehaviour,load,useandother pro-filesthataffectresourceconsumptionandcosts.Fortunately,there

hasbeenmuchworkonpredictivemodellingandthe

optimisa-tionofpredictivemodelsusingsearchbasedtechniques(Harman, 2010;AfzalandTorkar,2011)thatcanbeappliedtocloudsystem behaviourprediction.

3.1.2. Scalability

Scalabilitydoesnotcomeforfree.Existingsoftwaremustbe

re-engineeredto takeadvantageof thescalabilityof thecloud. Scalability relies upon parallelism, which is typically provided

byeitheruser-levelparallelism(manyusersaccessingthesame

service)ordata-levelparallelism(datacanbeprocessedin par-allel).Softwaremustbedesignedorrefactoredtohavealoosely

coupled and asynchronous architecture, to reduce bottlenecks

(7)

demandmust beanticipated usingpredictive modelling,and a policyformanagingscalemustbeestablished.

3.1.3. Faulttolerance

Thedistributednatureofsystemsinthecloudandthelower

reliability(VishwanathandNagappan,2010)of cloudplatforms requireanewapproachtofaulttolerance.Softwareshouldbe tol-eranttocompletehardwarefailure:avirtualmachinecanbelost atanytime,andsimilarlyitshouldbeanticipatedthatAPIcallscan fail.Expectationsofnetworkperformance,suchaslatency,haveto berevised.Thefragilityoftheunderlyingplatformisinstark con-trasttotheexpectationsofcustomersusingthewebservicesthat sooftenrelyonthecloud–suchservicesmusthaveahighlevelof availabilityandlowlatency.Testingproceduresmustbesuitably adapted;thecanonicalexampleisNetflix’s‘ChaosMonkey’(Netflix TechBlog,2008).

3.1.4. Resourceefficiency

Thecloudpay-as-you-gomodelmakesexplicitthevariablecosts ofcomputation.Thisprovidesagreaterincentiveforcompaniesto beasefficientaspossibleintheiruseofresources.Thesparecycles ofthepastaretoagreatdegreeeliminated,andthereforedecisions suchashowandwhentoscale,whichtechnologiesortoolchainsto use,andefficientsoftwaredesign,becomefarmoreimportantthan theyhavebeenduringthepre-cloudera.Forexample,ifacompany canuseasmallerVMinstancebyrequiringlessRAM,thenitwill reduceitsoperatingcosts.

3.1.5. Evaluatingsystems

Newscalesmeannewchallenges.Systems thatworkfineat

smallscales,or insimulation,mayfacenewproblemsatlarger

scalescausedbybottlenecksthatwerenotpreviouslyapparent.

Thisisprevalentintheanalysisof‘BigData’(Jacobs,2009).Of par-ticularconcerntoresearchersisthattheproblemofevaluatingnew ideasappliestotheirresearchasmuchas,ormorethan,software

developmentin industry.Problemsmayonlyoccurata certain

levelof scaleand concurrency.We discussthis issuefurtherin

Section6. 3.1.6. Security

Maintainingdataandsystemsecurityisachallenge,notleast

becausedatathatwaspreviouslyhostedona company’s

inter-nalserversmaynowresideonathirdpartymachineandpossibly residealongsidethedataandcomputationofthirdparties. Tradi-tionalfirewallsandintrusiondetectionsystemsdonotapplyand thesandboxingofvirtualisationlimitsthevisibilityofnetwork traf-fic,impairingthemonitoringofthistrafficbycloudclientsasa

defensivemeasure.

3.1.7. Managingthebusinessmodel

Reese(2009)notesthatitissometimeslostindiscussionsof cloudcomputing that scalability in itself is not an end, rather

that scalability must coincide with the goals of the business.

For example, serving extra users may not necessarily be

prof-itable, if those users are unlikely to purchase a given product

orservice. Criticaltoachievinga sustainablebusiness model is

themanagement of scalabilityand thepredictive modelling of

demand.

3.1.8. Accelerateddevelopmentcycles

The difficultyof distributing software toindividual users is

oftenavoided by employing cloudcomputing.Only server-side

softwareneedbeupdatedwhenchangingaserviceprovidedover

theweb,forexample.Furthermore,theroll-outofnewsoftware

canbelimitedtogivenusersorgeographicalareas,atechnique sometimesreferredtoascanaryingorA/Btesting.Thustraditional

barrierstoreleasearenolongeranobject,andcompaniessuchas Googlefrequentlyreleasemultiplerevisionsoftheirsoftwarein asingleweek.Thisbringsnewchallengesinanaccelerated life-cycle,suchascontinuoustesting,versioncontrolandmaintaining documentation.

3.1.9. Riskmanagement

Thereare two major risks facingan organisationhoping to

migrateordeploytothecloud(Armbrustetal.,2009,2010;Cliff, 2010).Thefirst istheissueofvendorlock-in,particularlywhen

making useof platform-as-a-service, which relies on

provider-specificAPIs.Anacceptedstandardforcloudinteroperabilityhas yettobeproposed,andasproviderscontinuetodivergeintheir

approach to provision, such standardisation looks increasingly

unlikely. Projectssuch asEucalyptus (Nurmi et al., 2009)have

suggesteda more pragmaticapproach, bycreatingopen-source

implementationsofproprietaryAPIs.Thesecondmajorrisk

sur-roundsdatagovernance,whendataistrustedtoathirdparty(the cloudhost)andinsituationsparticulartocloudcomputing,suchas coresidency.

3.2. Challengesforcloudproviders

Principally, cloudproviders are concernedwith maintaining

theirservicelevelagreementsasefficientlyaspossible,andina

scalablemanner:

3.2.1. Virtualmachinemanagement

Itisconvenienttoconsiderthemachinesthatpopulatea data-centreasasinglemassivelyparallelunit.Fromthispointofview,

theproblemsofresourcemanagementareclear:schedulingmust

takeplaceacrosstensofthousandsofmachines;individual vir-tualmachineinstancesmustbeassignedtophysicalservers;loads

mustbemonitoredandbalanced;VMscanbemigratedbetween

servers,andmemorycanbepagedacrossthenetwork(Williams

etal.,2011).Coresidencyalsobringschallengesofnewpathologies, inthesamewaythatharddisk‘thrashing’mayaffectconventional sharedsystems.

3.2.2. Managingoversubscription

Manycloudproviderstakeadvantageofthevariableworkloads ofclientsbyutilisingoversubscriptiontoreducetheircosts.Inan

oversubscribedsystem,theamountof resourcesprovisionedto

clientsexceedsthetotalcapacityofthephysicalresourcesavailable totheprovider.Thechallengeistoanticipatewhendemandmight outstripsupply,andtotakemeasurestomitigatesuchsituations withoutviolatingSLAs.

3.2.3. TranslatingSLAstolow-levelbehaviour

Thereiscurrentlyadisconnectbetweenthehigh-levelbusiness

modelofSLAsandthelow-levelresourcemanagementofkernels

andhypervisors.

3.2.4. Scalableserviceprovision

Aswellasprovidinginfrastructure,mostlargecloudproviders alsoprovidesomeservices,intheformofanAPI.Inconstructing suchservices,theyfacemanyofthesamechallengesastheircloud clients.Forexample,distributedlarge-scalestoragesystemshave

seenconsiderabledevelopmentin thelast fiveyears. Themain

challengeistoensurescalabilityandresilience(DeCandiaetal., 2007).

3.2.5. Resourceefficiency

Providers aimto reduce theirexpenditure through

(8)

ple,moderndatacentresareforthemostpartpower-inefficient (BarrosoandHölzle,2009).Themainoverheadofrunninga data-centreisclimatemanagement.Mosteffortinthisareafocuseson

thedesignofcoolingsystems,butonceefficiencyimproves,the

focusmayshiftonto‘energy-scalable’computing,thatisensuring

thatlowutilisationcorrespondstolowpowerconsumptionand

vice-versa.

4. ExampleapplicationsofSBSEtocloudengineering

Wenowprovideillustrativeexamplesofspecificoptimisation

problemsincloudcomputing,inordertodemonstratethe

appli-cabilityof SBSEtocloudengineering. Toavoid imprecision,we

givedetailed suggestionsonhoweach taskmaybeformulated

asasearchproblem,andalsoreferencerelatedworkinSBSEthat couldbeappliedinthisdomain.DetaileddescriptionsofSBSE

algo-rithmsandthesoftwareengineeringapplicationstowhich they

havehithertobeenappliedcanbefoundelsewhereinsurveyson

SBSE(Harmanetal.,2012a).

We have chosen optimisation problemsthat can potentially

benefitboth,cloudprovidersaswellastheirclients.The exam-plesgiveninSections4.1,4.3and4.5addressspecificproblems facedbycloudprovidersofInfrastructureAsAService,PlatformAs AServiceorSoftwareAsAService.Theproblemsandsolutionslaid outinSections4.2and4.4canservetobenefitboth,cloudproviders aswellasclients.

4.1. Providerresourceefficiency:cloudstackconfiguration

Onechallengetobothprovidersandclients,notedinSection3, istoreducetheresourceconsumptionoftheirsystems.

Inthis section,weproposeanapproachthat willhelp cloud

providers optimise their cloud stack configuration of physical

serversinsideadatacentre.Fig.6illustratesatypicalserver soft-warearchitectureinacloudenvironment.Aphysicalserverina clouddatacentreexecutesahypervisor,generallyXenorVMware. Thehypervisorrunsasingleoperatingsystem,suchasamodified versionofLinux,asthedom0hostoperatingsystem.Thishostis controlledbythecloudprovidertoadministerthenode;residing indom0providesitwithspecialaccessprivileges.Guest operat-ingsysteminstancesarethevirtualmachinesutilisedbyacloud

client,and executewithindomU.These aremultiplexedonthe

nodebythedom0host.Eachoftheselayersmaybeconfigured

independently.

Therearemanyopportunitiesforoptimisation:boththedom0

hostand thedomU guest operating systems can besubject to

configurationtuning,inordertooptimizeparticularperformance characteristics,suchasnetworkbandwidthandenergyefficiency.

Cloudprovidersusehostconfigurationparameterstooffera

rangeofinstanceconfigurations,eachwithfixedcharacteristics.For

example,theAmazont1.microinstancecurrentlyoffers613MB

ofRAManduptotwovirtualprocessingunits.These

characteris-ticsmaymapdirectlyontotheappropriateXenconfigurationfile

parameters:

maxmem =613 vcpus= 2

Suchhostconfigurationfilesspecifyhowthephysicalresources

oftheunderlyinghardwarenodeshouldbedividedamongstguest

operatingsystems.Thisconfigurationcanevenbealtered

dynam-ically.

Theguestoperatingsystemsettingsaregenerallyconfiguredat

boot-time,forexamplethroughthebootparameterstotheLinux

kernel(RedHatEnterprise,2007).Manyoftheseparametersare

performance-sensitive.Someparameterscanbeconfiguredat

run-time,butevenchangesthatrequirearebootmaybeappliedina

anticipated.

4.1.1. Formulation

Wecanrepresenttheparametersofthehypervisororoperating systemasavaluevectorcontainingmultipletypes,suitableforthe applicationofahill-climberorgeneticalgorithm.Searchoperators

shouldrespectparameterboundariesandconstraintsthat

deter-minethestructureoffeasiblesolutions.Inanofflinescenario,in whichtheoptimisationisperformedinasafesandboxenvironment disconnectedfromthelivecloudinstances,thismaybeachieved bypunishingunfeasiblesolutionsusingaspeciallydesignedfitness function.Inonlineoptimisation,however,thesearchalgorithmwill evaluatethefitnessofcandidatesolutionsinsituusingthelivecloud instancesasthefitnessfunction.Consequently,thesearchwillhave tobelimitedtothesubspaceoffeasiblesolutions.

In generalit seemsreasonable toassumethat an individual

user’s previous workload patterns may be indicative of future

behaviour,andoptimiseaccordingly.Thisapproachwouldbe sim-ilartotheworkonSBSEforcompileroptimisation,whichsearches

thespace ofparametersettings: SBSEhasbeenadvocated asa

tuningmechanismforcompilers(Cooperetal.,1999).Hosteand Eeckhout(2008)demonstratedhowacompilercanbetunedby targetingdifferentperformanceobjectivesforthecodeitproduces.

Compilershaveasurprisingnumberofpossibleparametersthat

canbetunedinthismanner:gcchasabout200.Indeed,oneway

totuneanoperatingsystemorothercloudstackcomponentwould betosearchforsuitablecompilersettingsthatenablearecompiled versiontoperformbetterinagivenscenario.

4.1.2. Evaluationfunction

The evaluation function for the tuning of cloud stack

com-ponents would depend on the performance objectives to be

monitoredand optimised.Obviouschoices aredynamic

perfor-mance attributes includingmemory usage,execution time and

energyconsumption.Sincethesepropertiesaresubjectto variabil-itydependingonthegivenworkload,theevaluationfunctionwill beforcedtoseeknormalisationthroughsamplingandstatistical analysis.Thisisacross-cuttingissueincloudoptimisationandis discussedfurtherinSection6.

4.2. Clientresourceefficiency:imagespecialisation

Cloud providers usually find themselves running many

instancesusingthesamemachineimage,thefullfeaturesofwhich areunlikelytoberequiredineverycase.Onagivenvirtualmachine, acustomerwillusuallyberunningaspecificapplication,suchas

aLAMP(Linux-Apache-MySQL-PHP)stack.Eitherthroughexplicit

declarationoftheintendedusebycustomers,orbypermitted mon-itoringofactivity,theprovidercouldtailortheimagetobetterfit theneedsofthecustomer.

Itseemswastefultouseonlyafractionofsoftwarewithinan image;thelargerthescaleofthedatacentre,theworsethis prob-lemof‘unusedsoftwareplant’willbe.Specialisationcouldhelp inreducingimagesize,reducingexecutiontime,andloweringthe timerequiredtomigrateanimageacrossthenetwork(Voorsluys etal.,2009).Thebeneficiariesofsuchspecialisationwouldbeboth, cloudprovidersaswellascloudclients.Foracloudprovideritwill resultinlessdemandonphysicalhardware.Forcloudclients, spe-cialisationoffersonewayofcuttingcost,forexamplebyconsuming lessresources.

Atpresent cloud providersalready offermany different

(9)

Fig.6. Schematicdiagramofcloudapplicationexecutionstack.

images3 for customers onthe EC2 cloudplatform. Each image

includesa particularoperatingsystemreleaseandasetof

bun-dledapplications.The‘BitNamiWordPressStack’imageprovides

UbuntuLinuxandallnecessarysoftwaretosupportWordPress:

BitNami,WordPress,Apache,MySQL,PHPandphpMyAdmin4.Such

coarse-grainedspecialisationcanbeformulatedastheselectionofa setofappropriateLinuxdistributionpackages(e.g.RPMs)toinclude inaVMimage.Thereisawell-defineddependencygraphstructure

forRPMpackages,whichcouldbeco-optedtocreatespecialised

Linuximagesforthecloud.

Thereexiststheinterestingproblemofidentifyingthe trade-offbetweenthefrequencyofuseofamoduleversuspotentialsize reductionshoulditberemoved.Ifremovingararelyusedmodule canresultinasignificantspacesaving,aworkaroundmaybeworth considering.

More fine-grained VMspecialisation is also possible. A

sin-gleRPM (e.g.,a Linux librarysuchas libstdc++)maycontain

superfluousfunctionalityforthepurposeofagivenVMinstance. Programslicingmighthelptoreducethelibrarytothebare essen-tialsfortheusecasesrequired(GuoandEngler,2011).Evenmore ambitiousspecialisationmaybeachievedbyrewritingapplication

andoperating systemcomponents.Manual specializationeffort

(Madhavapeddy et al., 2010)demonstrates clear potential; the applicationofSBSEshouldautomatethisprocess.

4.2.1. Formulation

Forthereductionofimagesizethroughthedeletionofunused modules,thesearchcoulduseadependencygraphrepresentation.

Searchingforcompromisesbetweenusageandsizereductionis

closelyrelatedtothecost-benefittrade-offsinrequirements engi-neering(Zhangetal.,2007;Durilloetal.,2009).

More generally, this is a type of ‘partial evaluation’ of the

machineimagewithrespecttotheintendedapplication.Partial

Evaluation(PE)hasbeenstudiedformanyyearsasawayof spe-cialisingprogramstospecificcomputation(Beckmanetal.,1976; Bjorneretal.,1987;Jones,1996).However,highlynon-trivial

engi-neeringworkwould berequiredtoadapt and develop existing

PEapproachestothescalerequired,suchasmodifyingtheLinux kernel,whichmayruntoseveralmillionsoflinesofcode.

3 https://aws.amazon.com/amis(accessedApril2012).

4

https://aws.amazon.com/amis/bitnami-wordpress-stack-3-3-1-0-ubuntu-10-04.

Anotherapproach wouldbetouseprogramslicing(Harman

and Hierons,2001; Silva, 2012)to ‘slice away’ unusedparts of animagethatcanbestaticallyidentified.Likepartialevaluation, slicinghasbeenstudiedformanyyears(Weiser,1979).However, despitemanyadvancesinslicingtechnologies,scalabilityis cur-rentlylimited.

AnalternativetoPEandslicingforimagespecialisationwould betouseasearchbasedapproachtofindpartsoftheimagethat

canberemoved.Moreambitiously,asearchbasedapproachmight

alsoseektoconstructanewversionoftheimage,specialisedfor aparticularapplication,usingtheoriginalimagetoprovidebotha startingpointandanoraclewithwhichtodeterminethecorrect behaviourofthere-write.Anaturalchoiceofformulationforthis

problemwouldrestontheuseofgeneticprogramming.Related

SBSEworkusingGeneticProgramming(GP)(Koza,1992)includes recentadvancesinautomatedsolutionstobugfixing(Arcuriand Yao,2008;Weimeretal.,2009),platformandlanguagemigration (LangdonandHarman,2010)andnon-functionalproperty optimi-sation(Whiteetal.,2008,2011;Sitthi-Amornetal.,2011).

AtypicalrepresentationinaGPsearchisanabstractsyntaxtree.

However,theGPprocess inherentlyinvolvesmaintainingmany

copies oftheprogram under optimisationand would hencebe

memory-intensive.Fortunately,recentapplicationsofgenetic

pro-grammingtotheproblemofcloningsystemsandsubsystemshave

developedconstrainedrepresentationstoovercomesuch scalabil-ityissues.Forexample,LangdonandHarman(2010)useagrammar basedGPapproach,inwhichchangesareconstrainedbyagrammar specialisedtotheprogram,whileLeGouesetal.(2012)representa solutionasasequenceofeditstotheoriginalinamanner reminis-centofearlierworkonsearchbasedtransformation(Ryan,2000; Fatiregunetal.,2005).EitherapproachcouldbeusedtoscaleGP totheimagespecialisationproblem.Profilingcouldalsofocusthe search,identifyingthosesoftwarecomponentsthatarefrequently executed.

Sufficientknowledgeofasystem’sbehaviourwillplayacritical

role.Theupperlimitofthedeletionapproachisdefined bythe

minimalfunctionalitythatmustbepreserved.Itisthebehavioural characteristicsthatwouldprovideapseudo-specificationforthe specialisation;theresultingsoftwareshouldbeabletohandleall systembehavioursthatclientsareinterestedin.Theprimarysource ofthisknowledgeshouldbeusageprofiles.

Accumulated profile data would provide a certain level of

confidence in determining the behavioural characteristics that

(10)

2010; Lakhotia et al., 2010a; Alshahwan and Harman, 2011; FraserandArcuri,2011)couldalsobeusedtospecificallytarget

theseapparentlyunexecutedregionsofcodetoraiseconfidence

that theyare not executed.Code that resistsexecution forthe

applicationinquestion,evenwhensearchisusedspecificallyto targetitsexecution,mightbespeculativelyremoved.

4.2.2. Evaluationfunction

Naturaltargetmetricsareimagesize,memoryusage,and exe-cutiontime.Forimagesize,wemayconsiderthetrade-offbetween thesizeofimagesandthefunctionalityprovided. Imagesizeis static,eliminatingtheneedtomakerepeatedobservations.With

regardstomemoryusageandexecutiontime, wemayconsider

whetheraverage,peakorminimalrequirementsaremost

impor-tant.Theaveragedrawoneitheroftheseresourcesmaynotbeas importantasthevariance,andwemightseektoraisetheminimum

drawonresourcesandreducethemaximumdraw,therebyaiding

loadpredictionandmanagementonacloudinfrastructure.

Ifanonlineapproachisadopted,thenanengineermightalso bepresentedwithoccasionalreportsofPareto-optimaltrade-offs (Harman,2007)currentlypossibleforheavilyusedapplications.In alloptimisationwork,itisimportanttoconsidertheappropriate

pointatwhichtodeployhumanjudgementinanotherwise

auto-matedprocess(Harman,2007,2010).Thesehumanjudgements

couldbeinformedbytheautomatedconstructionofspeculative

optimisation reports from realtime image specialisers, running

concurrentlywiththeapplicationstheyserveandwhichabsorband makeuseofotherwiseredundantdatacentreresourcestooptimise forthefuture.

4.3. Virtualmachineassignmentandconsolidationbyproviders

Virtualisation offers opportunities for power consumption

reduction using consolidation, whereby somevirtual machines

maybemigratedfromonephysicalservertoanother.Traditional

consolidationcanbeemployedsothatsomeserversmaybe

pow-ereddownortransitionedtoalow-powerconfiguration.Servers

continuetousesignificantamountsofpowerwhentheyareidle;

forexample,Googlereportthatidlepoweris‘generallyneverbelow 50%ofpeak’(Fanetal.,2007).Henceconsolidationmaybemore

desirablethan load-balancing,althoughpoweringdownservers

hasanassociatedcostinthetimetakentorestorethem.

TheproblemofallocatingVMstoserversisessentiallyoneof bin-packing,albeitdynamicinnatureandundermanyconstraints. TheconstraintsareimpliedbytheSLAsbetweenthecloudprovider

anditsclients.Forexample,wemayseektomakepowersavings

asdescribedabove,butalsowishtoavoidcohostingtwoVMsfrom

thesamecustomeronthesamemachineornetworksegment,so

asnottoaffectthetermsofanSLAin regardstoavailabilityor

responsiveness.Tocomplicatematters,itiscommonpracticeto

oversubscribephysicalhardware(Williamsetal.,2011),suchthat

demandmayinsomecircumstancesbeexpectedtoexceedsupply.

Furthermore,theenergyprofileofanapplicationandofaserver appearstobehighlyvariableandexperimentalstudies investigat-ingservers’energyconsumptionhavereportedconflictingfindings (BarrosoandHölzle,2009;LeeandZomaya,2012).

The allocation of VMs to servers has two overriding goals:

firstly,toreduceenergyconsumptionbyshuttingdownunoccupied machineswhendemandislowwithinadatacentre,i.e.traditional ‘consolidation’.Secondly,tomake efficientuseofmachinesthat

arerunningwithoutimpactingonacustomer’sVMinawaythat

SLAscannotbemet.Aggressiveconsolidation mayleadto

com-petitionforresourcesonheavilyloadedservers,anditmayalso

beproblematicifdemandrisesfasterthanpowered-downservers

ofdemandarerelated.

Thereexistsasubstantialbodyofliteratureonvirtualmachine management(Srikantaiahetal.,2008;BeloglazovandBuyya,2010; Lee and Zomaya, 2012), which invariably suggest handwritten heuristics.TheapplicationofSBSEmethodsoffersanopportunity toincreasethesophisticationandefficiencyofmanagement. 4.3.1. Formulation

Letusfirstconsiderthesimplestcaseofasinglestaticsituation. Wemayexpressthistaskasasearchproblembyconsideringa solu-tionasamappingfromadescriptionofthedemandforresources ontoanexistinginfrastructureofphysicalmachines.Tobuildon theformulationfromLeeandZomaya(2012),aserverssioutofw serversbelongstothesetS,andtheutilisationUiofasingleserver si∈Scanbedescribedasthesumoftheutilisationui,jduetoeach residentvirtualmachine

v

jrunningonsithus:

Ui= n



j=1

ui,j (1)

Theenergy consumption ofa singleserver, ci,mustthen be

modelledasafunctionofitsutilisation,

ci=b+f(Ui) (2)

wherebindicatestheminimal(base)energyusageofaserverthat isidling.However,thisassumesthatallutilisationishomogeneous, whereasinfactwemightregardUiasak-dimensionalvector,giving utilisationforeachofthektypesofresource(CPU,memory,energy, etc.)providedbyaserver.LeeandZomayaalsoassumethatfisa linearfunction,whichisinconflicttothatreportedbyBarrosoand Hölzle(2009).

Momentarilydiscardingtheselimitations,theproblemcanthen beconsideredassearchingforanassignmentconsistingofavector AsuchthatAjdescribeswherevirtualmachine

v

jshouldreside.The goalistominimiseoverallenergyconsumption,C:

C=

w



i=1

ci·Ri (3)

whereRigivesabooleanvalueindicatingifaserverisrunningat leastonevirtualmachine:

Ri=



1 if

j:Aj=i

0 otherwise

(4)

Itisnotunreasonabletoassumethataprovidermaywishto

separatevirtualmachinesfromthesamecustomerontoseparate

servers,i.e.lettingcust(

v

x)denotethecustomerwhoisrunning virtualmachine

v

x,wewanttoenforcethefollowingconstraint:

i,j,i/=j:cust(

v

i)=cust(

v

j)⇒Ai/=Aj (5)

Therewillbefurtherconstraintsbasedontherisk profileof highlyutilisedservers.Furthermore,wemayconsiderthecostsof transitionfromthecurrentallocationAtattimettothesuggested allocationAt+1.IfweassumeafixedcostforVMmigration,thenwe maytrytominimisethedifferencebetweenAtandAt+1.Ifaserver istobeshutdownorrestarted,thenwemusttaketheassociated costintoconsideration.

Muchdependsonwhethertheproblemistreatedstatically,in thesensethatasingleallocationisfoundforagivensituation,or

whetherthegoalistoconstructapolicytobeexecuted

dynam-ically.Oneinterestinghybridapproachwouldbetosearchfora staticallocationthatnotonlyoptimisesagivenevaluation func-tion,butalsotakesintoaccountfuturescenarios,forexamplein ordertominimisethecostofmigrationsgivenpotentialanticipated

(11)

demandchanges.ThisideaissimilartothatofEmbersonandBate (2010)intheirSBSEworkonallocatingprocessestoprocessorsin

amultiprocessorsystem.

Thestatic allocation of VMs tophysical servers canbe

rep-resentedusingtheintegervector,A.Adynamicsolutionwillbe representedasaprogramorheuristicfunction

At+1=f(At,Ut) (6)

Amoreambitiouspolicycouldincorporatemorecomplex infor-mation,suchasanticipateddemand,intothisfunction.Theabove discussionisheavilysimplified,andtheconstraintsinvolvedwill dependonthedesignofthedatacenter,aswellasthegoalsofthe partiesinvolved.Justafort(2012)considerstheimpactat applica-tionlevel,forexample,whereStillwelletal.(2012)optimiseon

heterogeneousplatforms.

Searchingoveravectorspacecouldbeperformedusing

Sim-ulatedAnnealing(Kirkpatricketal.,1983),thetechniqueusedby

EmbersonandBate.

Thesearchforadynamicheuristicwouldbewell-suitedtoa

GeneticProgrammingapproach,withafunctionmappingvirtual

machinestophysicalserversrepresentedasanexpressiontree.

PreviousworkonschedulinginGPprovidesexamplesofhowthis

mightbeachieved(Jakobovi ´cand Budin,2006; Jakobovicetal., 2007).

4.3.2. Evaluationfunction

Theobjectivesofthis taskare tominimisepower

consump-tionwithintheconstraintsspecifiedbyanSLAwhilsttakinginto accountpotentialfutureusage.SLAconstraintswillinclude

pro-vidingenoughphysicalRAM,ensuresufficientlylowlatency,and

providingenoughindependenceofexecutiontomaintainuptime

guarantees–forexample,byavoidingthecoresidencyofaclient’s ownvirtualmachinesonasingleserver.Futureusagemustbetaken intoaccounttoensurethatviolationofanSLAconstraintisunlikely tooccurafterreassignment.

Therearethreeapproachestoevaluationthequalityofa solu-tion:modelling,simulationandphysicalevaluation;thelattercan

involve real-time feedback from the datacentre. The CloudSim

toolkit(Calheiros et al.,2011) providesan appropriate level of abstractionforthisevaluationfunctionandwasusedpreviously byBeloglazovandBuyya(2010).

4.4. Scalemanagementforcloudclients

Oneofthekey advantages ofcloudcomputingis theability

toautomaticallyscaleaserverinfrastructureforanapplicationin responsetochangesindemand.Acloudclientwillwishtominimise theirexpenditure;theyarebilledforthecomputationalresources

thattheyconsume.AhigherspecificationVM,withmoreRAMor

afasterprocessor,incursahighercost.Further,costsaretypically basedon‘instancehours’,i.e.howlongaVMisrunningfor, regard-lessoftheamountofcomputationperformed.Idleinstancescost

moneyanditisimportanttobringmoreVMsupwhenthereis

asurgeindemand,andscaleanapplicationdownbyterminating

VMinstanceswhen demanddrops.AsnotedinSection3.1,this

scalabilitymustbemanaged.

Thus a cloud application must be configured such that it

minimises resource usage while maintaining a certain level of

throughput.Thiscanbeachievedbyeitheranticipatingdemand

orbyconstructingasetofrulesthatreacttochangesinan applica-tion’susageprofile.Fluctuationscanbecausedbyfactorsincluding promotionalcampaigns,the‘Slashdoteffect’whereapopular web-sitelinkstoasmallerwebsite,andevenbehaviouralpatternsof users.Anexampleoftencited istheelectricity spikecaused by kettlesswitchedonathalftimeduringpopularsportingevents.

Somevariationsin demandarepredictable,while othersare

not.CloudprovidersAmazonandMicrosoftAzureallowaclient

toconfigurerulesforscalinganapplication.Theserulesarebased

onmetricsincludingmaximum,minimumandaverageCPUusage,

thenumberofitemsinaservers’requestqueueandsoon.Theyare typicallydesignedbyadeveloperandsystemadministrator.

Fig.7showsanexamplesnippetfromaMicrosoftconfiguration document.Thetophalfcontainsrulesforascheduledscalingof anapplication(alsocalled‘constraintrules’),whilethebottomhalf showsrulesthatdefinewhentoreacttofluctuationsinan applica-tion’susage(alsocalled‘reactiverules’).Bothtypesofrulesshould

beconfiguredinawaythatmaximisesperformance,butminimises

cost.Forexample,onewouldnotwanttoscalebeyondthelevel

specifiedbytheconstraintrules,becauseotherwiseaclientwillbe chargedforidleinstances.Equally,thereactiverulesneedtobe con-figuredsuchthattheyarerobusttosuddenspikesinuserdemand. WecanconsiderusingSBSEtooptimisesuchrulesinordertofind

anoptimaltrade-offbetweenperformanceandcost.

4.4.1. Formulation

TherulesshowninFig.7aredescribedinanXMLformat.As

suchtheyarealreadyinaformatamenabletoaGPapproach.The structureofanXMLdocumentisdefinedindifferentschemasthat

canbeusedbythesearchtoensureitonlygeneratesvalidXML

documents.

Rulesare composedfrom conditionsand associated actions.

Condition operators such as equals, greater,

greaterOrE-qual, less, lessThan, notcanbeused toform partofthe

functionset.Whenaconditionevaluatestotrue,the

correspond-ing action is performed. Actions are alsofunctions that take a

variablenumberofparameters.Forexample,thescaleactionis

parameterisedbythetypeofVMtoscale,aswellashowmany

replicatedVMsshouldbecreated.

Theterminalsetcanbeconstructedfromtheattributesdefined forthedifferentelementsintheXMLschema,alongwiththeir val-ues.Forexample,astartTimeattributeisoftypetimestamp.In addition,theterminalsetneedstocontainallpossibleVM config-urationoptionsalongwiththedatatypesrequiredbythefunction set(e.g.therangeofintegers).

4.4.2. Evaluationfunction

We cantreat thisas a multi-objectiveoptimisation problem

whereourobjectivesarelatencyandcost.Assumewehavean exist-ingloadtestsuite.Wemayevaluateagivensetofconstraintand reactiverulesbymeasuringtheaveragelatencyobservedforthe loadtests.Equally,wecanmeasurethecostofexecutingaloadtest suiteintermsofcomputationalresourcesrequired.

Evaluatingasetofscalingrulesbyrunninganentireloadtest islikelytobetoocomputationallyexpensiveinpractice.Forrest etal.(2009)showedintheirworkonautomaticpatchfixingthat samplingfromasetoftestcasescanbemoreefficientthanusing anentiretestsuite.Intheirwork,testcaseswereusedtovalidate candidatepatches,andapatchwasconsideredvalidifitpassedall testcases.Theyfoundthatasamplingmethodprovidedsufficient informationtoaGPinordertoguideittowardspossiblesolutions. Candidatesolutionsonlyhavetobeevaluatedonthecompletetest suiteonceanoptimisationrunfinished,orafteragivennumberof generations.

Sincetheevaluationfunctionusestwoconflictingobjectives,

theGPwillnotgenerateasinglesolution.Insteadwewillobtaina Paretofront.AParetofrontisasetofnon-dominating,Pareto opti-mal,solutions.Solutionsaresaidtobenon-dominatedifnoother solutionexiststhatisbetterinallobjectivesi.e.bothlatencyand cost.

Sincethe outputofthesearch willbea setof equallygood

(12)

Fig.7. Exampleconfigurationrulesforwhentoscaleanapplication.SnippettakenfromaMicrosoftAzureconfigurationfile(Microsoft,2012).Suchrulescanbegenerated andoptimizedusinggeneticprogramming.

Visualisinga Paretofrontintwo dimensionscanhelpwiththis

processasitillustratesthetrade-offsbetweenlowerlatencyand cost.

4.5. Spot-pricemanagement

WehaveseenhowSBSEmightbeusedtoreducethecostfor

cloudclientsbyevolvingoptimisedscalingrulesforanapplication.

Nowweexaminethemanagementofdemandfromthepositionof

acloudprovider.

Fromthepointofviewofacloudprovider,whena serveris

notrunningatfullcapacity,underutilisationofresourcesequates tolostpotentialrevenue.Thus, itisimportant tomaximisethe usageoftheircloudinfrastructureatanygiventime,given suffi-cientsparecapacitytocopewithanticipateddemandsurges.One waytoaddressthisproblemistoauctionoffunusedresources,such asAmazon’sSpotinstances(Amazon,2012).Aclientcanbidfora

spotinstance,andwhen theirbidmatchesorexceeds theprice

ofaspotinstance,theinstancebecomesavailabletothem.Aspot instanceremainsavailabletotheclientforaslongastheirbidprice matchesorexceedsthepriceofaninstance.Pricesareperiodically

adjustedbythecloudprovider,andwhenaspotpriceexceedsa

client’sbid,theylosetheirinstance.

Tomaximisetheeffectivenessofspotinstances,acloudprovider

mustefficientlydividesparecapacity amongstsuitableinstance

configurations,andsubsequentlymakethoseinstancesavailable

forauction.

4.5.1. Representation

Weproposetoformulatetheproblemofauctioningoffunused

capacityasabin-packingproblem,inasimilarmannertothe con-solidationprobleminSection4.3.Binsdenoteserverswithspare

capacityand theobjectstoplaceintothebinsareVMinstance

types. We assume that a provider only offers a limited

num-berofspotinstancesizes.ThegoalistodistributeVMinstances acrossaserverinfrastructuresuchthattheamountofidleserver resourcesareminimised.Foreaseofexplanationwewillonly con-siderthedistributionofVMinstancesoverafixedtimeperiodof

onehour.

LetusdenotethedifferenttypesofVMinstanceaprovidermay offerasvt1,...,vtn.Wecansimplyrepresentthepossibleallocation ofVMinstancestoserverbinsasamatrix,asshowninFig.8.Rows

inthematrixdenoteVMinstancetypesandthecolumnsdenote

serverbins.Thenumbersintherow/columnsdenotehowmany

newinstancesofaparticularVMtypeareallocatedtoaspecific server.

4.5.2. Evaluationfunction

Asanevaluationfunctionweproposetominimisespare capac-ity.Moreformally,lethidenotethesparecapacity(headroom)of serversi.ThisisdeterminedwithreferencetoEq.(1):

hi=1−Ui=1− n



j=1

ui,j (7)

Wemustthenfindanewallocationofspotinstancestoservers, representedbythematrixBasillustratedinFig.8whereandBi,j denotesthenumberofnewVMinstancesoftypej,tobedeployed onserveri.

Theevaluationfunctionwillbetoreducehiovertime,under theconstraintsimpliedbySLAs.OfferingtheVMsaccordingtothe matrixB(whetherfounddirectlyoroutputbyasearch-generated policy)isnotenoughtoimproveutilisation,asbuyersforthenew VMinstancesmustbefound.Inparticular,Bdoesnotgeneratea priceatwhichaparticularVMtypeshouldbeauctionedoff.

Ben-Yehuda et al. (2011) showed that spot instance prices

appear tobeset,not bysupply and demand,but rathera

ran-domisedalgorithm.Inparticular,theauthorsfoundthefollowing formulatobeagoodfitforhistoricalpricesetting:

Pi=Pi−1+i (8)

wherei=−0.7×i−1+



().Intheirformula,



()denoteswhite noisewithstandarddeviation.Theauthorsmatchedtoavalue

Fig.8.ExamplematrixrepresentingpossibleVMinstancetypeallocationstoserver bins.

(13)

Fig.9. ShareofpaperstargetingeachSBSEresearchapplication. Source:SBSERepositoryZhangetal.(2012)(datafrom1976toMay2012).

of(0.39×(C−F)),whereCdenotesthemaximum(ceiling)andF denotestheminimum(floor)priceofaparticularVMinstance.Once asearchalgorithmhasfoundanoptimalallocationofVMinstance typesforidleresources,theVMimagescanbepre-builtandsold off.Thepriceforaninstancetypecanthenbesetusingtheabove formula.

5. ChallengesandopportunitiesforexistingSBSEmethods

Intheprevioussectionwepresentedanoverviewofchallenges

posedby cloudsystems that willlead tonew applications and

objectivesfortheSBSEresearchandpractitionercommunity.In

thissectionweturnourattentiontoexistingareasofSBSEresearch

thatfacenewchallengesandnewopportunitieswhenappliedin

theclouddomain.

SearchBasedSoftwareEngineeringresearchhasproducedmany

distinctsubdisciplines,mostofwhichwillremaincomparatively untouchedbythemigrationtocloudplatforms.Forinstance,the problemsofrequirementsengineering(ChengandAtlee,2007)and theassociatedSBSEresearch(Zhangetal.,2008),willnot necessar-ilychangefundamentally.Themainchallengeislikelytoremainthe elicitation,understandingandbalancingofmultiplerequirements andtheirrelationshipsandtensions,inthepresenceofmultiple

competingandconflictingstakeholders,withpoorlyunderstood

andill-expressedneedsanddesires.

However, for other areas of SBSE research and practice,

there will be a significant change, when the application focus

movesfromtraditionalplatformsandbusinessmodelstoacloud

paradigm.

5.1. Searchbasedtestingforthecloud

One of the biggest changes can be expected in the area of

SBSEfortesting,knownasSearchBasedSoftwareTesting(SBST)

(McMinn, 2004; Harman, 2007;Ali et al.,2010; Harman et al., 2012a).The changes in this areaare not entirelyproblematic;

cloud platforms offer significant advantages to the tester that

researchersinSBSTmayseektoexploit.Thisisparticularly sig-nificant,becausetestinganddebuggingformapproximatelyhalfof theoverallresearchactivityinSBSE(seeFig.9).Assuch,aneffecton SBSTwillhaveasignificantimpactoftheSBSEresearchagendaas whole.

5.1.1. On-demandtestenvironments

Oneofthemajorbottlenecksintestingisthetimeittakesto preparetestenvironmentsandexecutetestcases.Evenwhentests canbeexecutedinparallel,limitedavailabilityofhardwaremeans overalltestexecutiontimeremainshigh.Further,upuntilnow,it hasnotbeeneasytosetupatestinfrastructuretoruntestsin paral-lel.Cloudplatformsofferanewsolutiontothisproblem.Apartfrom unittests,manytestssuchasintegrationandloadtestsdependon

anexecutionenvironment.Wecaneasilyreplicatesuch

environ-mentsbycloningaVMimage.Thesecanthenbedistributedacross manyvirtualmachinestoruninparallel.

5.1.2. Snapshotting

Inacloudenvironment,everyapplicationrunsinsideavirtual

machine. Thus, it is possibleto takea snapshot of an

applica-tion, including its entire runtimeenvironment, at an arbitrary

stageinitsexecution.Thisopensupmanynewpossibilitiesfor debugging.

Traditionally software is deployed and installed on client

machines. Most software companies have little or no control

abouttheexecutionenvironmentoftheirapplication.5Thislackof

information,orratherthehugeconfigurationspaceapplications

run within,meansdebugging a failure is a difficult process. In

cloudapplications,thespecificationofauser’smachineislikelyto havemuchlessimpactonthecorrectrunningofsoftware;users’

machinesarebecomingmoreandmoreakintodumbterminals.

Since an application resides in thecloud, developers have full

access and control over its environment. Thus, when a failure

occurs,weareabletotakeasnapshotoftheentireVMstatefor

debuggingpurposes.

Inasimilarmanner,thereisthepossibilityofrapidlyforkinga runningVMinstance(Lagar-cavillaetal.,2009).Thiswillfurther enableefficientsoftwaretesting,byenablingasearchprocessto bifurcatetheexecutionofaproblembasedon,forexample,abranch predicate.

5.1.3. Multi-versiondeployment

Onewaytotestforregressionfaults,orevaluatepatches,isto

deploymodifiedsoftwareonlytoacertaingroupofusers,known

5ForexamplesomeWindowsapplicationsaredeployedunderLinuxwiththe

(14)

Forexample,Facebookrecentlyevaluateditspayperpostfeature onarelativelysmallaudienceinAustraliaandNewZealand,while therestoftheworldcontinuedtousethe‘oldversion’.Thismeans weareabletoeasilycollectprofileinformationacrossversions.We mayusethisinformationtohelpSBST,forexamplethroughseeding ofoptimisationalgorithms.

5.1.4. Quantifiablecost

It is also worth noting that the cloud offersa new way of

evaluating the cost of testing. Le Goues et al. (2012) usedthe

cloudtomeasurethecostofabugfix.Thisissimplythe

mone-tarychargeincurredforthecomputationalresourcesused,such

asthetime it tooktofinda patch.In testing wecouldequally

measure the cost of code coverage, the cost of finding a fault

andsoon.Justasacloudclientmightconsiderwhetherscaling

toanother VMwill bring sufficient return in revenue, wemay

alsoconsider the return oninvestment of additional resources

appliedtotesting.Theelasticresourcesofthecloudmustbeused efficiently.

5.1.5. Shortreleasecycles

Previously,bugswereexpensivetofixoncesoftwarehadbeen

deployed.Thiswaspartlybecausesoftwarewas‘shipped’,andthus itwasnoteasytopatchdeployedsoftwareapplications.Evenwhen patcheswereavailablefordownloadonline,noassumptionscould

bemadewhetherapatchhadbeendownloadedandinstalledby

auserornot.Thisinturnleadtoitsownsecurityvulnerabilities, sincehackerscouldusepatchestodetectsecurityvulnerabilities, andthenexploitusersthathadnotupdatedtheirsoftwarewiththe latestpatch.

Cloudcomputingalsooffersnewopportunitiesinthisarea. Soft-warenolongerneedstobeshipped,asitisprovidedinthecloud. Thus,whenaVMisupdatedwithapatch,allusersofan applica-tionautomaticallystartusingthepatchedversion,withouttheir knowledgeorintervention.

Theabilitytoeasilydeploysoftwareisleadingtoevershorter releasecycles.Smallchangescanbepushedtoeveryuserofthe

software,almostin anon-demandmanner.These shortrelease

cycleswillrequireimprovementsinhowregressiontestingis

per-formed.Oneresearchavenuecouldbehowtomake regression

testingfitintoaclouddevelopmentcycle.Itneedstobeefficient, soitdoesnotcounteractthenotionof‘shortreleasecycle’,while remainingrobust.

5.2. Softwaremaintenance

Muchpreviousworkintheareaofsoftwaremaintenancewithin

SBSEhasactuallyfocusedonrefactoringsoftware.Forexample,

acommongoalhasbeentorefactorsoftwaretoachieveagiven

qualityintermsofcohesionorcouplingmetrics(Lutz,2001). Ascloudcomputingreliesonacentralisedsystemofsoftware distribution,itpresentsnewopportunitiesforautomating mainte-nancethroughsearchthatwerenotpossibleinastandarddesktop computingscenario.Firstly,softwarechangescanberolledoutto

userswithoutuserintervention.Secondly,theseupdates canbe

rolledouttoa limited audiencethroughcanarying tolimitthe

impactofanyerroneousmodifications,aswellasthefacilityto

rapidlybackoutthose changesshoulditberequired.Thiscould

potentiallyallowSBSEresearcherstoautomatemoreoftheprocess thanhaspreviouslybeenpalatable,astheriskprofileofsoftware updatesisreduced.

Runningmanyvirtualmachineinstances,potentially

duplicat-ingsoftwareacrossmanymachines,offersnewopportunitiesfor

fault-findingandrepair.Snapshotting,asdescribedabove,allows improvedfaultlocalisationbyeffectivelycapturingtheinformation

bugmaybereplicatedgiventhecorrectinputs.Inaddition,itopens upthepossibilityofattemptingtorepairtheproblem automati-cally,particularlyiftheproblemrelatestoperformanceorresource consumption.

Forexample,considertheworkofCarzanigaetal.(2008),who useddifferentpathsthrougha system,suchassimilarAPIcalls,

tofindaworkaroundforagivenproblem. Theirworkreliedon

theredundancyinherentinsoftwareinthatthesame

function-alitycanoftenbefoundinmultiplelocations.Asimilarapproach canbefollowedusingdifferentversionsofthesamesoftware,and thoseversionscanalsobeusedasoracleswhenautomatingrepair. Inthecloud,theremayexistmanydifferentversionsofthesame software,allavailabletoarepairmechanismthroughaglobalAPI. Similarly,wemayusesnapshotstoroll-back,modify,andre-runa

systemafteracrashorperformanceproblem.Anobviousmethod

ofrepairistovarythesoftware’senvironment,byusingadifferent machineimageorphysicallocationinthedatacentre.

6. Cross-cuttingissues

This section discusses issues that must be addressed in the

designandimplementationofexperimentsinvestigatingSBSEfor

thecloud.Researchersandengineerswillencounterthree

com-monthemes:theproblemofeffectiveevaluationoftheirideasor

designs,howbesttosampletheavailabledatainordertomake

theirdecisions,andthephenomenonofincreasingspecialisation. 6.1. Prototypingandevaluation

Clouddatacentresaretypicallycomposedofserversinthetens ofthousands,andcosthundredsofmillionsofdollarstoconstruct. Constructing adatacentre iseconomicallyinfeasible foran

aca-demicinstitution,butwemustbeabletoevaluateourresearch

withsufficientfidelitytoenableustoestablishtheircredibility. Engineersworkingintheindustryfacesimilarproblems,andmay

belimitedtousingexisting facilitieswhendemandfor themis

low,orevaluationusingcanarying:limitedroll-outofchangesto arestrictedaudience.Ifneitheroftheseoptionsareavailable,they facethesamechallengesasacademicresearchers.

Whenworkingattheapplicationlevel,forexamplewhen

car-ryingoutresiliencetesting,existing cloudservicescanbeused.

However, anywork that requires access tothe lower levelsof

thecloudstackwillneeda cloudtestbed.The firstoptionis to usesimulation,andthereareanumberofcloudsimulationtools underdevelopment(Kliazovichetal.,2010;Calheirosetal.,2011). Suchsimulatorsmaybeappropriateforevaluatingcourse-grained

behaviourunderagivenVMmanagementsystem,bysimulating

migrationandconsolidationpatternswithoutconcernfordetailed applicationbehaviour.

Theefficiencyandfidelityofsimulationrestrictitsapplication. Therefore,itseemsthatpartoftheresearchagendaintocloud com-putingmustbetheevaluationofa‘scalemodel’cloudapproach.

Suchsmall-scalecloudmodelscouldbeconstructedbybuilding

fragmentsof cloudsystems (e.g.a singlerackofservers),orby replicatinglargerslicesofadatacentreusinglesscapablehardware. Indesigningbothsimulationsandscalemodelsofrealcloud sys-tems,therewillbenaturalopenresearchquestionsaboutwhether themodelscorrectlysimulatethebehaviourexhibitedbythereal worldcloud.ThisisnotanissuethatisspecifictoSBSEfor the cloud,butanissueforanyworkthatseekstoevaluateand exper-imentwithamodelorsimulationofthecloudand,indeed,isan issueforanymodelofanysystem,suchasforexample,modelsand simulationsoffinancialsystems(Cohenetal.,1983;Panayietal., inpress).

(15)

Theseissuesarechallengingbuttheyarenotinsurmountable.

Inthefinancialmodellingdomainforexample,DarleyandOutkin

(2007),wereabletoconstructaveryeffectivemodelofthemarkets whichwasusedtopredicttheimpactofNasdaqstockmarket dec-imalisation.Ifcomplexdynamicsystemssuchasfinancialmarkets canbeeffectivelymodelledandsimulated,thenthereisgrounds forhopethatthesamemaybetrueofcloudmodels.

Furthermore,theissueofdesigningasuitablecloudmodelis,

itself,anoptimisationproblem,makingSBSEanidealcandidate

forcloudmodellingtoo.Theoptimisationobjectiveistodesignthe model(selectingparameters,architectureandproperties)suchthat

predictionsandbehavioursobservedusingthemodelare

appro-priatesimulationsofrealworldclouds.Thisapproachtodesign

andtuningofmodelsandsimulationsusingsearchbased

optimi-sationhasalreadybeenappliedtofinancialmodels(Rogersandvon Tessin,2004;Narzisietal.,2006).Thereisnoreasontoassumethat itcannotalsobeappliedtooptimisingthedesignofcloudmodels andsimulations.

Thefurtherchallengeofgeneratingrepresentativeworkloads

is also problematic. In particular, we might hope to replicate

realisticworkloadsasseeninexistingdatacentres.Duetothe multi-residencyofdatacentres,thisdatamaynotbepubliclyavailable. Someprovidershavereleasedlimiteddata(Mishraetal.,2010; Google,2012).

6.2. Monitoringandsampling

Whendiscussingdesignissuesatthehighlevel,thedifficultyof measuringpropertiesthatareessentialcomponentsofan evalua-tionfunctioncanbelost.Forexample,considerpowerefficiency:

notonlyisitdifficulttoaccuratelymeasurepowerconsumption

and relatethat consumption to a particularvirtual machine or

application,butthepowerconsumptionofagivenapplicationor

servicemaybeheavilydependentonthedemandandinput

pro-fileforthatservice.Giventhattherearealargenumberofpotential executionscenarios,someformofsamplingisinevitable;thisraises

theissueofhowbesttochoosethatsample.Toolargeasample

mayleadtoanover-expensiveevaluationfunctionthatcannotbe

practicallyaccommodatedwithintheconstraintsimposedbymany

SBSEalgorithms.Searchalgorithmstypicallyrequirelarge

num-bersofevaluations,soevaluationhastobeefficient.However,too smallasamplemayyieldanestimateofqualitythatmayprovide insufficientguidanceforthesearch.

Findingtherightsamplesizewill,itself,beamatteroftuning pragmaticsortheoreticalanalysisfortheSBSEapproachused,but

thisisnotinsurmountable.ComparedtopreviousSBSEwork,the

processofdesigninganefficientevaluationfunctionwillrequire moreeffort.

Thevarianceinperformancecharacteristicsoverthesamplealso raisestheissueofwhatformofaggregationismostappropriate. Thiswilldependontheapplicationsinhandandwillbeinfluenced

bythetargetpropertiesoftheservicelevelagreementbetween

cloudproviderandclient.Forinstance,onsomecloudsettings,it willbeimportanttoreducethevarianceinperformancetoincrease predictabilityofexecution.However,inothercases,theengineer mayseektoreducetheaverageloadorthepeakloadonresources

suchastimeandmemory.

6.3. Onlineoptimisation

ManyapplicationsofSBSEtocloudsystemswillrequireonline optimisationratherthanofflineoptimisation,byperforminginsitu adaptationintheliveenvironment.Thisisincontrasttomost

pre-viouswork,whichhasassumedpre-deploymentoptimisationin

aclean-roomenvironment.Onlineoptimisationisyettobe seri-ouslyadoptedbytheSBSEcommunity,andamajorinhibitingfactor

isthecostofcomputation:optimisationmethodssuchas

evolu-tionaryalgorithmscanrequiresignificantcomputationalresources themselves.

Apossiblesolutiontothisproblemhasrecentlybeenproposed, intheformofAmortisedOptimisation(Yoo,2012).Yoosuggests distributingthecostofoptimisationacrossmultipleexecutionsof thetargetsystem:eachexecutionofthetargetsystemprovidesa singleobservationoftheevaluationfunction.Whilethe optimisa-tionprocesstakesplaceoverlongertimescales,thetotalamount

of computational overhead for the optimisation is significantly

reduced,allowingittotakeplaceintheliveenvironment.

Cloud systems providean amenable environment for

amor-tisedoptimisation,astheparallelexecutionofthesameapplication acrossmultiplevirtualmachinespresentsthepossibilityof

speed-inguptheamortisedoptimisationbyaggregatingmanyparallel

observations of the evaluation function. This is analogous to

pastuseofislandmodelsindistributionoptimisationalgorithms (Whitley,2001).

7. Conclusions

CloudcomputingofferstheSBSEresearchagendaastimulusof newchallengesandopportunities.Thereisthechallengeofusing SBSEtoaddressthetrade-offsinherentincloudcomputing.There arealsonewopportunitiesforextendingpastworkintestingand maintenancewithinSBSEtocloud-specificapplications.

Though the fundamental technical concepts of the cloud

paradigmwillbefamiliartomany,theemphasis theyplace on

thecombinationofvirtualisation,dynamicre-assignment, distribu-tionandparallelismdoesraiseimportantnewresearchquestions. Thisnovelshiftofemphasisalsolendsadditionalimportanceto manyalreadyknown,butunderdeveloped,applicationsofsoftware engineeringoptimisation.

Theshifttoapay-as-you-gobusinessmodelforthedeployment

anduseofcomputationalsocreatesnewchallengesand

opportu-nities.Thisshiftisa‘disruptive innovation’,becausethecostof

computationcanmorereadilybemeasureddirectlyinmonetary

terms.Thisis asignificantchangeintheunderlyingcomputing

businessmodel.Onemightimaginethatreal-currencycostingmay

ultimatelyhave animpactonglobalfinance.We maysoon see

tradingin‘computationalfutures’and,perhaps,eventhecostof computationasa‘globalcurrencystandard’.

Theabilitytodirectlymeasurecostwillalsohaveaprofound effect onresearch,and particularly onSBSEresearch. Hitherto,

costmeasurementhasprovedtobeathornyprobleminsoftware

engineeringresearch:ourforebearswereforcedtorelyoncost

sur-rogates.Forexample,thosewhosoughttomeasuredevelopment

costhadtobesatisfiedwithfunctionpoints,codesizeand per-sonmonthsofdevelopereffort.Similarly,inplaceoftestcost,we

havebecomeaccustomedtousingthetestsuitesize,execution

timeandtestprocessduration.Cloudcomputingmaychangethis: developmentandtestcostcanbemeasuredindollarsoryuan.

MostincarnationsofSBSEaremanifestationsofthetrade-off

betweencostandvalueobjectives.FortheseSBSEformulations,

theabilitytomeasurecostintermsofthedirectbottomlinefor theorganisationislikelytoaddrealismandactionability.Results foroptimisationthatproducereducedcostscanbedirectly mea-suredinfinancialgaintotheadopters,makingthebusinesscase

foradoptionsimplerandmorecompelling.

TheoutputsofSBSEresearchforcloudarealsolikelytobeofhigh impactbecauseofthewideuptakeofcloudresearch.Evidenceof thegrowthofinterestandactivityincloudcomputingabound.In

thispaperwehavealsopresentedevidencethattheproblemsof

cloudengineeringarehighlyamenabletoSBSEsolutions,

reformu-latingfivesuchproblemsasSBSEproblems.Therapiduptakeof

References

Related documents

Reporting. 1990 The Ecosystem Approach in Anthropology: From Concept to Practice. Ann Arbor: University of Michigan Press. 1984a The Ecosystem Concept in

Favor you leave and sample policy employees use their job application for absence may take family and produce emails waste company it discusses email etiquette Deviation from

The Lithuanian authorities are invited to consider acceding to the Optional Protocol to the United Nations Convention against Torture (paragraph 8). XII-630 of 3

We amend the real EF data by generating a certain number of papers by author X and linking each of them with 35 randomly chosen users (35 is the average paper degree in the

The three-dimensional frequency selective surface (3D FSS) with band reject multiple transmission zeros and pseudo-elliptic response is designed from two-dimensional (2D)

There are eight government agencies directly involved in the area of health and care and public health: the National Board of Health and Welfare, the Medical Responsibility Board

Insurance Absolute Health Europe Southern Cross and Travel Insurance • Student Essentials. • Well Being

A number of samples were collected for analysis from Thorn Rock sites in 2007, 2011 and 2015 and identified as unknown Phorbas species, and it initially appeared that there were