• No results found

Sustainable Computing: Informatics and Systems

N/A
N/A
Protected

Academic year: 2021

Share "Sustainable Computing: Informatics and Systems"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

ContentslistsavailableatSciVerseScienceDirect

Sustainable

Computing:

Informatics

and

Systems

j ourna l h o me p ag e :w w w . e l s e v i e r . c o m / l o c a t e / s u s c o m

Maximizing

the

detection

probability

of

overheating

server

components

with

sensor

placement

based

on

thermal

dynamics

Xiaodong

Wang

a,∗

,

Xiaorui

Wang

a

,

Guoliang

Xing

b

,

Cheng-Xian

Lin

c

aTheOhioStateUniversity,USA bMichiganStateUniversity,USA cFloridaInternationalUniversity,USA

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received4September2012 Accepted29January2013 Keywords: Datacenter CFD Sensorplacement Thermalmonitoring Overheatingdetection

a

b

s

t

r

a

c

t

Serveroverheatinghasbecomeawell-knownissueintoday’sdatacentersthathostalargenumber ofhigh-densityservers.Thecurrentpracticeofserveroverheatingdetectionistomonitortheserver inlettemperaturewiththetemperaturesensorontheserverenclosure,ortheCPUtemperaturewith on-diethermalsensors.However,thisisincontrasttothefactthatdifferentcomponentsinaserver mayhavedifferentoverheatingthresholds,whicharecloselyrelatedtotheirrespectivethermalfailure ratesandexpectedlifetimes.Moreover,thethermalcorrelationbetweentheinlet(orCPU)andother servercomponentscanbedifferentforeveryservermodel.Asaresult,relyingonthesingleinletor CPUtemperatureforserveroverheatingdetectionisover-simplistic,whichmayleadtoeitherdegraded detectionperformanceorfalsealarmsthatcanresultinexcessivecoolingpower,leadingtounnecessarily lowinlettemperature.

Inthispaper,weproposeamodel-basedapproachthatleveragesthermaldynamicstointelligently choosesensorplacementlocationsforpreciseoverheatingservercomponentdetection.Wefirst formu-latethedetectionproblemasaconstrainedoptimizationproblem.WethenadoptComputationalFluid Dynamics(CFD)toestablishthethermalmodelandanalyzethethermalstatusoftheserverenclosure undervariousoverheatingconditions,suchasinletoverheating,fanfailuresandCPUoverloading.Based ontheCFDanalysis,weapplydatafusionandadvancedoptimizationtechniquestofindanear-optimal solutionforsensorplacementlocations,suchthattheprobabilityofdetectingdifferentoverheating com-ponentsissignificantlyimproved.Ourempiricalresultsonarealrackservertestbeddemonstratethe detectionperformanceofoursolution.Extensivesimulationresultsalsoshowthattheproposedsolution outperformsothercommonlyusedoverheatingmonitoringsolutionsintermsofdetectionprobability anderrorrate.

© 2013 Elsevier Inc. All rights reserved.

1. Introduction

Inrecentyears,serveroverheatinghasbecomeoneofthemost importantconcernsinlarge-scaledatacenters.Duetothe consider-ationssuchasrealestateandintegratedmanagement,datacenters continue to increase their computing capabilities by deploying high-densityservers(e.g.,bladeservers).Asaresult,the increas-inglyhighserverandthuspowerdensitiescanleadtosomeserious problems.First,thereducedserverspacemayresultinagreater probabilityofthermalfailuresforvariouscomponentswithinthe servers,suchasprocessors,harddisks,andmemories.Such fail-uresmaycauseundesiredservershutdownsandservicedisruption.

Correspondingauthor.Tel.:+18653847365.

E-mailaddresses:[email protected],[email protected](X.Wang), [email protected](X.Wang),[email protected](G.Xing),lincx@fiu.edu (C.-X.Lin).

Second,eventhoughsomecomponentsmaynotfailimmediately, theirlifetimesmaybesignificantlyreduced duetooverheating. It is reported in [1–3]that thelifetime ofan electronicdevice decreasesexponentiallywiththeincreaseoftheoperating tem-perature.Finally,thegeneratedheatdissipationcanalsoleadto negativeenvironmentalimplications.Therefore,itisimportantfor eachservercomponenttorunatatemperaturebelowits overheat-ingthreshold.

However, in today’s data centers, how to precisely detect whetheranycomponentinaserverisoverheatingremainsanopen question.Thecurrentpracticeofdetectingandmonitoringan over-heatingservercanbedividedintotwocategories.Thefirstcategory isacoarse-grainedapproachthatonlyusesthetemperatureata proxycomponent,e.g.,CPU[4]oratafixedlocation,e.g.,theserver inlet,forserveroverheatingmonitoring.Thisisincontrasttothe factthatdifferentcomponentsinaservermayhavedifferent over-heatingthresholds,whicharecloselyrelated totheirrespective thermalfailureratesandexpectedlifetimes.Relyingona single 2210-5379/$–seefrontmatter© 2013 Elsevier Inc. All rights reserved.

(2)

thresholdattheserverinletorattheproxycomponentistherefore over-simplistic,becausethethermalcorrelationbetweentheinlet (ortheproxycomponent)andeachservercomponentcanbe dif-ferentforeveryservermodel.Asaresult,monitoringonlytheinlet temperatureoraproxycomponent,suchastheCPU,mayleadto eithermisseddetectionofoverheatingforthecomponentsother thanCPU,resultinginadegradedsystemlifetimeorfalsealarms thatresultinexcessivecoolingpowertounnecessarilylowerthe inlettemperature.

Thesecond category ofserverthermal monitoring approach assumesthateachdifferentcomponenthasitsownbuilt-inthermal sensor.Extensiveresearch[5–8]ofserverthermalmanagementhas recentlybeenconductedbasedonthisassumption.Unfortunately, today’shigh-densityseversarenotequippedwithathermal sen-soroneverycomponent.Inmostservers,onlytheprocessorshave on-diesensors whilesomememorychips mayalsohave built-insensors.Therefore,itisimportanttoprovideamechanismfor measuringthetemperaturesofothercomponents(e.g.,harddisk, networkchips),suchthatthepreviouslyproposedthermal man-agementschemescanworkeffectively.Moreimportantly,evenif everycomponenthasitsown thermalsensor,thosesensorsare usedonlyforthecontrolloopsofthosecomponentsinanisolated way.Asaresult,theycannotprovideasystem-levelthermalpicture thatcanhelpthefansystemoftheserverandthecoolingsystemsin thedatacentertoefficientlycooldownoverheatingcomponents. Furthermore,low-endsensorsusedin servercomponents com-monlyhavemeasurementnoisesandhardwarebiasesthatmay leadtofaileddetectionorfalsealarms.Recentstudies[9,10]have shownthatthecollaborativedatafusionofmultiplesensorscan significantlyimprovethedetectionaccuracy.Therefore,itis prefer-abletohaveserver-levelthermalmonitoringwithmultiplesensors thatcanpreciselydetectoverheatingcomponents.

Inthis paper,we proposetoleverage thethermal dynamics inaservertointelligentlyplacesensorsfor preciseoverheating servercomponentsdetection.Oursensorplacementsolution fea-turesamodel-basedapproach,whichadoptsComputationalFluid Dynamics(CFD)asatheoreticalfoundationtoestablishthe ther-malmodelandanalyzethethermalstatusoftheserverenclosure undervariousoverheatingconditions.CFDisapowerful mechani-calfluiddynamicanalysisapproachandiswidelyusedtoanalyze thefluiddynamicsinvariousengineeringfields,suchasaircraft enginedesignandthermalanalysisforbuildings.CFDhasalready beenusedbycomputersystempackagingdesignerstomake intel-ligentdecisionsonservercomponentlayoutdesign, butnotyet

forsensorplacementintheserverbox.WhileCFD-basedthermal monitoringhasshownpromise,akeylimitationofCFDisitshigh computationoverhead.Asaresult,CFDcannotbeeffectivelyused toreportthermalemergenciesinrealtime.Inthiswork,wepropose touseCFDtoanalyzethethermaldynamicsofflineandthen opti-mallyplacesensorsbasedontheanalysisresultstoconductonline overheatingdetection.Suchanintegratedapproachcanenableus toachievethebenefitsofboththesystematicmodelingofthermal dynamics(fromCFD),aswellasonlinemeasurementcalibration andfastresponsiveness(fromsensors).Oursolutionprovidesaway toequipexternalsensorsontheexistingserversdeployedindata centersformoreaccurateoverheatingmonitoring.Theproposed solutioncanalsobeusedonfutureserverstoplacemoresensors onthemotherboardduringthedesignphase.

Inourintegratedthermalmonitoringsolution,wefirstuseCFD tomodelthethermalenvironmentofagivenrackserverboxunder differentoverheatingconditions,includinginletoverheating,fan failureandCPUoverloading.Wethencalculatethemostcorrelated regionsintheserverboxforeachspecificcomponentby correla-tionanalysis.Accordingly,foragivennumberofsensors,weseek toplacethemintheserverboxsuchthattheoverheating com-ponentscanbedetectedwiththemaximumdetectionprobability,

whiletheerrorrateofthedetectioncanbebounded.We formu-latethisproblemasaconstrainedoptimizationproblem.Basedon theCFDanalysis,wedesignaheuristicalgorithmtofinda near-optimalsensorplacementsolution.Inouralgorithm,weapplydata fusiontechniquestoallowsensorstomake collaborative detec-tiondecisionsofservercomponentoverheating.Specifically,the contributionsofthispaperarefour-fold.

•Whilethecurrentthermalmonitoringsolutionsrelyon simplis-ticsensorplacement,i.e.,asinglesensorattheinletortheCPU, wepropose a novelsensor placement schemetointelligently placesensorsformaximizedoverheatingdetectionprobabilities ofeachservercomponentofinterest.

•WeuseCFDanalysisasatheoreticfoundationtodesignour pro-posedsensor placementscheme.OurCFDanalysismodelsthe thermaldynamicsof arackserverboxinvarious overheating scenarios,includinginletoverheating,CPUoverloading,andfan failure.

•Weformulateoptimalsensorplacementasaconstrained opti-mization problem and propose a heuristic algorithm to find a near-optimal solution. Temperature correlation analysis is conductedtofindthemostcorrelatedregions foreach server component.

•Weevaluateoursensorplacementschemeinareal-worldrack serverbox.Bothourempiricalandsimulation results demon-stratethatourplacementsolutioncansignificantlyimprovehot serverdetectionperformance.

Theremainderofthispaperisorganizedasfollows.Section2

highlightsthedistinctionofourworkbydiscussingrelatedwork.

Section3presentsthedatafusionmodel,theformulationofthe

serveroverheating detection problem, as wellas the tempera-turethreshold settingfor each differentcomponents. Section4

introducesthefundamentalsoftheComputationalFluid Dynam-ics approachand providesanexample of howtomodel a rack serverbox.Section5elaboratesonhowtousetheanalyticalresults fromCFDinoursensorplacementproblemandproposesa heuris-tic algorithm to solve theproblem. In Section 6, we introduce ourexperimentmethodologyandthenevaluateoursensor place-mentschemeusingbothsimulationandexperimentsonhardware testbed.InSection7,wediscussaninterestingvariantproblem for-mulation,aswellasapotentialapplicationofoursensorplacement scheme.Section8concludesthepaperanddiscussesthepossible futurework.

2. Relatedwork

Thermalmanagementforcomputersystemshasbeenwidely studiedinthepast.Skadronetal.haveproposeda temperature-awaremicroprocessormanagementtool,HotSpot[11],whichuses thermal resistancesandcapacitancestomodelthetemperature ofmicroprocessors.Performanceandthermal behaviorsof stor-agesystemsareextensivelystudiedin[12],whichidentifiesthe knobfortemperatureoptimizationofhighspeeddisks.Linetal.[8]

haveproposedasoftwarethermalmanagementschemeforDRAM Memory,which hasbeenimplementedonrealmachines. How-ever,fewstudieshavebeendoneonthejointthermalmonitoring andmanagementacrossdifferentsystemcomponents.Jeohwang etal. havemodeled thethermalprofilefor anoperating server systemandarackin[13]toprovideabridgebetweenthe indi-vidualcomponentthermalstatusanddatacenterthermalprofile. Ajointenergy,thermalandcoolingmanagementtechnique(JETC) isproposedin[14]tooptimizethecoolingandoperatingenergyfor bothCPUandmemory.Differentfromallthepreviousworkthat addressesasinglecomponentindividually,ourworkfocusesonthe

(3)

jointthermalmonitoringofmultiplecomponentsinasinglerack serversystem.

Data centerthermal management hasalsoattracteda lot of researchefforts.El-Sayedetal.[15]studiedhow tosafelyraise theoperatingtemperaturesetpoint ofdata centercooling sys-temsuchthatmorecoolingpowercanbesaved.Anautomated, online,predictivethermalmanagementschemefordatacenters isalsoproposedin[16].Workloadschedulingaccordingtodata centerthermalprofilehasbeenstudiedin[17].Anotherimportant aspectofdatacenterthermalmanagementnamelytemperature andthermal prediction,hasalsobeenstudied.Afastprediction frameworkfordatacentertransienttemperatureis proposedin

[18].Predictionsystembasedononlinethermalsensorreadings toreachafastandaccuratedatacentertemperatureprediction isproposedin [19]. Chenet al.[20] proposedtocombineboth onlinesensorreadingsandCFDanalysisresultsfordatacenter tem-peratureprediction.Althoughourworkpresentedin thispaper focusesonthethermal monitoringissuefor overheatingserver component,itisactuallycomplimentarytoalltheabovementioned datacenter-levelthermalmanagementstudies.Oneofthegoalsfor adatacenter-levelthermalmanagementsystemistorunthedata centercoolingsystemmoreefficiently.Ahigherriskof overheat-ingisoftenintroducedbysuchancoolingmanagementsystem. Withourthermalmonitoringsystematservercomponentlevel, overheatingissuecanbemoreefficientlymonitoredandcaptured. Sensorshavebeendeployedtoconductthermalmanagement incomputersystems.Theexistingthermalmanagementwith sen-sorscanbecategorizedintotwoclasses.Thefirstclassistodeploy sensorsinserverroomsand largedatacentersforenvironment temperaturemonitoring.Forexample,ahybridwiredandwireless sensornetworkisusedin[21]fordatacenterthermalmonitoring. Sensorsarealsousedin[9]todetecttheoverheatingserversatthe singlesystemlevel.Thesecondclassistodeploysensorsinsideor arounddifferentcomputercomponentsforaspecificcomponent thermal monitoring.Forexample, thecurrentCPU temperature thermalmanagementschemesdeployon-diethermalsensorsto monitortheCPUtemperatureatruntime[22].Temperaturesensor circuitshavealsobeenadoptedintheDRAMdesigntoprovide ther-malmonitoringformemorychips[23].Chiplevelthermalprofile isalsostudiedin[24]byusingruntimetemperaturesensor read-ings.Ourworkisdifferentfromalltheaforementionedresearch. WeuseComputationalFluidDynamics(CFD)andtemperature cor-relationofdifferentcomponentstoguidesensorplacement,such thattheefficiencyofthethermalemergencydetectioncanbe max-imized.

Differentsensordeploymentapproaches forimproved moni-toringanddetectionperformancehavealsobeenstudiedbefore. AsensorplacementschemebasedontheMultivariateGaussian Processmodelis proposedin [25].Thoughit provides informa-tivemonitoringresults,anofflinetrainingstagebeforetheactual deployment is required. This is not feasiblefor thermal moni-toringofproductionserversystemsbecausethermalemergency shouldnot becreatedfor thecollection ofthe training data.A fastsensorplacementapproachforfusion-basedtargetdetection isalsoproposedin[10]tominimizethenumberofdeployed sen-sorswhileachievingassureddetectionperformance.Differentfrom theaforementionedwork,weproposeanewmodel-basedsensor deploymentapproach,whichleveragesthetheoretical computa-tionalresultsfromCFDtomaximizethedetectionperformanceof servercomponentthermalemergency.

3. Overheatingservercomponentdetection

In this section, we first introduce the detection model for overheating server components. We then formulate overheat-ing server component detection as a constrained optimization

problem.Lastly,weintroducehowtosettheoverheating temper-aturethresholdforeachcomponent.

3.1. Overheatingcomponentdetectionmodel

Inthedesignofa computersystem,itisalwaysdesirableto optimizethe coolingefficiencyof thesystem.However, due to thedifferenceinfunctionalitiesandthevarianceinmanufacturing processes,eachcomponentinthesystemusuallyrequiresa differ-entsafeoperatingenvironmenttemperature.Therefore,inorder forthecomputersystemtooperatemoreefficiently andsafely, theoperatingenvironmenttemperatureofeachcomponentshould bemonitoredseparatelybasedontheirownrequirement.Ideally, individualthermalmonitoringandcoolingmechanismshouldbe providedtoeachsinglecomponent.Forexample,thecurrentdesign ofCPUincorporateson-diethermalsensors,suchthatthe temper-atureoftheCPUchipcanbemonitoredatruntime.Moreover,a heatsinkisusuallyattachedontopoftheCPUchiptoincrease theairflowrateoverCPU,suchthatthecoolingefficiencycanbe improved.Unfortunately, thereisusuallynosuchon-diesensor embeddedontoothercomponents,suchasmemorychipand net-workchip.Therefore,newtechniquesareneededtomonitorthe operatingenvironmentofallthecomponents,suchthattheir over-heatingconditionscanbedetectedandreportedpromptly.Inthis paper,weproposetoplaceadditionalsensorsintothecomputer systemboxtomonitortheoperatingenvironmenttemperaturesof allthecomponentsinthecomputersystem.

Withallthecomponentsandcoolingequipmentsrunning,the thermal environment inside acomputerbox is complex,which couldcausemorenoiseinthesensorreadings.Furthermore,the numberofsensorsthatcanbeplacedintoahigh-densityserver boxislimited,asonewantstomaximizethespaceutilizationfor allkindsofservercomponentsandavoidcomplexwiringandcostly installationinthealreadycompactserverbox.Thus,theadditional sensornodesaddedtotheserverboxshouldcollaboratewitheach othertomaximizetheirutility.Toaddressthesechallenges,we adoptdatafusion[26],awidelyadoptedcollaborativesensing tech-nique,tojointlyprocessnoisedatafrommultiplesensors.

Itisclearthattemperaturesatdistantlocationsfroma compo-nentarelesslikelytobecorrelatedwiththeambienttemperature ofthatcomponent.Therefore,wedefineafusionregionforeach monitoredcomponentasadiscwithafusionradiusR,whereeach monitoredcomponentislocatedatthecenterofthatdisc.The sen-sorswithinthefusion regionofamonitoredcomponentshould collaboratetomake theoverheatingdetectiondecision forthat component. Moreover,becauseof the complexair flows inside thesystem,temperaturesatdifferentlocationswithinthefusion regionhavedifferentcorrelationwiththeambienttemperature ofthemonitoredcomponent.Forexample,basedontheairflow direction,thetemperaturesatlocationsbehindtheCPUaremore correlatedwithCPUambienttemperature,comparedwiththe tem-peraturesatthelocationsinfrontoftheCPU.Therefore,wefurther defineacorrelationthresholdTh(i,j)foreachpairoflocationiand componentlocationj.Tocontributetotheambienttemperature monitoringforcomponentj,sensorsshouldbeplacedatlocation

iwithinthefusionradiusofcomponentj,wherethecorrelation valueshouldbelargerthanTh(i,j).

Todecidetheambienttemperatureatthemonitored compo-nentlocation,weadoptadatafusionschemewhichcalculatesthe averagetemperatureofallthereportedtemperaturesfrom sen-sorsthatmeet theabovetwo criteria.Wecomparetheaverage temperaturevaluewitha detectionthreshold j. Iftheaverage temperatureishigherthanthethreshold,thedecisionofa com-ponentbeingoperatinginanoverheatingenvironmentispositive. Theambienttemperature,Tj,ofcomponentjcanbederivedfrom thetemperaturereading,Ti,atthelocation(xi,yi)ofsensori.The

(4)

approachweusetoderivethetemperatureTjisexplainedinSection

5.2.Fornow,wejustdenotethisderivationasTj=fj(Ti). Measure-mentnoiseisusuallyincludedinthesensorreadings.Denotethe measurementnoisestrengthmeasuredbysensoriasNi,which fol-lowsthezero-meannormaldistributionwithavarianceof2,i.e., Ni∼N(0,2)[25].Weassumethatallthetemperaturesensorsare identical,suchthattheyfollowthesamemeasurement distribu-tion.Thefinalreportedtemperatureforthelocationofcomponent

jcanbepresentedas

Tj=f(Ti)+Ni2 (1)

whereN2

iisthenoiseinenergyform.Thenoiseistakenoutfromthe transformationsinceitisadditivetotherealtemperaturereadings. Assumingtherearenjsensorswithinthedatafusiongroupofa componentatlocationj,thedetectionprobabilityofthe overheat-ingcomponentjinaspecificoverheatingscenariocanbecalculated as PDj=P

1 nj nj

i=1 fj(Ti)+N2i >j

(2) wherej isthedetectionthresholdofoverheatingforthe com-ponentatlocationj.Becauseofthemeasurementnoisefromthe sensordevice,jincludesboththerealtemperaturethresholdfora component,denotedasCj,andthemeasurementnoise.Withahigh noiselevelfromthemeasurement,adetectionsystemislikelyto reportafalsealarmwhenthereisnorealevent.Inourcase,we definethefalsealarmratewhentheenvironmentofthemonitored componentisactuallynotoverheatingasfollows

PFj=P

1 nj nj

i=1

Ni2+Cj

>j

(3) We assume Gaussian Noise, i.e., Ni/∼N(0,1). Therefore,

nj

i=1(Ni/)2followstheChi-squaredistributionwithnjdegreesof freedom,denotedasnj(·).Hence,Eqs.(2)and(3)canbemodified

asfollows: PDj=1−nj

njj− nj i=1fj(Ti) 2

(4) PFj=1−nj

nj(j−Cj) 2 (5) 3.2. Problemformulation

WeassumethatthereareMcomponentsinacomputerserver, whose operating ambient temperatures need to bemonitored. GivenNsensors,(N≤M),weneedtofindtheplacementoftheseN

sensorssuchthatwecandetecttheoverheatingemergencyatany oftheMlocationswiththehighestpossibleconfidence.Weassume

N≤Misbecauseitispreferabletoplaceasfewsensorsaspossible intheserverboxforthermalmonitoringpurpose,consideringthe complexityandhighcostofthewiringdesignonthemotherboard. Ourgoalistomaximizetheaveragedetectionprobabilityofallthe monitoredlocations max 1 M M

j=1 PDj (6)

subjecttothefollowingconstraint

PFj≤˛

1≤j≤M (7)

where˛isthetolerabledetectionfalsealarmratebound.Wenote thatthe false alarm rateneedstobe boundedin many practi-calscenarios inorder toreducethewasteofsystemresources.

For acertainsensor placement,PFj ≤˛is anecessary condition

inourproblem.ByEq.(5),weconverttheconstraintinEq.(7)to j≥(2−nj1(1−˛)/nj)+Cj,aconstraintforthedetectionthreshold jatmonitoredlocationj,where−1(·)istheinversefunctionof (·).Usingthisequation,wecanobtainthethresholdthatsatisfies thefalsealarmrateboundwhilemaximizingthedetection proba-bility.FromEq.(4)weknowthatPDj decreaseswhenjincreases.

Therefore,tomaximizethedetectionprobability,weremovethe inequalityintheconstraintandonlyusethelowerbound˛.Hence, jcanbecalculatedas j= 2−1 nj (1−˛) nj + Cj (8)

3.3. Componenttemperaturethreshold

Before solving the problem in Section 3.2, we need to set the overheatingthreshold for each components in the system. Amongallthefactorsthatcontributetothelifetimeof semicon-ductordevices, operatingjunctiontemperature,i.e.,thehighest temperatureinsidethesemiconductordevice,isacritical decid-ingfactor.Withahigherjunctiontemperature,devicestendtofail sooner.Therehasbeenresearch[11,1]studyingthe temperature-inducedfailuremechanismsofsemiconductordevices.Inmostof themodelsstudied,theoperatingjunctiontemperatureshowsan exponentialimpactonthefailurerateofadevice,whichis:

∝exp

−Ea

kTJ

(9) wherekis theBoltzmann’sconstant,8.6eV/K.Eaand TJ arethe activationenergyofelectromigrationandtheoperatingjunction temperature,respectively.ThecommonactivationenergyforAl

andAlwithsiliconis0.6eV.

Hardwarecomponentsfrommanufacturersoftencomewitha warrantytime.Forexample,bothIntelandAMDselltheir prod-uctswithathree-yearwarrantypackage.Notethatthiswarranty timeindicatesthetimeperiodthatthedeviceshouldworkproperly withouthardintrinsicfailures,evenrunningunderextreme con-ditionswithinthespecification.However,asacommonpractice, computersystemsusuallyserveforalongerperiodoftimethan threeyearswithupgradestosomecomponents,suchasadding newdisksforlargerstoragespace.Toextendtheworkingtime, weneedtolowertheoperatingambienttemperaturethresholdof eachcomponent.Giventheextendedlifetimerequirementtand thelifetimerequirementtunderwarranty,wecanuseEq.(9)to calculatethenewoperatingjunctiontemperaturethresholdTJas

1 TJ = k Ea ln

t t

+ 1 TJ (10) In thiswork, weusesensorstomonitor thetemperatureof theoperatingenvironment,whichistheambienttemperatureofa workingcomponent.TheambienttemperatureTAcanbecalculated usingjunctiontemperatureTjinEq.(9)as

TA=TJ−P×JA (11)

wherePistheoperatingpowerofthedeviceandJAisthe junction-to-ambientthermalresistance[27].

Basedonalltheabovederivationsandrelatedvaluesfromdata sheetsofdifferentcomponents,wesettheoperatingenvironment temperaturethresholdCjforcomponentjinourworkbyoneofthe followingthreemethods:(1)directlytakenfromthedatasheet.For someofthecomponentsinthecomputersystem,themaximum operatingenvironmenttemperatureislistedinthedatasheetor themanual.Fig.1istheplatformusedinourexperiment.Itisa

(5)

Fig.1.TheDELLPowerEdge29502Urackserverusedinourhardwaretestbed. Theyellowboxesarethechipswhoseoperatingenvironmenttemperaturesneed tobemonitored.Thereddashedboxinthelowerpicturehighlightsthefrontpanel assemblyoftheserver.Thereddashedboxintheupperpicturehighlightsthe tem-peraturesensorusedbytheDELLservertomonitorthetemperatureattheinlet. ExceptCPUandMemory,chipsneedtobemonitoredfortemperatureareindexed andhighlightedwithyellowboxes.(Forinterpretationofthereferencestocolorin thisfigurecaption,thereaderisreferredtothewebversionofthearticle.)

2UDELLrackserverequippedwithanAMDOpteron2222SE Dual-Coreprocessor.Themaximumoperatingtemperaturelistedonthe datasheetforthistypeofCPUis69◦C.(2)Convertedfromthe junc-tiontemperaturethreshold.Forexample,themaximumjunction temperatureandthejunction-to-ambientthermal resistancefor LatticeispMACHCPLDchipinoursystemare75◦Cand41.8◦C/W, respectively.ApplyingEqs.(10)and(11)withlifetimerequirement of7years,wecangettheambientthresholdas60◦C.(3)Forthe unknowntypeofchipsorthechipswhosedatasheetsarenot avail-able,weuse43◦C,thedefaultSystemBoardAmbientTemperature settingrequiredbyOpenManage,DELL’sservermanagementtool.

3.4. Overheatingdetectionprobabilitymaximizationfor combinedoverheatingscenarios

InSection 3.2,wehave formulatedthedetectionprobability

maximizationproblemunderaspecificoverheatingscenariosuch asaninletoverheatingorCPUoverloading.However,inpractice, thereareusuallynomeanstoknowwhatkindofoverheating sce-narioisgoingtohappenatafuturetime.Therefore,itisimportant topreparethesystemformultiplepossibleoverheatingscenarios. Onesimplisticwaytoachievethisgoalistoconsiderevery pos-sibleoverheatingscenarioonebyoneanddeploysensorsforevery scenario.Althoughthisapproachdoesnotrequirechangetoour previousdetectionmodel,itcanresultinalargenumberofsensors ifthenumberofoverheatingscenariosislarge.Thiskindof moni-toringsystemisnotdesirablebecauseofthespaceintheserverbox todeployadditionalsensorsislimited.Tomitigatethisproblem, weproposetomaximizetheaveragedetectionprobabilityacross multipledifferentoverheatingscenarios.

Assume we have K possible overheating scenarios, to get the average detection probability across multiple overheating

scenarios,theoverheatingdetectionprobabilitymodelinEq.(2)

needstobemodifiedas: PDj= 1 K

K

k=1 P

1 nj nj

i=1 fjk(Ti)+Ni2>j

(12) wherefk

j(·)isthetemperaturemappingfromsensorlocationito componentlocationjintheoverheatingscenariok.

SimilartoEq.(4),undertheGaussiannoiseassumption,wecan transformEq.(12)tothefollowingequation:

PDj=1− 1 K K

k=1

nj

njj− nj i=1fjk(Ti) 2

(13) Basedontheaboveoverheatingdetectionprobabilitymodelfor theoverheatingcomponentdetectionundermultipleoverheating scenarios,wecanformulatetheprobabilitymaximizationproblem as: max 1 M M

j=1 PDj (14)

subjecttothesameconstraintasshowninEq.(7).Asshownin theexperimentalresultspresentedinSection6.5,thisformulation leadstoasmallernumberofsensorstobeplacedinaserverwith thedesiredoverheatingdetectionprobability.

4. CFDmodelingforserverboxandcomponents

Inthissection,wefirstintroduceComputationalFluidDynamics (CFD),thetoolweusetoanalyzethethermalenvironmentinside theserverbox.Wethenprovideanexampletodemonstratehow tomodelaserverboxandeachofitscomponentsinpracticeusing Fluent[28],awidelyusedCFDmodelingsoftwarepackage.

4.1. CFDmodeling

CFDisafluidmechanicsapproachthatanalyzespropertiesof fluidflowsbasedonnumericalmethodsandalgorithms.CFD anal-ysisgives greatinsightintotheflowpattern anddistributionof a targeted environment. Comparedwith thetraditional experi-mentalmethodofstudyingtheflowpatterndistributionsuchas usingflowsensors,CFDhasitssignificantadvantages.First,CFD canreachahighresolutioninthespaceandtimedomainswhile thetraditional methodusuallycanonlystudyalimitednumber ofpointsandtimeinstants.Second,CFDcanbeappliedfor virtu-allyanyproblemusingrealisticoperatingconditionsetupswhile experimentalmethodologycanonlyworkonlimitedconditions andenvironments.Third,thescaleofCFDsimulationcancovera widerangewhilethetraditionalmethodusuallyonlyworksona laboratory-scalemodel.

ThekeyforCFDmodelingistosolvethegoverningtransport equationsrepresentedinthefollowingconservationlawform:

t +

Uj

xj =

xj

,eff

xj +S (15)

where representsdifferentparameterssuchasmass,velocity, temperatureorturbulenceproperties; isthefluid(air)density;

tisthetimefortransientsimulations;xjisthecoordinatevariable forx,yorzwithjbeing1,2or3;Ujisthevelocityindifferent directions;isthediffusioncoefficient;andSisthesourceforthe particularvariable.Forexample,whenistheairtemperature,S

standsforthevolumetricheatratefromasourcecomponent.The fourequationtermsrepresenttransient,convection,diffusion,and sourcepartsoftransportphenomenoninthespatialdomain[29].

(6)

ThepartialdifferentialequationslistedinEq.(15)representa system,whereallthetransportequationsarecoupledtogetherand requiretobesolvedsimultaneously.Fora complicated environ-ment,suchasaserverenclosure,closed-formsolutionsarehard tobefoundfortheairflowandheattransferoftheentiresystem. Therefore,themostfundamentalconsiderationinCFDishowto treatacontinuousfluidinadiscretizedfashion,suchthat numeri-calmethodscanbeappliedtofindthesolutions.MostCFDsoftware packagesapplythecontrolvolumemethodtofindnumerical solu-tions.

4.2. ExampleofserverboxCFDmodeling

UsingCFDtoperformacontinuousfluidmodelrequiresthe dis-cretizationofthespatialdomainintosmallcells.Onemethodto performthis discretizationis togeneratevolumetric grid.After the discretization, necessary boundary conditions and suitable algorithmsneedtobeappliedtosolvetheabove-mentioned trans-portequations.Severalpopularsoftwarepackages,suchasFluent, FLOTHERM,FloventandPhoenics,canbeusedforCFDmodeling purpose.Inourproject,weuseFluent,awidelyusedCFDsoftware packagefromANSYSInc.,toperformthegeometrymeshingand solutionfinding.

TheCFDmodelweestablishinthisexampleisfortheDELL Pow-erEdge2950serverbox,showninFig.1.Inthefirststep,weuse Gambit,whichisagridgenerator,toperformthegeometry estab-lishmentforthisserver.Basically,wechoosedifferentgeometric shapesandperformunificationorsplittoestablishthe geomet-ricmodelfortheentireserverbasedontherealmeasuredscales. Thenweadddifferentgeometricshapesintotheserverbox geom-etrytomodeltheservercomponents,suchasthesystemfanand CPUsink,accordingtotheirgeographiclocationandcorresponding scale.Afterallcomponentsareaddedintothegeometricmodel, weneedtospecifydifferentboundarytypes,suchastheserver walls,thefans,andtheinlets/outletsoftheserverbox.Thelast stepistodividetheentiregeometricmodelintosmallerscalecells byapplyinggeometrymeshinginGambit.Thegridsizeisa user-specificparameter.Withafinergrid,moreaccurateCFDmodeling canbereached.However,afinegridincreasesthecomputational burdeninthefollowingstagewhenthetransportequationsare solvedbynumericalmethods.Weuse1mmasthegridsizetomesh thegeometry.AlthoughtheCFDgeometrymodeltakessometime togeneratebecauseofthecomplicatedcomponentlayoutinthe serverbox,wenotethatitisaone-timeworkthatcanbeusedfor theanalysisonalldifferentoverheatingconditionsforthesame server,whichisfeasibleforanofflinesensorplacementapproach. AftermeshingtheentireserverinGambit,weexportthegridto thesecondsoftwarepackage,Fluent,tosolvethetransport equa-tionsinEq.(15).Fluentrequiresalltheboundaryconditionsofour geometricmodeltobespecified.Forexample,weneedtospecify thepowerdissipationofeachheatdissipatingcomponentssuchas CPU,memory,diskandalltheothersystemchips.Wealsoneed tospecifytheinlettemperatureandthesystemfanspeed.After alltheparametersaresetup,thestandardk-epsilontwo-equation turbulencemodelischosentosimulatetheturbulentflow.Each simulationofonerunningconditiontakesabout20mintofinish.

Fig.2showsacoloredcross-sectiontemperaturemapaftersolving thetransportequationsinFluent.Thisisascenarioinwhichallthe componentsarerunningunderthepowersettingspecifiedontheir datasheets.

5. CFD-guidedsensorplacement

Inthissection,weintroducehowtousetheresultsfromtheCFD analysistoguidesensorplacementinsidetheserverbox,withthe

Fig.2.Coloredtemperaturemap(◦C)oftheDELLserverrunningCPUintensive benchmarks.Thesmallblackboxesindicateallthechipswhosetemperaturesneed tobemonitored.ThelargeboxinthemiddleistheCPUsink.Thefourverticalshort linesinthemiddlerepresentthefoursystemfans.Thefourhorizontalthinblocks underneaththeCPUsinkrepresentthememorymodules.Thetemperatureofthe memoryclosesttotheCPUsinkisalsorequiredtobemonitored.Diskisontheleft sideofthegraph.

goalofmaximizingtheoverheatingdetectionprobabilityforallthe components.Wethenintroduceaheuristicalgorithmforsolving thisdetectionprobabilitymaximizationproblem.

5.1. Overviewofourapproach

UsingCFDtoolsforoursensorplacementintheserverbox pri-marilyinvolvestwosteps.Inthefirststep,weestablishageometric modelfortheserverboxinGambit,meshthegeometry,andexport thegridtoFluent.Wethentakemeasurementsfortheincomingair temperatureandairflowrateattheinletoftheserver.These mea-surements,alongwiththepowerconsumptionofeachcomponent andthefanspeed,aretheinputparameterstoFluent.Werepeat thefirststepbytuningtheactuatingparameterofthe overheat-ingscenariostogetmultipleresultsofCFDanalysis.Forexample, inanoverheatingscenariocausedbyinletoverheating,wechange theinlettemperaturetoseveraldifferentvaluestorunCFD analy-sis.BasedontheCFDresultswithdifferentinlettemperatures,we obtainthetemperaturecorrelationbetweenanyspatiallocation, definedbytheCFDgrid,andeachcomponentlocation.Wealsouse theCFDdatatoobtainanapproximationfunctionforeach spa-tiallocationandtargetedcomponentlocationpair,suchthatthe temperatureatthetargetedlocationcanbecalculatedfromthe temperatureatanyspatiallocationwithahighcorrelation.

Inthesecondstep,wefeedtheresultsfromtheCFDanalysis, includingtheoverheatingscenariotemperaturedataandthe corre-lationdatatoouroptimizationalgorithmtofindthebestlocations forsensorplacement.Weassumethatoursensorplacementneeds tomonitorthetemperatureofthepointabovethecenterofeach component’stopface.Tosolvetheplacementproblemefficiently, we develop ouralgorithm based onthe ConstrainedSimulated Annealingapproach[30].Thealgorithmisexplainedindetailin thefollowingsections.

5.2. Componentambienttemperaturefunctionandcorrelation

InSection3.1,wedenotethereportedtemperatureof

compo-nentatlocationjfromsensoribyarelationshipTj=fj(Ti).Becauseof thecomplexfluiddynamicsandthermaldistributionintheserver box,thetemperatureatlocationicanbeverydifferentfromthe temperatureatlocationj,evenifthephysicaldistancebetweenthe twolocationsisshort.Therefore,weneedafunctionmappingfrom

(7)

icanbeusedtoreportthecomponenttemperatureTj.Weusethe CFDanalysisresultsfromthelastsectiontoderivethisrelationship mapping.WefirstrepeattheCFDanalysiswithdifferent parame-tersettings.Forexample,intheinletoverheatingscenario,theinlet temperatureischangedatdifferentrunsoftheCFDanalysis.Based onallthetemperaturedatafromdifferentrunsofCFD,weestablish asecond-orderpolynomialmodeltoapproximatetherelationship betweenanytemperatureTiandthecomponenttemperatureTjas:

Tj=aj,iTi2+bj,iTi+cj,i (16)

WehavealsointroducedinSection3.1thatoursensor place-mentschemeonlyplacessensorsatthelocationsthathavehigh temperaturecorrelationstothemonitoredtargets.Therefore,we use the same set of CFD data as used in the above function approximation to calculatethe spatial correlation betweenthe temperaturesTiandcomponenttemperatureTj.Person’s correla-tionisawidelyadoptedmetric[31]thatcalculatesthedegreeof associationbetweentwovariables.Assumingthatwehavensetsof CFDdatawithdifferentinlettemperaturesettings,wecancalculate Person’scorrelationr(Ti,Tj)by r(Ti,Tj)= nk=1(Tik−Ti)(Tjm−Tj)

n k=1(T k i −Ti) 2

n k=1(T k j −Tj) 2 (17)

Thepolynomialfunctionapproximationandcorrelationvalues areallinputstothealgorithminthenextsection.

5.3. Sensorplacementalgorithm

Procedure1. CFD-guidedsensorplacement(D)

Input:SensornumberN,ComponentLocationlistx[K]andy[K],CFDdata,

Correlationdatardata,OverheatingThresholdListC[K] Output:PlacementsolutionD

1.forj=1toKdo

2. x[j]min=xj−R;x[j]max=xj+R

3. y[j]min=yj−R;y[j]max=yj+R

4.endfor

5.x

min=min(x[K]);xmax =max(x[K]); 6.y

min=min(y[K]);ymax=max(y[K]); 7.(P,D)

8.=CSA(N,x

min,xmax,ymin,ymax ,C[K],CFDdata,rdata) 9.returnD

Ourgoalistofindtheoptimalsensorplacementlocationsinthe serverboxtomaximizetheaverageoverheatingprobabilityforall themonitoredcomponentlocations.Weproposetouseanonlinear programmingsolverbasedontheConstrainedSimulated Anneal-ing(CSA)algorithm[30].CSAisanextensionoftheconventional SimulatedAnnealingalgorithmforsolvingtheglobalconstrained optimizationproblemwithdiscretevariables. Theoretically,CSA canreachaglobaloptimalsolutionbyconvergingasymptotically toa constrained globaloptimum witha probabilityof 1. How-ever,alimitationofCSAisthatitscomputationalcomplexitygrows exponentiallywithrespecttothenumberofvariablesandthe solu-tionsearchspace[30,10].Therefore,beforeweapplyCSA,wefirst reducethesearchspaceofthealgorithmbycalculatingthe plau-siblesearchspaceaccordingtothecomponentlocations.In our sensorplacementproblem,weproposetoutilizesensorsthatare withinthefusionrangeofacomponentlocationtocollaboratively decideiftheoperatingenvironmenttemperatureofthat compo-nentisoverheating.Therefore,thesearchspaceisonlyplausible forthatcomponentifthesensorisplacedinsidethefusionrange

Rofthatcomponent.Weaggregatealltheplausiblesearchspaces ofeachcomponenttogetherbyfindingthemaximumand mini-mumpossiblexandyvaluesofasensor.Theaggregatedregion isthenusedasthesearchspaceforthesensorplacement algo-rithm.ThepseudocodeofthisalgorithmislistedinAlgorithm1.

Fig.3. Comparisonatmultiplelocationsintheseverbetweentemperature mea-surementsonthetestbedandCFDsimulationresults.TestbedrunsthesameCPU intensiveworkloadasinFig.2.

Lines1–6calculatetheplausiblesolutionsearchregion.Basedon theCFDandcorrelationanalysis,i.e.,CFDdata andrdata,lines7–8 useCSAsolvertofindtheplacementsolutionDthatmaximizesthe detectionprobabilityP.Algorithmoutputstheplacementsolution

D.

6. Evaluation

Inthissection,wefirstvalidateourCFDmodelbycomparing theCFDanalysisresultwiththerealsensormeasurements.Then weintroducetheexperimentsetupandthemethodologyusedfor theperformanceevaluationonourhardwaretestbed.Afterthat,the overheatingcomponentdetectionperformanceisevaluatedinboth simulationandhardwaretestbedexperimentsin threedifferent individualoverheatingscenarios,includinginletoverheating,fan failure,CPUoverloadingandthecombinedoverheatingscenario usingthepreviousthreeindividualscenarios.

6.1. Modelvalidationandexperimentmethodology

TovalidateourservermodelintheCFDanalysis,weplace19 sensorsinto theserverbox.The serverisplaced in anisolated serverroomwithadedicatedairconditioningsystem.We mea-surethetemperatureunderanormalserverrunningcondition,in whichtheserverisrunningtheSPECCPU2006benchmarksatan averagetemperatureof19.6◦Cattheinlet,witha0.5◦C fluctua-tionbecauseoftheairconditioningactuation.Themeasurements aretakenwhen theserverisrunningunderstablethermal sta-tuswithsensorsplaced intheclosedenclosure.Thesensorswe usedfortherealtemperaturemeasurementaretheTelosbsensor motes[32].Wechoosethistypeofsensorsbecausewecancollect thetemperaturereadingsfromthosesensorswithwirelesssignal withoutopeningtheserverenclosure.Wenotethatourapproach doesnotdependonaparticularsensortypeandcanutilizeeither wiredorwirelesscommunications(thoughwirelesssensorscan belessintrusivetothealreadycomplicatedserverenvironment).

Fig.3showsthecomparisonbetweentheCFDanalysistemperature resultandthetestbedmeasurementresult.Wecanseethatthe temperaturedifferencebetweenCFDanalysisandreal measure-mentisabout6.3%onaverage,whichshowsthatourcomputational CFDresultissufficientlyclosetotherealtemperature measure-ments.Ifadifferenttypeofsensorsthatissmallerinsizeisused, thedifferencecanbefurtherreduced.

There are totally five different sensor placement strategies that weevaluateacrossalltheexperiments.CFD-guided sensor placementistheplacementapproachweproposeinthisworkto placesensorsbasedontheanalytical resultsfromCFDanalysis.

ChipBestistheplacementresultingfromabesteffortapproach. Togetthisbestperformance,wefirstplacesensorsatalltheexact

(8)

Fig.4.Servertemperaturemapofapartialinletoverheatingscenario.Thered dashedboxesarethechipswhoseenvironmenttemperaturesexceedtheir indi-vidualoverheatingthresholds.TrianglesindicatethesensorsplacedbyCFD-guided approach,whenthegivensensornumberisfour.Theblackcrossesindicatethefour sensorsplacedbythebaselineChipBestapproach.(Forinterpretationofthe refer-encestocolorinthisfigurecaption,thereaderisreferredtothewebversionofthe article.)

chiplocationsintheoverheatingexperiment,oneforeachchip, tocollectthetemperaturedata.Then, for agiven number ofN

sensors(lessthanthenumberofMchips),wefindthecombination withtheNlocationsthatresultsinthebestdetectionperformance fromallpossiblecombinations.Noteit isinfeasible touseChip Bestinarealimplementation,becauseitneedstotestalldifferent combinations of sensor/chip pairing and select the best one. DifferentfromChipBest,ChipAveragecalculatesaveragedetection performanceofallthepossiblecombinations.Randomisasimple heuristicstrategythatplacessensorrandomlyintheserverbox, whichistheaverageresultsfrom10runsofrandomplacements.

UniformGriddividestheserverboxintouniform-sizedgridand placesonesensorineachgridrandomly.

Inallofourexperiments,weevaluatetheaveragedetection probabilityandtheerrorratefordifferentplacementapproaches. Theaveragedetectionprobabilityisdefinedasthenumberof over-heatingchips that aredetecteddividedby thetotal number of overheatingchips.Theerrorrateevaluatedconsistsofboththefalse alarmandmis-detection.Forallofourtestbedresults,weruneach overheatingexperiment10timesandcalculatetheaveragevalue ofeachperformancemetric.Therearenoaverageresultsin simu-lation,sincethereisnovariationinCFDtemperatureresults,when theexperimentsettingsremainthesame.

6.2. Inletoverheatingdetection

Inthissubsection,weevaluatethedetectionperformanceunder apartialinletoverheatingcondition.Partialinletoverheatingis oftenhardtobecapturedbythesingleinlettemperaturesensor onthefront panelassembly inFig. 1.Ideally, one couldadjust theairconditioningsystemintheroom(e.g.,reducingits blow-ingrange)toemulateinletoverheatingcausedbycoolingsystems. However,duetolimitedallowedaccesstotheairconditioning sys-temintheroom,weuseahairdryertoblowwarmairintothe serveratthelowerleftcornerofthefrontinlettoemulatethe par-tialinletoverheatinginourtestbedexperiment.Tocalculatethe spatialtemperaturecorrelationandthetargettemperature func-tion,CFDanalysisisconductedindifferentscenarioswithdifferent inletoverheatingtemperatures.Asaresult,thesensorplacement solutioncomputedbyouralgorithmcanhandlethedynamicsin differentinletoverheatingscenarios,despitethatweonlytesta subsetofthosescenarios.Fig.4showsthetemperature distribu-tionoftheserverboxunderthehighestpartialinletoverheating

20 40 60 80 100 Av erage e ction Probabilit y (% ) CFD Chip Best Chip Average Uniform Grid Random 0 1 2 3 4 5 6 7 8 9 10 11 Det e Sensor Number

Fig.5. AveragedetectionprobabilityoftheproposedCFD-guidedsolutionandthe baselinesintheproposedCFD-guidedsolutionandthebaselinesintheinlet over-heatingcase(simulation).

temperature.Wecanseethat9chips(reddashedframesinthe figure)outofthetotal11monitoredchipsareoverheatinginthis scenario.

Fig.5showstheaveragedetectionprobabilityinthepartialinlet overheatingscenario.WeseethattheCFD-guidedapproach has thehighestoverheatingdetectionprobability.ComparedwithChip Best,CFDshowsamaximumperformanceadvantageofabout22% whenthesensornumberis2.Thisismainlybecausewhena sen-sorisplacedattheexactlocationofonechipbyChipBest,itcannot alwaysprovidetemperaturemonitoringforotherchips,aschipsare usuallynotplacedclosetoeachother.AlthoughChipBestmayshow someacceptableoverheatingcomponentdetectionperformance whenthenumber ofsensors islarge, thisperformance is actu-allyhardtoachievewithouttestingallthecombinationsofsensor locationswiththegivennumberofsensors.Withoutexhaustively testingallthecombinations,onecanchoosechiplocations ran-domly,leadingtothedetectionperformanceoftheChipAverage

scheme.WeseethattheCFD-guidedplacementoutperformsthe

ChipAverageatallsensornumbersintheexperiment,witha high-estperformancegainof45%whensensornumberis2.Theother twobaselines,RandomandUniformGrid,showsignificantlyworse performancethanCFD-guided,ChipBest,andChipAveragesincethey areonlyheuristicapproaches.Toillustratethedifferencebetween

CFD-guidedandChipBest,aplacementexamplewith4sensorsis giveninFig.4.WeseethatCFDplacementdoesnotplacesensors onanyofthechips.Instead,itplacessensorsin betweenchips, suchthateachsensorcancovermorechips,thusleadingtobetter detectionresults.Fig.6showstheaverageerrorrateinthis sce-nario.WeseethatCFD-guidedplacementshowssignificantlylower errorratesthantheothertwochip-locationplacementschemes.

Fig.6.AveragedetectionerrorrateoftheproposedCFD-guidedsolutionandthe baselinesintheinletoverheatingcase(simulation).

(9)

Fig.7. AveragedetectionprobabilityoftheproposedCFD-guidedsolutionandthe baselinesintheinletoverheatingcase(testbed).

ThisdemonstratesthatwiththeanalyticalresultsfromCFD anal-ysis,theplacement cancover moretargets,whichleadstoless miss-detection.

Figs.7and8showthedetectionprobabilityanderrorrateof

detectiononthehardwaretestbed.Weextractthesensor place-mentlocationsfromthesimulationsandplaceallthesensorsinto theserverboxaccordingly.Becauseofthelimitedspace,weonly placeuptofivesensorsintotheserverbox.Sinceweevaluatethree differentsensorplacementschemes,themaximumnumberof sen-sorsplacedintheserveratthesametimeis15.Fromtheresult weseethatthedetectionprobabilityanddetectionerror perfor-manceonthehardwaretestbedmatchesthesimulation results well.Amongall thethreeschemes, CFD-guidedshows thebest detection performance and ChipAverage has the worst perfor-mance.

6.3. Fanfailuredetection

Inthisexperiment,weconductbothsimulationandhardware testbedexperimentonafanfailurescenario.Toensurethesafe operationofthesystem,weonlydisableonesinglefaninthe sys-tem.Tocalculatethespatialtemperaturecorrelationandthetarget temperaturefunction,severalrunsofCFDanalysiswithdifferent fanspeedsareconducted.Similartotheinletoverheatingscenario discussedbefore, oursensor placementsolutioncanhandlethe dynamicsindifferentfanfailurescenarios,becausetheCFD analy-sisisconductedwithdifferentfanspeeds.Fig.9showsthecolored temperaturemapoftheserverwithasinglefandisabled.The miss-inglineatoneofthefanpositionsrepresentsthefailedfan.Wesee that4chips(markedinreadframe)outofthetotal11monitored chipsareoperatingintheoverheatingenvironment.

Theaverageoverheatingdetectionprobabilityfromsimulation is shownin Fig.10. We seethatCFD placementapproach only requirestwosensorstoreacha100%ofoverheatingcomponent

Fig.8.AveragedetectionerrorrateoftheproposedCFD-guidedsolutionandthe baselinesintheinletoverheatingcase(testbed).

Fig.9. Servertemperaturemapinascenariowithsinglefanfailure.Thereddashed framearethechipswhoseenvironmenttemperaturesexceedtheirindividual oper-atingtemperaturethresholds.Theblacksolidtrianglesindicatethesensorsplaced bytheproposedCFD-guidedapproach,whenthegivensensornumberistwo.The blackcrossesindicatethetwosensorsplacedbythebaselineChipBestapproach.(For interpretationofthereferencestocolorinthisfigurecaption,thereaderisreferred tothewebversionofthearticle.)

Fig.10.AveragedetectionprobabilityoftheproposedCFD-guidedsolutionandthe baselinesinthescenariowithsinglefanfailure(simulation).

detection for allthe fouroverheatinglocations while ChipBest

requiresthreesensors.Theplacementswithtwosensorsbythese twoapproachesaremarkedinFig.9.WeseethatCFDplacement triestocoveralltherightcorneroverheatingchipsbyputtingonly onesensorinmiddleofthechips.ComparedwithChipAverage, CFDshowssignificantlybetterperformancebya60%higher detec-tionprobability.Asexpected,UniformGridandRandomschemes performmuch worsethantheotherplacementschemes.Fig.11

showstheaverageerrorrateofthefanfailurescenarioin simula-tions.Weseethatdespitesomerandomerrors,CFDoutperforms

Fig.11.AverageerrorrateoftheproposedCFD-guidedsolutionandthebaselines inthescenariowithsinglefanfailure(simulation).

(10)

Fig.12.AveragedetectionprobabilityoftheproposedCFD-guidedsolutionandthe baselinesinthescenariowithsinglefanfailure(testbed).

theothertwobaselineapproaches.ChipAverageshowstheworst performanceamongthethreeapproaches.

Figs.12 and13showthedetectionprobabilityanddetection

errorrateonthehardwaretestbedbasedontheextractedsensor placementlocationsfromthesimulation.FromFig.12weseethat

CFDhassimilarperformancewithChipBest,butbothofthemstill outperformtheChipAverageschemesignificantly.Fig.13shows theaverageerrorrateinthisfanfailurecase.WeseethatCFD per-formsjustalittleworsethanChipBest,butstillperformsmuch betterthantheChipAverage.Thedegradedperformanceinthisfan failurescenarioismostlikelycausedbythemodelinaccuracyof theCFDanalysis.Disablingafanmakesthethermalfluiddynamics morecomplexthanotherscenarios,leadingtoanincreaseofthe modelingerror.PleasenoteagainthatChipBestisactuallynot fea-sibleinarealimplementation,becauseitneedstotestalldifferent combinationsofsensor/chippairingandselectthebestone.

6.4. CPUoverloadingdetection

Inthissection,wepresentthesimulationresultsfor overheat-ingscenarioinducedbyCPUoverloading.Withthewidelyadopted DVFStechnique,CPUpoweriswellknowntobeacubicfunction ofCPUfrequency[33].ByoverclockingCPUfrequencyto1.5×of themaximumvaluelistedondatasheet,3× overloadedpower consumptioncanbeeasilyreached. Unfortunately,theplatform weuseinourhardwareexperimentdoesnotsupportCPU over-clocking.Therefore,weonlyshowthesimulationresultsin this sectionforthedetectionperformanceunderCPU3×overloading. Tocalculatethespatialtemperaturecorrelationandthetarget tem-peraturefunction,severalrunsofCFDanalysiswithdifferentCPU powersettingsareconducted.Noteagainthatoursensor place-mentsolutionisdesignedtohandlethedynamicsindifferentCPU overloadingscenarios.

Fig.14showsthecoloredtemperaturemapfortheCPU over-loading3× powerscenario.Althoughthe colorpattern isquite

Fig.13. AverageerrorrateoftheproposedCFD-guidedsolutionandthebaselines inthescenariowithsinglefanfailure(testbed).

Fig.14.ServertemperaturemapinthescenarioofCPUoverloading3xthelisted powerconsumptiononthedatasheet.Thereddashedboxesarethechipswhose environmenttemperatureexceedstheirindividualoperatingtemperature thresh-old.TheblacksolidtrianglesindicatethesensorsplacedbytheproposedCFD-guided approach,whenthegivensensornumberistwo.Theblacksolidcrossesindicatethe twosensorsplacedbythebaselineChipBestapproach.

similartotheresultinFig.2,i.e.,anormalrunwithbenchmark workload,itshowssignificantlyhighertemperaturethanthatin thenormal run.Thehighesttemperaturecanreachuptoabout 120◦C.Sixchipsarefoundtobeworkingunderoverheating condi-tionamongallthe11monitoredchips.Theplacementresultswith threesensorsisillustratedinFig.14forbothCFD-guided place-mentandChipBest.WeseethatCFDplacementplacessensorsin themiddleoftheclusterofoverheatingchipssuchthatmorechips canbecoveredbythelimitednumberofsensors.

Fig.15istheaveragedetectionprobabilityofthisCPU overload-ingscenario.WecanseethatCFDplacementconstantlyshowsthe bestdetectionprobabilityresult,andoutperformsbothChipBest

andChipAverage.Withasensornumberof2,theperformanceof

CFDreachestwiceashighasthatofCFDAverage.Theaverageerror rateofthecomponentoverheatingdetectionwithCPUoverloading isshowninFig.16.WeseethatCFDplacementoutperformsboth theChipBestandChipAveragewithalldifferentnumberofsensors.

6.5. Detectionperformanceincombinedoverheatingscenarios

We have evaluated our sensor placement scheme in three differentindividualoverheatingscenarios,includingpartialinlet overheating,fanfailureandoverheatingunderCPUoverloading. Asthetypeofoverheatingconditionisusuallyunknownbeforeit actuallyoccurs,weneedtopreparethesystemformonitoringany ofthepossibleoverheatingcondition.Inthissection,weevaluate theoverheatingdetectionperformanceofdifferentsensor place-mentschemesinacombinedoverheatingscenario.Thecombined

40 60 80 100 Av erage ction Probabilit y (% ) CFD Chip Best Chip Average Random Uniform Grid 0 20 11 10 9 8 7 6 5 4 3 2 1 Det e Sensor Number

(11)

Fig.16.AverageerrorrateinthescenarioofCPUoverloading3×power. 20 40 60 80 100 Av erage e ction Probabilit y (% ) CFD Chip Best Chip Average 0 20 11 10 9 8 7 6 5 4 3 2 1 Det e Sensor Number

Fig.17.Averagedetectionprobabilityinthecombinedoverheatingscenarios (sim-ulation).

overheatingscenarioconsistsofthepreviousthreedifferent indi-vidualoverheatingscenarios.Wepreparethesystembydeploying sensorstomonitoroverheatingcomponent inanyoftheabove threeoverheatingconditionsbasedontheformulationinSection 3.4.Specifically,weuseallthethreeCFDanalysisfromthe pre-viousthreedifferentoverheatingscenariosasinputandconduct oursensorplacementalgorithm,targetingtomaximizetheaverage overheatingdetectionprobabilityacrossallthethreeoverheating scenarios.Weconducttheevaluationfirstinsimulationandthen onourtestbed.

Figs.17and18arethesimulationresultsthatshowthe

aver-agedetectionprobabilityandaverageerrorrate,respectively,of thedetectionperformance forthis combinedscenarios. We see thatCFDhasalmostthesamedetectionperformanceastheChip Bestapproach.Comparedwithitsdetectionperformanceineach oftheindividualoverheatingscenario(asshowninprevious sec-tions),CFDperformsslightlyworseinthecombinedscenario.This ismainlybecausetheoptimizationalgorithmneedstoconsider alltheoverheatingscenariosatthesametimeandmakes trade-offsbetweendifferentscenarios.However,asdiscussedbefore,Chip

Fig.18.Averageerrorrateinthecombinedoverheatingscenarios(simulation).

Fig.19. Averagedetection probability inthecombinedoverheating scenarios (testbed).

Bestneedstotestalldifferentcombinationsofsensor/chippairing andselectthebestone,whichisactuallyinfeasibleinthereal imple-mentation.ComparedwithChipAverage,theCFD-guidedapproach stillperformssignificantlybetteronboththedetectionprobability andthedetectionerrorrate.

We then test different sensor placement strategies on our testbed.Theoverheatingdetectionprobabilityanddetectionerror rateofthehardware experimentare shownin Figs.19 and20, respectively.Asexplainedbefore,sinceweareunabletooverclock theCPUtocreatetheeventofCPUoverloading,asingleroundof eachexperimentconsistsoftwooverheatingscenarios,thepartial inletoverheatingandthefanfailureoverheating.Fromtheresults weseetheCFDplacementhasslightlybetterperformancethanChip Best.BothofthemconsistentlyperformbetterthantheCFDAverage

placement.Thehardwareresultslightlydiffersfromthesimulation resultbecauseofthedeviationintheCFDmodelingprocess.

7. Discussion

Inthissection,wefirstdiscussacloselyrelatedproblem, sen-sornumberminimizationproblem.Wethendiscussthepossible futureworkbasedontheoverheatingcomponentmonitoring sys-temusingoursensorplacementscheme.

7.1. Sensornumberminimization

Themaindesigngoalofthispaperistooptimizingthe deploy-ment locations ofgiven sensorsto reacha maximized average overheating detection probability of all the major components withintheserverbox.Whiletheprobabilitymaximizationis impor-tantforoverheatingdetection,sometimesitisalsointerestingto knowtheminimumnumberofsensorsrequiredtoreachatargeted overheatingdetectionprobability,especiallyinourserverbox com-ponentoverheatingdetectionapplication.Thisisbecausewithall theexistingcomponentsandwires,thespacewithintheserverbox

(12)

isusuallyverycompact,andthustheavailablespacethatcanbe usedtodeployadditionalsensorsisusuallylimited.

Theframeworkproposedinthisworkfordetectionprobability maximizationcanbeeasilymodifiedtoservethesensornumber minimizationpurpose.Toformulatethesensornumber minimi-zationproblem,wecanaddanadditionalconstraintoftargeted detectionprobability.Morespecifically,theformulationis:

arg min

(xi,yi)) ∀i

N (18)

subjecttothefollowingconstraints

PFj(SN)≤˛

1≤j≤M (19)

PDj(SN)≥ˇ

1≤j≤M (20)

whereSNisthelistoflocationsofalltheNsensors.Tosolvethis problem,wecanusethesamealgorithmproposedinSection5.3. Basically,weneedtofindoutthesmallestnumberofsensorsthat canprovidetherequireddetectionprobabilityfromconstraintEq.

(20)andalsomeetthefalsealarmrateconstraintinEq.(19).Since thisproblemisessentiallyavariantoftheproposeddetection prob-abilitymaximizationproblem,whichcanbesolvedwithasimilar algorithm,wedonotrepetitivelyshowexperimentresultsinthis paper.

7.2. Otherpotentialapplications

We have introduced that our proposed sensor placement schemecanbeusedtodeployingsensorstomonitoranddetect overheatingserver component under an unknownoverheating scenario using combined overheating scenario monitoring.We nowdiscusshowtointegrateserver-levelthermalmonitoringinto anotherpotentialapplication, overheatingrootcausediagnosis. Althoughitisimportanttocapturetheoverheatingcomponents,it isoftenmoredesirableifwecanfurtherdeterminethe overheat-ingreason.Inotherwords,itisoftenmoredesirabletodiagnose therootthatiscausingtheoverheatingphenomenon,suchthat actions,suchasincreasingfanspeedorloweringtheinlet temper-ature,canbetakentocorrecttheabnormaloverheatingbehavior oftheequipment.Toaccomplishthisgoal,inadditiontothe tem-peraturesensors,we canfurtherdeploy othertypesofsensors, suchasflowandacousticsensors,usingthesamesensor place-mentframeworkproposedinthiswork.Withtheadditionaltypes ofsensors,wecanfurthercharacterizetheworkingbehaviorofeach coolingrelatedcomponentandconditions,suchasserverfan,inlet flowspeedandflowpassageacrosstheserver.Bycharacterizing andmonitoringtheworkingconditionsofthesecomponents,we candeterminewhethertheyareworkingproperlytoprovidethe desiredcoolingcapabilities.Weplantointegratesensorplacement withoverheatingdiagnosisinourfuturework.

8. Conclusions

Efficientthermalmonitoringiscriticalfortoday’sserversystems toensuresafeoperationandcontinuousservice.Itisalsoimportant foreachservercomponenttomaintainadesirablelifetimeof ser-vice.However,thecurrentpracticeofserverthermalmonitoring simplyrelies oneithersensorsplaced attheserverinletor on-diethermalsensorsequippedonlywithsomeofcomponents,such asCPU,memoryorboth,whichmayleadtodegraded overheat-ingdetectionperformanceforcertaincomponents.Inthispaper, wehave presenteda novelsolutiontoplace additionalsensors intoserverboxforoverheatingservercomponentdetectionbased ontheCFDanalysisofthethermalandfluiddynamicsinsidethe serverbox.OursensorplacementschemeappliesConstrained Sim-ulatedAnnealingalgorithmwithareducedsearchspacetofinda

sensorplacementwithmaximizedoverheatingcomponent detec-tionprobability.Oursolutionalsoadoptsdatafusiontechniquesto collaborativelymaketheoverheatingdetectiondecision,resulting inimproveddetectionperformance.WeevaluateourCFD-based sensorplacementstrategywithareal-world2Urackserverin dif-ferent component overheatingscenarios. Our resultsshowthat the proposed placement strategy achieves significantly better overheating detection performance than several well-designed baselines.Extensivesimulationresultsalsodemonstratethe effec-tivenessofourCFDguidedsensorplacementscheme.

Acknowledgements

Thisworkwassupported,inpart,bytheUSNationalScience Foundation under grants CCF-1143605, CNS-1218154, CNS-1143607(CAREERAward),andCNS-0954039(CAREERAward),and bytheUSOfficeofNavalResearchundergrantN00014-11-1-0898 (YoungInvestigatorProgram).

References

[1]J.Srinivasan,S.Adve,P.Bose,J.Rivers,Lifetimereliability:towardan architec-turalsolution,IEEEMicro25(3)(2005)70–80.

[2]J.Srinivasan,S.Adve,P.Bose,J.Rivers,Thecaseforlifetimereliability-aware microprocessors,in:in:ISCA,2004.

[3]F.J.Mesa-Martinez,E.K.Ardestani,J.Renau,Characterizingprocessorthermal behavior,in:in:ASPLOS,2010.

[4]N.Tolia,Z.Wang,P.Ranganathan,C.Bash,M.Marwah,X.Zhu,Unifiedthermal andpowermanagementinserverenclosures,in:in:ASME,2009.

[5]J.Donald,M.Martonosi,Techniquesformulticorethermalmanagement: clas-sificationandnewexploration,in:in:ISCA,2006.

[6]R.Z.Ayoub,K.R.Indukuri,T.S.Rosing,Energyefficientproactivethermal man-agementinmemorysubsystem,in:in:ISLPED,2010.

[7]S.Gurumurthi,A.Sivasubramaniam,Thermalissuesindiskdrivedesign: chal-lengesandpossiblesolutions,TransactionsonStorage2(2006).

[8]J.Lin,H.Zheng,Z.Zhu,E.Gorbatov,H.David,Z.Zhang,Softwarethermal man-agementofdrammemoryformulticoresystems,in:in:SIGMETRICS,2008. [9]X.Wang,X.Wang,G.Xing,J.Chen,C.-X.Lin,Y.Chen,Towardsoptimalsensor

placementforhotserverdetectionindatacenters,in:in:ICDCS,2011. [10]Z.Yuan,R.Tan,G.Xing,C.Lu,Y.Chen,J.Wang,Fastsensorplacementalgorithms

forfusion-basedtargetdetection,in:in:RTSS,2008.

[11]K.Skadron,M.Stan,W.Huang,S.Velusamy,K.Sankaranarayanan,D.Tarjan, Temperature-awaremicroarchitecture,in:in:ISCA,2003.

[12]Y.Kim,S.Gurumurthi,A.Sivasubramaniam,Understandingthe performance-temperatureinteractionsindiski/oofserverworkloads,in:in:HPCA,2006. [13]J.Choi,Y.Kim,A.Sivasubramaniam,J.Srebric,Q.Wang,J.Lee,Modelingand

managingthermalprofilesofrack-mountedserverswiththermostat,in:in: HPCA,2007.

[14]R.Ayoub,R.Nath,T.Rosing,Jetcjointenergythermalandcoolingmanagement formemoryandCPUsubsystemsinservers,in:in:HPCA,2012.

[15]N.El-Sayed,I.A.Stefanovici,G.Amvrosiadis,A.A.Hwang,B.Schroeder, Tem-peraturemanagementindatacenters:whysome(might)likeithot,in:in: SIGMETRICS,2012.

[16]J.Moore,J.S.Chase,Weatherman:automated,online,andpredictivethermal mappingandmanagementfordatacenters,in:in:ICAC,2006.

[17]J. Moore,J. Chase,P.Ranganathan, R.Sharma, Makingscheduling“cool”: temperature-awareworkloadplacementindatacenters,in:in:USENIX,2005. [18]M.Jonas,R.R.Gilbert,J.Ferguson,G.Varsamopoulos,S.K.S.Gupta,Atransient

modelfordatacenterthermalprediction,in:in:IGCC,2012.

[19]L.Li,C.-J.M.Liang,J.Liu,S.Nath,A.Terzis,C.Faloutsos,Thermocast:a cyber-physicalforecastingmodelfordatacenters,in:in:SIGKDD,2011.

[20]J.Chen,R.Tan,Y.Wang,G.Xing,X.Wang,X.Wang,B.Punch,D.Colbry,A high-fidelitytemperaturedistributionforecastingsystemfordatacenters,in:in: RTSS,2012.

[21]C.-J.M.Liang,J.Liu,L.Luo,A.Terzis,F.Zhao,RACNet:ahigh-fidelitydatacenter sensingnetwork,in:in:SenSys,2009.

[22]S. Memik, R. Mukherjee, M. Ni, J. Long, Optimizing thermal sen-sor allocation formicroprocessors, IEEETransactionson Computer-Aided Design of Integrated Circuits and Systems 27 (3) (2008) 516–527, http://dx.doi.org/10.1109/TCAD.2008.915538.

[23]T.Yasuda,On-chiptemperaturesensorwithhightoleranceforprocessand temperaturevariation,in:in:ISCAS,2005.

[24]Y.Zhang,A.Srivastava,M.Zahran,Chiplevelthermalprofileestimationusing on-chiptemperaturesensors,in:in:ICCD,2008.

[25]A.Krause,C.Guestrin,A.Gupta,J.Kleinberg,Near-optimalsensorplacements: maximizinginformationwhileminimizingcommunicationcost,in:in:IPSN, 2006.

[26]P.K.Varshney,DistributedDetectionandDataFusion,Springer-Verlag,Inc,New York,1996.

(13)

[27]S.Marsh,Directextractiontechniquetoderivethejunctiontemperatureof hbt’sunderhighself-heatingbiasconditions,IEEETransactionsonElectron Devices47(2000).

[28]CFDflowmodelingsoftwareandsolutionsfromfluent,http://www.fluent.com [29]S.V.Patankar,NumericalHeatTransferandFluidFlow,HemispherePublishing

Corporation,NewYork,1980.

[30]B.W.Wah,Y.Chen,T.Wang,Simulatedannealingwithasymptotic conver-gencefornonlinearconstrainedoptimization,JournalofGlobalOptimization 39(2007).

[31]A.Verma,G.Dasgupta,T.K.Nayak,P.De,R.Kothari,Serverworkloadanalysis forpowerminimizationusingconsolidation,in:in:USENIX,2009.

[32]MEMSIC, TelosB mote, http://www.memsic.com/products/wireless-sensor-networks/wireless-modules.html

[33]K.Choi,W.Lee,R.Soma,M.Pedram,Dynamicvoltageandfrequencyscaling underapreciseenergymodelconsideringvariableandfixedcomponentsof thesystempowerdissipation,in:in:ICCAD,2004.

XiaodongWangiscurrentlyaPh.D.Studentinthe Depart-mentofElectricalandComputerEngineeringattheThe OhiostateUniversity.BeforejoiningTheOhioState Uni-versity,hewasaPh.D.studentatUniversityofTennessee, Knoxville.HeistherecipientofthefirstMinKao Fel-lowshipofElectricalEngineeringandComputerScience DepartmentatUniversityofTennessee,Knoxvillefrom 2007to2010.HealsoreceivedtheESPNGraduateStudent FellowshipandtheChancellorsAwardforExtraordinary ProfessionalPromiseAwardfromUniversityofTennessee, Knoxville,in2010and2011,respectively.Hereceivedhis M.S.inComputerEngineeringfromUniversityof Ten-nessee,Knoxvillein2009andB.S.degreeinElectrical EngineeringfromShanghaiJiaoTongUniversity,China,in2006.In2007,heworked atPDFSolutionsInc.asaDataAnalysisEngineer.

XiaoruiWangreceivedthePh.D.degreefromWashington UniversityinSt.Louisin2006.Heisanassociateprofessor intheDepartmentofElectricalandComputer Engineer-ingatTheOhioStateUniversity.Heistherecipientof theUSOfficeofNavalResearch(ONR)YoungInvestigator (YIP)Awardin2011,theUSNationalScienceFoundation (NSF)CAREERAwardin2009,thePower-Aware Comput-ingAwardfromMicrosoftResearchin2008,andtheIBM Real-TimeInnovationAwardin2007.Healsoreceivedthe BestPaperAwardfromthe29thIEEEReal-TimeSystems Symposium(RTSS)in2008.Heisanauthororcoauthorof morethan60refereedpublications.From2006to2011,he wasanassistantprofessorattheUniversityofTennessee,

Knoxville,wherehereceivedtheEECSEarlyCareerDevelopmentAward,the Chan-cellorsAwardforProfessionalPromise,andtheCollegeofEngineeringResearch FellowAwardin2008,2009,and2010,respectively.In2005,heworkedattheIBM AustinResearchLaboratory,designingpowercontrolalgorithmsforhigh-density computerservers.From1998to2001,hewasaseniorsoftwareengineerand thenaprojectmanageratHuaweiTechnologiesCo.Ltd.,China,developing dis-tributedmanagementsystemsforopticalnetworks.Hisresearchinterestsinclude power-awarecomputersystemsandarchitecture,real-timeembeddedsystems, andcyber-physicalsystems.HeisamemberoftheIEEEandtheIEEEComputer Society.

GuoliangXingreceivedtheB.S.degreeinelectrical engi-neeringandtheM.S.degreeincomputersciencefrom XianJiaoTongUniversity,China,in1998and2001, respec-tively,andtheM.S.andD.Sc.degreesincomputerscience andengineeringfromWashingtonUniversityinSt.Louis, in2003and2006,respectively.Heisanassistantprofessor intheDepartmentofComputerScienceand Engineer-ingatMichiganStateUniversity.From2006to2008,he wasanassistantprofessorofcomputerscienceatCity UniversityofHongKong.HeisanNSFCAREERAward recipientin2010.HereceivedtheBestPaperAwardat the18thIEEEInternationalConferenceonNetwork Proto-cols(ICNP)in2010.Hisresearchinterestsincludewireless sensornetworks,mobilesystems,andcyber-physicalsystems.

Cheng-XianLiniscurrentlyanAssociateProfessorin theDepartment ofMechanical andMaterial Engineer-ingatFIU.HispriorpositionsincludeAssociateProfessor in the University of Tennessee, Knoxville and Sum-merFacultyFellowatAirForceResearchLaboratoryin WPAFB.HeearnedhisPh.D.inMechanicalEngineering (ThermalEngineering)fromChongqingUniversity,China. Hehasauthoredandco-authoredover150 papersin peer-reviewedjournalsandconferenceproceedings.His currentresearchinterestsincludeComputationalFluid Dynamics,HeatTransfer,ThermalManagement,Energy EfficiencyandRenewableEnergyinBuiltEnvironments. HeisamemberoftheASMEandASHRAE.

SciVerse w w w . e l s e v i e r . c o m / l o c a t e / s u s c o m http://www.fluent.com http://www.memsic.com/products/wireless-sensor-networks/wireless-modules.html

References

Related documents

The purpose of our optimization is to calculate right states for controllable knobs such that the delay is minimized. Among the four parameters, CPU temperature is not directly

Building Management Systems take advantage of centralized monitoring and control to manage data center facilities targeting the entire site all the way down to the server rack..

Oracle Advanced Monitoring and Resolution Services offers an extensive integrated monitoring across the entire environment and the IT stack - from application and database up

The proposed vehicle tracking and speed monitoring system utilizes the Integrated development Environment (IDE) of Intel Galileo Gen2 board for design and implementation of

Oracle Server X5-2L includes Oracle Integrated Lights Out Manager (Oracle ILOM), which performs advanced health monitoring of the server operating environment (power and

Xeon® processor-powered rack-scale server, storage, and network solution with integrated open source NFV platform software, and offers 7/24/365 support and full

• Building compliance with governmental regulations into the solution, while “isolating” workloads in a virtualized server, multi-tenant cloud environment. • End-to-end

When installed on the exterior walls of buildings, living walls have been reported to have an important effect on the thermal performance of buildings and on the urban environment..