System Energy Efficiency Lab

(1)

Prof. Tajana Šimunić Rosing Dept. of Computer Science

Sy

stem Energy Efficiency Lab

(2)

Energy efficiency at global scale

National LambdaRail

NSF GreenLight & IRNC: TransLight/StarLight

2.8MW Fuel Cell Power Plant , UCSD Wind from NREL, 2MW Solar @UCSD

(3)

Key issues for distributed

renewable-powered datacenters



Green energy availability varies dramatically

 Instantaneous use leads to significant energy efficiency losses  _{Prediction is needed}



Datacenter computing requires consistent performance

 _{Infrastructure that monitors and manages computation in datacenters} has to be aware of performance costs

 Service response times are around 100ms, Max 10% batch job throughput hit



Energy costs of datacenters are typically higher than green

energy availability

 _{Brown energy needs to be present to both supplement green and as} “insurance” to meet performance constraints

 _{Improvements in computation & networking infrastructure energy} efficiency are necessary (power, thermal and cooling management)

(4)

Datacenter energy efficiency

Barroso & Hölzle, 2009

(5)

Energy efficiency of the infrastructure

Dynamic thermal management (DTM)

• Workload scheduling:

• Machine learning for dynamic adaptation

• Proactive thermal management

• Reduces thermal hot spots by average 80% with no performance overhead

• Cooling aware management

• Savings of 70% in cooling subsystem

Dynamic power management (DPM)

• HW level: adaptive power gating gives 40% energy savings with no perf. impact • SW level: 92% reduction in performance

variability with DVFS

• Optimal DPM for a class of workloads • Machine learning to adapt

• Select among specialized policies • Measured energy savings of 70%

NSF Project GreenLight

• Green cyber-infrastructure in

energy-efficient mobile facilities

• Closed-loop power and thermal

management

(6)

NSF GreenLight:

Dashboard & History plots

 Multiple sensor data: temperature, fan speed, liquid flow rate & temp, power

 Use measurements to develop models

(7)

GMVQ VM Power Cost Prediction

• Goal: Estimate how much a VM consumes and predict what the cost would be if it migrates to another machine

70 W

Power clusters

CPU utilization is not enough!

Tajana Simunic Rosing, UCSD

• Approach: Gaussian Mixture Vector Quantization (GMVQ) to fit a GMM to the training data

GMVQ is 3x better than regression

3x

(8)

Energy management with virtualization

CPU0 CPU1 NW CPU2 CPU n HD Hypervisor Guest n I/O CPUs Hardware

I/O Intensive? CPU Intensive? Guest 1 Guest 2 Apps OS Apps OS Apps OS Scheduler Workload Characterization VCPU1 VCPU2 wMPC wIPC util wMPC wIPC util vMPC vIPC vutil vMPC vIPC vutil nMPC nIPC nutil VM1 VM2 wMPC wIPC util wMPC wIPC util VCPU1 VCPU2 VGNODE  nMPC vMPC VM   nIPC vIPC VM   nutil vutil VM   vMPC wMPC VCPU   vIPC wIPC VCPU  vutil util VCPU  • Scheduling

• Co-locate guests with orthogonal characteristics

• Management policies

• Based on the metrics maintained per guest

 Avg 35% Energy Savings  Avg 40% speedup

vGreen

(9)

vGreen+

Batch Jobs: _{•MIPS driven} Services: •Latency sensitive

 Maximize qMIPS/Watt

 q  QoS ratio < 1

 MIPS  Batch job throughput

 Watt  Power consumption

vgnodes

vgserv

Tajana Simunic Rosing

(10)

Unmodified Xen: RUBIS w batch job

Poor SLA!

vGreen: Rubis & batch job

Great SLA!

Managing multi-tier applications

RUBiS: auction website

Clients

Webserver (PHP)

Database (mysql)

 What happens when we combine:

 Latency sensitive jobs (e.g. RUBIS)

 Throughput sensitive (e.g. batch jobs)

 Preliminary results:

 More than 10x improvement in SLA with

(11)



State of the art:

 _Baseline:_{running services and batch jobs on separate servers}  Selective Consolidation (tChar): vGreen

 _{Capping (tCap)}_{: Cap the CPU allotment to bVM to mitigate}

interference effects (Padala@EuroSys’09, Nathuji@EuroSys’10)



Controller

:

 Dynamically control the vCPU allocation of the service VM to maximize the batch job throughput while meeting service response times

VM Scheduling Policies Throughput

Our controller is within 7% of baseline; CPU capping has 25% lower average throughput

Batch job

throughput

(12)



Use a controller to manage virtual CPUs dynamically

 _{Maximize CPUs of various batch jobs while meeting Rubis SLAs}

Energy Efficiency Improvements

• 70% more efficient than running service & batch VMs separately while within 7% of maximum batch job throughput

(13)

Green Energy Prediction

• Data from solar panels at UCSD

• State of the art: exponential weighted average

• EWMA: 32.6 % error

• Extended eEWMA: 23.4% error

• Our algorithm:

• WCMA: less than 9.6% error

• Data gathered from a wind farm in Lake Benton, available by NREL

• State-of-the-art:

• Integrated predictor: 48.2% error

• Our algorithm: 21.2% error

• Combination of a weighted nearest-neighbor (NN) tables and wind power curve models

 Predict green energy availability for the next 30min window

(14)

Methodology



Use green energy to schedule “extra” batch jobs.

 MapReduce (MR) type jobs for this purpose.

 _{Initiate more subtasks with the available green energy.}

 Increased throughput

 Reduced completion time - limit reduction to 10%  Kill a subtask if the green energy supply level drops

MR arrival

Active MR jobs _Servers

Web Request arrival

Green Energy Supply

Server Web Request Queues

(15)

Experimental setup & validation



Globally distributed

datacenters connected

 _{5 datacenters, 12 routers,} modeled after ESnet

 Solar traces from UCSD  Wind traces from NREL

Tajana Simunic Rosing

Simulator validation Measured

Value

Simulated Value

Ave. Error

Avg. Power Consumption 246 W 251 W 3%

Rubis QoS ratio 0.08 0.085 6%

Avg. MapReduce Completion Time 112 min 121 min 8%



Jobs run within vGreen VMs on Nehalem server

 _{Rubis used for services with 100ms 90}th_{%ile response time constraint}  _{MapReduce used for batch jobs with 10% max job completion time}

reduction (max 5 cores on Nehalem server)

(16)

Benefits of Green Energy Prediction

0 0.2 0.4 0.6 0.8 1

Solar GE Efficiency Wind GE Efficiency Combined GE Efficiency

Normali ze d Quant ities instantaneous prediction

Prediction has 93% GE Efficiency

GE Efficiency: ratio of green energy consumed for useful work vs. the total green energy available



Compare our green energy

predictive

jobs scheduler

(17)

Benefits of Green Energy Prediction

0 0.2 0.4 0.6 0.8 1

job completion time

Normali ze d Job Comletion Tim e

w/o GE instantaneou prediction

Prediction has 22% faster batch job completion time vs. instantaneous On average, 5x fewer batch tasks need to be terminated when using GE prediction vs. instantaneous usage

0 0.2 0.4 0.6

Solar GE Job % Wind GE Job % Combined GE Job %

Normali ze d Quant ities instantaneous prediction

38% more jobs complete with prediction vs. instantaneous

GE Job %: ratio of batch jobs completed with GE over all jobs.

(18)

Next steps: Green energy

powered global routing

(jointly with Inder Monga, Esnet)

2.8MW Fuel Cell Power Plant , UCSD Wind from NREL, 2MW Solar @UCSD

(19)

Focus Center Research Program

Sponsors

DOD & DARPA

Applied Materials Novellus Cadence

AMD MICRON Freescale Texas Instr. IBM Xilinx

Intel GlobalFoundries

Intel MICRON

Freescale Texas Instruments IBM Xilinx

GLOBALFOUNDRIES AMD

Director: Prof. Paul Kohl (Georgia Tech). Nanoscale electrical & optical interconnects; energy delivery and thermal management; wireless connectivity; modeling, analysis and assessment of new connectivity solutions.

Interconnect Focus Center [13 Universities]

Director: Prof. Larry Pileggi (CMU). Circuit/module infrastructure; enterprise systems; portable electronics; functional diversity and emerging circuits for post CMOS.

Center for Circuit & Systems Solutions [13 Universities]

Director: Prof. Jan Rabaey (UC-Berkeley). High-level systems design addressing distributed sense and control systems, large-scale and small-large-scale information technologies systems.

Multi-Scale Systems Research Center [10 Universities]

Functional Engineered Nano-Architectonics [14 Univ.]

Director: Prof. Kang Wang (UCLA). Novel materials and processes which enable fabrication of nanoscale devices and interconnects.

Materials, Structures and Devices [15 Universities]

Director: Prof. Dimitri Antoniadis (MIT). Integration of new

materials enabling CMOS extension; carbon-based devices; novel embedded memory; functional diversification; theory, modeling and simulation of new devices.

Director: Prof. Sharid Malik (Princeton). Platform architectures; concurrent systems programming; platform viability; resilient systems and alternative computation models.

Gigascale Systems Research Center [15 Universities]

Raytheon United Technologies

(20)

DSCS Theme

Distributed sense and control systems

Target: Airborne Platforms (Avionics) 5 faculty (reduced from 8), 3 locations

LSS Theme

Large-scale “energy-intensive” systems

Target: Data centers

9 faculty (stable), 5 locations

SSS Theme

Small-scale “energy-frugal” systems

Target: Human enhancement

5 faculty (reduced from 8), 4 locations

MuSyC in a Nutshell

GRAND CHALLENGE : “Energy-smart” distributed systems, that

 Are deeply aware of balance between energy availability and demand

 Adjust behavior through dynamic and adaptive optimization through all

scales of design hierarchy.

• 19 Faculty Distributed over 9 US Universities • 60 Students (many only

partly funded) • 109 publications

• Monthy e-seminars and bi-annual e-workshops 29% 32% 26% 13% 2010-2011 DSCS LSS SSS Center

(21)

Multiscale Systems Center:

Energy Balanced Datacenters

Theme leader: Tajana Simunic Rosing

Realize closed-loop energy management strategies that enable “energy-intensive”

large-scale systems to be orders of magnitude more energy-efficient, while ensuring that mission-critical goals are met.

(22)

 Average cooling savings of 70%

relative to state-of-the-art

Combined Energy Thermal & Cooling,

CETC Management Results

 CETC: Our policy

 DLB: Dynamic load balancing (baseline)

 NFMO: Only page migrations allowed

 NMM: No memory clustering

 NCM: No CPU scheduling optimizations Intel Xeon Dual Socket

Quad core Server; with state-of-the-art PI fan controller

 CETC performance overhead < 0.2%

 CETC page migration rate < 5 pages/sec -> negligible overhead & high stability

Local ambient Temp. # DIMM CETC % NFMO % NMM % NCM % DLB % 45 o_C ₈ _0.175 _0.184 _0.093 _0.102 ₀ 35 O_C ₈ _0.115 _0.120 _0.069 _0.100 ₀ 45 o_C ₁₆ _0.175 _0.187 _0.109 _0.102 ₀