• No results found

How Much Power Oversubscription is Safe and Allowed in Data Centers?

N/A
N/A
Protected

Academic year: 2021

Share "How Much Power Oversubscription is Safe and Allowed in Data Centers?"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

How Much Power Oversubscription is

Safe and Allowed in Data Centers?

1EECS @ University of Tennessee, Knoxville 2ECE @ The Ohio State University

3IBM Research, Austin

(2)

Introduction

 Power: a first-class constraint in data center design

 Power oversubscription by power capping

 Improves power facility utilization  Improves server performance

 Power capping at different levels

 Servers, racks, and data centers

 However, they all share a common assumption

Power should

never

exceed the rated power capacity?

 Otherwise the circuit breaker (CB) would trip?

(3)

Trip curve of a typical

How Much Power Subscription is Safe?

 A CB trips or not depends on

 Magnitude of the overload

 Duration of the overload

 Ideal upper bound?

 Lower bound of the tolerance band

 This paper

 Investigates CB trip features

 Proposes adaptive power control to

1. Fully utilizes the allowed overload interval for maximized server perf. 2. Safely hosts more servers without

upgrading power facilities

Two minutes

(4)

Proposed Solution: CB-Adaptive

 More than just a standalone controller

 A methodology that adapts the parameters of existing power controllers to engineer their settling times

 Example: adapts a server power controller

[Lefurgy ICAC’07]

1. Obtain the tripping time from the CB tripping curve 2. The desired settling time should be the tripping time

3. Adapt controller parameter K to enforce the settling time

Server

CPU frequency Power measured Power Cap

Error

K ×Error

(5)

CB-Adaptive Design Details

 System model

 p(k) is the power of the server

 d(k) is the change to the CPU frequency

 A is a hardware-specific parameter when the server runs LINPACK

 How to adapt the controller parameter?

 The relationship between the parameter and the settling time  The parameter is a function of the measured power, the rated

current of CB, and the control period.

(

1)

( )

( )

(6)

Two CB-Adaptive Improvements

 Temperature-aware CB-Adaptive

 The CB trip curve is impacted by the ambient temperature.

 The rated current of CB is a linear function of the temperature.

 K is also a function of ambient temperature.

 CB-Proactive

 Delicately increases DVFS level in a proactive way  Further improves the server performance

 When and to what extent the DVFS level is increased?  CB enters the long-delay region

 Increase the frequency to the highest level

0.9 1 1.1 1.2 10 20 30 40 50 Temperature T h e r a te d c u rr e n t

(7)

Discussion on Power Oversubscription

 Possible applications of CB-Adaptive

 Hosting additional servers

 Safety issues

 A typical power delivery system

 Every component can tolerate overloads like CBs

 Overload capacity: power beyond which permanent damage occurs to the component

(8)

More Discussion

 Components other than CBs do not experience

overloads frequently.

 It is less likely that many servers reach their peak power simultaneously.

 Evidenced by a real Google data center [Fan ISCA’07]

 When only a branch circuit is overloaded

 CB-Adaptive can be applied directly

 When multiple branch circuits are overloaded

 CB-Adaptive needs to consider the tripping time of components other than CBs.

(9)

Hardware Testbed

 Dell OptiPlex 380  Rockwell Allen-Bradley 1489-A Industrial CB  Workloads  SPEC CPU2006  SPEC JBB  LINPACK

(10)

Baselines

 NoControl

 Estimates the peak power consumption of a server

 No power caps

 Unsafe and conservative

 P-Control

 Measures the power in every control period

 A non-adaptive proportional controller calculates frequency changes to enforce a power budget.

 P-Control-CB

 The power budget is different from that of P-Control

(11)

Power Control Comparison

 NoControl causes the CB trips. Unsafe

 P-Control & P-Control-CB

Unsafe and conservative

 CB-Adaptive fully utilizes overload intervals of CBs.  Raise CPU freq for

higher performance

50 100 150

0 10 20 30 40 50 60 70 80 90 100 Control Period (5sec)

P o w e r Co n s u m p ti o n (W a tt )

NoControl Rated Limit Long-Delay Upper Limit

50 100 150

0 10 20 30 40 50 60 70 80 90 100

Control Period (5sec)

P o w e r C o n s u m p ti o n (W a tt )

P-Control-CB Long-Delay Upper Limit P-Control

50 100 150

0 10 20 30 40 50 60 70 80 90 100 Control Period (5sec)

P o w e r C o n s u m p ti o n (W a tt )

CB-Adaptive Rated Limit Long-Delay Upper Limit

Converge to 80W after hours

50 100 150 P o w e r C o n s u m p ti o n (W a tt )

CB-Proactive Rated Lim it Long-Delay Uppe r Lim it

Converge to 80W

NoControl

(12)

Performance Comparison

 CB-Adaptive outperforms

P-Control by

 66%, for LINPACK

 29 % to 49%, for SPEC CPU 2006  74%, for SPEC JBB 0.0 0.4 0.8 1.2 1.6 2.0 P-C ontr ol P-C ontr ol-C B CB-A dapt ive CB-P roac tive L IN P A C K P e rf o rm a n c e (G fl o p s ) 0 5000 10000 15000 20000 25000 30000 35000 S P E C J B B P e rf o rm a n c e (b o p s ) LINPACK SPECJBB 0 3 6 9 12 15 18 sphi nx3 wrf lbm tonto Gem sFD TD calc ulix povr ay sopl ex deal II nam d lesl ie3d cact usA DM grom acs zeus mp milc gam ess bwav es xala ncbm k asta r om netp p h264 ref libqu antu m sjen g hm mer gobm k mcf gcc bzip2 perlb ench SPEC2006 benchmarks P e rf o rm a n c e ( B a s e R a ti o )

(13)

Impact of Temperature

 Temperature impacts the trip time significantly.

 Temperature-blind solutions P-Control-CB,

CB-Adaptive and CB-Proactive are

not

safe

.

200 250 300 350 400 450 500 NoCo ntrol (21.7 ℃ ℃℃ ℃ ) NoCo ntrol (26.4 ℃ ℃℃ ℃ ) NoCo ntrol (31.6 ℃ ℃ ℃ ℃ ) NoCo ntrol (34.8 ℃ ℃ ℃ ℃ ) P-Con trol-C B (45 ℃ ℃ ℃ ℃ ) CB-Ad aptiv e (45 ℃ ℃ ℃ ℃ ) CB-Pr oactive (45 ℃ ℃℃ ℃ )

Temperature (degree Celsius)

T ri p T im e ( S e c )

(14)

Temperature-Aware CB-Adaptive

 As the temperature increases, the performance

of servers decreases.

(15)

Power Provisioning Analysis

 NoControl

 The estimation is too conservative  7 servers hosted per branch

 P-Control

 Enforce a power budget instead of an estimation of power  13 servers hosted per branch

 CB-Adaptive

 Enforce a higher power budget than P-Control  20 servers hosted per branch

Rated power of the CB The number of servers

estimated server power =

(16)

Conclusions

 A common assumption of existing power capping

 Peak power should never exceed the rated CB capacity

 This paper

 Systematically studies the CB tripping characteristics

 Identifies ideal upper bound of safe power oversubscription  Proposes two adaptive power control strategies

 Evaluation on safe power oversubscription

 A single server: 38% performance improvement

 Circuit branch: host 54% more servers without upgrading power infrastructure

(17)

Questions?

 Acknowledgements

 NSF CAREER Award CNS-0845390  NSF CSR Grant CNS-0720663

 NSF SHF Grant CCF-1017336

 Prof. Leon Tolbert at the University of Tennessee

(18)
(19)

Control Theoretic Analysis

 How to adapt the controller parameter?

 Details of the derivation

 Z transform of the system model  Z-domain controller

 Calculate the close loop transfer function  Reverse Z transform

1

m

0.02

K

A

=

(20)

Power Provisioning Analysis

 UPS cannot tolerate overloads

 Not a problem because each UPS run at 50% its capacity

(21)

References

Related documents

Open Society Georgia (Soros) 27 Course Baccalaureate Nursing Curriculum 40 Continuing Tbilisi State University Iashvili Children’s Hospital National Medical Center. Model

• When we monitor a process by means of control charts, they tell us whether the process is out of control or not, i.e., whether the process is working under chance causes only

Comparative study of various bomber aircrafts The first step in the design of aircraft is to collect data of existing aircraft of similar purpose i.e., bomber.. This step

Even if we were able to correctly classify each Twitter user, we would not be able to make a reliable estimate of the voting results as (i) several Twitter users may not vote,

She has created mechanisms to feedback customer service and mystery shopping data into real time, revenue producing activities by the front line and outside sales forces.

Using a Public Goods Game, we are interested in assessing how cooperation rates change when agents can play one of two different reactive strategies, i.e., they can pay a cost in

1 Peter 3:21 – And baptism, which is a figure [of their deliverance], does now also save you [from inward questionings and fears], not by the removing of outward body filth

− Helps protect the skin against damage related to dehydration − For the treatment and/or prevention of diaper rash caused. by wetness or urine