• No results found

Virtualizing Mission-Critical Apps

N/A
N/A
Protected

Academic year: 2020

Share "Virtualizing Mission-Critical Apps"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Virtualizing Mission-Critical Apps

1PM EST, 3/29/2011

Ilya Mirman

(2)

2

Agenda

The Rise of “The Virtualization Chasm”

3 Fundamental inefficiencies

Best practices

(3)
(4)

4

Before Virtualization

10 12 14 16 2 4 8 6 C a p a c it y

• Traditional IT guarantees apps’ performance by

– Dedicating physical

machines (PM) to apps

– Provisioning sufficient capacity to service peak loads

• Consider an app requiring

16 cores, 8GB memory and 10k IOPS (IO Per Sec) IO bandwidth to service its peaks

(5)

5

Over-Provisioning Waste

Workloads are ‘bursty’:

Average/peak is often

under 10%

Dedicating hardware

wastes the slack capacity

between average & peak

10
(6)

6

Virtualization is Set to Resolve This Waste

Consolidate workloads into shared PMs

This increases average utilization additively

But it also increases interference among VMs

– E.g., Peak traffic of VM1 can interfere with CPU availability for other VMs

VM1 VM2 VM3 VM4 VM5 VM6 VM7 VM8 VM9 VM10 2

4 8 6

Peak Workloads of VMs

PMs Consolidate

(7)

7

VMs Compete for Resources

Best-effort resource allocations (vs. dedicated)

– VMs get their allocations, if capacity is available

VMs experience interference when capacity is insufficient

Interference can create congestion, bottlenecks and delays

Performance-

in

sensitive apps can tolerate interference

Permit simple, risk-free virtualization

(8)

8

The Rise of “The Virtualization Chasm”

Percentage Apps Virtualized

20% 80% 100%

R

O

I

40%

Production Apps

“The Virtualization-Chasm”

Virtualization 1.0 Virtualization 2.0

Virtualization 1.0: Virtualize performance-insensitive apps

– E.g., Print servers, non-critical web apps (The low-hanging fruits) – 20%-30% of enterprise apps

Performance-Insensitive Apps

Virtualization 2.0: Virtualize production apps

(9)
(10)

10

The Key Challenge:

Ensuring That Production

Apps Get Their Resources

Interference results from statistical over-commitment

Apps’ demands can exceed capacity momentarily

Interference may be controlled by two mechanisms

Resource allocation: protect apps against over-commitment

Workload placement: move workloads to minimize interference

(11)

11

VMWare Best Practices:

Managing Productions Apps Performance

Best Practice Guide to Exchange Server Virtualization:

http://www.vmware.com/files/pdf/Exchange_2010_ on_VMware_-_Best_Practices_Guide.pdf

“It is recommended that standalone

servers…be designed to not exceed 70% utilization during peak period.”

Assure Peak Utilization:

Avoid Over-Commitment:

“For performance-critical Exchange virtual machines (i.e., production systems), try to ensure the total number of vCPUs assigned to all the virtual machines is equal

(12)

12

VMWare Best Practices:

Managing Productions Apps Performance

VMWare Production Apps Strategy Rests on 2 Rules:

VMs running production apps should ensure that:

“Resource allocations are sufficient to serve

peak demands.”

“Resource allocations are sufficient to serve

peak demands.”

R-I

R-I

“Aggregate allocations

do not exceed the

PM capacity.”

“Aggregate allocations

do not exceed the PM capacity.”

R-II

R-II

R-I guarantees that an app may get its peak demands

served, if capacity is available.

R-I guarantees that an app may get its peak demands

served, if capacity is available.

R-II guarantees that the capacity allocation will be

available.

R-II guarantees that the capacity allocation will be

available.

i.e., if VM1 and VM2 each need 4 vCPUs, we need a PM with ≥8 CPUs!

(13)

13

Wait….Really? Then why virtualize?

Though there’s no sharing of resources, still enjoy the other

benefits of virtualization (app isolation, VM set-up, back-up,

etc.)

“Resource allocations are sufficient to serve

peak demands.”

“Resource allocations are sufficient to serve

peak demands.”

R-I

R-I

“Aggregate allocations

do not exceed the

PM capacity.”

“Aggregate allocations

do not exceed the PM capacity.”

R-II

R-II

R-I guarantees that an app may get its peak demands

served, if capacity is available.

R-I guarantees that an app may get its peak demands

served, if capacity is available.

R-II guarantees that the capacity allocation will be

available.

R-II guarantees that the capacity allocation will be

(14)

14

Virtualization Can Result in

3 Fundamental Inefficiencies

Over-provisioning inefficiency

Over-provisioning

inefficiency Workload packing inefficiency Workload packing

inefficiency control inefficiencyNon-adaptive Non-adaptive control inefficiency

1.

1. 2.2. 3.3.

(15)
(16)

16

How to Avoid Over-Provisioning Waste?

To Avoid Waste: Increase

average workload without

increasing reservations

– Add performance-insensitive apps with high average workload

– E.g., consolidate spam-filter apps, email archival apps alongside mission-critical apps

Need additional best

practice rule: Smart

consolidation

Best Practice #1:

Maintain a

consolidation-balance between

performance-sensitive and

insensitive workloads

Best Practice #1:

Maintain a

consolidation-balance between

(17)
(18)

18

A Greatly Simplified Example

2 4 8 6 10 12 14 16

PM1 PM2 PM3

2 4 8 6

VM1 VM2 VM3 VM4 VM5 VM6

Virtualized Workloads

Manual Ad-Hoc Workload Assignment

CPU capacity: 16 cores

Memory capacity: 8 GB

(19)

19

What If We Get New VMs?

2 4 8 6 10 12 14 16

PM1 PM2 PM3

Can we do better?

Optimized assignment uses

40% less resources (3 PM vs. 5)

2 4 8 6 10 12 14 16

PM1 PM2 PM3 PM4 PM5

Ad Hoc Assignment VM7 VM8 VM9 VM10

(20)

20

What Can We Learn from This Example?

Changes may require (re-)assignment of workloads

Even a trivialized example can be very complex

Complexity and waste can grow dramatically

When the number of VMs increases – When physical machines vary

When there are constraints (e.g., storage access, security policies)When the rate of changes is high

Ad hoc processes can lead to costly inefficiencies

(21)

21

Overcoming the Packing Inefficiency

Use improved workload

placement algorithms

Look holistically at all

workloads and resources

– Exploit the flexibility of performance-insensitive workloads

Exploit the dynamics of

workloads peaks & troughs

Best Practice #2:

Use improved workload

placement algorithms

Best Practice #2:

(22)
(23)

23

1

15 16 17 18 19 20 21 22 23 24 01 02 03 04 05 06 07 08 09 10 11 12 13 14

10

k-IO

P

S

R

at

e

Time

Mission-Critical App Example

Virtualized MS Exchange app

High IOPS during the night (2AM-5AM)

Peak: 10 k-IOPS

(24)

24

What If Workloads Grow?

Can we do better?

Optimized assignment uses

25% less resources

2 4 8 6 10 12 14 16

PM1 PM2 PM3 PM4 VM1 VM2 VM3 VM4 VM5 VM6

2 4 8 6

What if VM1 needs more memory & storage?

2 4 8 6 10 12 14 16

(25)

25

Adaptive vs. Non-Adaptive Workload Control

• Workloads demands (and interference) change over time – E.g., Exchange server is active through the night

– Why keep its reservation during the day?

• Static workload mgmt is limited in handling emergent problems

– Apps profiles reflect long-term statistics; fluctuations can cause interferences

• Adaptive workload control offers superior mgmt

– Exploit workload dynamics to reduce waste of static policies – Eliminate emergent interferences

Best Practice #3:

Provide adaptive control to

optimize resource use & avoid

interference

Best Practice #3:

Provide adaptive control to

optimize resource use & avoid

interference

Best Practice #4:

Use of forward looking

workload projection

Best Practice #4:

(26)

26

Adaptive Control:

Too Complex for Manual Management

Manual management requires administrators to:

– Master voluminous details of hypervisor and applications internals

Manage interference and waste problems manuallyManage resource allocations and move applications

as workloads change

Maintain tight-coordination between virtualization

& app administrators

(27)

Virtualizing Production Apps:

(28)

28

Conclusions

Workload placement can be very inefficient

– Over-provisioning waste; workload-packing waste; non-adaptive inefficiencies

Virtualization is much too complex for manual administration

Must be augmented by workload management:

Eliminate the over-provisioning waste through balanced

consolidation

Minimize the workload-packing waste by exploiting workload

features

– Support adaptive control to optimize resource use & avoid interference

Virtualization 2.0 Strategy:

(29)
(30)

Thank you!

References

Related documents