• No results found

Deterministic capacity planning for OpenStack

N/A
N/A
Protected

Academic year: 2021

Share "Deterministic capacity planning for OpenStack"

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

Deterministic capacity planning

for OpenStack

Keith Basil

Principal Product Manager, Red Hat

Sean Cohen

Principal Product Manager, Red Hat

Tushar Katarki

(2)

http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378 http://creativecommons.org/licenses/by-nc-nd/3.0/

(3)

AGENDA

OpenStack as an Elastic Cloud Determinism in Infrastructure Compute for Elastic Clouds Storage for Elastic Clouds Networking for Elastic Clouds Putting It All Together

(4)

Keith Basil

personal

Virginia hare scrambler, plays chess..

professional

Red Hat

Cloudscaling, Time Warner Cable, FederalCloud.com, Cisco and a couple of startups

blended

(5)

Sean Cohen

personal

Jazzman, oil painting & tennis...

professional

Red Hat

Dot Hill Systems, Cloverleaf Communications, VerticalNet

blended

(6)

Tuskar Katarki

personal

Two kids and the wife, squash, hike/bike

professional

Red Hat

15 years in IT infrastructure development Sun Microsystems, Oracle

(7)

Hello..

I’m Your Elastic Cloud.

H E L L Omy name is

(8)

OpenStack ...

Is open source software and vibrant communityProvides a framework for an elastic cloud

(9)

Elastic Cloud != Enterprise Virtualization

Elastic Cloud Workloads

✦Applications expect failure ✦Smaller stateless VMs

✦Applications scale out horizontally with

VMs of predetermined capacity

✦Lifecycle measured in hours to minutes

Enterprise Virt Workloads

✦Workloads NOT designed to tolerate failure ✦Larger stateful VMs

Workloads scale up within custom VMs

(more vCPU, vRAM)

✦Lifecycle measured in years

Scale Up

- Servers are like pets.

Scale Out

(10)

Difference in the resource requests?

I want 6 vCPUs, 4 GB and 120Gb disk please.

One is user determined. One is provider determined.

8)

I want an m1.small

please 8)

(11)

I would like an m1.medium VM please!

Umm, Do I know you? I need to see some papers!!

Keystone

Ok, we need to find a place to build this

VM. Nova

Tag - you’re it!

instance

capacity capacity

capacity Papers are good.

Time to get to work! Nova

Node Neutron, I need a network

with all the trimmings! Neutron

Here’s your IP, default route and FW settings. Cinder, have that

volume ready for me?

Node Indeed I do. Don’t forget to mount it!

Swift Glance

Hey Glance, can I get the RHEL 6.4 image? Node 8) OpenStack in 2 Minutes! Thank you OpenStack!! 8)

(12)

Your Mission, Should You Chose to Accept It..

“If you’re going to do operations reliably, you need to make it reproducible and programmatic.”

“Applications are what matter. Anything that gets apps deployed faster and helps companies manage the

proliferation of apps is good. Hence, DevOps.”

- Mark Imbriaco

VP of Ops, Digital Ocean - Mike Loukides

(13)

http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378 http://creativecommons.org/licenses/by-nc-nd/3.0/

devOps headband, BOFH Slayer gun handle and OpenStack unicorn branding added for effect. Not for redistribution.

The goal is to

keep your devOps heroes in play!

(14)
(15)

Let's Break The Myth...

There is no such thing

as

“infinite scale” in cloud

computing

All computing requests, even for

virtualized resources, ultimately map to physical device —> finite resources

(16)

✦ Every provider has limits, even if they’re massive.

✦ Adding the word Cloud simply squeezes the limit balloon

✦ It doesn’t eliminate the issue, even with “elasticity.”

✦ The service provider is responsible for risk mitigation of the

capacity it rents.

(17)
(18)

Why History matters..

✦Capacity planning and performance monitoring in the context

of Public providers:

✦Can be done only by understand the history of a specific

cloud provider.

✦Requires both cloud performance application to understand

✦Current state of the provider

(19)

Cloud tenants have a service level expectation Cloud Operators have business constraints

Implicit contract 8^) Operators RULE! 8^) Unicorns RULE! 8^) 8^) devOps FTW! 8^) BOFH Slayer! 8^) # root 8^) 8^) Unicorns RULE! 8^) Unicorns RULE! Implicit Contract 8^) uid=0 Operator Tenants

(20)

Capacity Planning in the Cloud

•Cloud users buy services based on capacity, protected by SLA •Cloud provider need deterministic capacity

planning to support the elastic growth

8^) Operators RULE! 8^) Unicorns RULE! 8^) 8^) devOps FTW! 8^) BOFH Slayer! 8^) # root 8^) 8^) Unicorns RULE! 8^) Unicorns RULE! Implicit Contract 8^) uid=0 Operator Tenants

(21)

Deterministic Capacity Planning

✦Determinism is the best measure we have for predicting the effort and expense of making a process consistently performant

✦When your service becomes a critical part of a customer’s infrastructure, their fate becomes wedded to the SLA’s you deliver.

✦ In Cloud Computing, the service’s performance will not be measured by its average speed but by the consistency of its speed

(22)

Modeling Performances

✦Using this information, we’re able to more accurately

determine the capacity of a Public provider

✦ Monitoring performance spikes and valleys over time.

✦This means we can more accurately model for performance,

(23)

Benchmarks can provide useful insight for performance analysis and capacity planning

(24)

Deterministic Concepts & Goals

AWS and GCE as models

You want 2048, not Tetris®

✦ Scheduling made easy

✦ Scaling made easy

✦ Optimal hardware use

(no holes or hot spots)

(25)

How do we achieve determinism

for these core OpenStack

(26)
(27)

Compute Instance Family

Solving resource contention in Compute

CPU

Disk Memory

(28)

1/1 1/2 1/4 1/8 n1-standard-8 n1-standard-4 n1-standard-2 n1-standard-1 m1.xlarge m1.large m1.medium m1.small m1.class n1-standard.class xlarge large medium small

(29)

We can take this approach with OpenStack

xlarge large

medium

small

Solve for the biggest VM in the class

We can easily derive the entire instance family because smaller instances are fractional proportions of the largest. This facilitates efficient hardware use and scheduling.

(30)

xlarge

Efficient Bin-Packing with Fractional Proportions

xlarge

Compute Hardware Node (general compute instance family)

128GB memory, (16) 1TB disks, (2) E5-2670 CPU

xlarge small small small small small small small small medium medium medium medium xlarge xlarge small small small small small small small small

Given the machine config below, it would support: (4) n1-standard-8-d (8) n1-standard-4-d (16) n1-standard-2-d (32) n1-standard-1-d (8) m1.xlarge (16) m1.large (32) m1.medium (64) m1.small large large large

(31)

Efficient Scheduling with Fractional Proportions

MEMORY OPTIMIZED NODE

small small small small medium medium medium xlarge medium medium small small large large

GENERAL COMPUTE NODE

xlarge small small small small medium medium medium medium xlarge large

General Purpose Instance Families

n1-standard ✦ m1

A1 - A4

CPU OPTIMIZED NODE

small small small small small small small small medium xlarge medium medium small small large large

Memory Optimized Instance Families

n1-highmem ✦ m2,cr1 ✦ A5 - A7

CPU Optimized Instance Families

n1-highcpu ✦ c1,cc2,c3 sch ed ul in g sch ed ul in g sch ed ul in g

(32)

Compute Calculator Intro

Designed to help determine optimal compute hardware configurations

✦Visually shows resource

constraints

✦Allows custom instance

families

(33)
(34)

Block Storage Volume Types

Solving resource contention in Block Storage

Throughput

General Storage Performance

(35)

What Are the Public Clouds Doing with Storage?

Performance Optimized –

✦ guaranteed IOPS (SSDs)

✦ IOPS per GB with low latency

✦ for I/O intensive workloads

✦ Billed by size and IO usage

Capacity Optimized (standard) –

✦no IOPS guarantees

✦workloads with moderate IO

✦Billed by size and IO usage

Blended Approach

(Performance Scaled with Capacity) –

✦ Ephemeral disks deprecated!

✦ IOPS scale with volume size

✦ Attached volume limits

(36)

Block Storage Classes in OpenStack

THROUGHPUT OPTIMIZED STORAGE NODE PERFORMANCE OPTIMIZED STORAGE NODE

Performance Optimized Storage

all SSDs

GENERAL STORAGE NODE

Throughput Optimized Storage

fast SAS drives with RAID 5/6 ✦ throughput tuned network ✦ high bandwidth Internal bus

Capacity (General) Optimized Storage

larger SATA HDDs C in de r sch ed ul in g C in de r sch ed ul in g C in de r sch ed ul in g SSD SSD SSD SSD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD

(37)

Storage Tiers with OpenStack Cinder

8^)

Operators

RULE!

8^)

1. Define storage back ends

2. Create Volumes Types

✦ General ✦ Performance ✦ Throughput 3. Create Volumes # cinder create \ --volume_type IOPS_OPTIMIZED_TYPE \ --display_name volume-1 50 TENANT OPERATOR

(38)

✦ Raw capacity of the storage ✦ Replication

✦ RAID type

Capacity (General) Optimized Storage

RAID TYPE 2-Way Replication 3-Way Replication RAID5 2.2 3.3 RAID6 2.4 3.6 RAID10 4 n/a Example:

Twelve (12), 1TB disks, configured for RAID6 and 2-way replication would yield 5.0TB of usable capacity.

(39)

✦ IOPS scale linearly with VM count ✦ Limits should be seen as triggers for

storage scale out

Performance Optimized Storage

Write Latency

(40)

Throughput Optimized Storage

✦ Throughput response matters ✦ The Read/Write mix matters

(41)

41

Storage Planning

●Step 0: What is my Cloud Storage offering?

● Capacity Based

● Performance (IOPS) Based ● Throughput (Bandwidth) Based

●Step 1: What Storage Tiers do I need?

● Capacity Optimized, Performance Optimized,

Throughput Optimized

● Step 2: Storage Capacity Planning

● Workload projections

● Performance Observations, Metrics to be

optimized, and Calculators

● Step 3: Procure and Deploy

● Step 4: Manage and Steer

(42)
(43)

Core Network

Solving resource contention for the Network

Throughput

Resiliency Latency

(44)

Enterprise vs Cloud Fabric

Traditional Enterprise Topology Modern Cloud Friendly Topology

(45)

Network Elasticity is Required..

NODE NODE NODE NODE NODE NODE NODE NODE

NODE NODE NODE NODE

NODE NODE NODE NODE NODE NODE NODE NODE

NODE NODE NODE NODE

NODE NODE NODE NODE NODE NODE NODE NODE

NODE NODE NODE NODE

NODE NODE NODE NODE NODE NODE NODE NODE NODE

BLOCK STORE BLOCK

STORE

NODE

NODE NODE NODE NODE NODE NODE NODE

BLOCK STORE BLOCK

STORE

NODE

NODE NODE NODE NODE NODE NODE NODE

NODE NODE NODE BLOCK STORE BLOCK STORE BLOCK STORE BLOCK STORE

Elastic Cloud Resource Map

NODE

(46)

Because your cloud will grow..

(47)

Core Fabric Requirements

OpenStack friendly networking features:

✦Availability and Resiliency

(multi-path, per-flow routing)

✦Resource Node (compute/storage) Data Throughput

✦Network Latency

(48)

Spine and Leaf Topology

Ask your friendly network vendor for guidance

Cisco, ARISTA, Brocade, Juniper, Force10, etc.

(49)
(50)
(51)

Plan for the Resource Service Level Compute/Storage Network Fabric Cloud Controller Resource Service Level

(52)

High level architecture Core servi ces Gen eral Pu rpose Comp ute Perfo rma nce Stora ge Gen eral (C apaci ty) Stora ge Deterministic Network

{

OpenStack Core Services

{

Deterministic Resources

}

Scale Out (as needed)

(53)
(54)

Resources

✦ https://github.com/noslzzp/

cloud-resource-calculator

What is DevOps?

http://oreil.ly/1jBcsAu - free!

Open source tools includes:

✦Graphite ✦Ganglia

Public Clouds Benchmarks

✦Cloudharmony.com ✦Cloudsleuth.com

(55)

Thank You!

Red Hat Enterprise Linux OpenStack Platform High Availability

Arthur Berezin — Technical Product Manager, Red Hat Wednesday, April 16

2:30 pm - 3:30 pm

Deploying Red Hat Enterprise Linux OpenStack Platform in the enterprise with FlexPod

Arthur Enright — Field Product Manager, Red Hat NetApp and Cisco

Wednesday, April 16 3:40 pm - 4:40 pm

Deep dive: OpenStack Compute

Steve Gordon — Technical Product Manager, Red Hat Thursday, April 17

9:45 am - 10:45 am

References

Related documents

But in 2004, probably as a result of a greater rainfall from the start of the growing season that year, and of the annual basic fertilisation, there was a greater uptake

Study IV built upon those data and combined it with three other datasets: two from humans, representing neonates (human infant metagenome, HIMG) and adults (human

 Introduction to fire protection engineering and building regulation, building safety systems, and egress system design. Human behavior

• Know how the evidence-based disease prevention and health promotion program known as the Chronic Disease Self-Management Program (CDSMP) works and

While the GDOG method can be applied to both few and multi-cycle lasers, the driving laser pulse dura- tion remains a key parameter, as only the energy contained in the

that all Traditions should be regarded as fictitious until their authenticity is objectively established. Taking for granted the mechanics of "back projection" of

Th e top quintile shows a faster rate of decline in the second period as compared with the fi rst, but that is not necessarily harmful; for the higher income countries, it