Deterministic capacity planning
for OpenStack
Keith Basil
Principal Product Manager, Red Hat
Sean Cohen
Principal Product Manager, Red Hat
Tushar Katarki
http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378 http://creativecommons.org/licenses/by-nc-nd/3.0/
AGENDA
✦ OpenStack as an Elastic Cloud ✦ Determinism in Infrastructure ✦ Compute for Elastic Clouds ✦ Storage for Elastic Clouds ✦ Networking for Elastic Clouds ✦ Putting It All Together
Keith Basil
personal
Virginia hare scrambler, plays chess..
professional
Red Hat
Cloudscaling, Time Warner Cable, FederalCloud.com, Cisco and a couple of startups
blended
Sean Cohen
personal
Jazzman, oil painting & tennis...
professional
Red Hat
Dot Hill Systems, Cloverleaf Communications, VerticalNet
blended
Tuskar Katarki
personal
Two kids and the wife, squash, hike/bike
professional
Red Hat
15 years in IT infrastructure development Sun Microsystems, Oracle
Hello..
I’m Your Elastic Cloud.
H E L L Omy name is
OpenStack ...
✦Is open source software and vibrant community ✦Provides a framework for an elastic cloud
Elastic Cloud != Enterprise Virtualization
Elastic Cloud Workloads
✦Applications expect failure ✦Smaller stateless VMs
✦Applications scale out horizontally with
VMs of predetermined capacity
✦Lifecycle measured in hours to minutes
Enterprise Virt Workloads
✦Workloads NOT designed to tolerate failure ✦Larger stateful VMs
✦Workloads scale up within custom VMs
(more vCPU, vRAM)
✦Lifecycle measured in years
Scale Up
- Servers are like pets.
Scale Out
Difference in the resource requests?
I want 6 vCPUs, 4 GB and 120Gb disk please.
One is user determined. One is provider determined.
8)
I want an m1.small
please 8)
I would like an m1.medium VM please!
Umm, Do I know you? I need to see some papers!!
Keystone
Ok, we need to find a place to build this
VM. Nova
Tag - you’re it!
instance
capacity capacity
capacity Papers are good.
Time to get to work! Nova
Node Neutron, I need a network
with all the trimmings! Neutron
Here’s your IP, default route and FW settings. Cinder, have that
volume ready for me?
Node Indeed I do. Don’t forget to mount it!
Swift Glance
Hey Glance, can I get the RHEL 6.4 image? Node 8) OpenStack in 2 Minutes! Thank you OpenStack!! 8)
Your Mission, Should You Chose to Accept It..
“If you’re going to do operations reliably, you need to make it reproducible and programmatic.”
“Applications are what matter. Anything that gets apps deployed faster and helps companies manage the
proliferation of apps is good. Hence, DevOps.”
- Mark Imbriaco
VP of Ops, Digital Ocean - Mike Loukides
http://sharpwriter.deviantart.com/art/Welcome-to-the-Internet-Please-Follow-me-322248378 http://creativecommons.org/licenses/by-nc-nd/3.0/
devOps headband, BOFH Slayer gun handle and OpenStack unicorn branding added for effect. Not for redistribution.
The goal is to
keep your devOps heroes in play!
Let's Break The Myth...
There is no such thing
as
“infinite scale” in cloud
computing
All computing requests, even for
virtualized resources, ultimately map to physical device —> finite resources
✦ Every provider has limits, even if they’re massive.
✦ Adding the word Cloud simply squeezes the limit balloon
✦ It doesn’t eliminate the issue, even with “elasticity.”
✦ The service provider is responsible for risk mitigation of the
capacity it rents.
Why History matters..
✦Capacity planning and performance monitoring in the context
of Public providers:
✦Can be done only by understand the history of a specific
cloud provider.
✦Requires both cloud performance application to understand
✦Current state of the provider
Cloud tenants have a service level expectation Cloud Operators have business constraints
Implicit contract 8^) Operators RULE! 8^) Unicorns RULE! 8^) 8^) devOps FTW! 8^) BOFH Slayer! 8^) # root 8^) 8^) Unicorns RULE! 8^) Unicorns RULE! Implicit Contract 8^) uid=0 Operator Tenants
Capacity Planning in the Cloud
•Cloud users buy services based on capacity, protected by SLA •Cloud provider need deterministic capacity
planning to support the elastic growth
8^) Operators RULE! 8^) Unicorns RULE! 8^) 8^) devOps FTW! 8^) BOFH Slayer! 8^) # root 8^) 8^) Unicorns RULE! 8^) Unicorns RULE! Implicit Contract 8^) uid=0 Operator Tenants
Deterministic Capacity Planning
✦Determinism is the best measure we have for predicting the effort and expense of making a process consistently performant
✦When your service becomes a critical part of a customer’s infrastructure, their fate becomes wedded to the SLA’s you deliver.
✦ In Cloud Computing, the service’s performance will not be measured by its average speed but by the consistency of its speed
Modeling Performances
✦Using this information, we’re able to more accurately
determine the capacity of a Public provider
✦ Monitoring performance spikes and valleys over time.
✦This means we can more accurately model for performance,
Benchmarks can provide useful insight for performance analysis and capacity planning
Deterministic Concepts & Goals
AWS and GCE as models
You want 2048, not Tetris®
✦ Scheduling made easy
✦ Scaling made easy
✦ Optimal hardware use
(no holes or hot spots)
How do we achieve determinism
for these core OpenStack
Compute Instance Family
Solving resource contention in Compute
CPU
Disk Memory
1/1 1/2 1/4 1/8 n1-standard-8 n1-standard-4 n1-standard-2 n1-standard-1 m1.xlarge m1.large m1.medium m1.small m1.class n1-standard.class xlarge large medium small
We can take this approach with OpenStack
xlarge large
medium
small
Solve for the biggest VM in the class
We can easily derive the entire instance family because smaller instances are fractional proportions of the largest. This facilitates efficient hardware use and scheduling.
xlarge
Efficient Bin-Packing with Fractional Proportions
xlarge
Compute Hardware Node (general compute instance family)
128GB memory, (16) 1TB disks, (2) E5-2670 CPU
xlarge small small small small small small small small medium medium medium medium xlarge xlarge small small small small small small small small
Given the machine config below, it would support: (4) n1-standard-8-d (8) n1-standard-4-d (16) n1-standard-2-d (32) n1-standard-1-d (8) m1.xlarge (16) m1.large (32) m1.medium (64) m1.small large large large
Efficient Scheduling with Fractional Proportions
MEMORY OPTIMIZED NODE
small small small small medium medium medium xlarge medium medium small small large large
GENERAL COMPUTE NODE
xlarge small small small small medium medium medium medium xlarge large
General Purpose Instance Families
✦ n1-standard ✦ m1
✦ A1 - A4
CPU OPTIMIZED NODE
small small small small small small small small medium xlarge medium medium small small large large
Memory Optimized Instance Families
✦ n1-highmem ✦ m2,cr1 ✦ A5 - A7
CPU Optimized Instance Families
✦ n1-highcpu ✦ c1,cc2,c3 sch ed ul in g sch ed ul in g sch ed ul in g
Compute Calculator Intro
Designed to help determine optimal compute hardware configurations
✦Visually shows resource
constraints
✦Allows custom instance
families
Block Storage Volume Types
Solving resource contention in Block Storage
Throughput
General Storage Performance
What Are the Public Clouds Doing with Storage?
Performance Optimized –
✦ guaranteed IOPS (SSDs)
✦ IOPS per GB with low latency
✦ for I/O intensive workloads
✦ Billed by size and IO usage
Capacity Optimized (standard) –
✦no IOPS guarantees
✦workloads with moderate IO
✦Billed by size and IO usage
Blended Approach
(Performance Scaled with Capacity) –
✦ Ephemeral disks deprecated!
✦ IOPS scale with volume size
✦ Attached volume limits
Block Storage Classes in OpenStack
THROUGHPUT OPTIMIZED STORAGE NODE PERFORMANCE OPTIMIZED STORAGE NODE
Performance Optimized Storage
✦ all SSDs
GENERAL STORAGE NODE
Throughput Optimized Storage
✦ fast SAS drives with RAID 5/6 ✦ throughput tuned network ✦ high bandwidth Internal bus
Capacity (General) Optimized Storage
✦ larger SATA HDDs C in de r sch ed ul in g C in de r sch ed ul in g C in de r sch ed ul in g SSD SSD SSD SSD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD
Storage Tiers with OpenStack Cinder
8^)
Operators
RULE!
8^)
1. Define storage back ends
2. Create Volumes Types
✦ General ✦ Performance ✦ Throughput 3. Create Volumes # cinder create \ --volume_type IOPS_OPTIMIZED_TYPE \ --display_name volume-1 50 TENANT OPERATOR
✦ Raw capacity of the storage ✦ Replication
✦ RAID type
Capacity (General) Optimized Storage
RAID TYPE 2-Way Replication 3-Way Replication RAID5 2.2 3.3 RAID6 2.4 3.6 RAID10 4 n/a Example:
Twelve (12), 1TB disks, configured for RAID6 and 2-way replication would yield 5.0TB of usable capacity.
✦ IOPS scale linearly with VM count ✦ Limits should be seen as triggers for
storage scale out
Performance Optimized Storage
Write Latency
Throughput Optimized Storage
✦ Throughput response matters ✦ The Read/Write mix matters
41
Storage Planning
●Step 0: What is my Cloud Storage offering?
● Capacity Based
● Performance (IOPS) Based ● Throughput (Bandwidth) Based
●Step 1: What Storage Tiers do I need?
● Capacity Optimized, Performance Optimized,
Throughput Optimized
● Step 2: Storage Capacity Planning
● Workload projections
● Performance Observations, Metrics to be
optimized, and Calculators
● Step 3: Procure and Deploy
● Step 4: Manage and Steer
Core Network
Solving resource contention for the Network
Throughput
Resiliency Latency
Enterprise vs Cloud Fabric
Traditional Enterprise Topology Modern Cloud Friendly Topology
Network Elasticity is Required..
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE
NODE NODE NODE NODE
NODE NODE NODE NODE NODE NODE NODE NODE NODE
BLOCK STORE BLOCK
STORE
NODE
NODE NODE NODE NODE NODE NODE NODE
BLOCK STORE BLOCK
STORE
NODE
NODE NODE NODE NODE NODE NODE NODE
NODE NODE NODE BLOCK STORE BLOCK STORE BLOCK STORE BLOCK STORE
Elastic Cloud Resource Map
NODE
Because your cloud will grow..
Core Fabric Requirements
OpenStack friendly networking features:
✦Availability and Resiliency
(multi-path, per-flow routing)
✦Resource Node (compute/storage) Data Throughput
✦Network Latency
Spine and Leaf Topology
Ask your friendly network vendor for guidance
Cisco, ARISTA, Brocade, Juniper, Force10, etc.
Plan for the Resource Service Level Compute/Storage Network Fabric Cloud Controller Resource Service Level
High level architecture Core servi ces Gen eral Pu rpose Comp ute Perfo rma nce Stora ge Gen eral (C apaci ty) Stora ge Deterministic Network
{
OpenStack Core Services{
Deterministic Resources}
Scale Out (as needed)Resources
✦ https://github.com/noslzzp/
cloud-resource-calculator
✦ What is DevOps?
http://oreil.ly/1jBcsAu - free!
Open source tools includes:
✦Graphite ✦Ganglia
Public Clouds Benchmarks
✦Cloudharmony.com ✦Cloudsleuth.com
Thank You!
Red Hat Enterprise Linux OpenStack Platform High Availability
Arthur Berezin — Technical Product Manager, Red Hat Wednesday, April 16
2:30 pm - 3:30 pm
Deploying Red Hat Enterprise Linux OpenStack Platform in the enterprise with FlexPod
Arthur Enright — Field Product Manager, Red Hat NetApp and Cisco
Wednesday, April 16 3:40 pm - 4:40 pm
Deep dive: OpenStack Compute
Steve Gordon — Technical Product Manager, Red Hat Thursday, April 17
9:45 am - 10:45 am