• No results found

Cloud Performance Considerations

N/A
N/A
Protected

Academic year: 2021

Share "Cloud Performance Considerations"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

Cloud Performance Considerations

(2)

Disclaimer

This document represents the author's views and opinions.

It does not necessarily represent IBM's position or strategies.

(3)

Agenda

§ Why cloud computing

§ What is cloud computing

§ What are the business perspectives

§ What is different about the cloud

§ Open questions

(4)

$0 $50 $100 $150 $200 $250 $300 Installed Base (M Units) Spending (US$B)

New server spending

Server mgmt and admin costs Power and cooling costs

0 5 10 15 20 25 30 35 40 45 50 Source: IDC, 2008

1WW TB Capacity Shipped on Enterprise Disk Storage Systems

IT Costs are Increasing

§  Costs to manage systems has

doubled since 2000

§  Costs to power and cool

systems has doubled since 2000

§  Devices accessing data over

networks doubling every 2.5 years

§  Bandwidth consumed

doubling every 1.5 years

§  Data Doubling every 18

months1

§  Server processing capacity

doubling every 3 years2

§  10G Ethernet ports tripling

(5)

What s Driving Cloud Computing?

1.  Cost Reduction:

1.  Efficiency: virtual resources for hardware utilization (memory, disk, machines)

2.  Sharing of hardware/maintenance: multitenancy for cost reduction 3.  Automation: automate mundane tasks

4.  Commodity hardware for most public clouds

– Cloud: Highly virtualized with many users sharing the same hardware

2.  Technology Maturity Cycle

1.  New: Wow, it works!

2.  Commercialization: Will it make money long term?

3.  Good enough : Functionality is good enough for majority of users. Users have a lower tolerance for poor ease of use, care less about the technical details, etc.

4.  Standardization: If users don t care about technical details, we can standardize and virtualize.

5.  Business: Focus higher in the solution stack

– Cloud: Companies who are moving to the cloud are focusing on their business, not technology.

3.  Payment model: Pay per use to reduce bar of adoption

1.  Pay up front for all required capital 2.  Finance terms (deferred financial cost) 3.  Pay per use (for public cloud).

– Cloud: Pay per use with immediate time to value

vs.

vs.

(6)

Is Cloud Computing Growing

Mind

share

Market

share

(7)

Agenda

§ Why cloud computing

§ What is cloud computing

§ What are the business perspectives

§ What is different about the cloud

§ Open questions

(8)

What is Different about the Cloud

Server Server Server Server Server Server

Data center

•  Customers buy hw and sw

• 10 s to 100 s hw servers

•  Servers are in silos •  Enterprise applications •  Few failures

•  Heterogeneous hw

Cloud

•  Customers rent hw and sw

• 1000s to 10,000 s hw servers

•  Elastic capacity (+/- servers) •  Enterprise and other apps •  Constant failures

•  Commodity hw

•  Quality of Experience (QoE) is very important to customers

•  Users run on virtualized hw

By 2012, one out five businesses will own no IT assets at all. Gartner 01/18/2010

http://www.gartner.com/it/page.jsp?id=1278413

Grid

•  Customers buy hw and sw

• 100 s to 1000 s hw servers

•  Shared servers •  Mostly batch apps

•  Need to account for failures •  Homogenous hw

(9)

Is Performance Important to the Success of the Cloud

§  Five of the 10 obstacles and opportunities for cloud computing are related to quality-of-service aspects such as availability, performance, capacity or scalability.

§  Obstacle # 1 “Availability of service” discusses availability risks for cloud computing as a result of e.g. programming errors, overload of common services or Distributed Denial of Service (DDoS) attacks §  Obstacle # 4 “Data transfer bottlenecks” discusses the growing data intensity of applications and how

this impacts data transfer rates and costs in the cloud

§  Obstacle # 5 “Performance unpredictability” discusses performance risks caused by e.g. inefficiencies in I/O sharing and by high performance computing

§  Obstacle # 6 “Scalable storage” discusses the difficulties of applying cloud computing to solutions requiring highly scalable persistent storage

§  Obstacle # 8 “Scaling quickly” discusses the difficulties of quickly scaling up and down in response to load without violating service level agreements.

(10)

Agenda

§ Why cloud computing

§ What is cloud computing

§ What are the business perspectives

§ What is different about the cloud

§ Open questions

(11)

IBM offers highly integrated cloud solutions for different client requirements

regarding workloads, service levels and delivery models

low gain high gain low pain high pain

Workloads determine type and fit of

Cloud Services

•  Availability •  Redundancy •  Monitoring

•  End to End Process Mgmt •  Core Infrastructure Services •  Server Management

•  Storage Management •  Security, Patch, Risk

Service Level expectations require

different Cloud Management Services

Enterprise Enterprise Data Center Private Cloud Enterprise Data Center IBM operated Managed Private Cloud

IBM owned and operated

Hosted Private Cloud

User

A User B User C User D User E Public Cloud Services Enterprise A Enterprise B Enterprise C Shared Cloud Services •  Problem/Change •  Audit Checking •  Software License Mgmt •  Application Management •  Compliance Checking •  MW and DBMS Services •  Network Connectivity •  Help Desk •  Business Continuity

Different Cloud Delivery Models accommodate different needs regarding architectural control, operations and asset ownership

Delivery Model 1 Delivery Model 2 Delivery Model 3 Delivery Model 4 Delivery Model 5

Tier 1 Tier 2 Tier 3 Tier 4

(12)

What are the Layers in the Cloud

Infrastructure  as  a  Service

Servers Networking Data  Center  Fabric Storage

Shared  virtualized,  dynamic  provisioning

Infrastructure  as  a  Service

Servers Networking Data  Center  Fabric Storage

Shared  virtualized,  dynamic  provisioning

Platform  as  a  Service

High  Volume Transactions

Middleware Database

Web  2.0  Application

Runtime RuntimeJava

Development Tooling

Platform  as  a  Service

High  Volume Transactions

Middleware Database

Web  2.0  Application

Runtime RuntimeJava

Development Tooling

Software  as  a  Service

Collaboration Business   Processes CRM/ERP/HR Industry   Applications

Software  as  a  Service

Collaboration Business   Processes CRM/ERP/HR Industry   Applications

(13)

Agenda

§ Why cloud computing

§ What is cloud computing

§ What are the business perspectives

§ What is different about the cloud

§ Open questions

(14)

Operating System

Is the Cloud More Complex: Virtualization

Operating

System

JVM

Application

server

Application

§  Multiple hardware and software queues in a normal server

§  Virtualization adds two new queues (guest OS and hypervisor) which is a network of software queues §  Memory and disk space are fixed resources that are shared even more

queue queue queue queue new queue new queue

Hypervisor

Guest OS

JVM

Application

server

Application

Guest OS

JVM

Application

server

Application

Guest OS

JVM

Application

server

Application

(15)

Is the Cloud More Complex: Scale Out and Network Functions

§  Network is a critical resource for persistent storage, input and output traffic §  Network attached storage is a shared pool of multiple storage pods

Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application new queue new queue Network Attached Storage Network Attached Storage

(16)

Is the Cloud More Complex: Virtual Machine Mobility

§  VMs leave, appear, move, grow

Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application Operating System Hypervisor Guest OS JVM Application server Application Guest OS JVM Application server Application Guest OS JVM Application server Application Network Attached Storage Network Attached Storage Guest OS JVM Application server Application

(17)

IBM CloudBurst Appliance

U 0 No. 42 41 40 39 38 37 36 35 34 33 32 31 F R R F 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 F R R F 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Blade Center Comp. 1U GB Ethernet Sw Bl ad eSe rve r EXP3000 x3650M2 Mgt Node DS3400 Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r PS3 GbE Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Main C B D A EXP3000 EXP3000 EXP3000 EXP3000 Bl ad eSe rve r PS4 GbE Bl ad eSe rve r Bl ad eSe rve r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Bl ad eSe rve r 1U GB Ethernet Sw Mgt PS1 Fan 1 Mgt PS2 Fan 2 Bl ad eSe rve r Bl ad eSe rve r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1U Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Mgt PS2 Fan 2 PS4 Kbd. Mon. PS3 GbE 1U PANEL Fan 1 GbE Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r Bl ad eSe rve r P DU PDU P DU P DU

Blade Center Comp. Mgt PS1 DS3400 EXP3000 1U PANEL 1U PANEL Customer Network Midplane AMM2 AMM2 Midplane The image x3650 M2 H S 2 2 B la de H S 2 2 B la de H S 2 2 B la de H S 2 2 B la de 24 pt 1Gps Ethernet Sw 24 pt 1Gps Ethernet Sw 10pt FC SM 10pt FC SM Bay 3 Bay 4 Cntl A DS3400 Cntl B 10G SM Bay 1 4 10G SM Bay 2

§  Compute, Network, and Storage resources

are integrated into the

appliance

(18)

How is Cloud Performance Analysis Done

§

Dynamic modeling required to characterize non-locality due to feedback between

layered subsystems

Classical queuing theory is not that helpful

Discrete event simulation approaches are needed

Servers  

Switches  

NAS  

A  bo0leneck  at  the  NAS  may  slow  the  execu9on  at  the  server  due  to    

Backpressure NAS

bottleneck shows up at

(19)

Agenda

§ Why cloud computing

§ What is cloud computing

§ What are the business perspectives

§ What is different about the cloud

§ Open questions

(20)

The Cloud Performance Challenge

§  Quality of Experience (QoE) depends upon (hybrid) cloud service performance –  Excellent QoE accelerates adoption and is a functional requirement

–  QoE crosses boundaries of internet, network, system, application performance and resilience

§  Competitive pressure will require competitive performance from all vendors to keep customers

–  IaaS and PaaS paradigms allow customers to move (e.g., price, QoE, etc) –  e.g., Amazon EC2 and IBM Compute Cloud can run the same software –  QoS and SLA s are an important differentiator

§  Performance of the cloud will evolve to near real-time business –  Communication needs are near real-time for correctness

–  Complex event processing needs to be done quickly to be useful

Great engineering comes from creating predictable results at predictable costs…

§  Cloud computing is a new paradigm which will have new performance challenges -  It incorporates prior component performance challenges too

- Hybrid clouds expand this further (e.g., network hops / latency) -  Customer expectations will require education

(21)

Open Question: Comparing Cloud Performance

§  It can t!!

§  There aren t any industry defined benchmarks because the workload classes vary greatly and have dynamic lifetimes

§  And a benchmark needs to include cost and availability as key factors

§  Perhaps a benchmark framework needed that workloads are plugged into? §  Perhaps a meta-benchmark analysis needed to provide a score?

(22)

Central shared storage (SAN or NFS)

J Provisioning is fast

J Live migration is supported

L VM disk I/O is slow due to disk and network

contentions

Open Question: Central storage vs. Local Disks vs Combination vs New …

Host machine

Guest OS

Host OS Hypervisor

Central shared storage Image repository

Image 1

Image 2

Virtual disk store Root disk

Data disk

Copy

Local disks

J VM disk I/O is fast

L Provisioning is slow due to network image

copying

L No live migration is supported

Host machine Guest OS Host OS Hypervisor Repository server Data disk Data disk Root disk Image repository Image 1 Image 2 Copy

(23)

Open Question: Optimal Approaches for Bin Packing and Moving VMs

When deploying services in a cloud, a balance must be found between

performance and capacity of the service, and the memory available on

nodes. This is further complicated if the number of replicas of an

application is limited, for instance by the available number of licenses.

The analysis of interference between services must scale to large

numbers of host nodes, applications, replicas of applications, and classes

of users. This paper combines a multi-dimensional packing heuristic and

network flow optimization to satisfy simultaneous constraints on

throughputs, processor utilizations, memory availability and license

availability, at a minimum cost and with a minimum of host processors.

Jim Zhanwen Li, John Chinneck, Murray Woodside, and Marin Litoiu. 2009. Deployment of Services in a Cloud Subject to Memory and License Constraints. In Proceedings of the 2009

(24)

Open Question: Performance Fault Diagnosis and Analysis

§  Intermittent backpressure causes lower level hw and/or sw to slow down §  The problem may appear to move if it is caused by a VM and the VM moves

§  The problem may appear to move if it is caused by a VM and the problem VM dies §  The problem may appear to move if it is caused by a VM and the problem VM starts up §  The problem may appear to move if it is caused by hw and the VM moves

§  Several VMs may show the same symptom separated in space and time §  What data and how much to monitor, with 104 à 105 elements

§  Expert system / analytics are needed to help in the identification of problems §  Extend analysis to predict hw failures before the occur

(25)

Open Question: The CAP Theorem and Performance

§  Three properties of shared-data, distributed systems

1.  Consistency: one update is made, all observers are updated

2.  Availability: all database transactions should be processed accurately and promptly 3.  Tolerance: tolerant to network Partitions

§  CAP Theorem

–  Only two properties can be achieved at any time –  Network partitions is given in distribute systems

–  Have to pick one between consistency and availability

§  How will distributed architectures change to optimize for each pair of properties –  Eventual consistency, non-relational databases?

Lynch, Nancy, and Seth Gilbert. “Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services.” ACM SIGACT News, v. 33 issue 2, 2002, p. 51-59.

(26)

Cloud Service Developer Cloud Service Provider

Security & Resiliency

Service Development

Tools

Common Cloud Management Platform

 

OSS – Operational Support Services BSS – Business Support Services

Cloud Services

Virtualized Infrastructure – Server, Storage, Network, Facilities

Cloud Service Consumer Consumer In-house IT Cloud Service Integration Tools

e.g. Service Activation

•  process optimization

e.g. Provisioning

•  image copy •  instance creation •  partitioning

e.g. Run Time Performance

•  Integration of storage, hypervisor, network components

•  Dedicated nodes

(27)

© Copyright IBM Corporation 2010. All rights reserved.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE

INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.

References

Related documents

I notate African dances as integrated scores in combination with the accompanying music because the practices results in providing the reader with a score that reveals the

Google Sky Maps are trademarks of Google Inc.All other products and services names mentioned may belong to their respective trademark owners. Equippe d with A ndroid TM 2 .2

Extend Enterprise Data Protection to the Cloud Scale with Your Virtual Environment True Enterprise Protection for Mobile User Data Comprehensi ve Protection

Private cloud Hosted private cloud Managed private cloud Enterprise Shared cloud services Enterprise A Enterprise B Public cloud services A Users B.. IBM Smart

AADC: Aromatic Amino Acid Decarboxylase (DOPA decarboxylase); ADH: Alcohol Dehydrogenase; ALDH: Aldehyde Dehydrogenase; AMPH: Amphetamine; AR: Aldehyde Reductase; ATP

This chapter is concerned with the analysis of the second round of interviews in the field. Five interviews were carried out over a one week period in March 2009. They

In this section, we use the result for Dirac-type systems to establish Theorem 1: The Cauchy data of a connection Laplacian plus potential on a surface with boundary determines

Pavement surface texture, characterized by microtexture, macrotexture, and megatexture, is a property used to describe the functional condition of pavements. It can be