• No results found

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

N/A
N/A
Protected

Academic year: 2021

Share "IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

© 2014 IBM Corporation

1

IBM Platform Computing Cloud Service

Ready to use Platform LSF & Symphony clusters

in the SoftLayer cloud

(2)

© 2014 IBM Corporation

2

Agenda

v

Mapping clients needs to cloud technologies

v

Addressing your pain points

v

Introducing IBM Platform Computing Cloud Service

v

Product features and benefits

v

Use cases

(3)

© 2014 IBM Corporation

3

HPC cloud characteristics and economics are different than

general-purpose computing

• High-end hardware and special purpose devices (e.g. GPUs) are typically used to

supply the needed processing, memory, network, and storage capabilities

• The performance requirements of technical computing and service-oriented

workloads means that performance may be impacted in a virtualized cloud

environment, especially when latency or I/O is a constraint

• HPC cluster/grid utilization is usually in the 70-90% range, removing a major

potential advantage of a public cloud service provider for stable workload volumes

HPC Workloads Recommended for Private Cloud

HPC Workloads with Best Potential for Virtualized Public & Hybrid Cloud

(4)

© 2014 IBM Corporation

4

IBM’s HPC cloud strategy provides a flexible approach to address

a variety of client needs

Evolve existing

infrastructure to

HPC Cloud to enhance

responsiveness,

flexibility, and

cost effectiveness.

Enable integrated

approach to improve

HPC cost and

capability

60%

Access additional

HPC capacity with

variable cost model

Private

Clouds

Hybrid

Clouds

Public

Clouds

Based on HPC Cloud’s potential impact, organizations are evolving their infrastructures to

enable private cloud deployments, exploring hybrid clouds, and considering public clouds.

(5)

© 2014 IBM Corporation

5

Are you experiencing any of these pain points?

•  Unable to meet business objectives (delay to market, etc.)

•  Existing resources insufficient to meet peek compute demand

–  Long run times on existing cluster or grid

–  No access to local technical computing resources (workstation users)

•  Technical resources expensive and time consuming to acquire

•  The skills/staff to architect and manage a technical computing infrastructure can

be difficult to acquire

- 10,000 20,000 30,000 40,000 50,000 1 4 7 10 13 16 19 22

Planned Daily Cycle

(24 x 365)

Financial Services

0 200 400 600 800 1000 1200 1400 1600

April May June

Planned Project

(6)

© 2014 IBM Corporation

6

IBM Platform Computing Cloud Service

Making the cloud work for you

Build

•  Complete, ready to run

clusters in the cloud

•  Add additional capacity

in hours instead of

months

Manage

•  Seamless workload

management,

on-premise and in the

cloud

•  Transparent user

experience

Support

•  24X7 cloud operation

support

•  Access to technical

computing expertise

when you need it

Protect

•  Data encryption,

dedicated physical

machines and network

•  Security through

physical isolation

(7)

© 2014 IBM Corporation

7

Ready to use Platform LSF & Platform Symphony clusters in the cloud

IBM Platform Computing Cloud Service (SaaS)

IBM Platform LSF

IBM Platform

Symphony

SoftLayer, an IBM Company

Infrastructure

24X7 CloudOps Support

(8)

© 2014 IBM Corporation

8

Dedicated physical and virtual machine infrastructure as a service

•  13+ data centers

•  17 network PoPs

•  Global private network

•  Bare metal and virtual machines

190,000+

SERVERS

21,000+

CUSTOMERS

22,000,000+

DOMAINS

(9)

© 2014 IBM Corporation

9

Workload I/O intensity

•  SoftLayer’s architecture

outperforms by >50% equivalent

AWS instances for high I/O

workloads

Control (APIs,

hardware / network

configurability)

•  SoftLayer offers hundreds of

hardware configurations vs. 14

for AWS

•  ~2,000 APIs for SoftLayer vs. ~60

for AWS and none for RAX

Integrated platform of

multiple architectures

•  Unified integration & control

panel for multiple cloud

architectures

•  RAX requires paid bridge,

different control interfaces

Ready to use Platform LSF & Platform Symphony clusters in the cloud

Low intensity workloads Low degree of control and customization AWS IBM High intensity workloads High degree of control and customization

Single platform Seamless integration

DIFFERENTIATOR

RATING

IBM ADVANTAGES

(10)

© 2014 IBM Corporation

10

Non-shared physical machines for added security and performance

•  Dedicated and isolated compute environment

•  All machine instances are dedicated to the client

•  Each cluster is isolated on a VLAN

•  Only the VPN gateway has an addressable interface

•  All customer data at rest is encrypted on shared file systems

•  When machines instances are decommissioned the disks are scrubbed using

DoD approved methods

(11)

© 2014 IBM Corporation

11

Optimal performance for technical computing apps

Industrial Manufacturing Benchmark – Structural Mechanics

EDA Benchmark (IBM-MESA)

Note: Benchmark results were obtained by IBM and have not yet been externally audited or validated.

(12)

© 2014 IBM Corporation

12

Run and supported by dedicated, 24X7 HPC Cloud Operations Team

CloudOps functions

•  Pre-provisioning: Provide guidance to client on how to enable VPN, multi-cluster settings &

security settings on the client on-premise environment

•  One time setup testing: Extensive testing of the cluster prior to release to the client

•  Extensive testing of the cluster on every event of flex-up prior to release to the client

•  Email alerts prior to flex-down & cluster shutdown operations

•  Email alerts in case of any overage (compute hours, download bandwidth)

•  Provide billing details of monthly usage including overage details

•  Provide support under IBM SLA by experts highly experienced in Platform Computing

products

Value: quality, peace of mind & minimum disruption to business

•  Extensive quality checks ensures minimum loss of usage hours & disruptions

•  Proactive alerts ensures that in-progress critical jobs are not killed in case of Flex-down &

Cluster Shutdowns and Overages

•  Highly trained & experienced Support ensures smooth on-boarding and minimize

disruptions

(13)

© 2014 IBM Corporation

13

Industry-leading workload management

•  20 years managing distributed scale-out systems with 2000+

customers in many industries

•  High performance workload management combined with

intelligent resource scheduling engine

•  Unmatched scalability (small clusters to global grids) and

production-proven reliability

•  Heterogeneous – manages System x and Power plus 3rd party

systems, virtual and bare metal, accelerators / GPU, cloud, etc.

•  Shared services for both compute and data intensive workloads

•  Integrated solutions with vertical reference architectures

23 of 30

largest

commercial

enterprises

Over 5M

CPUs under

management

60% of top

financial

services

companies

(14)

© 2014 IBM Corporation

14

IBM Platform LSF

Overview

Powerful workload management for demanding, distributed and mission-critical high

performance computing environments.

Key Capabilities

•  Powerful

-  Policy and resource-aware scheduling

-  Resource consolidation for optimal performance

-  Advanced self-management

•  Flexible

-  Heterogeneous platform support

-  Policy-driven automation

-  CLI, web services, APIs

•  Scalable

-  Thousands of concurrent users and jobs

-  Virtualized pool of shared resources

-  Flexible control, multiple policies

Client Benefits

•  Optimal utilization: reduced infrastructure cost

•  Robust capabilities: improved productivity

•  High throughput: faster time to results

(15)

© 2014 IBM Corporation

15

IBM Platform Symphony

Overview

Low-latency grid management platform for distributed

computing and analytics with sophisticated resource

sharing

Key Capabilities

•  Accelerates service-oriented applications

•  Extreme app scalability and throughput with very low

latency

•  Compute and data-intensive applications on a single

platform

•  Sophisticated, hierarchical resource sharing

•  Open and flexible: choice of OS, frameworks and

languages

Client Benefits

•  Increase performance and analytic result quality

•  Reduces IT costs - increase utilization, simplify

application onboarding, reduce administration costs

Low Latency / High throughput

Sub-millisecond, 17,000 tasks per second

Large Scale

10k cores per application, 40k cores per grid

Efficient shared services

Heterogeneous & Open

Linux, Windows, AIX, C/C++, C#, Java, Excel, Python, R

(16)

© 2014 IBM Corporation

16

Use case 1 – hybrid cluster

The problem

•  Existing resources cannot meet peak demand

•  Resources are expensive and time consuming to acquire

•  Skills to architect and manage clusters are difficult to find

•  Fixed or reduced budgets

•  On-premise constraints in space, cooling and power

The solution

•  Fully functioning IBM Platform LSF or Symphony clusters are

provisioned on the SoftLayer cloud and connected to the

on-premise cluster, expanding capacity as needed

•  Leverage MultiCluster capability for managed forwarding of

jobs from on premise cluster to off premise cluster

The Value

•  Access to additional compute capacity on a temporary basis as needed

•  Near-zero wait times

•  Reduce costs by paying for only what is used

•  Pay for additional capacity as an operating expense

•  Fully supported, end-to-end solution, from the on-premise to the on-cloud clusters

•  Expected and reliable performance from running technical computing workloads on physical machines

•  Transparent access to cloud resources, the end user experience does not change

(17)

© 2014 IBM Corporation

17

Use case 2 – stand-alone cluster in the cloud

The problem

•  New and emerging need for technical computing

•  Skills to architect and manage clusters are difficult to find

•  Resources are expensive and time consuming to acquire

•  Inconsistent demand does not justify the investment

The solution

•  Fully functioning Platform LSF and Symphony clusters are

provisioned on the SoftLayer cloud providing resources as

needed

The value

§ 

Market-leading Platform LSF and Platform Symphony software

§ 

Access to technical computing resources on a temporary basis

without the need to acquire, install and configure the infrastructure and cluster software

§ 

Keep costs low by paying for only what is used

§ 

Pay for capacity as an operating expense

§ 

Fully supported solution

(18)

© 2014 IBM Corporation

18

Is IBM Platform Computing Cloud Service a good fit for you?

Business pain points

•  And you experiencing lost profit due to missed deadlines?

•  Do you experience pressure to convert your compute environment capital expense to

operational expense?

•  Have you ever missed a deadline or delayed a project because technical computing

resource procurement took too long ?

Technology pain points

•  Do your users ever scale back their analyses to lower fidelity or less accuracy in order to fit

them into the local compute environment or to a time window?

•  Do you regularly, occasionally, or permanently have fewer resources (CPUs, disk, memory,

etc) than you would like to have to service the user’s compute demand?

•  Do you experience a large variance in compute resource utilization?

•  Have you reached, or will you reach the capacity of your datacenter(s), and do you need a

plan to grow beyond that capacity ?

(19)

© 2014 IBM Corporation

19

IBM Platform Computing Cloud Service

Making the Cloud Work for You

Unmatched Expertise

Analytics, Technical Computing,

Software, Services and ISV Partnerships

IBM Hybrid Cloud

Consolidation

Supporting heterogeneous IBM and non-IBM infrastructure

Cloud Leadership

Expertise from

Client Engagements

powered by

On

SmartCloud

Unmatched Capabilities

Policy-driven Workload

Management

On

Premise

(20)

© 2014 IBM Corporation

20

(21)

© 2014 IBM Corporation

21

SoftLayer and Amazon EC2 Products tested

NAME  

IaaS  

Provider  

CPU  Cores   Memory  

(GB)  

Disk  Space  

(GB)  

Physical  /  

Virtual  

Rate  (USD)  

Hourly  

SL  PM  

So'Layer  

16  

64  

1000[1]

 

Physical  

$1.85[2]

 

SL  VM  

So'Layer  

8  

8  

500[3]

 

Virtual  

$0.88    

SL  PM  (ded)  

So'Layer  

16  

64  

1000[1]  

Physical  

$3.83[5]  

EC2  CC2  

Amazon  

EC2  (CC2)  

32  

60.5  

3360  

Virtual  

$2.40[4]

 

EC2  2XL  

Amazon  

EC2  

(c1.xlarge)  

8  

7  

840  

Virtual  

$0.58    

SL  Physical  Machine    

Intel(R)  Xeon(R)  CPU  E5-­‐2650  0  @  2.00GHz  

SL  Physical  Machine  (dedicated)  

Intel®  Xeon®  CPU  E5-­‐2690  0  @  2.90GHz  

SL  Virtual  Machine  

Intel(R)  Xeon(R)  CPU  E5-­‐2650  v2  @  2.60GHz  

Amazon  CCI2  

Intel(R)  Xeon(R)  CPU  E5-­‐2670  0  @  2.60GHz  

Amazon  2XL  

 Intel(R)  Xeon(R)  CPU  E5-­‐2650  0  @  2.00GHz  

(22)

© 2014 IBM Corporation 22

Memory Bandwidth

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)

STREAM

(higher is better)

COPY SCALE ADD TRIAD 0.00 500.00 1,000.00 1,500.00 2,000.00 2,500.00 3,000.00 3,500.00 4,000.00 4,500.00

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)

STREAM Price Performance

(higher is better)

COPY SCALE ADD TRIAD

(23)

© 2014 IBM Corporation 23

CPU Performance

0 100 200 300 400 500 600 700 800

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)

El ap se d T im e

SuperPI

(lower is better)

0.00 2.00 4.00 6.00 8.00 10.00

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)

th ro u g h p u t p er d o lla r

SuperPI Price-Performance

(higher is better)

(24)

© 2014 IBM Corporation 24

Network Bandwidth

1 10 100 1000 10000 100000 1 10 100 1000 10000 100000 1000000 10000000 B an d w id th (Mb its /s )

Message Size (Bytes)

openMPI

SLVM EC2 2XL EC2 CCI2 SL PM SL PM Dedicated

(25)

© 2014 IBM Corporation 25

Network Latency

0 20 40 60 80 100 120

SL VM MPI 2 node EC2 2XL MPI 2 node EC2 CCI2 MPI 2 node

SL PM MPI 2 node SL PM (ded) MPI 2 node

openMPI Latency

(26)

© 2014 IBM Corporation

26

Input / Output Performance

0 50000 100000 150000 200000 250000 300000 350000 0 1 2 3 4 5 kB/sec

I/O file size (factor of memory size)

I/O Bandwidth - WRITE

(higher is better)

SL VM Write EC2 2XL Write EC2 CCI2 Write SL PM Write SL PM Ded Write 0 50000 100000 150000 200000 250000 300000 350000 400000 0 1 2 3 4 5 kB/sec

I/O file size (factor of memory size)

I/O Bandwidth - READ

(higher is better)

SL VM Read EC2 CCI2 Read EC2 2XL Read SL PM Read SL PM Ded Read

(27)

© 2014 IBM Corporation 27

Software Compilation

0 100 200 300 400 500 600 700 800

SL VM SL PM EC2 2XL EC2 CCI SL PM Ded

El ap se d T im e (s )

Software Compile Performance

(lower is better) 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00

SL VM SL PM EC2 2XL EC2 CCI SL PM Ded

Runs / $

Software Compile Price-Performance

(28)

© 2014 IBM Corporation

28

Life Science (BWA)

SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL Series1 20846.481 26509.368 25897.44 22442.7 37491 0 5000 10000 15000 20000 25000 30000 35000 40000 El ap se d ti m e (s ec )

Life Sciences Benchmark (BWA)

(lower is better)

SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL Series1 22.21 7.79 6.33 14.96 6.04 0.00 5.00 10.00 15.00 20.00 25.00 $ / run

Life Sciences Benchmark (BWA) Price

Performance

(29)

© 2014 IBM Corporation

29

EDA Benchmark (IBM-MESA)

0 500 1000 1500 2000 2500 3000 3500

SL PM (ded) SL PM SL VM EC2 2XL EC2 CCI2

El ap se d T im e (s ec )

EDA - IBM Mesa

(lower is better) 0.00 0.50 1.00 1.50 2.00 2.50

SL PM (ded) SL PM SL VM EC2 2XL EC2 CCI2

Runs / $

EDA - IBM Mesa - Price-Performance

(30)

© 2014 IBM Corporation 30

Provisioning Time

1 10 100 1000 10000 100000

SL PM SL VM EC2 CCI2 EC2 2XL SL PM Ded

Provisioning Time (sec)

(31)

© 2014 IBM Corporation

31

Industrial Manufacturing – Structural Mechanics

1 3 5 7 9 11 13 0 2 4 6 8 10 12 14 16 Sp ee d u p (r el ati ve to EC 2 2XL ) CPUs

One Node - S4D

SL PM EC2 CCI2 SL VM EC2 2XL SL PM (ded) 1 2 3 4 5 6 7 0 2 4 6 8 10 12 14 16 Sp ee d u p (r el ati ve to EC 2 2XL ) CPUs

One Node - S6

SL PM EC2 CCI2 SL VM EC2 2XL SL PM (ded) 1 3 5 7 9 11 13 15 17 19 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Sp ee d u p (r el ati ve to EC 2 2XL ) CPUs

Two Nodes - S4D

SL PM EC2 CCI2 SL VM EC2 2XL SL PM (ded) 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Sp ee d u p (r el ati ve to EC 2 2XL ) CPUs

Two Nodes - S6

SL PM EC2 CCI2 SL VM EC2 2XL SL PM (ded)

(32)

© 2014 IBM Corporation 32

Industrial Manufacturing – CFD

0 2 4 6 8 10 12 14 16 18 1 3 5 7 9 11 13 15 Sp ee d u p (r el ati ve to EC 2 2XL ) # cores

OpenFoam Speedup Backplane

(higher is better) SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL 0 1 2 3 4 5 6 7 8 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Sp ee d u p (r el ati ve to EC 2 2XL ) # cores

OpenFoam Speedup Ethernet

(higher is better) SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL

References

Related documents

When people choose to cloud compute using platform as a service or 'PaaS', they obtain access to an online platform provided by a cloud computing vendor.. They can then use

AURO Enterprise Cloud (AURO) is the Canadian leader in cloud computing and was developed out of a need for a highly elastic cloud platform that could support a wide range of

An analyst who was not aware of the effects of time averaging on the reduced form time series representation of z(t) would be led to incorrectly reject the class of economic

Uzziah, Menahem began to reign over Israel, and he reigned ten years ; 2 Kings xv.. Perhaps

When analysing changes occurring in the milk yield and composition depending on successive lactation it was concluded that the highest amount of obtained milk, calculated FCM and

(Jbosila, 2013) defines study habits as the attitude of one person towards their academic year in life. It has been also studied by many researchers. In fact, according to

How does cloud Service Provider Support affect cloud

With SoftLayer, IBM is now positioned to deliver on the broadest set of enterprise cloud computing requirements across the industry – with multiple clouds on a common platform.