• No results found

Virtual InfiniBand Clusters for HPC Clouds

N/A
N/A
Protected

Academic year: 2021

Share "Virtual InfiniBand Clusters for HPC Clouds"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

1 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Virtual InfiniBand Clusters for HPC Clouds

April 10, 2012

Marius Hillenbrand, Viktor Mauch, Jan Stoess, Konrad Miller, Frank Bellosa

KIT – University of the State of Baden-Wuerttemberg and

(2)

High Performance Computing + Clouds?

HPC Applications

Weather forecast, crash test simulations

Today in use in all scientific disciplines

Supercomputers / HPC Clusters

Owned and operated by single institutions

Fixed and inflexible run-time environments

Cloud Promise: Infrastructure-as-a-Service

Rent a service instead of buying and operating HW

Pay and use capacity adapting to current demand

(3)

Analysis: Clouds Today

3 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Contemporary clouds not viable for HPC

High communication latency and jitter

Performance acceptable for loosely-coupled

applications [1,2]

Communication-intensive workloads do not scale [3,4]

Only premium offers compete with small commodity

clusters (EC2 cluster compute instances) [5]

Existing clouds cannot run communication-intensive

applications which are crucial for HPC

[1] Juve et al.: Scientific workflow applications on amazon EC2, 2010.

[2] Montero et al.: An elasticity model for HTC clusters, 2011.

[3] Napper and Bientinesi: Can cloud computing reach the top500? 2009.

[4] Gupta and Milojicic: Evaluation of hpc applications on cloud, 2011.

[5] Church et al.: Iaas clouds vs. clusters for hpc: A performance study, 2010.

(4)

Proposal: Clouds on HPC

Base Cloud environment on HPC infrastructure

InfiniBand clusters

BlueGene supercomputers

Future PCI Express interconnects

(5)

Proposal: Clouds on HPC

4 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Base Cloud environment on HPC infrastructure

InfiniBand clusters

BlueGene supercomputers

Future PCI Express interconnects

(6)

Differences of HPC and Clouds

Clouds

HPC

Network

Gigabit/10G Ethernet

InfiniBand, BlueGene torus,

PCI Express

77

.

5

µ

s

in EC2 premium

VMs

2

4

µ

s

with InfiniBand

Network

QoS

Best effort

QoS features in HW

Flexibility

on-demand

(re)configuration

months

for

installation,

weeks for re-partitioning

custom OS image

fixed userbase, applications

exchangeable SW layers

HW constraints are fixed,

(7)

HPC Cloud Architecture

6 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

(8)
(9)

HPC Cloud Architecture

6 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

(10)
(11)

HPC Cloud Architecture

6 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

(12)

Network Isolation

Goal: Prevent illegitimate traffic between

virtual clusters

Base: InfiniBand

Partitions

Membership per node, not per VM

Applications freely choose partition to use

Our extension: Transparent enforcement of

partitions per VM

(13)

Network Performance Isolation

8 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Goal: Ensure bandwidth and latency SLAs

Base: InfiniBand

Virtual Lanes

Configurable traffic scheduling

Known policies for QoS [6]

Applications freely choose traffic class

Our extension: Transparent enforcement of

traffic classes per VM

[6] Alfaro et al.: A formal model to manage the InfiniBand arbitration

tables providing QoS, 2007.

(14)

Implementation: Intercept Commands

Base: HPC network virtualization

Proposed by Liu et al. [7]

Apps issue send/receive operations directly to HW

Connection establishment via host OS

Applied with SR-IOV

Our extension: Intercept connection

management in the host

Map users’

partitions

and traffic classes

Protect physical network configuration

Enforce isolation transparently to user

[7] Liu et al.: High performance vmm-bypass i/o in virtual machines,

2006.

(15)

Virtual HPC Network View

10 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Impression of a dedicated HPC network

Behaving like physical network for user apps and

config tools

Custom node addresses, isolation and QoS

Routing customized for communication pattern

Topology state machine per virtual cluster

Simulate configuration interface

Redirect users’ accesses

Repurpose debug tool

ibsim

for InfiniBand

Cloud provider’s challenge

Virtual cluster placement according to constraints

Merging virtual configuration of users

(16)

Results

Prototype

VMs with InfiniBand access

Automated isolation setup

(partitions)

Measurements cannot be published

SR-IOV drivers in non-public beta

PCI passthrough as substitute

MPI application latency (SKaMPI)

77

.

5

µ

s

in premium cloud offering (10GE)

3

.

4

µ

s

in our prototype (IB @ 10 Gbit/s)

Conceptual evaluation with published pre-alpha SR-IOV drivers

Transparent enforcement of isolation works

Protection of network configuration is inherent

(17)

Future Work

12 10.04.2012 Marius Hillenbrand - Virtual InfiniBand Clusters for HPC Clouds SYSTEM ARCHITECTURE GROUP, STEINBUCH CENTRE FOR COMPUTING

Transparent Live Migration on HPC Networks

protocol state in hardware

node addresses bound to physical nodes

Low-Latency-Clouds for non-HPC workloads

scale-out workloads bound by latency

future tightly-coupled cloud environments

(18)

Conclusion

Architecture for HPC Cloud Computing

InfiniBand virtualization

Network and performance isolation

Transparent enforcement of isolation

Virtual HPC network view

Impression of exclusive use

Behavior of a physical cluster

Physical network configuration is protected

References

Related documents

those characterized by innovativeness, dynamism, and high technology, by being entrepreneurial (i.e., being innovative, exhibiting proactive behaviour, and taking risks)

Total sugars, reducing sugars and total carotenoid showed continous increasing trend from fruit development till ripening in guava cultivar.. While anthocyanin content increased

Our findings suggest that patients with the tumor size ≤5 cm, without necrosis, without distant metastasis, and with low EZH2 expression had a significantly longer survival time..

The decision making at this stage is accomplished by two related decisions: first, bid / no bid decisions that consider factors would help to determine the benefit expected from

Tabel 3 menunjukkan bahwa pati talas modifikasi menghasilkan temperatur gelatinisasi yang paling rendah dibandingkan dengan pati yang lain dalam penelitian

Premium-Discount Formula and Other Bond Pricing Formulas.. 1

The city-based FIBA 3x3 World Tour Masters series takes place across five continents to produce the 12 teams representing their city who contest the annual World Tour Final..

Thanks to the assistance of our SCOR Global Life colleagues from around the world, along with contributions from ReMark, a SCOR Global Life direct marketing subsidiary, this