• No results found

SR-IOV In High Performance Computing

N/A
N/A
Protected

Academic year: 2021

Share "SR-IOV In High Performance Computing"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

NASA Center for Climate Simulation

SR-IOV

In

High Performance

Computing

Hoot Thompson & Dan Duffy

NASA Center for Climate Simulation NASA Goddard Space Flight Center

Greenbelt, MD 20771

[email protected] [email protected]

(2)

NASA Center for Climate Simulation

2/13/2012 2

NASA Center for Climate Simulation

• Focus on the research side of climate study (versus NOAA’s operational position) • Simulations span multiple time scales

– Days for weather prediction

– Seasons to years for short term climate prediction – Centuries for climate change projection

• Examples:

– High fidelity 3.5 KM global simulations of cloud and hurricane predictions – Comprehensive reanalysis of the last thirty years of weather/climate –MERRA – Multi-millennium analysis for the Intergovernmental Panel on Climate Change • Integrated set of supercomputing, visualization and data management technologies

– Discover computational cluster

• 30K traditional Intel cores plus 64 GPUs, roughly 400 TFlops • DDR/QDR Infiniband (IB) backbone

• 1 GbE and 10 GbE management infrastructure

• ~4 PBytes RAID based shared parallel file system (GPFS) – Tape archive of over 20 PBytes

(3)

NASA Center for Climate Simulation

Discover IB/GPFS Architecture

B 1 2 3 4 5a 5b 6a 6b

Base Unit: 512 Dempsey (3.2 GHz)

SCU1: 1,024 Woodcrest (2.66 GHz) SCU2: 1,024 Woodcrest (2.66 GHz) SCU3: 3,096 Westmere (2.8 GHz) SCU4: 3,096 Westmere (2.8 GHz) 24 DDR IB uplinks to each unit 24 DDR IB uplinks to each unit 20 GPFS I/O Nodes 16 NSD (data) 4 MDS (metadata) 20 GPFS I/O Nodes 16 NSD (data) 4 MDS (metadata) SCU5: 4,096 Nehalem (2.8 GHz) SCU6: 4,096 Nehalem (2.8 GHz) SCU7: 14,400 Westmere (2.8 GHz) Data Analysis

Data File Systems: Data Direct Networks

S2A9500 S2A9550 S2A9900 Metadata File Systems:

IBM/Engenio DS4700

Each circle represents a 288-port DDR IB Switch

Brocade 48000

7a-7e

The triangle represents a 2-to-1 QDR IB Switch fabric

(4)

NASA Center for Climate Simulation

2/13/2012 4

Nebula – NASA’s Cloud

• Open-source (OpenStack) cloud computing project and service • Alternative to costly construction of additional data centers • Sharing portal for NASA scientists and researchers

– Large, complex data sets

– External partners and the public.

• Nebula comprised of two components/containers – Nebula west at NASA AMES

– Nebula east at NASA GSFC

• NCCS team evaluating Nebula as adjunct to Discover hosted science processing • Key question can clouds match HPC level of capability needed for climate research • Potential obstacle – clouds primarily exist in virtualized space

– Overhead or loss due to virtual machine (VM) versus bare metal

– Node-to-node communication critical – high speed, low latency, RDMA

Intel Developer Forum

(5)

NASA Center for Climate Simulation

2/13/2012 5

Background And Proposition

• Background

– Discover’s performance tied to it’s DDR/QDR IB fabric

– Nebula, clouds in general, 10 GE based

• Question – can clouds deliver HPC level of performance?

– Can 10GE compete with high speed, low latency IB?

– What network performance is lost due to virtualization?

– What computational performance is lost due to virtualization?

• Proposition – typical NCCS model

– Build test bed to investigate the virtualization technologies

– Work with vendors to answer questions and address issues

(6)

NASA Center for Climate Simulation

2/13/2012 6

Methodology and Objectives

Compare bare metal against virtualized NIC

– Full software virtualization (SW Virt) – device emulation

– Virtio – split driver, para-virtualization

– Single Root IO Virtualization (SR-IOV)

• Direct assignment

• Mapped Virtual Function (VF)

Determine overhead of executing within VM construct

– VM to VM communication

• Base Network

• Message passing environment (mvapich2)

– Application

• Single node, multi-core

• Multi-node, multi-core

Draw conclusions and comparisons with Discover and Nebula

Intel Developer Forum

(7)

NASA Center for Climate Simulation

2/13/2012 Intel Developer Forum 7

Benchmarks

Started from the basic benchmarks to analyze system performance and build

up towards the application layer

Benchmark Version Description Download

Nuttcp nuttcp-7.1.5.c gcc compiler

Measure raw network

bandwidth, similar to netperf:

http://lcp.nrl.navy.mil/nuttcp OSU MPI Benchmarks MVAPICH2 1.7rc1 Intel compiler

Test latencies and bandwidths of most common MPI functions.

http://mvapich.cse.ohio-state.edu/

Linpack 10.2.6

Intel compiler

Intel version of Linpack

http://software.intel.com/en- us/articles/intel-math-kernel-library-linpack-download/

NAS PB 3.3.1

Intel compiler

NASA Parallel Benchmarks; CFD kernel benchmarks

http://www.nas.nasa.gov/Resources/Soft ware/npb.html

(8)

NASA Center for Climate Simulation

2/13/2012 Intel Developer Forum 8

Configuration Bare1 Bare2 VM1 VM2

Processor Type Intel Nehalem Intel Nehalem Intel Nehalem Intel Nehalem

Processor Number E5520 E5520 E5520 E5520

Processor Speed 2.27 GHz 2.27 GHz 2.27 GHz 2.27 GHz

Cores per Socket 4 4 4 4

Number of Sockets 2 2 2 2

Cores per Node 8 8 8 8

Theoretical Peak 72.64 GF 72.64 GF 72.64 GF 72.64 GF

Main Memory 48 GB 48 GB 16 GB 16 GB

Operating System Ubuntu 11.04 Ubuntu 11.04 Ubuntu 11.04 Ubuntu 11.04

Kernel 2.6.38-10.server 2.6.38-10.serve 2.6.38-10.server 2.6.38-10.server

Hypervisor KVM KVM N/A N/A

Hyperthreading Off Off Off Off

(9)

NASA Center for Climate Simulation

2/13/2012 9

Test Configuration

Intel Developer Forum

R&D Network '\ Q) Q) <J) ::::; ::::; <Ii ro Cll > > CD CD 10.10101 10.10.10.2 -.:t -.:t -.:t -.:t 0 0 Intel 1000 1 10002 Intel 0 0 ~ ~ ~ ~ ~ 82599EB 82599EB ~ ~ 10.10.20.1 10.10.20.2 ~ :::J ~ :::J 10.0.10.1 10.0.10.2 ~ :::J ~ :::J ~ C C C C :::J :::J :::J :::J .0 .0 .0 .0 ::J ::J ::J ::J " / Dell R710 Dell R710 ---> E5520 2.27GHz (X2) -;. E5520 2.27GHz (X2) -;. 48GB -;. 48GB

(10)

NASA Center for Climate Simulation 2/13/2012 10

Nuttcp Results

Bare to Bare VM to VM Sw Virt VM to VM Virtio VM to VM SR-IOV 4418.8401 Mbps 0 retrans 8028.6459 Mbps 0 retrans 9392.7072 Mbps 0 retrans 9415.2675 Mbps 0 retrans 9341.4362 Mbps 733 retrans 9354.0999 Mbps 208 retrans 9414.7318 Mbps 0 retrans 9414.8207 Mbps 0 retrans 9414.9368 Mbps 0 retrans 9415.1618 Mbps 0 retrans 137.3301 Mbps 0 retrans 145.6024 Mbps 0 retrans 145.7500 Mbps 0 retrans 138.5963 Mbps 0 retrans 141.8702 Mbps 0 retrans 146.1092 Mbps 0 retrans 146.3042 Mbps 0 retrans 146.4449 Mbps 0 retrans 146.2758 Mbps 0 retrans 146.1043 Mbps 0 retrans 5864.0557 Mbps 212 retrans 5678.0625 Mbps 0 retrans 5973.2256 Mbps 0 retrans 6309.8478 Mbps 0 retrans 6223.4034 Mbps 7 retrans 6311.3896 Mbps 0 retrans 6316.7924 Mbps 0 retrans 5955.8176 Mbps 0 retrans 5746.2926 Mbps 0 retrans 5692.8146 Mbps 0 retrans 9151.5769 Mbps 0 retrans 9408.0323 Mbps 0 retrans 8714.4063 Mbps 34 retrans 9313.8894 Mbps 7 retrans 9251.8453 Mbps 0 retrans 9193.1103 Mbps 0 retrans 9348.2984 Mbps 0 retrans 9101.7356 Mbps 73 retrans 8958.5032 Mbps 16 retrans 9228.5370 Mbps 0 retrans

(11)

NASA Center for Climate Simulation

2/13/2012 11

OSU Benchmarks Results – Bandwidth

Intel Developer Forum

0 100 200 300 400 500 600 700 800 900 1000 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 T hr o ug hput ( M B y tes /s ec)

Message Size (Bytes)

Bare to Bare VM to VM SRIOV VM to VM Virtio

Be

tt

er

(12)

NASA Center for Climate Simulation

2/13/2012 Intel Developer Forum 12

OSU Benchmarks Results – Latency

0 2000 4000 6000 8000 10000 12000 0 1000 2000 3000 4000 5000 L a tency ( m icr o seco nds )

Message Size (MBytes)

Bare to Bare VM to VM SRIOV VM to VM Virtio

Be

tt

er

(13)

NASA Center for Climate Simulation

2/13/2012 Intel Developer Forum 13

OSU Benchmarks Results – Latency (Small)

0 20 40 60 80 100 120 140 160 180 200 0 2,000 4,000 6,000 8,000 10,000 L a tency ( m icr o seco nds )

Message Size (Bytes)

Bare to Bare VM to VM SRIOV VM to VM Virtio

Be

tt

er

(14)

NASA Center for Climate Simulation

2/13/2012 14

Linpack Benchmarks Results

Intel Developer Forum

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 0 10000 20000 30000 40000 50000 60000 % o f P ea k P erf o rm a nce Problem Size (N)

Ubuntu1& Ubuntu2 - Bare Metal VM to VM SRIOV

VM to VM Virtio

Be

tt

(15)

NASA Center for Climate Simulation

2/13/2012 15

Going Forward

Conclusions to-date

– Clear advantages to SR-IOV technology

– Cloud based HPC feasible

– Data requires further analysis to understand Nebula implications

Issues/concerns

– TCP Slow start, variability and retran impact on HPC processing

Additional testing to close the gap

– More application testing – NAS Parallel and HPCC benchmarks

– Jumbo frames (9000 MTU)

– Bare metal-to-bare metal and VM-to-VM IB

– Different hypervisor – XEN

– Other VM guest types – RedHat, SUSE

– Multiple VMs running, bandwidth sharing

– Add cloud infrastructure to test setup – Openstack, Eucalyptus

(16)

NASA Center for Climate Simulation

Intel Developer Forum 16 2/13/2012

References

Related documents

Dungannon and to consider the relocation of the tourism staff from Killymaddy Visitor Centre to Hill of The O’Neill Ranfurly Arts and Visitor Centre. Councillor Dillon said it was

A plot showing measured hydrogen concentrations at varying distances and at a fixed height of 0.25m from the release point can be seen at Figure 23... 6.3 TEST 7 HORIZONTAL

The design team aligns faculty and program content with leadership tools and approaches current to the organization to build a uniform culture and language around leadership..

Moreover, to basic messaging WhatsApp Messenger users can send each other images, video as well as audio media messages.. WhatsApp Android is not compatible

The interviews showed that the growth path of firms depends on two different factors: one is changing legislation and administration, the other is the need to change the

U integriranom upisnom području osnovnih škola smještenih u Rovinju na prometnu udaljenost svakako utječe površina upisnog područja koja je gotovo trostruko veća od

(c) The table below shows the average percentage of dark and light tissue cells. These cells were found in the muscles of athletes training for different events at the

Four different types of frequency-selective structures: Unit cells are shown for rotational and complementary arrangements of double U-shaped resonators (DURs) and single