• No results found

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters

N/A
N/A
Protected

Academic year: 2021

Share "High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

High Throughput File Servers with SMB Direct,

Using the 3 Flavors of RDMA network adapters

Jose Barreto

Principal Program Manager

Microsoft Corporation

(3)

Abstract

In Windows Server 2012, we introduce the “SMB Direct”

protocol, which allows file servers to use high throughput/low

latency RDMA network interfaces.

However, there are three distinct flavors of RDMA, each with

their own specific requirements and advantages, their own pros

and cons.

In this session, we'll look into iWARP, InfiniBand and RoCE, outline

the differences between them. We'll also list the specific vendors

that offer each technology and provide step-by-step instructions

for anyone planning to deploy them.

The talk will also include an update on RDMA performance and a

customer case study.

(4)

Summary

• Overview of SMB Direct (SMB over RDMA)

• Three flavors of RDMA

• Setting up SMB Direct

• SMB Direct Performance

• SMB Direct Case Study

(5)
(6)

SMB Direct (SMB over RDMA)

New class of SMB file storage for the Enterprise

– Minimal CPU utilization for file storage

processing

– Low latency and ability to leverage high speed

NICs

– Fibre Channel-equivalent solution at a lower cost

Traditional advantages of SMB file storage

– Easy to provision, manage and migrate

– Leverages converged network

– No application change or administrator

configuration

Required hardware

– RDMA-capable network interface (R-NIC)

– Support for iWARP, InfiniBand and RoCE

Uses SMB Multichannel for Load

Balancing/Failover

File Client File Server

SMB Server SMB Client User Kernel Application Disk R-NIC Network w/ RDMA support NTFS SCSI Network w/ RDMA support R-NIC

(7)

What is RDMA?

• Remote Direct Memory Access Protocol

– Accelerated IO delivery model which works

by allowing application software to bypass

most layers of software and communicate

directly with the hardware

• RDMA benefits

– Low latency

– High throughput

– Zero copy capability

– OS / Stack bypass

• RDMA Hardware Technologies

– Infiniband

– iWARP: RDMA over TCP/IP

– RoCE: RDMA over Converged Ethernet

File

Server

SMB Direct

Client

RDMA

NIC

SMB Direct

Ethernet or

InfiniBand

SMB

Server

SMB Client

Memory

Memory

NDKPI

NDKPI

RDMA

NIC

RDMA

(8)

File Server

SMB Direct

1. Application (Hyper-V,

SQL Server) does not

need to change.

2. SMB client makes the

decision to use SMB

Direct at run time

3. NDKPI provides a much

thinner layer than

TCP/IP

4. Remote Direct Memory

Access performed by

the network interfaces.

SMB over TCP and RDMA

Client

Application

NIC

RDMA

NIC

TCP/ IP

User

Kernel

SMB Direct

Ethernet and/or

InfiniBand

TCP/ IP

Unchanged API

SMB Server

SMB Client

Memory

Memory

NDKPI

NDKPI

RDMA

NIC

NIC

RDMA 1 2 3 4 1 2 3 4

(9)
(10)

Type (Cards*) Pros Cons

Non-RDMA Ethernet (wide variety of NICs)

• TCP/IP-based protocol

• Works with any Ethernet switch • Wide variety of vendors and models • Support for in-box NIC teaming (LBFO)

• Currently limited to 10Gbps per NIC port • High CPU Utilization under load

• High latency iWARP (Intel NE020*, Chelsio T4)

Lo

w

CP

U

Utiliz

atio

n

un

der

lo

ad

Lo

w

la

tenc

y

• TCP/IP-based protocol

• Works with any 10GbE switch • RDMA traffic routable

• Currently limited to 10Gbps per NIC port*

RoCE

(Mellanox ConnectX-2, Mellanox ConnectX-3*)

• Ethernet-based protocol

• Works with high-end 10GbE/40GbE switches • Offers up to 40Gbps per NIC port today*

• RDMA traffic not routable via existing IP infrastructure • Requires DCB switch with Priority Flow Control (PFC)

InfiniBand

(Mellanox ConnectX-2, Mellanox ConnectX-3*)

• Offers up to 54Gbps per NIC port today* • Switches typically less expensive per port than

10GbE switches*

• Switches offer 10GbE or 40GbE uplinks • Commonly used in HPC environments

• Not an Ethernet-based protocol

• RDMA traffic not routable via existing IP infrastructure • Requires InfiniBand switches

• Requires a subnet manager (on the switch or the host)

Comparing RDMA Technologies

(11)

Mellanox ConnectX®-3 dual-Port Adapter

with VPI (InfiniBand and Ethernet)

• Mellanox provides end-to-end InfiniBand and Ethernet

connectivity solutions (adapters, switches, cables)

– Connecting data center servers and storage

• Up to 56Gb/s InfiniBand and 40Gb/s Ethernet per port

– Low latency, Low CPU overhead, RDMA

– InfiniBand to Ethernet Gateways for seamless operation

• Windows Server 2012 exposes the great value of

InfiniBand for storage traffic, virtualization and low latency

– InfiniBand and Ethernet (with RoCE) integration

– Highest Efficiency, Performance and return on investment

• For more information:

http://www.mellanox.com/content/pages.php?pg=file_server

– Gilad Shainer,

[email protected]

,

[email protected]

(12)

Intel 10GbE iWARP Adapter - NE020

• In production today

– Supports Microsoft’s MPI via ND in Windows Server 2008 R2 and beyond

– See Intel’s Download site (

http://downloadcenter.intel.com

) for

drivers (search “NE020”)

• Drivers inbox since Beta for Windows Server 2012

– Supports Microsoft’s SMB Direct via NDK

– Uses the IETF’s iWARP RDMA technology that is built on top of IP

– The only WAN-routable, “cloud-ready” RDMA technology

– Uses standard ethernet switches

– Beta drivers available from Intel’s Download site

(

http://downloadcenter.intel.com

) for drivers (search “NE020”)

• For more information:

(13)

Chelsio T4 line of 10GbE adapters (iWARP)

http://www.chelsio.com/wp-content/uploads/2011/07/ProductSelector-0312.pdf

• Contact: [email protected]

(14)
(15)

Setting up SMB Direct

• Install hardware and drivers

– Get-NetAdapter

– Get-NetAdapterRdma

• Configure IP addresses

– Get-SmbServerNetworkInterface

– Get-SmbClientNetworkInterface

• Establish an SMB Connection

– Get-SmbConnection

– Get-SmbMultichannelConnection

• Similar to configuring SMB for

regular network interfaces

• Verify client Performance Counters

– RDMA Activity – 1/interface

– SMB Direct Connection – 1/connection

– SMB Client Shares – 1/share

• Verify server Performance Counters

– RDMA Activity – 1/interface

– SMB Direct Connection – 1/connection

– SMB Server Shares – 1/share

(16)

InfiniBand details

• Cards

– Mellanox ConnectX-2

– Mellanox ConnectX-3

• Configure a subnet manager on the switch

– Using a managed switches with a built-in subnet manager

• Or use OpenSM on Windows Server 2012

– Included as part of the Mellanox package

– New-Service –Name "OpenSM" –BinaryPathName "`"C:\Program

Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" service L 128"

-DisplayName "OpenSM" –Description "OpenSM" -StartupType

(17)

iWARP details

• Cards

– Intel NE020

– Chelsio T4

• Configure the firewall

– SMB Direct with iWARP uses TCP port 5445

– Enable-NetFirewallRule FPSSMBD-iWARP-In-TCP

• Allow cross-subnet access (optional)

– iWARP RDMA technology can be routed across IP subnets

(18)

RoCE details

• Cards

– Mellanox ConnectX-3

– Make sure to configure the NIC for Ethernet

• Configuring Priority Flow Control (PFC) on Windows

– Install-WindowsFeature Data-Center-Bridging

– NewNetQosPolicy “RoCE” –NetDirectPortMatchCondition 445

-PriorityValue8021Action 4

– Enable-NetQosFlowControl –Priority 4

– Enable-NetAdapterQos –InterfaceAlias RDMA1

– Set-NetQosDcbxSetting –willing 0

– New-NetQoSTrafficClass "RoCE" -Priority 4 -Bandwidth 60 -Algorithm ETS

(19)
(20)

SMB Direct Performance – 1 x 54GbIB

Single

Server

Fusion IO

Fusion IO

Fusion IO

Fusion IO

IO Micro

Benchmark

SMB

Client

SMB

Server

Fusion IO

Fusion IO

Fusion IO

Fusion IO

IO Micro

Benchmark

10 GbE

10GbE

SMB

Client

SMB

Server

Fusion IO

Fusion IO

Fusion IO

Fusion IO

IO Micro

Benchmark

IB FDR

IB FDR

SMB

Client

SMB

Server

Fusion IO

Fusion IO

Fusion IO

Fusion IO

IO Micro

Benchmark

IB QDR

IB QDR

(21)

SMB Direct Performance – 1 x 54GbIB

*** Preliminary *** results from two Intel Romley machines with 2 sockets each, 8 cores/socket

Both client and server using a single port of a Mellanox network interface PCIe Gen3 x8 slot

Data goes all the way to persistent storage, using 4 FusionIO ioDrive 2 cards

Preliminary results based on

Windows Server 2012 beta

Configuration

BW

MB/sec

IOPS

512KB IOs/sec

%CPU

Privileged

Non-RDMA

(Ethernet, 10Gbps)

1,129

2,259

~9.8

RDMA

(InfiniBand QDR, 32Gbps)

3,754

7,508

~3.5

RDMA

(InfiniBand FDR, 54Gbps)

5,792

11,565

~4.8

Local

5,808

11,616

~6.6

Configuration

BW

MB/sec

IOPS

8KB IOs/sec

%CPU

Privileged

Non-RDMA

(Ethernet, 10Gbps)

571

73,160

~21.0

RDMA

(InfiniBand QDR, 32Gbps)

2,620

335,446

~85.9

RDMA

(InfiniBand FDR, 54Gbps)

2,683

343,388

~84.7

Local

4,103

525,225

~90.4

Workload: 512KB IOs, 8 threads, 8 outstanding

Workload: 8KB IOs, 16 threads, 16 outstanding

h

tt

p:

//

smb3

.in

fo

(22)

File Client

(SMB 3.0)

SMB Direct Performance – 2 x 54GbIB

Single Server

SQLIO

File Server

(SMB 3.0)

SQLIO

RDMA NIC RDMA NIC RDMA NIC RDMA NIC

Hyper-V

(SMB 3.0)

File Server

(SMB 3.0)

VM

RDMA NIC RDMA NIC RDMA NIC RDMA NIC

SQLIO

SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD

(23)

SMB Direct Performance – 2 x 54GbIB

Preliminary results based on

Windows Server 2012 RC

Configuration

BW

MB/sec

IOPS

512KB IOs/sec

%CPU

Privileged

Latency

milliseconds

1 – Local

10,090

38,492

~2.5%

~3ms

2 – Remote

9,852

37,584

~5.1%

~3ms

3 - Remote VM

10,367

39,548

~4.6%

~3 ms

(24)
(25)

File Server

(SMB 3.0)

File Client

(SMB 3.0)

SMB Direct Performance – 3 x 54GbIB

SQLIO

RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC RDMA NIC SAS RAID Controller JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS SAS HBA JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS SAS HBA JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS SAS HBA JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS SAS HBA JBOD SSD SSD SSD SSD SSD SSD SSD SSD SAS SAS HBA JBOD SSD SSD SSD SSD SSD SSD SSD SSD

Storage Spaces

Workload

BW

MB/sec

IOPS

IOs/sec

%CPU

Privileged

Latency

milliseconds

512KB IOs, 100% read, 2t, 8o

16,778

32,002

~11%

~ 2 ms

8KB IOs, 100% read, 16t, 2o

4,027

491,665

~65%

< 1 ms

Preliminary results based on

Windows Server 2012 RC

(26)
(27)
(28)
(29)

Summary

• Overview of SMB Direct (SMB over RDMA)

• Three flavors of RDMA

• Setting up SMB Direct

• SMB Direct Performance

• SMB Direct Case Study

(30)

Related Content

• Blog Posts

http://smb3.info

• TechEd Talks

WSV328

The Path to Continuous Availability with Windows Server 2012

VIR306

Hyper-V over SMB: Remote File Storage Support in Windows Server 2012 Hyper-V

WSV314

Windows Server 2012 NIC Teaming and SMB Multichannel Solutions

WSV334

Windows Server 2012 File and Storage Services Management

WSV303

Windows Server 2012 High-Performance, Highly-Available Storage Using SMB

WSV330

How to Increase SQL Availability and Performance Using WS 2012 SMB 3.0 Solutions

WSV410

Continuously Available File Server: Under the Hood

(31)

References

Related documents

In this case, when assessed against a wide range of sustainability criteria from a perspective that values those criteria, HS2 Phase I turned out to be the least

Most Firms Use File Servers, And Many Use Service Providers For File Services Though file servers are common among SMBs, in Forrester’s annual Forrsights Hardware SMB Survey, 21% of

Aflatoxin B1 (AFB1) is the mycotoxin of Aspergillus flavus fungus which is the most pathogenic and carcinogenic and can affect the health of livestock

The Emulex OCe14000 adapters offer a “triple play” of converged data, storage, and low latency RDMA and RDMA over Converged Ethernet (RoCE) networking on a common Ethernet fabric..

Throughput results using lmdd sequential write results for software WAN environment with varying block sizes and network delay... Throughput results for sequential file writes in

 Leverages Windows Server 2012 R2 Mellanox inbox InfiniBand &amp; Ethernet RDMA drivers.  Accelerates Microsoft Hyper-V over SMB and Microsoft SQL Server over

Sokolov, V.: Effect of Moisture on Dielectric Withstand Strength of Winding Insulations in Power Transformers.. Electrical Stations (Electric Power