• No results found

NVM Express* Based Solid-State Drives: Crossing the Chasm, Going Mainstream

N/A
N/A
Protected

Academic year: 2021

Share "NVM Express* Based Solid-State Drives: Crossing the Chasm, Going Mainstream"

Copied!
73
0
0

Loading.... (view fulltext now)

Full text

(1)

Make the Future with China!

NVM Express* Based Solid-State Drives:

Crossing the Chasm, Going Mainstream

Jack Zhang, Senior SSD Enterprise Architect, Intel Corporation

Mark Liang, Senior SSD Application Engineer, Intel Corporation

(2)

Agenda

2

Why NVM Express

*

SSDs are Going Mainstream

- Industry Trend, NVM Express Review

- NVM Express Leaderships, Latest Update

- NVM Express SSD Solutions at Horizontals

How NVM Express SSDs are Crossing the Chasm

- NVM Express SSD Ecosystem and Hardware Design Reference

- NVM Express SSD in Client Segment

(3)

Agenda

3

Why NVM Express* SSDs are Going Mainstream

- Industry Trend, NVM Express Review

- NVM Express Leaderships, Latest Update

- NVM Express SSD Solutions at Horizontals

How NVM Express SSDs are Crossing the Chasm

- NVM Express SSD Ecosystem and Hardware Design Reference

- NVM Express SSD in Client Segment

(4)

Memory and Storage Hierarchy

NVM Solutions are bringing storage closer to the processor

Processor

L1/2 Cache

L3 Cache

Main

Memory

Fast HDD

~1 ns

~10 ns

~100 ns

~10,000,000 ns (10 ms)

On Core CPU

On Die

Direct Attach

SAS, SATA*

Interfaces

Relative Delay

Per

formance

C

os

ts

NAND SSD

~100,000 ns (100 us)

SAS, SATA

4

Source: Intel

NVM Express

*

(NVMe)

PCI Express

*

(PCIe)

(5)

Memory and Storage Hierarchy

NVM Solutions are bringing storage closer to the processor

Processor

L1/2 Cache

L3 Cache

Main

Memory

Fast HDD

~1 ns

~10 ns

~100 ns

~10,000,000 ns (10 ms)

On Core CPU

On Die

Direct Attach

SAS, SATA*

Interfaces

Relative Delay

Per

formance

C

os

ts

NAND SSD

~100,000 ns (100 us)

SAS, SATA

NAND SSD

~10,000 ns (10us)

~100,000 ns (100 us)

PCIe*/NVMe*,

SAS, SATA

NVMe

4

Source: Intel

NVM Express

*

(NVMe)

PCI Express

*

(PCIe)

(6)

PCI Express

*

(PCIe

*

) SSDs Projected to Lead in Data Center

PCI Express

*

(PCIe

*

) projected as

leading SSD interface in DC by 2018

Enterprise SSD by Interface

PCIe SSDs lead the way by embracing industry standards

13%

17%

27%

32%

46%

53%

0%

20%

40%

60%

80%

100%

2013

2014

2015

2016

2017

2018

Data Center SSD – GB by Interface

PCIe

SAS

SATA

Source: Intel Market Model and multiple industry analysts

PCI Express

*

(PCIe

*

) is projected to

lead even sooner by capacity

Source: International Data Corporation (IDC). Worldwide Solid State Drive 2014-2018 Forecast, Doc #248727, June 2014

(7)

What is

NVM Express

*

is a

standardized high

performance

software interface

for PCI Express*

Solid-State Drives

Architected from

the ground up for

SSDs to be more

efficient, scalable,

and manageable

NVM Express is

industry driven to

be extensible for

the needs of both

the client and the

data center

?

If I had asked people

what they wanted,

they would have said

faster horses

- Henry Ford

(8)

NVM Express* Community

Promoter Group

Led by 13 elected

companies

NVM Express*, Org

Consists of more than 75

companies from across

the industry

Technical Workgroup

Queuing interface, NVM Express I/O

and Admin command set

Management Interface

Workgroup

Out-of-band management over PCI

Express* VDM and SMBus

(9)

NVM Express

*

Technical Overview

- Supports deep queues (64K commands per queue, up to 64K queues)

- Supports MSI-X and interrupt steering

- Streamlined & simple command set (13 required commands)

- Optional features to address target segment

- Data Center: End-to-end data protection, reservations, etc.

- Client: Autonomous power state transitions, etc.

- Designed to scale for next generation NVM, agnostic to NVM type used

NVM Express* (NVMe)

(10)

Serviceable Form Factor for Data Center

A serviceable (hot pluggable) form

factor is critical in Data Center

- The SFF-8639* form factor /

connector supports NVM Express

*

(NVMe), SAS, and SATA

- Enables OEMs to transition at their own

speed

SFF-8639 can be used with existing

platforms using a PCI Express

*

(PCIe

*

) adapter

NVMe is a great Data Center investment, near-term and long-term.

(11)

NVM Express* Driver Ecosystem

6.5 | 7.0

SLES 11 SP3

SLES 12

ESXi 5.5

13 | 14

Windows* 8.1

Linux* NVM Express* driver

(12)

What matters in today’s Data Center is not just IOPs and bandwidth

Let’s look at efficiency of the software stack, latency, and consistency

Analyzing What Matters

Basic 4U Intel® Xeon™ E5 processor based server

Out of box software setup

Moderate workload: 8 workers, QD=4, random reads

Server Setup

Not strenuous on purpose – evaluate protocol and not the server

NVM Express*(NVMe)

PCI Express*(PCIe*)

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Storage Protocols Evaluated

Interface

6Gb SATA*

6Gb SATA

6Gb SAS

12Gb SAS

PCIe* Gen 3

NVMe

(13)

Server setup:

-

2-Socket Intel® Xeon® E5-2690v2 + 64GB RAM + SSD Boot/Swap – EPSD 4U S2600CP Family

-

Linux

*

2.6.32-461.el6.bz1091088.2.x86_64 #1 SMP Thu May 1 17:05:30 EDT 2014 x86_64 x86_64 x86_64

GNU/Linux

-

CentOS 6.5* fresh build, yum –y update (no special kernel or driver)

SSDs used:

-

LSI 9207-8i* + 6Gb SAS HGST* Drive @ 400GB & LSI 9207-8i *+ 6Gb SATA Intel® SSD DC S3700 @ 400GB

-

LSI 9300-8i* + 12Gb SAS HGST* Drive @ 400GB

-

Onboard SATA Controller + SATA Intel® SSD DC S3700 @ 400GB

-

Intel® SSD DC P3700 Series NVM Express* (NVMe) drive at 400GB

FIO workload:

-

fio --ioengine=libaio --description=100Read100Random --iodepth=4 --rw=randread --blocksize=4096 --size=100%

runtime=600 time_based numjobs=1 name=/dev/nvme0n1 name=/dev/nvme0n1 name=/dev/nvme0n1

name=/dev/nvme0n1 -name=/dev/nvme0n1 -name=/dev/nvme0n1 -name=/dev/nvme0n1

--name=/dev/nvme0n1 2>&1 | tee -a NVMeONpciE.log

-

8x workers, QD4, random read, 4k block, 100% span of target, unformatted partition

(14)

The Efficiency of NVM Express

*

(NVMe)

CPU cycles in a Data Center are precious

And, each CPU cycle required for an IO adds latency

NVM Express

*

(NVMe) takes less than half the CPU cycles per IO as SAS

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(15)

The Efficiency of NVM Express

*

(NVMe)

CPU cycles in a Data Center are precious

And, each CPU cycle required for an IO adds latency

NVM Express

*

(NVMe) takes less than half the CPU cycles per IO as SAS

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(16)

The Efficiency of NVM Express

*

(NVMe)

CPU cycles in a Data Center are precious

And, each CPU cycle required for an IO adds latency

NVM Express

*

(NVMe) takes less than half the CPU cycles per IO as SAS

With equivalent CPU cycles, NVMe delivers over 2X the IOPs of SAS!

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(17)

The Latency of NVM Express

*

(NVMe)

The efficiency of NVM Express

*

(NVMe) directly results in leadership latency

When doubling from 6Gb to 12Gb, SAS only reduces latency by ~ 60 µS

NVMe is more than 200 µs lower latency than 12 Gb SAS

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(18)

The Latency of NVM Express

*

(NVMe)

The efficiency of NVM Express

*

(NVMe) directly results in leadership latency

When doubling from 6Gb to 12Gb, SAS only reduces latency by ~ 60 µS

NVMe is more than 200 µs lower latency than 12 Gb SAS

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(19)

The Latency of NVM Express

*

(NVMe)

The efficiency of NVM Express

*

(NVMe) directly results in leadership latency

When doubling from 6Gb to 12Gb, SAS only reduces latency by ~ 60 µS

NVMe is more than 200 µs lower latency than 12 Gb SAS

NVMe delivers the lowest latency of standard storage interface

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(20)

The Consistency of NVM Express

*

(NVMe)

NVM Express

*

(NVMe) leadership on latency and efficiency is consistently amazing

SAS is a mature software stack with over a decade of tuning, yet the first

generation NVM Express software stack has 2 to 3X better consistency

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

http://www.intel.com/performance. Source: Intel Internal Testing

(21)

NVM Express

*

(NVMe) Delivers Best in Class IOPs

100% random reads:

NVMe has >3X better IOPs than SAS 12Gbps

70% random reads:

NVMe has >2X better IOPs than SAS 12Gbps

100% random writes:

NVMe has ~ 1.5X better IOPs than SAS 12Gbps

0

100000

200000

300000

400000

500000

100% Read

70% Read

0% Read

IOP

S

4K Random Workloads

PCIe/NVMe

SAS 12Gb/s

SATA 6Gb/s HE

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(22)

And Best in Class Sequential Performance

- NVM Express

*

(NVMe) delivers > 2.5GB/s of read and ~ 2 GB/s of write

performance

- 100% reads:

NVMe has >2X better performance than SAS 12Gbps

- 100% writes: NVMe has >2.5X better performance than SAS 12Gbps

0

1000

2000

3000

100% Read

0% Read

MB

P

s

Sequential Workloads

PCIe/NVMe

SAS 12Gb/s

SATA 6Gb/s HE

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit

(23)

NVM Express

*

(NVMe)

Development Timeline

2011

2012

2013

2014

Host Memory Buffer

Replay Protected Area

Active/Idle Power and RTD3

Temperature Thresholds

Namespace Management

Controller Memory Buffer

Live Firmware Update

Atomicity Enhancements

NVMe 1.2 – Q4 2014

Multi-Path IO

Namespace Sharing

Reservations

Autonomous Power

Transition

Scatter Gather Lists

NVMe 1.1 – Oct 2012

2015

Queuing Interface

Command Set

End-to-End Protection

Security

PRPs

NVMe 1.0 – Mar 2011

NVM Express

*

(NVMe) revision 1.2 released

(24)

Management

Applications (e.g.,

Remote Console)

SMBus/I2C

PCIe VDM

MCTP over

SMBus/I2C Binding

MCTP over

PCIe Binding

Management Component Transport Protocol (MCTP)

NVMe Management Interface

Management Controller

(BMC or Host Processor)

Management Applications (e.g., Remote Console)

Physical

Layer

Transport

Layer

Protocol

Layer

Application

Layer

Management

Applications (e.g.,

Remote Console)

NVM Express

*

(NVMe) Management Interface

Defines out-of-band

management that is independent

of the physical transport and

protocol

Maps the management interface

to one or more out-of-band

physical interfaces (e.g., I2C, PCI

Express*)

Specifies a management

command set for NVM Express

*

(NVMe*) devices

(25)

P3700

P3600

P3500

Application Works Better w/ NVM Express*!

Private Cloud

Database

Virtualization

Big data

NVM Express* SSDs

lower enterprise IT

TCO by enabling

increased Virtual

Machine scalability

and optimizing

platform utilization

P3700

P3600

P3500

P3700

P3600

P3500

P3700

P3600

P3500

P3700

P3600

P3500

Software Defined

Infrastructure or

hyper convergence

is made affordable

with high

performance SSDs

Consistent, low

latency, high

bandwidth

performance of

NVMe shines in

traditional relational

databases

Analytics and NoSQL

databases fully utilize

NVMe performance

to provide near real

time results

NVMe keeps up with

high bandwidth

demands of HPC to

help speed up

overall workflow

times

HPC

(26)

30 Million IOPS = 30+ Arrays

High Performance Low TCO with NVMe SSDs

SGI at New Orleans SC14

(

switches, cables, racks, power, software, etc.)

30+ Arrays are 400% more expensive, not counting HBAs,

Integrated Compute/Storage

64x NVMe SSDs Intel P3700

(27)

Software-Defined Storage with NVM Express* SSDs

LUN

LUN

LUN

LUN

LUN

(28)

Software-Defined Storage with NVM Express* SSDs

SAN / NAS VVOL

SAN/NAS Pool

(29)

Software-Defined Storage with NVM Express* SSDs

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

(30)

Software-Defined Storage with NVM Express* SSDs

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

Virtual Data Services Data

Protection Mobility Performance Policy-driven Control Plane

(31)

Software-Defined Storage with NVM Express* SSDs

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

Virtual Data Services Data

Protection Mobility Performance Policy-driven Control Plane

(32)

Software-Defined Storage with NVM Express* SSDs

0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 4 8 16 32 64 # Virtual Machines

ESXi 5.5 S3700+HDD ESXi 5.5 P3700+S3500 ESXi 6.0 P3700+S3500 IOPS: 70/30 R/W – 4K, 100% Random

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

Virtual Data Services Data

Protection Mobility Performance Policy-driven Control Plane

Virtual SAN

(33)

Software-Defined Storage with NVM Express* SSDs

Reference paper including test systems and configurations:

http://www.intel.com/content/www/us/en/solid-state-drives/intel-vmware-scalable-enterprise-storage-brief.html

0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 4 8 16 32 64 # Virtual Machines

ESXi 5.5 S3700+HDD ESXi 5.5 P3700+S3500 ESXi 6.0 P3700+S3500 IOPS: 70/30 R/W – 4K, 100% Random

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

Virtual Data Services Data

Protection Mobility Performance Policy-driven Control Plane

Virtual SAN

(34)

Software-Defined Storage with NVM Express* SSDs

Reference paper including test systems and configurations:

http://www.intel.com/content/www/us/en/solid-state-drives/intel-vmware-scalable-enterprise-storage-brief.html

0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 4 8 16 32 64 # Virtual Machines

ESXi 5.5 S3700+HDD ESXi 5.5 P3700+S3500 ESXi 6.0 P3700+S3500 IOPS: 70/30 R/W – 4K, 100% Random

SAN / NAS VVOL

SAN/NAS Pool

Virtual Data Plane

x86 Servers

Hypervisor-converged

Storage pool Object Storage Pool

Cloud Object Storage

Virtual Data Services Data

Protection Mobility Performance Policy-driven Control Plane

Virtual SAN

VMWare supports NVM Express

(35)

Accelerating Database with NVMe* SSDs

37x

Top Use Cases:

DB Logs – promotes faster writes and

replication

Pure SSD I/O intensive databases of all types

DRAM augmentation. ex SAP* HANA dynamic

tiering, Aerospike

Intel CAS/B-Cache & TempDB (Sort)

Microsoft* SQL Server (RDBMS)

Buffer Pool Extension

faster MS SQL Server tiering

Increased random I/O throughput

Reduced I/O latency

Increased transaction throughput

40%

$35+

Baseline

w/ Paging

$23+

w/ SSD

$2.72

System Cost per Transaction

https://www- ssl.intel.com/content/www/us/en/solid- state-drives/sql-server-optimization-with-ssds-paper.html

(36)

Agenda

36

Why NVM Express* SSDs are Going Mainstream

- Industry Trend, NVM Express Review

- NVM Express Leaderships, Latest Update

- NVM Express SSD Solutions at Horizontals

How NVM Express SSDs are Crossing the Chasm

- NVM Express SSD Ecosystem and Hardware Design Reference

- NVM Express SSD in Client Segment

(37)

Form Factors for PCI Express*

Data Center

Client

SFF-8639

SATA Express

AIC

SFF-8639

SATA Express

M.2

Add in Card

M.2

BGA

HD SSD FF

(38)

SATA Express* and SFF-8639 Comparison

SATAe

SFF-8639

SATA / SAS

SATA

SATA / SAS

PCI Express

®

x2

x4 or dual x2

Host Mux

Yes

No

Ref Clock

Optional

Required

EMI

SRIS

Shielding

Height

7mm

15mm

Max

Performance

2 GB/s

4 GB/s

Bottom Line

Flexibility&

Cost

Performance

Source: Seagate*(with permission)

(39)

M.2 Form Factor Comparison

Host Socket 2

Host Socket 3

Device w/ B&M Slots

M.2 Socket 3 is the best option for

Data Center PCIe SSDs

M.2

Socket 2

Socket 3

M.2

SATA

*

Yes, Shared

Yes, Shared

PCI Express

*

(PCIe

*

) x2

PCIe x4

No

Yes

Comms Support

Yes

No

Ref Clock

Required

Required

Max Performance

2 GB/s

4 GB/s

(40)

Cable Options for Data Center PCI Express* SSD Topologies

miniSAS HD cables lightly

modified for PCI Express*

are being used due to the

robust connector and high

volume manufacturing.

40

SMBUS

PCIe Reset

Reference Clock

Reference Clock

(41)

Basic PCI Express

*

SSD Topology – 1 Connector

SFF-8639 Connector

directly attached to

board

Mostly used in small

form factors such as

compute node, blade,

etc.

1

(42)

Basic PCI Express* SSD Topology – 2 Connector

1

2

SFF-8639 Connector

PCIe 3.0 x4

Enterprise SSD

miniSAS HD Connector

PCIe* Cable

External Power

(43)

Basic PCI Express* SSD Topology – 3 Connector

Motherboard

1

3

2

SFF-8639 Connector

Backplane

miniSAS HD Connector

SSD Drive Carrier

PCIe® Cable

miniSAS HD Connector

(44)

Port Expansion Devices - Switches

Use Switches to

expand number

of PCIe* SSDs

Switch

PCIe SSD

x4 link

Intel CPU

x4 link

PCIe SSD

PCIe SSD

x4 link

PCIe SSD

x4 link

Port A

Port B

Port C

Port D

PCIe® 3.0 x8 link

(45)

Link Extension Devices – Switches and Retimers

Use Link Extension

Devices for longer

topologies

Retimer

PCIe SSD

x4 link

Switch

PCIe SSD

x4 link

Intel CPU

x4 link

Port A

Port B

PCIe* 3.0 x4 link

PCIe® 3.0 x4 link

(46)

PCI Express* (PCIe*) Switches and Retimers

PCIe Switches

Use for link extension and/or port

expansion

Hot-plug and error isolation

High performance peer-to-peer

transfers

Extra software features

Retimers

Mostly transparent to software

Retimers should be more

common in PCIe 4.0

Link Extension Devices

Use when channel has >

-20db

loss

: at 8GT/s PCIe 3.0

Retimer vs. Re-driver

Repeater

: A Retimer or a

Re-driver

Re-driver

: Analog and not

protocol aware

Retimer

: Physical Layer protocol

aware, software transparent,

Extension Device. Forms two

separate electrical sub-links.

-

Executes equalization procedure on

each sub-link

Recommend using only switches or

retimers for link extension of PCIe

(47)

Complex PCI Express* Topology – 4 Connector

PCIe x16 slot

miniSAS HD for PCIe

1

2

3

4

PCIe® Cable

Backplane

SFF-8639 Connector

SSD Drive Carrier

Cabled Add in

card with Link

Extension

(48)

Complex PCI Express* Topology – 5 Connector

PCIe x16 slot

Cabled Add in

card with Link

Extension

miniSAS HD for PCIe

Backplane

1

2

4

PCIe x16 Riser

3

5

PCIe®

Cable

SSD Drive Carrier

SFF-8639 Connector

(49)

PCI Express* cabling for future topologies - OCuLink

*

Category

OCuLink*

Standard Based

PCI-SIG*

PCIe® Lanes

X4

Layout

Smaller footprint

Signal Integrity

Similar on loss dominated

channels

PCIe 4.0 ready

16GT/s target

Clock, power

Supports SRIS and 3.3/5V

power

Production

Availability

Mid 2015

Source:

OCuLink internal

cables and connectors

12.85mm

2.83mm

(50)

OCuLink

*

Provides Flexible Data Center

Topologies

SFF-8639 Connector

Backplane

Cabled add in card

PCI Express* SSD

Board to board

connections

(51)

Dual Port Active/Active PCI Express* NVM Express*

Topology

51

Two controllers provide access for

Storage Users over redundant networks

Both controllers have simultaneous

access to each drive through separate

PCIe fabrics

One port on each drive is connected to each

fabric

During normal operation the two

controllers work cooperatively sharing

the drive

Each controller monitors its partner’s

health via a heartbeat signal

In the event of a controller/switch/

network failure the remaining host can

continue system operation

The system can be serviced and repaired

without taking it off-line

Controller

A

Controller

B

Switch A

Switch B

D

ua

l P

o

rt

SSD

Po rt 0 Po rt 1

D

ua

l P

o

rt

SSD

Po rt 0 Po rt 1

D

ua

l P

o

rt

SSD

Po rt 0 Po rt 1

Heartbeat

(52)

Dual Port PCI Express* SSD Topology

52

2.5”

2.5”

SFF-8639 Dual-port

connectivity solution

An integrated SFF-8639

(53)

NVM Express* Storage Device Management

Server Caching

Server Storage

External Storage

Root Complex

PCIe

NVMe NVMe NVMe x16 x4 Root Complex NVMe PCIe Switch NVMe NVMe NVMe

x16

x4

Controller A

ComplexRoot

Controller B

PCIe Switch x16 Root Complex PCIe Switch x16 SAS SAS NVMe NVMe NVMe NVMe SAS HDD

Example Pre-boot Management

Inventory, Power Budgeting, Configuration, Firmware Update

Example Out-of-Band Management During System Operation

(54)

NVM Express

*

Expected in 2015 Client

*

*

*

*Other names and brands may be claimed as the property of others.

Intel launches 750 Series

PCI Express* Gen3x4 NVMe

(55)

NVM Express

*

Forecasted to Lead in Client

AHCI PCI Express

*

(PCIe

*

) SSDs

started transition from SATA in

client in 2013

AHCI PCIe SSD is PCIe hardware with

legacy SATA* software interface

PCIe SSDs estimated to be > 25%

of client SSD market 2016

*Other names and brands may be claimed as the property of others.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2010

2011

2012

2013

2014

2015

2016

2017

2018

Client SSD Interface Trend

SATA II

SATA III

PCIe

Chart from Forward Insights SSD Insights Q4’14

(56)

Client Form Factors

There are a variety of form factors for client depending on platform needs

M.2 is an optimized SSD only form factor for laptops

Intel, SanDisk

*

, Toshiba

*

, and HP

*

proposed a BGA solution for

standardization for behind-the-glass usages (e.g., 2-in-1 laptops) – join

PCI SIG to participate!

2.5” SFF-8639 enables highest capacity and performance

Form factors are there to support client adoption in 2015

(57)

Announced client NVM Express

*

(NVMe*) SSDs deliver leadership

performance over SATA*

SATA is 600 MB/s while PCI Express

*

(PCIe

*

) Gen3 x4 is over 4000 MB/s –

more than 5X the headroom

However, for client what really

matters is power

Video Playback

SSD Power (milliWatts)

Optimized for Power

0

20

40

60

80

SATA

NVMe

Note: Measurements made on next generation Intel® Core™ SIP platform running Windows* 8.1 Standard O/S, Intel® Rapid Storage Technology (Intel® RST) prototype driver, leadership SATA SSD and prototype NVMe* SSD, 1080p Video Playback workload description available in Ultrabook™ logo documentation. Power data averaged over 3 runs.

NVMe is best in class – great performance with low power

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations noted above.

(58)

Agenda

58

Why NVM Express

*

SSDs are Going Mainstream

- Industry Trend, NVM Express Review

- NVM Express Leaderships, Latest Update

- NVM Express SSD Solutions at Horizontals

How NVM Express SSDs are Crossing the Chasm

- NVM Express SSD Ecosystem and Hardware Design Reference

- NVM Express SSD in Client Segment

(59)

NVM Express

*

(NVMe) in Fabric Environments

A primary use case for NVMe

is in a Flash appliance

Hundreds or more SSDs may

be attached – too many for

PCI Express

*

based attach

Flash

Appliance

front-end

NVMe NVM

Subsystem

NVMe NVM

Subsystem

NVMe NVM

Subsystem

Windows

*

client

Linux

*

client

SMB3

ISER

NVMe

Host

Software

(60)

NVM Express

*

(NVMe) in Fabric Environments

A primary use case for NVMe

is in a Flash appliance

Hundreds or more SSDs may

be attached – too many for

PCI Express

*

based attach

Concern: Remote SSD attach

over a fabric uses SCSI based

protocols today – requiring

protocol translation

Flash

Appliance

front-end

NVMe NVM

Subsystem

NVMe NVM

Subsystem

NVMe NVM

Subsystem

Windows

*

client

Linux

*

client

SMB3

ISER

NVMe

Host

Software

S

C

S

I

S

C

S

I

S

C

S

I

S

C

S

I

S

C

S

I

S

C

S

I

S

C

S

I

(61)

NVM Express

*

(NVMe) in Fabric Environments

A primary use case for NVMe

is in a Flash appliance

Hundreds or more SSDs may

be attached – too many for

PCI Express

*

based attach

Concern: Remote SSD attach

over a fabric uses SCSI based

protocols today – requiring

protocol translation

Flash

Appliance

front-end

NVMe NVM

Subsystem

NVMe NVM

Subsystem

NVMe NVM

Subsystem

Windows

*

client

Linux

*

client

SMB3

ISER

NVMe

Host

Software

S

C

S

I

S

C

S

I

S

C

S

I

S

C

S

I

Desire best performance and latency from SSD investment over

fabrics like Ethernet, Infiniband, and Intel® Omni Scale Fabric

S

C

S

I

S

C

S

I

S

C

S

I

(62)

Flash

Appliance

front-end

Introducing NVM Express

*

(NVMe) over Fabrics

NVMe NVM

Subsystem

NVMe NVM

Subsystem

NVMe NVM

Subsystem

Back-end Fabric

Windows

*

client

Front-end Fabric

SMB3

NVMe

Host

Software

Extend efficiency of NVMe over front or back-end fabrics.

S

C

S

I

Optimized

client

S

C

S

I

Linux

*

client

SMB3

S

C

S

I

S

C

S

I

NVMe over

Fabrics

(63)

NVM Express

*

(NVMe) over Fabrics?

Simplicity, Efficiency and End-to-End NVM Express

*

(NVMe) Model

- NVMe has a single Admin queue pair with 10 required commands

- NVMe supports up to 64K I/O Queues with 3 required commands

- Simplicity of protocol enables hardware automated I/O Queues – transport bridge

- No translation to or from another protocol like SCSI (in firmware/software)

- Inherent parallelism of multiple I/O Queues is exposed

- NVMe commands and structures are transferred end-to-end

Goal: Make remote NVMe

equivalent to local NVMe,

within ~ 10 µs latency.

(64)

64

Prototype Exceeds Goal

Recall: Goal is remote NVM Expresss*

equivalent to local NVMe, up to 10 µs

added latency

Prototype delivers 450K IOPs for both

the local and remote NVMe devices

Remote NVMe adds 8 µs latency

versus local NVMe access

Prototype demonstrated at SF14 IDF

NVMe over Fabrics

Subsystem Target

NVMe Controller

Software

NVMe Transport Bridge

Transport Abstraction

NVMe PCIe SSD

(Intel® P3700)

RDMA Transport

RDMA Fabric

(40G Ethernet iWARP RDMA)

RDMA Transport

Host System

Transport Abstraction

NVMe PCIe SSD

(Intel P3700™)

NVMe Host Software (Common)

FIO Test

Application

(Local NVMe IOPS)

FIO Test

Application

(Remote NVMe IOPS)

Intel i7-4790 3.6GHz Processors, 8GB DDR-1600, Gibabyte* GA-Z97X-UD7 MB, Intel P3700 800G SSDs, Chelsio* T580-CR 40GBE iWARP NIC. RHEL7 Linux, OFED 3.2 Software, FIO V2.1.10. Source: Intel. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark,* are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

(65)

NVM Express

*

(NVMe) SSD is a great Data Center investment, near term

and long term

NVMe SSDs delivers the lowest latency of any standard storage

interface

Ecosystem is getting mutual, NVMe SSD is trending to replace

SATA/SAS SSDs and going mainstream, NVMe SSD

solutions/optimizations at horizontals are expediting this transition

NVMe is expected to be coming to client platforms in 2015, with great

performance & power

Innovation continues – get involved in NVMe over Fabrics!

(66)

Next Steps

*Other names and brands may be claimed as the property of others.

Modernize your Data Center with NVM Express

*

(NVMe) SSDs – a strong

foundation for the next decade of storage innovation

Prepare for NVM Express in 2015 Client Platforms

Get involved in NVMe SSD solutions/optimizations at horizontals, such

as Ceph, Lustre

*

, Databases, Virtualizations, etc.

Join NVM Express – visit www.nvmexpress.org/join-nvme

- Help define the Management Interface

- Get involved in the definition of NVMe over Fabrics

(67)

Additional Sources of Information

A PDF of this presentation is available is available from our Technical

Session Catalog:

www.intel.com/idfsessionsSZ

. This URL is also printed

on the top of Session Agenda Pages in the Pocket Guide.

Demos in the showcase –

Booth #617

More web based info:

www.intel.com/ssd

(68)

Other Technical Sessions/Demos/Posters

Technical Sessions

Demos

General

Exhibitor Booth

Demo Description

Day

Time

#617

Intel’s First PCIe Gen3 x4 NVMe SSD 750 Series Demo

Wednesday, April 8, 2015

Thursday, April 9, 2015

10:30 – 18:00

10:30 – 15:00

#617

Intel® SSD DC P3600 Empowers >4.5M IOPS in 1U Server

Wednesday, April 8, 2015

Thursday, April 9, 2015

10:30 – 18:00

10:30 – 15:00

Posters

Session ID

Title

Day

Time

SSDC001

Innovations for NVMe Solid-State Drives

NVM Express* Ecosystem – Exploring PCIe* Form Factor

Wednesday, April 8, 2015

Thursday, April 9, 2015

13:00 – 15:00

10:45 – 12:15

Session ID

Title

Day

Time

Room

SSDS001

NVM Express* Based Solid-State Drives: Crossing the

Chasm, Going Mainstream

Wednesday

15:45

Room #2

Session

SSDS002

Key Factors to Consider When Designing Solid-State Drives

into Your Data Center

Wednesday

17:00

Room #2

Session

(69)

Legal Notices and Disclaimers

69

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.

No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit

http://www.intel.com/performance.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: PCI Express*(PCIe*)/NVM Express*(NVMe) Measurements made on Intel® Core™ i7-3770S system @ 3.1GHz and 4GB Mem running Windows*Server 2012 Standard O/S, Intel PCIe/NVMe SSDs, data collected by IOmeter*tool. PCIe/NVMe SSD is under

development. SAS Measurements from HGST Ultrastar*SSD800M/1000M (SAS) Solid State Drive Specification. SATA Measurements from Intel Solid State Drive DC P3700 Series Product Specification. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Source: Intel Internal Testing.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel, Xeon, Core and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others.

(70)

Legal Notices and Disclaimers

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for

optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3,

and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability,

functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.

Microprocessor-dependent optimizations in this product are intended for use with Intel

microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel

microprocessors. Please refer to the applicable product User and Reference Guides for more

information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

(71)

Risk Factors

The above statements and any others in this document that refer to plans and expectations for the first quarter, the year and the future are

forward-looking statements that involve a number of risks and uncertainties. Words such as "anticipates," "expects," "intends," "plans," "believes," "seeks,"

"estimates," "may," "will," "should" and their variations identify forward-looking statements. Statements that refer to or are based on projections,

uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel's actual results, and variances from Intel's

current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements.

Intel presently considers the following to be important factors that could cause actual results to differ materially from the company's expectations.

Demand for Intel’s products is highly variable and could differ from expectations due to factors including changes in the business and economic

conditions; consumer confidence or income levels; customer acceptance of Intel’s and competitors’ products; competitive and pricing pressures,

including actions taken by competitors; supply constraints and other disruptions affecting customers; changes in customer order patterns including

order cancellations; and changes in the level of inventory at customers. Intel’s gross margin percentage could vary significantly from expectations based

on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue

levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; excess or obsolete inventory; changes in unit

costs; defects or disruptions in the supply of materials or resources; and product manufacturing quality/yields. Variations in gross margin may also be

caused by the timing of Intel product introductions and related expenses, including marketing expenses, and Intel’s ability to respond quickly to

technological developments and to introduce new features into existing products, which may result in restructuring and asset impairment charges.

Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its

suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in

currency exchange rates. Results may also be affected by the formal or informal imposition by countries of new or revised export and/or import and

doing-business regulations, which could be changed without prior notice. Intel operates in highly competitive industries and its operations have high

costs that are either fixed or difficult to reduce in the short term. The amount, timing and execution of Intel’s stock repurchase program and dividend

program could be affected by changes in Intel’s priorities for the use of cash, such as operational spending, capital spending, acquisitions, and as a result

of changes to Intel’s cash flows and changes in tax laws. Product defects or errata (deviations from published specifications) may adversely impact our

expenses, revenues and reputation. Intel’s results could be affected by litigation or regulatory matters involving intellectual property, stockholder,

consumer, antitrust, disclosure and other issues. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from

manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring

other remedies such as compulsory licensing of intellectual property. Intel’s results may be affected by the timing of closing of acquisitions, divestitures

and other significant transactions. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings,

including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release.

Rev. 1/15/15 71

(72)
(73)

Server setup:

-

2-Socket Intel® Xeon™ E5-2690v2 + 64GB RAM + SSD Boot/Swap – EPSD 4U S2600CP Family

-

Linux

*

2.6.32-461.el6.bz1091088.2.x86_64 #1 SMP Thu May 1 17:05:30 EDT 2014 x86_64 x86_64 x86_64

GNU/Linux

-

CentOS 6.5* fresh build, yum –y update (no special kernel or driver)

SSDs used:

-

LSI 9207-8i* + 6Gb SAS HGST* Drive @ 400GB & LSI 9207-8i *+ 6Gb SATA Intel® SSD DC S3700 @ 400GB

-

LSI 9300-8i* + 12Gb SAS HGST* Drive @ 400GB

-

Onboard SATA Controller + SATA Intel® SSD DC S3700 @ 400GB

-

Intel® SSD DC P3700 Series NVM Express* (NVMe) drive at 400GB

FIO workload:

-

fio --ioengine=libaio --description=100Read100Random --iodepth=4 --rw=randread --blocksize=4096 --size=100%

runtime=600 time_based numjobs=1 name=/dev/nvme0n1 name=/dev/nvme0n1 name=/dev/nvme0n1

name=/dev/nvme0n1 -name=/dev/nvme0n1 -name=/dev/nvme0n1 -name=/dev/nvme0n1

--name=/dev/nvme0n1 2>&1 | tee -a NVMeONpciE.log

-

8x workers, QD4, random read, 4k block, 100% span of target, unformatted partition

References

Related documents

As a comparative study, the global performance of two cases for a floating harbor system are researched by numerical analysis and compared with results from experiments: one is

The pizza — prepared by Domino’s in Longmont — met the school district’s nutrition guidelines because it was made with low-fat mozzarella, low-sodium sauce and a multi-grain

For more information on cable/rod display systems, including pricing, specifications, and installation instructions visit our website at www.novadisplay.com For more information

In this paper we therefore characterize the traffic generated by different types of thin client users when working with popular office applications like Microsoft Word, Excel,

Also, with the proliferation of serial management solutions in the data center, it is also important to maintain control over serial devices such as routers, hubs, modems and

EGHE AD 2.20 LOCAL TRAFFIC REGULATIONS (continued) AMDT 8/2015 6 Use of Runways Not applicable 7 Training Not applicable. EGHE AD 2.21 NOISE

Outstanding Teaching Assistant of the Year Nominee, University of New Mexico, 2006 NIH Travel Award, International Society for Developmental Psychobiology, 2003, 2006-2008