• No results found

Intel Ivybridge vs. AMD Opteron: performance and power implications

N/A
N/A
Protected

Academic year: 2022

Share "Intel Ivybridge vs. AMD Opteron: performance and power implications"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

HEPiX Spring 2014

Tony Wong (BNL)

Intel Ivybridge vs. AMD Opteron:

performance and power implications

(2)

Acknowledgements

Thanks to Chris Hollowell and Shuwei Ye (BNL) for doing most of the work. We also thank Dell, HP and Penguin Computing for

making the hardware available for us to evaluate.

(3)

Background

Annual purchase cycle for RHIC-ATLAS Computing Facility (RACF) at BNL

Data Center constraints (space, power and cooling)

Vendor constraints (few AMD options)

Experimental requirements (computing and storage)

Other constraints

Migration to 10 Gb connectivity for RHIC

Compatible with Infiniband solutions?

(4)

Data Center constraints

Space

Approximately ~70% of 15,000 ft2 (~1,400 m2) data center taken

Remaining floor space requires power/cooling upgrades

Power

2 MW of usable UPS-backed power

Current usage ~1.1 MW (55% of maximum)

Cannot go much above ~80% due to configuration inefficiencies (ie, pdu-level redundancy for critical components)

Cooling

2 MW capacity

Few CRAC units on UPS power

(5)

Facility Heatmap (from Synapsense)

(6)

RACF Historical Power Usage

0 1000 2000 3000 4000 5000 6000 7000 8000

MWH

(7)

Historical Worker Node Count

0 500 1000 1500 2000 2500

2006 2007 2008 2009 2010 2011 2012 2013 2014 (proj.)

Worker Nodes

(8)

Experimental Requirements

RHIC

Disk-heavy worker nodes (2-U, dual-socket, multiple large SATA drives)

2+ GB of RAM per physical (Opteron) or logical (Ivybridge) core

10 Gb connectivity (new for 2014)

USATLAS

Disk-light worker nodes (1-U, dual-socket, multiple small SATA drives)

2+ GB of RAM per physical (Opteron) or logical (Ivybridge) core

(9)

Hardware Evaluation

CPU

E5-2695v2 (12 physical or 24 logical cores – Ivybridge 115 W TDP)

E5-2680v2 (10 physical or 20 logical cores – Ivybridge 115 W TDP)

E5-2660v2 (10 physical or 20 logical cores – Ivybridge 95 W TDP)

Opteron 4386 (8 physical cores)

Opteron 6380 (16 physical cores)

Storage

4 x 2 TB or 8 x 1 TB SATA drives (1-U)

12 x 4 TB SATA drives (2-U)

Vendors

HP (Ivybridge)

Dell (Ivybridge)

Penguin Computing (Opteron)

Want to validate HSPEC results with real-life applications

(10)

HSPEC

0 100 200 300 400 500 600

HSPEC

(11)

HSPEC per core

0 2 4 6 8 10 12

HSPEC/core

(12)

ATLAS Full Simulation

0 0.2 0.4 0.6 0.8 1 1.2

Completed Jobs/Minute

1 Job/Core

Ivybridge at higher Thermal Design Power (TDP) perform better than those at lower TDP (ie, E5-2695v2 vs. E5-2660v2)

Opteron 6380’s throughput is ~50% higher than 4386’s

Opteron 6380’s throughput is ~63% of E5-2680v2’s and

~77% of E5-2660v2’s

(13)

ATLAS Full Simulation

0 10 20 30 40 50 60

Time(Minute)

Single Job 1 Job/Core

Multi-job throughput is significantly worse for Ivybridge’s when compared to single-job performance due to

hyperthreading

(14)

Local Disk (random) I/O with Bonnie++ (aggregate)

0 200 400 600 800 1000 1200 1400

Read (MB/s) Write (MB/s)

Mostly dependent on # of drives, quality of drives and controller

Controller impacted 8x1 TB E5-2660v2 results negatively

More (high-quality) drives improves I/O but increases cost/server

(15)

Local Disk (Random) I/O with Bonnie++ (aggregate)

0 100 200 300 400 500 600 700 800 900 1000

4x1 TB E5-2470 (24 threads) -- 2013

8x500 GB E5-2660 (24 threads) - - 2013

4x2 TB Opteron 6380 (32 threads) -- 2014

Read (MB/s) Write (MB/s)

Test of Sandybridge cpu (2013) shows the superior performance of 8-disk configuration vs. 4-disk configuration

In 2014, results show 12-drive I/O is better than 8-drive I/O —> expect 8-drive to have better I/O performance than 4-drive configuration

Note 4-drive write performance in 2014 is already HIGHER than 8-drive write performance in 2013 and much higher than 4-drive configuration in 2013

Variations in SATA link rate (3 Gbps and 6 Gbps) accentuate results but does not alter general trends

(16)

The Effect of HT

HSPEC boost of 18-27% with HT enabled

I/O scales linearly with HT disabled

HT boosts ATLAS job throughput by ~15%

Turning off HT (and cutting back on RAM) increases the price competitiveness of Ivybridge by ~5% --not enough to overcome the price-performance advantages of the Opteron platforms

0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73 0.74

no HT (20 threads)

with HT (40 threads)

Completed Jobs/Minute

E5-2660v2

0 10 20 30 40 50 60

no HT (20 threads)

with HT (40 threads)

Time (Minutes)

Single Job 1 Job/Core

(17)

Cost per HSPEC for each configuration

0 5 10 15 20 25 30 35 40 45 50

Cost\HEPSPEC (US$)

USATLAS RHIC

4 x 2 TB 8 x 1 TB

(18)

Power Usage over 5 years

0 10 20 30 40 50 60 70 80 90 100

kWH\HEPSPEC

USATLAS RHIC

(19)

Cost Breakdown and Power Considerations

List prices (in US dollars) for servers (current as of May 7, 2014)

Cores assume dual-socket servers (no hyperthreading on Opteron)

Power cost based on BNL historical average ~6 cents/kwh

RAM upgrade costs $325 for each incremental 16 GB

CPU Cores Server Cost 10

GbE

Composite List Price Power Usage 5-yr Power Cost E5-2660v2 (1U) 40 $12,639 $473 $13,289 (128 GB RAM, no 10

GbE)

280 W $736

E5-2680v2 (1U) 40 $13,759 $473 $14,409 (128 GB RAM, no 10 GbE)

406 W $1,067

E5-2695v2 (1U) 48 $15,439 $473 $16,089 (128 GB RAM, no 10 GbE)

420 W $1,104

Opteron 6380 (1U) 32 $5,985(4x2)

$7,820(8x1)

Incl. $6,635 (96 GB RAM, no 10 GbE)

$8,470 (96 GB RAM, no 10 GbE)

380 W $999

E5-2660v2 (2U) 40 $16,790 $473 $17,263 445 W $1,169

E5-2680v2 (2U) 40 $17,910 $473 $18,383 570 W (est.) $1,498

E5-2695v2 (2U) 48 $19,590 $473 $20,063 598 W $1,572

Opteron 6380 (2U) 32 $10,980 Incl. $10,980 417 W (est.) $1,096

(20)

Procurement Guidance Summary

Composite price (in US dollars) includes server, RAM upgrade (USATLAS) and 10 GbE (RHIC)

Rightmost five columns normalized to a fixed, hypothetical budgetary constraint

Historical 25% discount NOT applied to composite cost

Final FY2013 final prices were $5.7k/server (USATLAS) and $7.3k/server (RHIC)

CPU Composite

List Price

kHSPEC Computing cores

Storage (TB)

Power (kW)

Space in ft2 (Racks)

E5-2660v2 (1U) $13,289 21.0 2,160 432 15 34 (2)

E5-2680v2 (1U) $14,409 21.7 1,960 392 20 34 (2)

E5-2695v2 (1U) $16,089 21.2 2,112 352 18 34 (2)

Opteron 6380 (1U) $6,635(4x2)

$8,470(8x1)

31.8 25.0

3,456 2,720

864 680

41 32

68 (4) 51 (3)

E5-2660v2 (2U) $17,263 30.3 3,120 3,744 35 85 (5)

E5-2680v2 (2U) $18,383 32.3 2,920 3,504 42 85 (5)

E5-2695v2 (2U) $20,063 32.3 3,216 3,216 40 85 (5)

Opteron 6380 (2U) $10,980 35.9 3,904 5,856 51 136 (8)

(21)

Effect of 2014 acquisitions

Facility infrastructure

Net power usage increase under ~100 kW

Net footprint increases up to ~200 ft2 (~19 m2)

Additional infrastructure (CRAC units and PDU’s) installed

Cost/worker node

Minimal ~3% increase for RHIC due to10 Gb connectivity

Ivybridge is pricy compared to Sandybridge and Opteron, even after dropping core count/socket and taking a historical ~25%

discount

Optional memory upgrade increases cost 5-10%

(22)

Implications for the future

With limited space, power and cooling until ~2020, several trends developing:

De-emphasize core count to optimize local disk I/O

Throughput more important than raw cpu performance

Reduce power footprint

Haswell to be released late in 2014

To be marketed as E5-26xxv3 series (up to 14 cores?)

Cannot time a FY14 procurement with Haswell release in hopes of a price drop for Ivybridge

Reported TDP goes up to 160 W – is that a bad omen?

Clouds and decreasing sales volume turning servers into a niche (expensive) market for hardware makers—consolidation among hardware brands a concern. Expand pool of acceptable brands?

(23)

Back-up slides

(24)

Local Disk (Random) I/O with Bonnie ++ (per core)

0 5 10 15 20 25 30 35 40 45 50

Read (MB/s) Write (MB/s)

References

Related documents

Best technology available (Application vs. technology) conventional Sealer / Coating Ink only UV Offset High Gloss Coating Special Substrates.. Drying principles

In- vestment agreement, investment freedom, market size, trade openness, natural re- sources, gross fixed capital formation, infrastructure, inflation, exchange rate stability,

The core elements of the conception of phenomenological language that converge with the uniform account of pain expressions are the following. According to this conception,

It is seen that the measured ratio LAI(LA+TA) decreases slightly for the 3nm and S.lnm devices. Two explanations may be put forward for this exception to the general

Republican presidential candidate Rick Santorum fueled the debate with his positions against abortion, birth control and prenatal testing, and his disdain for the “role of ‘radical

Biomass prediction in tropical forests : the canopy grain approach.. Remote sensing of biomass : principles and applications, Intech,

The third research question asked to what extent CRT, AOT, NFC, and Numeracy are associated with allocating organs to the high survival groups in the MoAs and to what extent

The objective of this paper is to contribute to a better understanding of consumer expectations and needs regarding public transportation and a provision of