Transforming the UL into a Big Data University. Current status and planned evolutions

37  Download (0)

Full text

(1)

Transforming the UL into a Big

Data University

Current status and

planned evolutions

Sébastien Varrette, PhD Prof. Pascal Bouvry

Prof. Volker Müller

December 6th, 2013

(2)

Classical Storage metrics

✓ Storage capacity: multiple of bytes (TB:109 - TiB=10243)

✓ Transfer rate on a medium: Mb/s or MB/s

✓ Other metrics: Sequential vs Random R/W speed, IOPS

Preamble

2

(3)

Classical Storage metrics

✓ Storage capacity: multiple of bytes (TB:109 - TiB=10243)

✓ Transfer rate on a medium: Mb/s or MB/s

✓ Other metrics: Sequential vs Random R/W speed, IOPS

The Big Data Challenge : 4 V’s

✓ [ Volume | Velocity | Variety | Veracity ]

‣ Also relevant for Luxembourg’s research priorities

‣ Large number of diverse data sources to integrate

‣Not just about storage capacity!

(4)

In this talk: Storage infrastructure @ UL

Classical Storage metrics

✓ Storage capacity: multiple of bytes (TB:109 - TiB=10243)

✓ Transfer rate on a medium: Mb/s or MB/s

✓ Other metrics: Sequential vs Random R/W speed, IOPS

The Big Data Challenge : 4 V’s

✓ [ Volume | Velocity | Variety | Veracity ]

‣ Also relevant for Luxembourg’s research priorities

‣ Large number of diverse data sources to integrate

‣Not just about storage capacity!

Preamble

2

(5)
(6)

Storage Levels

4 CPU Registers L1 -C a c h e register reference L1-cache (SRAM) reference L2 -C a c h e L3 -C a c h e Memory L2-cache (SRAM) reference L3-cache (DRAM) reference Memory (DRAM)

reference Disk memory reference Memory Bus I/O Bus

Larger, slower and cheaper

Size: Speed:

500 bytes 64 KB to 8 MB 1 GB 1 TB

sub ns 1-2 cycles 10 cycles 20 cycles hundreds cycles ten of thousands cycles

Level: 1 2 3 4

HDD

: (SATA @ 7,2 krpm) R/W: 100 MB/s; 190 IOps

SDD

: R/W: 560 MB/s; 85000 IOps

(7)

HDD vs. SSD Performances

HDD

: 150

(8)

Interconnect

Latency

✓ time to send a minimal (0 byte) message from A to B

Bandwidth

✓ max amount of data communicated per unit of time

6

Overview of the Main HPC Components

HPC Components: Interconnect

latency

: time to send a minimal (0 byte) message from A to B

bandwidth

: max amount of data communicated per unit of time

Technology E↵ective Bandwidth Latency

Gigabit Ethernet 1 Gb/s 125 MB/s 40µs to 300µs Myrinet (Myri-10G) 9.6 Gb/s 1.2 GB/s 2.3µs 10 Gigabit Ethernet 10 Gb/s 1.25 GB/s 4µs to 5µs Infiniband QDR 40 Gb/s 5 GB/s 1.29µs to 2.6µs SGI NUMAlink 60 Gb/s 7.5 GB/s 1µs 11 / 46

S. Varrette, PhD. (UL) HPC platforms @ UL

N vendredi 13 décembre 13

(9)

Data Management

Storage architectural classes & I/O layers

D A S SATA SAS Fiber Channel DAS Interface N A S File System SATA SAS Fiber Channel Fiber Channel Ethernet/ Network NAS Interface S A N SATA SAS Fiber Channel Fiber Channel Ethernet/ Network SAN Interface Application NFS CIFS AFP ... Network iSCSI ... Network SATA SAS FC ...

(10)

Data Management / HW Protection

RAID standard levels

8

RAID combined levels

(11)

Data Management / File System (FS)

Logical manner to store, manipulate and access data

Disk file systems

(12)

Data Management / File System (FS)

Logical manner to store, manipulate and access data

Disk file systems

✓ FAT32, NTFS, HFS, ext3, ext4, xfs...

Network file systems

✓ NFS, SMB

9

(13)

Data Management / File System (FS)

Logical manner to store, manipulate and access data

Disk file systems

✓ FAT32, NTFS, HFS, ext3, ext4, xfs...

Network file systems

✓ NFS, SMB

Distributed and/or Parallel file systems

✓ data are stripped over multiple servers for high performance

✓ generally add robust failover and recovery mechanisms

‣ Lustre, GPFS, FhGFS, GlusterFS

(14)

Storage HW Components Hosting

High density disk enclosures

✓ includes [redundant] HW RAID controllers

✓ RAID Controller card performances differs!

‣ Basic (low cost): 300 MB/s;

‣ Advanced (expansive): 1,5 GB/s

✓ Typical encl. sizing: 4U / 48 to 60 disks of 4TB

10

(15)

Storage HW Components Hosting

High density disk enclosures

✓ includes [redundant] HW RAID controllers

✓ RAID Controller card performances differs!

‣ Basic (low cost): 300 MB/s;

‣ Advanced (expansive): 1,5 GB/s

✓ Typical encl. sizing: 4U / 48 to 60 disks of 4TB

Storage racks

: 42U capacity, 15 kW

‣ HPC rack: 30-40 kW

(16)

Storage HW Components Hosting

High density disk enclosures

✓ includes [redundant] HW RAID controllers

✓ RAID Controller card performances differs!

‣ Basic (low cost): 300 MB/s;

‣ Advanced (expansive): 1,5 GB/s

✓ Typical encl. sizing: 4U / 48 to 60 disks of 4TB

Storage racks

: 42U capacity, 15 kW

‣ HPC rack: 30-40 kW

‣ Interconnect rack: 6-8 kW

Server rooms

✓ Power (UPS, battery), Cooling, Fire protection...

10

(17)
(18)

Data storage @ UL - SIU

Central storage service operated by the SIU for > 5Y

✓ on central file server - capacity ∼ 30TB

‣ including backup / archiving for all university users

archived data versions takes about 180 TB (uncompressed).

✓ originally developed to secure “important user data”

Current Situation (Dec. 2013)

✓ Central Administration: 3 TB

✓ User Data: 20 TB

✓ Research Data : 196 TB

12

(19)

Data storage @ UL - SIU

Central storage service operated by the SIU for > 5Y

✓ on central file server - capacity ∼ 30TB

‣ including backup / archiving for all university users

archived data versions takes about 180 TB (uncompressed).

✓ originally developed to secure “important user data”

Current Situation (Dec. 2013)

✓ Central Administration: 3 TB

✓ User Data: 20 TB

✓ Research Data : 196 TB

(20)

?

Data storage @ UL - SIU

Central storage service operated by the SIU for > 5Y

✓ on central file server - capacity ∼ 30TB

‣ including backup / archiving for all university users

archived data versions takes about 180 TB (uncompressed).

✓ originally developed to secure “important user data”

Current Situation (Dec. 2013)

✓ Central Administration: 3 TB

✓ User Data: 20 TB

✓ Research Data : 196 TB

12

undersized for now 2Y

(21)
(22)

UL HPC Platform

http://hpc.uni.lu

2 geographical sites

3 server rooms

3 admins (+ 1.5 in 2014)

4 clusters:

387 nodes, 4110 cores (43.21 TFlops)

1208.4 TB (raw storage, incl. backup)

‣ NFS + Lustre

>

5.7 M

Hardware investment so far

Open-Source software stack

✓ Debian, SSH, OpenLDAP, Puppet, FAI...

14

(23)
(24)

Example: the gaia Cluster

16

Lustre Storage

Gaia cluster characteristics

- Computing: 250 nodes, 2408 cores; Rpeak ≈ 21,62 TFlops - Storage: 240 TB (NFS) + 576TB (NFS backup) + 240 TB (Lustre) Kirchberg (chaos cluster) Cisco Nexus C5010 10GbE Bull R423 (2U)

(2*4c Intel Xeon L5620 @ 2,26 GHz), RAM: 16GB Gaia cluster access

Uni.lu 10 GbE IB 10 GbE 10 GbE 1 GbE Bull R423 (2U)

(2*4c Intel Xeon L5630@2,13 GHz), RAM: 24GB

NFS server

Nexsan E60 + E60X (240 TB)

120 disks (2 TB SATA 7.2krpm) = 240 TB (raw) Multipathing over 2+2 controllers (Cache mirroring)

12 RAID6 LUNs (8+2 disks) = 192 TB (lvm + xfs)

FC8

FC8

Nexsan E60 (4U, 12 TB)

20 disks (600 GB SAS 15krpm) Multipathing over 2 controllers

(Cache mirroring)

2 RAID1 LUNs (10 disks)

6 TB (lvm + lustre)

Bull R423 (2U)

(2*4c Intel Xeon L5630@2,13 GHz), RAM: 96GB

MDS1

MDS2

Bull R423 (2U)

(2*4c Intel Xeon L5630@2,13 GHz), RAM: 96GB

FC8

FC8

FC8

FC8

Bull R423 (2U)

(2*4c Intel Xeon L5630@2,13 GHz), RAM: 48GB

OSS1

2*Nexsan E60 (2*4U, 2*120 TB)

2*60 disks (2 TB SATA 7.2krpm) = 240 TB (raw) 2*Multipathing over 2 controllers (Cache mirroring)

2*6 RAID6 LUNs (8+2 disks) = 2*96 TB (lvm + lustre)

Bull R423 (2U)

(2*4c Intel Xeon L5630@2,13 GHz), RAM: 48GB

OSS2 FC8 FC8 FC8 10 GbE IB Adminfront Bull R423 (2U)

(2*4c Intel Xeon L5620 @ 2,26 GHz), RAM: 16GB

Bull R423 (2U)

(2*4c Intel Xeon L5620 @ 2,26 GHz), RAM: 16GB Columbus server IB Gaia cluster Uni.lu (Belval) Infiniband QDR 40 Gb/s (Fat tree) LCSB Belval Computing nodes

1x BullX BCS enclosure (6U) 4 BullX S6030 [160 cores] (16*10c Intel Xeon E7-4850@2GHz), RAM: 1TB

2x Viridis enclosure (4U) 96 ultra low-power SoC [384 cores]

(1*4c ARM Cortex A9@1.1GHz), RAM: 4GB

1x Dell R820 (4U) [32 cores] (4*8c Intel Xeon E5-4640@2.4GHz), RAM: 1TB

5x Bullx B enclosure (35U) 60 BullX B500 [720 cores] (2*6c Intel Xeon L5640@2.26GHz), RAM: 24GB

12 BullX B506 [144 cores] (2*6c Intel Xeon L5640@2.26GHz), RAM: 24GB

20 GPGPU Accelerator [12032 GPU cores] 4 Nvidia Tesla M2070 [448c] 20 Nvidia Tesla M2090 [512c] 12032 GPU cores 12032 GPU cores 12032 GPU cores vendredi 13 décembre 13

(25)

UL HPC: HW Investments

Cumulative Investment:

5

 

749

 

432

25#772#€# 119#274#€# 93#187#€# 1039#410#€# 413#482#€# 2249#294#€# 980#834#€# 828#178#€# #0#€# 500#000#€# 1000#000#€# 1500#000#€# 2000#000#€# 2500#000#€# 2006# 2007# 2008# 2009# 2010# 2011# 2012# 2013# UL#HPC:#Total#Yearly#HW#Investment#(VAT#including)# Other# Power#Supply# So:ware# Interconnect# Servers# Storage# CompuDng#Nodes# Server#room(s)#/#Racks#

(26)

UL HPC: HW Investments (VAT incl.)*

* excluding server rooms

18 !0,00!€! 200!000,00!€! 400!000,00!€! 600!000,00!€! 800!000,00!€! 1000!000,00!€! 1200!000,00!€! 2006! 2007! 2008! 2009! 2010! 2011! 2012! 2013! UL#HPC:#Total#HW#Investment#(VAT#incl.)#excl.#server#rooms# Other! Power!Supply! So:ware! Interconnect! Servers! Storage! CompuDng!Nodes! vendredi 13 décembre 13

(27)

UL HPC: Power and cooling

229# 0,45# 12,18# 27,67# 29,17# 42,44# 89,35# 130,63# 179,41# 0# 50# 100# 150# 200# 250# 2006# 2007# 2008# 2009# 2010# 2011# 2012# 2013# Se rv er %Ro om %C ap ac ity %[K W ]% UL%HPC:%Power%&%Cooling%Usage%in%UL%server%rooms% Max#Availaible#Power#Capacity#[kW]# Max#Used#Capacity#[kW]#

(28)

UL HPC Storage

20 0" 6,6" 6,6" 6,6" 32,4" 512,4" 1052,4" 1208,4" 220" 576" 0" 200" 400" 600" 800" 1000" 1200" 1400" 2006" 2007" 2008" 2009" 2010" 2011" 2012" 2013" Ra w $C ap ac ity $[T B] $ UL$HPC$Storage$[TB]$ Nyx" G5K" Gaia" Chaos" TOTAL" Backup" vendredi 13 décembre 13

(29)

UL HPC Storage: 1208 TB (raw.) in 2013

NFS-based storage

[392 TB]

✓ home / work directories directories

Lustre-based storage

[240 TB]

✓ SCRATCH

Backup devices

[576 TB]

(30)

IOZone based - increasing #nodes

Lustre:

NFS (read)

UL HPC Storage Benchmarking

22 0 500 1000 1500 2000 2500 3000 0 5 10 15 20 25 30

I/O bandwidth (MiB/s) Lustre, write, filesize 20GLustre, read, filesize 20G

0 0.5 1 1.5 2 0 500 1000 1500 2000 2500 3000 3500 I/O bandwidth (GB/s) NFS, read, filesize 20G vendredi 13 décembre 13

(31)

UL HPC and the Grande Region

The UL HPC platform

HPC in the Grande region and Around

TFlops TB FTEs

Country Name/Institute #Cores Rpeak Storage Manpower

Luxembourg UL 4110 43.213 1208.4 3

CRP GL 800 6.21 144 1.5

France

TGCC Curie, CEA 77184 1667.2 5000 n/a

LORIA, Nancy 3724 29.79 82 5.05

ROMEO, UCR, Reims 564 4.128 15 2

Germany

Juqueen, Juelich 393216 5033.2 448 n/a

MPI, RZG 2556 14.1 n/a 5

URZ, (bwGrid),Heidelberg 1140 10.125 32 9

Belgium UGent, VCS 4320 54.541 82 n/a

CECI, UMons/UCL 2576 25.108 156 > 4

UK Darwin, Cambridge Univ 9728 202.3 20 n/a

Legion, UCLondon 5632 45.056 192 6

(32)

Big Data @ UL: planned evolutions

(33)

Cloud Storage Analysis

Study initiated by LCSB (Credits: Bob Pepin)

✓ Transfer rate / time should not be under-estimated

Legal aspects of data externalization on Cloud?

Requirement [TB] Yearly Cost [K€] Total Cost [K€]

2013 2014 2015 2016 2017 2018 200 400 183 183 600 302 486 800 405 890 1000 507 1,397 1200 609 2,006

(34)

Big-Data @ UL Milestones

Early 2014: RFP for the acquisition of Big UL NAS

✓ Scalable NAS with initial effective capacity > 1PB

✓ Storage / backups of all devices (user + research)

✓ High-performant interface with HPC and UL IT

‣ mounting on computing nodes via IB QDR interconnect

‣ mounting on desktop / workstation via [new] UL network

Continuous increase of HPC storage capacity

✓ SCRATCH (Lustre based) - 100K€ / + 180TB / + 3 GB/s

26

(35)

Big-Data @ UL Milestones

UL moves to Belval!

✓ 2016: CDC (Centre de calcul) in Belval

‣ 3 storage server rooms (68 racks @ 15 kW)

‣ 2 HPC server rooms (52 racks @ 40kW)

✓ Note: 1 floor in addition for SIU + Partners

(36)

Phase 1 of UL internal cloud infrastructure

2014 Big-Data @ UL Milestones

28

(37)

Conclusion / Big-Data @ UL: A Necessity

Required to correctly handle research @ UL

✓ especially on life-science topics

Access/Synchronization tool

Backup/Archiving tool

Expected Budget asked (part of next 4Y Plan)

✓ 2014: BIG UL NAS [1 M€]

✓ 2015: Medium size tape library / Cloud connector [1.6 M€]

✓ 2016: Capacity ext. (Big UL NAS) by 1PB [800 K€]

✓ 2017: Capacity ext. (Tape / Archiving) [700 K€]

Figure

Updating...

References

Related subjects : Fiber Channel