• No results found

File transfer in UNICORE State of the art

N/A
N/A
Protected

Academic year: 2021

Share "File transfer in UNICORE State of the art"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

Bernd Schuller, Björn Hagemeier, Michael Rambadt Federated Systems and Data division

Jülich Supercomputer Centre Forschungszentrum Jülich GmbH

31 May 2012, UNICORE Summit 2012, Dresden

M it g lie d d e r H e lm h o lt z-G e m e in sc h a ft

File transfer in UNICORE

State of the art

(2)

Slide 2

Outline

 Filetransfer in UNICORE: a (short) review

 UFTP

■ Performance tests in a 100 Gbit/s network

■ Tests in PRACE

 GridFTP: recent developments

(3)

Slide 3

Three types of data movement in UNICORE

Storage instances (HOME, TMP, ...) Job directories UNICORE Server Local Filespace Client Client Server Import & Export Server-to-server data movement UNICORE Server Storage instances (HOME, … ) Non-UNICORE Data staging UNICORE Server

(4)

Slide 4

Client-Server transfers

 Web service layer integration

 Client-initiated by invoking SMS ImportFile / ExportFile

 Client-side protocol handler is usually Java code, integrated into

URC/UCC

 Server-side protocol handler

■ Can run in UNICORE/X or directly on the resource (e.g.UFTP)

UNICORE Client UNICORE/X Server Local FS ClusterFS TSI Initiate transfer Transfer data

(5)

Slide 5

Server-server transfers

 Web service layer

 UNICORE server acts as client

 Client initiated by invoking SMS SendFile / ReceiveFile

■ Transfer start can be scheduled

 Same code, features and set of protocols as client-server

UNICORE/X Server Cluster FS TSI Initiate transfer Transfer data UNICORE/X Server Cluster FS TSI UNICORE Client

(6)

Slide 6

UNICORE/X

Data staging

 Part of job processing (XNJS level integration)

■ JSDL DataStaging element: file name and source/target URL

 Peer need not be a UNICORE server

 UNICORE protocols (BFT, …) plus additional ones (scp, ftp, xtreemfs,

mailto, file, gsiftp …)

Remote Server Initiate transfer Transfer data Cluster FS TSI XNJS

(7)

Slide 7

Protocol overview

BFT

(https) ByteIO UFTP GridFTP http(s), ftp scp

WS layer

integration Yes Yes Yes No (1) No No

Data staging Yes Yes Yes Yes Yes Yes

Direct

FS to FS No No Yes Yes No (2) Yes

Efficiency Fair Bad (3) Very

good Very good Good Good

Security Full U6 Full U6 Full U6 Proxy in

SOAP header

User/Pass

In JSDL User/PassIn JSDL Notes:

1) Work on better GridFTP support is in progress 2) http/ftp can be configured to use wget / curl

(8)

Slide 8

(9)

Slide 9

 File transfer library based on „passive FTP“ ideas

 Single „FTP“ port. Dynamic opening of firewall ports for data

connections

 Multiple TCP streams per transfer. Optional data encryption

 Pure Java (plus a bit of native code for Unix user/group ID switching)

(10)

Basic UFTP

UFTPD

Server

Announce client transfer. Pass secret and client IP pseudo FTP socket listen cmd UFTP client Client Data socket(s)  UFPTD server

■ Pseudo-FTP port (open

in firewall)

■ Local command port

(can be SSL protected)

 UFTP client

■ Connects to pseudo-FTP

port

■ Open required data

connections

■ Send/receive data

Secure channel required for authentication

(11)

UFTP integration

command-line client Eclipse-based client

Local RMS (e.g. Torque, LL, LSF, etc.)

UFTPD

UNICORE/X

server Cluster login node

TSI UNICORE/X

2. commands other UNICORE

servers

3. pseudo FTP data transfer listen

cmd

Reference:

B. Schuller, T. Pohlmann: UFTP - high performance data transfer for UNICORE, UNICORE Summit 2011, Proceedings, 7-8 July 2011, Torun, Poland, ed.: / M. Romberg, P. Bala, R. Müller-Pfefferkorn, D. Mallmann, Forschungszentrum Jülich, 2011,

IAS Series Vol. 9, ISBN 978-3-89336-750-4, pp. 135-142

(12)

Slide 12

100 Gigabit/sec testbed

TU Dresden – TU Freiberg

login node login node

100 Gigabit/sec

10 Gigabit/sec (each node) 10 Gigabit/sec

(each node)

 Up to 10 GBit/sec from/to each cluster node

(13)

Slide 13 0 2000 4000 6000 8000 10000 12000 900 950 1000 1050 1100 1150 1200 1250 1300 File size (MB) T ra n sf e r ra te ( M B /s e c)

Single client, single server

100 Gigabit/sec 10 Gigabit/sec (each node) 10 Gigabit/sec (each node) UFTP

Server ClientUFTP

 Up to 1.2GB/sec

(14)

Slide 14

Multiple clients, single server

100 Gigabit/sec 10 Gigabit/sec (each node) 10 Gigabit/sec (each node) UFTP

Server ClientUFTP

 Up to 8 clients  (roughly!) parallel transfers (50GB each) UFTP Client UFTP Client 0 1 2 3 4 5 6 7 8 9 0 200 400 600 800 1000 1200 1400 1600 1800 aggregate per client Number of clients T ra n sf e r ra te ( M B /s e c)

(15)

Slide 15

Multiple client/server pairs

100 Gigabit/sec 10 Gigabit/sec (each node) 10 Gigabit/sec (each node) UFTP

Server ClientUFTP

 Up to 11 (roughly!) parallel

transfers (50GB each)

 12 GB/sec

 98% of line rate

UFTP

Server ClientUFTP

UFTP

Server ClientUFTP

0 2 4 6 8 10 12 0 2000 4000 6000 8000 10000 12000 14000

Number of parallel clients/servers

T o ta l t h ro u g h p u t ( M B /s e c)

(16)

Slide 16

UFTP tests in PRACE

JSC JUGENE

 UFTP is being evaluated

 Small testbed: JSC, SARA

and CINECA sites

 Resources available via

two network interfaces (internet and internal high-speed)

 For tests, UFTP server

is deployed at JSC. Client (UCC) at SARA and CINECA

SARA

CINECA

(17)

Slide 17

UFTP tests in PRACE – results

 Results depend on filesystems and the network interface that is used!

 JSC – CINECA (1 Gbit/s link, 1GB file size)

■ JSC tmp → CINECA tmp : 70 MB/s

■ JUGENE tmp → CINECA dev/null: 128 MB/s

 JSC – SARA (10 Gbit/s link, 10 GB file size)

■ SARA Home → JSC Home: 72 MB/s

■ SARA Scratch → JSC tmp: 126 MB/s

 For comparison (internal 10 Gbit/s link):

(18)

Slide 18

(19)

Slide 19

 Status

■ Data staging only. Callout to existing globus­url­copy tool

■ Proxy cert placed in SOAP header by the client.

 Recent development work (motivated by the EMI project)

■ Use JGlobus libs, updated to not conflict with UNICORE libs

■ In principle, this allows to integrate gridftp clients into UNICORE

clients

■ Use existing proxy cert generated via voms­proxy­init 

(myproxy should be the same)

■ Prototype module for the HiLA library, used with the EMI services

prototypes

(20)

Slide 20

 Investigate full WS level integration (client, SMS) of GridFTP

 The proxy cert problem ...

 See also SRM/LFC work (→ C. Loeschen)

 User/infrastructure requirements increase developer motivation :-)

(21)

Slide 21

Current and future work on

(22)

Slide 22

 Reliability

■ For server-server transfers

■ Status: no retries, no proper restart of failed transfers

■ Goal: server-server transfer is 100% reliable

■ For client-server transfers

■ Status: failed transfers need to be re-done

■ Goal: continue aborted transfer

 Scalability

■ Current: one filetransfer „process“ per file

■ Goal: more efficient transfer of many files

■ Ambitious goal: full „rsync“ (→ efficient replication of full storages)

(23)

Slide 23

 Work on supporting partial downloads (byte ranges)

■ Required for efficient, reliable, scalable transfer

■ Allows to download same file from multiple sources

■ Failure of of individual downloads is „cheaper“

 UFTP protocol extensions

■ Sessions

■ Get/put multiple files in a single UFTP session (as in ftp)

■ Byte ranges

■ Support ls, cd, mkdir and all that. Rsync?

■ Will be available in UFTP 2.0

(24)

Slide 24

 Reliable and scalable server-server transfer using a well-known

method

■ Target server downloads file in chunks

■ Keep download state on the file system to allow restarts without

loss of data

■ Combine chunks as the last step

■ In 6.5.0 already „almost“ possible with BFT („UNICORE https“)

(25)

Slide 25

Questions?

 Thanks

■ Andy Georgi, Stefan Höhlig (TUD)

for generously providing access to the 100Gbit/s testbed

■ Andreas Landhäußer, Gert Ohme (T-Systems SfR)

References

Related documents

These two dimensions explained (28.3%) of the variance resulting in the physical bullying among the students , and they also explained (27.9%) of the changes resulting in

The rst chapter looks at the state of trade unions in Europe and how they have been aected by globalisation; the second chapter is theoretical in nature and shows how the

In 2006, Oregon State University (OSU) Libraries developed an open source software product called Library à la Carte that librarians can use to construct research guides easily

Fig. 7 Bar graphs of the standardised regression coefficients from multilevel regressions of performance on medical school skills-based exams on UKCAT scores. Two sets of

III.5 Lgr4 modulates Wnt pathway, EGFR pathway, and MMPs in breast cancers The Wnt signaling pathway is a key regulator in cancer progression (Anastas & Moon, 2013), cancer

EFVPTC indicates encapsulated follicular variant of PTC; NIFTP, noninvasive follicular thyroid neoplasm with papillary-like nuclear features; PTC, papillary thyroid carcinoma....

First, the negative relationship between monetary policy and pass-through proposed by Taylor (2000) is quantitatively im- portant for Canada, regardless of whether pass-through

First, I'll explain the words to you. Try to listen to the rhythm. Listen again clapping your hands by the rhythm. Now, let's chant line by line. Join in the singing, Yunju. Once