• No results found

Performance oriented Data Transfer and Sharing Framework for Scientific Computing

N/A
N/A
Protected

Academic year: 2020

Share "Performance oriented Data Transfer and Sharing Framework for Scientific Computing"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Outline

Motivation

Requirements for Scientific Data Transfer

Related Works

Our Proposal: GridTorrent Framework

Test Results

Summary

(3)

Motivation

Computational science is changing to be data intensive

Scientists are faced with mountains of data that stem from

four sources[1]:

New scientific instruments double their output every year

or so

Simulations generates flood of data

The Internet and computational Grid allow the

(4)

Motivation (cont.)

Scientific discovery increasingly driven by data

collection[3]

Computationally intensive analysesMassive data collections

Data distributed across networks of varying capabilityInternationally distributed collaborations

Data Intensive Science: 2000-2015

Dominant factor: data growth (1 Petabyte = 1000 TB)

(5)

Motivation (cont.)

Scientific applications generates petabytes of data are

very diverse.

Fusion power

Climate modeling

Earthquake engineering

Astronomy

Bioinformatics

(6)

Motivation (cont.)

Some examples[]

Climate modeling

Community Climate System Model and other simulation applications

generates 1.5 petabytes/year

Bioinformatics

The Pacific Northwest National Laboratory is building new Confocal

microscopes which will be generating 5 petabytes/year

High-energy physics

The large hadron collider (LHC) project at CERN will create 15

(7)

Motivation Conclusion

Scientific community has large set of distributed data

(8)

Requirements for Scientific Data Transfer

Transferring scientific data over large-scale requires

efficient

high-performancereliable

secure

policy-aware managementbalanced system

CPU farms

storage

(9)

Is it a new problem?

The answer is no.

There are attempts to meet the above requirements as

GridFTP

GridFTPXIO

GridHTTP

TeraGrid Copy (TGCP)

The Replica Location Service (RLS)

(10)

GridFTP

Extension of the standard FTP protocol

Reliable,

secure

high performance

Efficient

The de facto standard for transferring data in many Grid

projects

(11)

GridFTP (cont.)

Additional features supported by the GridFTP

protocol

Grid Security Infrastructures (GSI) and Kerberos supportSupport for reliable and restartable data transfer: restart

transfers from point of failure when failures occurred

Partial file transfer: regions of a file transfer.

Parallel data transfer: multiple TCP streams between two

network endpoints to improve bandwidth.

Third-party control of data transfer: the ability to control

(12)

GridHTTP

Allow large (gigabyte) files to be transferred at

optimal speeds using HTTP

Does not deviate from existing HTTP standards,

But describes how to use existing headers and

methods to produce an encrypted data stream.

Support bulk data transfers via unencrypted HTTP,

Support authentication and authorization with the

(13)

GridFTPXIO

The Globus eXtensible Input/Output (XIO) System

provides an abstraction layer to transport protocols.

enables different I/O problems to be presented uniformly as a simple open/close/read/write (OCRW) interface.

a support framework for developing communication protocols.

an interface that enables an existing

application written with XIO to access their hardware.

primary usage scenarios

Independence from the Transport Control

Protocol

Ease of Adding GridFTP Support to Third-Party

Applications

Ease of Providing GridFTP Access to Data

(14)

TeraGrid Copy (TGCP)

TeraGrid Copy (TGCP) solution

includes three main components:

GridFTP ServiceRFT Service

TGCP shell script

In the striped configuration,

GridFTP service runs on several

nodes of a cluster

the data to be transferred is

partitioned among the nodes

each node may use several parallel

(15)

TGCP (cont.)

The tgcp script can use

the globus-url-copy tool

(A) in either third-party

transfer mode

(B) in conventional

(16)

TGCP (cont.)

RFT Service will be used to

manage the transfer.

adds additional reliability to

the transfer request

(17)

The Replica Location Service (RLS)

provides a framework for tracking the physical locations of data that has been replicated.

maps logical names to physical names.

Replication of data items can

reduce access latency,

improve data locality,

increase robustness, scalability and performance for distributed applications.

does not operate in isolation,

used with other components like the Reliable File Transfer

(18)

RLS (cont.)

The current RLS implementation has the following

features.

Local Replica Catalogs (LRCs)Replica Location Indices (RLIs)

LRCs send information about their state to RLIs using soft

state protocols.

Optional "Bloom Filter" compression can be used to

summarize the contents of the LRC.

The current RLS implementation maintains static

(19)

So, if there are solutions….

There is no pure P2P data transfer mechanism used in

this area.

There are several different protocols

(20)

Our proposal: GridTorrent

Framework

We are proposing a new distributed file peer-to-peer

protocol in scientific data in an acceptable speed

Similar to (GridFTP) redefining of Bittorrent protocol to

adjust it using in scientific data transfer

(21)

Why we need GridTorrent

Framework?

Requirements and characteristics of scientific data

transfer

Large and voluminous data set

Security

Reliability

Efficiency

Scalability

User-friendly environment

Balanced

(22)

Why we need GridTorrent

Framework? (cont.)

GridTorrent has faster download speed

Large and voluminous data set

Balanced

GridTorrent allows to share bandwidths between peers

Efficiency

GridTorrent is based on Bittorrent

Reliability

(23)

Why we need GridTorrent

Framework? (cont.)

GridTorrent has security manager

Security

GridTorrent has content management framework

User-friendly environment

(24)

Why Bittorrent?

Alternative Peer to Peer Protocols

FastTrackGnutellaeDonkey

Direct ConnectAres

Why BitTorrent?

Better bandwidth utilizationNever before speeds.

Limit free riding – tit-for-tat

Limit leech attack – coupling upload & downloadSpurious files not propagated

(25)

Why Bittorrent? (cont.)

Bittorrent proved that it is suitable for distributing very

large files.

There are many companies using Bittorrent as distributing

protocol

Amazon S3

Microsoft’s Avalanche (inspired by Bittorrent)Blizzard (Game production company)

(26)

Advantages of GridTorrent

Framework

Saves resources by taking advantage of the unused

upload capacity of downloaders.

CPU

Network Bandwidth

Disk

Reliable

Jobs can be started and stopped using web interface

Can be deployed under any system

(27)
(28)

GridTorrent Framework

Components (cont.)

GridTorrent Framework has three major components:

GridTorrent Client

GridTorrent Content Manager

(29)

GridTorrent Client

It has four components

Torrent Data Sharing Algorithm

Task Manager

WS-Tracker Client

Data Transfer layer

(30)

GridTorrent Content Manager

Four main components:

Task Manager

ACL Manager

Content Manager

(31)

GridTorrent WS-Tracker

It functions as regular Bittorrent Tracker

Send source and peer list to peersUpdate their status

It sends tasks list obtained from GridTorrent Content

Manager

All communications are secure (SSL)

(32)

GridTorrent Content Manager

It allows content owner to publish content in different

access level.

Public level

User level

Group level

(33)

Initial Test Results

File size (MB) : 300 MB

Number of Streams/Sources: 4

Source machines: gridfarm (Bloomington, IN)

LAN test:

Iperf bandwith (Mbps): 857

Client machine: complexity (Indianapolis, IN)

WAN test:

Iperf bandwith (Mbps): 30.2

(34)

Initial Test Results (cont.)

Table 1: Download speed of PTCP vs. GridTorrent with 4 streams/sources

Table 2: GridTorrent bandwidth load balancing on downloaded file segment with 4 streams/sources

Download Speed (Mbps)

PTCP GridTorrent(1 stream) GridTorrent(4 streams)

LAN Test 80 90 95

WAN Test 42 49 102

Bandwidth usage (Downloaded MB from each source)

Source1 Source2 Source3 Source4

LAN Test 44 53 47 42

(35)
(36)

Research Issues

Current Bittorrent protocol is designed for actual network environment

Modifications needed to provide pure scientific data transfer

modification on message format and frequencyUDP

GridFTP

Requirements needed to provide pure scientific data transfer

Security

(37)
(38)

References

Petascale computational systems, Bell, G.; Gray, J.; Szalay, A. Computer Volume 39, Issue 1, Jan. 2006 Page(s): 110 – 112

Getting Up To Speed, The Future of Supercomputing,

Graham, S.L. Snir, M., Patterson, C.A., (eds), NAE Press, 2004, ISBN 0-309-09502-6

Overview of Grid Computing, Ian Foster,

http://www-fp.mcs.anl.gov/~foster/Talks/ResearchLibraryGroupGridsAp ril2002.ppt, last seen 2007

Science-Driven Network Requirements for Esnet, http://

Figure

Table 1: Download speed of PTCP vs. GridTorrent with 4 streams/sources

References

Related documents

In the case of direct representation, VALUE messages are sent only to the deepest agent in the DFS tree involved in the global cost function (no VALUES are sent to intermediate

While the best methods based on standard vision features achieve a top-5 error rate of about 26%, convolutional nets with dropout achieve a test error of about 16% which is a

We also used qRT-PCR to verify that GPX1 mRNA, which decreased in response to UVB, had an antioxidant effect of super- oxide anion radical elimination caused by UVB and found

All three types of radiologic technology programs experienced increased entering-class sizes during the past two years.. The smallest value is shown. *These figures do not

Apart from fiscal austerity, reforms prioritised financial stability, pensions, and the business environment, including among others: the introduction of a framework

Based on the above discussions, the main aim of this research is to resort to the statistical P -value to quantify the hierarchy of the influence of three important factors

The defense intelligence community understands information sharing is critical to advance the mission, facilitate effective and efficient war- fighting, conduct intelligence

The Employee calculator must provide the following calculation abilities: optional retirement; voluntary early retirement authority; disability retirement; discontinued