• No results found

Enhancing UNICORE Storage Management using Hadoop

N/A
N/A
Protected

Academic year: 2021

Share "Enhancing UNICORE Storage Management using Hadoop"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

Enhancing UNICORE Storage Management using Hadoop

Di t ib t d Fil S t

Distributed File System

Wasim Bari2, Ahmed Shiraz Memon1, Dr. Bernd Schuller1

1. Jülich Supercomputing Centre, Forschungszentrum Jülich &

2 i f S i ifi C i h

2. Institute of Scientific Computing, RWTH Aachen

(2)

Contents

Contents

Contents – Motivation – Hadoop – UNICORE – UniHadoop - Summary - Q&A

Motivation

Hadoop

p

UNICORE

UniHadoop

Demonstration

Demonstration

Summary

(3)

Motivation

Motivation

Contents – Motivation – Hadoop – UNICORE – UniHadoop - Summary - Q&A

• Computation power has increased exponentially • More Computation with advanced technologiesp g

• Grid Computing, Parallel computing, High Performance Computing

• Environment Modeling

• High Energy and Nuclear Physics • And many more in “Feasible” time • And many more in Feasible time • Applications are producing HUGE data

• Big Bang, Large Hadron Collider at CERN – 15

(4)

Motivation

Motivation

Contents – Motivation – Hadoop – UNICORE – UniHadoop - Summary - Q&A

Traditional Data Storage Systems cannot store such amount of data

• Limited Storage • Disaster Recovery • Disaster Recovery • Number of Files • Parallel Access

Distributed Data Storage System

• Elastic expandable Storage • Elastic expandable Storage • Disaster Recovery

• Fast and Parallel Access • Durable

• Replication • Number of Files

(5)

Motivation

Motivation

Contents – Motivation – Hadoop – UNICORE – UniHadoop - Summary - Q&A

• UNICORE supports simple Storage mechanisms with the Filesystem located at TargetSystem Siteg y

• Becomes complicated while using complex workflows

• Goal: Allows UNICORE to support Distributed Storage System as big shared storage UNICORE UNICORE Distributed Storage System Client Storage System Simple Storage

(6)

Hadoop

Hadoop

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• In 2003 Proposed a highly scalable, fault tolerant, dynamic Distributed File System called Google File System (GFS).

I 2004 i t d d M R d i d l A

• In 2004, introduced MapReduce programming model, A mechanism to achieve parallelism.

• Apache Hadoop is an Open Source implementation of GFS andp p p p MapReduce

• Main contributor is

• Hadoop Distributed File System (HDFS)

• Scalable • Economical • Economical • Efficient and • Reliable

(7)

Hadoop

Hadoop Distributed

Distributed Filesystem

Filesystem (HDFS)

(HDFS)

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• Very Large Distributed File System

– 10K nodes, 100 million files, 10 PB

• Master/Worker Architecture

– Single Master (NameNode)/Many Workers (DataNode)g ( )/ y ( )

• Single Namespace for entire cluster

• Blocks: Files are broken up into smaller chunks

Typically 128 MB block size – Typically 128 MB block size

– Each block replicated on multiple DataNodes

• Client

Fi d l ti f bl k – Finds location of blocks

– Accesses data directly from DataNode – Security

P i i d l i il t POSIX – Permission model similar to POSIX

(8)

HDFS: Reading a file

HDFS: Reading a file

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

(9)

Hadoop

Hadoop

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• Abstract Filesystem concept

Supported File and Storagessystems • Supported File- and Storagessystems

– Hadoop Distributed File System (hdfs://)

– Cloudera ( formerly Kosmos File System) (kfs://)

A S3 Fil S t – Amazon S3 File System

• S3 Native File System (s3n://)

• S3 Block File System (s3://)

– Local disk (file://)

– Local disk (file://)

– FTP File System (ftp://)

R ti l ti f f th t d St t

(10)

UNICORE

UNICORE

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• A Ready-to-Run, Open Source Grid Middleware that enables seamless and secure access to the Computing resources

• UNICORE Atomic Services (UAS)

– Storage Management Service (SMS)Sto age a age e t Se ce (S S)

– File Transfer Service (FTS) – Job Management Service

• Implementation of standards OASIS, OGSA, WS-I etc…

(11)

UNICORE Architecture

UNICORE Architecture

(12)

UniHadoop

UniHadoop: Overview

: Overview

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

UniHadoop is an integration of UNICORE with Apache Hadoop • UniHadoop is an integration of UNICORE with Apache Hadoop • Inspired by Hadoop design

• Using an Abstract notion of Filesystem • Supports all storage/filesystems

• Specialisation of UNICORE Storage Management Service (SMS).

(13)

UniHadoop

UniHadoop: Architecture

: Architecture

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

command -line client Eclipse -based client Portal client, e.g. GridSphere Programming API command -line client Eclipse -based client Portal client, e.g. GridSphere Programming API UNICORE UNICORE Gateway Gateway Hadoop Hadoop UNICORE UNICORE Gateway Gateway Hadoop Hadoop UNICORE XNJS UNICORE Atomic Services OGSA - * UNICORE XNJS UNICORE Atomic Services OGSA - * UNICORE WS - RF Service Registry CIS Information Service SMS FTS SMS FTS UNICORE XNJS UNICORE Atomic Services OGSA - * UNICORE XNJS UNICORE Atomic Services OGSA - * UNICORE WS - RF Service Registry CIS Information Service SMS SMS UNICORE WS - RF hosting env . IDB XACML entity UNICORE WS - RF hosting env . IDB XUUDB UVOS XACML entity UNICORE WS - RF hosting env . IDB XACML entity UNICORE WS - RF hosting env . IDB XUUDB UVOS XACML entity XUUDB UVOS

Target System Interface Target System Interface

XUUDB UVOS

Target System Interface Target System Interface

Local RMS (e.g. Torque, LL, LSF, etc.) Local RMS (e.g. Torque, LL, LSF, etc.)

External Storage

USpace USpace

DSS

Local RMS (e.g. Torque, LL, LSF, etc.)

Local RMS (e.g. Torque, LL, LSF, etc.) Local RMS (e.g. Torque, LL, LSF, etc.)Local RMS (e.g. Torque, LL, LSF, etc.)

External Storage

USpace USpace

DSS DSS

(14)

UniHadoop

UniHadoop: Design

: Design

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• StorageManagement: Base Interface • List directory

• create directory

• change permissions etc...

• SMSBaseImpl: Basic implementation

• FixedStorageImpl: A storage service files for fixed path, such as „/work“

• SMSHadoopImpl: SMS Adaptation for SMSHadoopImpl: SMS Adaptation for Hadoop

• IStorageAdapter: Interface for accessing l l l hi hi l t t

low level hierarchical storage systems • HadoopStorageAdapter: HDFS specific

(15)

UniHadoop

UniHadoop: Usage Scenario 1

: Usage Scenario 1

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• Default Shared Storage “H ” kfl

• “Huge” workflows

Target System 1 Target System 2 Target System 3 Target System N

(16)

UniHadoop

UniHadoop: Usage Scenario 2

: Usage Scenario 2

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

A shared storage for every Target System • A shared storage for every Target System

Target System 1 Target System 2 Target System 3 Target System N

Hadoop Storage

Hadoop Storage

Hadoop Storage

(17)

Demonstration

Demonstration

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

• Listing directories/files on HDFS • Creating files

Submitting jobs using data from HDFS • Submitting jobs using data from HDFS

(18)

Feature Matrix

Feature Matrix

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

Features

Default SMS

Hadoop SMS

p

Unlimited Storage

Χ

Durability

Χ

Durability

Χ

Unlimited File Size

Χ

Disaster Recovery

Χ

U li it d N Of fil

Unlimited No. Of files

Χ

Trash Recoveryy

Χ

Χ

(19)

Summary

Summary

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Extensions – Summary - Q&A

• Requirements

Architecture and Design • Architecture and Design • Usage scenarios

• Hadoop integration provide support for • MapReduce

• MapReduce

• Data analysis and intelligence using Pig, Hive and HBase • A number of available File- and Storagesystems

(20)

Q & A

Q & A

Contents – Motivation – Hadoop –UNICORE – UniHadoop - Summary - Q&A

(21)

Contact

Contact

a.memon@fz-juelich.de

j

UNICORE H d

UNICORE-Hadoop:

Bundled in release 6.2.1 server, www.unicore.eu

Source:

Source:

References

Related documents

The various types of bullying which were going on in the schools were; physical, social, verbal, cyber and psychological.. Accordingly the study revealed that physical and

The rst chapter looks at the state of trade unions in Europe and how they have been aected by globalisation; the second chapter is theoretical in nature and shows how the

Sophomores Welcome Week Begins JR/SR Welcome Week Begins Welcome Week

The purchasers of new homes, as well as the owners of existing residential property that may require the installation of residential sprinklers as a condition of future development

III.5 Lgr4 modulates Wnt pathway, EGFR pathway, and MMPs in breast cancers The Wnt signaling pathway is a key regulator in cancer progression (Anastas & Moon, 2013), cancer

EFVPTC indicates encapsulated follicular variant of PTC; NIFTP, noninvasive follicular thyroid neoplasm with papillary-like nuclear features; PTC, papillary thyroid carcinoma....

First, the negative relationship between monetary policy and pass-through proposed by Taylor (2000) is quantitatively im- portant for Canada, regardless of whether pass-through

Postpartum depression in the Occupied Palestinian Territory a longitudinal study in Bethlehem RESEARCH ARTICLE Open Access Postpartum depression in the Occupied Palestinian Territory