• No results found

FutureGrid Overview

N/A
N/A
Protected

Academic year: 2020

Share "FutureGrid Overview"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

https://portal.futuregrid.org

FutureGrid Overview

Geoffrey Fox

[email protected]

http://www.infomall.org https://portal.futuregrid.org

Director, Digital Science Center, Pervasive Technology Institute

Associate Dean for Research and Graduate Studies, School of Informatics and Computing Indiana University Bloomington

Future Internet Technology Building

Tsinghua University, Beijing, China

(2)

FutureGrid key Concepts I

FutureGrid is an international testbed modeled on Grid5000

Supporting international Computer Science and Computational

Science research in cloud, grid and parallel computing (HPC) – Industry and Academia

The FutureGrid testbed provides to its users:

A flexible development and testing platform for middleware and application users looking at interoperability, functionality,

performance or evaluation

Each use of FutureGrid is an experiment that is reproducibleA rich education and teaching platform for advanced

(3)

https://portal.futuregrid.org

FutureGrid key Concepts II

FutureGrid has a complementary focus to both the Open Science Grid and the other parts of XSEDE (TeraGrid).

FutureGrid is user-customizable, accessed interactively and

supports Grid, Cloud and HPC software with and without virtualization.

FutureGrid is an experimental platform where computer science

applications can explore many facets of distributed systems

and where domain sciences can explore various deployment

scenarios and tuning parameters and in the future possibly migrate to the large-scale national Cyberinfrastructure.

FutureGrid supports Interoperability Testbeds – OGF really

needed!

Note much of current use Education, Computer Science Systems and

(4)

FutureGrid key Concepts III

• Rather than loading images onto VM’s, FutureGrid supports

Cloud, Grid and Parallel computing environments by

provisioning software as needed onto “bare-metal” using Moab/

xCAT

– Image library for MPI, OpenMP, MapReduce (Hadoop, Dryad, Twister), gLite, Unicore, Globus, Xen, ScaleMP (distributed Shared Memory), Nimbus, Eucalyptus, OpenNebula, KVM, Windows …..

– Either statically or dynamically

• Growth comes from users depositing novel images in library • FutureGrid has ~4000 (will grow to ~5000) distributed cores

with a dedicated network and a Spirent XGEM network fault and delay generator

Image1

(5)

https://portal.futuregrid.org

FutureGrid Partners

Indiana University (Architecture, core software, Support)

Purdue University (HTC Hardware)

San Diego Supercomputer Center at University of California San Diego

(INCA, Monitoring)

University of Chicago/Argonne National Labs (Nimbus)University of Florida (ViNE, Education and Outreach)

University of Southern California Information Sciences (Pegasus to manage

experiments)

University of Tennessee Knoxville (Benchmarking)

University of Texas at Austin/Texas Advanced Computing Center (Portal)

University of Virginia (OGF, Advisory Board and allocation)

Center for Information Services and GWT-TUD from Technische Universtität

Dresden. (VAMPIR)

(6)

FutureGrid:

a Grid/Cloud/HPC Testbed

Private

Public FG Network

(7)

https://portal.futuregrid.org

Compute Hardware

Name System type # CPUs Cores# TFLOPS Total RAM (GB) Secondary Storage

(TB) Site Status

india IBM iDataPlex 256 1024 11 3072 339 + 16 IU Operational

alamo Dell PowerEdge 192 768 8 1152 30 TACC Operational

hotel IBM iDataPlex 168 672 7 2016 120 UC Operational

sierra IBM iDataPlex 168 672 7 2688 96 SDSC Operational

xray Cray XT5m 168 672 6 1344 339 IU Operational

foxtrot IBM iDataPlex 64 256 2 768 24 UF Operational

Bravo Large Disk & memory 32 128 1.5 (192GB per 3072 node)

144 (12 TB

per Server) IU Operational

Delta Large Disk & memory With Tesla GPU’s

16

16 GPU’s 96 ? 3

1536 (192GB per

node)

96 (12 TB

per Server) IU ~Dec 31 2011

TOTAL Cores

(8)

Storage Hardware

System Type Capacity (TB) File System Site Status

DDN 9550

(Data Capacitor)* 339 shared with IU + 16 TB dedicated Lustre IU Existing System

DDN 6620 120 GPFS UC New System

SunFire x4170 96 ZFS SDSC New System

Dell MD3000 30 NFS TACC New System

IBM 24 NFS UF New System

(9)

https://portal.futuregrid.org

Network Impairment Device

Spirent XGEM Network Impairments Simulator for

jitter, errors, delay, etc

Full Bidirectional 10G w/64 byte packets

up to 15 seconds introduced delay (in 16ns

increments)

0-100% introduced packet loss in .0001%

increments

Packet manipulation in first 2000 bytes

up to 16k frame size

(10)
(11)

https://portal.futuregrid.org

(12)

5 Use Types for FutureGrid

160

approved projects November 16 2011

https://portal.futuregrid.org/projects

Training Education and Outreach (8%)

Semester and short events; promising for small universities

Interoperability test-beds (3%)

Grids and Clouds; Standards; from Open Grid Forum OGF

Domain Science applications (31%)

Life science highlighted (18%), Non Life Science (13%)

Computer science (47%)

Largest current category

Computer Systems Evaluation (27%)

TeraGrid (TIS, TAS, XSEDE), OSG, EGI

Clouds are meant to need less support than other models;

(13)

https://portal.futuregrid.org 13

(14)

Current Education projects

System Programming and Cloud Computing, Fresno State,

Teaches system programming and cloud computing in different computing environments

REU: Cloud Computing, Arkansas, Offers hands-on experience

with FutureGrid tools and technologies

Workshop: A Cloud View on Computing, Indiana School of

Informatics and Computing (SOIC), Boot camp on MapReduce for faculty and graduate students from underserved ADMI

institutions

Topics on Systems: Distributed Systems, Indiana SOIC, Covers

(15)

https://portal.futuregrid.org

Current Interoperability Projects

SAGA,

Louisiana State, Explores use of

FutureGrid components for extensive

portability and interoperability testing of

Simple API for Grid Applications, and scale-up

and scale-out experiments

XSEDE/OGF

Unicore and Genesis Grid

endpoints tests for new US and European

grids

(16)

Current Bio Application Projects

Metagenomics Clustering,

North Texas, Analyzes

metagenomic data from samples collected from

patients

Next Generation Sequencing in the Cloud,

Indiana

and Lilly, investigate clouds for next generation

sequencing using MapReduce

Hadoop-GIS

: Emory, High Performance Query

System for Analytical Medical Imaging, Geographic

Information System like interface to nearly a

(17)

https://portal.futuregrid.org

Current Non-Bio Application Projects

Physics: Higgs boson,

Virginia, Matrix Element

calculations representing production and decay

mechanisms for Higgs and background processes

Business Intelligence on MapReduce,

Cal State -

L.A., Market basket and customer analysis designed

to execute MapReduce on Hadoop platform

CFD and Workload Management

Experimentation

Cummins – a major truck engine

company testing new simulation approaches

(18)

Current Technology Projects

ScaleMP for Gene Assembly,

Indiana Pervasive

Technology Institute (PTI) and Biology, Investigates

distributed shared memory over 16 nodes for

SOAPdenovo assembly of Daphnia genomes

XSEDE,

Virginia, Uses FutureGrid resources as a testbed

for XSEDE software development

EMI,

European Middleware Initiative will deploy software

on FutureGrid for training and use by international users

Bioinformatics and Clouds

, University of Oregon installed

(19)

https://portal.futuregrid.org

Current Computer Science Projects I

Data Transfer Throughput, Buffalo, End-to-end optimization

of data transfer throughput over wide-area, high-speed networks

Elastic Computing, Colorado, Tools and technologies to

create elastic computing environments using IaaS clouds that adjust to changes in demand automatically and transparently

Cloud-TM, Portugal, Cloud-Transactional Memory

programming model

The VIEW Project, Wayne State, Investigates Nimbus and

Eucalyptus as cloud platforms for elastic workflow scheduling and resource provisioning

(20)

Current Computer Science Projects II

Leveraging Network Flow Watermarking for

Co-residency Detection in the Cloud

, Oregon Looking

at security risks in virtualization and ways of

mitigating

Distributed MapReduce

, Minnesota. Support data

analytics with Hadoop with distributed real time

data sources

Evaluation of MPI Collectives for HPC Applications

on Distributed Virtualized Environments

, Rutgers

(21)

https://portal.futuregrid.org 21

Typical FutureGrid Performance Study

(22)

OGF 2010 Demo from Rennes

SDSC SDSC

UF UF

UC UC

Lille Lille

Rennes Rennes

Sophia Sophia ViNe provided the necessary

inter-cloud connectivity to deploy CloudBLAST across 6

Nimbus sites, with a mix of

(23)

https://portal.futuregrid.org

B534 Distributed Systems Class

23

(24)

ADMI Cloudy View on

Computing Workshop

June 2011

Jerome took two courses from IU in this area Fall 2010 and Spring 2011 on

FutureGrid

• ADMI: Association of Computer and Information Science/Engineering Departments at Minority Institutions

Offered on FutureGrid

10 Faculty and Graduate Students from ADMI Universities

The workshop provided information from cloud programming models to case

studies of scientific applications on FutureGrid.

At the conclusion of the workshop, the participants indicated that they would

incorporate cloud computing into their courses and/or research.

Concept and Delivery by

Jerome Mitchell:

Undergraduate ECSU,

(25)

https://portal.futuregrid.org

Workshop Purpose

Introduce ADMI to the basics of the emerging

Cloud Computing paradigm

Learn how it came about

Understand its enabling technologies

Understand the computer systems constraints, tradeoffs, and

techniques of setting up and using cloud

Teach ADMI how to implement algorithms in the Cloud

Gain competence in cloud programming models for distributed

processing of large datasets.

Understand how different algorithms can be implemented and

executed on cloud frameworks

Evaluating the performance and identifying bottlenecks when

(26)

FutureGrid Tutorials

Tutorial topic 1: Cloud Provisioning Platforms

• Using Nimbus on FutureGrid [novice]

• Nimbus One-click Cluster Guide [intermediate]

Using OpenStack Nova on FutureGrid

[novice]

• Using Eucalyptus on FutureGrid [novice]

• Connecting private network VMs across Nimbus clusters using ViNe [novice]

• Using the Grid Appliance to run FutureGrid Cloud Clients [novice]

Tutorial topic 2: Cloud Run-time Platforms • Running Hadoop on Eucalyptus

• Running Twister on Eucalyptus

Other Tutorials and Educational Materials

• Additional tutorials on FutureGrid-related technologies

• FutureGrid community educational materials

Tutorial topic 3: Educational Virtual Appliances

• Running a Grid Appliance on your desktop

• Running a Grid Appliance on FutureGrid

Running an OpenStack virtual appliance on

FutureGrid

• Running Condor tasks on the Grid Appliance

Running MPI tasks on the Grid Appliance • Running Hadoop tasks on the Grid Appliance

• Deploying virtual private Grid Appliance clusters using Nimbus

• Building an educational appliance from Ubuntu 10.04

Customizing and registering Grid Appliance images

using Eucalyptus

Tutorial topic 4: High Performance Computing

• Basic High Performance Computing

Running Hadoop as a batch job using MyHadoop • Performance Analysis with Vampir

Instrumentation and tracing with VampirTrace

Tutorial Topic 5: Experiment Management

(27)

https://portal.futuregrid.org

FutureGrid Viral Growth Model

Users apply for a project

Users improve/develop some software in project

This project leads to new images which are placed in

FutureGrid repository

Project report and other web pages document use

of new images

Images are used by other users

And so on ad infinitum ………

Please bring your nifty software up on FutureGrid!!

(28)

Software Components

Portals including “Support” “use FutureGrid” “Outreach”Monitoring – INCA, Power (GreenIT)

Experiment Manager: specify/workflowImage Generation and Repository

Intercloud Networking ViNE

Virtual Clusters built with virtual networksPerformance library

Rain or Runtime Adaptable InsertioN Service for imagesSecurity Authentication, Authorization,

Note Software integrated across institutions and between

middleware and systems Management (Google docs, Jira, Mediawiki)

• Note many software groups are also FG users

“Research”

(29)

https://portal.futuregrid.org

FutureGrid Software Architecture

Access Services

IaaS, PaaS, HPC, Persitent Endpoints, Portal, Support

Management Services

Image Management, Experiment Management, Monitoring and Information Services

Operations Services

Security & Accounting Services, Development Services

FutureGrid Fabric

Compute, Storage & Network Resources Support ResourcesDevelopment &

Portal Server, ...

Systems Services and Fabric

Base Software and Services, FutureGrid Fabric, Development and Support Resources

Note on Authentication and

Authorization

We have different

environments and

requirements from TeraGrid

(30)

Detailed Software Architecture

Base Software and Services

OS, Queuing Systems, XCAT, MPI, ...

Access Services

Management Services FutureGrid Operations Services Development Services Wiki, Task Management, Document Repository User and Support Services Portal, Tickets, Backup, Storage, PaaS Hadoop, Dryad, Twister, Virtual Clusters, ... Additional Tools & Services Unicore, Genesis II, gLite, ... Image Management FG Image Repository, FG Image Creation Experiment Management Registry, Repository Harness, Pegasus Exper. Workflows, ... Dynamic Provisioning

RAIN: Provisioning of IaaS, PaaS, HPC, ...

(31)

https://portal.futuregrid.org

Motivation:

Image Management and

Rain on FutureGrid

• Allow users to take control of installing the OS on a system on bare metal (without the administrator)

• By providing users with the ability to create their own

environments to run their projects (OS, packages, software) • Users can deploy their environments in both bare-metal and

virtualized infrastructures

• Security is obviously important

• RAIN manages tools to dynamically provide custom HPC environment, Cloud environment, or virtual networks on-demand

(32)

Architecture

API

Image Management

FG Resources

VM Bare MetalImage Image Repository Image Generator

External Services: Bcfg2, Security tools

Image Deploy

FG Shell Portal

Cloud Framework

(33)

https://portal.futuregrid.org

Image Generation

https://portal.futuregrid.org

• Creates and customizes images according to

user requirements

• Images are not aimed to any specific

infrastructure

0 50 100 150 200 250 300 350 400 450 500 CentOS5 Ubuntu10.10 Ti m e (s ) GenerateanImage

Upload image to the repo

Compress image

Install user packages

Install u l packages

Create Base OS

(34)

Image Deployment

• Customizes and deploys images for

specific infrastructures

• Two main infrastructures types:

HPC Deployment Cloud Deployment

0 20 40 60 80 100 120 140 Ti m e (s ) Deploy/StageImageonxCAT/Moab xCAT packimage

Retrieve kernels and update xcat tables Untar image and copy to the right place Retrieve image from repo 0 50 100 150 200 250 300 Ti m e (s ) Deploy/StageImageonCloudFrameworks

Wait un l image is in available status (aprox.) Uploading image to cloud framework from client side Retrieve image from server side to client side

Umount image (varies in different execu ons)

Customize image for specific IaaS framework

(35)

https://portal.futuregrid.org

Dynamic Provisioning Scalability Test

(36)

Test Setup

Environments: Provision in HPC, Eucalyptus,

OpenStack, OpenNebula

Image: Create CentOS5 images

We use FG image generator

Adapts image to kernel/ramdisk so we can deploy in

various environments

HPC: ramdisk modified by XCATEucalyptus: XEN kernel

OpenStack/OpenNebula: General Linux kernel

(37)

https://portal.futuregrid.org

Compute Infrastructure

(38)

Summary – HPC

HPC

111 machines available, we could use all of them

linear scaling

Due to reboot each reprovisioning takes some

(39)

https://portal.futuregrid.org

Summary - OpenStack

Cactus release of OpenStack (not the newest one) • Provisioning is done in batches of 10.

E.g. 30 machines is done through 3 x provisioning of 10 images, and so forth.If we do not do it this way experiments failed (50%)

• Caching of the images in the nodes is needed or scalability is effected significantly.

• Network sometimes gets not properly created (a known problem in OpenStack)

Diabolo has additional features that are a must in scalability experiments. • Scalability was not possible beyond 64 ndes

• We conclude(e.g. Gregor): Cactus out of the box is not suitable for our purposes. However, we have been able to make it work through

(40)

Eucalyptus

Latest open source version 2.03

We created VMS similar to OpenStack due to the

same problems noted as in OpenStack

Problematic is that Eucalyptus (in FG?) does not

allow us to execute commands in quick

(41)

https://portal.futuregrid.org

OpenNebula

We used version 3.0.0

OpenNebula does not cache images by default (we used the

default setup)

We used ssh distribution of images as NFS had terrible

performance problems

We were able to instantiate 148 instances with little problems.In our experiments we observed only one fault.

Open Nebula without cache works well and is suitable for

scalability experiments, however it is slow due to that. A

(42)

Need for fg-rain -move

Within FG we are in need of a tool that simply

moves nodes from one cloud infrastructure to

another to conduct scalability experiments

E.g. Node x at time a could be assigned to

(43)

https://portal.futuregrid.org

Nimbus

Nimbus has in FG been reported to be very

reliable.

We need to expand our environment towards

Nimbus and conduct a similar experiment

(44)

FutureGrid in a nutshell

The FutureGrid project

mission

is to

enable experimental work

that advances:

a) Innovation and scientific understanding of distributed computing and parallel computing paradigms,

b) The engineering science of middleware that enables these paradigms, c) The use and drivers of these paradigms by important applications, and, d) The education of a new generation of students and workforce on the

use of these paradigms and their applications.

The

implementation

of mission includes

• Distributed flexible hardware with supported use

Identified IaaS and PaaS “core” software with supported use

• Growing list of software from FG partners and users

References

Related documents

Introduction of an imidazo- line group on position-2 of the thiazolo ring enhanced the cytotoxic activity of these compounds with a specific effect on the cell cycle G2

In the case of ND, the negative correlation between SR and ND ( − 0. 34) suggests that a large neutrality distance (that is, long sequences of neutral steps in the random walk) makes

ARM is currently the most suitable method for analysis of big market basket data but when there is a large volume of sales transaction with high number of products, the data matrix

An important component of Cisco AVVID (Architecture for Voice, Video and Integrated Data), the Cisco Catalyst 4500 Series extends control from the backbone to the network edge

It is commonly used for itsChemotherapeutic effect, Antioxidant effect ,Antitumoral effect ,Biological effect ,Antiviral and Immunoenhancing effect ,Mechanical effect

The electrodes are activated and the com- bustion temperature to 300°C (14540 nF) has the optimum capacitance is almost three times greater than the elec- trode with the

Instead, the statistical saliency model treats the target as a point in a feature space and measures search time as a function of the distance to the mean and

– Colorado Statutes Title 10 (Insurance), Article 16 (Health Care Coverage), Part 1004 (Health Care Coverage Cooperatives). – Colorado Senate Bill 11-191: “Colorado Uniform