• No results found

FutureGrid: What an Experimental Infrastructure Can Do for You

N/A
N/A
Protected

Academic year: 2020

Share "FutureGrid: What an Experimental Infrastructure Can Do for You"

Copied!
74
0
0

Loading.... (view fulltext now)

Full text

(1)

What FutureGrid Can Do for You

TeraGrid’11 BOF Session

(2)

Agenda

FutureGrid from User’s Perspective

,

Geoffrey Fox

How to Access FutureGrid

, Gregor von Laszewski

HPC on FutureGrid

, Warren Smith

Cloud Computing on FutureGrid

, Kate Keahey

Training, Education and Outreach

, Renato

Figueiredo

Experimental Framework Support

, Warren Smith

(3)

FutureGrid BO

Overview

TG 11

Salt Lake City

July 18 2011

Geoffrey Fox

[email protected]

http://www.infomall.org https://portal.futuregrid.org

Director, Digital Science Center, Pervasive Technology Institute

(4)

FutureGrid key Concepts I

FutureGrid supports

Computer Science

and

Computational Science

research in

cloud, grid and parallel computing (HPC)

The FutureGrid testbed provides to its users:

An interactive development and testing platform for

middleware and application users looking at

interoperability

,

functionality

,

performance

or

evaluation

with or without

virtualization

A rich

education and teaching

platform for advanced

cyberinfrastructure (computer science) classes

FutureGrid has a complementary focus to both the Open Science

Grid and the other parts of XSEDE.

(5)

FutureGrid key Concepts II

Rather than loading images onto VM’s, FutureGrid supports

Cloud, Grid and Parallel computing

environments by

dynamically provisioning

software as needed onto “bare-metal”

using Moab/xCAT

– Image library for MPI, OpenMP, MapReduce (Hadoop, Dryad, Twister), gLite, Unicore, Xen, Genesis II, ScaleMP (distributed Shared Memory), Nimbus, Eucalyptus, OpenNebula, OpenStack, KVM, Windows …..

Growth comes from users depositing novel images in library

FutureGrid has ~4300 (will grow to ~5000) distributed cores

(6)

FutureGrid Partners

Indiana University

(Architecture, core software, Support)

Purdue University

(HTC Hardware)

San Diego Supercomputer Center

at University of California San Diego

(INCA, Monitoring)

University of Chicago

/Argonne National Labs (Nimbus)

University of Florida

(ViNE, Education and Outreach)

University of Southern California Information Sciences (Pegasus to manage

experiments)

University of Tennessee Knoxville (Benchmarking)

University of Texas at Austin

/Texas Advanced Computing Center (Portal)

University of Virginia (OGF, Advisory Board and allocation)

Center for Information Services and GWT-TUD from Technische Universtität

Dresden. (VAMPIR)

(7)

FutureGrid:

a Grid/Cloud/HPC Testbed

(8)

Compute Hardware

Name System type # CPUs Cores TFLOPS# Total RAM(GB) SecondaryStorage

(TB) Site Status

india IBM iDataPlex 256 1024 11 3072 339 + 16 IU Operational

alamo PowerEdgeDell 192 768 8 1152 30 TACC Operational

hotel IBM iDataPlex 168 672 7 2016 120 UC Operational

sierra IBM iDataPlex 168 672 7 2688 96 SDSC Operational

xray Cray XT5m 168 672 6 1344 339 IU Operational

foxtrot IBM iDataPlex 64 256 2 768 24 UF Operational

Bravo* Large Disk &memory 32 128 1.5 (192GB per3072 node)

144 (12 TB

per Server) IU Aug. 1 generalEarly user

Delta* Large Disk &memory With Tesla GPU’s

16

16 GPU’s 96 ? 3

1536 (192GB per

node)

96 (12 TB

per Server) IU ~Sept 15

Total 1064 4288 45 16TB

(9)

Storage Hardware

System Type Capacity (TB) File System Site Status

DDN 9550

(Data Capacitor) 339 shared with IU+ 16 TB dedicated Lustre IU Existing System DDN 6620 120 GPFS UC New System SunFire x4170 96 ZFS SDSC New System Dell MD3000 30 NFS TACC New System

(10)

Network Impairment Device

Spirent XGEM Network Impairments Simulator for

jitter, errors, delay, etc

Full Bidirectional 10G w/64 byte packets

up to 15 seconds introduced delay (in 16ns

increments)

0-100% introduced packet loss in .0001%

increments

Packet manipulation in first 2000 bytes

up to 16k frame size

(11)
(12)

5 Use Types for FutureGrid

122

approved projects July 17 2011

https://portal.futuregrid.org/projects

Training Education and Outreach (13)

Semester and short events; promising for small universities

Interoperability test-beds (4)

Grids and Clouds;

Standards

; from Open Grid Forum OGF

Domain Science applications (42)

Life science highlighted (21)

Computer science (50)

Largest current category

Computer Systems Evaluation (35)

TeraGrid (TIS, TAS, XSEDE), OSG, EGI

(13)
(14)
(15)

Selected Current Education projects

System Programming and Cloud Computing,

Fresno

State, Teaches system programming and cloud

computing in different computing environments

REU: Cloud Computing,

Arkansas, Offers hands-on

experience with FutureGrid tools and technologies

Workshop: A Cloud View on Computing,

Indiana

School of Informatics and Computing (SOIC), Boot

camp on MapReduce for faculty and graduate students

from underserved ADMI institutions

(16)

Selected Current Interoperability

Projects

SAGA,

Louisiana State, Explores use of

FutureGrid components for extensive

portability and interoperability testing of

Simple API for Grid Applications, and scale-up

and scale-out experiments

(17)

Selected Current Bio Application

Projects

Metagenomics Clustering,

North Texas,

Analyzes metagenomic data from samples

collected from patients

Genome Assembly,

Indiana SOIC, De novo

(18)

Selected Current Non-Bio

Application Projects

Physics: Higgs boson,

Virginia, Matrix Element

calculations representing production and

decay mechanisms for Higgs and background

processes

Business Intelligence on MapReduce,

Cal

State - L.A., Market basket and customer

(19)

Selected Current Computer

Science Projects

Data Transfer Throughput,

Buffalo, End-to-end

optimization of data transfer throughput over

wide-area, high-speed networks

Elastic Computing,

Colorado, Tools and technologies

to create elastic computing environments using IaaS

clouds that adjust to changes in demand automatically

and transparently

(20)

Selected Current Technology

Projects

ScaleMP for Gene Assembly,

Indiana Pervasive

Technology Institute (PTI) and Biology,

Investigates distributed shared memory over 16

nodes for SOAPdenovo assembly of Daphnia

genomes

XSEDE,

Virginia, Uses FutureGrid resources as a

testbed for XSEDE software development

Globus Online,

Indiana PTI, Chicago, Investigates

the feasibility of providing DemoGrid and its

(21)

Typical FutureGrid Performance Study

(22)

ADMI Cloudy View on

Computing Workshop

June 2011

• Jerome took two courses from IU in this area Fall 2010 and Spring 2011 on FutureGrid

• ADMI: Association of Computer and Information Science/Engineering Departments at Minority Institutions

• Offered on FutureGrid

• 10 Faculty and Graduate Students from ADMI Universities

• The workshop provided information from cloud programming models to case studies of scientific applications on FutureGrid.

• At the conclusion of the workshop, the participants indicated that they would incorporate cloud computing into their courses and/or research.

Concept and Delivery b

Jerome Mitchell:

(23)
(24)

FutureGrid Viral Growth Model

Users apply for a project

Users improve/develop some software in project

This project leads to new images which are placed in

FutureGrid repository

Project report and other web pages document use

of new images

Images are used by other users

And so on ad infinitum ………

(25)
(26)

Elementary FG Access Services

(27)
(28)

FG Portal

• Coordination of Projects and users

– Project management

• Membership

• Results

– User Management

• Contact Information

• Keys, OpenID

• Coordination of Information

– Manuals, tutorials, FAQ, Help – Status

• Resources, outages, usage, … • Coordination of the Community

– Information exchange: Forum, comments, community pages – Feedback: rating, polls

• Technology has been established

• Transition technical development to TACC as much as possible so we can focus on other areas at IU

(29)
(30)
(31)

Check your Account Status

Goto:

– Accounts-My Portal Account

Check if the account status

bar is green

– Errors will indicate an issue or a task that requires waiting

Since you are already here:

– Upload a portrait

(32)

Get access

Project Lead

Create a portal account

Create a project

Add project members

Project Member

Create a portal account

Ask your project lead to

add you to the project

Once the project you participate in is approved

Apply for an HPC & Nimbus account

•You will need an ssh key

(33)
(34)
(35)

Services

Offered

ViNe can be installed on the other resources via Nimbus 

Access to the resource is

requested through the portal 

Pegasus available via Nimbus and Eucalyptus

(36)
(37)

HPC on FutureGrid

Warren Smith

(38)

HPC on FutureGrid

HPC-style usage is supported

Many of the clusters have an HPC partition

Clusters well suited to HPC

Infiniband networks

(39)

Compute Hardware

Name System type # CPUs Cores TFLOPS# Total RAM(GB) SecondaryStorage

(TB) Site Status

india IBM iDataPlex 256 1024 11 3072 339 + 16 IU Operational

alamo PowerEdgeDell 192 768 8 1152 30 TACC Operational

hotel IBM iDataPlex 168 672 7 2016 120 UC Operational

sierra IBM iDataPlex 168 672 7 2688 96 SDSC Operational

xray Cray XT5m 168 672 6 1344 339 IU Operational

foxtrot IBM iDataPlex 64 256 2 768 24 UF Operational

Bravo* Large Disk &memory 32 128 1.5 (192GB per3072 node)

144 (12 TB

per Server) IU Aug. 1 generalEarly user

Delta* Large Disk &memory With Tesla GPU’s

16

16 GPU’s 96 ? 3

1536 (192GB per

node)

96 (12 TB

(40)

HPC Access

ssh to login nodes

alamo.futuregrid.org, hotel.futuregrid.org, …

Uses the public key you’ve uploaded to the portal

Modules to manage your environment

Intel and Gnu compilers (others wanted?)

MPI, OpenMP

Torque and Moab to schedule access to compute

nodes

Reservations?

(41)

Performance Tools

Provide a number of tools to analyze

performance

Full support of partner tools

(42)
(43)

Cloud Computing on FutureGri

with Nimbus

Kate Keahey

k

[email protected]

(44)

What is Nimbus?

Enable providers to build IaaS clouds

Enable users to use IaaS clouds

Nimbus

Infrastructure

Nimbus

Platform

Workspace Service Cumulus Context Broker Cloudinit.d High-quality, extensible, customizable,

open source implementation

GatewayScalingElastic Tools

Enable developers to extend,

(45)

Using Nimbus Infrastructure

Pool

node nodePool nodePool

Pool

node nodePool nodePool

Pool

node nodePool nodePool

Pool

node nodePool nodePool

(46)

Using Nimbus Infrastructure

Pool

node nodePool nodePool

Pool

node nodePool nodePool

Pool

node nodePool nodePool

Pool

node nodePool nodePool

Nimbus publishes information about each

VM

Users can find out information about their

VM (e.g. what IP the VM was bound to)

Users can interact directly with their VM in the same

way the would with a physical machine.

(47)

Nimbus on FutureGrid

Hotel

(University of Chicago) -- Xe

41 nodes, 328 cores

Foxtrot

(University of Florida) -- Xe

26 nodes, 208 cores

Sierra

(SDSC) -- Xe

18 nodes, 144 cores

(48)

Sky Computing

Sky Computing = a Federation of

Clouds

Approach:

– Combine resources obtained in

multiple Nimbus clouds in FutureGrid and Grid’ 5000

– Combine Context Broker, ViNe, fast image deployment

– Deployed a virtual cluster of over 1000 cores on Grid5000 and

FutureGrid – largest ever of this type

Grid’5000 Large Scale Deployment

Challenge award

Demonstrated at OGF 29 06/10

TeraGrid ’10 poster

• More at:

www.isgtw.org/?pid=1002832

Work by Pierre Riteau et al,

University of Rennes 1

“Sky Computing”

(49)

Backfill: Lower the Cost of Your Cloud

Challenge:

utilization, catch-22 of

on-demand computing

Solution: new instances

– Backfill

Bottom line:

up to 100%

utilization

Who decides what backfill VMs

run?

Spot pricing

Research by Paul Marshall,

University of Colorado

Open Source community

contributions via Google Summer

of Code (GSoC), Paolo Gomez

Nimbus release 2.7

Paper @ CCGrid 2011

16 % 31 % 47 % 62 % 78 % 94 %

(50)

BarBar Experiment at SLAC

in Stanford, CA

Using clouds to simulating

electron-positron collisions

in their detector

Exploring virtualization as a

vehicle for data

preservation

Approach:

– Appliance preparation and management

– Distributed Nimbus clouds

– Cloud Scheduler

Running production BaBar

workloads

UVIC Efforts

(51)

Cloud Computing on FutureGrid

Several Infrastructure-as-a-Service clouds

Nimbus, Eucalyptus, OpenStack (experimental)

Supported patterns

Experimenting with middleware on top of

infrastructure clouds

Modifying and experimenting with infrastructure

clouds

(52)
(53)

FutureGri

Training, Education and

Outreach

Presented by Renato Figueiredo

r

[email protected]

(54)

Overview

Traditional ways of delivering hands-on training and

education in parallel/distributed computing have

non-trivial dependences on the environment

• Difficult to replicate same environment on different resources (e.g. HPC clusters, desktops)

• Difficult to cope with changes in the environment (e.g. software upgrades)

(55)

TEO Infrastructure - guiding principles

Fidelity

: TEO activities should use full-fledged,

executable software: education/training modules

Learn using the proper tools

Reproducibility:

Creators of content should be able

to install, configure, and test their modules once,

and be assured of the same functional behavior

regardless of where the module is deployed

(56)

TEO Infrastructure - guiding principles

Deployability:

Students and users should be

able to deploy modules in a simple manner,

and in a variety of resources

Reduce barriers to entry; avoid dependences

upon a particular infrastructure

Community-oriented

: Modules should be

simple to share, discover, reuse, and expand

(57)

Towards this vision in FutureGrid

Executable modules –

virtual appliances

Deployable on FutureGrid resources

Deployable on other cloud platforms, as well as

on virtualized desktops

Community sharing – Web 2.0 portal,

appliance image repositories

(58)

Virtual appliances

Leverage existing virtual networking software and

virtual appliance images used in other projects

Focus: integration with FutureGrid resources

Leverage network virtualization software

• FutureGrid includes ViNe and GroupVPN

Image deployment, testing, documentation, tutorials

• KVM/Xen, Nimbus/Eucalyptus

(59)

Virtual appliance clusters

Same image, different VPNs

c o p y instanti ate Had oop + Virtu al Netw ork A Hadoop

worker Another Hadoopworker

(60)

University of Arkansas Indiana University University of California at Los Angeles Penn State Iowa State Univ.Illinois at Chicago University of Minnesota Michigan State Notre Dame University of Texas at El Paso IBM Almaden Research Center Washington University San Diego Supercomputer Center University of Florida Johns Hopkins July 26-30, 2010 NCSA Summer School Workshop

http://salsahpc.indiana.edu/tutorial

300+ Students (200 on sites from 10 institutes; 100 online)

IU MapReduce and UF Virtual Appliance technologies are supported by FutureGrid.

(61)

Activities: Courses

Graduate-level “Cloud computing for

Data-Intensive Sciences” (Judy Qiu, Fall 2010)

Virtualization technologies and tools

Infrastructure as a service

Parallel programming (MPI, Hadoop)

FutureGrid supported activities in a new

semester-long class offered Fall 2010 at LSU

(Gabrielle Allen, Shantenu Jha)

(62)
(63)

Activities: ADMI Workshop

Cloudy View on Computing workshop

10 faulty members and graduate students from HBCUs

interested in cloud computing.

Cloud programming models, case studies of scientific

(64)
(65)

Experiment Management on

FutureGrid

Warren Smith

(66)

Experiment Managemen

Goals

Support rigorous experimentation

Define experiments in detail

Record experimental results

• User-specified measurements (placement and granularity)

Share experiment information

• Experiments can be repeated and verified • Variations on experiments can be performed

Convenient execution of experiments

FutureGrid has distributed resources and services

(67)

Experiment Managemen

Approach

Provide tools to execute distributed experiments

– Access (potentially many) resources – Interact with a number of services

– Support execution of experiment plans

Support several usage models

– Workflow (often large, automatic, batched, unattended) – Interactive (attended)

– Hybrid

Store experiment information for later use

(68)

Experiment Managemen

Available Components

Pegasus

– Workflow-based experiment management – Builds on existing Pegasus software

Kickstart to record job execution and its environment • Details of Pegasus presented elsewhere

TakTuk

– Basic interactive experiment management – Reuse tool deployed on Grid 5000

Host List Manager

– Organize provisioned systems into groups, generate host lists for TakTuk

(69)

Experiment Managemen

Planned Components

Messaging-based Execution and Monitoring System (MEMS)

– More sophisticated interactive experiment management

– Integrated message streams for commands, results, and monitoring

Pegasus provisioning workflows

– Include resource provisioning into workflow

Experiment Repository

– Store and retrieve information about experiments – Uses the FG Image Repository as component.

User Portal integration

Convert experiment plans

(70)
(71)
(72)

Research on FutureGrid

Were there ever experiments you could not run and

if so what were the obstacles?

What do you need to obtain results for your next

paper?

Resources, repositories, middleware?

What kind of experiment management tools do you

use today and how could they be improved?

How do you collaborate with colleagues on

developing complex experiments?

(73)

Education on FutureGrid

What types of resources would help you teach

a class?

Access to hardware? Integrated set of readymade

course materials? Ease of use?

What would help you teach your next

tutorial?

How would you like to share teaching

(74)

Usage Modalities and Outreach

What is your ideal scenario of usage?

What would prevent you from using

infrastructure such as FG?

Where and how do you typically find

information about services that enhance your

mode of work?

References

Related documents

The FACTS technology uses advanced microcomputers, high speed power electronic devices, powerful analytical tools and latest control techmology.FACTS technology

In Sec- tion 5 scale economy estimates are paired with approximate current unit cost estimates of cash, debit card, and stored-value card payments to determine the possible time

Developed for law enforcement agencies and other federal, state, and local agencies that work with missing and exploited children, this directory describes the many federal

Risk factors for common contact allergens and patch test results using a modified European baseline series in patients tested during between 2000 and 2009 at Siriraj

In this model, polyQ expansion proteins are toxic when expressed in a cell in which they both generate aggregates (first hit) and then coaggre- gate with endogenous

The fact is that, as the science fiction writer Norman Spinrad says, "psychochemistry [has] created states of consciousness that bad never existed before." Taking a

These models have a rich phenomenology including new collider signatures, stable dark matter candidates, and alternatives to the discrete R-parity symmetry usually built into the

The vacuum and the ultrasound treatments after incipient wetness impregnation can apparently enhance the dispersion of cobalt on the SBA-15 catalysts due to the decrease of the