• No results found

S06: Open-Source Stack for Cloud Computing

N/A
N/A
Protected

Academic year: 2021

Share "S06: Open-Source Stack for Cloud Computing"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

S06: Open-Source

Stack for Cloud

Computing

Milind Bhandarkar Yahoo! Michael Ryan Intel Michael Kozuch Intel Richard Gass Intel
(2)

Agenda

Sessions:

(A) Introduction 8.30-9.00

(B) Hadoop

9.00-10.00

Break 10.00-10.15

Hadoop 10.15-11:30

Lunch 11.30-12.30

(C) Pig 12.30-1.30

Break 1.30-1.45

(D) Tashi

1.45-3.30

I.

Speaker intros

II.

Motivation

III.

Open Cirrus

IV.

Open Cirrus

software stack

V.

Getting involved

(3)

Session A:

Introduction

(4)

Michael Kozuch (Intro)

Michael Kozuch is a Principal Engineer with Intel

Labs Pittsburgh and manager of the ILP

Systems Research and Engineering group

– Manages the Intel Open Cirrus cluster and is the PI for the Tashi research project

Michael is a 12-year veteran of Intel and

contributed to the development of Intel’s VT and

TXT technologies

(5)

Milind Bhandarkar (Hadoop)

Lead Yahoo! Grid Solutions Team since

June 2005

Contributor to Hadoop since January 2006

Trained 1000+ Hadoop users at Yahoo! &

elsewhere

20+ years of experience in Parallel

Programming

(6)

Michael Ryan (Tashi)

Michael is currently a research engineer

with Intel Labs Pittsburgh

Lead developer for Tashi

Serves as sysadmin for the Intel Open Cirrus

site

Coordinates the Global Monitoring service for

Open Cirrus

(7)

Richard Gass (PRS)

Richard is currently a research engineer with

Intel Labs Pittsburgh

– Lead developer for PRS

– Serves as sysadmin for the Intel OpenCirrus site

Richard has published 9+ scientific papers and

is also an (imminent) PhD candidate with

(8)
(9)

Why Open and Cloud makes

sense

• Cloud Computing is a new, critical technology

– Efficiency: Admin costs aggregated

– Scalability: From 1 to 1000 servers in 10 sec. flat

– Empowerment: Anyone can buy a cluster

• Open Communities enable rapid innovation

– Exchange of ideas: Knowledge grows

– Constructive Darwinism: Best tools survive/evolve

– Empowerment: Anyone can build a LAMP stack

Rapidly developing and deploying innovative computing technologies

(10)

Research Interest: Big Data

• Interesting applications are data hungry

• The data grows over time

• The data is immobile

– 100 TB @ 1Gbps ~= 10 days

• Compute comes to the data

• Big Data clusters are the new libraries

(11)
(12)

Open Cirrus

Cloud Computing Testbed

MIMOS* ETRI* ISPRAS* KIT* UIUC* IDA*

Collaboration between industry and academia, sharing

hardware infrastructure

(13)

Open Cirrus

• Objectives

– Foster systems research around cloud computing

– Vendor-neutral open-source stacks and APIs for the cloud

– Expose research community to enterprise level requirements

– Provide realistic traces of cloud workloads

• How are we unique

– Support for systems research and applications research

– Federation of heterogeneous datacenters

– Collection of interesting data sets

Independently-managed sites…

(14)

User Access to Open Cirrus

• User access is organized around Research Projects

– Led by Principal Investigator (PI)

• Project PIs apply to each site separately

– Identifying additional team members

• Contact information for applications to each site are available on the Open Cirrus Web site

(15)

Open Cirrus

*

Research Projects

Example research

areas of interest

Datacenter federation Datacenter management Web services Data-intensive systems

Projects typically

not of interest

Traditional HPC app development

Production apps looking for “free” cycles

Closed-source system development

(16)
(17)

Open Cirrus* Software

Components

Physical Machine Allocation (PRS)

Cluster Storage (HDFS)

Virtual Machine Allocation (AWS* Compatible, e.g. Tashi or Eucalyptus)

Application Services (Hadoop)

Compute Node Services Global Services Site Services Single Sign-On Global Monitoring Global User Directories Data Location Resource Telemetry Billing/ Accounting

(18)

Physical Machine Allocation:

PRS

Open service research Tashi development Apps running in a VM mgmt infrastructure (e.g., Tashi, Eucalyptus) Production storage service Provides each project

with a mini-datacenter Isolation of experiments

PRS dynamically divides compute nodes

into isolated subdomains

(19)

Cluster Storage: HDFS

• Storage system aggregating standard devices

High-performance, parallel access

High data reliability through replication

• Exposing location information enables intelligent placement of computation

(20)

Virtual Machine Allocation:

Tashi

An open source Apache Software Foundation

incubator project

– Infrastructure for cloud computing on Big Data – http://incubator.apache.org/projects/tashi

– Support for AWS* interface – OS, FS, and VMM agnostic

Research focus:

(21)

Application Service: Hadoop

An open-source Apache Software Foundation

project sponsored by Yahoo!

http://hadoop.apache.org

Provides a scalable, parallel programming

model (MapReduce) and the associated

runtime

(22)
(23)

Summary

Open Communities can shape the

development of Cloud Computing

Open Cirrus* is a multi-partner test bed for

research in Cloud Computing

The Open Cirrus software stack provides a

good starting point for open-source cloud

computing software development

(24)

Getting Involved

• Contact Open Cirrus* with research proposals

University Pierre and Marie Curie LIP6 in Paris (http://opencirrus.org http://incubator.apache.org/projects/tashi

References

Related documents

nationally if Australia were to levy a carbon tax, would domestic nuclear production become economically sustainable. Without such measures, or significant subsidies by

Four female student participants with a mean age of 27 years (range 23-34 years) engaged in an eight week full time role emerging placement in either a community

While over half (54.7 %) of Nebraska's third grade children do not have dental sealants, Nebraska is close to meeting the benchmark HP2010 Objective of 50 percent on at least

Jollands, M 2015, 'Effectiveness of placement and non-placement work integrated learning in developing students' perceived sense of employability', in Aman Oo, Arun Patel,

• To install this utility, select the Prerequisite Software link on the SmartPlant Review product CD and select License Checkout Utility. The utility is installed in

 Amsterdam University Library: The interim mobile library website created by Amsterdam University library provides a number of services including access to the OPAC,

If county officials or county records commissions would like further guidance or assistance in organizing a disaster planning committee or organizing a disaster preparedness plan

Kentucky has 95 high school career and technical centers or area technology centers that provide dozens of career education programs serving as a choice for students who