Open Cirrus : A Global Testbed for Cloud Computing Research

(1)

Open Cirrus™: A Global Testbed

for Cloud Computing Research

David O’Hallaron

Director, Intel Labs Pittsburgh

Carnegie Mellon University

(2)

Open Cirrus Testbed

• Sponsored by HP, Intel, and Yahoo! (w/additional support from NSF). • 9 sites worldwide, target of around 20 in the next two years.

• Each site 1000-4000 cores.

• Shared hardware infrastructure (~15K cores), services, research, apps. http://opencirrus.intel-research.net

(3)

3 Dave O’Hallaron – DIC Workshop, 2009

Open Cirrus Context

Goals

1. Foster new systems and services research around cloud computing 2. Catalyze open-source stack and APIs for the cloud

Motivation

— Enable more tier-2 and tier-3 public and private cloud providers

How are we different?

— Support for systems research and applications research

•Access to bare metal, integrated virtual-physical migration — Federation of heterogeneous datacenters

(4)

Intel BigData Cluster

Open Cirrus site hosted by Intel Labs Pittsburgh

— Operational since Jan 2009.

— 180 nodes, 1440 cores, 1416 GB DRAM, 500 TB disk

Supporting 50 users, 20 projects from CMU, Pitt, Intel, GaTech

— Cluster management, location and power aware scheduling, physical virtual migration (Tashi), cache savvy algorithms (Hi-Spade), realtime streaming frameworks (SLIPstream), optical

datacenter interconnects (CloudConnect), log-based architectures (LBA)

— Machine translation, speech recognition, programmable matter simulation , ground model generation, online education, realtime brain activity decoding, realtime gesture and object recognition, federated perception, automated food recognition.

Idea for a research project on Open Cirrus?

— Send short email abstract to Mike Kozuch, Intel Labs Pittsburgh, michael.a.kozuch@intel.com

(5)

Open Cirrus Stack

Compute + network + storage resources

Power + cooling Management and

control subsystem

Physical Resource set (PRS) service

(6)

Open Cirrus Stack

PRS service

Research Tashi NFS storage

service HDFS storageservice

PRS clients, each with their own ―physical data center‖

(7)

Open Cirrus Stack

PRS service

service HDFS storageservice Virtual cluster Virtual cluster

(8)

Open Cirrus Stack

PRS service

BigData App Hadoop

1. Application running 2. On Hadoop

3. On Tashi virtual cluster 4. On a PRS

(9)

Open Cirrus Stack

PRS service

BigData app Hadoop

Experiment/ save/restore

(10)

Open Cirrus Stack

PRS service

BigData App Hadoop Experiment/ save/restore Platform services

(11)

Open Cirrus Stack

PRS service

BigData App Hadoop Experiment/ save/restore Platform services User services

(12)

Open Cirrus Stack

PRS

Research Tashi NFS storage

BigData App Hadoop Experiment/ save/restore Platform services User services

(13)

System Organization

Compute nodes are divided into dynamically-allocated,

vlan-isolated PRS subdomains

Apps switch back and forth between virtual and phyiscal.

Open service research Tashi development Proprietary service research Apps running in a VM mgmt infrastructure (e.g., Tashi) Open workload monitoring and trace

collection Production

storage service

(14)

Open Cirrus Stack - PRS

PRS service goals

— Provide mini-datacenters to researchers — Isolate experiments from each other — Stable base for other research

PRS service approach

— Allocate sets of physical co-located nodes, isolated inside VLANs.

PRS code from HP Labs being merged into Apache Tashi

project.

(15)

Open Cirrus Stack - Tashi

An open source Apache Software Foundation

project sponsored by Intel, CMU, and HP.

Research infrastructure for cloud computing on Big

Data

— Implements AWS interface

— Daily production use on Intel cluster for 6 months • Manages pool of 80 physical nodes

•~20 projects/40 users from CMU, Pitt, Intel

— http://incubator.apache.org/projects/tashi

Research focus:

— Location-aware co-scheduling of VMs, storage, and power.

— Integrated physical/virtual migration (using PRS)

Credit: Mike Kozuch, Michael Ryan, Richard Gass, Dave O’Hallaron (Intel), Greg Ganger,

(16)

Cluster

Manager

Tashi High-Level Design

Node Node Node Node Node

Storage Service

Virtualization Service

Node

Scheduler

Cluster nodes are assumed to be commodity machines

Services are instantiated through virtual machines

Data location and power information is exposed to scheduler and services CM maintains databases and routes messages; decision logic is limited Most decisions happen in the scheduler; manages compute/storage/power

in concert

The storage service aggregates the capacity of the commodity nodes

(17)

Location Matters (calculated)

Calculated (40 racks * 30 nodes * 2 disks)

0

50

100

150

200

250

300

Disk-1G

SSD-1G

Disk-10G

SSD-10G

Th

ro

u

g

h

p

u

t/

d

isk

(

M

B

/s)

Random Placement

Location-Aware Placement

3

.6

X

11X

3

.5

X

9

.2

X

(18)

Location Matters (measured)

Measured (2 racks * 14 nodes * 6 disks)

0

5

10

15

20

25

30

35

40

ssh

xinetd

Th

ro

u

g

h

p

u

t/

d

isk

(

M

B

/s)

Random Placement

Location-aware Placement

2

.9

X

4

.7

X

(19)

19 19Dave O’Hallaron – DIC Workshop, 2009

Open Cirrus Stack – Hadoop

An open-source Apache Software Foundation project sponsored

by Yahoo!

— http://wiki.apache.org/hadoop/ProjectDescription

Provides a parallel programming model (MapReduce), a

distributed file system, and a parallel database (HDFS)

(20)

Typical Web Service

db db External client Query Result HTTP server Application server Application server Application server Application server Data center Examples:

Web sites serving dynamic content

Characteristics:

• Small queries and results • Little client computation

• Moderate server computation

(21)

Big Data Service

Parallel compute server d₁ d₂ d₃ External client Parallel data server Query Source dataset Derived datasets Parallel file system (e.g., GFS, HDFS) Result

Data-intensive computing system (e.g. Hadoop)

Parallel query server External data sources Examples: • Search

• Photo scene completion • Log processing

• Science analytics

Characteristics:

• Small queries and results

• Massive data and computation performed on server

(22)

Streaming Data Service

Parallel compute server d₁ d₂ d₃ Parallel data server Continuous query stream Source dataset Derived datasets Continuous query results Parallel query server External data sources Characteristics:

• Application lives on client

• Client uses cloud as an accelerator

• Data transferred with query

• Variable, latency sensitive HPC on server • Often combines with Big Data service

Examples:

Perceptual computing

on high data-rate

sensors: real time brain activity detection, object recognition, gesture

recognition

External client and

(23)

Streaming Data Service

Gestris – Interactive Gesture Recognition

Two-player ―Gestris‖ (gesture-Tetris) implementation

• 2 video sources

• Uses a simplified volumetric event detection algorithm • 10 cores, 3GHz each:

-1 camera input, scaling -1 game + display

-8 for volumetric matching (4 for each video stream) • Achieves full 15fps rate

Arm gesture selects action

(24)

Streaming Data Meets Big Data

Real-time Brain Activity Decoding

•

Magnetoencephalography (MEG) measures the magnetic fields associated with brain activity.

•

Temporal and spatial resolution offers unprecedented insights into brain dynamics.

ECoG MEG

Credit: Dean Pomerleau (Intel), Tom Mitchell, Gus Sudre and Mark Palatucci (CMU), Wei Wang, Doug Weber and Anto Bagic (UPitt)

(25)

Localizing Sources of Magnetic

Activity

Ill-posed problem that applies to both MEG and EEG.

Very computationally expensive

Important for better mapping to fMRI results, further neuroscience understanding of brain processes and (maybe) improve decoding.

Goal: determine spatiotemporal pattern of brain activity

most likely to have caused measured magnetic field

(26)

Big Data Background Processing

Source localization pipeline

MRI data Pre-processing & Filtering MEG or EEG field data (~ 1 hr / session) Reconstruct brain (~ 40 hr/ subject) Create co-registered boundary model (~ 1 hr / subject) Model of electro-magnetic field from sources to sensors

(~ 5 min / session)

Brain activity estimates

(movies, time series) (~ 15 min / session) Brain Structural Information Electro-magnetic Field Measurements

(27)

Streaming/Big Data Service

Real-Time MEG/EEG Decoding

Stimulus MEG/EEG Imaging Preprocess & filter Electro-magnetic field Data Off-line Source Modeling (once) Cloud cluster Brain activity estimates Source Localization Brain Activity Decoding Off-line Decoder Training (once) Decoded Results

“Hand”

Han d F o o t C el ery HandFoot Celery Airplane

Real-Time

Decoding Of

Brain Activity

(28)

Summary and Lessons

Using the cloud as an accelerator for interactive streaming/big

data apps is an important usage model.

Location-aware and power-aware workload scheduling still open

problems.

Need integrated physical/virtual allocations to combat cluster

squatting.

Storage models are still a problem.

— GFS-style storage systems not mature, impact of SSDs unknown

We need open source service architecture and reference

implementations.

— Access model

— Local and global services — Application frameworks

Need to investigate new application frameworks

http://incubator.apache.org/projects/tashi

http://wiki.apache.org/hadoop/ProjectDescription