Open Cirrus™: A Global Testbed
for Cloud Computing Research
David O’Hallaron
Director, Intel Labs Pittsburgh
Carnegie Mellon University
Open Cirrus Testbed
• Sponsored by HP, Intel, and Yahoo! (w/additional support from NSF). • 9 sites worldwide, target of around 20 in the next two years.
• Each site 1000-4000 cores.
• Shared hardware infrastructure (~15K cores), services, research, apps. http://opencirrus.intel-research.net
3 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Context
Goals
1. Foster new systems and services research around cloud computing 2. Catalyze open-source stack and APIs for the cloud
Motivation
— Enable more tier-2 and tier-3 public and private cloud providers
How are we different?
— Support for systems research and applications research
•Access to bare metal, integrated virtual-physical migration — Federation of heterogeneous datacenters
Intel BigData Cluster
Open Cirrus site hosted by Intel Labs Pittsburgh
— Operational since Jan 2009.
— 180 nodes, 1440 cores, 1416 GB DRAM, 500 TB disk
Supporting 50 users, 20 projects from CMU, Pitt, Intel, GaTech
— Cluster management, location and power aware scheduling, physical virtual migration (Tashi), cache savvy algorithms (Hi-Spade), realtime streaming frameworks (SLIPstream), optical
datacenter interconnects (CloudConnect), log-based architectures (LBA)
— Machine translation, speech recognition, programmable matter simulation , ground model generation, online education, realtime brain activity decoding, realtime gesture and object recognition, federated perception, automated food recognition.
Idea for a research project on Open Cirrus?
— Send short email abstract to Mike Kozuch, Intel Labs Pittsburgh, michael.a.kozuch@intel.com
5 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack
Compute + network + storage resources
Power + cooling Management and
control subsystem
Physical Resource set (PRS) service
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice
PRS clients, each with their own ―physical data center‖
7 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
BigData App Hadoop
1. Application running 2. On Hadoop
3. On Tashi virtual cluster 4. On a PRS
9 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
BigData app Hadoop
Experiment/ save/restore
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
BigData App Hadoop Experiment/ save/restore Platform services
11 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack
PRS service
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
BigData App Hadoop Experiment/ save/restore Platform services User services
Open Cirrus Stack
PRS
Research Tashi NFS storage
service HDFS storageservice Virtual cluster Virtual cluster
BigData App Hadoop Experiment/ save/restore Platform services User services
13 Dave O’Hallaron – DIC Workshop, 2009
System Organization
Compute nodes are divided into dynamically-allocated,
vlan-isolated PRS subdomains
Apps switch back and forth between virtual and phyiscal.
Open service research Tashi development Proprietary service research Apps running in a VM mgmt infrastructure (e.g., Tashi) Open workload monitoring and trace
collection Production
storage service
Open Cirrus Stack - PRS
PRS service goals
— Provide mini-datacenters to researchers — Isolate experiments from each other — Stable base for other research
PRS service approach
— Allocate sets of physical co-located nodes, isolated inside VLANs.
PRS code from HP Labs being merged into Apache Tashi
project.
15 Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack - Tashi
An open source Apache Software Foundation
project sponsored by Intel, CMU, and HP.
Research infrastructure for cloud computing on Big
Data
— Implements AWS interface
— Daily production use on Intel cluster for 6 months • Manages pool of 80 physical nodes
•~20 projects/40 users from CMU, Pitt, Intel
— http://incubator.apache.org/projects/tashi
Research focus:
— Location-aware co-scheduling of VMs, storage, and power.
— Integrated physical/virtual migration (using PRS)
Credit: Mike Kozuch, Michael Ryan, Richard Gass, Dave O’Hallaron (Intel), Greg Ganger,
Cluster
Manager
Tashi High-Level Design
Node Node Node Node Node
Storage Service
Virtualization Service
Node
Scheduler
Cluster nodes are assumed to be commodity machines
Services are instantiated through virtual machines
Data location and power information is exposed to scheduler and services CM maintains databases and routes messages; decision logic is limited Most decisions happen in the scheduler; manages compute/storage/power
in concert
The storage service aggregates the capacity of the commodity nodes
17 Dave O’Hallaron – DIC Workshop, 2009
Location Matters (calculated)
Calculated (40 racks * 30 nodes * 2 disks)
0
50
100
150
200
250
300
Disk-1G
SSD-1G
Disk-10G
SSD-10G
Th
ro
u
g
h
p
u
t/
d
isk
(
M
B
/s)
Random Placement
Location-Aware Placement
3
.6
X
11X
3
.5
X
9
.2
X
Location Matters (measured)
Measured (2 racks * 14 nodes * 6 disks)
0
5
10
15
20
25
30
35
40
ssh
xinetd
Th
ro
u
g
h
p
u
t/
d
isk
(
M
B
/s)
Random Placement
Location-aware Placement
2
.9
X
4
.7
X
19 19Dave O’Hallaron – DIC Workshop, 2009
Open Cirrus Stack – Hadoop
An open-source Apache Software Foundation project sponsored
by Yahoo!
— http://wiki.apache.org/hadoop/ProjectDescription
Provides a parallel programming model (MapReduce), a
distributed file system, and a parallel database (HDFS)
Typical Web Service
db db External client Query Result HTTP server Application server Application server Application server Application server Data center Examples:Web sites serving dynamic content
Characteristics:
• Small queries and results • Little client computation
• Moderate server computation
21 Dave O’Hallaron – DIC Workshop, 2009
Big Data Service
Parallel compute server d1 d2 d3 External client Parallel data server Query Source dataset Derived datasets Parallel file system (e.g., GFS, HDFS) Result
Data-intensive computing system (e.g. Hadoop)
Parallel query server External data sources Examples: • Search
• Photo scene completion • Log processing
• Science analytics
Characteristics:
• Small queries and results
• Massive data and computation performed on server
Streaming Data Service
Parallel compute server d1 d2 d3 Parallel data server Continuous query stream Source dataset Derived datasets Continuous query results Parallel query server External data sources Characteristics:• Application lives on client
• Client uses cloud as an accelerator
• Data transferred with query
• Variable, latency sensitive HPC on server • Often combines with Big Data service
Examples:
Perceptual computing
on high data-rate
sensors: real time brain activity detection, object recognition, gesture
recognition
External client and
23 Dave O’Hallaron – DIC Workshop, 2009
Streaming Data Service
Gestris – Interactive Gesture Recognition
Two-player ―Gestris‖ (gesture-Tetris) implementation
• 2 video sources
• Uses a simplified volumetric event detection algorithm • 10 cores, 3GHz each:
-1 camera input, scaling -1 game + display
-8 for volumetric matching (4 for each video stream) • Achieves full 15fps rate
Arm gesture selects action
Streaming Data Meets Big Data
Real-time Brain Activity Decoding
•
Magnetoencephalography (MEG) measures the magnetic fields associated with brain activity.•
Temporal and spatial resolution offers unprecedented insights into brain dynamics.ECoG MEG
Credit: Dean Pomerleau (Intel), Tom Mitchell, Gus Sudre and Mark Palatucci (CMU), Wei Wang, Doug Weber and Anto Bagic (UPitt)
25 Dave O’Hallaron – DIC Workshop, 2009
Localizing Sources of Magnetic
Activity
Ill-posed problem that applies to both MEG and EEG.
Very computationally expensive
Important for better mapping to fMRI results, further neuroscience understanding of brain processes and (maybe) improve decoding.
Goal: determine spatiotemporal pattern of brain activity
most likely to have caused measured magnetic field
Big Data Background Processing
Source localization pipeline
MRI data Pre-processing & Filtering MEG or EEG field data (~ 1 hr / session) Reconstruct brain (~ 40 hr/ subject) Create co-registered boundary model (~ 1 hr / subject) Model of electro-magnetic field from sources to sensors
(~ 5 min / session)
Brain activity estimates
(movies, time series) (~ 15 min / session) Brain Structural Information Electro-magnetic Field Measurements
27 Dave O’Hallaron – DIC Workshop, 2009
Streaming/Big Data Service
Real-Time MEG/EEG Decoding
Stimulus MEG/EEG Imaging Preprocess & filter Electro-magnetic field Data Off-line Source Modeling (once) Cloud cluster Brain activity estimates Source Localization Brain Activity Decoding Off-line Decoder Training (once) Decoded Results
“Hand”
Han d F o o t C el ery HandFoot Celery AirplaneReal-Time
Decoding Of
Brain Activity
Summary and Lessons
Using the cloud as an accelerator for interactive streaming/big
data apps is an important usage model.
Location-aware and power-aware workload scheduling still open
problems.
Need integrated physical/virtual allocations to combat cluster
squatting.
Storage models are still a problem.
— GFS-style storage systems not mature, impact of SSDs unknown
We need open source service architecture and reference
implementations.
— Access model
— Local and global services — Application frameworks
Need to investigate new application frameworks