Operating a distributed IaaS Cloud

(1)

Operating a distributed IaaS

Cloud

Ashok Agarwal, Patrick Armstrong Adam Bishop, Andre Charbonneau, Ronald Desmarais, Kyle Fransham, Roger Impey, Colin Leavett-Brown, Michael Paterson, Wayne Podaima, Randall

Sobie, Matt Vliet

!"#$%&'()$*+&,-../&0'1&

2$34)5$67&89&:$;68)$</&:$;68)$</&=<<><&

?<@8<A&B454<);C&=8D;$A&89&=<*<></&EF<G<&

(2)

Outline

• _{! Motivation}

–

_{! HEP Legacy Data Project}

–

_{! CANFAR: Observational Astronomy}

• _{! System Architecture}

• _{! Operational Experience}

• _{! Future work}

• _{! Summary}

(3)

Motivation

• ! Projects requiring modest resources we believe to be

suitable to Infrastructure-as-a-Service (IaaS) Clouds:

–

! The High Energy Physics Legacy Data project

–

_{! The Canadian Advanced Network for Astronomical Research}

(CANFAR)

• _{! We expect an increasing number of IaaS clouds to be}

available for research computing.

(4)

HEP Legacy Data Project

• _{! We have been funded in Canada to investigate a}

possible solution for analyzing BaBar data for

the next 5-10 years.

• _{! Collaborating with SLAC who are also pursuing}

this goal.

• _{! We are exploiting VMs and IaaS clouds.}

• _{! Assume we are going to be able run BaBar code}

in a VM for the next 5-10 years.

• _{! We hope that results will be applicable to other}

experiments.

• _{! 2.5 FTEs for 2 years ends in October 2011.}

(5)

• _{! 9.5 million lines of C++ and Fortran}

• _{! Compiled size is 30 GB}

• _{! Significant amount of manpower is required to}

maintain the software

• _{! Each installation must be validated before}

generated results will be accepted

• _{! Moving between SL 4 and SL 5 required a}

significant amount of work, and is likely the last

version of SL that will be supported

(6)

• _{! CANFAR is a partnership between}

–

! University of Victoria

–

! University of British Columbia

–

! National Research Council Canadian

Astronomy Data Centre

–

_{! Herzberg Institute for Astrophysics}

• _{! Will provide computing}

infrastructure for 6 observational

astronomy survey projects

(7)

• _{! Jobs are embarrassingly parallel, much like}

HEP.

• _{! Each of these surveys requires a different}

processing environment, which require:

–

_{!A specific version of a Linux distribution}

–

_{!A specific compiler version}

–

_{!Specific libraries}

• _{! Applications have little documentation}

• _{! These environments are evolving rapidly}

(8)

How do we manage jobs on

IaaS?

• _{! With IaaS, we can easily create many}

instances of a VM image

• _{! How do we Manage the VMs once booted?}

• _{! How do we get jobs to the VMs?}

(9)

Possible solutions

• _{! The Nimbus Context broker allows users to}

create “One Click Clusters”

–

_{! Users create a cluster with their VM, run their jobs,}

then shut it down

–

_{! However, most users are used to sending jobs to a}

HTC cluster, then waiting for those jobs to complete

–

_{! Cluster management is unfamiliar to them}

–

_{! Already used for a big run with STAR in 2009}

• _{! Univa Grid Engine Submission to Amazon EC2}

–

_{! Release 6.2 Update 5 can work with EC2}

–

_{! Only supports Amazon}

• _{! This area is involving very rapidly!}

• _{! Other solutions?}

(10)

Our Solution: Condor + Cloud

Scheduler

• _{! Users create a VM with their experiment software}

installed

–

_{! A basic VM is created by our group, and users add on}

their analysis or processing software to create their

custom VM

• _{! Users then create batch jobs as they would on a}

regular cluster, but they specify which VM should

run their images

• _{! Aside from the VM creation step, this is very}

similar to the HTC workflow

(11)

1<*&0<HA4& ..& B454<);C&<*>&=8PP4);$<A&;A8D>5&P<>4& <3<$A<HA4&G$6C&58P4&;A8D>QA$R4&$*64)9<;4S&

(12)

1<*&0<HA4& .,& 254)&5DHP$65&68& =8*>8)&T8H&5;C4>DA4)& 6C<6&C<5&*8&)458D);45& <F<;C4>&68&$6S&

Step 2

(13)

1<*&0<HA4& .I& =A8D>&';C4>DA4)&>464;65&6C<6& 6C4)4&<)4&G<$@*+&U8H5&$*&6C4& =8*>8)&&VD4D45&<*>&6C4*& P<R45&)4WD456&68&H886&6C4& :X5&6C<6&P<6;C&6C4&U8H& )4WD$)4P4*65S&

Step 3

(14)

1<*&0<HA4& .J&

Step 4

YC4&:X5&H886/&<F<;C& 6C4P54A345&68&6C4&=8*>8)& VD4D45&<*>&H4+$*&>)<$*$*+& U8H5S&E*;4&*8&P8)4&U8H5& )4WD$)4&6C4&:X5&=A8D>& ';C4>DA4)&5CD65&6C4P&>8G*S&

(15)

How does it work?

1. ! A user submits a job to a job scheduler

2. ! This job sits idle in the queue, because there are no

resources yet

3. ! Cloud Scheduler examines the queue, and determines

that there are jobs without resources

4. ! Cloud Scheduler starts VMs on IaaS clusters

5. ! These VMs advertise themselves to the job scheduler

6. ! The job scheduler sees these VMs, and starts running

jobs on them

7. ! Once all of the jobs are done, Cloud Scheduler shuts

down the VMs

(16)

Implementation Details

• _{! We use Condor as our job scheduler}

–

_{! Good at handling heterogeneous and dynamic}

resources

–

_{! We were already familiar with it}

–

_{! Already known to be scalable}

• _{! We use Condor Connection broker to get around}

private IP clouds

• _{! Primarily support Nimbus and Amazon EC2, with}

experimental support for OpenNebula and Eucalyptus

.

(17)

Implementation Details Cont.

• _{! Each VM has the Condor startd daemon installed,}

which advertises to the central manager at start

• _{! We use a Condor Rank expression to ensure that}

jobs only end up on the VMs they are intended to

• _{! Users use Condor attributes to specify the number}

of CPUs, memory, scratch space, that should be on

their VMs

• _{! We have a rudimentary round robin fairness scheme}

to ensure that users receive a roughly equal share of

resources respects condor priorities

(18)

Condor Job Description File

1<*&0<HA4& .N& Universe = vanilla Executable = red.sh Arguments = W3-3+3 W3%2D3%2B3 Log = red10.log Output = red10.out Error = red10.error should_transfer_files = YES when_to_transfer_output = ON_EXIT # Run-environment requirements

Requirements = VMType =?= ”redshift" +VMNetwork = "private" +VMCPUArch = "x86" +VMLoc = "http://vmrepo.phys.uvic.ca/vms/ redshift.img.gz" +VMMem = ”2048" +VMCPUCores = "1" +VMStorage = "20" +VMAMI = "ami-fdee0094" Queue

(19)

CANFAR: MAssive Compact Halo

Objects

• ! Detailed re-analysis of

data from the MACHO

experiment Dark Matter

search.

• ! Jobs perform a wget to

retrieve the input data (40

M) and have a 4-6 hour

run time. Low I/O great

for clouds.

• ! Astronomers happy with

the environment.

(20)

Experience with CANFAR

1<*&0<HA4& ,-& 'C8)64)&T8H5&

(21)

VM Run Times (CANFAR)

1<*&0<HA4& ,.&

X<Z&<AA8G4>&:X&BD*& @P4&[.&G44R\&

(22)

VM Boots (CANFAR)

(23)

Experimental BaBar Cloud

Resources

!"#$%&'"( )$&"#( *$+"#( ^D6D)40)$>&_`)+8**4&a<H& .--&=8)45& `AA8;<64>& B458D);45&<AA8;<@8*&68&5D((8)6& b<b<)&

"A4(C<*6&=AD564)&_2:$;& NN&=8)45& "Z(4)$P4*6<A&;A8D>&;AD564)&C8565& [Z)886>&98)&;A8D>\&

?B=&=A8D>&$*&EF<G<& LN&=8)45& !8565&:X&$P<+4&)4(85$68)7& [)4(8P<*\&

`P<c8*&"=,& #)8(8)@8*<A&68&d& 0)<*6&9D*>$*+&9)8P&`P<c8*& !4)P45&=AD564)&_23$;& :<)$<HA4&[,N-&P<Z\& E;;<5$8*<A&b<;ReAA&<;;455&

(24)

1<*&0<HA4& ,J&

(25)

A Typical Week (Babar)

(26)

BaBar MC production

(27)

Other Examples

(28)

Inside a Cloud VM

(29)

Inside a Cloud VM Cont.

(30)

A batch of User Analysis Jobs

1<*&0<HA4& I-&

,-./0"(((( & &1$2#((( &345(&-+"(

Y<D.?QX=& &LJ & &,,-&fbg5& Y<D.?Q><6< &MM& & &JJ-&fbg5& Y<D..QX= & &..J & &.I--&fbg5&

(31)

Cloud I/O for BaBar User Anaysis

1<*&0<HA4& I.&

,-./0"(((( & &1$2#((( &345(&-+"(

Y<D.?QX=& &LJ & &,,-&fbg5& Y<D.?Q><6< &MM& & &JJ-&fbg5& Y<D..QX= & &..J & &.I--&fbg5&

(32)

Some Lessons Learned

• _{! Monitoring cloud resources is difficult}

–

_{!Can you even expect the same kind of knowledge?}

• _{! Debugging user VM problems is hard for users, and}

hard for support

–

_{!What do you do when the VM network doesn’t}

come up.

• _{! No two EC2 API implementations are the same}

–

_{!Nimbus, OpenNebula, Eucalyptus all different}

• _{! Users nicely insulated from cloud failures. If the VM}

doesn’t come but the job doesn’t get drained.

(33)

SLAC activities

Cloud in a Box:

–

_{! LTDA Analysis cloud}

–

! The idea is to build a secure cloud to run obsolete operating

systems without compromising the base OS.

–

! VMs are on a separate vlan, and strict firewall rules are in

place.

–

_{! Users are managed through ldap on an up-to-date system.}

–

! Uses Condor / Cloud Scheduler / Nimbus for IaaS.

SLAC Team: Homer Neal, Tina Cartaro, Steffen Luitz, Len Moss,

Booker Bense, Igor Gaponenko, Wiko Kroeger, Kyle Fransham

(34)

SLAC LTDA Cluster

1<*&0<HA4& IJ& ^$)4G<AA& Z)886>& ?1=& 3P& 3P& 3$)6D<A&H)$>+4& ?1=& ,JYb& 1<<'&;A$4*6& Z)886>& ?1=& 3P& 3P& 3$)6D<A&H)$>+4& ?1=& ,JYb& … (x60) 'a`=& ?Y#& ah`#/&h?'/& h!=#& ?^'& =:'/&C8P4/&G8)R& <)4<5/&:X&)4(8& P7'Va& 254)& A8+$*& 254)& A8+$*& aBX& 1<<'& BBR-LTDA-VM BBR-LTDA-SRV BBR-L TDA-LOGIN … !E'Y& B!"aL& X<*<+4>& :X& 0D456& B!"aK& 1<<'&;A$4*6&

Note: Built using very generic design patterns! Useful to others …

(35)

Future Work/Challenges

• _{! Increasing the the scale}

–

_{! I/O scalability needs to be proven.}

–

! Total number of VMs.

• _{! Security? Leverage work of HEPiX virtualization}

working group.

• _{! Booting large numbers of VM quickly on research}

clouds.

–

_{! copy on write images (qcow, zfs backed storage)?}

–

_{! BitTorrent Distribution?}

–

_{! Amazon does it so we can too.}

(36)

About the code

• ! Relatively small python package, lots of

cloud interaction examples

http://github.com/hep-gc/cloud-scheduler

1<*&0<HA4& IL&

Ian-Gables-MacBook-Pro:cloud-scheduler igable$ cat source_files | xargs wc -l 0 ./cloudscheduler/__init__.py 1 ./cloudscheduler/__version__.py 998 ./cloudscheduler/cloud_management.py 1169 ./cloudscheduler/cluster_tools.py 362 ./cloudscheduler/config.py 277 ./cloudscheduler/info_server.py 1086 ./cloudscheduler/job_management.py 0 ./cloudscheduler/monitoring/__init__.py 63 ./cloudscheduler/monitoring/cloud_logger.py 208 ./cloudscheduler/monitoring/get_clouds.py 176 ./cloudscheduler/utilities.py 13 ./scripts/ec2contexthelper/setup.py 28 ./setup.py 99 cloud_resources.conf 1046 cloud_scheduler 324 cloud_scheduler.conf 130 cloud_status 5980 total

(37)

Summary

• _{! Modest I/O jobs can be easily handled on}

IaaS clouds

• _{! Early experiences are promising}

• _{! More work to show scalability}

• _{! Lots of open questions}

(38)

More Information

• _{! Ian Gable ([email protected])}

• _{! cloudscheduler.org}

• _{! Code on GitHub:}

–

_{! http://github.com/hep-gc/cloud-scheduler}

–

! Run as proper open source project

1<*&0<HA4& IN& B8<>P<(&

(39)

Acknowledgements

(40)

Start of extra slides

(41)

CANFAR

• ! CANFAR needs to provide computing

infrastructure for 6 astronomy survey projects:

1<*&0<HA4& J.&

,%&6"7( 8"-9( :"0"#'$/"(

?4Z6&04*4)<@8*&:$)+8&=AD564)&'D)347& ?0:'& 2:$;& =^!Y& #<*Q`*>)8P4><&`);C<48A8+$;<A&'D)347& #`*>`'& 2b=& =^!Y& '=2b`Q,&`AA&'R7&'D)347& '`''7& 2b=& T=XY& '=2b`Q,&=85P8A8+7&a4+<;7&'D)347& =a'& 2b=& T=XY& 'C<(45&<*>&#C868P46)$;&B4>5C$i5&98)&a<)+4&

'D)3475&

'#ca'& 2b=& =^!Y&

Y$P4&:<)$<HA4&'R7& Y:'& 2:$;& =^!Y&

(42)

Cloud Scheduler Goals

• _{! Don’t replicate existing functionality.}

• _{! To be able to use existing IaaS and job scheduler}

software together, today.

• _{! Users should be able to use the familiar HTC tools.}

• _{! Support VM creation on Nimbus, OpenNebula,}

Eucalyptus, and EC2, i.e. all likely IaaS resources

types people are likely to encounter.

• _{! Adequate scheduling to be useful to our users}

• _{! Simple architecture}

(43)

1<*&0<HA4& JI& k4&C<34&H44*&$*64)4564>&$*&3$)6D<A$c<@8*& 98)&58P4&@P4S& •!&"*;<(5DA<@8*&89&`((A$;<@8*5& •!&088>&98)&5C<)4>&)458D);45& •!&#4)98)P5&G4AA&<5&5C8G*&<6&!"#$%& k4&<)4&$*64)4564>&$*&(D)5D$*+&D54)& ()83$>4>&:X5&8*&=A8D>5S&&YC454&<)4& 564(5&J&<*>&K&<5&8D6A$*4>&$6&Y8*7&=<55l& m:$5$8*&98)&:$)6D<A$c<@8*n&6<AR&<6&!"#$%& ?"B'=S&

Operating a distributed IaaS Cloud