Harnessing the High
Performance Capabili5es of
Cloud over the Internet
Jaison Paul Mulerikkal, PhD
HPC Knowledge Portal Meeting 2015 Barcelona, Spain
About Me
• Jaison Paul Mulerikkal
• B Tech – Mahatma Gandhi University, Kerala, India • MS (RMIT University,
Melbourne, Australia) • PhD (Australian Na5onal
University)
• Computa5onal Scien5st, NeSI, New Zealand
• Asst Professor at Rajagiri School of Engineering & Technology, Kochi, Kerala, India
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Kerala
3
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Hegelian Dialec5cs
•
German philosopher Georg Wilhelm Friedrich
Hegel explained philosophy of history as a
dialec5c between thesis, an5-‐thesis and
resul5ng synthesis.
– there could emerge a theory first and it could be confronted by an opposing theory.
– The dialec5c between these opposing theories may find a consensus by assimila5ng the main aspects of both, in the due course of 5me
4
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
History of Compu5ng
(According to me!)
5
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Cloud Compu5ng
• The most powerful feature of cloud compu5ng is its capacity
to transfer compu5ng as a 5th u5lity a[er water, electricity,
gas, and telephony. – That was the promise!
• My Defini5on:
– Cloud Compu5ng is a form of parallel and distributed system which uses virtualiza5on techniques to orchestrate large storage, memory and network resources of data-‐centres or similar resources as a
unified unit but with apparent elas5c availability on-‐demand by the customers.
– It relies on the economy of scale to provide infrastructural and
applica5on services without upfront commitment of customers over a network or Internet with minimal management, running and
maintenance costs.
Hype Cycle -‐ 2014
7
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
State of the Cloud – 2014
8
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Private Cloud Usage
10
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
HPC in Cloud?
• Gartner CIO survey 2011 predicts that 23% of
compu5ng ac5vity would never move to cloud. Some of the High Performance Compu5ng (HPC)
applica5ons will be among those 23% that would never move to cloud.
• Ian Foster noted:
– The one excep5on that will likely be hard to achieve in
cloud compu5ng (but has had much success in Grids) are HPC applica5ons that require fast and low latency network interconnects for efficient scaling to many processors .
12
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Silver Lining in the Cloud
•
However the future for high performance
compu5ng in cloud is not that bleak.
•
It could be possible for specialized clouds and
providers with new tools and technology to
enable HPC applica5ons on cloud with
acceptable level of speed and efficiency.
– Science Cloud – supported by Nimbus project -‐ is an early
indica5on of that trend
13
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Two Ques5ons
•
Whether Cloud could produce HPC like
performance?
•
Whether HPC capabili5es of Cloud can be
harnessed effec5vely over slow networks like
the Internet?
14
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
15
•
Whether Cloud could produce HPC
like performance?
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Dell Experiment
– Comparison between a bare metal (BM)
installa5on (with RHEL 6.5) and a virtual machine (VM) running on a hypervisor (OpenStack), on a single node (15 Jul 2014).
– Running NAS Parallel Benchmarks, ANSYS, etc.
• Applica5ons which are embarrassingly parallel and
compute intensive perform 1-‐2% lower on the VM rela5ve to the BM
• Applica5ons which have very high memory bandwidth
requirements may perform up to 25% lower on the VM rela5ve to the BM.
16
Cluster vs Cloud Comparison -‐ ANU
17
18
How do we cut the fat?
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Docker
A Container management system
Background
• Google’s lmcpy project (Let Me Contain That For You)
• Linux containers (LXC):
– Pros: faster lifecycle and limited overhead – Cons: configura5on complexity and weaker
security isola5on
• Oracle Solaris also has a similar concept called Zones • Docker is built on top of LXC (Linux Containers).
20
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Docker vs VM
21
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Docker vs VMs
• VM hypervisors, such as Hyper-‐V, KVM, and Xen, all are based on emula5ng virtual hardware. That
means they’re fat in terms of system requirements. • Containers use shared opera5ng systems. They are
supposed to be much more efficient than hypervisors in system resource terms.
– Loading 5me and system resources that need to launch those applica5ons could be lower
• That’s theory, what’s prac5cal?
22
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
IBM Experiment
•
“Passive Benchmarking with docker LXC, KVM
& OpenStack” on IBM So[Layer
– By Boden Russell ([email protected])
•
Disclaimer:
– “The tests herein are “passive” – no in depth tuning, analysis, etc. More ac5ve tes5ng is
warranted. These results do not necessary reflect your workload or exact performance nor are they guaranteed to be sta5s5cally sound.”
23
What Openstack IS
•
A group of 7+ core open source projects
aimed at providing comprehensive cloud
services.
•
It is more than a hypervisor manager.
•
It sits above the pool of virtualized resource as
a master control plane.
•
It provides users with a single point of control
and orchestra5on.
OpenStack Components
• OpenStack Compute (code-‐named “Nova”)
• OpenStack Object Store (code-‐named “Swi[”) • OpenStack Image (code-‐named “Glance”)
• OpenStack Iden5ty (code-‐named “Keystone”)
• OpenStack Block Storage (code-‐named “Cinder”) • OpenStack Networking (formerly code-‐named
“Quantum”) •
Docker on OpenStack?
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Benchmark Environment Topology @ IBM So[Layer
glance api / reg nova api / cond / etc
keystone … rally
nova api / cond / etc cinder api / sch / vol
docker lxc dstat
controller compute node
glance api / reg nova api / cond / etc
keystone … rally
nova api / cond / etc cinder api / sch / vol
KVM dstat
controller compute node
Ref: Boden Russell ([email protected]) 28
+
Awesome!
+
Cloudy Performance: Serial VM Boot
Ref: Boden Russell ([email protected]) 29 3.900927941 5.884197426 0 1 2 3 4 5 6 7 docker KVM Ti me i n Seco nd s
Average Server Boot Time
docker KVM
Cloudy Performance: Serial VM Boot
Ref: Boden Russell ([email protected])
30 0 20 40 60 80 100 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 CPU U sa ge I n Percen t Time
Docker: Compute Node CPU
usr sys 0 20 40 60 80 100 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 CPU U sa ge I n Percen t Time
KVM: Compute Node CPU
usr sys Averages – 1.14 – 0.44 Averages – 12.6 – 2.08
Cloudy Performance: Serial VM Boot
Ref: Boden Russell ([email protected]) 31
0.00E+00 1.00E+09 2.00E+09 3.00E+09 4.00E+09 5.00E+09 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 Me m or y U se d Time
Docker: Compute Node Used Memory
Memory 0.00E+00 1.00E+09 2.00E+09 3.00E+09 4.00E+09 5.00E+09 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 Me m or y U se d Time
KVM: Compute Node Used Memory
Memory Delta 687 MB Per VM 45.8 MB Delta 2775 MB Per VM 185 MB
32
•
Whether HPC capabili5es of Cloud
can be harnessed effec5vely over
slow networks like the Internet?
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
A Surprise By-‐product of my PhD
• My PhD research at ANU has produced a SOA middleware that is intended to produce high
performance outcomes for not so embarrassingly
parallel scien5fic applica5ons.
– ANU-‐SOAM was moulded by adop5ng the architecture of
IBM-‐Pla5orm Symphony Enterprise SOA middleware.
– The programming model supported ANU-‐SOAM with its Data Service extension is found to
effec5vely harness cloud compu5ng resources over Internet to produce HPC results.
33
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Service Oriented Architecture
• SOA is a compu5ng paradigm that considers services as building blocks for applica5ons.
• In SOA, atomic units of computa5on(s) are
considered as a Service, which are at the disposal of
Clients.
• A Resource Manager nego5ates the availability/ scheduling of services to clients.
34
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
SOA Tradi5onal Architecture
35
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
High Performance Scien5fic
Compu5ng -‐ Main Challenge
• Inter-‐dependency of underlying computa5onal tasks in many
scien5fic applica5ons.
– Eg: N Body problem.
• When these applica5ons are parallelized, the inter-‐
dependency shall compel atomic units of work -‐ tasks -‐ to
progress in phases (we call it as genera8ons).
• This increases task granularity which result in increased
communica5on and communica5on costs (Overheads) that slows down the applica5on.
• They are not embarrassingly parallel !
36
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
N Body Problem
37
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
N Body Problem – Conven5onal
Approach
• The naïve NBS algorithm starts with a known set of values for the mass, velocity and posi5on of bodies involved.
• The future posi5ons and veloci5es of bodies can be predicted by itera5vely moving forward in small 5me increments.
• This linear algorithm is split into client and service
processes and the algorithm is parallelized by sending subset(s) of the N bodies to each SI.
• The SIs process these subsets and the par5al results are
communicated back to the client to synchronize all updates from all SIs.
38
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
N Body Problem – Conven5onal Approach
39
RM Client Service
Request Resources
Assign Resources
Send Tasks(particle info) Parallel computation
of partial data Send back updated (partial) particle info Sync all partial results
*Next Generations.. Send Tasks(particle info)
* Next generation tasks are depended on previous generation results.
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Data Service Extension
• Introduced to deal with interdependency of tasks in scien5fic apps (as explained in the example of NBS). • The Data Service allows SIs to communicate each
other without 5ght coupling so that many decisions can be taken among SIs, rather than going back to client all the 5me.
• This will
– reduce communica5ons between client and SIs in many applica5ons. – allow applica5on programmers to move cri5cal applica5on logic from
client side to service side.
40
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
SOAM with Data Service
Resource Manager Client Common Data C’Data Service Instance C’Data Service Instance C’Data Service Instance 41Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Data Service Func5ons
• Data common to all SIs can be set (add) using this service.
• The common data is replicated among all SIs and client process. • This common data can be accessed (get) either by client or SIs. • Updates to this common data (put) can also be made by
individual SIs or client processes, without changing common data for a for a par5cular genera5on of tasks).
• put is a deferred opera5on in CDS.
• These updates (put) can be synchronized between SIs and the client process using sync.
• iSync applies sync only to service instances. The common data
will be synchronized among SIs but won’t be updated back to the client side.
42
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Deferred put
43 CD Up- date Meta Up- date Meta CD Up- date Up- date Update (from other) sync TimeHigh Performance Cloud Computing Using an Efficient Data Service [email protected]
N Body Problem – ANU-‐SOAM
Approach
44
RM Client DS
Request Resources
Assign Resources add particle info
Parallel computation of partial data
iSync updates b/n SIs
*Next Generations.. put partial updates
* Next generation tasks are get updated data from CDS.
Service
Send tasks
get particle info
iSync command
Send tasks
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Experimental Setups -‐ within Cloud
• Two types of cloud experimental scenarios
– 1) ANU-SOAM deployed within cloud: both the client and SIs to run within a public cloud IaaS.
– 2) ANU-SOAM deployed over the Internet: access a public cloud IaaS across Internet, from home PC.
• Selec5on of a right cloud provider turned out to be
cri5cal.
• Because, ANU-‐SOAM uses OpenMPI as its communica5on
backbone, which does not support Network Address Transla5on (NAT).
• Since the Amazon cloud uses NAT to translate the public
IP addresses of its compute nodes, Rackspace (which doesn't use NAT technology) was chosen, especially to enable the second set of experiments.
NBS – All within Cloud
1 1.5 2 2.5 3 3.5 4 Service Instances 0 5 10 15 20 T ime (Se c)NBS-SOAM within cloud - Loading time NBS-SOAM within cloud - Compute time NBS-SYMPHONY within cloud - Total time
46
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Cloud over Internet
47
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
NBS -‐ Cloud over Internet
Conven5onal SOA Approach
1 1.5 2 2.5 3 3.5 4 Service Instances 0 100 200 300 400 T ime (Se c)
NBS-SYMPHONY within cloud - Total time NBS-SYMPHONY over the Internet- Total time
48
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
NBS -‐ Cloud over Internet
ANU–SOAM Approach
1 1.5 2 2.5 3 3.5 4 Service Instances 0 5 10 15 20 T ime (Se c)NBS-SOAM within cloud - Loading time NBS-SOAM within cloud - Compute time NBS-SOAM over the Internet - Loading time NBS-SOAM over the Internet - Compute time
49
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Comparison
1 1.5 2 2.5 3 3.5 4 Service Instances 0 100 200 300 400 T ime (Se c)NBS-SOAM over the Internet - Total time NBS-SYMPHONY over the Internet- Total time
50
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Intelligent Be|ng?
51
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Supercompu5ng over a Coffee?
53
This slide is intentionally kept blank
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
HPC in India
54
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail
Thank you
Ques5ons?
55
Harnessing the High Performance Capabilities of Cloud over the Internet jaisonmpaul@gmail