Advanced Distributed Systems
Cloud Computing
2014-11-26
Cristian Klein Department of Computing Science
Umeå University
2014-11-26 Cloud Computing 1/43 2014-11-26 Cloud Computing 2/43
During this course
• Treads in IT
• Towards a new data center
• What is Cloud computing?
• Types of Clouds
• Making applications Cloud-ready
Cloud Computing
2014-11-26 3/43
Trends in IT
Cloud Computing
2014-11-26 4/43
Internet is everywhere
Source: w3.org
Source: ITU
Total Internet bandwidth is increasing exponentially Home bandwidth follows Nielson's law:
It doubles every 21 months (some say 12) Number of mobile, always-connected users is increasing
More and more users are connected
Cloud Computing
2014-11-26 5/43
Frequency barrier hit
Since 2003, (single-core) processor speed stopped increasing
Cloud Computing
2014-11-26 6/43
Management costs increase
Cloud Computing
2014-11-26 7/43
Power & Cooling expensive
Cloud Computing
2014-11-26 8/43
Increasing prosumarism
• Users both produce and consume content
• Need efficient means to share it
Cloud Computing
2014-11-26 9/43
Towards a new data center
Cloud Computing
2014-11-26 10/43
Business face rapid growth
Cloud Computing
2014-11-26 11/43
… regular peaks
• Daily, weekly and yearly patterns
Wikipedia TracesCloud Computing
2014-11-26 12/43
… expected peaks
• Marketing campaigns – Christmas
• Tax filling
Load on the FIFA 98 Website
Cloud Computing
2014-11-26 13/43
… unexpected peaks
• Triggered by news/events
Source: Google blog
Cloud Computing
2014-11-26 14/43
time
Over-provisioning
time
Under-provisioning lost clients
time
Ideally: auto-scaling wasted
resources
Legend:
actual load provisioned load
Cloud Computing
2014-11-26 15/43
What is Cloud computing?
Cloud Computing
2014-11-26 16/43
NIST definition
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
You do not own computing infrastructure, you rent it
Cloud Computing
2014-11-26 17/43
Essential Characteristics
• On-demand self-service
– User can provision resources without human intervention
• Broad network access
– Service is accessible through the network
• Resource pooling
– Resources are pooled to several costumers
• Rapid elasticity
– Can rapidly increase and decrease capacity
• Measured service
– Resource usage can be monitored, reported
Cloud Computing
2014-11-26 18/43
Service models
• SaaS: Software as a Service – Ready to use applications
– E.g.: Gmail, Google Docs, YouTube, Facebook
• PaaS: Platform as a Service – Ready to use tools
– E.g.: Google Apps Engine
• IaaS: Infrastructure as a Service – Fundamental computing resources – E.g., Amazon EC2, Amazon S3, Windows
Azure
Cloud Computing
2014-11-26 19/43
Deployment models
• Private cloud (“my stuff”)
– Institution resources managed in a“Cloud-like” fashion
• Community cloud (“our stuff”)
– Share resources among institutions• Public cloud (“anyone's stuff”)
– Resources to rent by anybody with acredit card
• Hybrid cloud
– A combination of the above
Cloud Computing
2014-11-26 20/43
Types of Clouds
Cloud Computing
2014-11-26 21/43
Infrastructure as a Service (IaaS)
• Fundamental computing resources
– CPU, RAM, storage, network, etc.• Mostly in an virtualized way
– Details of the underlying physicalresources are hidden from the user
Cloud Computing
2014-11-26 22/43
Physical Machine Physical Machine 2
Virtualization
Physical Machine 1
OS 1 OS 2
Application 1 Application 2
Virtual Machine 2 Virtual Machine 1
OS 1 OS 2
Application 1 Application 2
Hypervisor
• Hypervisors: Xen, VMware, KVM, Microsoft Hyper-V
• Isolates different users
• Flexible infrastructure management:
consolidation – At VM start: mapping – During VM execution: migration
Cloud Computing
2014-11-26 23/43
Public IaaS Clouds
• Amazon: EC2, S3
• Windows Azure
• RackSpace
• E.g.: Amazon EC2 – 1 year micro instance for free
• Try it yourself!
– Small: 0.60 $/hr – 8XL: 4.60 $/hr
• For comparison:
– 8XL in your own datacenter: 1.50 $/hr (estimated)
– Abisko @ HPC2N: 0.07 $/core/hr (source: P-O)
Cloud Computing
2014-11-26 24/43
Open-source solutions
• Useful to build private/community/
hybrid clouds
• Production-oriented
– Eucalyptus, CloudStack, OpenStack
• Research-oriented
– OpenNebula, NimbusCloud Computing
2014-11-26 25/43
IaaS Cloud standards
• De-facto: Amazon – WSDL-based – createInstance()
– startInstances(), stopInstances()
• Open Cloud Compute Interface (OCCI) – REST-based
POST /compute/ HTTP/1.1 Category: compute [...]
X-OCCI-Attribute: occi.compute.cores=2 X-OCCI-Attribute: occi.compute.hostname="foobar"
HTTP/1.1 201 OK
Location: http://example.com/vms/foo/vm1
Cloud Computing
2014-11-26 26/43
“Cloud-ready” applications
• Almost any application can run in the Cloud, but they are not “Cloud-ready”
• Use API to auto-scale – Application needs to be elastic
time load
2014-11-26 Cloud Computing 27/43 2014-11-26 Cloud Computing 28/43
1 VM per tier
Client
Application
Database
Cloud Computing
2014-11-26 29/43
Scalable architecture
Client
Load balancer
Application 1
Master DB Slave DB 1 Application 2
read write
replicate
Cloud Computing
2014-11-26 30/43
Add auto-scaling
Client
Load balancer
Application 1
Master DB Slave DB 1 Slave DB
n-1 Application
2 Application
n read write
replicate replicate
Cloud Computing
2014-11-26 31/43
IaaS pros and cons
•
Pros– General purpose – Very flexible
•
Cons– Low-level, fine-grained – Need to handle application
• Deployment
• Elasticity
• Failure
Cloud Computing
2014-11-26 32/43
Platform as a Service (PaaS)
• Provides programming languages,
libraries, services and tools
•
Higher level than IaaS•
Special purpose•
Less flexible• Examples
– For web applications: Google Apps Engine
– For BigData processing: MapReduce
Cloud Computing
2014-11-26 33/43
Google Apps Engine
• Develop and host web applications
• Take advantage of Google
technologies for scalability, resilience
• Python, Java, Go
• Several APIs
– Images, Mail, Log– Blobstore, Datastore, BigTable
Cloud Computing
2014-11-26 34/43
helloworld.py import webapp2
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain' self.response.write('Hello, webapp2 World!') app = webapp2.WSGIApplication([('/', MainPage)], debug=True)
app.yaml application: helloworld version: 1 runtime: python27 api_version: 1 threadsafe: true handlers:
- url: /.*
script: helloworld.app
Hello Google Apps Engine
Try it, it's free!
Cloud Computing
2014-11-26 35/43
MapReduce
• A programming model for processing
large data sets (BigData)
– Difficult due to scalability, failure, etc.
• Useful for: Indexing, Financial
Analysis, etc.
• Used by Google, Yahoo!, Facebook
• Open source implementation:
Hadoop
Cloud Computing
2014-11-26 36/43
MapReduce: Logical View
• Operates on several key-value spaces:
(1) input, (2) intermediate, (3) output
• Map : (key
1, value
1) → (key
2, value
2)
• Reduce : (key2, list(value2)) → list(value3)
Source: blog.maxgarfinkel.com
Cloud Computing
2014-11-26 37/43
MapReduce: Word count
function map(String name, String document):
for each word w in document:
emit (w, 1)
function reduce(String word, list partialCounts):
sum = 0
for each partialCount in partialCounts:
sum += partialCount emit (word, sum)
Spark
• General-purpose BigData engine
• Most operations in-memory
• Allows interactive use
Cloud Computing
2014-11-26 38/43
Spark: More Examples
Cloud Computing
2014-11-26 39/43 2014-11-26 Cloud Computing 40/43
Final thoughts
Cloud Computing
2014-11-26 41/43
Clouds vs. Grids
Clouds Grids
Origin Industry Academic
Economic model Public: pay-per-use Other: domain-specific (e.g., grant models)
Grant models
Size of allocation Many “small” users Few “big” users Waiting time On-demand Queuing Duration Long (months) Short (< 4h)
Usual workload Interactive
Long-lived services Non-interactive Jobs Organization One provider Multiple providers
Cloud Computing
2014-11-26 42/43
Cloud issues
• Privacy
– What happens to your data?
• Security
– Sometimes VM provide insufficient isolation
• Standardization is lagging behind
– Important to prevent vendor lock-in – IaaS: less of a problem– PaaS: major problem
Cloud Computing
2014-11-26 43/43
Conclusion
• IT trends
• Frequency barrier
• Infrastructure complexity
• Omnipresent network connection