Cloud Computing –
Virtualized Computing InfrastructuresErik Elmroth
Cloud computing in plain English
www.youtube.com/watch?v=QJncFirhjPg2
Brief outline
• A Game changing trend in IT use • Revitalization of the datacenters
– Infrastructure providers for service providers
• Compute CloudsCompute Clouds
– virtualization
– datacenter infrastructure providing virtual resources as utility
• Umeå research in Cloud computing
– Research topics – Major projects
”Game Changing Trend”
Growth on Service Consumer Side– Individuals – professionally and privately – Companies: External services or hosting of
complete IT environment
– Explosive growth in availability of services of internet
Revitalization of data centers!
Revitalization of the datacenters:
Service & infrastructure provider cooperationCompany or individual. Sees service, not hardware Provider for service user. Customer for infra provider Provides infra to service provider. (Datacenter) SLA SLA SLA SLA
Critical performance requirements
- to be cost-efficiently met
Extremely rapid growth (from global scale) – YouTube (16 months) 100 mil/movies per day, 20
mil. unique users per month
– AppStore (19 months): Over 100000 Iphone programs, ovre 3 billion downloads
Regular/planned peaksg /p p
– Banks, tax filing – Market campaign effects Unexpected peaks
– New related video streaming – Stock trading peaks at financial crises Regional aspects in usage patters
– Regional concerns (new, events, etc) – Time-dependent usage-patterns
Critical performance requirements
- to be cost-efficiently met
Extremely rapid growth (from global scale)
– YouTube (16 months) 100 mil/movies per day, 20 mil. unique users per month
– AppStore (19 months): Over 100000 Iphone programs, over 3 billion downloads
Regular/planned peaksg /p p
– Banks, tax filing – Market campaign effects
Unexpected peaks
– New related video streaming – Stock trading peaks at financial crises
Regional aspects in usage patters
– Regional concerns (new, events, etc) – Time-dependent usage-patterns
New Requirements on Datacenters
(Infrastructure Providers)
• Today, load peaks typically managed by extensive over-provisioning
– COSTLY!!!
• Need for a new datacenter infrastructure, that
– provide elasticity:
• scale quickly in response to demand increase (in • scale quickly in response to demand increase (in
minutes, not days)
• shrink dynamically to save resources (energy, other use)
– Improve network parameters by locality-awareness – manage SLAs corresponding to business
agreements
– support a variety of payment schemes (pay-per-use, pre-paid, flat-rate, etc)
• Today’s clouds provide partial solutions
Compute Clouds
• Virtual “cloud” of IT resources (within a datacenter)
• Services run on virtual resources, unaware of the physical resources
• Infrastructure – compute, storage, and network
network
• Utility model – provision on demand, charge back on use
– Notably, as power and running costs become a larger fraction of the total IT cost, the character of IT capacity become more utility-like
Before talking more Clouds…
• Virtualization 10”Traditional” virtualization
Applic Applic Applic Applic.Hardware (CPU, RAM, Disk, LAN) Operating System Virtual Machine OS Applic. Virtual Machine OS Applic. … Applic. Virtual Machine OS Applic.
Hypervisor virtualization
Virtual hi Virtualhi VirtualhiHardware (CPU, RAM, Disk, LAN) Hypervisor Hypervisor OS Applic. machine Appl OS Appl machine … OS Applic. machine
Virtualization features
With a virtual machine you can:• Define machine size as part of physical machine • Halt and resume execution
• Migrate between physical machines
These features can be used for many purposes!
13
Server Sprawl
• New application = new server
File/Print File/Print Application Application Application Application File/Print Database Database Application Application Application Application Application Database Application Application Application Database Database Application
Problems Server Sprawl
• Hardware– Increased hardware acquisition costs – Increased infrastructure requirements – Increased hardware maintenance costs – Increased hardware replacement costs
• Administration
– Patch management – Backup and recovery
– Server management and troubleshooting
Server Consolidation
• Increase hardware
utilization
• Reduced costs
– Fewer systems – Less power – Less cooling – Less administration• Reduced
Infrastructure
– Fewer racks – Fewer switchesMultiple OS & Applications
• Run multiple OS
– Shared hardware• Incompatible application
• Applications with different
OS and library
requirements
• Isolation between
applications
Other levels of virtualization (with inconsequent naming) • Operating system virtualization
– Virtualization inside OS
– Full separation between applications, but all running in the same OS
• Application virtualizationpp
– Encapsulation of application in executable – no need for traditional installation of application in OS
– Runs as if installed on hardware but all access to OS is virtualized.
• Desktop virtualization
– As appl. virtualization but encapsulation of the whole desktop
What is Cloud Computing?
An emerging computing paradigm where data and services reside in massively scalable data centers
and can be ubiquitously accessed from any connected devices over the internet.
Not only one type of clouds…
NIST definition of cloud computing
5 characteristics •On-demand self-service •Broad network access •Resource pooling •Rapid elasticity •Measured service
National Institute of Standards and Technology
3 service models •Software-as-a-Service •Plattform-as-a-Service •Infrastructure-as-a-Service 4 deployment models •Private •Public •Community •Hybrid
The Amazon example …
(Why is an internet bookstore entering this market?)
22
The Amazon example …
• EC2 is a web service that provides resizable compute capacity in the cloud
• S3 provides a web services interface to store and retrieve any amount of data, at any time, from anywhere on the web
• SimpleDB is a web service for running queries on structured data in real time
on structured data in real time
• CloudFront is a web service for content delivery (software distributions, web content, media files)
• SQS offers a reliable, highly scalable, hosted queue for storing messages as they travel between computers
• Mechanical Turk is a web service for programmatically access to marketplace for
work that requires human intelligence 23
Standard EC2 Instances (2010)
• Small: 1.7 GB, 1 EC2 Compute Unit (1 virtualcore), 160 GB storage, 32-bit platform • Large: 7.5 GB, 4 EC2 Compute Units (2 virtual
cores), 850 GB storage, 64-bit platform • Extra Large: 15 GB, 8 EC2 Compute Units (4
virtual cores), 1690 GB storage, 64-bit platform), g , p
One EC2 Compute Unit equivalent to a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor
24
Standard
Instances Linux/UNIX Windows
Small (Default) $0.10 per hour $0.125 per hour Large $0.40 per hour $0.50 per hour Extra Large $0.80 per hour $1.00 per hour
EC2 High CPU Instances (2010)
• Medium: 1.7 GB, 5 EC2 Compute Units (2virtual cores), 350 GB storage, 32-bit platform • Extra Large: 7 GB of memory, 20 EC2
Compute Units (8 virtual cores), 1690 GB storage, 64-bit platform
25
High CPU
Instances Linux/UNIX Windows
Medium $0.20 per hour $0.30 per hour Extra Large $0.80 per hour $1.20 per hour
Pay only for what you use
On-demand capacity allocation Own capacity
Resource need
S3 pricing (2010)
Storage$0.150 per GB – first 50 TB / month of storage used $0.140 per GB – next 50 TB / month of storage used $0.130 per GB – next 400 TB /month of storage used $0.120 per GB – storage used / month over 500 TB
Data Transfer
27 Data Transfer
$0.100 per GB – all data transfer in
$0.170 per GB – first 10 TB / month data transfer out $0.130 per GB – next 40 TB / month data transfer out $0.110 per GB – next 100 TB / month data transfer out $0.100 per GB – data transfer out / month over 150 TB
Requests
$0.01 per 1,000 PUT, COPY, POST, or LIST requests $0.01 per 10,000 GET and all other requests* * No charge for delete requests
Common AWS features
• Provide application platforms, includingresources
• Accessed over the web • Simple to use
• Easy to get started (just need a credit card) P
• Pay-per-use
• No contracts (committing to future use)
28
Cloud attractions
Cost – especially for peaks
Flexibility; rapid scalability and de-scalability
Data replication
Easier cross-institution collaboration
Any {time, place, device} access via web
b browser
Alternative if departmental or central IT
non-responsive
Priorities: no need to focus on commodity IT
Future of computing
Cloud concerns
Loss of control
Integration: enterprise & federated
authorization
Interoperability: with key enterprise apps
Accessibility and user interface limitations of
web apps web apps
Reliability, performance, security
Offline access
Features; changes; vendor lock-in
Policy/compliance concerns (privacy)
Business “surprises”; Support; More Logins
How? Virtual machines
-Abstracts hardware
Grid technology
-Distributed virtual resource
Business Service Management
- Dynamic SLA management
Autonomic systems
Cloud Computing 2015 –
Virtual infrastructure for future service deliveryWhat?
•Large-scale IT capacity
•Compute + storage + network
•Automaticly increase, decrease & migrate •Large-scale management Federated clouds Service Provider Infrastructure id Internal infra‐ structure Service Provider Bursted internal clouds
B
ASIC DEPLOYMENT SCENARIOS Infrastructure Infrastructure Infrastructure Provider Service Provider Multi‐clouds Provider Provider Infrastructure Provider Infrastructure Provider Infrastructure Provider Infrastructure Provider BrokerCloud Resource Management
What? For whom?
•Service providers •Infrastructure providers
What?
•Compute + storage + network •Low and high level management
How?
•Single abstraction – multiple use (scenarios) •General tools for key functionality •Flexibility in deployment and configuration
Example (low level management):
Elasticity- & access control
Elasticity control
•Control system handling peaks &lows •Inceasing ability to meel SLAs •Reduces resource consumption consumption Access control •Overbooking of elastic services
•Access control quality directly determines income and SLA violation rate
Holistic Cloud management
Business Level ObjectivesManagement constraints
Algorithms
Policies AlgorithmsPolicies AlgorithmsPolicies AlgorithmsPolicies AlgorithmsPolicies AlgorithmsPolicies
UmU cloud research
– additional examples
Create cloud infrastructure
• Architectures and software for cloud and grid systems
• Methods for improving resource utilization • Monitoring & accounting
• Self-management, self-optimization • Algorithms for scheduling and elasticity • Algorithms for scheduling and elasticity • Algorithms for efficient VM migration Use cloud infrastructure
• Basic unbderstanding of how to develop software to be run on elastic infrastructure
• Tools to create and run cloud services • Test and development jointly with end users
Migrating large virtual machines Why migration?
Server consollidation, cloud optimization, resource management, elasticity control, etc Basic algorithm for live migration:
1. Transfer all memory pages
Er ik El m ro th el m ro th@ cs .um u. se Er ik El m ro th el m ro th@ cs .um u. se 2. Repeat:
2.1 Transfer all pages being “dirtied” during migration process
3. Suspend VM
4. Transfer remaining pages 5. Researt VM on destination host
Time from sustpend to restart is downtime. VM unavailable during this time
Challenge
If memory pages are dirtied rapidly relative to transfer time
• VM suspended during extended time • Network connection timeouts • Services on the VM crashed
Er ik El m ro th el m ro th@ cs .um u. se Er ik El m ro th el m ro th@ cs .um u. se
• Services on the VM crashed
Likely to happen for VMs with “busy” memory access patterns
Solution: page caching and delta compression
• Transfer only difference between current and perviously trnsferred version
• Optimize page order for transfers
Demo (effect of delta compression)
Er ik El m ro th el m ro th@ cs .um u. se Er ik El m ro th el m ro th@ cs .um u. se
International collaborations
EU FP7 IP. Introduced federated clouds. EUs first major cloud project.EU FP7 IP. Optimized cloud services over complete lifecycle. Non-functional aspects. EU FP7 IP. Pioneering federated storage clouds. Raised level of abstraction Media- and telecom applications
UMIT
Research Lablevel of abstraction. Media and telecom applications. Governments strategic efforts. Methods and software for eScience applications.
Umeå initative for innovation and industry benefits within simulation, visualisation, computation and infrastructur. Key partners: IBM Haifa Research Labs, SAP Research, ATOS Orgin, Universidad Complutense de Madrid, Leeds University, Barcelona Supercomputer Center, Telefonica I+D, and British Telecom
Next generation infrastructure for service delivery – Federation of clouds
– Leverage migration – enable migration
– Service definition, automati QoS, monitoring, aaccounting/billing – Open specifications
RESERVOIR (EU FP7)
”Resources and Services Virtualization without Barriers”
•UMU and 12 partners •IBM, Israel (coord)
•Telefonica, Spain
•SAP, Ireland + Israel
•SUN Microsystems, Germany
Th l F
p p
– For diverse underlying technology (e.g., virtualization technology)
Infrastructure provider Infrastructure provider Infrastructure provider Service provider •Thales, France •Elsag-Datamat, Italy
•Global Grid Forum
•5 academic (Spain, Italy, UK, Belgium, Schweiz)
•Budget: 17 (10) M Euro
•Duration: 2008-02 - 2011-03
The Reservoir Architecture
Service Manager Service Provider
SLA SLA
SD+ SLA
• Monitors service and enforces SLA compliance by managing capacity of Service Components (VEEs) or/and size of Service Tiers • Deals with mapping of service
metrics (response time) to infrastructure metrics (VEE size)
(SM)
Infrastructure Provider = Site/Domain/Cloud VEE Management System
VEE Management Enablement Layer
Virtualized Physical Resource (e.g., Hypervisor) infrastructure metrics (VEE size)
• Monitors VEEs and finds best VEE placement
• Deals federation of domains
VEE = Virtual Execution Environment (VEEM)
The Reservoir Architecture
Service Manager Service Provider SLA SLA SD+ SLA (SM) Clear separation of concern &delegation of responsibility, e.g., • SM unaware of placement
(local & remote)
• Primary VEEM takes the role of
Infrastructure Provider = Site/Domain/Cloud VEE Management System
VEE Management Enablement Layer
Virtualized Physical Resource (e.g., Hypervisor)
VEE = Virtual Execution Environment (VEEM)
y
an SM towards remote site • Remote VEEM sees no
difference between local SM and remote VEEM
Service Applications on Reservoir
One multi-VEEapplication on:
– One VEE host – Multiple VEE hosts – Multple sites
SM may specify placement constraints, e.g.,
– When physical nearness is needed
– For redundancy – Various types user
requests
The Evolution of the Power Grid
useum .or g / collect io n/ev ent .p h p ? id = 3456876 http://www.pbase.com/rbenny/image/29116
http://www.rootsweb.com/~nytigs/BurdenPayrollRecords.htmThe Burden Iron Works Water Wheel
ht tp ://i eee-v irt ual -m
The Pearl Street Station
•Make your own infrastructure •Not the company’s main
business but a considerable competitive advantage
•The utility industry •Metering •Limited reach
•Efficient distribution •Federation of providers •The diversity factor •Economies of scale
http://www.anl.gov/Media_Center/logos22-1/electThe US National Power Grid
The Evolution of the
Compute
Grid
R E S E R V O I R
“…will move towards a mix of microproduction and
large utilities, with increasing numbers of small-scale
producers co-existing with large-scale regional
•Make your own infrastructure •Not the company’s main
business but a considerable competitive advantage
•Efficient distribution •Federation of providers •The diversity factor •Economies of scale
http://www.by-star.net/techspeak/datacenter/
http://www.smcplus.com/applications.asp?id=32
http://www.informationweek.com/galleries/showImage.jhtml?galleryID=62&imageID=13
Google @ The Dulles, OR
producers, and load being distributed among them dynamically…”
There’s Grid and then thar Clouds - Ian Foster
•The utility industry •Metering •Limited reach
Create an eco‐system for cloud infrastructure
OPTIMIS (EU FP7)
Scenario 2013+:Most companies use private and public clouds in combination
Internal infra‐ structure Service provider Infrastructure provider Infrastructure provider Infrastructure provider Infrastructure provider Broker Service provider
Create an eco‐system for cloud infrastructure
– Self‐management – Self‐optimization – Risk assessment
Internal
OPTIMIS (EU FP7)
Scenario 2013+:Most companies use private and public clouds in combination
•UMU (scientific coordinator) and 11 partners:
•ATOS Origin, Spanien (coord)
•British Telekom, UK
Cloud providers’ eco‐system
– Programming model – Service composition Construction Risk assessment – Energy efficiency –Data management operation Also: –Multi‐clouds –Federated clouds – License management –Energieffektivitet – User locality External operation – Risk – Trust – Eco‐aspects – Cost (Economy) Deployment optimization British Telekom, UK •SAP, Irland •Fraunhofer, Germany •Flexiscale, UK •451Group, UK
•5 akademiska (Spain, Italy, UK, Belgium, Schweiz)
•Budget: 10 (7) M Euro
OPTIMIS
MAJOR OUTCOMES AND BENEFICIARIES
Validation scenarios:
•Programming model validation
through lifecycle management of
on‐demand ERP/CRM services
•Extended elasticity via transparent
cloud bursting
•Cloud brokerage and federation
involving many cloud providers
Major outcomes: Key beneficiaries: •Service Providers •Infrastructure Providers Additional stakeholders: •Brokers
•Independent software vendors
•Service consumers (end‐users)
Major outcomes:
•OPTIMIS Toolkit
•Tools for construction, deployment, operation
•General base toolkit for trust, risk, cost and eco aspects
•Reference architectures and guidelines for (bursted) internal clouds, multi‐clouds and federated clouds
•Showcase results through business driven validation scenarios
•Market predictions, business models and
legal guidelines Infrastructure Provider Service Provider
Construction phase Deployment phase Operation phase
Deployment Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Deployment Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Admission Control OPTIMIS Base Toolkit OPTIMIS Base Toolkit Admission Control OPTIMIS Base Toolkit OPTIMIS Base Toolkit Service Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Service Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Cloud Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Cloud Optimizer OPTIMIS Base Toolkit OPTIMIS Base Toolkit Construction Optimizer Configuration Manager Configuration Manager Programming Model and IDE Programming Model and IDE DC_OSLO DC_WARSAW DC_HAIFA DC_ATHENS
VISION (EU FP7)
Infrastructure for reliable and effective delivery of data-intensive storage services, facilitating the convergence of ICT, media and telecommunications
DC_MADRID DC_BERLIN
DC_PARIS
DC_LONDON DC_ROME
•UMU and 14 partners: •IBM, Israel (coord)
•Deutche Welle, Germany
•RAI, Italy
•Telenor, Norway
•Siemens, Germany
•France Telecom, France
Technology Innovations
–Raise Abstraction Level of Storage: objects with user-defined and system-user-defined metadata
–Data Mobility and Federation: enable comprehensive data migration and interoperability across remote locations
–Computational Storage: technology for specifying and executing computations close to storage
–Content-Centric Storage: facilitate access to data by content and its relationships
–Advanced Capabilities for Cloud-based Storage: support delivery of data-intensive services securely, at the desired QoS, at competitive costs
Validation Scenarios –Media –Telco –Healthcare –Enterprise •SAP, Germany •Telefonica, Spain
•Engineering SPA, Italy
•ITricity, Netherlands
•Storage Networking Industry Assoc, Europe
•3 academic (Italy, Sweden, Greece)
•Budget: 16 (9) M Euro
•Duration: 2010-10 - 2013-09
VISION Cloud VISION
photoSharing
socialnetworking
Computational storage enables an application’s performance sensitive computations to be performed close to the storage
VCusr1 VCusrn VCusr1 VCusrn Document Sharing VC_User VC_User VC_User
Rich metadata enables sharing of objects across applications Content centric access enables
each application to have its own view of the storage
Mobility and interoperability allow users to change providers and to have storage at multiple providers
Advanced capabilities for cloud-based storage ensure secure access and quality of service
eSSENCE – method development for eScience • Governments strategic
efforts (VR)
• Research on methods development for eScience • Our focus
– Methods and frameworks for d d l d f
•UMU and 2 partners: •Uppsala University (coord)
•Lund University Er ik El m ro th el m ro th@ cs .um u. se
grid and cloud infrastructure – Methods and tools for
applications
•Lund University
•Budget: 102 M SEK
•Duration: 2010 - 2014
UMIT Research Lab
Methods and toolsThe parallel revolution
Dynamic scalable IT infrastructure
Foundation of excellent basic research
Engineering research
Interdisciplinary challenges
Industrial applications and inno tion
•Interdisciplinärt forskningslab vid UmU
•Datavetenskap
•Fysik
M t tik
Problem Model Simulation Results
Optimization Computation IT Infrastructure Hardware & software Visualization & interaction innovation •Matematik
•Tillämpad fysik o elektronik
•Budget: 40 M SEK (flertal finansiärer)
•Duration: 2009 - 2015 (to be extended)
•Physical location: MIT building 2ndfloor
Senior researchers Project coordinators
Erik Elmroth, Professor Francisco Hernandez, Assistant professor Johan Tordsson, Assistant professor Lei Xu,
Post Doc Lennart Edblom, PhLic Christina Igasto PhD PhD Students Ahmed
Ali-Eldin Daniel Henriksson Ewnetu Bayuh Lakew
Lars
Larsson Wubin Li Mina Sedaghat Petter Svärd P-O Östberg
Others Tomas Ögren, Systems expert Sebastian Gröhn, Research assistant Marcus Karlsson Research assistant Mikael Öhman, Research assistant www.cloudresearch.se