INFO5011
Cloud Computing Semester 2, 2011
Outline
›
Recap of cloud computing servicing models
›
SaaS architecture consideration
›
IaaS and PaaS comparison
2
cloud computing– Service models
›
Recap the cloud computing definition by US goverments’
National Institute of Standard and Technology
-
Cloud computing is a model for enabling ubiquitous, convenient,
on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services)
3
EC2 Azure AppEngine
Lower-level, Less management Higher-level, More management Force.com Utility computing GoogleApps, SalesForce SaaS PaaS IaaS
Clouds servicing models
Private (On-Premise) Storage Server HW Networking Servers Databases Virtualization Runtimes Applications Security & Integration Y ou manag e Infrastructure (as a Service) Storage Server HW Networking Servers Databases Virtualization Runtimes Applications Security & Integration Manag ed by v endor(s) Y ou manag e Y ou manag e Platform (as a Service) Storage Server HW Networking Servers Databases Virtualization Runtimes Applications Security & Integration Manag ed by v endor(s) Software (as a Service) Storage Server HW Networking Servers Databases Virtualization Runtimes Applications Security & Integration Manag ed by v endor(s)Diagram from Azure Academic Materials Syllabus, Prepared by David S Platt , Harvard University Extension School, [email protected] www.rollthunder.com
4
Software As a Service
- Software as a Service (SaaS): The consumer uses an application, but does not control the operating system, hardware or network infrastructure on which it's running.
- Applications are restricted to business applications or applications that may normally installed in a business network or personal computer
- Examples
- Business applications: CRM solutions from salesforce.com - Business/Personal applications: Gmail, Google Doc, etc.
5
PaaS
- Platform as a Service (PaaS): The consumer uses a hosting environment for their applications. The consumer controls the applications that run in the
environment (and possibly has some control over the hosting environment),but does not control the operating system, hardware or network infrastructure on which they are running. The platform is typically an application framework.
6
IaaS
› Infrastructure as a Service (IaaS): The consumer uses "fundamental computing resources" such as processing power, storage, networking components or
middleware. The consumer can control the operating system, storage, deployed applications and possibly networking components such as firewalls and load balancers, but not the cloud infrastructure beneath them.
7
Cloud Server and Data Center Map:
http://www.datacentermap.com/cloud.html
SaaS in the news
―Customer relationship management (CRM) continues to be the largest market for SaaS. SaaS revenue within the CRM market is forecast to reach $3.8 billion in 2011, up from $3.2 billion in 2010. Gartner expects SaaS to represent nearly 32 percent of the CRM market's total software revenue in 2011.‖
8 http://www.gartner.com/it/page.jsp?id=1739214
Gartner Says Worldwide Software as a Service Revenue Is Forecast to Grow 21 Percent in 2011, July 7, 2011
―Worldwide software as a service (SaaS) revenue is forecast to reach $12.1 billion in 2011, a 20.7 percent increase from 2010 revenue of $10 billion, according to Gartner, Inc. The SaaS-based delivery will experience healthy growth through 2015, when worldwide
revenue is projected to reach $21.3 billion.‖
"After more than a decade of use, adoption of SaaS continues to grow and evolve within the enterprise application markets. Initial concerns about security, response time and service availability have diminished for many organizations as SaaS
business and computing models have matured and adoption has become more widespread‖
SaaS in the news
http://www.nytimes.com/2011/06/28/technology/business-computing/28soft.html
9
―Mr. Conophy, the chief information officer of the InterContinental Hotels Group, decided earlier this year to begin moving nearly all the company’s 25,000 office workers off
Microsoft’s e-mail and Office productivity applications and onto Google’s Web-based
alternatives.‖
Microsoft Takes to Cloud to Ward Off Competition
New York Times, June 27, 2011
―Halting such defections is a top priority at Microsoft. Its response arrives Tuesday, when the company begins selling Office 365, a cloud-based version of Microsoft’s e-mail,
whiteboard collaboration software and word processing, spreadsheet and presentation programs.‖
―At $50 a year, Google’s pricing seems far more appealing than the standard price for the Office PC software, from $200 to about $400, depending on features. Office 365 prices are from $2 per user a month to $27 per user a month.‖ ―With cloud-based versions of Word, Excel and PowerPoint, plus several new communications and collaboration tools, that offering could be quite appealing, analysts say. The price, at $72 a year, is somewhat above Google’s, but it carries the Microsoft name and familiarity.‖
SaaS Motivations
› SaaS, sometimes know as on-demand computing, internet as a platform,
seems like a return to the old terminal and time-sharing system model. › Incentives for giving up lots of control (individual users, organizations)
- Software must be installed and configured then updated with each new release
- Infrastructure, OS and low utilities must be maintained
- Every update to the OS sets off a cascade of subsequent revisions to other programs
- For SaaS users, WEB BROWSER is like an OS
› Incentives for providing taking up lots of controls (providers)
- Software sold or licensed as a product must be able to cope with a baffling variety of operating environments
- SaaS software runs on platform of the vendor’s choosing
- Updates and bug fixes are deployed in minutes
- …
10 Brian Hayes. 2008. Cloud computing. Commun. ACM 51, 7 (July 2008), 9-11.
http://doi.acm.org/10.1145/1364782.1364786
Two major categories of software as a service
›
Line-of-business services
- offered to enterprises and organizations of all sizes. Line-of-business services are often large, customizable business solutions aimed at facilitating business processes such as finances, supply-chain management, and customer relations. These services are typically sold to customers on a subscription-basis.
›
Consumer-oriented services
- offered to the general public. Consumer-oriented services are sometimes sold on a subscription-basis, but are often provided to consumers at no cost, and are supported by advertising.
11
Slides 11- 24 are based on Frederick Chong and Gianpaolo Carraro, Architecture
Strategies for Catching the Long Tail, Microsoft Corporation, April 2006
Accessible from: http://msdn.microsoft.com/en-us/library/aa479069.aspx
Providing SaaS
12
Changing the business model
›
Shifting the "ownership" of the software from the customer to an external
provider.
›
Reallocating responsibility for the technology infrastructure and
management—that is, hardware and professional services—from the
customer to the provider.
›
Reducing the cost of providing software services, through specialization
and economy of scale.
›
Targeting the "long tail" of smaller businesses, by reducing the minimum
cost at which software can be sold.
13
Selling to the long tail
14
The Amazon long tail
Line of Business software long tail
15
Traditional on-premise software model
SaaS model
Application architecture
›
Most important attributes for SaaS architecture
- Scalable- Multi-tenancy
- Configuration
- Metadat based configuration instead of code based customization.
›
Multi-tenancy and isolation may work at different level for different service
models
- Tenant can be a single consumer or a large organization
16
The Software as a Service Maturity Model
17
SaaS Maturity levels
›
Level I: Ad Hoc/Custom
- The first level of maturity is similar to the traditional application service provider (ASP) model of software delivery, dating back to the 1990s. Customized
instance.
›
Level II: Configurable
- At the second level of maturity, the vendor hosts a separate instance of the application for each customer (or tenant). Same instance, configured to suit different customers
›
Level III: Configurable, Multi-Tenant-Efficient
- Only has the ―scale up‖ option
›
Level IV: Scalable, Configurable, Multi-Tenant-Efficient
- Multiple instances; not 1:1 mapping; easy to scale out
18
Higher level architecture
19
Metadata services
›
the metadata service provides customers with the primary means of
customizing and configuring the application to meet their needs. Typical
areas include:
- User interface and branding - Workflow and business rules - Extensions to the data model - Access control
20
Multi-Tenant Data Model
›
Dedicated Tenant Database
- Using metadata to keep track of which database belongs to which tenant
›
Shared database, fixed extension set
›
Shared database, custom extension set
21
force.com data storage example
22
Diagramsfrom Craig D. Weissman, Steve Bobrowski, The Design of the Force.com
Multitenant Internet Application Development Platform. In SIGMOD’09
Figure 4. Example of single flex column
Figure 3. Force.com’s data definition and storage model
Scalability – scaling out
›
Scaling the application
- Design the application to run in a stateless fashion
- Design the application to conduct I/O operations asynchronously
- Pool resources such as threads, network connections and database connections - Write the database operations in such a way as to maximize concurrency and
minimize exclusive locking
›
Scaling the data
- Partitioning and repartitioning
- force.com partition based on tenant
23
Operational structure
›
What it takes to deliver the application to customers and to keep it
available and running well at cost-effective level
- How many 9s in uptime
›
Shared services
- Accurately track customers' usage, and bill them for time or resources used. - Restrict or throttle access at certain times of the day, or in order to meet other
criteria.
- Monitor site access and performance, to ensure that SLAs are being met. - Perform other functions in order to ensure a seamless experience for your
customers that meets or exceeds expectations.
24
Shared services
›
Operational support services
(OSS)—Handle operational issues
such as account activation,
provisioning, service assurance,
usage, and metering.
›
Business support services
(BSS)—Support billing (including
invoicing, rating, taxation, and
collections) and customer
management (which includes order
entry, customer self services,
customer care, trouble ticketing,
and customer relationship
management).
PaaS and IaaS or utility services
›
The line between PaaS and IaaS is fuzzy
›
There are lots of services lie in between
- Many classic IaaS service providers provide computing and storage as separate services
- Amazon EC2 and S3
- Rackspace Cloud Server and Cloud File
- Microsoft Windows Azure Virtual Machine/compute and storage - There might be other storage options
- Amazon EBS, SimpleDB, RDB, Microsoft SQL Azure Database
- There might be other utility type of services for networking, messaging and so on - Amazon Simple Queue Services, Amazon Virtual Private Cloud, Microsoft
virtual network, CDN,…
26
Commonalities among major service providers
› Common services
- Elastic compute cluster
- Persistent storage
- Intra-cloud network
- Wide-area Network
› Comparisons as reported in the paper
27
Slides 27-42 are based on
CloudCmp: comparing public cloud providers. Ang Li, Xiaowei Yang, Srikanth Kandula, and Ming Zhang. In Proceedings of the 10th annual conference on Internet measurement (IMC '10) unless stated otherwise
Provider Elastic Compute Storage Wide-area Network
AWS Xen VM SimpleDB(table), S3(blob), SQS(queue), RDS(Relational)
3DC locations
Azure Azure VM XStore(table, blob, queue)
SQL Azure Database (Relational)
6DC locations (2 each in US, EU and Asia)
AppEngine Proprietary sandbox DataStore(table), BlobStore (Blob) Unpublished number of Google DC
RackSpace Xen VM CloudFiles (blob) 2DC locations (all in US)
Elastic Compute Cluster
› Elastic Compute Cluster
- Pricing:
- AWS, Azure and CloudServers charge based on how long an instance remains allocated
- AppEngine charges based on how many CPU cycles a customers’ application consumes
- Elastic implementation
- Opaque scaling (AWS, Azure and CloudServers)
- Manual or policy based
- Transparent scaling (AppEngine)
- Automatic,
- Performance Metrics
- Benchmark finishing time
- Cost
- Scaling latency
28
An example of Amazon’s scaling mechanism
29
Jeff Barr, Host Your Web Site in the Cloud, published by Sitepoint, 2010 INFO5011 "Cloud Computing" - 2011 (U. Röhm and Y. Zhou)
Persistent Storage Mechanisms
› Persistent Storage
- Table: Relational and non-Rational (NoSQL)
- Blob: Binary Large Object, such as image or video; binary data are stored in a bucket with metadata (label on the bucket) that allows one blob to be distinguished from
another
- Queue: A classic first-in, first-out data storage structure used used primary for passing data from one computing job to another in a loosely-coupled fashion
30
Mechanism Operation Description
Table get
put query
Fetch a single row(object) Insert a single row(object)
Lookup rows(objects) that satisfy certain condition
Blob download
upload
Download a single blob Upload a single blob
Queue send
receive
Send a message to a queue
Retrieve the next message from a queue
Blob Storage
›
Amazon S3
- Buckets, objects and keys
- A bucket is a container for objects stored in Amazon S3; Bucket names are drawn from a global namespace; Each S3 account can have a limited number of buckets
- Objects are the fundamental entities stored in Amazon S3. Objects consist of object data and metadata.
- Each object has a key to identify itself within a bucket. - Object are addressable
- http://dog.s3.amazonaws.com/maggie.jpg
›
Microsoft Azure Blob Service has similar
container/blob
structure
- http://myaccount.blob.core.windows.net/mycontainer/myblob31
Persistent Storage Pricing & Performance Metrics
›
Storage pricing are based on of three type of costs
- Storage, executing request, network transfer- S3/Azure Storage Service/Cloud File charge fixed per request cost - SDB/DataStore: charge cpu cycles used to carried out the requests
- A complex query costs more than a simple one
›
Performance Metrics
- Operation Response Time - Time to Consistency
- Cost Per Operation
32
Intra-cloud network
›
A large cloud provider often operates several data centers
- Intra-datacenter traffic within a data center is not charged- Inter-datacenter traffic is charged based on the volume crossing the data center boundary
›
Performance metrics
- Path capacity measured by TCP throughput - Path latency
33
Wide-area Network
› Wide-area network in this paper is defined as the collection of network paths between a cloud’s data centers and external hosts (most likely where client app runs)
› Performance metrics
- Optimal wide-area network latency
- Initial requests from PlanetLab nodes, compute the minimum latency between a PlanetLab node and any data center owned by a provider
34
PlanetLab currently consists of 1080 nodes at 532 sites. Diagram from http://www.planet-lab.org/
Implementation
› Computation Metrics
- The purpose of benchmark tasks to stress various aspects of the compute infrastructure offered by cloud providers
- Modify a standard Java based benchmark to suit all cloud services
- The benchmark consists of CPU/memory/and IO intensive tasks respectively
› Storage Metrics
- Various data size
- Java client to test typical storage APIs
- Use persistent HTTP to avoid initial SSL handshake and TCP set up time
- Also provide other language client
› Network metrics
- Standard tools like iperf and ping
- For intra-cloud network performance, allocate a pair of instances in the same or different data centers and run these tools
- For wide-area network performance, instantiate an instance in each data center owned by the provider and ping them from 200+ PlanetLab nodes.
35
Results on CPU
36 $0.085/h (<1) $0.34/h (2) $0.64/h (4) $0.12/h (1) $0.24/h (2) $0.48/h (4) $0.96/h (8) $0.015/h $0.03/h $0.06/h $0.12/hSame performance regardless of instance type! Maybe explained by the CPU sharing policy of C2, where a virtual instance can fully use all physical CPUs on a machine if there is no contention
Results on IO
37
Results on scaling latency
38
Windows instance takes longer time to create than Linux one
Same provisioning time, different booting time;
maybe caused by mismatch between cpu and windows code base
Same booting time, different provisioning time; may be caused by various hardware infrastructure
Persistence storage (table) result
39
The difference in query performance may be caused by indexing strategies on non-key columns
Intra-datacenter performance
40
Wide-area Network
41
Realworld application results
›
The paper also deploy several real world application on different cloud
providers to confirm the result obtained by CloudComp
›
E-Commerce Website
›
Parallel Scientific Computation
›
Latency Sensitive Website
42
Discussion question
›
Grid computing vs. Cloud computing
›
Multi-tenancy and isolation levels
43
Resources
› SaaS directory http://saasdir.com/
› Google DataStore:
http://code.google.com/appengine/articles/datastore/overview.html
› Ang Li, Xiaowei Yang, Srikanth Kandula, and Ming Zhang. CloudCmp: comparing
public cloud providers. In Proceedings of the 10th annual conference on Internet
measurement (IMC '10). ACM, New York, NY, USA, 1-14.
› Frederick Chong and Gianpaolo, Architecture Strategies for Catching the Long Tail. Carraro, Microsoft Corporation, April 2006
44