• No results found

Cloud computing taxonomy

N/A
N/A
Protected

Academic year: 2021

Share "Cloud computing taxonomy"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

Cloud computing taxonomy

Olivier Cur´e

Universit´e Paris-Est Marne la Vall´ee , LIGM UMR CNRS 8049, France

October 1, 2015

Olivier Cur´e

(2)

“a cloud provides on demand resources and services over the Internet, ussually at the scale and with the reliability of a data center”1

Resources are accessed through services hence pay-as-you-go pricing model

This is based on a Service Level Agreement (SLA) between a cloud provider and customers.

1Grossman, R. L. and Gu, Y. (2009). On the varieties of clouds for data intensive computing. Q. Bull. IEEE TC on Data Eng., 32(1):4450.

Olivier Cur´e

(3)

Olivier Cur´e

(4)

Scalability

ability of a system, network, or process, to handle growing amounts of work in a graceful manner or its ability to be enlarged to accommodate that growth.

Two ways to scale:

vertical scaling, aka scaling up, (adding processors, memory) and

horizontal scaling, aka scaling out, with functional scaling (group data by function and spreading then across databases) and splitting data within functional area across multiple databases , aka sharding.

Olivier Cur´e

(5)

Olivier Cur´e

(6)

Service Level Agreement (SLA)

A SLA specifies the responsabilities, guarantees and service commitment.

For instance, the service commitment may define that the service uptime during a billing cycle (e.g., a month) should be at least 99%, and if this is not the case, the customer should get a service credit.

Olivier Cur´e

(7)

Multitenancy

an essential concept of cloud computing.

a principle in software architecture where a single instance of the software runs on a server, serving multiple client

organizations (tenants).

architecture is designed to virtually partition its data and configuration.

Olivier Cur´e

(8)

Cloud computing capitalizes2 on:

grid computing (distributed resources over the network) server virtualization

cluster computing (to manage lots of computing and storage resources)

Web services (SOA)

Utility computing (packaging computing and storage resources as services)

2Ozsu Tamer and Valduriez Patrick: Principles of Distributed Database Systems. Springer 2011

Olivier Cur´e

(9)

Cloud computing vs Grid computing

Cloud computing is designed to support a large number of users while Grid computing is meant to run very large jobs for few users.

Cloud computing involves to selecting a provider and to running apps in their datacenter(s), while Grid computing involves a federation of multiple organizations.

Olivier Cur´e

(10)

Server virtualization

a technology that enables multiple applications to run on the same physical server as virtual machines

That is the same as if they would run on distinct physical servers.

Solutions: VMWare, Xen VM(@ Amazon)

Olivier Cur´e

(11)

Olivier Cur´e

(12)

Olivier Cur´e

(13)

Different forms of cloud:

Public cloud: pay-as-you-go approach to the public. What is sold is utility computing.

Private cloud: internal datacenters of a business or organization that are not available to the public.

Community cloud: shares infrastructure between several organisations from a specific community with common concerns (security, compliance, jurisdiction, etc.),

Hybrid cloud: a composition of two or more clouds (private, community, or public) that remain unique entities. Can also correspond to multiple cloud systems which are connected such that programs and data can be moved easily from one

deployment system to another.

Olivier Cur´e

(14)

3 broad categories of cloud services:

Software as a Service (SaaS) Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Data as a Service (DaaS)

Olivier Cur´e

(15)

Software as a Service

Delivery of app software as a service

A generalization of the ASP model but the cloud provider also provides tools to integrate other apps.

Apps ranging from email, calendar to CRM, data analysis, etc.

Example: Salesforce.com CRM

Olivier Cur´e

(16)

Olivier Cur´e

(17)

Infrastructure as a Service

Delivery of a computing infrastructure as a service (computing, storage and networking)

Easy scale up (add resources) and down (release resources)::

elasticity

Elastivity is achieved via server virtualization: multiple apps run on the same physical server as virtual machines.

Example: Amazon Web Services, Joyent, Rackspace, Eucalyptus.

Olivier Cur´e

(18)

Platfoom as a Service

Delivery of a computing platform with development tools and APIs as a service.

Supports the creation and deployment of custom apps directly on the cloud

Example: Google App Engine, Joyent, Heroku, Cloudbees, Microsoft Azure, Cloud foundry (VMWare)

Olivier Cur´e

(19)

Olivier Cur´e

(20)

Olivier Cur´e

(21)

Data as a Service

considers that software is becoming a commodity.

Data is the main thing when it can be used with different software.

Pricing model:

Volume-based model: either quantity-based pricing (e.g. 5,000 API calls per day) or pay per call (e.g. few cents, for each call to the API).

Data type-based model: not all data have the same value.

Ensuring data quality and its cleansing are central tasks.

Actors: Urban Mapping (geography data service), Xignite (Financial market data), etc.

Olivier Cur´e

(22)

Olivier Cur´e

(23)

Approximate revenues

Olivier Cur´e

(24)

Leader in SaaS enterprise app market, especially in CRM (Customer Relationship Management)

Between 47.000 and 100.000 clients Contains different aspects

Force.com: a PaaS offering with a proprietary language (APEX)

Database.com: database part of force.com

Heroku: a PaaS competitor to force.com. Java and Ruby focused supports many DBMS: SQL and NoSQL

Olivier Cur´e

(25)

Salesforce.com CRM Overview Demo

Olivier Cur´e

(26)

Hard to get details on their infrastructure

A single instance of Oracle, probably the largest in the world Large tables with hundreds of flex columns, which can hold data of many different kinds and dataypes → a kind of NOSQL store.

Multitenancy via joins

550 million transactions/per day in 2011.

A single instance of Oracle, probably the largest in the world

Olivier Cur´e

(27)

Google App Engine A PaaS approach

Supports Java, Go and Python runtime Main concerns: Scalability and reliability

GAE restricts your app from any access to the physical infrastructure:

No socket openings

No running background processes (but cron is allowed) Or other back-end routines

Olivier Cur´e

(28)

GAE shares resources among multiple apps but isolates the data and security between each tenant.

Your app can use some Google services but you can not open ports directly

GAE imposes a fixed duration on execution of code.

A GAE app gets a daily limit on each type of request and this is subtracted from your daily allotment:

free GAE can scale to 5 million hits per day, 7400 secure incoming requests/min.

Need more? Pay for it.

Olivier Cur´e

(29)

Daily quotas and per-minute quotas.

Features of quotas:

CPU time Requests

Incoming bandwidth Outgoing bandwidth

Olivier Cur´e

(30)

Quotas for request resources

Olivier Cur´e

(31)

Quotas for database resources

Olivier Cur´e

(32)

Quotas for mail API resources and more

Olivier Cur´e

(33)

GAE provides (as of 2010):

a Java 6 JVM and Java Servlet Interface JDO, JPA, JavaMail

JCache

Google Plugin for Eclipse with local development server and deployment tools

Schedule tasks (to create and manage cron jobs)

Olivier Cur´e

(34)

Will it play in GAE

Since the JVM on GAEJ accepts a subset of a standard JVM, it is important to know what’s supported.

Supported: JDO, JPA, JSF, JSP, JSTL, Java Servlet API 2.4, JAXB, JavaMail, XML APIs (DOM, SAX, XSLT)

Not supported: EJB, JAX-RPC, JAX-WS, JDBC, JNDI, RMI, JMS

Olivier Cur´e

(35)

Will it play in GAE (2)

JVM-based languages

Compatible: Groovy, JRuby, Jython, Scala, PHP, Javascript Libraries and frameworks

Compatible: GWT, log4J, RESTlet, Struts (1 and 2), Spring MVC, Tiles, Adobe Flex

Semi-compatible: GRAILS, Jena Semantic framework Not compatible : Hibernate

Olivier Cur´e

(36)

Servlets and JSP in GAEJ

GAE runs the JVM in a secured sandbox environment to isolate apps from one another.

Classical MVC approach :

JSP for views, POJOs for the model and a Servlet for the controller.

Servlet uses JDO and JDOQL to access data stored in BigTable.

Olivier Cur´e

(37)

GAE datastore

GAE takes care of distribution, replication and load balancing.

It is powered by

Bigtable: a highly distributed and scalable service for storing and managing structured data

and Google File System (GFS): a scalable, fault-tolerant file system designed for large, distributed, data-intensive

applications.

Olivier Cur´e

(38)

Querying GAE datastore

Either with a standard API (JDO or JPA implementation) or low level API for modeling and persisting entities.

Datastore provides CRUD (Create, Read, Update, Delete) access to entities of Bigtable) and query with JDOQL.

Schemaless, no joins, supports for indexing and transactions.

Olivier Cur´e

(39)

GAE services

Memcache:

In-memory data cache in front of a persistent storage.

Memcache API supports the JCache interface: a map like interface to the cached data store.

Expiration mechanism: after a defined number of seconds or a precise time.

Data in Memcache is not reliable since not persistent.

Olivier Cur´e

(40)

URL Fetch Service

To communicate with other systems using HTTP and HTTPS callout.

Can not access ports other than 80 (HTTP) and 443 (HTTPS).

No socket connections directly.

Olivier Cur´e

(41)

To manage, monitor and configure an application.

Application dashboard enables to:

View requests, error log, analyze traffic.

Administer datastore, manage indexes.

Display quotas info (resources, datastore, email)

Olivier Cur´e

(42)

A cloud-computing app can be connected to several cloud computing platforms.

For instance: GAE and Salesforce.com with the Force.com toolkit for GAE.

Olivier Cur´e

(43)

Amazon Web Services comprises:

Simple Storage Service (S3) Elastic Compute Cloud (EC2) SimpleDB

Simple Queue Service

3 interfaces supported: REST, Query (REST-like) and SOAP

Olivier Cur´e

(44)

Amazon’s IaaS solution

“a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of operating systems, load them with your custom application environment, manage your networks access permissions, and run your image using as many or few systems as you desire.”

charges by the number and duration of VM instances used by a customer.

To use Amazon EC2, one need to selet a pre-configured image or create an Amazon Machine Image containing apps, libs, data and config settings.

Olivier Cur´e

(45)

Instance types

6 families of instances with a total of 11 solutions:

Micro instances: 613MB of RAM, 2 EC2 unit, EBS storage, 32-bit or 64-bity platform

Standard instances: 3 solutions from 1.7GB RAM to 15GB, 2 to 8 EC2 unit, 160GB to 1690 GB disk, 32-bit and 64-bit platform

High memory instances: 3 solutions from 17.1 GB to 68.4 GB of RAM, 6 to 26 EC2 unit, 5420 GB to 1690 GB of disk High CPU instance

Cluster compute instances: high CPU with increased network perf. (10GigaBit Ethernet)

Cluster GPU instances: high CPU and increased network performance for applications benefitting from highly parallelized processing

Olivier Cur´e

(46)

Operating systems and Software

OS: Red Hat Enterprise Linux, SUSE Linux Enterprise, Fedora, Windows Server, Gentoo, Oracle Enterprise Linux, Ubuntu Linux, Debian, Amazone Linux AMI

Databases: IBM DB2, SQL Server, MySQL, Oracle Database 11g

App dev environments: IBM sMsh, JBoss Enterprise app platform, Ruby on Rails

App servers: Websphere, Oracle Weblogic

Olivier Cur´e

(47)

On demand instances

Olivier Cur´e

(48)

Data transfer

Olivier Cur´e

(49)

Olivier Cur´e

(50)

Google Compute Engine Google’s IaaS.

Linux Virtual machines on Google’s infrastructure.

Strong security (data encryption).

Pricing: minimum charge is 10 minutes after that billing by the minute.

Olivier Cur´e

(51)

Pricing

4 machine types: standard (8 versions from 0.132$/hr to 0.922$/hr), shared-core (0.019$/hr), high memory (0.305$/hr to 1.2$/hr) and high CPU (0.16$/hr to 0.65$/hr).

Instance uptime (number of minutes between when you start an instance and when you stop an instance).

Network pricing: free for the same zone, Google products.

Otherwise, the more you send, the less you pay per Gb . Load balancing: pay per load balancing rule by the hour Persistent disk: difference between provisioned and snapshot, paid per Gb/month (0.10$ resp. 0.125$).

IO operations: 0.10$ per million.

Olivier Cur´e

(52)

Pricing example 1

Olivier Cur´e

(53)

Pricing example 2

Olivier Cur´e

(54)

Comparison is hard :

hard to find a common ground between IaaS and PaaS.

many dimensions: compute, storage, network, scaling.

CloudCmp3 compares the performance and cost of cloud providers. Challenges:

what to measure?

how to measure perceived performance of services? speed of CPU, memory, and disk I/O, scaling latency, storage service response time, time to reach consistency, network latency, and available bandwidth

CloudCmp compares: Amazon AWS, RackSpace CloudServers (IaaS), GAE, MS Azure (PaaS)

3Ang Li et al: CloudCmp: comparing public cloud providers. IMC ’10

Olivier Cur´e

(55)

RAMCloud

project of large-scale storage systems entirely in DRAM Combination of scale and low latency

Scale : 10000 servers with 64GB DRAM/server Latency access: 5-10µs remote procedure calls

Motivation: DRAM 5-10 times faster than SSD (flash)

Olivier Cur´e

(56)

History

Olivier Cur´e

(57)

Configuration example

Olivier Cur´e

(58)

Take away message Image : Cloud Computing trends 0611

Olivier Cur´e

References

Related documents