• No results found

Starting for the cloud -- two issuses in cluster: resource allocation and overload management

N/A
N/A
Protected

Academic year: 2021

Share "Starting for the cloud -- two issuses in cluster: resource allocation and overload management"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Starting for the cloud

-- two issuses in cluster:

resource allocation and overload

management

Ziyou Wang, Yan Li, Chao You, Minghui Zhou

Peking University

[email protected]

[email protected]

(2)

Agenda

 

Cloud Computing: Challenges

 

Resource Allocation

 

Shared cluster

 

Resource allocation planning

 

Overload Management

 

Examples

 

Automatic degradation mechanism

(3)

Cloud Computing: Challenges

The emergence of cloud computing makes it a cost-efficient way

for application providers to lease the computing resources from a

third provider

Benefit: increase resource utilization, improve business agility,

decrease power consumption…

But how to effectively allocate various resources in cloud to

different applications is still an open problem.

When the applications host in the cloud face with overload, which

means the demand on at least one of the cloud’s resources exceeds

the capacity of that resource, what can we do to handle this

situation?

(4)

Shared Cluster

Considering one kind of cloud implementation: the workloads of

different web applications are not correlated, a large-scale cluster,

called

shared cluster or data center

, is maintained to host a large

number of applications simultaneously

Each application runs on a subset of nodes

Each node may run multiple applications

Users Enterprises

(5)

Resource Allocation: a scenario

As the cluster’s resources are no longer occupied by one

application, it requires the cluster to allocate the resources on

demand

For example

middleware Node  150   app  D   High-­‐throughput     low-­‐latency  network   app  C   An increase of app A,C’s workload

Place new instances in the data center

re-allocate workload

middleware Node  1  

app  A   app  C  

Repository   Apps   … Other   nodes   Dispatcher   Applica>on  users   middleware Node  16   app  B   app  A   middleware Node  99  

app  B   app  A  

(6)

Self-adaptive Resource Allocation

Model

Resource  alloca>on    

planning  

Resource  alloca>on  

execu>on    

Requests

Self-­‐adap4ve    

resource    alloca4on    

(7)

Our Resource Allocation Work

Middleware

Virtual  Machine  Monitor VM customized   JOnAS app  a

Resource  par>>oner App  deployer Dispatcher requests Repository VM customized   JOnAS app  x Communicator Local  valuator Resource  alloca>on   planning Resource  alloca>on   execu>on Middleware Resource     alloca>on    planning coopera>on Management     Console commands messages

For the resource allocation planning, we propose a

decentralized resource allocation planning approach

Nodes decide their own resource allocation

Market-based coordination is adopted to help them

make the resource decision

Until now, the approach is evaluated with a serial of

simulated experiments, and is being implemented in

the cluster with JO

2

nAS

(8)

Resource Allocation Planning

To support application prioritization, applications can be assign

with the different utility values. Accordingly, the goal of resource

management is to maximize the total utility values of the requests

satisfied

Inspired by human market, we model the shared cluster as a

market, where shares of application requests are treated as goods

and nodes as dealers to exchange goods

Basing on local valuation of the goods, each node autonomously

and continuously trades with others in order to find an application

share combination which fits the node’s resource constrains and

maximize its income

(9)

Resource Allocation Planning

When a node wants to sell, more than one node may want to buy.

To make the seller transfer the goods to the appropriate buyers, an

auction mechanism is adopted

1. multicast 4. notify 2.1 valuation 2.1 valuation 2.1 valuation 4. inform (appC, 50%, 100 req/ s) ... Node 1 app A app C Node 50 app A app B Node 65 app B app C Node 100 app B app D ... Nodes app ... ... want C, 35% want C, 20% 2.2 Sell C 30% 2.2 Sell C 20% 3. sort 4 notify N100: … N65: …(app C, 10%) N50: … N1: … (app C, 70%) Dispatcher N100: … N65: …(app C, 30%) N50: … (app C,30%) N1: … (app C, 20%) update (app C , 30% to n50, 20% to n65) middleware middleware middleware middleware middleware

(10)

Our Resource Allocation Work

Middleware

Virtual  Machine  Monitor VM customized   JOnAS app  a

Resource  par>>oner App  deployer Dispatcher requests Repository VM customized   JOnAS app  x Communicator Local  valuator Resource  alloca>on   planning Resource  alloca>on   execu>on Middleware Resource     alloca>on    planning coopera>on Management     Console commands messages

For the resource allocation execution

Integrate a VMM into the middleware

Automatically load the app and partition the resource at

runtime via VMM

Customize JOnAS for the app, and store the customized

image in the repository

Proportionally workload dispatching

Now, we use Open VZ, a lightweight OS level VMM, as a

case study, and are trying to integrate OpenVZ into the

middleware

(11)

Agenda

 

Cloud Computing: Challenges

 

Resource Allocation

 

Shared cluster

 

Resource allocation planning

 

Overload Management

 

Examples

 

Automatic degradation mechanism

(12)

Examples

On September 11th 2001, for instance, the workload on a

popular news web site increased by an order of magnitude in

30 min, with the workload doubling every 7 min in that

period.

April 21th 2010, is the China National Mourning for Yushu

Quake Victims. Theatre and sporting performances are

cancelled, karaoke bars shut and the culture ministry has

ordered suspension of all online music, games, comics, films

and TV shows.

(13)

When overload happens?

Overload prevention

is a critical goal so that a system can remain

operational in the presence of overload even when the incoming

request rate is several times greater than the system’s capacity.

It is well known that the workload seen by Internet applications

varies over multiple time-scales and often in an unpredictable

fashion.

Unexpected things are always happening:

Featured on national television or in a major newspaper.

(14)

The TaoBao Architecture

Apache + Application Server + MySQL

200+ applications, thousands of components

12k servers

2k~3k java servers

Search

Product

Browsing

Product Recommendation

Shop Cart

(15)

The Reality – Manual Service

Degradation

In response to overload:

CNN replaced its front page with simple HTML page that could

be transmitted in a single Ethernet packet .

Taobao turned off a sub system.

All these techniques are implemented

manually

, though a better

approach would be to degrade service gracefully and automatically

in response to load.

Which point causes overload?

Which resource is the bottleneck?

Which service should be degraded or turned off?

(16)

Automatic Degradation Mechanism

Overload Priority defines the priorities of different services and

degradation actions can be taken.

Overload Detection is responsible for signaling the occurrence of

instable status of the application.

Overload Localization is triggered to locate the bottleneck of resources.

Overload Controller will take appropriate actions to degrade some

unnecessary services to release more resources to support key services.

Mechanism Overload Detection Overload Localization Overload Controller Performance Metrics Degradation Actions -Applications Service Service Service Service Service Overload Priority

(17)

Automatic Application Degradation

Cluster level degradation

Coarse-grained

  Sub-system level degradation

Resource management

Service differentiation

Node level degradation

Fine-grained

  Component level degradation

(18)

Considerations

Hard to be transparent to the user ( what can de degraded?

sometimes how?)

Using it alone can contribute to delay overload, but it needs to be

combined with other techniques to be fully effective.

Dynamic resource allocation

Admission control

Service differentiation

(19)

References

Related documents

The primary objective of this investigation was to assess whether the inclusion of discrete tensile elements, specifically polypropylene fibers, could reduce the desiccation

The relationship between ethnic diversity and the efficiency of public spending in education depends upon the measure (Polarization or fractionalization) and the level

Seperti yang dikatakan oleh Frandson (1992) bahwa faktor-faktor yang mengontrol pengosongan lambung melalui sphincter pilorik, mencakup volume makanan di dalam

SAZIYE GAZIOGLU , Middle East Technical University, Turkey, and University of Aberdeen — Stock Market Returns in An Emerging Financial Market: Turkish Case Study. HAYKEL HADJ SALEM

Although a positive concept in terms of consumer protection, its implementation must be planned carefully to ensure that all market players, whether network operators, ISPs or

Automated platforms (Learning Management Systems) can conduct various automated audits of high-risk areas, and provide information to the campus regarding changes in laws and

We consider the relative time as a more fair measure because: (i) the transfer to at least the first replica requires the owner to be available; (ii) data can be modified (and thus,

Exhibit 1. McKinsey&Company Use or disclosure of data contained on this sheet is subject to the restrictions on the title page of this proposal.. The on-the-ground team in