• No results found

DynamicCloudSim: Simulating Heterogeneity in Computational Clouds

N/A
N/A
Protected

Academic year: 2021

Share "DynamicCloudSim: Simulating Heterogeneity in Computational Clouds"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

DynamicCloudSim:

Simulating Heterogeneity in

Computational Clouds

Marc Bux, Ulf Leser

{bux|leser}@informatik.hu-berlin.de

The 2nd international workshop on Scalable Workflow

Enactment Engines and Technologies (SWEET'13)

(2)
(3)
(4)
(5)

Small Instance

: 1.7 GB RAM, 1 EC2

Compute Unit

, 160 GB local storage

Compute Unit

: equiv. CPU capacity of a 1.0-1.2 GHz Opteron or Xeon

No guarantees wrt. I/O throughput and network delay / bandwidth

(6)

Any one cloud instance is unlike another.

(7)

Heterogeneity in EC2 Cloud Instances

Different CPUs

on physical

host systems

[Jackson10, Schad10]

– Intel Xeon E5430 (2.66 GHz quad)

– AMD Opteron 270 (2 GHz dual)

– AMD Opteron 2218 HE (2.6 GHz dual)

I/O

throughput varies as well

[Dejun10]

– No correlation between

CPU and I/O performance

Am az on E C2 P erf ormanc e [Scha d10] Sourc e: [Dejun10]

(8)

Occasional CPU performance slumps and

failures

during task

execution

[Dejun10, Jackson10]

Variance in

I/O

and

network

throughput

[Zaharia08 ,Jackson10]

Performance depends on hour of day and day of week

[Schad10]

Dynamic Changes of Performance

EC2 Disk performance vs. VM co-allocation [Zaharia08]

(9)

Vision

Adaptive scheduling

of scientific workflows

Exploit

heterogeneous

resources

(10)

Vision

The standard approach for evaluation is

simulation

[Braun01, Blythe05]
(11)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

(12)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

3) Summary and Outlook

(13)

CloudSim

Datacenter

Host

VM

Task

R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, R. Buyya (2011),

CloudSim: a toolkit for modeling and simulation of cloud computing

environments and evaluation of resource provisioning algorithms

,

Software - Practice and Experience 41(1):23-50.

More than 250 citations in Google Scholar

(14)

DynamicCloudSim

Datacenter

Heterogeneous Host

Dynamic VM

Error-prone Task

Extend CloudSim with models for

1. Heterogeneous computational resources (Het) 2. Dynamic changes of performance at runtime (DCR) 3. Straggler VMs and failed task executions (SaF)

More fine-grained representation of computational resources

(15)

Realism – can we ever get there?

Simulation can never perfectly resemble reality

We model inhomogeneity and dynamic changes by

sampling from

normal distributions

Default

mean

and STD/

RSD

Parameters are obtained

from

[Zaharia08, Dejun10, Jackson10, Schad10, Iosup11]
(16)

Simulating VM Performance: DCS vs CS

1. Heterogeneous computational resources (

Het

)

2. Dynamic changes of performance at runtime (

DCR

)

(17)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

a) Scheduling Scientific Workflows

b) Evaluation Workflows

c) Evaluation Results

(18)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

a) Scheduling Scientific Workflows

b) Evaluation Workflows

c) Evaluation Results

(19)

Scheduling of Scientific Workflows

Scheduling

:

Mapping tasks to the available physical resources

Usual goal: minimize overall execution time

Static

Scheduling:

Schedule is assembled prior to workflow execution

Schedule is strictly abided at runtime

Adaptive

Scheduling:

Monitor computational infrastructure

(20)

Static Schedulers

Baseline:

Round Robin

Assign tasks to resources in turn

Equal amount of tasks per resource

Elaborate:

HEFT

(Het. Earliest Finish Time)

[Topcuoglu02]

Implemented in SWfMS

Pegasus

Requires

runtime estimates

for each task on each resource

Assign tasks with longest time to finish a fixed timeslot on

a suitable (well-performing) resource

(21)

Adaptive Schedulers

Baseline:

Greedy Task Queue

Assign tasks to resources at runtime in

first-come-first-served manner

Adapts to changes of performance at runtime (

DCR

)

Elaborate:

LATE

(Longest Approx. Time to End)

[Zaharia08]

Developed for

Hadoop

to increase robustness to instability

10% of Tasks progressing at rate below average are

replicated and

speculatively executed

Exploit dynamic changes of performance

(22)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

a) Scheduling Scientific Workflows

b) Evaluation Workflows

c) Evaluation Results

(23)
(24)

Abstract Montage Workflow

(25)

Concrete Montage Workflow

43,318 tasks

reading and writing

534 GB of data

10 GB

input files which have to be

uploaded

to the cloud

(26)
(27)
(28)

Concrete Genomics Workflow

Align 10% of the reads produced in a sequencing experiment

against the smallest of human chromosomes (chr22)

Use about

0.2% of the available data

4,266 tasks reading and writing 436 GB of data (2.3 GB upload)

Indexing (bowtie, SHRiMP, PerM) Alignment (bowtie, SHRiMP, PerM) Convert (samtools view)

Sort (samtools sort) Merge (merge)

Preprocess (samtools mpileup) Variant calling (VarScan)

“Sense-Making” (VCFTools) Upload to cloud

(29)

Agenda

1) Simulating Heterogeneity in Computational Clouds

2) Evaluating Established Workflow Schedulers

a) Scheduling Scientific Workflows

b) Evaluation Workflows

c) Evaluation Results

(30)

Runtime depending on Heterogeneity (

Het

)

0 0.125 0.25 0.375 0.5 0 200 400 600 800 1000 1200 1400 Static Round

Robin HEFT Greedy

Queue LATE 368 304 296 311 371 301 300 308 450 296 303 315 715 296 308 313 1314 286 300 300 RSD Parameters for Heterogeneous Resources (Het) A ve ra ge R u n ti m e in Min u te s 0 0.125 0.25 0.375 0.5 0 200 400 600 800 203 143 163 178 220 148 163 179 275 150 166 177 602 152 187 182 747 149 195 185 RSD Parameters for Heterogeneous Resources (Het) A ve ra ge R u n ti m e in M in u te s

(31)

Runtime depending on Dynamic Changes (

DCR

)

0 0.125 0.25 0.375 0.5 0 100 200 300 400 500 600 Static Round

Robin HEFT Greedy

Queue LATE 368 304 296 311 352 301 296 317 394 357 299 308 465 439 311 299 574 530 307 289 RSD Parameters for Dynamic Changes at Runtime (DCR) A ve ra ge R u n ti m e in M in u te s 0 0.125 0.25 0.375 0.5 0 100 200 300 400 Static Round

Robin HEFT Greedy 203 143 163 178 216 165 166 176 241 190 165 179 295 255 170 180 393 314 207 177 RSD Parameters for Dynamic Changes at Runtime (DCR) A ve ra ge R u n ti m e in M in u te s

(32)

Runtime with Stragglers and Failures (

SaF

)

0 0.00625 0.0125 0.01875 0.025 0 500 1000 1500 2000 2500 3000 Static Round

Robin HEFT Greedy

Queue LATE 368 304 296 311 598 405 396 316 876 659 586 317 1365 962 790 316 2559 1291 1137 321 Likelihood of Straggler VMs and

Failed Tasks (SaF)

A ve ra ge R u n ti m e in M in u te s 0 0.00625 0.0125 0.01875 0.025 0 500 1000 1500 2000 203 143 163 178 352 262 237 180 617 411 444 187 1025 604 635 188 1990 984 1125 195 Likelihood of Straggler VMs and

Failed Tasks (SaF)

A ve ra ge R u n ti m e in M in u te s

(33)

That’s all well and good, but…

Scheduling in SWfMS: Static or Greedy Task Queue

HEFT and LATE have a

computational overhead

and

require information not available in real scenarios:

HEFT:

runtime estimates

of each task on each machine

LATE:

progress rate

of each running task

Untapped optimization potential:

multiple resource scheduling

(34)

Summary and Outlook

EC2:

Heterogeneity

and

instability

in VM performance

DynamicCloudSim

introduces several factors of

instability into CloudSim

Simulation experiments

reproduce known strengths

and shortcomings of established schedulers

(35)

Thanks for your attention!

(36)

DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 36

(37)

Literature

[Braun01] T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M.

Maheswarans, A. I. Reuther, J. P. Robertson, M. D. Theys, B.

Yao, D. Hensgen, R. F. Freund (2001),

A Comparison Study of

Eleven Static Heuristics for Mapping a Class of Independent

Tasks onto Heterogeneous Distributed Computing Systems

,

Journal of Parallel and Distributed Computing 61:810–837.

[Blythe05] J. Blythe, S. Jain, E. Deelman, Y. Gil, K. Vahi, A.

Mandal, K. Kennedy (2005),

Task Scheduling Strategies for

Workflow-based Applications in Grids

, in: Proceedings of the

5th IEEE International Symposium on Cluster Computing and

the Grid, volume 2, Cardiff, UK, pp. 759–767.

(38)

Literature (cont.)

[Jackson10] K. R. Jackson, et al. (2010),

Performance Analysis

of High Performance Computing Applications on the Amazon

Web Services Cloud

, in: Proceedings of the 2nd International

Conference on Cloud Computing Technology and Science,

Indianapolis, USA, pp. 159-168.

[Dejun09] J. Dejun, et al. (2009),

EC2 Performance Analysis for

Resource Provisioning of Service-Oriented Applications

, in:

Proceedings of the 7th International Conference on Service

Oriented Computing, Stockholm, Sweden, pp. 197-207.

[Zaharia08] M. Zaharia, et al. (2008),

Improving MapReduce

Performance in Heterogeneous Environments

, in: Proceedings

of the 8th USENIX Symposium on Operating Systems Design

(39)

Literature (cont.)

[Schad10] J. Schad, J. Dittrich, J.-A. Quiané-Ruiz (2010),

Runtime Measurements in the Cloud: Observing, Analyzing,

and Reducing Variance

, Proceedings of the VLDB Endowment

3(1):460–471.

[Iosup11] A. Iosup, N. Yigitbasi, D. Epema (2011),

On the

Performance Variability of Production Cloud Services

, in:

Proceedings of the 2011 11th IEEE/ACM International

Symposium on Cluster, Cloud and Grid Computing, Newport

Beach, California, USA, pp. 104–113.

(40)

Literature (cont.)

[Topcuoglu02] H. Topcuoglu, S. Hariri, M.-Y. Wu (2002),

Performance-Effective and Low-Complexity Task Scheduling

for Heterogeneous Computing

, IEEE Transactions on Parallel

and Distributed Systems 13(3):260-274.

[Berriman04] G. B. Berriman, et al. (2004),

Montage: a

grid-enabled engine for delivering custom science-grade mosaics

on demand

, in: Proceedings of the SPIE Conference on

Astronomical Telescopes and Instrumentation, volume 5493,

Glasgow, Scotland, pp. 221-232.

https://code.google.com/p/cloudsim/ https://code.google.com/p/dynamiccloudsim/

References

Related documents

(2012); that risk exposure is higher for younger population although, from the point of view of the impact of accidents, morbidity and mortality are higher for an older

Expression of functional hepatitis B virus polymerase in yeast reveals it to be the sole viral protein required for correct initiation of reverse transcription. Uhl,

Zero gradient the object moves with a constant velocity or the acceleration is zero.. The gradient of the graph increases. The object moves with increasing acceleration. The area

For each of the five test areas we use a small but representat- ive section of the Landsat images to find the optimal image win- dow sizes. Applying this selection procedure on

But on the other hand we can witness that modern judicial practice of the Court is building all required (substantial and procedural) legal elements of

Custom-developed mobile application; system-level block diagrams of the thermopile IR sensor and bone- conduction hearing aid, and circuit diagrams of the signal-conditioning circuit

Moreover, if deemed necessary, this study was also set to amend the model to fit the needed dimensions on measuring employees’ perception on service quality considering that

8 shows the forecast of world power supply by photovoltaic generation estimated by SHARP Corporation, who sees 8.6% of the. total world electricity supply by