Experiments with Complex Scientific Applications
on Hybrid Cloud Infrastructures
Maciej'Malawski
1,2
,'Piotr'Nowakowski
1
,'Tomasz'Gubała
1
,'Marek'Kasztelnik
1
,'
Marian'Bubak
1,2
,'Rafael'Ferreira'da'Silva
3
,'Ewa'Deelman
3
,'Jarek'Nabrzyski
4'
'
NSFCloud'Workshop'on'Experimental'Support'for'Cloud'CompuOng''
December'11Q12,'2014,'Arlington,'VA'
AGH University of Science and Technology:
1
ACC Cyfronet AGH, ul. Nawojki 11, 30-950 Kraków, Poland
2
Department of Computer Science, al. Mickiewicza 30, 30-095 Kraków, Poland
3
University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA
4
Center for Research Computing, University of Notre Dame, IN, USA
2
Research Challenges
•
Execution of complex scientific applications on clouds:
workflows and their ensembles
•
Pegasus Workflow Management System
(OCI SI2-SSI #1148515)
•
HyperFlow Workflow Engine
•
Platform for deployment and sharing of scientific applications
on hybrid clouds
•
Atmosphere Framework
•
Algorithms for scheduling, provisioning and cost optimization:
•
Dynamic and Static Algorithms
•
Mathematical Programming
•
Cloud Workflow Simulator
3
Research:
The Atmosphere Framework
Hybrid cloud as a means of provisioning computing power for virtual experiments
Cloud'Management'Portlets'
GUI'host'
(provisions'endQuser'
features'and'access'opOons)'
Provide'GUI'elements'which'enable'service'developers'and'
end'users'to'interact'with'the'Atmosphere'plaYorm'and'
create/deploy'services'on'the'available'cloud'resources'
Atmosphere'Core'
Services'Host'
user'accounts'
Atmosphere'Registry'(AIR)'
available'cloud'sites'
services'and'templates'
Atmosphere'Core'
Secure'RESTful'API'
(Cloud'Facade)'
•
AuthenOcaOon'and'authorizaOon'logic'
•
CommunicaOon'with'underlying'computaOonal'clouds'
•
Launching'and'monitoring'service'instances'
•
CreaOng'new'service'templates'
•
Billing'and'accounOng'
•
Logging'and'administraOve'services'
Worker'
Node'
Worker'
Node'
Worker'
Node'
Head'
Node'
Image'store'
OpenStack'cloud'site'at'ACC'CYFRONET'AGH'
96'CPU'cores'
184'GB'RAM'
4'TB'storage'
private'IP'space'
Worker'node'w/large'
resource'pool'(“fat'node”)'
Head'
Node'
Image'store'
VPHQShare'cloud'site'at'UNIVIE'
Worker'node'w/large'
resource'pool'(“fat'node”)'
128'CPU'cores'
256'GB'RAM'
4'TB'storage'
private'IP'space'
API'
host'
Image'store'
Amazon'ElasOc'Compute'Cloud'(EC2)'–'European'availability'zone'
Massive'
(funcOonally'
limitless)'
hardware'
resource'
pool'
public'IP'space'
Worker'
Node'
Worker'
Node'
4
Research:
Simulation and Scheduling of Large-Scale
Scientific Workflows on IaaS Clouds
•
Large-scale scientific workflows from
Pegasus WMS
•
Workflows of 100,000 tasks
•
Workflow Ensembles
•
Schedule as many workflows as possible
within a budget and deadline
•
Uses a Cloud Workflow Simulator
4
Time
V
M
M. Malawski, G. Juve, E. Deelman, J. Nabrzyski: Cost- and deadline-constrained provisioning
for scientific workflow ensembles in IaaS clouds. SC 2012: 22
5
Research:
Cost Optimization of Applications on Clouds
•
Infrastructure model
•
Multiple compute and storage
clouds
•
Heterogeneous instance types
•
Application model
•
Bag of tasks
•
Multi-level workflows
•
Modeling with AMPL and
CMPL
•
Modeling Language for
Mathematical Programming
•
Cost optimization
•
Under deadline constraints
•
Mixed integer programming
•
Bonmin, Cplex solvers
M. Malawski, K. Figiela, J. Nabrzyski,
Cost minimization for computational applications on hybrid cloud infrastructures
,
Future Generation Computer Systems, 29(7), 2013, pp.1786-1794,
http://dx.doi.org/10.1016/j.future.2013.01.004
M. Malawski, K. Figiela, M. Bubak, E. Deelman, J. Nabrzyski,
Cost Optimization of Execution of Multi-level
Deadline-Constrained Scientific Workflows on Clouds
. PPAM, 2013, 251-260
http://dx.doi.org/10.1007/978-3-642-55224-3_24
0 500 1000 1500 2000 2500 3000 0 10 20 30 40 50 60 70 80 90 100 Cost ($)
Time limit (hours)
20000 tasks, 512 MiB input and 512 MiB output, task execution time 0.1h @ 1ccu machine
Rackspace instances Rackspace and private instances
Amazon's and private instances Multiple providers
Amazon S3 Rackspace Cloud Files Optimal Layer 1 A Layer 2 B B B C Layer 3 D Layer 4 E Layer 5 F 1h 2.5 h 0.5 h 0.3 h 2 h 6 h Private cloud Compute private Amazon Storage Compute m1.small m1.large t1.micro m2.xlarge Task Input Output Application Rackspace Storage Compute rs.1gb rs.2gb rs.4gb rs.16gb
6
Research:
Cloud Performance Evaluation
•
Performance of VM deployment times
•
Virtualization overhead
•
Evaluation of open source cloud
stacks
•
Eucalyptus, OpenNebula, OpenStack
•
Survey of European public cloud
providers
•
Performance evaluation of top cloud
providers
•
EC2, RackSpace, SoftLayer
•
A grant from Amazon has been obtained
6
M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski, S. Varma,
Evaluation of Cloud
Providers for VPH Applications
, poster at CCGrid2013, Delft, the Netherlands, pp.13-16, 2013
IaaS Provider Zoning EEA jClouds API Support BLOB storage support Per-hour instance
billing Access API Published price VM Image Import / Export Relational DB support Score Weight 20 20 10 5 5 5 3 2 1 Amazon AWS 1 1 1 1 1 1 0 1 27 2 Rackspace 1 1 1 1 1 1 0 1 27 3 SoftLayer 1 1 1 1 1 1 0 0 25 4 CloudSigma 1 1 0 1 1 1 1 0 18 5 ElasticHosts 1 1 0 1 1 1 1 0 18 6 Serverlove 1 1 0 1 1 1 1 0 18 7 GoGrid 1 1 0 1 1 1 0 0 15 8 Terremark ecloud 1 1 0 1 1 0 1 0 13 9 RimuHosting 1 1 0 0 1 1 0 1 12 10 Stratogen 1 1 0 0 1 0 1 0 8 11 Bluelock 1 1 0 0 1 0 0 0 5 12 Fujitsu GCP 1 1 0 0 1 0 0 0 5 13 BitRefinery 0 0 0 0 0 1 0 1 0 14 BrightBox 1 0 0 1 1 1 1 0 0 15 BT Global Services 1 0 0 0 1 0 1 0 0 16 Carpathia Hosting 1 0 0 0 0 0 1 0 0 17 City Cloud 1 0 0 1 1 1 0 0 0 18 Claris Networks 0 0 0 1 0 0 0 0 0 19 Codero 0 0 0 1 1 1 0 0 0 20 CSC 1 0 0 0 0 0 1 0 0 21 Datapipe 1 0 0 1 1 0 0 0 0 22 e24cloud 1 0 0 1 0 1 0 0 0 23 eApps 0 0 0 0 0 1 0 0 0 24 FlexiScale 1 0 0 1 1 1 1 0 0 25 Google GCE 1 0 1 1 1 1 0 1 0
26 Green House Data 0 0 0 0 1 0 1 0 0
27 Hosting.com 0 0 0 0 0 1 1 1 0 28 HP Cloud 0 1 1 1 1 1 1 1 0 29 IBM SmartCloud 0 0 1 1 1 1 0 1 0 30 IIJ GIO 0 0 0 0 0 0 0 0 0 31 iland cloud 1 0 0 1 0 1 1 0 0 32 Internap 0 0 1 1 1 1 0 0 0 33 Joyent 0 0 0 1 1 1 0 0 0 34 LunaCloud 1 0 1 1 1 1 0 0 0 35 Oktawave 1 0 1 1 1 1 0 1 0 36 Openhosting.co.uk 1 0 0 0 0 1 0 0 0 37 Openhosting.com 0 1 0 1 1 1 1 0 0 38 OpSource 1 0 1 1 1 1 1 0 0 39 ProfitBricks 1 0 0 1 1 1 0 0 0 40 Qube 1 0 0 0 0 1 0 0 0 41 ReliaCloud 0 0 0 0 0 0 0 0 0 42 SaavisDirect 0 0 1 1 0 1 0 0 0 43 SkaliCloud 0 1 0 1 1 1 1 0 0 44 Teklinks 0 0 0 0 0 0 0 0 0 45 Terremark vcloud 0 1 0 1 1 1 1 0 0 46 Tier 3 0 0 0 0 1 0 0 0 0 47 Umbee 1 0 0 1 1 1 1 0 0 48 VPS.net 1 0 0 0 1 1 0 0 0 49 Windows Azure 1 0 1 1 1 1 0 1 0
7
Experiment:
Evaluation of autoscaling techniques for
Atmosphere cloud platform
•
Challenges
•
Requires repeated tests under
varying workloads
•
Experiments in an isolated
environment
•
Goals
•
Perform autoscaling based on:
•
Complex event processing
•
Time series database
•
Build an isolated environment on
8
Experiment:
Scalability of Scientific Workflows in
HyperFlow Model
•
Challenges
•
Issues on data transfers and data locality
•
Calibrate the performance models of applications
•
Goals
•
Execute large-scale deployments on multi-site NSFCloud facilities
•
Assess the impact of network latency and bandwidth limitations
8
PaaSage application
Hyperflow
engine
scheduler
Task
Cloud
Executor 1
Executor 1
VMs
Workflow
components
Executor
RabbitMQ
Job queue
Results
Monitoring
Ready tasks
Scheduled tasks
Redis
CAMEL
model
Workflow
graph
Workflow
CAMEL
generator
Workflow
generator
PaaSage platform
Upperware
Executionware
Metrics
Metrics
9
Experiment:
Influence of Variability of Clouds on the
Quality of Algorithms
•
Challenges
•
Static scheduling methods assume that the
estimates of task runtimes are available
•
The runtime variations and various
uncertainties influence the actual execution
•
Goals
•
A large-scale experimental testbed will allow
investigating the influence of the uncertainties
•
Development of new models to mitigate
uncertainties negative effects
0.0 0.5 1.0 1.5 2.0