• No results found

Scientific and Technical Applications as a Service in the Cloud

N/A
N/A
Protected

Academic year: 2021

Share "Scientific and Technical Applications as a Service in the Cloud"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Scientific and Technical Applications

as a Service in the Cloud

Swiss Distributed Computing Day

University of Bern, 28.11.2011 – adapted version

Wibke Sudholt

CloudBroker GmbH

Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41 44 633 79 34

Email: [email protected] Web: http://www.cloudbroker.com

(2)

All rights reserved. © CloudBroker GmbH

Overview

•  High performance computing (HPC) in the cloud

•  CloudBroker Platform

•  Example: Protein modeling in the IBM Cloud for

ETH Zurich

•  EuroCloud Swiss

28.11.2011 2 Swiss Distributed Computing Day

(3)
(4)

All rights reserved. © CloudBroker GmbH

Cloud Terms

28.11.2011

Utility Computing

Computing on Demand

Software as a Service

Infrastructure as a Service

Platform as a Service

Multi-Tenancy

Public Cloud

Private Cloud

Cloud Storage

Pay-per-Use

Elasticity

4 Swiss Distributed Computing Day

Scalability

Hybrid Cloud

Cloud Bursting

Self Service

Virtualization

SOA

Clusters

Grids

ASP

Web Services

Internet

(5)

Cloud Computing Definition

•  Access to computer resources on demand

without much initial investment in time or

money (self service)

•  Only pay for what is actually used in small

steps (OpEx instead of CapEx)

•  Nearly unlimited scalability (elasticity)

= Change in business model

(6)

All rights reserved. © CloudBroker GmbH

Cloud Services

28.11.2011 6 Swiss Distributed Computing Day

•  Web / office / business applications, … •  Salesforce, Google Apps, ...

Software as a Service (SaaS)

•  Development / deployment frameworks, distribution / messaging / monitoring systems, databases, …

•  Microsoft Windows Azure, Google App Engine, ...

Platform as a Service (PaaS)

•  Computing power, virtual machines, storage space, … •  Amazon Web Services, IBM SmartCloud Enterprise, ...

(7)

Problems of Traditional HPC

•  Scientific and technical applications

–  Complex algorithms and applications needing HPC resources (supercomputers, clusters, grids)

–  Mainly used in research and development (R&D), often project-based, with increasing importance

•  HPC computer infrastructure, middleware tools and application software

–  Require expert knowledge

–  Expensive, time-consuming and complex to buy, set up, use and maintain

–  Hard to integrate with existing systems and processes –  Often operating at capacity limit

⇒  HPC is hardly accessible or affordable for SMEs / small research groups, specialized application purposes or short-term projects

(8)

All rights reserved. © CloudBroker GmbH

Advantages of Cloud for HPC

•  Immediate access to computer resources on

demand

•  Availability of resources not existing in-house

•  Possibility for spill-over / cloud bursting

•  Temporary, non-binding utilization

•  Pay-per-use with minimal initial investment

•  Nearly unlimited scalability

•  Hardware and partly software maintained by

cloud providers

28.11.2011 8 Swiss Distributed Computing Day

(9)

Challenges of Cloud for HPC

•  HPC infrastructure, middleware and applications

remain complex to set up, use and maintain also in

the cloud

•  Dynamic features of the cloud and pay-per-use

billing add to the complexity

•  Performance limitations for some HPC calculations

due to virtualization and available hardware

•  Security concerns for R&D because of outsourcing,

internationality, SLAs, multi-tenancy and potential

vendor lock-in

•  Hardware and software vendors have to adapt to the

pay-per-use and self-service business model

(10)

All rights reserved. © CloudBroker GmbH

CloudBroker Platform

28.11.2011 10 Swiss Distributed Computing Day

(11)

Solutions of CloudBroker GmbH

•  Easy, scalable, secure, integrable and pay-per-use access to scientific and technical applications in the cloud

•  HPC application store / marketplace with direct deployment and execution of applications in the cloud and one bill for everything

•  Using infrastructure as a service (IaaS) from cloud providers •  Offering platform as a service (PaaS) for software vendors •  Providing software as a service (SaaS) to end users

•  Application parameters and files remain the same as for local execution and can be easily exported

(12)

All rights reserved. © CloudBroker GmbH

CloudBroker Platform

12 CloudBroker Platform Amazon Cloud IBM Cloud … Cloud Bio-informatics Applications Molecular Modeling Applications Fluid Dynamics Applications Web Browser UI … Applications Web Service API

Generic Workbenches Clo ud Bro ke r In te gra tio n Domain-Specific Gateways R&D End Users and Software Vendors

CLI

28.11.2011 Swiss Distributed Computing Day

(13)

Cloud Broker End Users Software Vendors Cloud Providers

Business Model

Resources Applications Usage $ $ $

(14)

All rights reserved. © CloudBroker GmbH

Security Frame: Transport Layer Security, Access Rights Security

Functionality

14 IBM Cloud … Cloud Application Manager Process Monitor Resource Manager

Web Service API Web Browser UI

End Users Clients

IBM Adapter

… Adapter Cloud Provider Access Manager

Scalability and Fault Tolerance Handler User Manager Accounting Module Billing Module Payment Module Queuing System Storage Manager Image Manager Portals Process Manager Amazon Cloud Amazon Adapter Software Vendors 28.11.2011 Swiss Distributed Computing Day

(15)

Security

Security Certified Data Center CBP . Corporate IT SSL Secured Connection Authentication Industry Standard Server Security Technology CBP . Industry Standard Secure Data Center Security Certified Compute and Storage Cloud Technology Customer CloudBroker Cloud Provider SSL secured connection Corporate Security Policies and Standards

Authentication to VM SSL Secured Connection Authentication to Cloud CloudBroker Platform Industry Standard Application Security Technology Cloud Instances Dedicated, Secured and Restricted Virtual Machines Client Browser or Application

(16)

All rights reserved. © CloudBroker GmbH

Typical Calculation Lifecycle

1.  Prepayment (user)

2.  Software selection and job creation (user)

3.  Data file upload (user) to cloud storage (platform) 4.  Job submission (user)

5.  Compute instance startup or reuse (platform)

6.  Data file upload from cloud storage to master instance (platform)

7.  Calculation on compute instances (platform, application) 8.  Data file download from master instance to cloud storage

(platform)

9.  Compute instance shutdown or reuse (platform)

10.  Data file download (user) from cloud storage (platform) 11.  Billing (platform)

28.11.2011 16 Swiss Distributed Computing Day

(17)

Current Applications

Application Domain Remarks

GAMESS Quantum chemical calculations

BLAST DNA and protein sequence alignment AutoDock Protein-ligand docking

Gromacs Molecular dynamics simulations X! Tandem Mass spectrometry data matching OpenFOAM Computational fluid dynamics

Rosetta Protein modeling Only with own license ??? Computational fluid dynamics In preparation

??? Material science In preparation ??? DNA and protein sequence alignment Requested ??? Protein modeling Requested

… …

(18)

All rights reserved. © CloudBroker GmbH

Application Requirements

•  Software characteristics

–  Scientific and technical applications, open source or commercial, independent of domain –  Compute-intensive, not data-intensive –  Batch-oriented, non-interactive, command line, running for hours or days –  Installable on Linux –  Single-threaded, multi-threaded or parallel / MPI

•  Deployment in the

cloud

–  Installation shell script and software package –  Configuration through

the platform

–  Selection of pricing options

–  Validation and execution by the CloudBroker team –  Several software

versions possible

28.11.2011 18 Swiss Distributed Computing Day

(19)

Integration into Third Party Tools

•  Provide all platform and

cloud advantages

within an environment

known to the user

•  Public or private,

generic or

domain-specific clients,

workflows,

workbenches, portals,

etc.

•  Utilize platform as cloud

middleware in the

background

•  KNIME

–  Konstanz Information Miner –  http://www.knime.org –  Workflow framework

•  SCI-BUS

–  SCIentific gateway Based User Support –  http://www.sci-bus.eu –  EU FP7 project

–  11 User communities from different domains

(20)

All rights reserved. © CloudBroker GmbH

SCI-BUS Project

28.11.2011 20 Swiss Distributed Computing Day

SCI-BUS is supported by the FP7 Capacities Programme under contract no. RI-283481

(21)

Example:

Protein Modeling

in the IBM Cloud

(22)

All rights reserved. © CloudBroker GmbH

Scientific Background

•  R&D group: Institute of Molecular Systems Biology

(IMSB) at ETH Zurich (http://www.imsb.ethz.ch)

•  Goal: Better understand the mechanisms of

infectious diseases to fight antibiotics resistance

•  Example: Streptococcus bacterium

•  Method: Computationally model the 3D structures of

important proteins from their 1D sequence

•  Software: Rosetta (http://www.rosettacommons.org)

•  Analysis: Find the important structural differences

between less and more harmful bacteria strains

28.11.2011 22 Swiss Distributed Computing Day

(23)
(24)

All rights reserved. © CloudBroker GmbH

Infrastructure

•  Problem: Calculations would need several months

on ETH Zurich’s compute cluster due to long

queue waiting times and low job throughput

•  Calculations: Embarrassingly parallel and thus

highly scalable, compute-intensive and not

data-intensive, can be automated and outsourced

⇒ Perfect fit for cloud computing

•  Solution: Use the CloudBroker Platform to deploy

the Rosetta software and manage data and

calculations on IBM SmartCloud Enterprise cloud

resources

28.11.2011 24 Swiss Distributed Computing Day

(25)

Project Architecture

(26)

All rights reserved. © CloudBroker GmbH

Showcase Results

•  249 Streptococcus target proteins modeled using special Rosetta client for automation

•  Up to 63 compute instances with 1’008 virtual CPUs in parallel provided by the IBM SmartCloud Enterprise

•  Number of instances in the cloud automatically adjusted to the workload by the CloudBroker Platform

•  Optimized data transfer between ETH Zurich file server and compute and storage instances in the cloud

•  About 36’000 single-threaded jobs created by the client, managed by the platform and computed in the cloud

•  Almost 250’000 CPU hours utilized for the production calculations

•  Ca. 2.3 Mio 3D protein structure models created •  Calculations finished within two weeks

28.11.2011 26 Swiss Distributed Computing Day

(27)
(28)

All rights reserved. © CloudBroker GmbH

EuroCloud Swiss

•  Swiss association for

cloud computing

•  Platform and lobbying for

cloud computing in

Switzerland

•  http://www.eurocloud

swiss.ch

•  Representative of

Switzerland in the

EuroCloud Europe

network

•  Collaboration with simsa

•  Swiss Cloud Conference:

21.03.2012, Technopark

Zurich

•  Swiss Cloud Award 2012

•  Code of practice

•  Certification

•  …

28.11.2011 28 Swiss Distributed Computing Day

(29)

Thank You

•  CloudBroker management team, in particular

Nicola Fantini

•  CloudBroker development team

•  SystemsX.ch, in particular Dr. Peter Kunszt

•  ETH Zurich, in particular Dr. Lars Malmström

•  IBM, in particular Marcel Lautenschlager,

Roland Reifler and Stefan Ruckstuhl

•  EuroCloud Swiss

(30)

All rights reserved. © CloudBroker GmbH

For More Information

Contact Dr. Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1 CH-8005 Zurich Switzerland Phone: +41 44 633 79 34 Email: [email protected] Web: http://www.cloudbroker.com 28.11.2011 30 Swiss Distributed Computing Day

References

Related documents

Improving retention strategies for experienced nurses based upon the described influences on nurses’ decisions to leave patient care settings can effectively promote

In both urine and plasma, the majority of the metabolites were found in samples collected at late time points (6 –24 h), suggesting colonic metabolism of

To show that the results are not driven by the pair treatment we calculated the correlation between outputs of artificial (nominal) pairs constructed by forming ‘pairs’ of

The Community First Choice Option gives states added financial support to build a broad home- and community-based care program in Medicaid that will serve residents who need

Different methods were employed in this study: rainfall coefficient method was used to determine monthly distribution of rainfall; Penman method to calculate

□ For a student who was homeschooled in a State where State law does not require the student to obtain a secondary school completion credential for homeschooling (other than a

When analysing changes occurring in the milk yield and composition depending on successive lactation it was concluded that the highest amount of obtained milk, calculated FCM and

The presentation has not been updated since it was originally presented, and does not constitute a commitment by any CDF entity to underwrite, subscribe for or place any securities or