• No results found

Integration of Cloud Computing and Cloud Storage

N/A
N/A
Protected

Academic year: 2021

Share "Integration of Cloud Computing and Cloud Storage"

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

Integration of Cloud Computing

and Cloud Storage

Chief Scientist ,Renaissance Computing Institute Adjunct Prof. Computer Science, NC State University

IEEE Mass Storage Conference Tutorial May 3, 2010

(2)

Outline

• Introduction to Cloud Computing

• Building a cloud computing architecture

– Properties for a cloud – Types of clouds

– Building images in a cloud – Designing a cloud system

• Large data sets and analysis requirements • Cloud Computing Demo

(3)

Level of General Interest in Cloud

Computing

(4)
(5)

“The interesting thing about cloud computing is that we've redefined cloud computing to include everything that we already do…. I don't

understand what we would do differently in the light of cloud computing other than change the wording of some of our ads.”

Larry Ellison CEO Oracle

(6)

Some Important Characteristics in Building Cloud Computing Architectures

(7)

Start with Basic General Design Goals

For Any System

• Rapid efficient delivery of IT services • Reliable, secure, and fault-tolerant • Data and process aware services • Secure dropped-session recovery

• More efficient delivery to remote users

• System is cost-effective to operate and maintain

• On-demand or batch processing capabilities

• Removal of conflicts among supported software applications

– Incompatible versions

– Inconsistent upgrade patterns among different locations – User issues with obtaining and using the software

(8)

Add Basic General Design Goals

For A Cloud Computing System

• Ability for users to request specific HW platforms to build, save, modify, run virtual computing

environments and applications that are:

– Reusable – Sustainable – Scalable

– Customizable

• Timely inclusion of new software images • Root privileges (as required/authorized) • Time and place independent access

(9)

A List of Some Important Characteristics

For a Cloud Computing System

• An operational paradigm allowing the users to seamlessly and securely provision and/or combine

– Computer hardware – Operating systems – Application software – Storage

– Rich set of customizable services

• As a system that is scalable up, down, in, and out

– With resources accessible over a network – Based on a service-oriented architecture

• With users controlling the options to purchase, lease or reserve

each equipment and service capability on a mix-n-match component or unit basis

(10)

A List of Some Important Characteristics

For a Cloud Computing System

• An operational paradigm allowing the users to seamlessly

and securely provision and/or combine

– Computer hardware – Operating systems – Application software – Storage

– Rich set of customizable services

• As a system that is scalable up, down, in, and out

– With resources accessible over a network – Based on a service-oriented architecture

• With users controlling the options to purchase, lease or

reserve each equipment and service capability on a mix-n-match component or unit basis

(11)

Some Definitions -- Cloud Types

• Public Cloud is a system where hardware, software and/or application services are accessible to the general public

over the Internet with access usually purchased on some type of a pay-as-you-go per component usage basis

• Private Cloud is a system where computational hardware, software, and/or applications and software services are only accessible internally within an organization or

educational institution

• Hybrid Cloud is a private cloud that scales up service availability by externally provisioning resources from a

public cloud when there are rapid workload fluctuations or hardware failures.

(12)

Cloud Computing Images

-A Key Concept

• The combination of the operating system, applications and customizable software forms an “image”

• An “image” is a tangible abstraction of a software stack

• Ingredients needed to create an image

– Any base-line operating system

– If virtualization is needed for scalability, a hypervisor layer – Any desired middleware or application that runs on that

operating system

(13)

Some Metadata Describing an Image

– Identifier – Location – Name – Owner – Memory footprint

– Speed of access information – Licensing information

– Hardware requirements – Loading time

(14)

Some Image Properties

• Images can be associated into logical image groups

– Mapped onto a particular logical resource (“hardware”) groups

– Mapped onto individual computers

• Association of the meta-data with an image is

(15)

Types of Images

• Simple Virtual or Bare Metal Images

– Simple Virtual Image: image loaded into an operating system/application virtual environment of choice

– Bare-metal image: operating system and application stack loaded straight onto the hardware without any

other software layer between the image and that target hardware.

• Composite Images

– Composite images are aggregates of two or more images that are loaded synchronously (called

environments)

– Composite images can construct “virtual clouds” and cloud services

(16)

Traditional Versus Cloud Computing

Some Comparisons of Technical Characteristics

Traditional

• Control of resource use managed by the site

• Site-specific environment • Site defines modes of access • Site-driven prioritization

• Ease of deployment for provider, constraints on user

• Site controls the technology

Cloud

• Control of resource use managed by the user

• User defines site environment • User defines modes of access • Explicit user choices for service

level options & prioritization • More difficult deployment for

provider, ease of use for user • Users control the technology

(17)

Clouds Grouped by Services

• Hardware as a Service (HaaS) – On demand access to a specific equipment configuration possibly at a particular site

• Infrastructure as a service (IaaS) – On demand access to user specified hardware capabilities, performance and services which may run on a variety on hardware products

• Platform as a Service (PaaS) - On-demand access to user

specified combination of hypervisors, operating systems and middleware that enables applications and services

• Application as a Service (AaaS) - On-demand access to user specified application(s)

• Software as a Service (SaaS) - may encompass anything from PaaS through AaaS

• Cloud as a Service – On demand ability to construct a local cloud within an overall cloud service

• Security as a Service – On-demand use of cloud configuration for security of applications and systems

(18)

Cloud Services and

Cloud Architectures

• Cloud architecture abstracts resources at several levels

– Application and operating system level via images and hypervisors

– Hardware location level via compute manager and compute nodes

– Network level (via virtual networks, VLANs, VPNs)

• Each cloud service type needs an architecture that will optimize that type of service delivery

(19)

IaaS

PaaS

(20)

Managing server handles •User requests •Resource scheduling •Authorization •Security •Multi-site coordination •Performance monitoring •Virtual network mgt

•Software licensing etc.,

Database Web software Apache Provisioning engine Custom CC codes Utilities

Infrastructure as a Service (IaaS)

Hardware as a Service (HaaS)

(21)

X-Win RDP Client apps Workflow Operating System (Win, Linux) Virtual layer VMWare KVM Xen Applications Operating System (Win, Linux) Applications

IaaS/HaaS Logical

Architecture

Image Repository Database Web software Apache Provisioning engine Custom CC codes Utilities

(22)

Scheduler finds a server with the requested application or has management node load requested application on a server

Image Repositories Management nodes Operating system Image repositories Computation Storage Networking

(23)
(24)
(25)
(26)
(27)
(28)

High Performance Computing Option

(29)

HPC Configuration

• In the education and research space it is possible to design dynamic hardware

reconfiguration of a cloud computing system • Identify sporadic usage patterns and

re-purpose cloud hardware between distributed and HPC usage

(30)

Image Repositories HPC cluster

Login node

(31)

Image Repositories HPC cluster

Login node

(32)

Image Repositories HPC cluster

Login node

(33)
(34)

Periodic Sporadic Usage

(35)
(36)

Dynamic Blade – Server

Re-Configuration

• Insert multiple NICs in the blades or servers • Build multiple VLANs (non-routing IPs) to

– Control the out of band management network – Access/load images from the image repository

• Create either

– A public (routable) IP connection

– VLAN together a cluster of blades / servers

– Block reserve group of machines for HPC (depending on IB, Myrinet, or 10 GigE config)

(37)

Platform as a Service (PaaS)

Characteristics

• Computing platform and solution stack as a service

• Program the cloud, but at a relatively high level, such as Web Services and delivery of web-based applications

• development platforms with the development tool itself hosted in the cloud and accessed

(38)

Platform as a Service (PaaS)

Software as a Service (SaaS)

Application as a Service (AaaS)

(39)

Compute

Storage

Fabric

Platform as a Service (PaaS)

Architecture

(40)

Platform as a Service(PaaS)

Applications

User Data

(41)

Other Cloud Services

• Application as a Service and Software as a Service

– Both specific cloud services focused on particular software application(s)

– Extension of existing application stack already running on the local site (more toward business applications)

• Cloud as a Service – construction of entire cloud architecture within a larger cloud

(42)

Data Sets and Repositories

• Computational capability needed to process and analyze the data

• Transform data into useful information • Two options

– Push the data to the computational system

(43)

Large Data Sets and

Analysis Requirements

• Data set itself

– Location – Size

• Multiple dimensions to consider

– Technical

– Operational – Economic

(44)

Location

Balancing Storage and Computation

• Where should the data be located and where should the computations be performed?

• Two potential options

– Option 1 - traditional approach is to move all/part of the data to the computational system

– Option 2 –move the computation near the data repository

(45)

Size of the Data Sets

• Number and size of data repositories are expanding

– Rate of data collection increasing

– Aggregate expansion of the total size of the data

• Compare the network bandwidth growth to the growth in the size of data repositories

(46)

Aug 1990 100 GBy/m o Oct 1993 1 TBy/mo Jul 1998 10 TBy/mo Nov 2001 100 TBy/mo Apr 2006 1 PBy/m o

ESnet Traffic Increases by 10X Every 47 Months, on Average

Observation of Current and Historical ESnet Traffic Patterns

Ter ab yt es / mon th

Log Plot of ESnet Monthly Accepted Traffic, January 1990 – June 2009

Courtesy of William Johnston, LBL

Actual volume for Jun 2009: 4.3 Petabytes/month Projected volume for Jun 2010: 8.6 Petabytes/month

(47)

y = 2.3747e0.5714x y = 0.4511e0.5244x y = 0.1349e0.4119x y = 0.8699e0.6704x 0 1 10 100 1000 10000 100000 1000000 10000000 100000000 Ja n , 9 0 Ja n , 9 1 Ja n , 9 2 Ja n , 9 3 Ja n , 9 4 Ja n , 9 5 Ja n , 9 6 Ja n , 9 7 Ja n , 9 8 Ja n , 9 9 Ja n , 0 0 Ja n , 0 1 Ja n , 0 2 Ja n , 0 3 Ja n , 0 4 Ja n , 0 5 Ja n , 0 6 Ja n , 0 7 Ja n , 0 8 Ja n , 0 9 Ja n , 1 0 Ja n , 1 1 Ja n , 1 2 Ja n , 1 3 Ja n , 1 4 Ja n , 1 5 ESnet traffic HEP exp. data ESnet capacity

Climate modeling data Expon. (ESnet traffic) Expon. (HEP exp. data) Expon. (ESnet capacity)

Expon. (Climate modeling data)

2010 value xx

40 Pby xx 4 Pby

Network Traffic, Science Data, and Network Capacity Long-term trends Historical Projection All F our Da ta Series ar e Normal iz ed to “1” at Jan. 19 90 2010 value --40 PBy --4 PBy

(HEP data courtesy of Harvey Newman, Caltech, and Richard Mount, SLAC. Climate data courtesy Dean Williams, LLNL, and the Earth Systems Grid Development Team.)

(48)

Size of the Data Sets

• Best performance from the current ESnet traffic patterns is approximately 8.6 Pbytes/month (total throughput)

• Single user may get 20% sustained bandwidth or approximately 1.75 Pbytes/month

• Moving a full data repository of 100 Pbytes may take on the order of 57 months!

• Moving only 1% may take several weeks!

• Size of a fully constructed cloud computing image may be approximately several Gbytes and take

(49)

Instead of pushing the data to the cloud

Bring the cloud to the data

(50)

Desirable Cloud Computing and Storage

Technical Properties

• Portability – ability to move cloud computing job among different computational systems

• Interoperability – ability to run cloud job among different computational systems

• Scalability

– Up and down – ability to have cloud computing hob

access different generations of similar hardware

architecture

– In and out – ability to expand or contract the size of a cloud computing job on a specific hardware

(51)

Cloud Computing and Storage

Some Operational Issues/Observations

• Software licensing agreements at each cloud computing location

– Multi-institution

– Multi-site cloud installations

• Service Level Agreements

• Security - privacy and sensitive data • Data and application audits

• Terms and conditions (Master Service Agreement, click-wrap EULAs, distribution of risks)

(52)

Demo

• Cloud computing demonstration

– User interface

– Access to the image repository – Computation options

(53)

Some Closing Remarks

Cloud Computing and Storage

• Cloud computing (CC) is a disruptive paradigm in several dimensions

– Technical – Economic – Education – Research

• There is an intertwining among these dimensions that must be observed when constructing cloud computing

• CC is actually having the largest impact in the business and commercial sector

• The majority of cloud services developed are directed to business applications – not STEM projects and research

(54)

Some Closing Remarks

Cloud Computing and Storage

• Not everything is best optimized by migrating a cloud computing image to the data repository • What are the proper decision criteria when and

where to locate computation and storage

• Which leads to the second part of the tutorial Policy-based Data Management – Reagan Moore

(55)

References

Related documents

Findings of the study show that (1) Go-green managerial skills of Primary School principals significantly affect students’ environmental awareness with F =

• in public cloud is services are dynamically provided over the Internet by a third-party provider like Amazon • private cloud is a virtualized computing. infrastructure created

You will always make the most money by following the main trend of the market, Every Commodity makes a TOP or BOTTOM on some exact mathematical point in proportion to some

(Paperback) 3rd Edition Free PDF d0wnl0ad, audio books, books to read, good books to read, cheap books, good books, online books, books online, book reviews epub, read books

Google Sky Maps are trademarks of Google Inc.All other products and services names mentioned may belong to their respective trademark owners. Equippe d with A ndroid TM 2 .2

Motivations : 1) In existing studies [35], [38] one or two types of MR modalities (like T1, or FLAIR) were exploited for glioma classification. Since brain images from differ- ent

You can also program CVs that control momentum, 3 step and 128 step speed tables, switching speed, normal direction of travel, scalable speed stabilization and more to take

Since 2000, the Managed Funds Association (MFA), which is the leading trade group for the hedge fund industry, has annually issued its “Sound Practices for Hedge Fund Managers.” Th