• No results found

Exascale (Computing) in the Cloud

N/A
N/A
Protected

Academic year: 2021

Share "Exascale (Computing) in the Cloud"

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

Apache-VCL

Exascale (Computing) in the Cloud

Mladen A. Vouk

Professor and Head of Computer Science, and Associate

Vice-Provost for Information Technology

North Carolina State University, Raleigh, NC 27695

o k@csc ncs ed

[email protected]

(2)

Exascale Computing in the Cloud

Exascale Computing in the Cloud

This talk discusses the architecture of "cloud computing" as it is

beginning to emerge in the world today, how VCL - NC State's

award-beginning to emerge in the world today, how VCL NC State s award

winning open-source Cloud Computing technology fits into that space,

and the directions cloud computing is moving in. Peta-byte, and soon

exa-byte, data collections and streams are not a rarity anymore,

especially in the "cloud". How, and where one processes such large

amounts of data in a meaningful way is less of a challenge in terms of

physical storage, communication infrastructure and processing power,

than it is a challenge in terms of algorithms and analytics for knowledge

extraction and decision support. In the domain of exa-scale (and

higher),"winners" will be analytics engines, algorithms and tools that

can support on demand decision making based on "big" data sources

can support on-demand decision making based on "big" data sources

either in-situ (at the data sources) or via specialized post-processing

analytics clouds. This problem spans a variety of application domains

-from health care to science to security to bio-technology and energy

from health care, to science, to security, to bio-technology and energy

management.

(3)

~ 1 zettabyte on Internet; Challenge: collection, collation, finding, …?

We are being flooded with information, data ..

Giga 10

g

9

Tera

10

12

Peta 10

15

Exa

10

18

Exa

10

18

Zetta 10

21

Yotta 10

24

Volume

Exabytes of data move

Volume

Exabytes of data move

Variety

80% of new data growth is unstructured content or content

Variety

80% of new data growth is unstructured content or content

Cost

An average company with 1,000 employees spends$5.3

Cost

An average company with 1,000 employees spends$5.3 y

around in networks and /or are generated . In 2010 codified information base was expected to double every 11 hours

y

around in networks and /or are generated . In 2010 codified information base was expected to double every 11 hours

unstructured content or content with hidden structures, e.g. email, blogs, web pages, white papers, images, video and audio. A lot is “content in the unstructured content or content with hidden structures, e.g. email, blogs, web pages, white papers, images, video and audio. A lot is “content in the

1,000 employees spends $5.3 milliona year to find

information stored on its

servers. 42% of managers say they use the wrong information

t l t k

1,000 employees spends $5.3 milliona year to find

information stored on its

servers. 42% of managers say they use the wrong information

t l t k

every 11 hours.

(4)

Computers

Supercomputers

Computers

EFLOPS

Supercomputers

EFLOPS

cca 2018

(5)

Peta - X

• Fastest supercomputer today: 2.5+ PFLOPS

(mostly in-memory calculations Linpack

(mostly in-memory calculations, Linpack

benchmark)

• Fastest embarrassingly parallel system (loosely

Fastest embarrassingly parallel system (loosely

coupled) is probably Folding@home: 6.9 to 12.2

PFLOPS

• Data processed by Google on daily basis –

about 24 petabytes

• Large Hadron Colider (CERN) – expected to

produce 15 petabytes per year

(6)

Exa-X

Global internet traffic ran about 21 exabytes per

• Global internet traffic ran about 21 exabytes per

month in March 2010.

• Mobile traffic will reach over 2 exabytes per

• Mobile traffic will reach over 2 exabytes per

month by 2013.

• There are at least 2 000 000 000 computational

• There are at least 2,000,000,000 computational

devices world-wide – collectively 2+ exa-(fl)ops.

• Digital content of the world is closing in onto a

Digital content of the world is closing in onto a

zettabyte.

• First exabyte tape library – Oracle Corp, January

First exabyte tape library Oracle Corp, January

2011.

• Square Kilometer Array Radio Telescope is

q

y

p

(7)

Limitations

Storage

(exa+ capable) – exa now

Networks

Networks

(100 giga+ channels, exa+ capable)

(100 giga channels, exa capable)

Computers

(tightly and loosely coupled, peta+,

exa pending, cca 2018)

p

g,

)

Input/Output, latency, power, cooling

, …

Trust

(hierarchical analytics, data, confidentiality,

(

y

y

security, privacy, science, …)?

Humans

(on average, absorb at most 20 bits per

second of new info, no improvement expected)

(8)

Who Can Help? Watson?

- 21.6 Terabyte storage

- 90 servers 2880 CPUs

•Q&A tool

•Natural language processing

90 servers, 2880 CPUs -Total 80 TFLOPS

1 TB of memory

Natural language processing

•Text processing

•Data mining

P tt

t hi

•Pattern matching

•Large scale data reduction

- from terabytes to bytes

(9)

Analytics

• Analytics of very large amounts of data needs a

number of enabling technologies

It

ill b di t ib t d

ti

d

• It will be distributed, continuous, and very

adaptable to different domains; it will face variably

granular and changing inputs, needs and

conditions

conditions.

Peta and Exa - may often come from a large

number of much smaller operations and actions

.

• Analytics needs to be secure and privacy

preserving, and it should be used to ensure

security and privacy, as well as information integrity

y

p

y

g y

and veracity.

• In-situ (or on-the-fly) analytics, post-analytics,

trusted hierarchical analytics, data-to-algorithm,

y

,

g

,

algorithms-to-data.

(10)

Analytics Cloud “Outpost” myImage Repository

Outpost Cloud Manager #x

Image Repository

Data Limited

Bring computations to data

(11)

Analytics Cloud “Outpost” MyImage Repository

My Cloud Manager #x

MyCloud

Image Repository MyData

Movable Data

Bring data to computations

(12)

Cloud Computing Enables Analytics of “Big” Data

• Utility-level personalized, secure and

on-d

d/

lf

i

b

d d li

f

il

d

demand/self-service based delivery of agile and

mobile information technology services, needs

to accommodate a range of services,

from

C

desktop to HPC

.

• This is essential for

scientific discovery

,

education

and

critical IT systems

(e g health

education

, and

critical IT systems

(e.g., health

care, power grids, transportation systems,

military logistics and tactical systems,

intelligence gathering systems air traffic control

intelligence gathering systems, air traffic control,

security, financial …)

• Complex but trusted

hierarchical and

di t ib t d

l ti

(13)

Service Levels

Level 1

Service Levels

Users of Services

(from naïve to sophisticated) illi Help-Desk

Services Integration & Provisioning

(from naïve to sophisticated) millions

Services Integration & Provisioning

(group reservations, image creation,

Image aggregates (clouds), etc. thousands Level 2

Advanced User

Service Authors and Administrators, Base-line Images, Basic

I t ll ti hundreds Level 3 Expert Developers and Advanced Installers Installation tens Level 4 Developer p Advanced Installers Developer

(14)

Gotta Have It!

Network

Private vs. Public Clouds Domain Services, Analytics, & Oth Network

API

HaaS, IaaS, PaaS

AaaS, SaaS SECaaS LOC C Bare-Metal & Virtualized) Resources & Authentication Authorization Accounting & Other Attributes Data,, LOCaaC CaaS Resources & Services Accounting Reliability & Fault Tolerance Data,, Data, Data Import, Export, Exchange Provenance Meta-Data Privacy, Security, Licenses, …

Where are exactly are my data? Wh i h i Exchange Client (End-User) Portal Service Who is sharing hardware and O/S with me? Help Desk, Oriented (SOA) Analytics support. p , Training, Education

(15)

Services

The principal difference from traditional services

is the level of control an end-user has. Full control is possible.

Hardware as a Service (HaaS) – On demand access to a explicit

(specific) computational, storage and networking product and/or equipment configuration possibly at a particular site (Location as a

f

p

q p g p y p (

Service - LaaS)

Infrastructure as a service (IaaS) – On demand access to user specified hardware, interconnects, and storage capabilities, performance and

services which may run on a variety of hardware products services which may run on a variety of hardware products

Platform as a Service (PaaS) - On-demand access to user specified

combination hypervisors, operating system, and middleware that enables user required applications and services that are running on either Haas and/or IaaS

Application as a Service (AaaS) - On-demand access to user specified application(s)

Software as a Service (SaaS) - may encompass anything from PaaS

Software as a Service (SaaS) - may encompass anything from PaaS through AaaS

Cloud as a Service, Security as a Service, Portability-as-a -Service,

Storage & Location as a Service, e.g., where are my data stored? I wish my data to be stored within 100 miles of X – needs HaaS control …

(16)

“Analytics Cloud“ – under the hood

Knowledge creation & Integration,

Workflow control plane g

Social Networking, Provenance, Tracking & Meta-Data (DBs and Portals) Concept-driven Analytics

Workflow control plane

(DBs and Portals) W/F Engine Analytics W/F Generation

Wizard Synchronous & Asynchronous Services Run-time

Manager and Scheduler

Execution

Plane - “Heavy duty” in-cloud Computations, Flows Services

Analytics Enabled Resources and Images

16 Supercomputers Clusters

Supercomputers Active Storage

(17)

Text Analytics Example

http://vcl.ncsu.edu

Text Analytics Example

• NCSU College of Management researchers –

Currently many Business Intelligence (BI) needs are

Currently many Business Intelligence (BI) needs are

not met (e.g., market analysis) - BI is often based on

analysis of structured databases (which represents

analysis of structured databases (which represents

only 20% of available data) – “mash up” of structured

and unstructured information is needed.

• Some technology support needs

Search key words in structured and unstructured data, User-defined

th

/di ti

U

d fi d l ti

hi

M t fil il

thesaurus/dictionary, User-defined relationships, Meta files easily

queried, Data easily exported, User-defined reports, Graphical

representation of data, Avoid commercial search engines, User

determines priority Works on any data not just web data

(18)

Enabling Technologies

• IBM JStart

– LanguageWare: a natural language processor

g

g

g

g

g

g p

that allows application to ’read’ natural

languages; also a toolkit to model a domain via a

dictionary, relationships and rules

y,

p

– IBM Content Analytics: an analytics engine for

analyzing and reporting unstructured data

– Big Sheets: a browser-based analytics solution

– Big Sheets: a browser-based analytics solution

for very large data sets

• NCSU VCL Cloud – Analytics sub-cloud

– 12 IBM BladeCenter HS22 blades, cca 1

TFLOPS, several Terabytes of storage and about

(19)

Analytics Cloud

VCL Analytics Images

y

g

In Beta-Testing

St i iGold Steering

Databases

Dashboard

Clouds

Web

Clouds

VCL C

ti

d C

l

I

VCL Data Banks

Weeks of data collection per “run”.

(20)

Numerical Analytics Example

http://hpc.ncsu.edu

The Ocean Observing

and Modeling Group,

headed by Dr. Ruoying He t NCSU

NOAA emergency response division has been using the SABGOM

at NCSU.

been using the SABGOM ocean current nowcast and forecast (along with 3 other ocean models) to generate an official oil trajectory prediction, used to guide responses of the local state and federal local, state, and federal governments.

The graphic is generated using the South Atlantic Bight and Gulf of Mexico (so-called NC State SABGOM )

model This model (along with weather prediction) is run Modeling Gulf Currents model. This model (along with weather prediction) is run

daily on a myrinet equipped subcluster of the NCSU VCL-HPC sub-cloud, predicting present and future (84

Enabling Technology: NCSU VCL-HPC

(21)

http://analytics.ncsu.edu

Mission

: promote graduate education and research in the emerging field

Mission

: promote graduate education and research in the emerging field

of analytics. Educate the citizens of North Carolina and beyond in the

concepts, methods, software tools, and applications of analytics that

h

di

d

i l l

i d

have direct and practical relevance to industry.

Coverage

g

: Includes data collection and integration, statistical methods,

g

and complex processes for enterprise-wide decision making.

Output

: MS in Advanced Analytics As the use of analytics becomes

Output

: MS in Advanced Analytics As the use of analytics becomes

more widespread, there is mounting demand for professionals with

strong quantitative skills coupled with an understanding of how the

t h i

li d

t

i t

f iti l

t k

f i

d i i

(22)

decision-Enabling Technologies

• Full suite of SAS Analytics Products used

Enabling Technologies

Full suite of SAS Analytics Products used

to train MS students

• NCSU VCL Cloud

IAA Analytics sub

• NCSU VCL Cloud – IAA Analytics

sub-cloud

IAA C

i

l

(23)

At Exascale Clouds

At Exascale, Clouds

• Will need to frequently exchange information

q

y

g

and algorithms with other clouds.

• Will need to have well defined interfaces for data

and algorithm exchange.

• Will engage in automatic summarization of

information about data sets and data streams

(what type of coding and annotation to use?).

• Must be secure, trustworthy and privacy aware

(24)

Exascale-Analytics Cloud

VCL A

l ti

I

VCL Analytics Images

W kfl

S

t

VCL E-Analytics Images

S

Workflow Support

iGold Dashboards

Analytics Aware

VCL Data Banks

High-performance Clouds

I

it

l ti

VCL S

C

ti

d C

l

In-situ analytics: VCL Super-Centipede Crawls

Data Sources and Returns only Processed Data

(25)

http://vcl.ncsu.edu

VCL Case-Study

Key Partner

(Virtual Computing Laboratory Technology)

(Very Flexible and Secure, Open Source, Self-Service and Image-based)

http://incubator apache org/projects/vcl html

http://incubator.apache.org/projects/vcl.html

Current NC State University VCL installation

– Private Cloud

:

2000+ blades, 7000+ cores, maintenance support: 2 FTE

About 700+ in General mode, about the same in HPC

mode, and several hundred in various test-beds

open to 40,000+ NCSU students and faculty, different pilots

and partner accounts, through Shibboleth all UNC System

and partner accounts, through Shibboleth all UNC System

campuses have access to VCL (cca 250,000 students).

Delivers as many as 250,000 service reservations per year

and over 10.5 million CPU hours (including HPC cycles).

Lo

cost At NCSU bet een 3 and 30 cents per CPU ho r

Low cost: At NCSU between 3 and 30 cents per CPU hour

(26)

VCL was Cloud

when Cloud was not (yet) Cool

Google Trends (3/27/11)

VCL Production-level

IBM & Google announce ‘Cloud” C oduc o e e

Services Started Fall 2004

(27)

VCL Home Page

VCL has it (almost) all – and it works.

Are you interested in a “taste” account?

Are you interested in a taste account?

If so, please send

[email protected]

an

email and we will give you access

.

Home Page

Self-service Open Open Modular Flexible Scalable Upgradable Secure BM & VM Distributed Distributed Cost-effective Reliable Functional

(28)

VCL Services

Reservation Times:

From 30 min to open-ended

Load Times:

From a few sec to 20 min

Actual sole-use bare-metal, or virtual images HaaS IaaS PaaS AaaS [SaaS CaaS]

From a few sec to 20 min

(service dependent, 80+% in less than 2 min)

Stateless, Augmentation, and Persistent modes HaaS, IaaS, PaaS, AaaS, [SaaS, CaaS] Undifferentiated Resources Single Seat (VCL-Desktop) Multiple Synced Seats (VCL-Class) Servers (VCL-Server) Aggregates (VCL-Cloudlets) HPC Clusters (VCL-HPC) S t e.g., System Z Labs,

Differentiated Resources VCL Agent

H

H

r Othe r Supercomputers e.g., System Z (mainframes) Labs, Other clouds, … Storage

EC2

IBM

Storage

(29)

Some Facts

Image baselines are typically Windows and Linux with a variety of

applications –

VDI

can be one of the apps. Depending on how

demanding an application is, it may be

virtualized

(e.g., VMWare,

KVM XEN

)

it

b

t l

KVM, XEN, ...) or it may run on bare-metal.

External services (e.g., to EC2, IBM cloud, etc.)

Currently over 800 images, over 120 in use per semester.

About 100 000+ image reservations per semester

About 100,000+ image reservations per semester.

Most of the “individual seat” requests are on-demand (“Now”)

reservations: about 90% of requests

System availability: exceeds 99.9%, image reservation reliability >

y

y

g

y

99%

General, HPC and Service operation modes

NCSU (in 5 data centers, four on campus, one off campus at

MCNC)

MCNC)

Numerous partners and pilots. A number of stand-alone facilities

(including: Duke, ECU, GMU, RENCI, UNC-CH, NCCU, India, Old

Dominion, Western, Carolina, Kannapolis, GA, CA, VA,MD,SC, MA, LA, etc )

(30)

General Reservation

(VCL-Desktop, VCL-Server)

Long-term reservations Need to explicitly manage Short-term reservations (2-3 hours)

p y g

- state persistence - timeout

- backups

Exascale/SRCE/28-Mar11/v7a 30

Frequently used image load very quickly

(31)

Group Reservation

Group Reservations

Group Reservation

p

(VCL-Class)

This type of reservation does not pay attention

to topology, just to coordinated delivery of individual

delivery of individual Images.

(32)

Analytics Cloudlet

Aggregate Environments – Sub-Clouds (VCL-Cloud)

Analytics Cloudlet

Parent and Children know about each other

[vouk@bn19-36 etc]$ more cluster_info child= 152.46.19.36 child 152.46.19.36 parent= 152.46.19.5 child= 152.46.20.78 child= 152.46.20.86 [ k@b 19 36 t ]$ Parent Lin Lin [vouk@bn19-36 etc]$

This functions allows construction of

Win Lin Win

Custom sub-clouds: Controller + any

number of (hybrid) non-recursive

children. Topology control depends on Image construction – typically within WHAT DO WE DO WITH Image construction – typically within

range of one-management range. WHAT DO WE DO WITH

(33)

HPC (VCL-HPC)

Login Node Internet SchedulerHPC HPC Job HPC Storage Compute Nodes

Full control of topology, storage,

And communications. VLAN-ed

separately. Migration of resources from

sepa ate y

g at o o esou ces o

(34)

A Look Inside

A Look Inside

(35)

VCL Top Level Architecture

VCL TM= Traffic Monitor Authentication Service VCL

Manager & Scheduler

T M Internet TM TM VCL Database Node Manager #1 Node Manager #2

Node Manager #n

Storage

Image Repository Image Repository Image Repository

T G id Storage Vi t l R l z-Series Tera-Grid University Labs Storage

Vi t l R l Diff ti t d Virtual or Real

Undifferentiated Resources, Virtual or Real Differentiated

(36)

Scheduler

VCL DB User selects desired

application through web interface Management Image Library Management Node Server

(37)

Scheduler VCL DB Management Image Library

Scheduler finds available server with requested application or if

not loaded, has management node Management

Node

, g

load image with requested application on an available server

(38)

Scheduler

VCL DB User accesses desired

Management Image Library application through OS provided method (RDP for Windows, ssh/X11 for Linux) Management Node Server

(39)

xCat

W X KVM EC2

TMP-i HSLT

vmWare, Xen, KVM, … EC2

IBM, .. Federated S i

Services

(40)

Security as a Service

• Variety of authentication options (LDAP, Shibboleth… other)

y

p

(

,

)

• High security and isolation (IP-lock, local firewalls,

point-to-point VLANs and VPNs, one-time passwords, feedback

confirmation, timeout, traffic monitoring …)

g

)

• Sophisticated resource access and mapping privilege tree.

• Real-time monitoring of reliability and security

Auth

Traffic Monitoring

Policy based Privilege Tree Maps

Auth

Timeout

One Time Auth

VPN

IPLock One Time

Passwd Activity VLANs within VCL

(41)

VCL Dashboard

(real time)

(real time)

(42)

Provenance and Meta-Data

Provenance and performance Provenance and performance statistics for any time period is

available to a general user, including reliability information.

(43)

VCL Implementation Options

p

p

• Partnerships

– Guest in an existing installation (very quick)

– Local Resources – Remote Management (can solve latency and

oca

esou ces

e ote

a age e t (ca so e ate cy a d

remote bandwidth issues)

– Small VCL broker and emergency backup with external cloud

services

O

t

f ll I

t ll ti

( 2 2)

i

L

l

• Operate your own full Installation (v2.2) – requires Level

3 training (

http://cwiki.apache.org/VCL/

)

• Levels

Vi t

l

l VCL

i k h d

ti

VM

KVM

– Virtual-only VCL – quick, hardware agnostic, e.g., VMware, KVM

pool, VCL vm management

– VCL-in-the-box, and hardware appliance

– Training: VCL Sandbox

Web Interface

Training: VCL Sandbox

• Storage

– From “I own the data stores and locations”

to “don’t care”

Manager Service Environment Scheduler Schedule DB

to don t care

Environment

Library

(44)

Small VCL Configuration

General Basic Configuration

1 BladeCenter E/H chassis

2 Ethernet Switch Modules

(BNT Layer 2/3 copper)

Power supplies 3&4 (for 7 or

Three Networks

Public, Private, Management

Intelligent Images Securit

y

Power supplies 3&4 (for 7 or

more blades)

Chassis network module to

connect management node

to storage

Fiber Channel Optical

Intelligent Images, Securit

y

ESM ESM MM

Fiber Channel - Optical pass through

iSCSI - Copper pass through

2-14 HSxy Blades

At l

t

bl d

OPM ESM

y

At least one blade

configured to attach to

external storage for Image

Library (FC, iSCSI, …)

Server for scheduler

Server for scheduler,

database, and management

node

Server(s) to deliver VCL

services

St

f

I

Storage for Images

FC or iSCSI storage array

(few TB)

(45)

Scaling VCL

Network switch

Cisco 6509e (or equivalent

General Multi-Chassis

Cisco 6509e (or equivalent in your favorite network vendor flavor)

3 separate networks + VLANs

Network connected to f

Internet for user access

Private Network connected to VCL management node (for loading and managing images)Private Management GigE Switch Public Network Private Management network (connecting BladeCenter Management Modules and VCL management node -controls power on/off, reboot )

reboot, …)

VCL Management nodes

One management node for every ~100 blades

Physical connection to storage array - shared file

t (GFS GPFS) f GigE Switch GigE Switch g y system (GFS, GPFS) for multiple management nodes at one site

(46)

HPC Cluster in VCL

HPC Configuration

Network switch

Add another private network for message passing traffic - use NIC that would be

HPC

Storage

NIC that would be used for Public

network user access

BladeCenter Chassis

Configure two VLANs in one chassis s itch

GigE Switch Public Network

Storage Servers

in one chassis switch module.. one for

public Internet access and one for private message passing interface

GigE Switch

Message Passing Network

interface

VCL management

node

configures blade VLAN based on i t d t GigE Switch GigE Switch

Private Management Network Private Network Message Passing Network

image metadata

Private Management Network Private Network

(47)

The Resource “Knows”

The Resource Knows

• Images are VCL’s primary currency – they are software

stacks (bare-metal or virtual) They “know” who can use

stacks (bare metal or virtual). They know who can use

them, how many licenses they are allowed to use, how

to defend themselves, what storage to access, etc.

• Other resources (computers, schedules, user groups,

etc.) are also user, role, and security conscious.

• This provides scalable customizable and flexible

This provides scalable, customizable and flexible

resource security

• Above is typically coupled with system level security

(VLANs, VPNs, traffic and load monitors, etc,)

(48)

Shades of Things to Come

Shades of Things to Come

• Do you want to own a power plant (well… maybe)?

It is s all cheaper to “b

” po er and ha e a back p

– It is usually cheaper to “buy” power and have a backup

generator, perhaps a transformer, and ciruit breakers (and few

flashlights). The same should be true with Cloud services.

VCL Cl

d B

k

li ht

i ht l

l l

d th t

t

VCL-Cloud-Broker

– a lightweight local cloud that acts

as a wide-area cloud resource broker. It also provides

emergency backup resources, perhaps a data vault, first

g

y

p

p

p

class easy-to-add interfaces to other on-demand cloud

service, a monitoring and brokering “image”, control and

monitoring “dashboard” for seamless service imports

monitoring dashboard for seamless service imports,

master image repository, involuntary vendor-lock-in

protection, etc.

(49)

How much?

• Capital costs

• depends on the size, e.g., cca $100k for a blade center that

supports between 200 and 400 simultaneous users (data center

supports between 200 and 400 simultaneous users (data center

and networking infrastructure assumed).

• Operational costs

$

• e.g., $200/blade per year + 2-3 FTEs per 5,000 units + licenses)

• Skills needed (depends on the type of installation, level 1

through 4 for large installations)

through 4 for large installations)

• Today,

for

a

relatively

modest

investment,

an

organization, a state, a nation can provide first class, full

t l

l bl

d

t i

bl (

t ff

ti

)

control, scalable, secure, and sustainable (cost-effective)

information technology cloud (and sub-cloud) support for

its daily operations, for education, and for its critical

y

p

services and applications.

(50)

Use, Capacity & Operational Profile

> 800 images available cca 250,000 reservations per year

(51)

Green & Cost-Effective

120 140 s November 2008 60 80 100 N um be r of R e s e rv a ti o n s 20 40 A v er ag e N 800,000 High-Performance Computing (over 12 months Mar08-Mar09)

0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Time of Day (24 hr clock)

500,000 600,000 700,000 u rs ( )

Average daily active reservations

200,000 300,000 400,000 CPU H o u

Opportunity to save power or Increase utilization

(52)

Cost Factors

Utilization

(70-80%) – HPC + General mix, economy of scale.

Lab spaces (25:1) – currently cca 250,000 non-HPC reservations

per year, cca 10+ million HPC CPU hrs.

Virtualization as many as 20+ VMs per physical blade

Virtualization, as many as 20+ VMs per physical blade

Extended hardware life

. Refresh cycle (yearly), resource lifetime

(cca 5 years) – yearly down-migration of resources

Power savings (Blades)

g (

)

Architectural savings:

one BladeCenter chassis (cca 100k) can

serve 200+ on-demand concurrent sessions (augmentation mode,

VDI, HPC, etc.)

Reduced administration and maintenance costs

(about 2 FTEs

Reduced administration and maintenance costs

(about 2 FTEs

for about 2,000 blades, 6000+ cores) Distributed burden of image

creation (800+ images)

Green

Image driven (security, license, topology, complex environments and

workflows, …)

Price point from 3 to 27 cents per CPU hour (HPC plus General use,

not counting Services from federated clouds) In K-12 annualized

not counting Services from federated clouds). In K 12, annualized

cost per user/child can be as low as few dollars.

(53)

VCL Economics (annual cost)

• Considering just VCL use in augmentation mode (cca

500,000 CPU-hrs per year)...

1

–Per reservation : $2.20 ($0.72 for servers) Total including Personnel ( & Hardware only)

$ ($ )

–Per CPU hour: $1.04 ($0.34 for servers) –Per active user: $29.69 ($9.68 for servers) –Per potential user: $14.84 ($4.84 for servers)

( & Hardware only)

• With backfill (cca 10.5 million HPC CPU hours per year)

...

2

Per CPU hour: $0 12 ($0 08 for servers) NCSU VCL: 40,000+ userst i t ti d/ –Per CPU hour: $0.12 ($0.08 for servers)

• Additional Benefits

–User owned computers have more value

most in augmentation and/or HPC mode, 2000+ blades p

– Machine refresh cycle has been stretched out

–Distribution of workload (need only 2 FTEs to operate 2000+ physical units

(54)

VCL Status and Future Directions

• Features in Production use

–Code open sourced through Apache, support through Apache

and IBM

IBM GBS/UDS

and IBM

–Block and Recurring Reservations

–Long term reservations

–Load multiple images with single environment reservation

IBM GBS/UDS

provides support for VCL educational sector in the US

Load multiple images with single environment reservation

–Clusters on demand (can host ANY other cloud)

–Virtual Machines for light weight applications, bare-metal loads

fro more demanding apps.

g pp

–Open plug-in architecture for possible proprietary expansion

–Scalable, augmentable, and secure

• Features under development

Features under development

–Interface to public clouds (Amazon and IBM) – in beta

–Automated storage provisioning

–Reduced image load time (in beta)

Reduced image load time (in beta)

(55)

Summary

Exa scale analytics will work best in a very

• Exa-scale analytics will work best in a very

granular analytics-aware cloud

environment.

• For the world (and VCL)

lots of plans for the

• For the world (and VCL) - lots of plans for the

future (e.g., cloud federation, preconfigured

Analytics sub-clouds, …)

Analytics sub clouds, …)

• VCL experience: clouds work if utilization and

reliability are high, end-user access is easy, and

y

g ,

y,

functionality is diverse and easily modifiable and

morphable to user needs.

http://vcl.ncsu.edu

• For a VCL “taste”-account, send email to

References

Related documents

Production Network Data Center Private Cloud Virtualization Core Remote Office Branch Office Campus Network Management Application Performance Security Intelligence Customer •

Public VLAN Network Router Network Controller Private Encapsulated vNet Private Encapsulated vNet Private Encapsulated vNet Network Node Open vSwitch. (Encap

When a patient experiences continuous pain in the maxillary premolar and molar areas and there is no evidence of dental infection, the most likely diagnosis is.. acute

Type I constructions which can take cognitive subjects as their themes can be further divided into two patterns based on whether the cognitive subject has a person

Connection-Oriented Ethernet for Delivery of Private Cloud Services © Copyright 2012 Fujitsu Network Communications..

Private network: Virtual private network (VPN) server access through dedicated, stand-alone third carriers not connected to the public network with unmetered bandwidth usage

secundaria, si bien nuestro corpus no pretende ser una muestra aleatoria sino el conjunto del universo o censo de todas las noticias publicadas en la versión digital... postular

The studied Nursing home showed excellent compliance with planning and making residents’ nursing care plans avail- able for staff, referring residents after a fall,