Apache-VCL
Exascale (Computing) in the Cloud
Mladen A. Vouk
Professor and Head of Computer Science, and Associate
Vice-Provost for Information Technology
North Carolina State University, Raleigh, NC 27695
o k@csc ncs ed
[email protected]
Exascale Computing in the Cloud
Exascale Computing in the Cloud
This talk discusses the architecture of "cloud computing" as it is
beginning to emerge in the world today, how VCL - NC State's
award-beginning to emerge in the world today, how VCL NC State s award
winning open-source Cloud Computing technology fits into that space,
and the directions cloud computing is moving in. Peta-byte, and soon
exa-byte, data collections and streams are not a rarity anymore,
especially in the "cloud". How, and where one processes such large
amounts of data in a meaningful way is less of a challenge in terms of
physical storage, communication infrastructure and processing power,
than it is a challenge in terms of algorithms and analytics for knowledge
extraction and decision support. In the domain of exa-scale (and
higher),"winners" will be analytics engines, algorithms and tools that
can support on demand decision making based on "big" data sources
can support on-demand decision making based on "big" data sources
either in-situ (at the data sources) or via specialized post-processing
analytics clouds. This problem spans a variety of application domains
-from health care to science to security to bio-technology and energy
from health care, to science, to security, to bio-technology and energy
management.
~ 1 zettabyte on Internet; Challenge: collection, collation, finding, …?
We are being flooded with information, data ..
Giga 10
g
9Tera
10
12Peta 10
15Exa
10
18Exa
10
18Zetta 10
21Yotta 10
24Volume
Exabytes of data move
Volume
Exabytes of data move
Variety
80% of new data growth is unstructured content or content
Variety
80% of new data growth is unstructured content or content
Cost
An average company with 1,000 employees spends$5.3
Cost
An average company with 1,000 employees spends$5.3 y
around in networks and /or are generated . In 2010 codified information base was expected to double every 11 hours
y
around in networks and /or are generated . In 2010 codified information base was expected to double every 11 hours
unstructured content or content with hidden structures, e.g. email, blogs, web pages, white papers, images, video and audio. A lot is “content in the unstructured content or content with hidden structures, e.g. email, blogs, web pages, white papers, images, video and audio. A lot is “content in the
1,000 employees spends $5.3 milliona year to find
information stored on its
servers. 42% of managers say they use the wrong information
t l t k
1,000 employees spends $5.3 milliona year to find
information stored on its
servers. 42% of managers say they use the wrong information
t l t k
every 11 hours.
Computers
Supercomputers
Computers
EFLOPSSupercomputers
EFLOPScca 2018
Peta - X
• Fastest supercomputer today: 2.5+ PFLOPS
(mostly in-memory calculations Linpack
(mostly in-memory calculations, Linpack
benchmark)
• Fastest embarrassingly parallel system (loosely
Fastest embarrassingly parallel system (loosely
coupled) is probably Folding@home: 6.9 to 12.2
PFLOPS
• Data processed by Google on daily basis –
about 24 petabytes
• Large Hadron Colider (CERN) – expected to
produce 15 petabytes per year
Exa-X
Global internet traffic ran about 21 exabytes per
• Global internet traffic ran about 21 exabytes per
month in March 2010.
• Mobile traffic will reach over 2 exabytes per
• Mobile traffic will reach over 2 exabytes per
month by 2013.
• There are at least 2 000 000 000 computational
• There are at least 2,000,000,000 computational
devices world-wide – collectively 2+ exa-(fl)ops.
• Digital content of the world is closing in onto a
Digital content of the world is closing in onto a
zettabyte.
• First exabyte tape library – Oracle Corp, January
First exabyte tape library Oracle Corp, January
2011.
• Square Kilometer Array Radio Telescope is
q
y
p
Limitations
•
Storage
(exa+ capable) – exa now
•
Networks
Networks
(100 giga+ channels, exa+ capable)
(100 giga channels, exa capable)
•
Computers
(tightly and loosely coupled, peta+,
exa pending, cca 2018)
p
g,
)
–
Input/Output, latency, power, cooling
, …
•
Trust
(hierarchical analytics, data, confidentiality,
(
y
y
security, privacy, science, …)?
•
Humans
(on average, absorb at most 20 bits per
second of new info, no improvement expected)
Who Can Help? Watson?
- 21.6 Terabyte storage- 90 servers 2880 CPUs
•Q&A tool
•Natural language processing
90 servers, 2880 CPUs -Total 80 TFLOPS
1 TB of memory
Natural language processing
•Text processing
•Data mining
P tt
t hi
•Pattern matching
•Large scale data reduction
- from terabytes to bytes
Analytics
• Analytics of very large amounts of data needs a
number of enabling technologies
It
ill b di t ib t d
ti
d
• It will be distributed, continuous, and very
adaptable to different domains; it will face variably
granular and changing inputs, needs and
conditions
conditions.
•
Peta and Exa - may often come from a large
number of much smaller operations and actions
.
• Analytics needs to be secure and privacy
preserving, and it should be used to ensure
security and privacy, as well as information integrity
y
p
y
g y
and veracity.
• In-situ (or on-the-fly) analytics, post-analytics,
trusted hierarchical analytics, data-to-algorithm,
y
,
g
,
algorithms-to-data.
Analytics Cloud “Outpost” myImage Repository
Outpost Cloud Manager #x
Image Repository
Data Limited
Bring computations to data
Analytics Cloud “Outpost” MyImage Repository
My Cloud Manager #x
MyCloud
Image Repository MyData
Movable Data
Bring data to computations
Cloud Computing Enables Analytics of “Big” Data
• Utility-level personalized, secure and
on-d
d/
lf
i
b
d d li
f
il
d
demand/self-service based delivery of agile and
mobile information technology services, needs
to accommodate a range of services,
from
C
desktop to HPC
.
• This is essential for
scientific discovery
,
education
and
critical IT systems
(e g health
education
, and
critical IT systems
(e.g., health
care, power grids, transportation systems,
military logistics and tactical systems,
intelligence gathering systems air traffic control
intelligence gathering systems, air traffic control,
security, financial …)
• Complex but trusted
hierarchical and
di t ib t d
l ti
Service Levels
Level 1
Service Levels
Users of Services
(from naïve to sophisticated) illi Help-Desk
Services Integration & Provisioning
(from naïve to sophisticated) millions
Services Integration & Provisioning
(group reservations, image creation,
Image aggregates (clouds), etc. thousands Level 2
Advanced User
Service Authors and Administrators, Base-line Images, Basic
I t ll ti hundreds Level 3 Expert Developers and Advanced Installers Installation tens Level 4 Developer p Advanced Installers Developer
Gotta Have It!
Network
Private vs. Public Clouds Domain Services, Analytics, & Oth Network
API
HaaS, IaaS, PaaSAaaS, SaaS SECaaS LOC C Bare-Metal & Virtualized) Resources & Authentication Authorization Accounting & Other Attributes Data,, LOCaaC CaaS … Resources & Services Accounting Reliability & Fault Tolerance Data,, Data, Data Import, Export, Exchange Provenance Meta-Data Privacy, Security, Licenses, …
Where are exactly are my data? Wh i h i Exchange Client (End-User) Portal Service Who is sharing hardware and O/S with me? Help Desk, Oriented (SOA) Analytics support. p , Training, Education
Services
The principal difference from traditional services
is the level of control an end-user has. Full control is possible.
Hardware as a Service (HaaS) – On demand access to a explicit(specific) computational, storage and networking product and/or equipment configuration possibly at a particular site (Location as a
f
p
q p g p y p (
Service - LaaS)
Infrastructure as a service (IaaS) – On demand access to user specified hardware, interconnects, and storage capabilities, performance and
services which may run on a variety of hardware products services which may run on a variety of hardware products
Platform as a Service (PaaS) - On-demand access to user specified
combination hypervisors, operating system, and middleware that enables user required applications and services that are running on either Haas and/or IaaS
Application as a Service (AaaS) - On-demand access to user specified application(s)
Software as a Service (SaaS) - may encompass anything from PaaS
Software as a Service (SaaS) - may encompass anything from PaaS through AaaS
Cloud as a Service, Security as a Service, Portability-as-a -Service,
Storage & Location as a Service, e.g., where are my data stored? I wish my data to be stored within 100 miles of X – needs HaaS control …
“Analytics Cloud“ – under the hood
Knowledge creation & Integration,
Workflow control plane g
Social Networking, Provenance, Tracking & Meta-Data (DBs and Portals) Concept-driven Analytics
Workflow control plane
(DBs and Portals) W/F Engine Analytics W/F Generation
Wizard Synchronous & Asynchronous Services Run-time
Manager and Scheduler
Execution
Plane - “Heavy duty” in-cloud Computations, Flows Services
Analytics Enabled Resources and Images
16 Supercomputers Clusters
Supercomputers Active Storage
Text Analytics Example
http://vcl.ncsu.edu
Text Analytics Example
• NCSU College of Management researchers –
Currently many Business Intelligence (BI) needs are
Currently many Business Intelligence (BI) needs are
not met (e.g., market analysis) - BI is often based on
analysis of structured databases (which represents
analysis of structured databases (which represents
only 20% of available data) – “mash up” of structured
and unstructured information is needed.
• Some technology support needs
•
Search key words in structured and unstructured data, User-defined
th
/di ti
U
d fi d l ti
hi
M t fil il
thesaurus/dictionary, User-defined relationships, Meta files easily
queried, Data easily exported, User-defined reports, Graphical
representation of data, Avoid commercial search engines, User
determines priority Works on any data not just web data
Enabling Technologies
• IBM JStart
– LanguageWare: a natural language processor
g
g
g
g
g
g p
that allows application to ’read’ natural
languages; also a toolkit to model a domain via a
dictionary, relationships and rules
y,
p
– IBM Content Analytics: an analytics engine for
analyzing and reporting unstructured data
– Big Sheets: a browser-based analytics solution
– Big Sheets: a browser-based analytics solution
for very large data sets
• NCSU VCL Cloud – Analytics sub-cloud
– 12 IBM BladeCenter HS22 blades, cca 1
TFLOPS, several Terabytes of storage and about
Analytics Cloud
VCL Analytics Images
y
g
In Beta-Testing
St i iGold SteeringDatabases
DashboardClouds
Web
Clouds
VCL C
ti
d C
l
I
VCL Data Banks
Weeks of data collection per “run”.
Numerical Analytics Example
http://hpc.ncsu.edu
The Ocean Observing
and Modeling Group,
headed by Dr. Ruoying He t NCSU
NOAA emergency response division has been using the SABGOM
at NCSU.
been using the SABGOM ocean current nowcast and forecast (along with 3 other ocean models) to generate an official oil trajectory prediction, used to guide responses of the local state and federal local, state, and federal governments.
The graphic is generated using the South Atlantic Bight and Gulf of Mexico (so-called NC State SABGOM )
model This model (along with weather prediction) is run Modeling Gulf Currents model. This model (along with weather prediction) is run
daily on a myrinet equipped subcluster of the NCSU VCL-HPC sub-cloud, predicting present and future (84
Enabling Technology: NCSU VCL-HPC
http://analytics.ncsu.edu
Mission
: promote graduate education and research in the emerging field
Mission
: promote graduate education and research in the emerging field
of analytics. Educate the citizens of North Carolina and beyond in the
concepts, methods, software tools, and applications of analytics that
h
di
d
i l l
i d
have direct and practical relevance to industry.
Coverage
g
: Includes data collection and integration, statistical methods,
g
and complex processes for enterprise-wide decision making.
Output
: MS in Advanced Analytics As the use of analytics becomes
Output
: MS in Advanced Analytics As the use of analytics becomes
more widespread, there is mounting demand for professionals with
strong quantitative skills coupled with an understanding of how the
t h i
li d
t
i t
f iti l
t k
f i
d i i
decision-Enabling Technologies
• Full suite of SAS Analytics Products used
Enabling Technologies
Full suite of SAS Analytics Products used
to train MS students
• NCSU VCL Cloud
IAA Analytics sub
• NCSU VCL Cloud – IAA Analytics
sub-cloud
IAA C
i
l
At Exascale Clouds
At Exascale, Clouds
• Will need to frequently exchange information
q
y
g
and algorithms with other clouds.
• Will need to have well defined interfaces for data
and algorithm exchange.
• Will engage in automatic summarization of
information about data sets and data streams
(what type of coding and annotation to use?).
• Must be secure, trustworthy and privacy aware
Exascale-Analytics Cloud
VCL A
l ti
I
VCL Analytics Images
W kfl
S
t
VCL E-Analytics Images
S
Workflow Support
iGold DashboardsAnalytics Aware
VCL Data Banks
High-performance Clouds
I
it
l ti
VCL S
C
ti
d C
l
In-situ analytics: VCL Super-Centipede Crawls
Data Sources and Returns only Processed Data
http://vcl.ncsu.edu
VCL Case-Study
Key Partner
(Virtual Computing Laboratory Technology)
(Very Flexible and Secure, Open Source, Self-Service and Image-based)
http://incubator apache org/projects/vcl html
http://incubator.apache.org/projects/vcl.html
Current NC State University VCL installation
– Private Cloud
:
•
2000+ blades, 7000+ cores, maintenance support: 2 FTE
•
About 700+ in General mode, about the same in HPC
mode, and several hundred in various test-beds
•
open to 40,000+ NCSU students and faculty, different pilots
and partner accounts, through Shibboleth all UNC System
and partner accounts, through Shibboleth all UNC System
campuses have access to VCL (cca 250,000 students).
•
Delivers as many as 250,000 service reservations per year
and over 10.5 million CPU hours (including HPC cycles).
Lo
cost At NCSU bet een 3 and 30 cents per CPU ho r
•
Low cost: At NCSU between 3 and 30 cents per CPU hour
VCL was Cloud
when Cloud was not (yet) Cool
Google Trends (3/27/11)VCL Production-level
IBM & Google announce ‘Cloud” C oduc o e e
Services Started Fall 2004
VCL Home Page
VCL has it (almost) all – and it works.
Are you interested in a “taste” account?
Are you interested in a taste account?
If so, please send
[email protected]
an
email and we will give you access
.
Home Page
Self-service Open Open Modular Flexible Scalable Upgradable Secure BM & VM Distributed Distributed Cost-effective Reliable Functional …VCL Services
Reservation Times:
From 30 min to open-ended
Load Times:
From a few sec to 20 min
Actual sole-use bare-metal, or virtual images HaaS IaaS PaaS AaaS [SaaS CaaS]
From a few sec to 20 min
(service dependent, 80+% in less than 2 min)
Stateless, Augmentation, and Persistent modes HaaS, IaaS, PaaS, AaaS, [SaaS, CaaS] Undifferentiated Resources Single Seat (VCL-Desktop) Multiple Synced Seats (VCL-Class) Servers (VCL-Server) Aggregates (VCL-Cloudlets) HPC Clusters (VCL-HPC) S t e.g., System Z Labs,
Differentiated Resources VCL Agent
H
H
r Othe r Supercomputers e.g., System Z (mainframes) Labs, Other clouds, … Storage
EC2
IBM
StorageSome Facts
•
Image baselines are typically Windows and Linux with a variety of
applications –
VDI
can be one of the apps. Depending on how
demanding an application is, it may be
virtualized
(e.g., VMWare,
KVM XEN
)
it
b
t l
KVM, XEN, ...) or it may run on bare-metal.
•
External services (e.g., to EC2, IBM cloud, etc.)
•
Currently over 800 images, over 120 in use per semester.
•
About 100 000+ image reservations per semester
•
About 100,000+ image reservations per semester.
•
Most of the “individual seat” requests are on-demand (“Now”)
reservations: about 90% of requests
•
System availability: exceeds 99.9%, image reservation reliability >
y
y
g
y
99%
•
General, HPC and Service operation modes
•
NCSU (in 5 data centers, four on campus, one off campus at
MCNC)
MCNC)
•
Numerous partners and pilots. A number of stand-alone facilities
(including: Duke, ECU, GMU, RENCI, UNC-CH, NCCU, India, Old
Dominion, Western, Carolina, Kannapolis, GA, CA, VA,MD,SC, MA, LA, etc )
General Reservation
(VCL-Desktop, VCL-Server)
Long-term reservations Need to explicitly manage Short-term reservations (2-3 hours)
p y g
- state persistence - timeout
- backups
Exascale/SRCE/28-Mar11/v7a 30
Frequently used image load very quickly
Group Reservation
Group Reservations
Group Reservation
p
(VCL-Class)
This type of reservation does not pay attention
to topology, just to coordinated delivery of individual
delivery of individual Images.
Analytics Cloudlet
Aggregate Environments – Sub-Clouds (VCL-Cloud)
Analytics Cloudlet
Parent and Children know about each other
[vouk@bn19-36 etc]$ more cluster_info child= 152.46.19.36 child 152.46.19.36 parent= 152.46.19.5 child= 152.46.20.78 child= 152.46.20.86 [ k@b 19 36 t ]$ Parent Lin Lin [vouk@bn19-36 etc]$
This functions allows construction of
Win Lin Win
Custom sub-clouds: Controller + any
number of (hybrid) non-recursive
children. Topology control depends on Image construction – typically within WHAT DO WE DO WITH Image construction – typically within
range of one-management range. WHAT DO WE DO WITH
HPC (VCL-HPC)
Login Node Internet SchedulerHPC HPC Job HPC Storage Compute NodesFull control of topology, storage,
And communications. VLAN-ed
separately. Migration of resources from
sepa ate y
g at o o esou ces o
A Look Inside
A Look Inside
VCL Top Level Architecture
VCL TM= Traffic Monitor Authentication Service VCLManager & Scheduler
T M Internet TM TM VCL Database Node Manager #1 Node Manager #2
Node Manager #n
Storage
Image Repository Image Repository Image Repository
T G id Storage Vi t l R l z-Series Tera-Grid University Labs Storage
Vi t l R l Diff ti t d Virtual or Real
Undifferentiated Resources, Virtual or Real Differentiated
Scheduler
VCL DB User selects desired
application through web interface Management Image Library Management Node Server
Scheduler VCL DB Management Image Library
Scheduler finds available server with requested application or if
not loaded, has management node Management
Node
, g
load image with requested application on an available server
Scheduler
VCL DB User accesses desired
Management Image Library application through OS provided method (RDP for Windows, ssh/X11 for Linux) Management Node Server
xCat
W X KVM EC2
TMP-i HSLT
vmWare, Xen, KVM, … EC2
IBM, .. Federated S i
Services
Security as a Service
• Variety of authentication options (LDAP, Shibboleth… other)
y
p
(
,
)
• High security and isolation (IP-lock, local firewalls,
point-to-point VLANs and VPNs, one-time passwords, feedback
confirmation, timeout, traffic monitoring …)
g
)
• Sophisticated resource access and mapping privilege tree.
• Real-time monitoring of reliability and security
Auth
Traffic Monitoring
Policy based Privilege Tree Maps
Auth
Timeout
One Time Auth
VPN
IPLock One Time
Passwd Activity VLANs within VCL
VCL Dashboard
(real time)
(real time)
Provenance and Meta-Data
Provenance and performance Provenance and performance statistics for any time period is
available to a general user, including reliability information.
VCL Implementation Options
p
p
• Partnerships
– Guest in an existing installation (very quick)
– Local Resources – Remote Management (can solve latency and
oca
esou ces
e ote
a age e t (ca so e ate cy a d
remote bandwidth issues)
– Small VCL broker and emergency backup with external cloud
services
O
t
f ll I
t ll ti
( 2 2)
i
L
l
• Operate your own full Installation (v2.2) – requires Level
3 training (
http://cwiki.apache.org/VCL/
)
• Levels
Vi t
l
l VCL
i k h d
ti
VM
KVM
– Virtual-only VCL – quick, hardware agnostic, e.g., VMware, KVM
pool, VCL vm management
– VCL-in-the-box, and hardware appliance
– Training: VCL Sandbox
Web Interface
Training: VCL Sandbox
• Storage
– From “I own the data stores and locations”
to “don’t care”
Manager Service Environment Scheduler Schedule DBto don t care
EnvironmentLibrary
Small VCL Configuration
General Basic Configuration
1 BladeCenter E/H chassis
2 Ethernet Switch Modules
(BNT Layer 2/3 copper)
Power supplies 3&4 (for 7 or
Three Networks
Public, Private, Management
Intelligent Images Securit
yPower supplies 3&4 (for 7 or
more blades)
Chassis network module to
connect management node
to storage
Fiber Channel Optical
Intelligent Images, Securit
yESM ESM MM
– Fiber Channel - Optical pass through
– iSCSI - Copper pass through
2-14 HSxy Blades
At l
t
bl d
OPM ESMy
At least one blade
configured to attach to
external storage for Image
Library (FC, iSCSI, …)
Server for scheduler
Server for scheduler,
database, and management
node
Server(s) to deliver VCL
services
St
f
I
Storage for Images
FC or iSCSI storage array
(few TB)
Scaling VCL
Network switch
Cisco 6509e (or equivalent
General Multi-Chassis
Cisco 6509e (or equivalent in your favorite network vendor flavor)
3 separate networks + VLANs
Network connected to f
Internet for user access
Private Network connected to VCL management node (for loading and managing images) Private Management GigE Switch Public Network Private Management network (connecting BladeCenter Management Modules and VCL management node -controls power on/off, reboot )
reboot, …)
VCL Management nodes
One management node for every ~100 blades
Physical connection to storage array - shared file
t (GFS GPFS) f GigE Switch GigE Switch g y system (GFS, GPFS) for multiple management nodes at one site
HPC Cluster in VCL
HPC Configuration
Network switch
Add another private network for message passing traffic - use NIC that would be
HPC
Storage
NIC that would be used for Public
network user access
BladeCenter Chassis
Configure two VLANs in one chassis s itch
GigE Switch Public Network
Storage Servers
in one chassis switch module.. one for
public Internet access and one for private message passing interface
GigE Switch
Message Passing Network
interface
VCL management
node
configures blade VLAN based on i t d t GigE Switch GigE SwitchPrivate Management Network Private Network Message Passing Network
image metadata
Private Management Network Private Network
The Resource “Knows”
The Resource Knows
• Images are VCL’s primary currency – they are software
stacks (bare-metal or virtual) They “know” who can use
stacks (bare metal or virtual). They know who can use
them, how many licenses they are allowed to use, how
to defend themselves, what storage to access, etc.
• Other resources (computers, schedules, user groups,
etc.) are also user, role, and security conscious.
• This provides scalable customizable and flexible
This provides scalable, customizable and flexible
resource security
• Above is typically coupled with system level security
(VLANs, VPNs, traffic and load monitors, etc,)
Shades of Things to Come
Shades of Things to Come
• Do you want to own a power plant (well… maybe)?
It is s all cheaper to “b
” po er and ha e a back p
– It is usually cheaper to “buy” power and have a backup
generator, perhaps a transformer, and ciruit breakers (and few
flashlights). The same should be true with Cloud services.
VCL Cl
d B
k
li ht
i ht l
l l
d th t
t
•
VCL-Cloud-Broker
– a lightweight local cloud that acts
as a wide-area cloud resource broker. It also provides
emergency backup resources, perhaps a data vault, first
g
y
p
p
p
class easy-to-add interfaces to other on-demand cloud
service, a monitoring and brokering “image”, control and
monitoring “dashboard” for seamless service imports
monitoring dashboard for seamless service imports,
master image repository, involuntary vendor-lock-in
protection, etc.
How much?
• Capital costs
• depends on the size, e.g., cca $100k for a blade center that
supports between 200 and 400 simultaneous users (data center
supports between 200 and 400 simultaneous users (data center
and networking infrastructure assumed).
• Operational costs
$
• e.g., $200/blade per year + 2-3 FTEs per 5,000 units + licenses)
• Skills needed (depends on the type of installation, level 1
through 4 for large installations)
through 4 for large installations)
• Today,
for
a
relatively
modest
investment,
an
organization, a state, a nation can provide first class, full
t l
l bl
d
t i
bl (
t ff
ti
)
control, scalable, secure, and sustainable (cost-effective)
information technology cloud (and sub-cloud) support for
its daily operations, for education, and for its critical
y
p
services and applications.
Use, Capacity & Operational Profile
> 800 images available cca 250,000 reservations per year
Green & Cost-Effective
120 140 s November 2008 60 80 100 N um be r of R e s e rv a ti o n s 20 40 A v er ag e N 800,000 High-Performance Computing (over 12 months Mar08-Mar09)0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Time of Day (24 hr clock)
500,000 600,000 700,000 u rs ( )
Average daily active reservations
200,000 300,000 400,000 CPU H o u
Opportunity to save power or Increase utilization
Cost Factors
•
Utilization
(70-80%) – HPC + General mix, economy of scale.
•
Lab spaces (25:1) – currently cca 250,000 non-HPC reservations
per year, cca 10+ million HPC CPU hrs.
•
Virtualization as many as 20+ VMs per physical blade
•
Virtualization, as many as 20+ VMs per physical blade
•
Extended hardware life
. Refresh cycle (yearly), resource lifetime
(cca 5 years) – yearly down-migration of resources
•
Power savings (Blades)
g (
)
•
Architectural savings:
one BladeCenter chassis (cca 100k) can
serve 200+ on-demand concurrent sessions (augmentation mode,
VDI, HPC, etc.)
•
Reduced administration and maintenance costs
(about 2 FTEs
•
Reduced administration and maintenance costs
(about 2 FTEs
for about 2,000 blades, 6000+ cores) Distributed burden of image
creation (800+ images)
•
“
Green
”
•
Image driven (security, license, topology, complex environments and
workflows, …)
•
Price point from 3 to 27 cents per CPU hour (HPC plus General use,
not counting Services from federated clouds) In K-12 annualized
not counting Services from federated clouds). In K 12, annualized
cost per user/child can be as low as few dollars.
VCL Economics (annual cost)
• Considering just VCL use in augmentation mode (cca
500,000 CPU-hrs per year)...
1–Per reservation : $2.20 ($0.72 for servers) Total including Personnel ( & Hardware only)
$ ($ )
–Per CPU hour: $1.04 ($0.34 for servers) –Per active user: $29.69 ($9.68 for servers) –Per potential user: $14.84 ($4.84 for servers)
( & Hardware only)
• With backfill (cca 10.5 million HPC CPU hours per year)
...
2Per CPU hour: $0 12 ($0 08 for servers) NCSU VCL: 40,000+ userst i t ti d/ –Per CPU hour: $0.12 ($0.08 for servers)
• Additional Benefits
–User owned computers have more value
most in augmentation and/or HPC mode, 2000+ blades p
– Machine refresh cycle has been stretched out
–Distribution of workload (need only 2 FTEs to operate 2000+ physical units
VCL Status and Future Directions
• Features in Production use
–Code open sourced through Apache, support through Apache
and IBM
IBM GBS/UDSand IBM
–Block and Recurring Reservations
–Long term reservations
–Load multiple images with single environment reservation
IBM GBS/UDS
provides support for VCL educational sector in the US