• No results found

The CLASS Cloud Access Pilot

N/A
N/A
Protected

Academic year: 2021

Share "The CLASS Cloud Access Pilot"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

The

 

CLASS

 

Cloud

 

Access

 

Pilot

16 May 2012

Kenneth S. Casey, NODC On Behalf of the

On Behalf of the 

(2)
(3)

The NOAA National Data Centers

The

 

NOAA

 

National

 

Data

 

Centers

 

– National Oceanographic Data CenterNational Oceanographic Data Center

• Understanding our Oceans and Coasts

– National Geophysical Data Center

• Understanding our World

– National Climatic Data Center

(4)

The NOAA National Data Centers

The

 

NOAA

 

National

 

Data

 

Centers

 

Based on 

NODC’s Levels of Stewardship

Across the three Data Centers the words vary a 

little, but all focus on stewarding environmental 

(5)

Comprehensive

 

Large

 

Array

data

 

d h

(

)

Stewardship

 

System

 

(CLASS)

Designed originally for largevolume satellite data sets • IT infrastructure supporting the lowest level of 

stewardship stewardship

NESDIS has mandated its use across the three NOAA 

National Data Centers

Even the lowest levels of stewardship require Even the lowest levels of stewardship require 

(6)

FY2012 FY2013 FY2014 FY2015 FY2016

Evolution of the NOAA Archive Architecture

FY2012 FY2013 FY2014 FY2015 FY2016

NODC

Phase I Phase II Phase III

Metadata NCDC Cloud Pilot Access Path D t N t Access Dissemination Stewardship S

NGDC IRODS Data Net

Data Center Migration NCDC NGDC t a g i M2M HPSS Archive Path NPP NGDC NODC n g Archive Storage Concurrent CLASS Initiatives GCOM‐W

Jason On‐Hold Programs

Service

MOB

JPSS GOES‐R

(7)

Data Centers’ Data Migration Plan

Data

 

Centers

 

Data

 

Migration

 

Plan

• Approachespp oac es a d and milestonesesto es o for integratingteg at g C SS CLASS 

into NOAA Data Center operations by FY15

• Three Phases:

– Archival Storage Phase – use of CLASS for safe, secure, 

long‐term storage

A S i Ph d CLASS i l d

– Access Services Phase – expands CLASS to include 

access capabilities expected by Consumers and 

functions needed for Data Center stewardship

– Operations Phase – comparison of levels of service 

and decommissioning of local Data Center services 

when appropriate when appropriate

(8)

A metaphor, if you will…

A

 

metaphor,

 

if

 

you

 

will…

• Working to integrate CLASS into our archive  ti i bit lik “d j b” Th

operations is a bit like our “day job”.  The  kind of thing you have to do.  You get up,  pull on your boots, and wade through the  muck, making the best of it you can.

• But you are not really very happy in your 

d j b t t l i lt ti

http://www.dreamstime.com/royalty‐free‐

stock‐photo‐muddy‐boots‐image14440895

day job, so you start exploring alternatives.  Maybe you take some online classes at 

night, learn some new skills, invest in a 

startup… you do something to “change the  game”, “live the dream”, or “expand your  horizons” the CLASS Cloud Access Pilot is horizons … the CLASS Cloud Access Pilot is  just that sort of thing.

http://www.dreamstime.com/royalty‐free‐stock‐

(9)

CLASS Cloud Access Pilot

CLASS

 

Cloud

 

Access

 

Pilot

Why: To test the cost‐effectiveness scalability

Why:  To test the cost effectiveness, scalability, 

performance, and agility of a Cloud solution for 

access to archived data

access to archived data

Who:  Three NOAA Data Centers and CLASS, 

reported on by CLASS Operations Working Group

reported on by CLASS Operations Working Group

What:  At least one data collection from each 

D C i l di / ll f NODC’ d

(10)

CLASS Cloud Access Pilot

CLASS

 

Cloud

 

Access

 

Pilot

When:  This FY,, mightg  continue into next

Where:  A commercial provider of Cloud IaaS will be 

selected. Government Clouds will also be examined.

How:  

‐ Three parallel activities 

P l t Cl d t ith DC h ld d t

• Populate Cloud storage with DC‐held data

• Load Data Center access services to the Cloud

• Populate Cloud storage with CLASS held data

• Populate Cloud storage with CLASS‐held data

‐ Test the Cloud‐based data access services with a 

(11)

CLASS Cloud Access Pilot

CLASS

 

Cloud

 

Access

 

Pilot

Common Storage Services using Cloud

Common Storage Services using Cloud 

(12)

CLASS Cloud Access Pilot

CLASS

 

Cloud

 

Access

 

Pilot

Traditional Data NOAA  Google Data  Center Google UMS Managed  by User Managed  by  Vendor

(13)

Access

CLASS

 

Cloud

 

Access

 

Pilot

FTP TDS LAS DAP Happy 

Users

1. The three arrows pointing 

into the cloud from CLASS 

and the DCs can develop 

at different rates. 2. Eventually, the DC to 

Cloud arrow could go

Access

Data Virtually Organized in “logical”  directories (e.g., symbolic links)

Cloud arrow could go 

away, and CLASS could 

manage the 

synchronization of data to 

the cloud access layer. 3. For now, discovery 

Cloud IaaS

Data Physically Organized in Accessions

, y

services like Geoportal 

could run locally at DCs, 

but could also eventually 

move into the cloud. 

Discovery services could 

h l d

Find

point to the cloud access 

layer, the existing DC‐

hosted access 

mechanisms, or both.

NODC, NGDC, NCDC DC local holdings  CLASS

Discovery

, , CLASS

sent to CLASS

(14)

Cloud Provider Evaluation Criteria

Cloud

 

Provider

 

Evaluation

 

Criteria

• Cost

• Performance – throughput, latency

• Uptime availability

• Reliability ‐ how many 9’s for data integrity

• Reliability ‐ how many 9 s for data integrity

• Capacity ‐ current (200 TB) and future (2 PB) ‐ do we want 

to be the largest customer

S it ITAR d HIPAA t id ti NIST FISMA

• Security ‐ ITAR and HIPAA not considerations, NIST FISMA 

moderate, encrypted file transfers, user access controls

• Content Format ‐ universal formats, no wrappers, no limits 

l it on granularity

• Organizational Structure ‐ arbitrary DC‐selected structures

• Reportingp g and Metrics ‐ access logs,g , p performance 

(15)

Cloud Providers Evaluated

Cloud

 

Providers

 

Evaluated

• GoogleGoog e C oud Cloud

• Amazon Web Services

• BlueLockBlueLock

• Terramark

• NetAppNetApp

• IBM Federal Cloud/SmartCloud

• AkamaiAkamai

• Government Clouds (NESDIS, Census Bureau, 

NASA Nebula) NASA Nebula)

(16)

Example Vendor Costs

Initial (20 TB) Continuing 200 TB (monthly) Suggested 90‐day capped cost  Amazon $2337 $15 473 ( t ) $60 000

Example

 

Vendor

 

Costs

Amazon

Option 1a $2337 $15,473$3174 (capped (storage) @ 15%  access)

$60,000

Google  $3740 $18 115 (storage) $70 000

Option 1b $3740 $18,115$3740 (capped (storage) @ 15%  access)

$70,000

CLASS new HW $30,964 $3960 $50,000

Option 2 $ , $(support services) $ ,

CLASS repurposed HW Option 3 $18,700 $3960 (support services) $35,000 Option 3

Note: Commercial options provide redundant storage with copies spread across fault‐

tolerant (isolated power grid+HVAC) availability zones, CLASS options have single copy. Consideration: These options could be used in any combination. (ie. start with option 3 

(17)

Two

Pronged Approach

Two Pronged

 

Approach

Initiate Amazon S3 implementation,p , ~30 TB 

initially then scale up as pilot progresses

SimultaneouslySimultaneously pursue internal government pursue internal government‐

managed Cloud “sandbox” development 

environment

‐ Two prongs help account for the urgent need 

to develop more cost‐effective strategies in the 

f f i i

face of numerous uncertainties

(18)

Prong 1: Amazon S3

Prong

 

1:

 

Amazon

 

S3

Pros

Known cost model

Guaranteed redundancy

Outsourced service provider model

Separate costs for storage and access

IaaS and PaaS models available

No capital investment for hardware

••

Built‐in security model Cons

M t i l t t l

(19)

Prong 2: Govt. Managed Cloud

Prong

 

2:

 

Govt.

 

Managed

 

Cloud

Pros

Diminishing cost over time

DDN hardware available

Nebula Cumulus is S3 compatible

Clear ownership of hardware and software

Cons

Implementation time delayed

Resource availability issues

Resource availability issues

Up front cost for server 

(20)

Current Status

Current

 

Status

• Costs Assessed ($200Costs Assessed ($200 $250k‐$250k total) total)

– Commercial Vendor

Government sandbox

– Government sandbox

– Federal Labor and CLASS contract labor

C t t M difi ti d di i ( d

Contract Modifications under discussion (need 

(21)

Way Forward

Way

 

Forward

• Modify the contract!Modify the contract!

• Procure the Amazon resources

S d h G db l d

Stand up the Govt. sandbox cloud

Load Data Center data to cloud(s)

Load Data Center applications to cloud(s)

(22)

Backup Slides

Backup

 

Slides

(23)

Google

Pros

• Known cost model

Google

• Guaranteed redundancy (3x plus tape) • SLA provides 99.9% availability

• Outsourced service provider model • Separate costs for storage and access • S d b i d l il bl • Storage and web service models available • No capital investment for hardware

• BuiltBuilt‐‐in security modelin security model • Dynamic data caching Cons

(24)

Other Cloud Providers

NASA Nebula

Platform and Infrastructure

Other

 

Cloud

 

Providers

Expansion capabilities

Did not respond to inquiries – project discontinued

NESDIS Cloud NESDIS Cloud

Platform as a service only

Windows platform onlyWindows platform only Oak Ridge

Software as a service (SaaS) only

Google Cloud

Low costs for storage ‐ level of redundancy unknown

R d d d f lt t l t k

Redundancy and fault tolerance as yet unknown

References

Related documents

[r]

Cloud-to-cloud backup is different from cloud backup in that it does not involve any data stored on your local hard drive.. The primary systems protected by cloud-to-cloud backups

Control Service RHA Console Master Server(s) (Source) Master Server Volume(s) Appliance Instance (Running). EBS volume(s) with Master

Gaining accessibility, visibility and control over enterprise data across all desktops, laptops and mobile devices is not only important for business continuity; it is also a

Common High Value Cloud Use Cases On- premise Cloud Hybrid Cloud data Servers BDR Appliance.. Cloud Snapshots to nearby

This is local backup software or backup server using Cloud Storage as the destination of backup data. TimeFinder).. This is a backup application that only backs up a

TAPE & DISK)BASED BACKUP BACKUP SOURCE DE)DUPLICATION BACKUP TARGET DE)DUPLICATION 2007 CLOUD BACKUP 2013 INTEGRATED DATA PROTECTION SUITE.. Backup Pressures Continue. Backup

Hosters:Cloud Service Distributors R&D Distribution Agreement Software licensing Training Hosting Provisioning / Billing SLA (services) Partner Registration Customer