• No results found

Utilizing the SDSC Cloud Storage Service

N/A
N/A
Protected

Academic year: 2021

Share "Utilizing the SDSC Cloud Storage Service"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Utilizing the SDSC Cloud

Storage Service

PASIG Conference

January 13, 2012

Richard L. Moore

[email protected]

San Diego Supercomputer Center

University of California San Diego

(2)

Traditional supercomputer center

storage systems

Functional Systems

• Tape-based archival system

• Built for capacity

We’ve extended the archive beyond HPC simulation data to experimental data and other digital assets - and as a node in geographically-distributed digital

preservation systems (e.g. Chronopolis)

• High-bandwidth parallel file system

• Built for speed

• Transient data, single-copy reliability

• Home directory system (e.g. NFS)

• Built for robustness and reliability • Regular backups

Limitations

• Archival data is difficult to access

- high latency, lower bandwidth,

user interfaces

• Difficult to share archival data by

multiple users

• All too often archived data,

particularly HPC simulations, is

“write-once-read-never”

• Not sustainable and no incentives for users to retain only high-value data

(3)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Adapting to emerging requirements and

changing technologies

• Exponential data growth - and analysis of that data - are

increasingly important to the research enterprise

• Requires ready access to data, w/ low latency & high bandwidth

• Collaborative “team science” demands easy data sharing

• Consumer product development drives prices

• Disk capacities increasing quickly

• Flash memory becoming more affordable

• ‘Gordon’ compute system just now being deployed with 0.25 PB of flash - to fill

the “latency gap” between DRAM and spinning disk

• For HPC systems with historical “byte/flop” ratios, storage

would be an increasingly significant fraction of total system cost

• Can’t afford open-ended archival storage … must develop methods to

place value on data, especially for long-term high-reliability storage

(4)

SDSC is deploying a new

repertoire of storage systems

SDSC Cloud

• Storage of Digital Data for Ubiquitous Access and High-Durability • Access: Multi-platform web interface, S3 interfaces, backup SW

Data Oasis (PFS)

• High-Performance Transient Parallel File System for HPC • Access: Lustre on HPC Systems (Gordon, Trestles, Triton)

Project Storage

• Purpose: Typical Project / User File Server Storage Needs • Access: NFS/CIFS, iSCI

(5)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

A Paradigm Shift for Long-Term Storage:

Access, Sharing and Collaboration

SDSC Cloud

http://cloud.sdsc.edu

• Launched September 2011

• Largest, highest-performance

known academic cloud

• 5.5 Petabytes (raw), 8 GB/sec

• System can upload 500GB in ~1 min

• Automatic dual-copy and verification

• Capacity and performance scale

linearly to 100’s of petabytes

• Open source platform based on

NASA and RackSpace software

(6)

Key Features of SDSC Cloud

• “Always-there” disk-based availability of data

• Tape latency and multi-user issues addressed

• High reliability

• Disk RAID; automatic dual-copy; continuous background checksum verification/ restoration; offsite replication soon

• Simple data owner user interfaces to data, its management, its access and

setting permissions for sharing data

• Easy access to shared data for any users with permission under range of

mechanisms (http, APIs, portals, gateways …)

• Encryption readily incorporated – and addresses issues of storing

HIPAA/proprietary data

• Transaction history is logged – track usage, assess utility, support provenance

• Scalable system in both capacity and bandwidth

(7)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Applications of SDSC Cloud

Shared/published/curated data collections

HPC simulation data storage and sharing

Web/portal applications and site hosting

Application integration using supported APIs

Serving images/videos

(8)

Why Openstack Swift Cloud Software?

Industry Standard

More than 100 leading companies from over a

dozen countries are participating in OpenStack, including Cisco, Citrix, Dell, Intel

and Microsoft.

Proven Software

Running the OpenStack cloud operating system is

same software that powers many large public and private

clouds, including RackSpace Cloud Storage. Highly Compatible Compatibility w/ public OpenStack clouds

means it’s easy to migrate data and apps

to public clouds when desired—based on

security policies, economics, and other

key business criteria.

Control & Flexibility

Open source platform means not locked to a proprietary vendor, and

modular design can integrate with legacy or

3rd-party technologies. OpenStack project provided under Apache

2.0 license. Evaluated Software OpenStack Swift • Open Source • Community Support • Highly Configurable Eucalyptus • Highly Flexible • Compute Focused Caringo Castor • Commercial Software

(9)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Commercial Products Commvault Amanda Backup Tools Crashplan Traditional Clients GUI Applications Command Line SDSC Web I/F

Web Services API

Amazon S3 Rackspace CloudFiles / Openstack API

SDSC Cloud Interfaces

Swift Object Storage Cluster Load Balanced Proxy Servers User- Developed Web Portals/ Gateways Data Owners

External

Users

(10)
(11)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO

Rates and Funding Mechanisms

• See

https://cloud.sdsc.edu/hp/pricing.php

for current pricing; HW costs subject

to market volatility; contact

[email protected]

if interested in service

• “On Demand” Cloud Storage

• Pay monthly per GB used (water-mark)

• U California users: $X/TB-Year dual-copy + applicable indirect costs

• + 50% premium for additional off-site copy (when available)

• Users external to UC: 2*$X/TB-year dual-copy, 3*X for dual-copy + 1 off-site copy

• “Condo” Cloud Storage

• Recipient buys HW that is integrated into the storage service and pays annual operating costs for maintenance and system administration

• Purchase condo HW at $Y market price (pre-configured head node and disk array - currently 2TB drives with 8.5 TB usable dual-copy; space will increase over time)

• Annual operating cost: $Z/year/condo + applicable indirect costs & UC-external factors • User has right to use condo for 5 years; TCO/condo = $Y + 5*Z over 5 years

*Encryption and HIPAA Compliant Storage is available with both options

(12)

Questions?

Get a trial account with an .edu email address – cloud.sdsc.edu

(no charges first 30 days)

References

Related documents

Attempting to the damage that A. fulica can do to agriculture, to public health and to the environment, this work aims to report the occurrence of A. fulica in the southern Piauí

While the results did not determine whether Primavera was primarily used as a result of being a contract requirement, despite the responses received from the owners on the perception

The need is created by the whole community. Sector artistic groups and private users also create a demand for facilities. This is a significant cost activity for Council.

H.264 triple- streaming; iSCSI recording; CF card slot; audio; Motion+; ROI; 2 video inputs. Order number

DIS T RIB UTI 0 N: Micronycteris hirsuta is known in South America from Colombia, Venezuela, Trinidad, Guy- ana, Surinam, French Guiana, Ecuador, Peru, Bolivia, and from an

On-line as well as traditional communication behaviors from students receiving instruction over the Internet using “stand alone” communication software were compared to a similar

As stated by Stewart (1999:56): 'Intellectual capital has become so vital that it's fair to say that an enterprise that is not managing knowledge is not paying attention to

We argue that the application of family category on these two weevil groups is unjustified because: i) evolutionary systematic justification for family rank is unsup- ported, i.e.,