• No results found

Object Storage, Cloud Storage, and High Capacity File Systems

N/A
N/A
Protected

Academic year: 2021

Share "Object Storage, Cloud Storage, and High Capacity File Systems"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

Presented by:

A Primer on

Object Storage, Cloud

Storage, and High Capacity

File Systems

Chris Robertson

Sr. Solution Architect

Cambridge Computer

(2)

2

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

About Your Lecturer: Chris Robertson

  SA at Cambridge Computer

•  25% of my time I do what industry analysts do

•  75% of my time is client-facing, solving problems and reconciling

to budgets

  Cambridge Computer

•  Expertise in storage networking, data protection, and data life

cycle management

•  Founded in 1991

•  Based in Boston with regional teams spread around the country

•  Unique business model with no costs or commitments to our

clients (ask us how this is possible)

•  Clients of all shapes and sizes

–  Museums, K12, Defense Contractors, Banks, etc.

–  Everyone has data. No one wants to lose it!

(3)

A Unique Business Model: Combining

the Best of All Worlds. . .

(4)

4

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

What is Cloud Storage?

  “The Cloud” has the same challenges that any other

enterprise has

•  The design challenges of cloud storage are relevant to private

users with large private data collections.

  Cloud storage has three major incarnations

•  Enterprise storage for applications that are hosted in the cloud

–  Dynamic provisioning of storage with careful attention to balancing

capacity and performance.

•  Hosted backups

–  Granular / efficient backups with backups with data automatically

stored off site

•  Redundant object storage

–  Geographically dispersed, redundant storage for data that does not

change much.

(5)

A Typical Cloud Service: 3 Copies of

Each Object Stored Somewhere

(6)

6

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

The Cloud is Accessed Through SOAP/

REST Software Interface

Dedicated appliance

and/or software app

“On-Ramp”

SOAP/REST

interface

File or block

interface

(7)
(8)

8

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Traditional Storage Models Don’t

Scale

  Data accumulates over time

•  If your primary storage capacity doubles, then

BOTH

the

CAPACITY

and the

SPEED

of your backup system must

double.

•  Backups take too long. Restores take too long.

  Storage devices become

BRITTLE

as they get bigger

and bigger

•  The bigger they are, the harder they fall

  Wholesale data migration between storage devices is

impractical. Massive storage systems must allow for

in place upgrades.

(9)

Moving a PB is Heavy Lifting

Data Rate

Example

Total Time

(Approximate)

140MB/Sec

LTO-5 tape drive at full tilt

without factoring in

compression

82.5 days

1GB/Sec

A beefy Virtual Tape Library

A dedicated 10Gb Ethernet

11 days

1.5mb/Sec

A dedicated T-1

176 years

156mb/Sec

An OC3

640 days

(10)

10

(11)

Bigger Hard Drives: Friend or Foe?

  The Good News: As drives grow bigger we can achieve

more capacity with fewer devices

•  Fewer devices = higher density, lower power consumption, fewer

device failures

  The Bad News

•  MTBF not growing as fast

•  Bandwidth into device not growing as fast

•  Consequences

–  Unreliability (per bit) growing

–  Accessibility of data (per bit) shrinking

–  Drive rebuild times are longer, which increases overall risk of data loss

(12)

12

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

RAID Rebuilds Take Too Long

  RAID 5 rebuilds take too long

•  On the order of 36 hours per TB

•  4TB drive could take a week to rebuild

  RAID 6 (double parity offers some protection)

•  But what happens when we have 8TB drives?

  The more stuff you have the higher the chance of

failures.

(13)

Redundancy Between Cabinets: Can

You Have Too Much Redundancy?

  Is this really a good

idea?

  How long will it take to

re-mirror a 14TB RAID 6

stripe?

  Is there a better way to

protect against a device

failure?

•  Replication?

•  Backup?

•  Mirroring at a different

level of abstraction?

(14)

14

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

How Big is the Building Block?

What Are You Building? What Size Building Block?

An outhouse?

Brick

The foundation for a new

house?

Cinder Block

A pyramid?

Boulder

(15)

Object Storage – More than

Just the Cloud

(16)

16

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Objects Represent a Different Way to

Address Data

Block

Blocks are addressed by Device ID and sequential block

number.

File

Files are addressed by UNC paths:

\\MyServer\MyFolder\MyFile.doc

Object

Objects are addressed by an ID that is unique to the

storage system.

- Sequentially assigned number

- Randomly assigned number

- A hash derived as a function of the objects content

- A combination of things

(17)

What is an “Object”

  An object is a chunk of data that can be individually addressed and

manipulated

•  A file is a chunk of data

–  A zip file containing many files is a chunk of data

•  A file can be made up of several chunks of data

•  A block is a chunk of data

•  A volume (a range of blocks) is made up of chunks of data

•  Pages, extents, chunks, chunklets are objects consisting of multiple blocks

  Email?

•  An email message is a chunk of data

•  An email attachment is a chunk of data

•  An email message along with its attachments could be treated as a single

chunk of data.

  Often objects have associated metadata

•  Descriptive information or tags

•  Provenance

(18)

18

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Content Addressing

  Content addressing calculates a hash of the data that

makes up the object and uses the hash as an address

•  Locality independence

–  An object can live in multiple location for:

•  Redundancy

•  Parallelism

•  Local processing affinity

•  Data integrity

–  The object can be compared against its hash for integrity checking

–  If the hash test fails, simply retrieve a copy of the object and repair the

corrupt object

•  Deduplication

(19)

Self Healing and Data

(20)

20

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Basic Object-Level Redundancy: An

Alternative to RAID and Mirroring

(21)

Redundant Objects Propagate on

Device Failure

(22)

22

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

(23)

Erasure-Coded Data Protection: An

Alternative to Parity-Based RAID

(24)

24

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

You Can Lose X% of Your Storage

Without Losing Data

(25)

Some Real-World Examples of

Object-Based Storage

(26)

26

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Splitting SAN I/O into a Block Stream

and an Object Stream

(27)

Object-Based File System with Erasure

Coding and Global Dedupe

(28)

28

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

SharePoint with External Blob Storage

Gateway of some sort

(29)

Shared File System Leveraging a

Cloud-based Object Store

(30)

30

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Object-Based Archive File System:

Automatic Back up to Tape

(31)

The Mwah Hah Hah Plan to Conquer

the World

(32)

32

A primer on Object Storage, Cloud Storage, and High Capacity File Systems

Merit 2012

Summary of What We Have Today

  Application software that manages files on CIFS and NFS volumes

for a single location

  Out of band – respects ACLs and UIDS/GUIDS

  Basic support for cloud stores (S3) and object stores

  Key-value metadata

  MYSQL back end

  Support for 500K to 1B files

  Admin GUI

  User GUI

  Rest-based API

  Multi-threaded crawler

  Policy-based multi-threaded data mover

  Backup copies with versioning

References

Related documents

Commercial Cloud Providers Vendor Instance storage Object storage Block storage Semi- structured data storage Relational Database storage Distributed File System Online

The present study compares the physiological binding interfaces between the kinase and other core components to the current in vitro structural array models by (i) determining

The need is created by the whole community. Sector artistic groups and private users also create a demand for facilities. This is a significant cost activity for Council.

Using StorSimple and cloud storage, organizations are able to enjoy the financial and operational benefits of cloud storage while addressing capacity needs and resolving storage

IBM SmartCloud Storage Access complements XIV, Storwize family and SONAS storage systems to provide self-service storage capacity provisioning, monitoring and reporting. This

In order for a planet to transit, the orbit must be inclined such that the impact param- eter of the orbit (the distance from the centre of the star to the centre of the planet

It is shown that NO x emissions from urban buses fi tted with Selective Catalytic Reduction (SCR) are comparable to those using Exhaust Gas Recirculation for Euro V vehicles,

Konstruktyvistinë treèiafrontininkø pozicija suartina juos su ketvirtojo deðimtmeèio Lietu- vos poezija - J. Taèiau vis dëlto treèiafrontinin- kø kûryboje avangardizmo