• No results found

Howard Marks Chief Scientist

N/A
N/A
Protected

Academic year: 2021

Share "Howard Marks Chief Scientist"

Copied!
23
0
0

Loading.... (view fulltext now)

Full text

(1)

DIY B ild Y

O

P i t

DIY: Build Your Own Private or 

Hybrid Storage Cloud

yb d Sto age C oud

Howard Marks Chief Scientist

(2)

Where I Come From

• 25 years as a consultant in the midmarket  F th h l f b th d th t • From the school of been there, done that,  broke that too • Reviewing products for magazines since ‘87 • Concentrating on storage/servers this century • Now running independent test lab/analyst  firm  • Twitter @DeepStoragenet [email protected]

(3)

As If You Didn’t Know

We’re drowning in data

We’re drowning in data

Standard file systems breaking downToo Small16TB‐100TB A f illi filA few million files255 byte path

(4)

Storage is Supposed to Be Getting 

Cheaper!

Cheaper!

• Disk cost is droppingDisk cost is dropping  Storage Is No It’s rapidly • $250 buys: Storage Is  Cheap! No It s  Not! $ y – 1994: 2 GB – 1999: 20 GB – 2004: 200 GB – 2009: 2000 GB • But enterprise storage  costs keep rising! 4

(5)

What is Cloud Storage

• Well it’s what public cloud providers sell I fi i l El i – Infinitely Elastic – Relatively low cost  / / • S3 ~15¢/GB/Mo – Typically object interface ibl – Internet accessible – Multi‐tenant – You don’t have to manage it!

(6)

So What’s a Private Storage Cloud

• Massively scalable N ll l i – Not really elastic • Step function for growth • Can’t shrink • Can t shrink  • Data protection by policy l l i i – Fault tolerance, copies, retention, etc. – Includes location  • Store 2 copies in location 1 plus 1 in each of 3 other  locations

(7)

What’s Cloud Storage Good For?

• Reduced TCO through reduced management I l d d d b k – Includes reduced backup • Large data stores • Low change rates – Especially of individual objectsp y j • Not latency sensitive

Archives rich data stores etcArchives, rich data stores, etc.

(8)

An Advertising Agency

• Digital Asset Management System H ld d ( i d id ) – Holds ads (print and video) • Accessed by creative types in NY, Chicago, LA,  SF • Metadata and index in each officeMedia on cloud storage

– Public cloudPublic cloud

(9)

Public or Private

• Public cloud more elastic and pay as you go Si lifi lti it • Simplifies multi‐site access • But: – Network adds latency – Security concerns • But you could encrypt data before it leaves – Control concerns – International issues like privacy/Patriot Act

(10)

Typical Cloud Storage Architecture

• Redundant Array of 

I d d N d (RAIN)

Independent Nodes (RAIN)

– All peers or

Access nodes and storage nodes – Access nodes and storage nodes

• Low cost x86 servers • Low cost SATA DAS • Low cost SATA DAS  • Object storage 

C ld i l d NAS

(11)

Which is Your Idea of Low Cost 

Storage?

Storage?

Backblaze Storage Pod CLARiiON Full of 1TB Drives Backblaze Storage Pod CLARiiON Full of 1TB Drives

(12)

The Difference is Philosophy

p y

• Enterprise systems have high subsystem  li bili reliability – Use redundant components to make sub‐systems  f l l fault tolerant – Subsystem failure creates a crisis • Cloud systems accept node failures – Reliability comes from software and redundant  data

(13)

File Systems and Object Stores

y

j

File System Object Store • Limits: – Disk Capacity (16‐100TB) – Path (255 char) • Store/retrieve file by  URI/URL

• Usually has extended

Path (255 char) – Number of files – Metadata • Usually has extended  metadata: – Retention • Syntax: – Open(file) – Lock(2343,100) – Protection policy  • No limits – Path depth ( , ) – Write(2343,”hello” – Close(file) – Path depth – Files/folder  – Total files

(14)

Object Access

j

• Object stores use HTTP derived syntax P bj – Put object to save – Get object to read • New object replaces old one • API usually based on SOAP or RESTContent Addressable Storage

– Special case of object store where URI=data hashSpecial case of object store where URI=data hash – Inherent single instance storage

(15)

Data Protection Methods

Data Protection Methods

• Conventional RAIDConventional RAID

• Object replication/dispersal Obj li i i h

Object replication with RAIDErasure and dispersal codes

(16)

Erasure Codes

Erasure Codes

• Beyond RAID for protection and integrityBeyond RAID for protection and integrity • Usually based on Reed‐Solomon math C id hi h i l • Can provide higher protection at lower  overhead – EG: Survive 4 drive failures w/25% overhead • Dispersal codes add location

(17)

How Erasure Codes Work

• Breaks data into blocks

Bl k t i d t d f d

• Blocks contain data and forward  correction codes

• System can return data from x of ySystem can return data from x of y  blocks • If 10 of 15 any 10 blocks can y reconstruct data • Store each block on different  disk/node • Store 5 blocks each in 3 locations

(18)

Private Cloud Storage Products

g

• EMC Atmos

• Caringo Castor

– SW only powers Dell Dx6000

• Hitachi Content Platform • NetApp Storage GridNetApp Storage Grid 

– Was Bycast

DDN Web Object StoreDDN Web Object Store

(19)

Erasure Code Based Systems

y

• OceanStore

– Academic project at BerkeleyAcademic project at Berkeley

• Cleversafe

– RAIN dispersal systemRAIN dispersal system

• Amplidata

– RAIN dispersal system w/Atom storage nodesp y / g

• NEC HYDRAstor

– Deduplicating grid for backup/archivep g g p – No dispersal

(20)

Hybrid Cloud Options

y

p

Any combination of on‐premise and internet  storage could be called hybrid cloud

storage could be called hybrid cloud • Models:

Cluster on premise replicates to public provider – Cluster on‐premise replicates to public provider

• Atmos to Atmos (AT&T Synaptic Cloud)

– On‐premise replicates to colocation clusterOn premise replicates to colocation cluster – Gateway/archiving system writes to both

• Also used for dedicated infrastructure byAlso used for dedicated infrastructure by  public provider

(21)

Clustered NAS

Cluster of NAS heads w/shared  storageg • Can have file/policy based  replication Etc. • Can scale to 10+TB • Low latency/higher performance E l • Examples: – IBM SONAS

– Symantec FileStoreSymantec FileStore – Gluster

(22)

Scale‐Out NAS

• NAS cluster with distributed storage dd d dd f d • Adding nodes adds performance and capacity • Scales, performs like clustered NASSimpler managementExamples:Examples: – Panasas EMC/Isilon – EMC/Isilon – HP IBRIX

(23)

And Now It’s Time to Play…

References

Related documents

6.17 Listen and repeat the irregular verbs in the present and the past, e.g. Cover Clthe PAST column. Practise saying the sentences in the past.. P R E S E

*Please attach a link to maps of treatment areas if possible?. Basin application maps included as standard in town annual reports -

The aims of this study is to provide a privacy enhancing API to developers, to make the development of privacy enhanced software faster, and ultimately allow data subjects to have

By proposing a bilevel 1458 local search procedure of choosing an appropriate reference 1459 point near an obtained NSGA-II solution and a suitable weight 1460 vector for finding

In the present work the stochastic ranking proposed by [3] for single objective constrained op- timization problems is extended to multiobjective ones using the concept of

One can interpret the licensing rules as part of the social planner’s patent policy, or as a set of licensing agreements arranged ex ante among the potential innovators; that is,

A retrospective audit was conducted on patients attending the clozapine clinic at Cork University Hospital and assessed patients’ (i) independent living, (ii) co-prescribed