• No results found

Introduction to OpenStack Swift CloudOpen Japan 2014

N/A
N/A
Protected

Academic year: 2021

Share "Introduction to OpenStack Swift CloudOpen Japan 2014"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

Yuji Hagiwara

[email protected]

Platform Engineer, NTT DATA Corp.

Introduction to OpenStack Swift

(2)

Agenda

1.What is Swift?

2.Swift’s Latest Information

3.Swift’s Future

(3)

Who am I

Yuji Hagiwara – Platform Engineer, NTT DATA Corp.

Since 2011 -

Using OpenStack

Since 2013 -

Developing Searching on Swift

(4)

2004

2007

2010

2013

2016

Amount of Unstructured Data

Background

Data Explosion on Enterprise – Amount of Unstructured Data has been growing.

We need storage with Scalability, Durability, Availability.

Examples of Unstructured Data

• Media (Images, Videos, Audios) • Web Contents

• Documents

• Backups/Archives

Where should we store these data?

One of the Solutions is

Swift

.

Growing exponentially

(5)

What is Swift?

Swift is...

• A storage system with Scalability, Durability, Availability.

• The REST-ful

Distributed Object Storage

likely Amazon S3.

• One of

OpenStack

Core Components.

• Implemented by

Python

.

• A

Open Source

Software.

① Block Storage (Cinder)

(6)

Usage

so simple.

$ curl -XPUT --data-binary ‘@mydoc.txt‘

http://swift.example.com:8080/v1/account/container/object

$ curl –XGET

http://swift.example.com:8080/v1/account/container/object

$ curl –XDELETE

(7)
(8)

Swift as a storage for a variety of applications

Swift

System

Backup

REST API

CMS

Cyber

Duck

FTP-like use

Distribution Digital Apps Web

(9)

OpenStack Swift deployments and use cases

Name of enterprise Product/ service Description

Rackspace(USA) Cloud Files Cloud file share service by Rackspace itself. They use same code as OSS except for features such as authentication, Accounting and CDN (<500PB)

Korean Telecom (Sourth

Korea) ucloud storage service Object storage service using OpenStack/Swift (16PB+ size)

Sina (Republic of China) Sina App Engine(SAE) Public storage service. They moved to OpenStack from another technology MongoDB in 2012.

San Diego Supercomputer

Center (USA) SDSC Cloud Storage Services Cloud storage service on SDSC. Users can select Amazon/S3 or Rackspace Swift.

SME Storage (USA) SMEStorage Open Cloud Platform Cloud storage service based on Rackspace Cloud File

SoftLayer (USA) SoftLayer Object Storage Public object storage service. Acquisition by IBM

SwiftStack(USA) Swift Stack Provide professional service and Operation and management product

HP(USA) HP Cloud Private cloud storage service uses OpenStack.

Wikimedia(USA) Wikimedia storage Media files store for Wikipedia.

(10)
(11)

Architecture: Nodes

Swift consist of 2-type Nodes: Proxy Node

and Storage Node.

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node

HTTP Load balancer

Forward Data to node

Store data

Application

(12)

HTTP Load balancer

Architecture: The Ring

The Ring (static table for data allocation on storage node)

decide the optimal Storage Node by

Name

.

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Ring Ring Ring

Application

(13)

HTTP Load balancer

Architecture: The role of Ring

If you requested to

Store

the data “A”, 3 Replica nodes store the data “A”.

Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4

Proxy Node Proxy Node Proxy Node

Ring Ring Ring

data

Data “B” Data “B” Data “B”

“A” must be located at “1”, “2”, “4”

Data “A”

Data “A” Data “A”

Application

(14)

HTTP Load balancer

Architecture: The role of Ring

If you requested to

Get

the data “A”, One of Nodes reply the data “A”.

Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4

Proxy Node Proxy Node Proxy Node

Ring Ring Ring

data

Data “B” Data “B” Data “B”

“A” must be located at “1”, “2”, “4”

Data “A”

Data “A” Data “A”

Application

Ring Ring Ring Ring

(15)

Scalability

Proxy

Storage Storage Storage Proxy

(expand)

Proxy

Storage Storage Storage Proxy

Storage

(Expand)

(1)

Expand proxy server

“Throughput”

(2)

Expand Storage servers

or disks “volume”

More Throughput

(16)

Many processes working together

Swift

object-replicator account-server object-auditor object-updater object-server object-expirer account-replicator account-auditor account-reaper container-server container-replicator container-auditor container-updater container-sync proxy-server

(17)

Replicator

Node 1 Node 2 Node 3 Node 4

Node 2 Node 3 Node 4

(1) Each nodes checks data in others

Node 1 Node 2 Node 3 Node 4

(5) Recover disk

(6) recover data to original node (4) Copy data to another node Nor mal Defe at Reco very

Node 1 Node 2 Node 3 Node 4 (2) Disk defeat

(3) Detect disk trouble

Node 1

Node 2 Node 3 Node 4 Node 1

Replicator Disk

(18)

Replicator

Node 1 Node 2 Node 3 Node 4

Node 2 Node 3 Node 4

(1) Each nodes checks data in others

Node 1 Node 2 Node 3 Node 4

(5) Recover disk

(6) recover data to original node (4) Copy data to another node Nor mal Defe at Reco very

Node 1 Node 2 Node 3 Node 4 (2) Disk defeat

(3) Detect disk trouble

Node 1

Node 2 Node 3 Node 4 Node 1

Replicator Disk

(19)

HTTP Load balancer

Normal state

Each Data has replicated.

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Server

Server

Server

Server

Ring Ring Ring

Ring Ring Ring Ring

Data “A”

Data “A” Data “A”

Data “B” Data “B” Data “B”

Application Auditor Replicator Auditor Replicator Auditor Replicator

Auditor Replicator

(20)

Replicator

Node 1 Node 2 Node 3 Node 4

Node 2 Node 3 Node 4

(1) Each nodes checks data in others

Node 1 Node 2 Node 3 Node 4

(5) Recover disk

(6) recover data to original node (4) Copy data to another node Nor mal Defe at Reco very

Node 1 Node 2 Node 3 Node 4 (2) Disk defeat

(3) Detect disk trouble

Node 1

Node 2 Node 3 Node 4 Node 1

Replicator Disk

(21)

HTTP Load balancer

Defeat state

If a disk is broken...

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Server

Server

Server

Server

Ring Ring Ring

Ring Ring Ring Ring

Data “A”

Data “A” Data “A”

Data “B” Data “B” Data “B”

Application Auditor Replicator Auditor Replicator Auditor Replicator

Auditor Replicator

Broken

(22)

HTTP Load balancer

Defeat state

Replicator detects the lost data and replicates the data to another node for

temporary.

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Server

Server

Server

Server

Ring Ring Ring

Ring Ring Ring Ring

Data “A”

Data “A” Data “A”

Data “B” Data “B” Data “B”

Application Auditor Replicator Auditor Replicator Auditor Replicator

Auditor Replicator

Broken

(23)

Replicator

Node 1 Node 2 Node 3 Node 4

Node 2 Node 3 Node 4

(1) Each nodes checks data in others

Node 1 Node 2 Node 3 Node 4

(5) Recover disk

(6) recover data to original node (4) Copy data to another node Nor mal Defe at Reco very

Node 1 Node 2 Node 3 Node 4 (2) Disk defeat

(3) Detect disk trouble

Node 1

Node 2 Node 3 Node 4 Node 1

Replicator Disk

(24)

HTTP Load balancer

Recovery state

When the broken disk is replaced to a fresh disk...

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Server

Server

Server

Server

Ring Ring Ring

Ring Ring Ring Ring

Data “A”

Data “A” Data “A”

Data “B” Data “B” Data “B”

Application Auditor Replicator Auditor Replicator Auditor Replicator

Auditor Replicator

(25)

HTTP Load balancer

Recovery state

Replicator replicates the data and removes the temporary data.

Storage Node Storage Node Storage Node Storage Node

Proxy Node Proxy Node Proxy Node

Server

Server

Server

Server

Ring Ring Ring

Ring Ring Ring Ring

Data “A”

Data “A” Data “A”

Data “B” Data “B”

Replicate the data to the correct node. Data “B” Application Auditor Replicator Auditor Replicator Auditor Replicator

Auditor Replicator

Removed

(26)

Replicator

Node 1 Node 2 Node 3 Node 4

Node 2 Node 3 Node 4

(1) Each nodes checks data in others

Node 1 Node 2 Node 3 Node 4

(5) Recover disk

(6) recover data to original node (4) Copy data to another node Nor mal Defe at Reco very

Node 1 Node 2 Node 3 Node 4 (2) Disk defeat

(3) Detect disk trouble

Node 1

Node 2 Node 3 Node 4 Node 1

Replicator Disk

(27)
(28)

History and Trend of Community

2010.6 Start OpenStack Project

2010.10 1st release "Austin" 2013.4 “Grizzly” 2013.10 “Havana” 2014.10 “Juno”

Fundamental

Global Cluster

2013.4 “Icehouse”

Developing

Supported

Timeline in each functions

Erasure Coding

Storage Policy

Hot Topics on Now

Development Trend in Swift

History of OpenStack

(29)

Replication

Erasure Coding

Size

3x original

2x original

Latest info: Erasure Coding

Data

Partial

Data

Partial

Data

Parity

1

Data

Data

Data

Data

distribute

Parity

2

(30)

Latest info: Storage Policy

Data Data Data

Data Data Data

Data Data Data

Before

Data Data Data

Data Data

After

Partial

Data

Partial

Data

Parity

1

Parity

2

Same Policy on cluster

Variety Policy on cluster

(31)

Future Direction of Swift

2 concepts:

Integrated Searchable Storage

(32)

Integrated Searchable Storage

Swift should be integrated with Searching.

It means to need searching as

Scalable, Durable, Available

as Swift.

Swift

Storage System

Users

Operators

Managers

Store

Get

Search

(33)

Use cases of Search

1.Content Search

2.Detection for de-duplication

3.Tiered storage

Data

Hash

Hot Content – More Modified

Data

Hash

Data

Hash

Data

Hash

Cold Content – Less Modified

Already Stored?

Cheap Storage

(34)

Future: Integrated Searchable Storage

How do we implement?

Internal

External

Swift

Swift

preexist

Search

Engine

Hook

Index

Store

Search

Search

Search

Store

Search

feature

Storage

feature

Storage

feature

Index

Internal External

Where do search Swift with search library

(such as Lucene)

Search Engine (such as Solr)

Redundancy High Depend on Search engine

Availability High Depend on Search engine

(35)

Our Implementation

Internal Approach

• Hack Swift to embed the search library.

container-server

Indexing

Searching

Proxy Node

Swift’s

Ordinary API

New Search

API

object-server

Lucene

SQLite

ContainerA DB ContainerB DB Container A Index Container B Index

account-server

Metadata

Data

Query

Application

Indexing

Search

(36)

Future: Intelligent Resource Management

Swift has more and more different functions.

Swift

object-replicator account-server object-auditor object-updater object-server object-expirer account-replicator account-auditor account-reaper container-server container-updater container-sync proxy-server ec-auditor ec-reconstructor ec-stripe-auditor

Erasure Coding

Storage Policy

Multi-ring support

Other arbitrary processes

Search...? Compression...?

(37)

Future: Intelligent Resource Management

Resources are drained! – IOPS, CPU, Network, Memory

Performance Priorities of these functions are

different by the Requirement.

Ex1) Store performance

VS

Search performance

Ex2) Service Level on Business Hour

VS

on Outside Hour

More Intelligent Resource Management

is necessary.

with

cgroups

0

12

6

18

Business Hour (High-prio to process requests) Outside (High-prio to check durability)

(38)

Summary

1.What is Swift?

Swift is a Great OSS, for storing unstructured data.

2.Swift’s Latest Information

Erasure Coding

Storage Policy

3.Swift’s Future

Integrated Searchable Storage

(39)

PR: Demonstration is Now Available!

We exhibit the

Demo Application(Contents delivery system)

built with Swift.

• On-demand Delivery a lot of contents(Pictures or movies) stored at Swift.

• Implemented Searching on Swift. (Our original implementation)

(40)

Thank you for your attention!

Please contact to

[email protected]

,

if you have any questions or comments.

Q&A: Do you have any question?

(41)

Challenges and Questions

How to integrate Swift with cgroups?

How to use cgroups?

What is the best toolset for cgroups?

VFS?

libcgroup?

systemd?

How to control multiple hosts with cgroups dynamically?

How to integrate Swift with search?

What is the best implementation way?

What is the best search middleware?

How to search Multilingual?

References

Related documents

 Staff reviews mitigation protocols every 30 days, and since the Governor relaxed requirements, changes to current protocols are: we removed some Table Games barriers;

Our principal finding is that increasing the minimum wage raises wages and prices with small adverse employment effects in Brazil.. This suggests a general wage-price

While the results did not determine whether Primavera was primarily used as a result of being a contract requirement, despite the responses received from the owners on the perception

Most Grade 10 learners at schools where the study was conducted, own smartphones they use for non-educative purposes in their day-to-day activities that could assist in

This thesis describes the ability for two Pseudomonas sp., a soil - isolate strain PAI-A and a clinical - isolate Pseudomonas aeruginosa strain PAO1, to degrade long

As stated by Stewart (1999:56): 'Intellectual capital has become so vital that it's fair to say that an enterprise that is not managing knowledge is not paying attention to

Œ As a complement to the NYISO planning process, the owners of the interconnected electricity transmission facilities in New York State initiated the State Transmission

CSU Apply Online allows you to submit your application for admission via the web and includes the facility to attach supporting documentation to your application. It is faster