• No results found

Managing Storage Space in a Flash and Disk Hybrid Storage System

N/A
N/A
Protected

Academic year: 2021

Share "Managing Storage Space in a Flash and Disk Hybrid Storage System"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

Managing Storage Space

in a Flash and Disk

Hybrid Storage System

Xiaojian Wu, and A. L. Narasimha Reddy

Dept. of Electrical and Computer Engineering

Texas A&M University

IEEE International Symposium on

Modeling, Analysis & Simulation of Computer and Telecommunication Systems , 2009

(2)

Outline

• Introduction

• Related Work

• Proposed Scheme

• Evaluation

• Conclusion

(3)

Introduction (1/4)

• To build large flash-only storage system is too expensive

– employ both flash and magnetic disk as a hybrid storage system

• Different characteristics

– write to flash can take longer than magnetic disk drives while

read can finish faster

– flash have a limit on the number of times a block can be written

– magnetic disks typically perform better with larger file sizes

• Data placement, retrieval, scheduling and buffer

management algorithms

needs to be revisited

for the

hybrid storage system

(4)

The disk drive is more efficient for larger reads and writes!

(5)

Introduction (3/4)

• Requests experience different performance at different

devices based on the

request type

(read or write)

and

(6)

Introduction (4/4)

• Managing the space across the devices in a hybrid

system should be adaptable to changing device

characteristics

• issues

– allocation

– data redistribution or migration

• Proposing a

measurement-driven approach

to

migration to address these issues

– observe the access characteristics of individual blocks and consider migrating individual blocks

(7)

Related Work

• HP’s AutoRAID system considered data migration

between a mirrored device and a RAID device

– migrate hot data to faster devices and cold data to

slower devices

– improve the access times of hot data by keeping it

local to faster devices

– when data sets are larger than the capacity of faster

devices in such systems,

thrashing

may occur

(8)

Proposed Scheme (1/5)

• pool the storage space across flash and disk drives and make it

appear like a single larger device to the file system

• maintain an

indirection map

, containing mappings of logical to

physical addresses, to allow blocks to be flexibly assigned to

different devices

– when data is migrated, indirection map needs to be updated – to reduce the cost, consider migration at a unit larger than a

(9)

Proposed Scheme (2/5)

• keep track of access behavior of a block by maintaining two counters, for Read and Write accesses

– use 2 bytes for keeping track of read/write frequency separately per chunk (64KB or larger)

• about 32KB per 1GB of storage

– a block can be considered for migration or relocation only after receiving a minimum number of accesses

• for observing sufficient access history

– block access counters are initialized to zero on boot-up and after migration

• Every time a request is served by the device, keep track of the request response time at that device

– maintain both read and write performance separately – exponential average of the device performance:

average response time = 0.99 * previous average + 0.01 * current sample

(10)
(11)

Proposed Scheme (4/5)

• For each device i, keep track of the read

r

i and write wi response times

• Determine whether to migration

– Given a block j’s read/write access history through its access counters Rj and

Wj and the device response times

– current cost of accessing block j in its current device i:

C

ji

= (

R

j

* r

i

+

W

j

* w

i

) / (

R

j

+

W

j

)

– compare with a block with similar access patterns at another

device k, Cjk

– if

C

ji

> (1+

δ

)*C

jk,

(12)

Proposed Scheme (5/5)

• Employ a token scheme to control the rate of migration • potential cost of a block migrated form device i to device k:

r

i

+

w

k

– only consider blocks are currently being read or written to the device, as part of normal I/O activity, to reduce the cost

• Strategy in choosing which block to migrate

– maintain a cache of recently accessed blocks

– whenever a migration token is generated, migrate a block from this cached list to benefit the most active blocks

• Migration is carried out in blocks or chunks of 64KB or larger

– larger block size increases migration costs, reduces the size of the indirection map,

(13)

Evaluation (1/7)

• NFS server

– Intel Pentium Dual Core 3.2 GHz processor – 1GB main memory

– magnetic disk:

one 7200RPM, 250G SAMSUNG SATA disk (SP2504C) – flash disk drives:

• a 16GB Transcend SSD (TS16GSSD25S-S) • a 32GB MemoRight GT drive

– Fedora 9 with a 2.6.21 kernel – Ext2 file system

• 3 Workloads

– SPECsfs 3.0

• file system workloads, read/write ratio about 1:4

– Postmark

• typical access patterns in an email server

– IOzone

• create controlled workloads at the storage system, control the read/write ratio from 100%, 75%, 50%, 25%, and 0%

(14)

4 policies

– FLASH-ONLY – MAGNETIC-ONLY – STRIPING

data is striped on both flash and magnetic disk

– STRIPING-MIGRATION

data is striped on and migrated across both disks

600 434

426

(a) benefit from data redistribution matches the read/write characteristics of block to the device performance

(b) succeed in redistributing write-intensive blocks to the magnetic disk

Evaluation (2/7)

(15)

Evaluation (3/7)

Using δ = 1 and chunk size of 64KB in all the following experiments

if Cji > (1+δ)*Cjk,

(16)

Evaluation (4/7)

2-HARDDISK STRIPING:

data is striped on two HDD and no migration is employed

Transcend 16G (slower) MemoRight 32G (faster)

(a)2-harddisk striping outperforms hybrid drive on both saturation point and response time

(b) hybrid drive achieves nearly 50% higher throughput saturation point

(17)

Using IOzone to create 100% writes to 75%, 50%, 25%, 0% write workloads

2-HARDDISK STRIPING: data is striped on two HDD and no migration is employed

STRIPING: data is striped on both flash and magnetic disk (Transcend-base hybrid drive)

STRIPING-MIGRATION: data is striped on and migrated across both disks

the read/write characteristics of the workload have a critical impact on the hybrid system

(18)

Evaluation (6/7)

• migration improves the transaction rate, read/write throughputs in both the hybrid systems by about 10%

• Transcend-based hybrid system can not compete with 2-HDD system • MemoRight-based hybrid system outperforms the 2-HDD system by roughly about 10-17%

(19)

Evaluation (7/7)

Migration-1: consider only read/write characteristics

Migration-2: request size is also considered

• if < 64KB, based on the read/write request pattern

• if > 64KB, allow to exploit the gain from striping data across both the devices

 For MemoRight-Hybrid

•Migration-1 improves performance over striping by about 7% • Migration-2 improves about 20% on average

 For Transcend-Hybrid

• the improvement of both migration policies is not as much • can not match the performance of the 2-HDD system

it shows that both read/write and request size patterns can be exploited to improve performance

(20)

Conclusion

• proposed a measurement-driven migration

strategy for managing storage space in a hybrid

system to exploit the performance asymmetry

• extract the read/write access patterns and

request size patterns of different blocks and

matches them with the read/write advantages of

different devices

• The results indicate that the proposed approach

can improve the performance of the system

References

Related documents

However, an ASM disk group can contain files belonging to several data- bases, and a single database can use storage from multiple ASM disk groups. You

Them to accommodation in abano grand hotel abano terme, sunny weather is ready to report a stay at any friends with free cancellation and increase your hotel!. Case your hotel in

goal join predicate θ, we have considered two measures: the number of user interactions (i.e., the number of tuples that need to be presented to the user in order to infer the

The catalog is a key part of managing data, and ensuring that logical capacity data is migrated to physical capacity data when the original flash storage is deleted.. There is

The NexGen N5 Hybrid Flash Array makes performance affordable by combining high performance PCIe flash and disk capacity.. With storage Quality of Service (QoS), customers

  RAID 3 – Fine-grained data striping plus parity disk.   RAID 4 – Course-grained data striping plus

To implement the RAID0 storage policy, the device creates a single virtual volume that is striped across both hard drives, with a storage capacity that is equal to the sum of

 Flash can replace DRAM for cache and memory  Flash SSDs are built for enterprise storage  Flash storage is 10 times more expensive than hard disk drives  Flash storage