Managing Storage Space
in a Flash and Disk
Hybrid Storage System
Xiaojian Wu, and A. L. Narasimha Reddy
Dept. of Electrical and Computer Engineering
Texas A&M University
IEEE International Symposium on
Modeling, Analysis & Simulation of Computer and Telecommunication Systems , 2009
Outline
• Introduction
• Related Work
• Proposed Scheme
• Evaluation
• Conclusion
Introduction (1/4)
• To build large flash-only storage system is too expensive
– employ both flash and magnetic disk as a hybrid storage system
• Different characteristics
– write to flash can take longer than magnetic disk drives while
read can finish faster
– flash have a limit on the number of times a block can be written
– magnetic disks typically perform better with larger file sizes
• Data placement, retrieval, scheduling and buffer
management algorithms
needs to be revisited
for the
hybrid storage system
The disk drive is more efficient for larger reads and writes!
Introduction (3/4)
• Requests experience different performance at different
devices based on the
request type
(read or write)
and
Introduction (4/4)
• Managing the space across the devices in a hybrid
system should be adaptable to changing device
characteristics
• issues
– allocation
– data redistribution or migration
• Proposing a
measurement-driven approach
to
migration to address these issues
– observe the access characteristics of individual blocks and consider migrating individual blocks
Related Work
• HP’s AutoRAID system considered data migration
between a mirrored device and a RAID device
– migrate hot data to faster devices and cold data to
slower devices
– improve the access times of hot data by keeping it
local to faster devices
– when data sets are larger than the capacity of faster
devices in such systems,
thrashing
may occur
Proposed Scheme (1/5)
• pool the storage space across flash and disk drives and make it
appear like a single larger device to the file system
• maintain an
indirection map
, containing mappings of logical to
physical addresses, to allow blocks to be flexibly assigned to
different devices
– when data is migrated, indirection map needs to be updated – to reduce the cost, consider migration at a unit larger than a
Proposed Scheme (2/5)
• keep track of access behavior of a block by maintaining two counters, for Read and Write accesses
– use 2 bytes for keeping track of read/write frequency separately per chunk (64KB or larger)
• about 32KB per 1GB of storage
– a block can be considered for migration or relocation only after receiving a minimum number of accesses
• for observing sufficient access history
– block access counters are initialized to zero on boot-up and after migration
• Every time a request is served by the device, keep track of the request response time at that device
– maintain both read and write performance separately – exponential average of the device performance:
average response time = 0.99 * previous average + 0.01 * current sample
Proposed Scheme (4/5)
• For each device i, keep track of the read
r
i and write wi response times• Determine whether to migration
– Given a block j’s read/write access history through its access counters Rj and
Wj and the device response times
– current cost of accessing block j in its current device i:
C
ji= (
R
j* r
i+
W
j* w
i) / (
R
j+
W
j)
– compare with a block with similar access patterns at another
device k, Cjk
– if
C
ji> (1+
δ
)*C
jk,Proposed Scheme (5/5)
• Employ a token scheme to control the rate of migration • potential cost of a block migrated form device i to device k:
r
i+
w
k– only consider blocks are currently being read or written to the device, as part of normal I/O activity, to reduce the cost
• Strategy in choosing which block to migrate
– maintain a cache of recently accessed blocks
– whenever a migration token is generated, migrate a block from this cached list to benefit the most active blocks
• Migration is carried out in blocks or chunks of 64KB or larger
– larger block size increases migration costs, reduces the size of the indirection map,
Evaluation (1/7)
• NFS server
– Intel Pentium Dual Core 3.2 GHz processor – 1GB main memory
– magnetic disk:
one 7200RPM, 250G SAMSUNG SATA disk (SP2504C) – flash disk drives:
• a 16GB Transcend SSD (TS16GSSD25S-S) • a 32GB MemoRight GT drive
– Fedora 9 with a 2.6.21 kernel – Ext2 file system
• 3 Workloads
– SPECsfs 3.0
• file system workloads, read/write ratio about 1:4
– Postmark
• typical access patterns in an email server
– IOzone
• create controlled workloads at the storage system, control the read/write ratio from 100%, 75%, 50%, 25%, and 0%
• 4 policies
– FLASH-ONLY – MAGNETIC-ONLY – STRIPING
• data is striped on both flash and magnetic disk
– STRIPING-MIGRATION
• data is striped on and migrated across both disks
600 434
426
(a) benefit from data redistribution matches the read/write characteristics of block to the device performance
(b) succeed in redistributing write-intensive blocks to the magnetic disk
Evaluation (2/7)
Evaluation (3/7)
Using δ = 1 and chunk size of 64KB in all the following experiments
if Cji > (1+δ)*Cjk,
Evaluation (4/7)
2-HARDDISK STRIPING:
data is striped on two HDD and no migration is employed
Transcend 16G (slower) MemoRight 32G (faster)
(a)2-harddisk striping outperforms hybrid drive on both saturation point and response time
(b) hybrid drive achieves nearly 50% higher throughput saturation point
Using IOzone to create 100% writes to 75%, 50%, 25%, 0% write workloads
2-HARDDISK STRIPING: data is striped on two HDD and no migration is employed
STRIPING: data is striped on both flash and magnetic disk (Transcend-base hybrid drive)
STRIPING-MIGRATION: data is striped on and migrated across both disks
the read/write characteristics of the workload have a critical impact on the hybrid system
Evaluation (6/7)
• migration improves the transaction rate, read/write throughputs in both the hybrid systems by about 10%
• Transcend-based hybrid system can not compete with 2-HDD system • MemoRight-based hybrid system outperforms the 2-HDD system by roughly about 10-17%
Evaluation (7/7)
Migration-1: consider only read/write characteristics
Migration-2: request size is also considered
• if < 64KB, based on the read/write request pattern
• if > 64KB, allow to exploit the gain from striping data across both the devices
For MemoRight-Hybrid
•Migration-1 improves performance over striping by about 7% • Migration-2 improves about 20% on average
For Transcend-Hybrid
• the improvement of both migration policies is not as much • can not match the performance of the 2-HDD system
it shows that both read/write and request size patterns can be exploited to improve performance