SWDs generally follow the geometry of regular HDDs except the tracks are overlapped. Similar to HDDs, each SWD may contain several platters. Physical data blocks are also addressed by Cylinder-Head-Sector (CHS). Obviously outer tracks are larger than inner tracks, so the SWD space is divided into multiple zones. Tracks in outer zones are larger than those in inner zones and have better performance too. Tracks in the same zone have the same size. Each zone can be further organized into bands if needed. A small portion (about 1% to 3%) of the total space is usually used as a random access zone (RAZ) for maintaining metadata [77, 78, 79].
I-SWDs and O-SWDs organize and use the bulk shingled access zone (SAZ) dier- ently as shown in Figure 4.1. Autonomous I-SWDs usually organize the tracks into small bands in order to achieve a good balance between space gain and performance as discussed and evaluated in [76]. Figure 4.1a shows an example of using 4 tracks per band. However, bigger band size can be used for host-managed I-SWDs. For example, the shingled le system [77] sets the band size to be 64MB or about 100 tracks based on the track size.
Most existing work on O-SWDs divide the shingled access zone into an E-region and an I-region as shown in Figure 4.1b. Sometimes multiple E-regions and I-regions may be used. E-region is organized as a circular buer space and used for buering incoming writes, while I-region is used for permanent data storage and organized into big bands. Obviously, writes to E-region and I-region have to be done in a sequential manner and GC operations are required for both regions. The E-region size is suggested to be no more than 3% [78, 79, 80, 81].
4.3 Related Work
Several studies have been done for out-of-place update SWDs. For example, Cassuto et al. proposed two indirection systems in [80]. Both systems use two types of data regions, one for caching incoming write requests and the other for permanent data storage. They proposed an S-block concept in their second scheme. S-blocks have the same size and each S-block consists of a pre-dened number of sequential regular blocks/sectors such as 2000 blocks as used in [80]. GC operations have to be performed in both data regions
……
Random Access Zone Shingled Access Zone with Small Bands
…
Band 0 Band 1 Band M
…
…
Zone 0 Zone N
(a) In-place Update SWD
Random Access Zone
…
E-region Shingled Access Zone with Big Bands (I-region)
…
…
Zone 0 Zone N
…
Band 0 Band 1 Band M
…
…
…
(b) Out-of-place Update SWD
Figure 4.1: SWD Layouts
in an on-demand way. Hall et al. proposed a background GC algorithm [81] to refresh the tracks in the I-region while data is continuously written into the E-region buer. The tracks in the I-region have to be sequentially refreshed at a very fast rate in order to ensure enough space in the E-region, which is quite expensive and creates performance and power consumption issues. Recently, Jin et al. proposed the HiSMRfs [82] which is a host-managed solution. HiSMRfs pairs some amount of SSD with the SWD device so that le metadata (hot data) can be stored in the SSD while le data (cold data)
can be stored in the SWD. HiSMRfs uses le-based or band-based GC operations to reclaim the invalid space created by le deletions and le updates. However, the details of the GC operations are not discussed. Aghayev et al. designed a tool framework called Skylight [83] to reverse-engineer a Seagate autonomous SWD. Skylight infers important information such as drive type, persistent cache size and GC types by measuring the latency of controlled I/O operations.
There are also some studies on in-place update SWDs. Wan et al. proposed two bold track and sector layouts to reduce space waste and write amplication overhead [84, 85]. The rst is a wave-like shingling organization which lays out the tracks with partial overlap in two opposite radial directions like wave so the space waste on safety gaps can be reduced by about half compared to a traditional and practical shingling method. The second bold idea is called segment-based data layout which divide a region into segments in the radial direction such that the size of data rewritten can be limited to a segment instead of a whole region. The closest work to ours in this chapter is the shingled le system [77], which is a host-managed design for in-place update SWDs. The shingled le system directly works on SWD PBAs. The SWD main space is organized into small bands of size 64 MB. Files will be written sequentially from head to tail in a selected band. When a le is updated, impacted data in the subsequent tracks will be rst read out to a block cache and written back to the original locations afterwards. However this work did not address the write amplication problem. Another drawback is that popular le systems (like EXT4 and NTFS) as well as other data management software have to be modied in order to use these SWDs. As a result, we do not make comparisons to this scheme. Our work improves the write amplication problem with novel address mapping schemes that make SWDs support general le systems in a drop-in manner.