5.4 Evaluation
5.4.3 Pileus-Like Key-Value Store
Pileus-KV uses a persistent hashtable over the Yogurt block address space in order to store variable-sized key-value pairs. We ran YCSB workload A which is composed of 50% write and 50% read on the key-value store, choosing keys
0 20000 40000 60000 80000 100000 120000 140000 160000 180000 4KB 8KB 12KB 16KB 20KB
Avg read latency (us)
Value size Primary
Local latest Yogurt RMW Yogurt MR
Figure 5.5: Key-value store’s read latency and value size.
according to a zipf distribution, and measured the performance. The values of the keys can be partially updated when only part of the value changes. In the experiment, the value size is varied from 4KB to 20KB, which is equivalent to 1 to 5 data blocks. To read or write a key-value pair at least one additional metadata block must be accessed to locate the block that is storing the key. We evaluate Yogurt’s capability to access stale data that spans multiple blocks using GetVersionRange API with GetCost calls.
We used the same server configurations as for the Pileus-like block store and used the memory cache. There are 16 threads accessing the key-value store and a stream of incoming writes from the primary. Figure 5.5 shows the average read latency. As the value size grows to span multiple blocks, Yogurt can pro- vide multiple options for selecting each block. The gap between accessing the latest block from the local storage and accessing older versions grows as the value size gets larger. The key-value store is querying costs every time before it reads, so the overall approach is a simple greedy selection. More sophisti-
cated selection schemes can be proposed to further improve the performance, but the figure shows that for both read-my-writes and monotonic reads seman- tics greedy selection can already lead to better performance than the baselines.
5.5
Summary
In this chapter, we repurposed a well-known distributed systems principle within the context of a single cloud storage server: storage systems should ex- pose older versions to applications for better performance. This principle en- ables consistency control within a cloud storage server and provides different isolated views of the storage server to each client. This principle is increas- ingly relevant as we move toward a post-disk era of storage systems that are often internally multi-versioned and multi-device. Distributed storage services in the cloud can benefit from this principle by pushing relaxed consistency re- quirements (negotiated between the client and the service) down the stack to the storage subsystem on each server. In the future, we believe that new ap- plications will emerge on a single storage server that can work with weaker consistency guarantees in exchange for better performance.
CHAPTER 6
RELATED WORK
In this dissertation, we explored how to support isolation with regard to perfor- mance, transactions, and consistency control in cloud storage systems. In this chapter, we discuss previous work related to our contributions, the techniques on which they build, and alternate or complementary approaches.
6.1
Performance Isolation and Logging
Log-structured storage has a long history starting with the original log- structured filesystem (LFS) [107]. Much of the work on LFS in the 1990s fo- cused on its shortcomings related to garbage collection [115, 82, 116]. Other work, such as Zebra [67], extended LFS-like designs to distributed storage sys- tems. Attempts to distribute logs focused entirely on striping logs over multiple drives, as opposed to the chained-logging design that we investigated in Chap- ter 3.
Log-structured designs have made a strong comeback in part because of the emergence of flash memory, which requires a log-structured design to mini- mize wear-out. Not only do individual SSDs layer an address space over a log, but filesystems designed to run over SSDs are often log-structured to min- imize the stress on the SSD’s internal mapping mechanisms [27]. New log- structured designs have emerged as flash has entered the mainstream; for in- stance, CORFU [33] uses an off-path sequencer to implement a distributed, shared log over a flash cluster. Another reason for the return of log-structured
designs is the increased prevalence of geo-distributed systems, where intrinsic ordering properties of logs provide consistency-related benefits [138].
In addition, performance isolation and contention in data centers have re- ceived increasing attention. Lithium [65] uses a single on-disk log structure to support multiple VMs, much as Gecko does (Chapter 3), but it layers this log conventionally over RAID and does not offer any new solutions to the prob- lem of read-write contention. However, the authors of the Lithium paper make two relevant points: first, replicated workloads are even more likely to be write- dominated, and second, the inability of log-structured designs to efficiently ser- vice large, sequential reads is unlikely to matter in virtualized settings where such reads are rare due to cross-VM interference. Parallax [85] supports large numbers of virtual disks over a shared block device but focuses on features such as frequent snapshots rather than performance under contention. PARDA [60] is a system that provides fair sharing of a storage array across multiple VMs but does not focus as Gecko does on improving aggregate throughput under contention.