• No results found

Future Work

In document Isolation in Cloud Storage (Page 168-172)

Isolation in storage systems has been long studied and is continuously explored in the cloud environment. Gecko (Chapter 3), Isotope (Chapter 4), and Yogurt (Chapter 5) contribute to three aspects of isolation – performance, transaction, and consistency control – but there are many future directions and relevant problems that require further research. In this chapter, we review and discuss future research directions.

7.1.1

Hardware Integration

Moving features for isolation below the block layer requires research of further exploration. SSDs have embedded processors and run firmware that carries out complex functionality such as address translation, garbage collection, and caching [27]. Shingled drives [26] that need a data indirection layer have designs similar to those of SSDs. Such modern hardware designs open new possibilities for pushing rich functionality down to the physical block device. Some possible future research directions include the following.

First, operations that are CPU intensive can be offloaded to the block de- vice, thus simplifying the storage stack. As part of this approach, transactions can be pushed down to the physical block device. The block device can offload the caching of uncommitted blocks and computations for comparing transac-

tion conflicts from the host machine. However, additional coordination be- tween block devices for transaction decision-making and committing transac- tions requires research. A similar approach, which pushes functionality down to a physical block device, can be found on Seagate’s new key-value disk drive, which has an Ethernet port and supports key-value interfaces [19]. Seagate’s key-value drive facilitates key-value store designs, offloads key-value search- ing operations from the host machine, and enables bypassing several layers of the storage stack by using Ethernet-based accesses.

Second, a hardware implementation in a physical block device can react quickly to requests. Support for StaleStore APIs is a good candidate for im- plementation in hardware, because the storage access cost estimation is time sensitive and needs knowledge of the physical block device. Gecko and Isotope rely on flash drives or SSDs to persist metadata and transaction records. Dura- bility guarantees can be best made by the hardware since the storing media’s characteristics determine how and when data is persisted.

7.1.2

Support for Distributed Storage Systems

The systems introduced in this dissertation are inside a cloud storage server. Some principles directly apply or extend to distributed storage settings, but some are not immediately usable. For example, transactional APIs of Isotope that are provided from the block layer may not scale in distributed settings. The core idea of handling block level transactions can be applied to a distributed block store, but details such as deciding and aborting transactions should in- volve network communications among multiple nodes. Network communi-

cations can be an overhead in implementing strong isolation guarantees, and operations may need to roll back depending on the implementation. With a centralized controller, coordinating transactions can be easier but the scalabil- ity can remain a problem. On the other hand, distributed decision-making can scale well, but it can complicate the design and communication protocols. Large distributed storage systems tend to implement their own transactions with transactional guarantees that are less general and tuned for system-specific needs [125, 29]. A future research direction is to extend the Isotope transactional API to support distributed transactions. The goal would be to enable different semantics under the same API, similar to how Pileus [131] supports different client-centric consistency semantics, but using data-centric consistency models.

One of the challenges that makes distributed transactions difficult is time synchronization: distributed nodes have different clocks and deciding the order of transactions is difficult. There are two approaches to deal with this problem. The first is using logical clocks such as the Lamport clock and vector clock [128]. Following this direction results in systems that are similar to many distributed systems now in common use. However, a recently proposed datacenter time protocol [77] synchronizes physical timestamps at a scale of tens of nanosec- onds with bounds using cheap hardware. As another research direction, Iso- tope could be combined with physically synchronized clocks. We expect such a system to make local decisions for a certain portion of transactions without con- sulting or with less contact with a centralized decision engine while supporting strong guarantees.

7.1.3

Towards Smarter Block Storage

Finally, a third research direction includes making block storage smarter. Block storage has been treated as very simple, but a great number of features are be- ing integrated similar to the work described in this dissertation. Smarter block storage enables bypassing software stacks, so it can be useful to strip down un- necessary layers in a heavily layered cloud storage system. From the viewpoint of embedded devices that cannot afford heavy layers of software stacks, a smart block store can keep the software stack simple and save power. Block devices are becoming powerful due to advances in hardware technology and there is a need for rethinking the storage stack design. In addition to logging and transac- tions, deduplication, encryption, data placement for efficient data accesses and fewer defragmentations, and so on can be considered for new features.

Making the block layer smarter requires redesigning other layers such as the filesystem, virtual filesystem, page cache, and even applications at the same time. Gecko, Isotope, and Yogurt demonstrate how new features in the block storage can affect the layers above. Logging and caching approaches inside Gecko and Isotope can influence the caching policy in a page cache and consis- tency management of Isotope and Yogurt can affect how synchronization works in existing filesystems. The role of each layer in storage stacks will change ac- cordingly, and a great research direction is to investigate how the full storage stack will evolve in the future.

In document Isolation in Cloud Storage (Page 168-172)