2.2 Methodology
2.2.3 Public Workloads and Systems
To evaluate our systems and approaches, we use block I/O traces, benchmarks, storage systems, and development frameworks. In this subsection, we describe the ones that are publicly available and how we use them.
Block I/O Traces
Block I/O traces are often used to evaluate storage systems. Because the traces only describe the I/O access patterns and not the data contents that can be sen- sitive, block I/O traces are relatively easier to release in the public domain. Mi- crosoft has released three sets of traces – Microsoft production server traces, Microsoft enterprise traces, and Microsoft Research Cambridge traces – in the SNIA repository [20] and they have been actively used in many research projects and papers. The traces are collected from SQL servers, Exchange mail servers, network filesystems, Live map service, printer servers, source management servers and so on, which ranges from a small system to a large distributed scale system. These are very useful as they are collected from a real deployment of systems and were collected for an extended time period. These traces can be used to evaluate how a new system can perform under various workload
patterns that can be found in real life. By running different I/O traces simul- taneously, we emulate a cloud environment where random traces run together. In this dissertation, the block I/O traces are mostly used for evaluating perfor- mance isolation.
Benchmarks
Using a benchmark software is a standard way of evaluating a new system. Storage system benchmarks generate various I/O patterns and enable testing the system under different settings.
IOZone [9]: IOZone is a filesystem benchmark tool. It generates various I/O patterns in different phases such as sequential reads, sequential writes, re-writes, read backwards, read strided, random reads, random writes, asyn- chronous reads, and asynchronous writes. Although it is a filesystem bench- mark, block devices can be tested when IOZone is run directly on the logical block layer. For the evaluation of our device mapper based systems, we used options using O DIRECT and O SYNC which enable I/Os to directly interact with the block layer. At the end of the execution, IOZone returns statistics including bandwidth and latency. Using multiple IOZone threads, we create a system environment with multiple concurrent users issuing heavy I/O.
LevelDB benchmark [11]: LevelDB is a library key-value store developed by Google. LevelDB benchmark is designed as a part of LevelDB to evaluate the performance. LevelDB benchmark uses the LevelDB interface to issue I/O requests. Similar to IOZone, the benchmark generates synthetic I/O patterns, such as random and sequential reads and writes. We use this benchmark and
LevelDB to evaluate our approach in Chapter 4. We build our own key-value store, which shares the LevelDB interface, on top of our system and compare the performance against LevelDB.
Yahoo! Cloud Serving Benchmark (YCSB) [46]: YCSB is developed by Ya- hoo to evaluate key-value stores and cloud serving stores. It mostly generates I/Os based on zipf distributions [41], where I/Os appear more on popular data items and the popularity is logarithmically distributed to each data item. There are six basic workload patterns: update heavy, read mostly, read only, read lat- est, short ranges, and read-modify-write. We use update heavy workload (YCSB workload-a) which issues 50% read and 50% write I/Os in zipf distributions. Similar to the LevelDB benchmark, we use it on top of key-value stores running on top of our systems in Chapter 4 and 5 to evaluate the performance.
FUSE Framework
Filesystem in user space (FUSE) framework [6] lets developers write user space filesystems. It consists of a FUSE kernel module and the libfuse user space library that connects the user space file system implementation with the kernel module. Typically, a filesystem in the kernel space is difficult to develop and test, due to the inherent difficulty of kernel programming, but FUSE reduces this burden. Although there are overheads for running a filesystem in the user space due to frequent context switching between kernel space and user space, FUSE is used for many purposes, such as for prototyping and education. In Chapter 4, we build a transactional filesystem using FUSE to evaluate our system.
Baseline Systems
To support our claim and compare our approaches, we use a baseline configu- ration or a baseline system created by others. The following systems and con- figurations are the publicly available baselines we used for the investigation.
Software RAID [13]: Linux software RAID is a device mapper module that configures multiple block devices into a RAID drive. It supports RAID 0, 1, 4, 5 and 6 [97]. We use RAID-0 configuration to parallelize multiple block devices and to run logging on top. We use this configuration to evaluate our system in Chapter 3.
LevelDB [10]: LevelDB is a library key-value store optimized for range queries. It uses a log-structured merge (LSM) tree [95] to maintain its data. The LSM tree has several levels of logs and sorts the index in each level while key-value pairs are being inserted. LSM tree makes LevelDB perform fast writes and execute range queries efficiently. We use LevelDB and LevelDB benchmark to compare and evaluate the performance of our system in Chapter 4.
Ext2/Ext3 Filesystem [64]:Ext2 and ext3 filesystems are widely used filesys- tems in Linux. Ext2 is simple and fast but lacks fault tolerance, so ext3 was developed to overcome this problem by logging the changes to be made (i.e. journaling). We run ext2 and ext3 filesystems on top of our systems to evaluate the performance and overhead.