Part II Scalable and Consistent Distributed Storage – CATS
9.5 Overhead of Consistent Quorums
until the server count doubled to six servers. After 20 minutes, we started to remove one server every 10 minutes until we were down to three servers again. We measured the average operation latency in one minute intervals throughout the experiment. The results, presented in Figure 9.5, show that CATS is able to reconfigure promptly, e.g., within a span of roughly one to two minutes. The duration of the reconfiguration depends mostly on the amount of data transferred and the bandwidth capacity of the network.
The average operation latency during reconfiguration does not exceed more than roughly double the average latency in steady state, i.e., before the reconfiguration is triggered. For example, with six servers in steady state, the system offers an average operation latency of approximately 2.5 ms, while during reconfiguration that latency grows to circa 5 ms.
Because CATS is linearly scalable, the latency approximately halves when the number of servers doubles from three to six: while during the first 10 minutes of the experiment, three servers offer an average latency of 5 ms, between minutes 30 and 50, six servers deliver a latency of indeed only 2.5 ms. As expected, an increase in latency occurs once nodes are removed after 50 minutes. As the CATS servers were running under load for more than one hour, the JVM had been constantly optimizing the hot code paths in the system. This explains the asymmetric latencies whereby instead of a completely mirrored graph, in the second half of the experiment we observe slightly better performance for the same system configuration.
9.5
Overhead of Consistent Quorums
Next, we evaluate the performance overhead of atomic consistency com- pared to eventual consistency. For a fair comparison, we implemented eventual consistency in CATS, enabled through a configuration parameter. Here, read and update operations are always performed in one phase, and read-impose is never performed. When a node n performs a read operation, it sends read requests to all replicas. Each replica replies with a timestamp and a value. After n receives replies from a majority of replicas, it returns the value with the highest timestamp as the result of the read operation. Similarly, when node n performs an update operation, it sends write re- quests to all replicas, using the current wall clock time as a timestamp.
16 64 256 1024 4096 2.5
3 3.5 4
Value size [bytes] (log)
Throughput [1000 ops/sec]
Eventual consistency CATS without CQs Atomic consistency
Figure 9.6. Overhead of atomic consistency and consistent quorums versus an implementation of eventual consistency, under a read-intensive workload.
Upon receiving a write request, a replica stores the value and timestamp only if the received timestamp is higher than the replica’s local timestamp for that particular data item. The replica then sends an acknowledgment to the writer m. Node m considers the write operation complete upon receiving acknowledgments from a majority of the replicas.
We also measured the overhead of consistent quorums. For these measurements, we modified CATS such that nodes did not send replication group views in read and write messages. Removing the replication group view from messages reduces their size, and thus requires less bandwidth. For these experiments, we varied the size of the stored data values, and we measured the throughput of a system with three servers. The measurements, averaged over five runs, are shown in Figures 9.6 and 9.7. The results show that as the value size increases, the throughput falls, meaning that the network becomes a bottleneck for larger value sizes. The same trend is observable in both workloads. As the value size increases, the cost of using consistent quorums becomes negligible. For instance, under both workloads, the throughput loss when using consistent quorums is less
9.5. OVERHEAD OF CONSISTENT QUORUMS 147 16 64 256 1024 4096 1.5 2 2.5 3 3.5 4
Value size [bytes] (log)
Throughput [1000 ops/sec]
Eventual consistency CATS without CQs Atomic consistency
Figure 9.7. Overhead of atomic consistency and consistent quorums vs. an eventual consistency implementation, under an update-intensive workload.
than 5% for 256 B values, 4% for 1 KB values, and 1% for 4 KB values. Figures 9.6 and 9.7 also show the cost of achieving atomic consistency by comparing the throughput of regular CATS with the throughput of our implementation of eventual consistency. The results show that the overhead of atomic consistency is negligible for a read-intensive workload and as high as 25% for an update-intensive workload. The reason for this difference between the two workloads is that for a read-intensive workload, read operations rarely need to perform the read-impose phase, since the number of concurrent writes to the same key is very low due to the large number of keys in the workload. For an update-intensive workload, due to many concurrent writes to the same key, read operations often require to impose the read value. Therefore, in comparison to an update-intensive workload, the overhead of achieving linearizability is very low – less than 5% loss in throughput for all value sizes – for a read-intensive workload. We believe that this is an important result. Applications that are read-intensive can opt for atomic consistency without a significant loss in performance, while avoiding the complexities of using eventual consistency.
16 64 256 1024 4096 2 4 6 8 10 12
Value size [bytes] (log)
Latency [ms]
Reads (Eventual consistency) Updates (Eventual consistency) Reads (Cassandra)
Updates (Cassandra)
Figure 9.8. Latency comparison between Cassandra and an implementation of eventual consistency in CATS, under a read-intensive workload.