Designing and implementing an interface that would allow for extensibility while main- taining an easy to use set of functions was a particularly challenging task. Different key-value stores offer different functionality, and it was not trivial finding an interface that fit every system. On one hand, programmers will get the best results out of database drivers that implement the full FMKe interface, but this imposes an additional imple- mentation effort we did not want to impose. Alternatively, a generic interface could be provided with simple read/write interface, but this would negatively impact the perfor- mance of systems with more advanced APIs (e.g. AntidoteDB allows for the reading and writing of multiple keys within a single API call).
As we considered extending the support to a larger number of storage systems, we quickly realised that we would not be able to provide a single universal interface. Instead, we opted to provide both types of driver interfaces, which we detail in the following subsections.
4.2.1 Generic Driver
This driver interface is supposed to offer programmers an easy way to test their desired system if it is not yet supported, through the implementation of an Erlang module with a simple interface containing a small set of functions:
• start_driver • stop_driver • begin_transaction • commit_transaction • read • write
There is an FMKe module that converts top level application requests into a trans- action that depending on the operation may include multiple read and write calls. If a system does not have transactional support, a simple empty response can be sent back, and FMKe disregards the transactional context. This interface was the simplest we could design that supported all types of systems while simultaneously maintaining a conceptu- ally simple interface. There are multiple drivers of this kind already implemented using this strategy, which in average contain approximately 100 lines of code.
We argue that this interface can allow programmers to very quickly obtain perfor- mance measurements from their desired storage system, but we also acknowledge some limitations to this approach. It is not known whether the performance results taken using a generic driver can be directly compared to the results of a storage system using a driver that implements the entire FMKe API. This is because the decomposition of application level operations into read and write function calls yields a significantly larger and more complex chain of function calls. In the future we will implement drivers that cover the entire FMKe API as well as a generic drivers for a small set of storage systems and then measure whether the performance overhead of using the generic approach is measurable or significant.
4.2.2 Optimized Driver
As an alternative to generic drivers for database systems with more intricate client APIs, we allow programmers to also implement a driver that fully covers the FMKe API. This allows a more fine grained and flexible approach to each operation, where even the data model can be changed. This type of drivers can be arbitrarily complex depending on the data model and the algorithms used to fulfill each operation. Since they are highly tailored to the storage system they are implementing, we refer to these drivers asoptimized
drivers.
Since each operation described in Section3.5could be implemented with direct calls to the storage system’s API, we consider this approach to contain a certain performance advantage over the generic approach. Despite their potential performance advantage, these drivers are significantly more complex to implement, not only because every oper- ation needs to be implemented, but also because programmers need to ensure that the
4 . 2 . F M K E D R I V E R I N T E R FAC E
code they wrote is not susceptible to the types of anomalies described in the data model that was chosen. On average, these driver types are 5 times longer than generic drivers (approximately 500 lines of code).
Ideally we would be able to perform our comparative performance study with only optimized drivers, since they would yield the most reliable performance numbers, but our main focus in getting FMKe accepted as a new standard for the performance evaluation of distributed key-value stores, and thus we need to provide an easier alternative for programmers that does not involve implementing the entire API.
4.2.3 Driver Interface Comparison
We implemented both the generic and the optimized driver for Riak and decided to de- termine what is the performance tradeoff when users implement generic drivers.
0 10 20 30 40 50 60 70 80 90 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Average Latency (ms) Throughput (ops/s) RIAK-SIMPLE RIAK-OPT
Figure 4.4: Performance comparison of a simple and optimized drivers for the same database
Figure4.4describes our results, which shows a clear performance benefit in imple- menting an optimized driver. We suspect this performance overhead occurs because all calls going to a generic module need to go through an adapter that converts application level operations into read and write operations to submit to the generic drivers. While were are not certain that the overhead present in the Riak generic driver would be simi- lar for other databases, we decided to implement all further FMKe drivers as optimized drivers