6.1 Critical evaluation
Database systems are a key element in a large number of different applications. Thus, it is important for database systems to be reliable and scalable. To achieve these properties, replication is an important feature.
In this work, we have implemented a middleware system for database replication following a synchronous approach to avoid replica divergence. From the point of view of a client, the system provides single-copy consistency since the client executes as if only one copy of the database exists. Internally, this is not the case since when the result of a commit is returned to a client there are replicas that might have not been updated yet.
Clients use a custom JDBC driver, which we have implemented, to contact with the middleware, thus allowing unmodified applications to use our system unmodified. The middleware is composed of a primary replica and several secondary replicas. Replicas use the database JDBC driver to communicate with the local database system (PostgreSQL in our implementation).
The database is fully replicated in all the replicas. Execute replica executes on snapshot isolation, the isolation level that we find to better apply to our system. We think it increases the efficiency of the execution of read-only transactions. Besides it avoids the same phenomena as the serializable isolation level.
The system is based on a primary-copy architecture where the primary-copy only executes update transactions while the secondary replicas execute all the read-only transactions. The system architecture is similar to the Ganymed [5] system, with some
modifications. First, we avoid the need of declaring the read-only transactions as read- only beforehand, still executing them in a secondary replica. Second, we implement a speculative mechanism on the client to try to improve system’s performance.
The use of primary-copy is usually associated with two disadvantages. The primary- copy is a bottleneck and a single point of failure. The bottleneck problem is negligible when the fraction of read-only transactions is substantially higher than that of read-write transactions. To avoid the primary copy to be a single point of failure, we have implemented a failure detection strategy with an election algorithm to elect an updated secondary replica as primary replica. The system also processes the failures of the secondary replicas. When a failed secondary replica returns online, it uses another replica (preferably a secondary replica) to get the missed updates before becoming an active replica that can be used by the clients.
Another contribution of this work is related with the extraction of write sets. We tested two alternatives and presented the results to justify the choice taken. We have also implemented an automatic script to read the database tables and create the triggers related with the extraction.
The system also uses speculative execution for some of the client remote calls to improve the efficiency of its execution. The speculation could also theoretically be used on the servers but to implement it we needed extra time.
We intended the middleware to be efficient. The testing of the middleware itself was not conclusive enough to prove this was achieved. The system was benchmarked with a workload with a much larger number of read-write transactions than read-only. This does not benefit the system implemented efficiency since it was implemented for workloads with a much larger number of read-only transactions. The system needs a more exhaustive test, with the use of a workload with a greater load of read-only transactions than read-write transactions. Also by testing the impact of using speculation some additional conclusions may be drawn. This last test may be executed testing the system with a different benchmark that includes other processing besides the database operations.
Nevertheless the results obtained show one benefit of using replication. The percentage of committed transactions increases because the burden over the primary replica decreases. And the benefits of using replication still prevail. If a server fails there are other servers ready to process the client’s requests. A secondary replica can assume the role of a failed primary replica and the system is maintained online. So the replication improves the availability of the system.
6.2 Future work
The implemented system has some limitations that can be improved in the future. But first it has to go through more testing.
The benefits of using speculation must be evaluated using a benchmark which has processing other than database operations. The system itself should also be benchmarked with different benchmarks like the TPC-W to better evaluate its capabilities and efficiency.
The system can be improved in several aspects in order to optimize it. We present some ideas to implement in the future.
As previously mentioned we could balance the number of rows brought from the server when creating a ResultSet instead of bringing the entire table.
The replica failure recovering could be improved. The election algorithm, executed when the primary fails, chooses the first active replica as the new replica, but it could chose the replica with the most recent updates. Also, when a secondary replica fails and returns online it must be updated to the last version of the database. The log approach could be more efficient in terms of memory space since all logs are being kept on main memory and no garbage collection is being done. We could implement a garbage collection that would clean the logs from all replicas after a defined time limit was achieved and only if all the active replicas had the version of their databases with a number equal or higher than the log entry to be deleted. If a failed replica returned
online it would update itself from the logs if the unseen updates were still logged, or else would request the whole database from a secondary replica.
An interesting improvement to implement is related with the speculation. When a client uses speculation it issues an operation to the server continuing the execution and can immediately issue another operation to the server. On the server side the first operation starts to execute and the next operation must wait its turn to execute and only then is executed. If a third operation is issued it must also wait for its turn to execute, and so on. So the server sends the operations to the database one by one. But instead we can group the operations that arrive at the server and are waiting for their turn to execute and send them as a batch of operations to the database. In other words, the operations that arrive while one prior operation is being executed are batched to be sent to the database, instead on sending them one at a time.
Regarding speculation there are some other improvements that may be studied, like the use of speculation on the servers to communicate with the client or between the servers. Other possible improvement is to apply the speculation to the commit operations, on the client. But this presents a problem. If we speculate the commit of a transaction, it can no longer abort since the results of the transaction can no longer be repealed. We need to use a system like the Speculator [1] or one that presents the same functionality.
Finally we could investigate different ways of distributing the workload of read-only transactions by the secondary replicas instead of just choosing a random replica.