We now describe how the decentralized model for the basic snapshot isolation is extended to sup- port serializable transaction execution using the cycle prevention and cycle detection approaches discussed in Section 2.1.
3.3.1
Implementation of the Cycle Prevention Approach
The cycle prevention approach aborts a transaction when an anti-dependency among two con- current transactions is observed. This prevents a transaction from becoming a pivot. One way of doing this is to record for each item version the tids of the transactions that read that version and track the read-write dependencies. However, this can be expensive as we need to maintain a list of tids per item and detect anti-dependencies for all such transactions. To avoid this, we detect the read-write conflicts using a locking approach. During the validation phase, a transaction acquires a read lock for each item in its read-set. Read locks on an item are acquired in shared mode. A transaction acquires (releases) a read lock by incrementing (decrementing) the value in a column named rlock in the StorageT able.
The commit protocol algorithm for the cycle-prevention approach is presented in Algorithm 3. An anti-dependency between two concurrent transactions can be detected either by the writer transaction or the reader transaction. We first describe how a writer transaction can detect a read-write conflict with any other concurrent reader transaction. During the validation phase, a writer transaction checks for the presence of a read lock for an item in its write-set at the time of attempting to acquire a write lock on that item. The transaction is aborted if the item is already read locked. Note that we need to detect read-write conflicts only among concurrent transactions to detect anti-dependencies. This raises an issue that a concurrent writer may miss detecting a read-write conflict if it attempts to acquire a write lock after the conflicting reader transaction has committed and its read lock has been released. To avoid this problem, a reader transaction records its commit timestamp, in a column named ‘read-ts’ in the StorageT able, while releasing a read lock acquired on an item. A writer checks whether the timestamp value written in the ‘read-ts’ column is greater than its snapshot timestamp, which indicates that the writer is concurrent with a committed reader transaction. A reader transaction checks for the presence of a write lock or a newer committed version for an item in its read-set to detect read-write conflicts. Otherwise, it acquires a read lock on the item.
Algorithm 3 Commit protocol for cycle prevention approach Validation phase:
for all item ∈ write-set of Ti do
[ begin row-level transaction:
read the ‘committed version’, ‘wlock’, ’rlock’, and ‘read-ts’ columns for item if any committed newer version is present, then abort
else if item is already locked in read or write mode, then abort else if ‘read-ts’ value is greater than T Si
s, then abort.
else acquire write lock on item :end row-level transaction ] for all item ∈ read-set of Ti do
[ begin row-level transaction:
read the ‘committed version’ and ‘wlock’ columns for item if any committed newer version is created, then abort if item is already locked in write mode, then abort.
else acquire read lock by incrementing ‘rlock’ column for item. :end row-level transaction ]
execute commit-incomplete phase shown in Algorithm 2 for all item ∈ read-set of Ti do
[ begin row-level transaction:
release read lock on item by decrementing ‘rlock’ column if read-ts < T Si
c then
read-ts ← T Si c
:end row-level transaction ] status← commit-complete
notify completion and provide T Si
c to TimestampService to advance ST S
During the commit-incomplete phase, Ti releases the acquired read locks and records its
more than one reader transactions for a particular data item version, it is possible that some transaction has already recorded a value in the ‘read-ts‘ column. In this case, Ti updates the
currently recorded value only if it is less than T Si
c. The rationale behind the logic for updating
the ‘read-ts‘ value is as follows. For committed transactions T1, T2, ..., Tn that have read a
particular data item version, the ‘read-ts’ column value for that item version would contain the commit timestamp of transaction Tk (k ≤ n), such that Tk is the transaction with the largest
commit timestamp in this set of transactions. An uncommitted writer transaction Tj that is
concurrent with any transaction in the set T1, T2, ..., Tn must also be concurrent with Tk i.e.
T Sj
s < T Sck, since Tk has the largest commit timestamp. Thus Tj will detect the read-write
conflict by observing that the ‘read-ts‘ value is larger than its snapshot timestamp.
3.3.2
Implementation of the Cycle Detection Approach
The cycle detection approach requires tracking all dependencies among transactions, i.e. anti- dependencies (both incoming and outgoing) among concurrent transactions, and write-read and write-write dependencies among non-concurrent transactions. We maintain this information in the form of a dependency serialization graph (DSG) [20], in the storage system. Since an active transaction may form dependencies with a certain committed transaction, we need to retain information about such transactions in the DSG.
Detecting Dependencies
For detecting dependencies, we record in StorageT able (in a column named ‘readers’), for each version of an item, a list of transaction-ids that have read that item version. Moreover, for each transaction Ti, we maintain its outgoing dependency edges as the list of tids of the transactions
for which Ti has an outgoing dependency edge. This information is recorded in the ‘out-edges’
column in the TransactionTable, and it captures the DSG structure. The dependencies are detected and recorded as discussed below.
For detecting dependencies, we include an additional phase called DSGupdate in the transac- tion protocol, which is performed before the validation phase. In the DSGupdate phase, along with the basic write-write conflict check using the locking technique discussed in Section 3.2, a
transaction also detects dependencies with other transactions based on its read-write sets. To find wr dependencies, the transaction finds, for every item version in its read-set, the the tid of the transaction which wrote that version. Similarly, to derive outgoing rw anti-dependencies, it finds the tid of the transaction which either wrote the immediately following version for that data item or is holding lock on that data item, if any. For every item in the write-set of the transaction, it finds the tids of the reader(s) and writer of the immediately preceding version. This gives the incoming rw edge(s) and the ww edge. The dependency edges are then inserted in the TransactionTable. Since our purpose is to find the directed cycles irrespective of the type of edges, we only insert an outgoing edge T 1 → T 2, if there is at least one edge of type rw, wr or ww from T1 to T2.
Checking for Dependency Cycles
In the validation phase, the transaction checks for a dependency cycle involving itself, by travers- ing the outgoing edges in the DSG in depth-first-search manner, starting from itself. In the search, it considers only the transactions which are in validation, commit-complete, or commit- incomplete phase. Since the transactions which are in Validation or later phases must have already inserted their dependency edges, there is no possibility that a transaction would miss any dependency edge. It ignores any transaction with larger T Sc, encountered during the search.
If it detects a cycle and all the transactions involved in the cycle are already committed, then it aborts. If the transaction detects a cycle with one or more transactions still in the validation phase then it waits for their status to resolve. If a transaction does not detect a cycle, or if any transaction involved in the cycle aborts, then it proceeds further. The cycle checking is performed concurrently and in non-blocking manner by the transactions. Due to the ordering based on T Scwhen two concurrent transactions are involved in the same cycle, only one of them
would abort.
Pruning of DSG
A challenge in this approach is to maintain the dependency graph as small as possible by frequently pruning to remove those committed transactions that can never lead to any cycle in
the future. In order to prune the dependency graph, we remove the unnecessary transactions using the following rule: a committed transaction is removed if it is - (1) not reachable from any currently active transaction and (2) not concurrent with any active transaction. Such an unreachable transaction can not be part of any future cycle for following reasons. Since the transaction is not concurrent with any active transaction, the only new dependencies that can arise for such a transaction in future are of outgoing ww and wr types. Since such a transaction does not have any incoming edge, and the only new dependencies edges that can be formed are outgoing edges it can not become part of any future cycle. Thus, such a transaction can be safely removed from the DSG.