• No results found

3. Architecture of DivRep Middleware

3.1. DivRep – Replication with Diverse Database Servers

3.1.3. Distributed Deadlock Avoidance

To avoid distributed deadlocks, DRA relies on a deadlock prevention scheme that uses a specific parameter, referenced as NOWAIT. When the underlying databases guarantee SI, the parameter ensures that an exception is raised as soon as concurrent

transactions attempt to modify the same data item. Figure 3-8 shows the functionality of the parameter using an example when two transactions, T1 and T2, execute against

a centralized database. Each transaction requires exclusive locks on two resources, A and B, but the order of lock acquisition is different for two transactions. Once T2

attempts to acquire a conflicting lock, lock A, an exception will be raised since T1

already holds the lock, and T2 will have to abort. Note that if NOWAIT parameter was

not enabled the opposite order of lock acquisition would have lead to a deadlock. The behaviour of the parameter is different from first-committer-wins and first-updater-

wins rules (Fekete, O'Neil et al. 2004) since no waiting for one of the transactions to

end is necessary. The use of NOWAIT might lead to an increase in the number of transaction aborts, and corresponding restarts (Bernstein and Goodman 1981) than if a deadlock detection scheme was used, but incurs no extra overhead needed for the construction of potentially complex waits-for graphs, which incur the principal overhead in deadlock detection schemes. The use of the NOWAIT parameter seems to be attractive due to its simplicity, especially for the workloads with low probability of deadlocks/aborts. Time T1 T2 Lock A - Granted Lock A - Denied

Lock B - Granted Commit

Lock B - Granted Abort

Resource A unavailable Time T1 T2 Lock A - Granted Lock A - Denied

Lock B - Granted Commit

Lock B - Granted Abort

Resource A unavailable

Figure 3-8 Deadlock avoidance using NOWAIT parameter on a non-replicated database exemplified with two concurrent transactions, T1 and T2 (the corresponding begin operations are omitted). As soon

as transaction T2 requests a conflicting lock for resource A, the database server will raise an exception

and T2 will have to abort.

A number of database servers offer the behaviour of NOWAIT parameter, such as Oracle, PostgreSQL, InterBase, Firebird etc., although the respective implementations might differ. For example, on Firebird the NOWAIT parameter is specified for a database connection, while on PostgreSQL the behaviour is available through the SELECT … FOR UPDATE operation, which locks the selected rows against concurrent updates.

In order to avoid distributed deadlocks in a replicated database not all, but exactly n-1 replicas should be configured with the NOWAIT parameter (where n is the number of

replicas). Otherwise, if the number of replicas that enabled NOWAIT parameter is less than n-1, a distributed deadlock, that spans replicas which have not enabled the parameter, could be observed. Hence the liveness in the replicated database will be compromised. We assume deployment of two replicas with DivRep and one has the

NOWAIT parameter enabled. Despite enabling NOWAIT parameter on one of them,

other types of concurrency conflicts might be reported, since the replicas use 2-Phase locking for write operations. In particular it is possible that a replica, which has not enabled NOWAIT, reports a centralized deadlock (when concurrent transactions executing on the same replica try to acquire a set of locks in different order). In this case DivRep follows the decision made by the replica and aborts the victim transaction. Similarly it should be noted that on the replica which has enabled

NOWAIT an exception due to the first-updater-wins rule could be observed, i.e. not all

exceptions would be raised as a result of the NOWAIT parameter.

One is interested in how much impact NOWAIT parameter has on the abort rate. We reason about the matter informally using Figure 3-9. Let us assume an FT-node, a system with two replicas, in which only one of the replicas, RA, has NOWAIT

parameter enabled (clearly FT-node is a special case of a system with more than two replicas). Two overlapping transactions, Ti and Tj, both try to modify data item x. The

moment transaction Tj tries to acquire exclusive lock on x, NOWAIT exception will be

raised and the middleware will abort the transaction on both replicas. Following the abort of Tj on RB the first-committer-wins rule is enforced, exclusive lock will be

granted to T

B

i and the transaction will successfully commit on both replicas.

However had NOWAIT parameter been enabled on replica RB too, transaction Ti

would have been also aborted, following the NOWAIT exception raised as part of Tj

execution. Hence, in the cases when more than one replica has NOWAIT parameter enabled the deadlock avoidance scheme might result in unnecessarily many aborts. This is not possible in DivRep, however, where only one replica enables the parameter (in this way DivRep obeys that n-1 replicas have NOWAIT enabled). Moreover, despite having NOWAIT on both replicas, unnecessary aborts might be an infrequent event in practice: it is likely that the abort of Tj happens earlier, in global

calendar time, than the request for the writing of x by Ti on RB – in this way TB j would

release the locks, NOWAIT exception would not be triggered by RBB and Ti would

w

i

(x)

w

j

(x)

w

i

)

R

A

R

B

w

j

(x)

NOWAIT exception raised NOWAIT exception raised

a

j

a

j

c

i

c

i

b

i

b

j

b

j

b

i Blocked

w

i

(x)

w

Time j

(x)

w

i

(x)

R

a

j A

R

B

w (x)

j

a

j

c

i

c

i

b

i

b

b

j

b

i Blocked Time

Figure 3-9 Deadlock avoidance using NOWAIT parameter in a replicated database system with two replicas, RA and RB. Two transactions TB i and Tj execute concurrently and access the same data item x.

Only transaction boundary operations (begins (b), aborts (a) and commits (c)) and the write of the common data item are shown. The interaction between the replicas and the middleware and similarly

between the clients and the middleware is omitted for clarity.

There is an exception to the claim that DivRep does not produce unnecessary aborts: the aborts are possible in specific situations where one of the database engines in an FT-node provides deadlock detection mechanism, which might make contrary decisions to a replica with NOWAIT enabled. Consequently, it is possible that two replicas resolve conflicting transactions in a different order. An instance of this situation is depicted in Figure 3-10, which shows executions of two transactions Ti

and Tj on two replicas RA and RB. Replica RB A will report concurrency conflict

exception as soon as Tj tries to acquire lock for data item x and as a result the

transaction will have to be aborted. Correspondingly, on replica RB (which has

NOWAIT disabled) a deadlock detection scheme will decide that Ti should be aborted

due to the centralised deadlock. Without imposing execution determinism in DivRep both transactions will be aborted. However, the particular series of events are unlikely - the value of the deadlock timeout should, e.g. according to the best practice guides for database administrators, exceed the typical transaction time, and in that way avoid possibility that inconsistent decisions are made by different replicas (transaction Tj

would be aborted before deadlock detection mechanism is triggered on RBB and as a

consequence transaction Ti would commit).

FT-node uses two database servers, one of which has NOWAIT parameter enabled. When selecting the pair of servers a particular database engine might not offer

NOWAIT parameter, and ruled out as an inappropriate choice for FT-node. Nonetheless the selection process should verify if the functionality of NOWAIT could be simulated using other configuration parameters – many database servers offer a

lock timeout parameter that prevents indefinitely long blocking, e.g.

LOCK_TIMEOUT in Microsoft SQL server or innodb_lock_wait_timeout in MySQL. Commonly, setting the values of the parameters to zero simulates NOWAIT functionality; or at least fine-grained timeout values (e.g. in milliseconds) could achieve the same.

NOWAIT exception raised NOWAIT exception raised

wj(y) wj(x) wi(x) RA RB w i(y) aj aj ai ai bi bj bj bi

Local Deadlock: Ti chosen as “victim”

Time wj(x) wi(x) wi(y) NOWAIT Disabled wj(y) wj(x) wi(x) RA RB w i(y) aj aj ai ai bi bj bj bi

Local Deadlock: Ti chosen as “victim”

Time

wj(x) wi(x)

wi(y)

NOWAIT Disabled

Figure 3-10 An example of unnecessary aborts in DivRep. Two transactions Ti and Tj execute

concurrently and both access the two data items, x and y, on the two replicas, RA and RB. Only

transaction boundary operations (begins (b), aborts (a) and commits (c)) and the writes of the common data items are shown. The interaction between the replicas and the middleware and similarly between

the clients and the middleware is omitted for clarity.

B

3.2. Correctness of DRA