Complexity analysis - Theta replication method

Theta replication method

3.4 Complexity analysis

In the Conflict Resolution algorithm implemented for the proposed approach sorting and comparing of simple arrays of variables are performed. All of these operations are based on the usage of the fastest available algorithms [14]. The Quick-Sort algorithm is used for TIDs Selection and for Conflict Resolution.

The following sections presents the time complexities for the selection of the transaction identifiers and for the conflict resolution.

3.4.1 Selection of the transaction identifier

Let n be the length of the transaction queue (the number of transactions in the queue).

The first step of the selection of transaction identifiers is to sort out all the elements from the input queue using Quick-Sort method, which complexity is

θ(log(n)). Then the lowest, subsequent elements from the queue are selected.

The complexity of this operation is equal to θ(n), and the overall time complexity of the TIDs Selection is given by the greater of these two complexities:

θ(n ∗ log(n)) + θ(n) = θ(n ∗ log(n)) (3.5)

3.4.2 Conflict resolution

Let:

p be the number of transactions chosen from the queue of transactions,

n₁, n₂, . . . , n_p– numbers of single operations/procedures per transaction (numbers of rows in the list of parameters related to the processed transactions).

In the first step of finding possible conflicts between p selected from the queue transactions, the parameters related to this transactions are sorted out. It re-quires p sorts (one sort per each set of parameters) using the Quick-Sort algo-rithm. The time complexity of all these sorts is expressed as below:

θ(n₁∗ log(n₁)) + θ(n₂∗ log(n₂)) + . . . + θ(n_p∗ log(n_p)) (3.6) Expression (3.6) for n = max{n₁, n₂, . . . , n_p} is equal to:

p ∗ θ(n ∗ log(n)) = θ(n ∗ log(n)) (3.7) Since constant factors can be ignored in left side of equation (3.7), the time complexity of sort operations for lists of all parameters is equal to θ(n ∗ log(n)).

When the parameters in each list of transactions are sorted out, they are com-pared n_i times, where n_i is the greater number of rows in both lists of parameters.

The time complexity of the comparison of two sorted list is equal to θ(n_i). The time complexity of all comparisons is the aggregate of the complexities for single comparisons. After ignoring constant elements and for n = max{n₁, n₂, . . . , n_p} it is equal to θ(n), which is presented in the equation (3.8).

(p − 1)θ(n) + (p − 2)θ(n) + . . . + (p − (p − 1))θ(n) = θ(n) (3.8)

The overall time complexity of all the operations required to perform the conflict resolution is equal to:

θ(n ∗ log(n)) + θ(n) = θ(n ∗ log(n)) (3.9)

3.5 Summary

Theta approach is realized in multi-tier architecture with distributed middleware, which gives mechanisms that ensure high level of system scalability. Since Theta replication method does not require any distributed locks and transactions are processed in parallel only if they are not in conflict, deadlock detection and resolu-tion is realized using the same techniques as in the centralized database system.

Elimination of the necessity of the usage of the locks between remote replicas causes that the usage of distributed transactions is not required, and in conse-quence the overall performance of the data replication improves significantly.

The Conflict Prevention algorithm designed for the approach does not re-quire any special order of the incoming transactions, which minimize amount of messages exchanged between remote sites, considerably increasing the overall performance of the replication process.

The clients’ transactions are processed concurrently in database which allow to gain better performance of the replication process. The great advantage of the proposed approach is the usage of Conflict Prevention algorithm in conjunction with executing of procedures stored in databases. The new approach uses inno-vative mechanism determining the order of transactions execution, which allows to apply transactions in replicas concurrently. The other advantage of the ap-proach is the fact, that on behalf of middleware stored procedures are executed in database. These stored procedures are precompiled in the database kernel, thus they are more efficient than execution of single SQL instructions.

System components and stored data in the system with Theta replication implementation are duplicated among different sites, overall system resistance to failures of a single component or even a whole site fulfill defined data safety requirements.

In Theta replication stored procedures are executed in databases using native drivers in middleware layer for communication with particular database, which eliminates necessity of modifications in the application code to fulfill particular database requirements. Since procedures in various databases use identical names in each replica and exactly the same list of parameters, it is possible to use different platforms and various databases. Moreover, changes in the application logic caused by business requirements do not require modification in the code of Theta replication core system. Similarly to applications working with single instance database, only the client’s application and database objects have to be modified. In the middleware it is only required to redefine relations between new and existing stored procedures.

To summarize, Theta approach uses both adapted components appearing in literature or implemented in practice solutions (for instance global transaction identifiers, distributed middleware architecture), and the new elements (Conflict Resolution algorithm, the way of communication between processes inside the middleware and remote sites). Thank to its features, the new approach offers a highly scalable, efficient and failure resistant data replication technique for distributed, multi-node systems. The new method is suitable for environments working on various hardware platforms, with different operating systems and databases.

In document New method for data replication in distributed heterogeneous database systems (Page 79-83)