Reconciliation Algorithms - A Distributed Storage and Query Subsystem for Collaborative Data Sh

De nition 8 describes in a declarative fashion what a general reconciliation algorithm must do. When participantpidecides to reconcile, it must retrieve the newly published, trusted transactions (the

relevanttransactions). It must compute their transaction and update extensions, and then chose a subset of those to apply, based on integrity constraint and user preferences. It then must update the participant’s instance to re ect the transactions it has chosen to accept, and record the decisions it made. In this section, we describe concrete algorithms that satisfy this de nition.

e computation described above can either be centralized or distributed. If the work is centralized on the reconciling participant, we call itclient-centric reconciliation, since it is typically the reconciling participant that retrieves all of the relevant transactions and decides which to apply. An

Distributed Store Central Store Client-Centric Reconciliation Network-Centric Reconciliation

Pros: Low communication, high reliability

Cons: Needs reliable central server, reconciliation work all at one peer

Pros: Distributes reconciliation work across many peers, high reliability

Cons: High communication, needs reliable central server Pros: No central store, distributed reconciliation work Cons: Highest communication, needs stable base of

connected peers Pros: No central store,

medium communication Cons: Needs stable base of connected peers,

reconciliation work all at one peer

Figure 3.4: Comparison of reconciliation algorithms and update stores

alternative isnetwork-centric reconciliation, in which computation is distributed across the entire network of participants. While the network-centric approach would place less load on the reconciling participant by distributing almost all of the work across the network, the client-centric approach gen- erates less network traﬃc, and it allows for a considerably simpler reconciliation algorithm. It also may allow potentially sensitive information, like the trust conditions, to be kept private from other participants.

In this thesis, we only consider client-centric reconciliation. Experimental evaluation, given in Section 3.5, showed that the cost of retrieving updates from storage (either local or distributed) was the dominating cost in reconciliation, so we did not consider it necessary to develop a distributed implementation of the reconciliation procedure.

e reconciliation algorithm needs to access several diﬀerent kinds of data to perform the op- erations outlined above. It must access the log of published transactions, and the instance of the reconciling participant. It also needs to read and modify the sets of applied, rejected, and deferred transactions for the reconciling participants. We de ne anupdate storemodule to provide a general interface to much of the aforementioned state. We have explored using both a centralized server and a distributed store in which the participants themselves store the state.

Each combination of reconciliation algorithm and update store implementation has its own unique bene ts, as shown in Figure 3.4. Our initial implementation uses client-centric reconciliation, which is considerably simpler both to understand and to implement; we couple that with either

central or distributed storage. As future work we propose to implement network-centric reconciliation over distributed storage using the distributed storage layer described in Chapter 4.

In order to implement an algorithm for the general reconciliation problem given in De nition 8, we introduce several new concepts:

Dirty values are key values that are modi ed (i.e. read or wri en) by a deferred transaction. As men- tioned previously, any transaction that reads or writes a value whose key is in the dirty value set must be deferred, in order to ensure that a previously-deferred transaction can always be accepted later. ey are used to enforce condition 2c from the de nition to avoid performing many pairwise compatibility checks; instead a set can be maintained.

Con ict groups are groups of con icts with the same type that involve the same key value; the reconciliation algorithm groups con icts for each reconciliation into such groups.

Options are groups of transactions within a con ict group that make the same modi cation to the key value. At most one option can be accepted for each con ict group when con icts are re- solved; the transactions from the other groups are rejected.

In the common case, each option within a con ict group will have only one transaction in it. Consider our running example of theF(organism, protein, function)relation. Suppose we had three transactions

t1 {+F(mouse, KIF4, spindle stabilization),+F(mouse, KIF5, spindle stabilization)}

t2 {+F(mouse, KIF4, chromosomal positioning)}

t3 {+F(mouse, KIF5, chromosomal positioning)}

Here there would be two con ict groups, one for⟨mouse,KIF4⟩with options{{t1},{t2}}, and one for

⟨mouse,KIF5⟩, with options{{t1},{t3}}. ese list the sets of compatible transactions for each key, and

the user should choose one (or none) of those sets as valid; the others will be rejected in the con ict resolution procedure outlined in Section 3.2.

In a more complicated scenario, there may be multiple transactions within each option. If we slightly alter the above example to

t2 {+F(mouse, KIF4, chromosomal positioning)}

t3 {+F(mouse, KIF4, chromosomal positioning)}

then there would be one con ict group with two options,{{t1},{t2, t3}}.

We now present the client-side algorithm for reconciliation, the core of which is the R - U procedure given in Algorithm 3.1. It determines which updates the participant can apply or reject during a particular reconciliation, and which it must defer, in a manner satisfying the require- ments of De nition 8. It also assigns the deferred transactions into con ict groups, as described above, to explain to users which transactions were deferred and why. As described in Section 3.2, this algorithm is also run again a er decisions for deferred transactions have been supplied by the user. en, a er recording the transactions a user has decided to reject, it reexamines the remaining deferred transactions to discover which, if any, can now be accepted.

Algorithm 3.1R U (recno)

Input:recno(reconciliation number to perform)

Helper functions are given as Algorithms 3.2, 3.3, 3.4, and 3.5.

1: txns←IDs of the undecided trusted transactions new forrecnofor this participant 2: prio←Mapping from index intxnsto priority

3: prios←Set of all transaction priorities 4: Sortpriosin decreasing order

5: fort∈txnsdo

6: upEx[t]← _{e a ened update extension of}t 7: decision[t]←C S (recno,upEx[t]) 8: end for

9: con icts←F C (txns,upEx)

10: fortxnPrio∈priosdo

11: decision←D G (txnPrio,con icts,prio,decision) 12: end for

13: Recorddecisionatrecno

14: fort∈txnsdo 15: ifdecision[t]₌ then 16: ApplyupEx[t] 17: end if 18: end for 19: deferred←{txn|decision[txn] = } 20: U C (recno,deferred)

e R U is given as Algorithm 3.1, and the various helper functions appear as Algorithms 3.2, 3.3, 3.4, and 3.5. R U begins by computing the a ened update ex-

tension of each trusted transaction. e call to C S at line 7 determines which transactions much be rejected or deferred because of the reconciling participant’s dirty value set or materialized state. e call to F C at line 9 discovers con icts between the a ened update extensions of trusted transactions. e algorithm then calls D G at line 11 to consider each group of transactions with the same priority, in decreasing order of priority; the decreasing order allows the algorithm to proceed greedily and consider each group only once. Within each group, transactions that con ict with higher-priority accepted transactions are rejected, and those that con ict with higher-priority deferred transactions are themselves deferred; if con icts are found between two non-rejected transactions within a group, both are deferred. Once all priority groups have been considered, R U has made decisions for all trusted transactions. Line 13 records which transactions the participant has decided to accept or reject. Lines 14-18 update the state of the local database; it is necessary to recompute the update extension since the antecedents of the trusted transactions may overlap. Line 20 updates the participant’s dirty value set and list of con icts for the current reconciliation.

Algorithm 3.2C S (recno,upEx)

Input:recno(reconciliation number),upEx(update extension of transaction)

Output:decision for input transaction 1: ifupExcontains a value dirty atrecnothen

2: return

3: else ifupExcontains a rejected transactionthen

4: return

5: else ifupExis incompatible with the instance atrecnothen

6: return

7: else

8: return

9: end if

Suppose that during a particular reconciliation there aret_{relevant transactions, each of which has} at mostaundecided antecedents. Further suppose that each transaction contains at mostucompo- nent updates. In this case, computing the a ened update extensions will take timeO(tua)_{, since} that much time is needed even to read through the updates for the relevant transactions. Checking for pairwise con icts between the update extensions will take time at mostO(t2+tua)_{, if a hash table-} based con ict detection algorithm is used. is con ict detection step asymptotically dominates all other work done a erwards by the R U procedure, giving a combined running time

Algorithm 3.3F C (txns,upEx)

Input:txns(set of transaction IDs),upEx(map from transaction ID to a ened update extensions)

Output:Map from transaction ID to set of con icting transaction IDs 1: con icts←∅

2: fort, t′ ∈txnsdo

3: ifupEx[t]con icts withupEx[t′]then

4: ifneithert_nort′_{subsumes the other}then

5: con icts[t]←con icts[t]∪{t′} 6: con icts[t′]←con icts[t′]∪{t} 7: end if

8: end if

9: end for

10: return con icts

Algorithm 3.4D G (txnPrio,con icts,prio,decision)

Input: txnPrio(map from transaction ID to priority),con icts(map from transaction ID to IDs of con icting transactions),prio(value of priority to make decisions for),decision(map from transaction ID to decision for already decided transactions)

Output:Map from transaction ID to decision 1: prioGrp←Values inpriothat map totxnPrio

2: higher←Values inpriothat map to a priority>txnPrio 3: Remove rejected transactions fromprioGrp

4: fort∈prioGrpdo

5: forc∈(con icts[t]∩higher)do

6: ifdecision[c] = then

7: decision|[t]←

8: prioGrp←prioGrp−{t} 9: else ifdecision[c] = then

10: decision[t]← 11: end if

12: end for

13: end for

14: fort, t′ ∈prioGrpdo

15: iftcon icts witht′then

16: decision[t]← 17: decision[t′]← 18: end if

19: end for

Algorithm 3.5U C (recno,deferred)

Input:recon(reconciliation number),deferred(IDs of deferred transactions) Clear all con ict state (con ict groups, dirty values) from reconciliationrecno

fort∈deferreddo

upEx[t]←_{the a ened update extension of}t

Remove fromupEx[t]all clean updates inapplicable atrecno MarkupEx[t]dirty atrecno

end for

con icts←F C (deferred,upEx) con ictGroups←∅

fort∈deferred,t′ ∈con icts[t]do

forcon ict⟨type,value⟩betweentandt′do

Add{t, t′}tocon ictGroups[⟨type,value⟩]

end for end for

for⟨type,value⟩ ∈con ictGroups.keysdo

Combine compatible txns for⟨type,value⟩into same option

end for

Recordcon ictGroupsas con ict set forrecno

O(t2+tua).

Proposition 2(Correctness of R U ). e R U procedure given in Al- gorithm 3.1 satis es the conditions given in De nition 8.

Proof. Condition 1 of De nition 8 enforces that decisions from a previous reconciliation are never overruled. It is satis ed, since transaction extensions that contain rejected transactions are rejected

at line 4 of the C S method. Otherwise, since R U only considers newly

arrived transactions and their transaction extensions, which by de nition only contain transactions that have not yet been accepted.

Conditions 2a and 2b are enforced by line 2 for transactions deferred in a previous reconciliation, since any con icts with or dependency on a deferred transaction will cause the update extension under consideration to touch a dirty value. Con icts with transactions deferred during the current reconciliation (which will also catch dependent transactions, since they must modify an overlapping set of keys) are caught at line 10 of the D G method. Condition 2c is caught at line 16

of D G .

Condition 3a is enforced at line 4 of C S . Condition 3b is checked at line 7 of D G . By considering the trusted transactions in decreasing order by priority, R U greed-

ily ensures that condition 3b is satis ed; since lower priority transactions can never aﬀect whether higher priority transactions are accepted, the lower priority ones can be considered independently in subsequent iterations. Condition 3c is caught at line 6 of C S if the instance was not compatible with the transaction before any transactions were accepted during this reconciliation (if the instance was compatible at the start of the reconciliation but was made incompatible during the reconciliation, then the transaction must also con ict with some transaction of higher priority that was already accepted, and was rejected for violating Condition 3b above).

e remaining conditions are trivially satis ed. e algorithm does not reject or defer transactions that it could accept, no transaction is given more than one decision, and the instance is updated. Lines 14 to 18 of R U omit the complexity of applying overlapping update extensions correctly, but this is trivial to implement by recomputingupEx[t]immediately before applying it.

In document A Distributed Storage and Query Subsystem for Collaborative Data Sharing (Page 58-65)