Centralized Plan Generation - Declarative Networking

This section describes the steps required to generate execution plans of a centralized Datalog program using the semi-naïve (SN) fixpoint evaluation mechanism [Balbin and Ramamohanarao,1987]. SN is the standard method used to evaluate Datalog programs correctly with no redundant computations. The Shortest-Path program (Figure2.5in Chapter2) is used here as an example of how SN is achieved in the DN engine.

4.1.1 SEMI-NAÏVE EVALUATION

The first step in SN is the semi-naïve rewrite, where each Datalog rule is rewritten to generate a number of delta rules to be evaluated. Consider the following rule:

p: − p1, p2, ..., pn, b1, b2, ..., bm. (4.1) p1, ..., pn are derived predicates and b1, ..., bmare base predicates. Derived predicates refer to inten-

sional relations that are derived during rule execution. Base predicates refer to extensional (stored) relations whose values are not changed during rule execution. The SN rewrite generates n delta rules, one for each derived predicate, where the kt h_{delta rule has the form}1_:

pnew _{: − p}old

1 , ..., poldk−1,poldk , pk+1, ..., pn, b1, b2, ..., bm. (4.2)

1_{These delta rules are logically equivalent to rules of the form}_pnew

j : − p1, p2, ..., pk−1,pkold, pk+1, ..., pn, b1, b2, ..., bm,

In each delta rule,pold_k is the delta predicate, and refers to pk tuples generated for the first time

in the previous iteration. Here, pold

k refers to all pktuples generated before the previous iteration.

For example, the following rule r2-1 is the delta rule for the recursive rule r2 from the Datalog program shown in Figure2.1from Chapter2:

pathnew_{(S, D, Z, C)}_{: − link(S, Z, C1), path}old_{(Z, D, Z}_{2, C2), C}_{= C1 + C2.} _(4.3)

The only derived predicate in rule r2 is path, and hence one delta rule is generated. All the delta rules generated from the rewrite are then executed in synchronous rounds (or iterations) of computation, where input tuples computed in the previous iteration of a recursive rule execution are used as input in the current iteration to compute new tuples. Any new tuples that are generated for the first time in the current iteration are then used as input to the next iteration. This is repeated until a fixpoint is achieved (i.e., no new tuples are produced).

Algorithm4.1summarizes the basic semi-naïve evaluation used to execute these rules in the DN engine. In this algorithm, DN maintains a buffer for each delta rule, denoted by Bk. This buffer

is used to store pktuples generated in the previous iteration (pold_k ). Initially, pk, pold_k ,pold_k and

pnew

k are empty. As a base case, all rules are executed to generate the initial pktuples, which are

inserted into the corresponding Bkbuffers. Each iteration of the while loop consists of flushing all

existingpold_k tuples from Bkand executing all the delta rules to generatep_jnewtuples, which are

used to update pold

j , Bj and pj accordingly. Note that only new pj tuples generated in the current

iteration are inserted into Bj for use in the next iteration. A fixpoint is reached when all buffers are

empty.

Algorithm 4.1Semi-naïve Evaluation in DN

execute all rules

for each derived predicate pk Bk← pk

end

while_∃B_k.size >0

∀Bkwhere Bk.size >0,pold_k ← Bk.f lush()

execute all delta rules

for each derived predicate pj pold_j ← p_jold∪ pold_j

Bj ← pnew_j − p_jold pj ← poldj ∪ Bj pnew j ← ∅ end end

4.1. CENTRALIZED PLAN GENERATION 31 4.1.2 DATAFLOW GENERATION

Algorithm4.1requires executing the delta rules at every iteration.These delta rules are each compiled into an execution plan, which is in the form of a DN dataflow strand, using the conventions of the DN dataflow framework described in Chapter3. Each dataflow strand implements a delta rule via a chain of relational operators. In the rest of this chapter, the dataflow strand for each delta rule is referred to as a rule strand.

For each delta rule, each rule strand takes as input its delta predicate (prepended with). This input is then used as input to the strand which implements a sequence of elements implementing relational equijoins. Since tables are implemented as main-memory data structures with local indices over them, tuples from the stream are pushed into an equijoin element, and all matches in the table are found via an index lookup.

After the translation of the equijoins in a rule, the planner creates elements for any selection filters, which evaluate the selection predicate over each tuple, dropping those for which the result is false. In some cases, the dataflow can be optimized to push a selection upstream of an equijoin, to limit the state and work in the equijoin, following traditional database rules on the commutativity of join and selection.

Aggregate operations like MIN or COUNT are translated after equijoins and selections, since they operate on fields in the rule head. Aggregate elements generally hold internal state, and when a new tuple arrives, compute the aggregate incrementally. The final part of translating each rule is the addition of a “projection” element that constructs a tuple matching the head of the rule.

pathnew_{(S,D,Z,C) :- link(S,Z,C1),}_pathold_{(Z,D,Z2,C2), C = C1 + C2.}

r2-1 Join

pathnew_.Z=link.Z

Project pathnew Buffer Output paths Input paths pathold path path

Figure 4.1: Rule strand for delta rule r2-1 in DN.

Figure4.1shows the dataflow realization for delta rule r2-1. The rule is repeated above the dataflow for convenience. The example rule strand receives newpathold _{tuples generated in the}

previous iteration to generate new paths (pathnew_{) which are then “wrapped-around” and inserted}

into the path table (with duplicate elimination) for further processing in the next iteration. In effect, semi-naïve evaluation achieves the computation of paths in synchronous rounds of increasing hop counts, where paths that have been previously derived in the previous round are used to generate new paths in the next iteration.

In document Declarative Networking (Page 51-54)