• No results found

Event Filtering Algorithm

As an extension to the conjunctive counting approach (see Section 2.3.3 on page 40), our general Boolean filtering approach applies a three-step filtering process (predicate matching, candidate subscription matching, and final sub- scription matching). We illustrate an overview of this process in Figure 4.4.

4.3.1

Predicate Matching

In the predicate matching step, the filtering algorithm determines all predicates that are fulfilled by an incoming message. This task is performed by consulting

0 0 0 0 0 1 .. .. .. .. .. .. .. .. 1 id(p) {id(s)} 1 2,3 79 70 .. .. .. .. id(p) id(p) .. .. .. id(p) .. id(p) indexes predicate One−dimensional vector Hit Minimum predicate count vector location table Subscription Subscription trees subscription Candidate matching subscription Final matching Predicate matching Fulfilled predicates Candidate subscriptions vector predicate Fulfilled subscription table association Predicate− the predicate List of fulfilled predicates Subscriptions containing Accumulation per subscription of fulfilled predicates per subscription Actual number of fulfilled predicates per subscription

Greater or equal test

Access subscriptions Memory address of subscription id(s) loc(s) 1 .. 70 .. 5 subscriptions Evaluate 10 11 .. .. .. .. .. 12 10 15 .. .. .. .. .. 12 Unfulfilled predicate predicate Fulfilled Minimal number Event message subscriptions Fulfilled

Figure 4.4: Overview of predicate matching, candidate subscription match- ing, and final subscription matching in the Boolean filtering algorithm. the one-dimensional predicate indexes created in the pre-processing step (see Section 4.2.3): the filtering algorithm evaluates the one-dimensional indexes for all filter functions that are applicable to the attributes of the incoming message. The state of fulfillment of predicates is then recorded in a fulfilled predicate vector . Predicate matching is illustrated in the top part of Figure 4.4.

4.3.2

Candidate Subscription Matching

The next step, candidate subscription matching, restricts the set of registered subscriptions to a set of candidate subscriptions that are potentially fulfilled

by the incoming message. The determination of candidate subscriptions is based on the approach taken in the conjunctive counting algorithm (see Sec- tion 2.3.3).

The algorithm has provided a set of predicates that is fulfilled by an in- coming message. Whether a predicate is fulfilled has been recorded in the fulfilled predicate vector (see Section 4.3.1). Based on this information and the populated predicate-subscription association table (see Figure 4.4 for an illustration and Section 4.2.3 for a description), BoP determines the number of fulfilled predicates per subscription. This task is performed by incremen- tally increasing a counter in a hit vector , containing one 1-byte integer value per subscription. Having processed all fulfilled predicates and evaluated their entries in the predicate-subscription association table, the hit vector states the total number of fulfilled predicates per subscription.

Based on this information, BoP then determines all candidate subscrip- tions: it compares the value in the hit vector to the value in the minimum predicate count vector (see Figure 4.4 for an illustration and Section 4.2.3 for a description). If the entry in the hit vector shows a value greater than or equal to the entry in the minimum predicate count vector, a candidate subscription is found. The middle part of Figure 4.4 illustrates this candidate subscription matching process.

4.3.3

Final Subscription Matching

Having found the set of candidate subscriptions, the final part of the matching process evaluates this set against the incoming message. Using the subscrip- tion location table (see Figure 4.4 for an illustration and Section 4.2.3 for a description), BoP accesses the encoded subscription tree of a candidate. Then, the Boolean structure of the tree is evaluated against the message.

If the filter expression evaluates to true, the candidate is a matching sub- scription. For this evaluation, the filtering algorithm only needs to process the Boolean tree structure of a subscription but not its predicates—the value of the leaf nodes (i.e., the state of fulfillment of predicates) is already known and stored in the fulfilled predicate vector.

We illustrate final subscription matching in the bottom part of Figure 4.4. The following example illustrates the overall filtering process:

Example 4.6 (Filtering of an event message) Let us consider event mes- sage e1, defined in Example 4.2 (page 97), and the registration of the three

subscriptions s1 to s3 we have given Section 3.3 (page 79). In the following,

we refer to the predicates of these subscriptions by pi

j, stating predicate pj of

subscription si.

The predicate matching step uses the one-dimensional predicate indexes to determine all fulfilled predicates. For the attribute-value pairs of message e1,

these predicates are as follows: • ∅ for (Category, Fantasy) • {p2

4} for (Format, softcover)

• ∅ for (Special Attribute, none) • {p1

6, p28, p212} for (Condition, used)

• ∅ for (Buy It Now, no) • {p1

4, p15, p26, p27, p210} for (Price, 11.00)

• {p1

2, p22} for (Ending Within, 6 hours)

• {p3

6} for (Bids, 0)

• {p1

1, p21} for (Title, “Harry Potter and the Goblet of Fire”)

• {p3

2} for (Author, “JK Rowling”)

We now proceed to the candidate subscription matching step: summing up the number of fulfilled predicates for the three subscriptions results in five hits for s1, eight hits for s2, and one hit for s3. The hit vector (using a set notation)

is thus {5, 8, 2}.

In our example, every predicate occurs in only one subscription. This is because we did not identify common predicates when introducing these classes. The filtering algorithm easily retrieves the information about the occurrence of predicates from the predicate-subscription association table. For example, p1 3

and p2

5 do internally get assigned the same identifier.

For our three subscriptions, the minimal number of fulfilled predicates is as follows: pmin(s1) = 4, pmin(s2) = 5, and pmin(s3) = 3 (see Example 4.5 on

page 106 for a calculation example). The minimum predicate count vector is thus (again using a set notation) {4, 5, 3}.

The last step of candidate subscription matching identifies candidate sub- scriptions by comparing the hit and minimum predicate count vector. These

candidates are s1 (5 ≥ 4) and s2 (8 ≥ 5), but s3 is not a candidate subscription

(2 3).

Final subscription matching finally evaluates the subscription trees of all candidate subscriptions (s1 and s2):

Subscription s1 evaluates to true and thus is both a candidate subscription

and a fulfilled subscription. This is because p1

1 and p12, as well as p15 and p16 are

fulfilled, leading to a subscription tree that evaluates to true.

However, the other candidate, subscription s2, evaluates to false. It is thus

a candidate subscription but not a fulfilled subscription. Although p2

1, p22, and

p24 are fulfilled, neither p2

9 and p210, nor p211 and p212 are fulfilled. Hence, the

subscriptions tree of s2 evaluates to false.