Experimentation with Dynamic Root Merging

8.7 Experimentation with Dynamic Root Merg-

ing

Experimentation with the dynamic root merging algorithm presented pre- viously is similar to the add and add/remove scenarios presented in Chap- ter 4. Figure 8.6 illustrates the benchmark. First, a number of filters (up to 500) are generated using the workload generator. Each filter has a unique interface, which corresponds to the local or hierarchical scenario. The filters are added to a forest extended with the root merging algorithm. The merging algorithm maintains a merged root set when filters are added and removed to the structure.

Filters

Workload generator

Notifications

Poset or Forest

1. Add or Add/Remove scenario

2. Matching test Correctness testing Filters Workload generator Notifications Forest

1. Add or Add/Remove scenario

2. Matching test

Correctness testing Root

Merger

Figure 8.6: Add and add/remove scenario with the root merger algorithm.

Correctness of operation is ensured by testing that the merged set does not result in false negatives and positives, and ensuring that the internal data structures used by the merging algorithm are correct.

Each measurement was replicated 20 times and the predicates were ran- domly selected form the set of predicates {<, >, ≤ , ≥ , =, 6=, [a, b]} using a uniform distribution. The two benchmarked filter schemas were: a variable number of attribute filters (1-3), and a static number of attribute filters (2 and 3). The number of filters to be added and removed in the add/remove microbenchmark was 100.

Figure 8.7 presents the results for the variable number of attribute filters case. In this scenario, filter merging may be performed without significant overhead, because the root set is small due to covering and root filters have only a few attribute filters. Merging is very useful in this case and a constant or near constant root set size is achieved, whereas the non-merged root set size is linear to the number of input filters.

Figure 8.8 presents the results for two static attribute filters. Filter merging is also beneficial in this case, but has more overhead than for the previous microbenchmark. The merged filter set size has also a near constant size in this case. The root set size is larger in this case compared

0 50 100 150 200 250 300 350 400 100 150 200 250 300 350 400 450 500 Time (ms) Filters Add scenario time

Forest Merged forest 0 5 10 15 20 25 30 35 40 45 50 100 150 200 250 300 350 400 450 500 Filters Filters Add scenario root set size

Forest Merged forest 0 500 1000 1500 2000 2500 3000 3500 4000 4500 100 150 200 250 300 350 400 450 500

Total covering ops

Filters Add/remove scenario total ops

Forest Merged forest 0 20 40 60 80 100 120 140 100 150 200 250 300 350 400 450 500 Time (ms) Filters

Add/remove scenario total time (ms)

Forest Merged forest

Figure 8.7: Add and add/remove scenario for a variable number of attribute filters.

to the previous case and filter merging cannot be performed as often due to the static number of attribute filters.

Figure 8.9 shows the results for three static attribute filters. In this case, the merging overhead is substantial and the root set size is not reduced. This demonstrates the effects of a non-mergeable workload.

8.8 Summary

The results show that covering and merging are very useful and give significant reduction of the filter set, especially with a variable number of attribute filters, because those filters with fewer attribute filters may cover other filters with more attribute filters. We also experimented with a static scenario, where the number of attribute filters per filter is fixed. The static scenario also gives good results for covering, but perfect merging does not perform well when the number of attribute filters grows.

When the number of filters per schema grows the whole subscription space becomes covered, which we call subscription saturation. This moti- vates high precision filters for a small amount of filters, and more general filters when the subscription space becomes saturated.

8.8 Summary 151 0 200 400 600 800 1000 1200 1400 1600 1800 2000 100 150 200 250 300 350 400 450 500 Time (ms) Filters Add scenario time

Forest Merged forest 0 20 40 60 80 100 120 140 160 100 150 200 250 300 350 400 450 500 Filters Filters Add scenario root set size

Forest Merged forest 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 100 150 200 250 300 350 400 450 500

Total covering ops

Filters Add/remove scenario total ops

Forest Merged forest 0 100 200 300 400 500 600 700 800 900 100 150 200 250 300 350 400 450 500 Time (ms) Filters

Add/remove scenario total time

Forest Merged forest

Figure 8.8: Add and add/remove scenario for a static number of attribute filters (2). 0 2000 4000 6000 8000 10000 12000 100 150 200 250 300 350 400 450 500 Time (ms) Filters Add scenario time

Forest Merged forest 0 50 100 150 200 250 300 100 150 200 250 300 350 400 450 500 Filters Filters Add scenario root set size

Forest Merged forest 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 100 150 200 250 300 350 400 450 500

Total covering ops

Filters Add/remove scenario total ops

Forest Merged forest 0 1000 2000 3000 4000 5000 6000 7000 100 150 200 250 300 350 400 450 500 Time (ms) Filters

Add/remove scenario total time

Forest Merged forest

Figure 8.9: Add and add/remove scenario for a static number of attribute filters (3).

The results indicate that filter merging is feasible and beneficial when used in conjunction with a data structure that returns the minimal covering set, and the filter set contains elements with a few attribute filters that cover more complex filters. The cost of unsubscription, or removing filters from the system, can be minimized by merging only elements in the minimal cover or root set. When a part of a merger is removed, the merger needs to be re-evaluated. Any delete or add operation outside the minimal cover does not require merging. For complex filter merging algorithms it is also possible to use a lazy strategy for deletions. In lazy operation, the system delays re-merging and counts the number of false positives. When a threshold is reached the minimal cover set or parts of it are merged and sent to relevant neighbours.

Part V

Applications

Chapter 9 Collection and Object

Synchronization Based on Context

Information

We present a novel mechanism for collection and object synchronization based on context information. The mechanism is based on a distributed event system and uses event filters to represent context and realize context queries. The central operations of the system are storing and retrieving objects by their context. The new feature of the system is context-based synchronization, which allows synchronizing collections of objects continu- ously based on the given context. The system may also be used for context- based service provisioning. We present mechanisms for both collection and object synchronization. The former uses the publish/subscribe paradigm and the latter builds on an XML-aware file synchronizer. We focus on the first mechanism and present a context-aware photo library as a sample application.

9.1 Introduction

A number of core technologies are needed in order to realize the intelligent and adaptive services of tomorrow. Efficient and intelligent data synchronization is a basic property of current and future applications, especially in mobile and ubiquitous environments. Mobile phones, laptops, and PDAs have become commonplace. We are faced with the question of how to lo- cate important data items and keep them synchronized on different devices. Furthermore, the rules for synchronization are dependent not only on the device, but also on the past, current, and future operating context. In this chapter, we present a middleware system and an API for creating and

156 Information tracking object collections based on context queries, and then synchronizing the objects using a file synchronizer.

Context-awareness is considered to be an important property of future mobile applications [1]. In this chapter, context is represented by a set of dimensions that take either discrete or interval values. We focus on how context information may be used to synchronize objects and do not consider how the actual context information is acquired.

The proposed mechanism is based on three basic middleware services included in the Fuego middleware service set: the messaging service based on synchronous and asynchronous SOAP [71], the event service [129] that facilitates distributed publish/subscribe (pub/sub), and the XML-aware file synchronizer [82]. The messaging service is responsible for transporting information between known entities using, for example, explicit addresses or queue-names. The event service is responsible for decoupled, anonymous many-to-many information dissemination [51]. The XML-aware file synchronizer provides facilities for the synchronization of files and directories using XML directory trees and a tree reconciliation mechanism.

The contributions of this chapter are: 1. Using techniques from publish/subscribe systems to match and compare context information. This allows both point and subspace matching in the context/content space. 2. An API for tracking and synchronizing collections using context queries. 3. Using the distributed pub/sub to synchronize collections, and 4. using an XML-aware synchronizer to synchronize files. The system may also be used for context-based service provisioning and context-based personalization of applications.

This chapter is structured as follows: in Section 9.2 we discuss how context information may be represented using event filters. Section 9.3 presents the proposed synchronization system and gives examples of context-based synchronization. In Section 9.4 we examine the sample application and Section 9.5 discusses related work. Finally, Section 9.6 presents the summary.

In document Efficient Content-based Routing, Mobility-aware Topologies, and Temporal Subspace Matching (Page 159-166)