3.3 Poset-derived Forests
3.3.2 Poset-derived Forest with Multiple Interfaces
The basic poset-derived forest does not take into account the interface- processing present in the Siena filters poset. Each input node has one interface, but within the data structure a node may have several interfaces associated with it. The number of interfaces is bounded by the number n of nodes in the forest. Let [n] denote the set of the n smallest natural numbers, i.e. [n] = {0, ..., n − 1}.
3.3 Poset-derived Forests 37
Algorithm 3 Add and del procedures for the forest.
Let (F , ) be a poset-derived forest. It is assumed that there is an efficient way to find a node in F based on its identifier. In subsequent examination, references to “larger” and “smaller” are to be taken with respect to the relation w. We define the following algorithms with inputs F and a filter x and output a poset-derived forest:
add(F , x): This algorithm maintains a current node during its execution. First, set the current node to be the imaginary root of F .
1. If x is already in the forest, return without changes.
2. Else if x is incomparable with all children of the current node, add x as a new child of the current node.
3. Else if x is larger than some child of the current node, move all children of the current node that are smaller than x to be children of x and make x a new child of the current node. 4. Else pick a child of the current node that is larger than x, set
the current node to this picked child and repeat this procedure from step 2.
del(F , x): Let C be the set of children of x and r be the parent of x. Then run add for each of the elements of C starting from step 2 and setting r as the current node. In this an element of C carries the whole subtree rooted at it with the addition. To preserve sibling-purity, any siblings of a relocated node that are smaller than the node must be relocated deeper into the tree using add.
Definition 3.10 A triple (F , , G) is a poset-derived forest with multiple interfaces, if
1. (F , ) is a poset-derived forest.
2. G is a function that associates a subset of [n] with every filter, and G(x) 6= ∅ if and only if x ∈ F .
3. If x ∈ F and an interface k ∈ G(x), then k 6∈ G(y) holds for all descendants of x in the relation .
To satisfy Property 3.2 we extend the add and del operations accord- ingly. A node is not inserted if a covering node with the same interface is already present. If a node is inserted, nodes that are covered by the new node and have the same interface are removed. A forest is either redun- dant or non-redundant. A redundant forest may contain a redundant filter, whereas a non-redundant forest may not contain such a filter. A redundant filter is such a (F1, i) that there exists a (F2, i) for which F2 w F1.
The process of removing redundant filters is called interface pruning or interface elimination and it involves scanning the data structure for filters that are covered by the input filter and have been received from the same interface.
Operations
The add operation inserts a new filter x into the forest and if the interface c of the input filter is also new it is inserted into the set of interfaces. More formally, the add operation creates a new forest (F0, 0, G0), where F0 = F ∪ {x}, G0(x) = G(x) ∪ {c}, and G0(y) = G(y), y 6= x. Similarly, the
del operation results in a new structure without the deleted filter.
The elimination of redundant filters may be implemented in add, del, or both. We distinguish two interesting cases. First, for filters from local clients no elimination is necessary. Second, for hierarchical and peer-to-peer routing the structure should be non-redundant.
Interface-based Balancing
Interface-based balancing is a technique for optimizing interface-specific op- erations in the forest. Each node maintains a set of interfaces used by its descendants. The index may be used to cluster filters from the same in- terface near each other. The add operation sorts the set representing the current level of the forest using the interface of the node to be inserted. Nodes that have descendants of the same interface are processed first. This
3.3 Poset-derived Forests 39
allows to quickly find filters from a specific interface in the forest. A forest that implements the interface index is called a balanced forest (BF).
We use an index presented in Definition 3.11 to optimize the perfor- mance of the data structure. For each filter in the structure the index keeps a record of the interfaces of its descendants. The index is updated for every addition and deletion. The add operation uses the index in decid- ing which subtree to traverse. An interface index entry is not necessarily needed for leaf nodes. The index requires at most n − 1 entries where n is the number of nodes in the forest and n bits per entry if each node has a unique interface. The total number of bits required by the index in the worst case is n(n − 1). The index may be implemented using a bit vector of at most k bits for representing the interfaces, where k is the maximum number of interfaces in the forest.
Definition 3.11 Interface-index(x): the input is node x and the output is the set of interfaces used by x’s descendants.
The maintenance of the index has both memory and processing over- head. During add and del operations the index is updated from the inserted node to the root so the depth of insertion is important. The index update is simple to implement for add and del. The index is updated for the input node and for all predecessors.