On the necessity of many leaves - Binary search trees, rectangles and patterns

In this section we study condition (i) of Theorem3.4. This condition can be interpreted as saying that the number of leaves in the after-tree Q or the number of zigzags in the search path P must be proportional to |P|.

Figure 3.4: Illustration of the proof of Theorem3.20. Tree after accessing node i − 1, before accessing i . Wing partition shown with shaded ellipses.

The fact that condition (ii) of Theorem3.4alone is not suffi- cient for an algorithm to satisfy the access lemma, follows from the easy observation that Move-to-root satisfies condition (ii), but not the access lemma. Informally, we can say that an algorithm must “do something else” besides being local (=monotone). It would be desirable to show that this “something” must be ex- actly condition (i) of Theorem3.4. A conclusive statement in this direction would say that “if a sufficiently high fraction of the trans- formations done by

A

do not satisfy condition (i), then the access lemma cannot hold”. Perhaps most insight would be gained from the description of a global adversary strategy that would force any algorithm that consistently violates (i) to have high total cost.

At this point, we are unable to prove such a statement. Instead, we relate condition (i) to a different (reasonable) measure of efficiency: the sequential access condition. (Recall that

A

satisfies the sequential access condition, if from every initial tree over [n] it can serve the sequence (1, . . . , n) with cost O(n).) We show the following theorem.

Theorem 3.20. If for all after-trees Q created by algorithm

A

, it holds that (i) Q can be decomposed into O(1) monotone sets, and (ii) the number of leaves of Q is at most no(1), then

A

does not satisfy the sequential access condition.

Observe that the value n in Theorem3.20is the global number of nodes (not just the number of nodes |Q| on the search path). Before proving Theorem3.20, we state the open question of whether the result can, in some way, be improved.

Problem 35. Can Theorem3.20be strengthened in any of the following ways?

1. Involving in the statement (instead of the sequential access condition) the balance condition, the access lemma, or some other measure of efficiency.

2. Involving in the statement the quantity z (number of zigzags).

3. Relaxing the condition that every transformation must create only few leaves. 4. Relaxing the dependence on the monotonicity condition.

3.7. On the necessity of many leaves 67

5. Relaxing the bound no(1)to, say, o(n) or o(|Q|).

Despite the shortcomings of Theorem3.20, there are also some apparent strengths of this result (which can be seen as another partial converse of Theorem3.4). In particular, the statement refers to the sequential access property, without relying on the sum-of-logs potential, or on any other proof technique. Second, as the quantity no(1)involves n, and not |Q|, the result holds regardless of what the algorithm does when |Q| ≤ no(1), i.e. when the search path is very short. The remainder of the section is devoted to the proof of Theorem3.20.

Let R be a BST over [n]. We call a maximal left-leaning path of R a wing of R. More precisely, a wing of R is a set {x1, . . . , xk} ⊆ [n], with x1< · · · < xk, and such that x1has no left

child, xk is either the root of R, or the right child of its parent, and xi is the left child of xi +1 for all 1 ≤ i < k. A wing may consist of a single element. Observe that the wings of R partition [n] in a unique way, and we call the set of wings of R the wing partition of R, denoted as

w p(R). We define a potential functionΨ over a BST R as follows:

Ψ(R) = X

w ∈w p(R)

|w| · log |w|.

Let T0be a left-leaning path over [n] (i.e. n is the root and 1 is the leaf ). Consider a strict

online BST algorithm

A

with the access-to-root property. Suppose that

A

accesses elements of [n] in sequential order, starting with T0as initial tree. Let Tidenote the BST after accessing element i . Then Tihas i as the root, and the elements yet to be accessed (i.e. {i + 1,...,n}) form the right subtree of the root, denoted Ri. To avoid treating T0separately, we augment

it with a “virtual root” 0. This node plays no role in subsequent accesses, and it only adds a constant one to the overall access cost.

Using the previously defined potential function, we denoteΨi= Ψ(Ri). We make the following easy observations:Ψ0= n logn, and Ψn= 0.

Next, we look at the change in potential due to the restructuring after accessing element

i . Let Pi= (x1, x2, . . . , xni) be the search path when accessing i in Ti −1, and let nidenote its length, i.e. x1= i − 1, and xni= i . Observe that the set P_i0= Pi\ {x1} is a wing of Ti −1.

Let Qibe the after-tree resulting from the re-arranging of the path Pi. Observe that the root of Qiis i , and the left child of i in Qiis i − 1. We denote the tree Qi\ {i − 1} as Q0_i, and the tree Q0_i\ {i }, i.e. the right subtree of i in Qi, as Q_i00.

The crucial observation of the proof is that for an arbitrary wing w ∈ w p(Ti), the following holds: (i) either w was not changed when accessing i , i.e. w ∈ w p(Ti −1), or (ii) w contains a portion of P_i0, possibly concatenated with an earlier wing, i.e. there exists some w0∈ w p(Qi0), such that w0_{⊆ w. In this case, we denote as ext(w}0_{) the extension of w}0_{to a wing of w p(T}_i_),

i.e. ext(w0) = w \ w0, and either ext(w0) = ;, or ext(w0) ∈ w p(Ti −1).

Now we bound the change in potentialΨi− Ψi −1. Wings that did not change during the restructuring (i.e. those of type (i)) do not contribute to the potential difference. Also note, that i contributes toΨ_{i −1}, but not toΨi. Thus, we have for 1 ≤ i ≤ n, assuming that 0 log 0 = 0, and denoting f (x) = x log(x):

Ψi− Ψi −1= X w0_{∈w p(Q}00 i) ¡ f ¡¯ ¯w0 ¯ ¯+ ¯ ¯ext(w0) ¯ ¯¢ − f ¡ ¯ ¯ext(w0) ¯ ¯¢¢ − f (n_i− 1). By simple manipulation, for 1 ≤ i ≤ n:

Ψi− Ψi −1≥ X w0_{∈w p(Q}00 i) f¡ ¯ ¯w0 ¯ ¯¢ − f (n_i− 1).

By convexity of f , and observing that |Q00_i| = ni− 2, we have Ψi− Ψi −1≥ ¯ ¯w p(Q00_i) ¯ ¯· f Ã ni− 2 ¯ ¯w p(Q00_i) ¯ ¯ ! − f (ni− 1) = (ni− 2) · log ni− 2 ¯ ¯w p(Q_i00) ¯ ¯ − f (ni− 1). Lemma 3.21. If R has right-depth m, and k leaves, then¯¯w p(R)

¯ ¯≤ mk.

Proof. For a wing w , let`(w) be any leaf in the subtree rooted at the node of maximum depth

in the wing. Clearly, for any leaf` there can be at most m wings w with `(w) = `. The claim follows.

Thus,¯

¯w p(Q_i00) ¯

¯≤ no(1). Summing the potential differences over i , we get Ψn− Ψ0= −n logn ≥ −

n X i =1

nilog (no(1)) − O(n).

Denoting the total cost of algorithm

A

on the access sequence (1, . . . , n) as C , we obtain

C =Pn

i =1ni= n · ω(1). This shows that

A

does not satisfy the sequential access property.

In document Binary search trees, rectangles and patterns (Page 82-84)