Direct Sum for Richness - Lower Bound Techniques for Data Structures

We now return to our original goal of understanding the complexity of LSD. As with indexing, we begin by considering the upper bounds. Armed with a good understanding of the upper bound, the lower bound will become very intuitive.

Figure 5-2: A B 1 Θ(n) Θ(n lgu_n) 1 Θ(n) Θ(u)

Our protocol is a simple generalization of the protocol for Indexing, in which Alice sent as many high-order bits of her value as she could afford. Formally, let k ≥ u be a parameter. We break the universe [u] into k blocks of size Θ(u/k). The protocol proceeds as follows:

1. Alice sends the set of blocks in which her elements lie. This takes A = O lg k_n = O(n lgk

n) bits.

2. For every block containing an element of Alice, Bob sends a vector of Θ(u/k) bits, indicating which elements are in T . This takes B = n ·u_k bits.

3. Alice replies with one more bit giving the answer.

To compute the trade-off, eliminate the parameter k between A and B: we have k = n·2O(A/n)

and thus B = u/2O(A/n)_{. Values of A = o(n) are ineffective: Bob has to send Θ(u) bits,}

just as in the trivial protocol in which he describes his entire set. Similarly, to achieve any B = o(n), Alice has to send Θ(n lgu

n) bits, which allows her to describe her set entirely. The

trade-off curve is plotted symbolically in Figure 5-2.

5.3.1 A Direct Sum of Indexing Problems

Figure 5-3:

S T

Alice

Bob

We now aim to prove a lower bound matching the trade-off from above. This can be done by the elementary richness argu- ment we used for Indexing, but the proof requires some care- ful bounding of binomial coefficients that describe the rectangle size. Instead, we choose to analyze LSD in an indirect but clean way, which also allows us to introduce an interesting topic in communication complexity: direct sum problems.

Thinking back of our LSD protocol, the difficult case is when the values in Alice’s set fall into different blocks (if two values are in the same block, Bob saves some communication because he only needs to describe one block instead of two).

blocks, and placing one of Alice’s value in each block (Figure 5-3). Intuitively, this LSD problem consists of n independent copies of indexing: each of Alice’s values indexes into a vector of u/n bits, describing a block of Bob’s set. The sets S and T are disjoint if and only if all of the n indices hit elements outside Bob’s set (which we can indicate by a “one” in the vector being indexed). Thus, in our family of instances, the LSD query has become the logical and of n Indexing queries.

Definition 5.5. Given a communication problem f : X × Y → {0, 1}, let the communication problem Vn

f : Xn_{× Y}n_{→ {0, 1} be defined by} Vn

f(~x, ~y) = Q_ifi(xi, yi).

This is an example of a direct sum problem, in which Alice and Bob each receive n independent inputs; the players want to output an aggregate (in this case, the logical and) of the function f applied to each pair of corresponding inputs. As we have explained above Vn

Indexing is a special case of LSD.

Intuitively, a n-wise direct sum problem should have a lower bound that is n times larger than the original. While it can be shown that this property is not always true, we can prove that any richness lower bound gets amplified k-fold, in the following sense:

Theorem 5.6. Let f : X × Y → {0, 1} be [ρ|X|, v]-rich, and assume Vn

f has a communication protocol in which Alice sends A = n · a bits and Bob sends B = n · b bits. Then f has a 1-rectangle of size ρO(1)_|X|/2O(a) _{× v/2}O(a+b)_.

Before we prove the theorem, let us see that it implies an optimal lower bound for LSD. We showed that Indexing is [|X|2 ,

|Y |

2 ]-rich. Then, if LSD has a protocol in which Alice sends

A bits and Bob sends B bits, the theorem finds a 1-rectangle of Indexing of size |X|/2O(A/n) by |Y |/2O(A+B)/n_{. But we showed that any 1-rectangle X × Y must have |Y| ≤ |Y |/2}|X |_{, so}

2O(A+B)/n _{≥ |X | = |X|/2}O(A/n)_{. The instance of Indexing are on blocks of size u/n, thus}

|X| = u

n, and we obtain the trade-off:

Theorem 5.7. Fix δ > 0. In a deterministic protocol for LSD, either Alice sends Ω(n lgmn)

bits, or Bob sends n · m_n1−δ bits.

5.3.2 Proof of Theorem 5.6

We begin by restricting Y to the v columns with u one entries. This maintains richness, and doesn’t affect anything about the protocol.

Claim 5.8. Vn

f is [(ρ|X|)n_{, v}n_]-rich.

Proof. Since Vn

f only has vn columns, we want to show that all columns contain enough ones. Let ~y ∈ Yn be arbitrary. The set of ~x ∈ Xn with Vn_{f(~x, ~y) = 1 is just the n-wise} Cartesian product of the sets {x ∈ X | f (x, yi) = 1}. But each set in the product has at

least ρ|X| elements by richness of f .

Now we apply Lemma 5.4 to find a 1-rectangle of Vn

f of size (ρ|X|)n_/2A _{× v}n_/2A+B_,

which can be rewritten as (₂ρa|X|)n × ( 1

2a+b|Y |)n. Then, we complete the proof of the

Claim 5.9. IfVn

f contains a 1-rectangle of dimensions (α|X|)n _{× (β|Y |)}n_{, then f contains}

a 1-rectangle of dimensions α3_{|X| × β}3_{|Y |.}

Proof. Let X × Y be the 1-rectangle ofVn

f . Also let Xi and Yi be the projections of X and

Y on the i-th coordinate, i.e. Xi = {xi | ~x ∈ X }. Note that for all i, Xi× Yi is a 1-rectangle

for fi. Indeed, for any (x, y) ∈ Xi× Yi, there must exists some (~x, ~y) ∈ X × Y with ~x[i] = x

and ~y[i] = y. But Vn

f(~x, ~y) = Q_jf (xj, yj) = 1 by assumption, so f (x, y) = 1.

Now note that there must be at least 2₃n dimensions with |Xi| ≥ α3|X|. Otherwise, we

would have |X | ≤ Q

i|Xi| < (α3|X|)k/3· |X|2k/3 = (α|X|)k = |X |, contradiction. Similarly,

there must be at least 2₃k dimensions with |Yi| ≥ β3|Y |. Consequently, there must be an

overlap of these good dimensions, satisfying the statement of the lemma. This completes the proof of Theorem 5.6.

In document Lower Bound Techniques for Data Structures (Page 75-77)