Index and Selection - Practical Verified Computation with Streaming Interactive Proofs

In this section, we present an online scheme for the selection problem. Our definition of the Selection problem assumes all frequencies fi := P_(j_k_,δ_k_):j_k_=iδk are non-negative, and so this definition is only valid for the strict turnstile update model.

Definition 3.3.1. The Selection problem is defined in terms of the quantity N0 = P

i∈[n]fi, the sum of all the frequencies. Given a desired rank ρ∈ [N

0_{], output an item j from}

the stream x = _h(j1, δ1), . . . , (jN, δN)i, such that P_(j_k_,δ_k_):j_k_<jδk < ρ and P_(j_k_,δ_k_):j_k_>jδk ≥ N0_{− ρ.}

An easy prescient (log n, log n)-scheme is for the helper to give a claimed answer s as annotation at the start of the stream. The verifier need only count how many items in the stream are (a) smaller than s and (b) greater than s. The verifier then outputs s if the rank of s satisfies the necessary conditions, and outputs _{⊥ otherwise.}

However, our goal is to present (almost) matching upper and lower bounds when only online annotation is allowed. To do this, we first consider the online MA complexity of the communication problem of index: Alice holds a string x ∈ {0, 1}N_{, Bob holds an integer} i_{∈ [N], and the goal is for Bob to output index(x, i) := x}i. The lower bound for selection will follow from the lower bound for index and a key idea for the selection upper bound is taken from the communication protocol for index seen in the proof of the following theorem. Theorem 3.3.2 (Online MA complexity of index). Let ca > 1 and cv be integers such that ca · cv ≥ N. There is an online MA protocol Q for index, with hcost(Q) ≤ ca and vcost(_{Q) = O(c}vlog ca). Futhermore, any online MA protocol Q for index must have hcost(_{Q) vcost(Q) = Ω(N). Thus, in particular, MA}→_{(index) = ˜}Θ(√N ).

Proof. For the lower bound, we use an online MA protocol _{Q to build a (Merlin-less) ran-} domized one-way index protocol Q0_{. Here, a one-way protocol is a one in which Alice sends} a message to Bob, with no communication from Bob to Alice.

We first consider the case where Merlin does not send any message to Alice at all and then explain how to modify the proof to cover the case where Merlin sends a message to Alice (possibly based on Merlin’s internal randomness rM) that does not depend on Bob’s

input. Let ca = hcost(Q). Let B(n, p) denote the binomial distribution with parameters n and p, and let k be the smallest integer such that X _{∼ B(k,}1

3) ⇒ Pr[X > k/2] ≤ 2−ca_{/3. A standard Chernoff bound gives k = Θ(h). Let a(x, R}

A) denote the message that Alice sends in _{Q when her random string is R}A (notice a(x, RA) does not depend on any help message h1(x, rM) from Merlin, since we have assumed no such help message is sent), and let b(a, h2) be the bit Bob outputs in Q upon receiving message a from Alice and h2 from Merlin. In the protocol _Q0, Alice chooses k independent random strings R1, . . . , Rk and sends Bob a(x, R1), . . . , a(x, Rk). Bob then outputs 1 iff there exists a ca-bit string h such that majority (b(a(x, R1), h2), . . . , b(a(x, Rk), h2)) = 1. Let C be the number of bits

communicated in this protocol. Clearly, C _{≤ k · vcost(Q) = O(hcost(Q) vcost(Q)). We}

claim that _Q0 is a 1₃_{-error protocol for index whence, by a standard lower bound (see, e.g.,} Ablayev [4]), C = Ω(N ).

To prove the claim, consider the case when xi = 1. By the correctness of Q there exists a suitable help message h2 from Merlin that causes Pr[b(a(x, RA), i, h2) = 0] ≤ 1₃. Thus, by construction and our choice of k, the probability that Bob outputs 0 in _Q0 is at most 2−ca_{/3. Now suppose x}

i = 0. Then, every possible message h2 from Merlin satisfies Pr[b(a(x, RA), i, h2) = 1] ≤ 1₃. Arguing as before, and using a union bound over all 2h possible messages h, we see that Bob outputs 1 with probability at most 2ca · 2−ca_{/3 =} 1

Now consider the case in which Merlin sends a message to Alice (possibly based on Merlin’s internal randomness rM) that does not depend on Bob’s input. Assume that the soundness probability of the protocol is 1/13-complete (this can be achieved by repeating the whole protocol O(1) times and taking the majority vote, which increases the costs by only constant factors). In this case, we construct a one-way randomized (Merlin-less) communication protocol for index as follows. Alice chooses a random string rM herself. Since Merlin’s message to Alice, h1(x, rM), does not depend on Bob’s input y, Alice can compute

h1(x, rM) herself. Alice sends to Bob the messages a(x, R1, h1(x, rM)), . . . , a(x, Rk, h1(x, rM)) that she would have sent in the online MA protocol given Merlin’s message h1(x, rM), and Bob outputs 1 if and only if there exists a ca-bit string h that would have caused him to accept on a majority of Alice’s messages.

Consider the case when xi = 1. By the correctness of Q, with probability at least 3/4 over the choice of rM, there exists a suitable help message h2 from Merlin that causes Pr[b(a(x, RA, h1(x, rM)), i, h2) = 0]≤ 1₃ (otherwise, with probability at least 1/4· 1/3 = 1/12 over the choice of both rM and RA, Merlin will fail to convince Bob to output 1, contra- dicting the fact that the protocol is 1/13-complete.) Call such a choice of rM “good”. By construction and our choice of k, if rM is good then the probability that Bob outputs 0 inQ0 is at most 2−ca_{/3. Thus, in the case x}

i = 1, our one-way randomized communication protocol outputs 1 with probability at least 3/4_{− 2}−ca_{/3 > 2/3.}

In the case xi = 0, the argument that our one-way randomized communication protocol outputs 0 with probability at least 2/3 proceeds exactly as in the case where Merlin did not send any message to Alice, since it holds that for every message h1 to Alice and every possible message h2 to Bob, the protocol satisfies Pr[b(a(x, RA, h1), i, h2) = 1] ≤ 1₃.

The upper bound follows as a special case of the two-party set-disjointness protocol in [3, Theorem. 7.4] since the protocol there is actually online. We give a more direct protocol, which establishes intuition for our selection result. Write Alice’s input string x as x = y(1)_{· · · y}(v), where each y(j)is a string of at most cabits, and fix a prime q with 3ca < q < 6ca. Let y(k) be the substring that contains the desired bit xi. Merlin sends Bob a string z of length at most ca, claiming that it equals y(k). Alice picks a random α∈ Fq and sends Bob α and the strings gα(y(1)), . . . , gα(y(v)), where gα is defined as in Lemma 3.2.1. This requires communicating O(v log q) = O(v log ca) bits. Bob checks if gα(z) = gα(y(k)), outputting ⊥ if not. If the check passes, Bob assumes that z = y(k), and outputs xi from z under this

assumption. By Lemma 3.2.1, the error probability is at most ca/q < 1/3. It is worth making the following two remarks on the above proof.

1. The above lower bound argument in fact shows that an online MA protocol_{Q for an ar-} bitrary two-party communication problem F satisfies hcost(_{Q) vcost(Q) = Ω(R}→(F )), where R→(F ) is the one-way, randomized communication complexity of F . Thus, MA→(F ) = Ω(pR→_{(F )). A similar result was proved by Aaronson [2].}

2. The upper bound for index presented above works more or less unchanged when Alice’s string is in ΣN, for an arbitrary finite alphabet Σ. In view of Lemma 3.2.1, one simply needs to choose the prime q such that 3_{|Σ|h < q < 6|Σ|h to bound the} error probability below 1/3. This leads to a protocol _{P with hcost(P) ≤ h log |Σ|} and vcost(_{P) = O(v(log |Σ| + log h)). Henceforth, we shall refer to this generalized} protocol simply as “the index protocol” — the alphabet Σ will usually be clear from the context.

Theorem 3.3.3. For all ca, cv such that ca · cv ≥ n, there is an online (calog n, cvlog n)- scheme for selection. Furthermore, any online (ca, cv)-scheme for selection must have ca· cv = Ω(n).

Proof. Conceptually, the verifier builds a vector r = (r1, . . . , rn)∈ Zn+ where rk =P_j<kfk. This is done by inducing a new stream x0 from the input stream x: each tuple (xk, δk) in A causes virtual tokens (xk+1, δk), (xk+2, δk), . . . , (n, δk) to be inserted into A0. Then r = f (A0); note that _krk1 = O(nN ). We apply the index protocol to this vector, with q = Θ(m2) to retrieve the ranks of elements surrounding the claimed answer s. This information is sufficient to check that s has the claimed rank.

For the lower bound, we use a standard reduction from the index problem. Given the string x_{∈ {0, 1}}N, Alice transforms it into the stream over universe [2N ] whose jth tuple is

(2j_{− x}j, 1), for each j. Given the index i∈ [N], Bob transforms it into a stream consisting of i copies of (2N, 1) and N _{− i copies of (1, 1). Consequently, the median of the combined} length-(2N ) stream is 2i_{− x}i, from which the value of xi can be recovered. To complete the proof, observe that any online scheme to compute this median would imply an online MA protocol for index with the same cost; and that all players can perform this reduction online without extra space or annotation.

Notice that in the above scheme the information computed by the verifier is independent of ρ, the rank of the desired element. Therefore these algorithms work even when ρ is revealed at the end of the stream.

In document Practical Verified Computation with Streaming Interactive Proofs (Page 43-48)