For simplicity, we will define the large sparse set problem for a linked ring. A linked ring can be trivially obtained from a linked list as follows. Set the first element to be the successor of the last element. Specifically, next(last) := f irstwhere f irst is the index of the first element andlast is the index of the last element.
The large sparse set problem
Input: A linked ring of n elements. The elements are stored in an array A of size n. Each element has a pointer to its successor in the ring, and is the successor of exactly one element in the ring. Formally, element i, 1≤i≤n, has a pointer fieldnext(i) which contains the array index of its successor.
The problem is to find a subsetS of Asuch that: (1)|S|> bn, where b is some constant fraction, 0 < b ≤ 1/2; and (2) if some element A(i) is in S then next(i) is not in S. Formally, the output is given in an array S = [S(1), . . . , S(n)]. For i, 1 ≤ i ≤ n, if S(i) = 0 then i is in the sparse setS and if S(i) = 1 i is not.
The algorithm below has a randomized step. An independent coin is flipped for each elementi, 1≤i≤n, resulting inHEADorT AILwith equal probabilities. TheHEAD orT AILresult is written intoR(i), 1≤i≤n. Now,all elements whose result is HEAD and whose successors result is T AIL are selected into S.
Large sparse set
for i, 1≤i≤n pardo
- With equal probabilities write HEAD or T AIL intoR(i) - if R(i) =HEAD and R(next(i)) =T AIL
- then S(i) := 0 - else S(i) := 1
ComplexityThe algorithm runs in O(1) time and O(n) work.
The set S is indeed a large sparse set, as explained below. First, it satisfies uniform sparsity (item (1)), since if an element is selected its successor cannot be selected. To see that, note that an elementiis selected only ifR(next(i)) =T AIL, which implies that its successor,next(i), cannot be selected. The number of elements inSis a random variable, and the following Lemma establishes that with high probability S is large enough.
Lemma Assume that n is even. (1) The expected size of S is n/4.
(2) The probability that|S| ≤n/16 is exponentially small; formally, it isO(cΩ(n)), where
cis a constant 0< c <1.
Extensions to odd values of n here and in the corollaries of the Lemma make an easy exercise.
Proof of Lemma (1) Each element gets HEAD with probability 1/2; independently, its successors getsT AILwith probability 1/2. So, the probability that an element is inS is 1/4 and item (1) of the Lemma follows. (2) For the proof we quote Corollary 6.7 from page 125 in [CLR90], for bounding the right tail of a binomial distribution (henceforth called the Right Tail theorem):
Consider a sequence of n Bernoulli trials, where in each trial success occurs with prob- ability p and failure occurs with probability q = 1−p. Let X be the (random variable counting the) number of successes. Then for r >0
P r(X−np≥r)≤(npq r )
r
Consider a set B comprising some element i in the ring as well as all the other n/2 −1 elements of the ring whose distance from i an even number. We will derive item (2) in the Lemma by restricting our probabilistic analysis to elements (in A that are also) in B. We saw above that each element of B is not selected into S with probability 3/4. These probabilities are pairwise independent. (To be on the safe side, we do not make any assumptions about the elements in A−B.) We have a sequence of n/2 Bernoulli trials, where in each trial success occurs with probability p = 3/4. By the Right Tail theorem, the probability that at least 7/8 of the n/2 elements ofB are not selected (and therefore r= (7/8)(n/2) = 7n/16) is at most
((n/2)(3/4)(1/4)
(7n/16) )
(7n/16) = (3/14)(7n/16)
Item 2 of the Lemma follows.
Each iteration of the recursive work-optimal list ranking algorithm of the previous section can employ routine Large sparse set. The corollary below allows to wrap up the complexity analysis for the work-optimal fast list ranking algorithms.
CorollarySuppose the recursive work-optimal algorithm is applied until a list of size ≤n/logn is reached. Consider the probability that a list of this size is reached in such a way that in each iteration along the way |S| ≤ 15|A|/16. Then, the probability that this does not happen is exponentially small. Formally, there is some integerN such that for every n ≥N, that probability is O(αΩ(n/logn)), where α is a constant 0< α <1.
Proof of corollaryThe size ofAin the last application of the recursive work-optimal algorithm is at leastn/logn. The probability that|S| ≤15|A|/16 in the last application is at least
1 − (3/14)16 log7nn
The probability for |S| ≤ 15|A|/16 in each of the preceding applications is at least this much. Also, in the case where |S| ≤ 15|A|/16 in all applications there will be at most log16/15logn applications. Therefore, the probability for having |S| ≤ 15|A|/16 in all applications before a list of sizen/logn is reached is at least
1 − (3/14)16 log7nnlog
16/15logn
Deriving the corollary is now straightforward. We conclude,
Theorem 9.2: Using the randomized large sparse set routine, the work-optimal fast list ranking algorithm runs in time O(lognlog logn) and O(n)work with probability of
1−x(n), where x(n)is decreasing exponentially as a function of n/logn.
It is also possible to argue that using the randomized large sparse set routine, the average running time of the recursive work-optimal algorithm isO(log2n) and the average work is O(n).
Exercise 25: (List ranking wrap-up). Describe the full randomized fast work-optimal list ranking algorithm in a “parallel program”, similar to pseudo-code that we usually give. Review briefly the explanation of why it runs in O(lognlog logn) time and O(n)
work with high-probability.
9.5. Deterministic Symmetry Breaking