8.1
Maximal Independent Sets
For a graph G = (V, E), an independent set is a set S ⊂ V which contains no edges of G, i.e., for all (u, v)∈E either u6∈S and/orv6∈S. The independent setS is amaximal independent set if for allv∈V, eitherv∈S or N(v)∩S 6=∅ whereN(v) denotes the neighbors ofv.
It’s easy to find a maximal independent set. For example, the following algorithm works:
1. I=∅, V0=V.
2. While (V06=∅) do
(a) Choose anyv∈V0. (b) SetI=I∪v.
(c) SetV0=V0\(v∪N(v)).
3. OutputI.
Our focus is finding an independent set using a parallel algorithm. The idea is that in every round we find a setS which is an independent set. Then we addSto our current independent setI, and we removeS∪N(S) from the current graph V0. If S∪N(S) is a constant fraction of |V0|, then we will only needO(log|V|) rounds. We will instead ensure that by removingS∪N(S) from the graph, we remove a constant fraction of the edges.
To choose S in parallel, each vertex v independently adds themselves to S with a well chosen probability p(v). We want to avoid adding adjacent vertices toS. Hence, we will prefer to add low degree vertices. But, if for some edge (u, v), both endpoints were added toS, then we keep the higher degree vertex.
Here’s the algorithm:
The Algorithm
Problem: Given a graph findamaximal independent set. 1. I=∅, G0 =G.
2. While (G0 is not the empty graph) do
(a) Choose a random set of verticesS∈G0 by selecting each vertex v independently with probability 1/(2d(v)).
(b) For every edge (u, v)∈E(G0) if both endpoints are in S then remove the vertex of lower degree fromS (Break ties arbitrarily). Call this new setS0.
(c) I =IS
S0. G0 =G0\(S0S
N(S0)), i.e.,G0 is the induced subgraph onV0\(S0S
N(S0)) where V0 is the previous vertex set.
3. output I
Fig 1: The algorithm
Correctness: We see that at each stage the setS0that is added is an independent set. Moreover since we re-move, at each stage,S0SN(S0) the setIremains an independent set. Also note that all the vertices removed
fromG0 at a particular stage are either vertices inIor neighbours of some vertex inI. So the algorithm al-ways outputs a maximal independent set. We also note that it can be easily parallelized on a CREW PRAM.
8.2
Expected Running Time
In this section we bound the expected running time of the algorithm and in the next section we derandomize it. LetGj = (Vj, Ej) denote the graph after stagej.
Main Lemma: For somec <1,
E (|Ej| | Ej−1)< c|Ej−1|.
Hence, in expectation, onlyO(logm) rounds will be required, wherem=|E0|.
We say vertexv is BAD if more than 2/3 of the neighbors ofv are of higher degree thanv. We say an edge is BAD if both of its endpoints are bad; otherwise the edge is GOOD.
The key claims are that at least half the edges are GOOD, and each GOOD edge is deleted with a constant probability. The main lemma then follows immediately.
Lemma 8.1 At least half the edges are GOOD.
Proof: Denote the set of bad edges by EB. We will definef : EB → E2
so that for all e1 6=e2 ∈ EB, f(e1)∩f(e2) =∅. This proves |EB| ≤ |E|/2, and we’re done.
The functionf is defined as follows. For each (u, v)∈E, direct it to the higher degree vertex. Break ties as in the algorithm. Now, suppose (u, v)∈EB, and is directed towardsv. Since v is BAD, it has at least twice as many edges out as in. Hence we can pair two edges out ofv with every edge intov. This gives our mapping.
Lemma 8.2 If v is GOOD then Pr (N(v)∩S6=∅)≥2α, whereα:= 12(1−e−1/6).
By definition,|L(v)| ≥ d(3v) ifv is a GOOD vertex.
Pr (N(v)∩S6=∅) = 1−Pr (N(v)∩S=∅) = 1− Y
w∈N(v)
Pr (w6∈S) using full independence
≥ 1− Y w∈L(v) Pr (w6∈S) = 1− Y w∈L(v) 1− 1 2d(w) ≥ 1− Y w∈L(v) 1− 1 2d(v) ≥ 1−exp(−|L(v)|/2d(v)) ≥ 1−exp(−1/6),
Note, the above lemma is using full independence in its proof.
Lemma 8.3 Pr (w /∈S0 | w∈S)≤1/2. Proof: LetH(w) =N(w)\L(w) ={z∈N(w) :d(z)> d(w)}. Pr (w /∈S0 | w∈S) = Pr (H(w)∩S6=∅ | w∈S ) ≤ X z∈H(w) Pr (z∈S | w∈S) ≤ X z∈H(w) Pr (z∈S, w∈S ) Pr (w∈S) = X z∈H(w) Pr (z∈S) Pr (w∈S)
Pr (w∈S) using pairwise independence
= X z∈H(w) Pr (z∈S ) = X z∈H(w) 1 2d(z) ≤ X z∈H(w) 1 2d(v) ≤ 1 2.
Proof: LetVG denote the GOOD vertices. We have Pr (v∈N(S0) | v∈VG ) = Pr (N(v)∩S06=∅ | v∈VG) = Pr (N(v)∩S06=∅ | N(v)∩S6=∅, v∈VG) Pr (N(v)∩S6=∅ | v∈VG) ≥ Pr (w∈S0 | w∈N(v)∩S, v∈VG ) Pr (N(v)∩S6=∅ | v∈VG) ≥ (1/2)(2α) = α
Corollary 8.5 If v is GOOD then the probability that v gets deleted is at leastα.
Corollary 8.6 If an edgeeis GOOD then the probability that it gets deleted is at leastα. Proof:
Pr (e= (u, v)∈Ej−1\Ej )≥Pr (v gets deleted ).
We now re-state the main lemma :
Main Lemma: E (|Ej| | Ej−1)≤ |Ej−1|(1−α/2). Proof: E (|Ej| | Ej−1) = X e∈Ej−1 1−Pr (egets deleted ) ≤ |Ej−1| −α|GOODedges| ≤ |Ej−1|(1−α/2).
The constantαis approximately 0.07676.
Thus,
E (|Ej|)≤ |E0|(1−
α 2)
j≤mexp(−jα/2)<1,
forj > α2logm. Therefore, the expected number of rounds required is≤4m=O(logm).
8.3
Derandomizing MIS
The only step where we use full independence is in Lemma 8.2 for lower bounding the probability that a GOODvertex gets picked.The argument we used was essentially the following :
Lemma 8.7 Let Xi, 1 ≤ i ≤ n be {0,1} random variables and pi := Pr (Xi= 1 ). If the Xi are fully independent then Pr n X 1 Xi>0 ! ≥1− n Y 1 (1−pi)
Here is the corresponding bound if the variables are pairwise independent
Lemma 8.8 Let Xi, 1≤i≤n be{0,1} random variables and pi := Pr (Xi= 1 ). If the Xi are pairwise independent then Pr n X 1 Xi>0 ! ≥ 1 2min{ 1 2, X i pi} Proof: SupposeP
ipi ≤1/2.Then we have the following, (the condition
P
ipi ≤1/2 will only come into some algebra at the end)
Pr X i Xi>0 ! ≥ X i Pr (Xi= 1 )− 1 2 X i6=j Pr (Xi= 1, Xj= 1 ) = X i pi−1 2 X i6=j pipj ≥ X i pi− 1 2 X i pi !2 = X i pi 1−1 2 X i pi ! ≥ 1 2 X i pi when X i pi≤1. If P
ipi > 1/2, then we restrict our index of summation to a set S ⊆ [n] such that 1/2 ≤
P
i∈Spi ≤1, and the same basic argument works. Note, ifP
ipi >1, there always must exist a subset S ⊆ [n] where 1/2≤P
i∈Spi≤1
Using the construction described earlier we can now derandomize to get an algorithm that runs inO(mn/α) time (this is asymptotically as good as the sequential algorithm). The advantage of this method is that it can be easily parallelized to give anN C2 algorithm (usingO(m) processors).
8.4
History and Open Questions
Thek−wise independence derandomization approach was developed in [CG], [ABI] and [L85]. The maximal independence problem was first shown to be in NC in [KW].They showed that MIS in isN C4. Subsequently, improvements on this were found by [ABI] and [L85]. The version described in these notes is the latter.
The question of whether MIS is inN C1is still open.
References
[1] N Alon, L. Babai, A. Itai, “A fast and simple randomized parallel algorithm for the Maximal Independent Set Problem”,Journal of Algorithms, Vol. 7, 1986, pp. 567-583.
[2] B. Chor, O. Goldreich, “On the power of two-point sampling”, Journal of Complexity, Vol. 5, 1989, pp.96-106.
[3] R.M. Karp, A. Wigderson, A fast parallel algorithm for the maximal independent set problem, Proc. 16th ACM Symposium on Theory of Computing, 1984, pp. 266-272.
[4] M. Luby,A simple parallel algorithm for the maximal independent set problem, Proc, 17th ACM Sym-posium on Theory of Computing, 1985, pp. 1-10.