2.7 MPC Implementation Details
2.7.4 Concluding Proofs
Lines5,7and8can be implemented as described in sections2.7.2and2.7.3. Lines9and10 do not need an actual implementation, as by that point all the vertices that are not marked as discardedconstitute V0, and all the edges incident to V \V0will be marked asdiscarded. Sim-ilarly, all the matched edges will be marked asmatchedby the implementation ofLocalPhase. All the edges and vertices that are marked asdiscardedwill be ignored in further processing.
After all the rounds are over, the matching consists of the edges marked asmatched.
Let∆?be the value of∆ at Line12, and hence the value of∆ at the end of the last while loop iteration. Let∆0be the value of∆ just before the last iteration, i.e. ∆?= ∆0/2τ, for the correspondingτ. Now consider the last call ofLocalPhaseat Line8. The last invocation has
∆0/(2τ−1) as a parameter. On the other hand, by Claim2.13and Claim2.14we know that after the last invocation ofLocalPhasewith high probability there is no vertex that has degree greater then 34∆0/(2τ−1) < 2∆?. Therefore, with high probability there is no vertex that should be removed at Line12, and hence we do not implement that line either.
An implementation of Line13 is described in Section2.7.1. Finally, we can state the following result.
Lemma 2.21. There exists an implementation ofParallelAlgin the MPC model that with high probability executes O¡(loglogn)2+ max¡lognS, 0¢¢ rounds.
Proof. In the proof we analyze the case S ≤ n. Otherwise, for the case S > n, we think of each machine being split into bS/nc "smaller" machines, each of the smaller machines having space n.
We will analyze the number of iterations of the while loopParallelAlgperforms. Let∆i
andτi be the value of∆ and τ at the end of iteration i, respectively. Then, from Line3and Line4we have
To obtain the number of iterations the while loop ofParallelAlgperforms, we derive for which i ≥ 1 the condition at Line2does not hold.
Unraveling∆i −1further from (2.6) gives
∆i≤ ∆(1−γ)
For S ≤ n and as γ < 1/2 we have
³n S
´1−(1−γ)i
≤n
S. (2.8)
On the other hand, for i?=log log nγ ≤ c(log log n)2we have
n(1−γ)i?< logn. (2.9)
Now putting together (2.7), (2.8), and (2.9) we conclude
∆i?<n Sln n,
and hence the number of iteration the while loop ofParallelAlgperforms is O¡(loglogn)2¢.
Total round complexity. Every iteration of the while loop can be executed in O(1) MPC rounds with probability at least 1 − 1/n3. Since there are O¡(loglogn)2¢ iterations of the while loop, all the iterations of the loop can be performed in O¡(loglogn)2¢ many rounds with probability at least 1 − 1/n2.
On the other hand, by Lemma2.20and the condition at Line2ofParallelAlg, the com-putation of Line13ofParallelAlgcan be performed in O¡log¡nS(ln n)32¢¢ rounds. Putting the
both bounds together we conclude that the round complexity ofParallelAlgis O¡(loglogn)2+ lognS¢ for the case S ≤ n. For the case S > n (recall that in this regime we assume that each machine
is divided into machines of space n) the round complexity is O¡(loglogn)2¢.
Computation
This chapter is based on a joint work with Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, and Ronitt Rubinfeld. It has been accepted to ACM Symposium on Principles of Distributed Computing (PODC) 2018 [GGK+18] under the title
Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover.
3.1 Introduction
In this chapter, we study one of the most fundamental problems in algorithmic graph theory – maximal independent set (MIS). The study of this problem in models of parallel computation dates back to PRAM algorithm. A seminal work of Luby [Lub86] gives a simple randomized algorithm for constructing MIS in O(log n) PRAM rounds. Similar results, also in the context of PRAM algorithms, were obtained in [ABI86,II86,IS86]. Since then, MIS was studied quite extensively in various models of computation. We design a simple randomized algorithm that constructs MIS in the MPC and theCONGESTED-CLIQUEmodel.
3.1.1 Model
We consider two closely related models: Massively Parallel Computation (MPC), and the CONGESTED-CLIQUEmodel of computing. We refer a reader to Section2.1.1for the defini-tion of the MPC model.
CONGESTED-CLIQUE
TheCONGESTED-CLIQUEmodel was introduced by Lotker, Pavlov, Patt-Shamir, and Pe-leg [LPPSP03] and has been studied extensively since then, see e.g., [PST11,DLP12,BHP12, Len13,DKO14,Nan14,HPS14,HP14,CHKK+15,HPP+15,BFARR15,Gha16,GP16,Kor16,HKN16, CHPS17,Gha17,JN18]. In this model, we have n players which can communicate in syn-chronous rounds. In each round, every player can send O(log n) bits to every other player.
Besides this communication restriction, the model does not limit the players, e.g., they can use large space and arbitrary computations; though, in our algorithms, both of these will be small. Furthermore, in studying graph problems in this model, the standard setting is that we have an n-vertex graph G = (V,E), and each player is associated with one vertex of this graph.
Initially, each player knows only the edges incident on its own vertex. At the end, each player should know the part of the output related to its own vertex, e.g., whether its vertex is in the computed maximal independent set or not, or whether some of its edges is in the matching or not.
We emphasize thatCONGESTED-CLIQUEprovides an all-to-all communication model. It is worth contrasting this with the more classical models of distributed computing. For instance, theLOCALmodel, first introduced by Linial [Lin87], allows the players to communicate only along the edges of the graph problem G (with unbounded size messages).
3.1.2 Our Results
In Section3.4we present an algorithm for constructing MIS in the MPC and theCONGESTED -CLIQUEmodel.
Theorem 3.1
For any graph with maximum degree∆, there is an algorithm that with high probability computes an MIS in O(log log∆) rounds of the MPC model, with Θ(n) space per each machine. Moreover, the same algorithm can be adapted to compute an MIS in O(log log∆) rounds of theCONGESTED-CLIQUEmodel.
3.1.3 Related Work
Maximal independent set has been central in the study of graph algorithms in both the parallel and the distributed models. The seminal work of Luby [Lub86] and Alon, Babai, and Itai [ABI86] provide O(log n)-round parallel and distributed algorithms for constructing MIS. The distributed complexity in the LOCAL model was first improved by Barenboim et al.[BEPS12] and consequently by Ghaffari [Gha16], which led to the current best round complexity of O(log∆) + 2O(p
log log n). In theCONGESTED-CLIQUEmodel of distributed computing, Ghaffari [Gha17] gave another algorithm which computes an MIS in ˜O(plog∆) rounds. A deterministic O(log n log∆)-roundCONGESTED-CLIQUEalgorithm was given by Censor-Hillel et al. [CHPS17].
It is also worth referring to the literature on one particular MIS algorithm, known as the randomized greedy MIS, which is relevant to what we do for MIS. In this algorithm, we permute the vertices uniformly at random and then add them to the MIS greedily. Blelloch et al. [BFS12]
showed that one can implement this algorithm in O(log2n) parallel/distributed rounds, and recently Fischer and Noever [FN18] improved that to a tight bound ofΘ(logn). We will show a O(log log∆)-round simulation of the randomized greedy MIS algorithm in the MPC and the CONGESTED-CLIQUEmodel.
3.2 Preliminaries
For a graph G = (V,E) and a set V0⊆ V , G[V0] denotes the subgraph of G induced on the set V0, i.e., G[V0] = (V0, E ∩(V0×V0)). We use N (v) to refer to the neighborhood of v in G. Throughout the chapter, we use ndef= |V | to denote the number of vertices in the input graph.
3.3 Overview and Organization
Our MIS algorithm is based on the randomized greedy MIS algorithm (RandGreedyMIS). This algorithm ranks/permutes vertices 1 to n randomly and then greedily adds vertices to the MIS, while walking through this permutation. We provide this algorithm below.
Algorithm 6:RandGreedyMIS(G)
Randomized greedy MIS algorithm Input: Graph G = (V,E)
Output: An MIS in G
1 V0← V
2 Choose a permutationπ : [n] → [n] uniformly at random.
3 while V06= ; do
4 Select the vertex v which according toπ has the smallest rank in V0.
5 Add v to the MIS.
6 Remove v and N (v) from V0.
7 return the constructed MIS
This algorithm has been studied before in the literature of parallel algorithms [FN18, BFS12]. One of the features ofRandGreedyMIS is that, informally speaking, high-degree vertices get removed quickly and hence the maximum degree of the graph decreases at a high rate. In the rest of this chapter, we show how to efficiently implement a variant of this algorithm in only O(log log∆) MPC andCONGESTED-CLIQUErounds.