15.859-Z Algorithmic Superpower Randomization February 15, 2019
Lecture 5
Lecturer: Bernhard Haeupler Scribe: Sahil Hasan
1
Overview
In this lecture, we derive a parallel algorithm for the Lov´asz Local Lemma from the Moser-Tardos algorithm. Initially we will revisit prior lecture materials involving Lov´asz Local Lemma criterion as well as useful proofs from last week’s homework.
2
Review
2.1
The Moser-Tardos Algorithm
Previously, we learned the Moser-Tardos algorithm which provides a constructive proof of the Lov´asz Local Lemma. Here, we restate some technical details which are used in the analysis of the algorithm. The following is setting for the version of the Lov´asz Local Lemma we learned previously. We have n binary random variables X1, . . . , Xn and m events A1, . . . , Am, that we normally view as
a collection of bad events. We can define B to be an event defined within the overall probability space defined by the collection of binary random variables X1, . . . , Xn, where we declare the smallest
subset of variables that define this event to be vbl(B). From here, we can draw the corresponding dependency graph. Let us assume that the maximum degree of the dependency graph is ∆. By the Lov´asz Local Lemma, we know that there exists an assignment of X1, . . . , Xn such that all of
A1, . . . , Am are violated. That is to say, there exists a non-zero probability of none of the events
within A1, . . . , Amoccurring.
Using these previous definitions, we see that we can define the basic LLL criterion to the following claim, as proven in previous classes.
Claim 1 For each event A, there is an XA such that P r[A] ≤ XAQBΓ−(A)(1 − XB)
/ Further proofs in previous classes went on to show that the number of resamplings for Moser Tardos follows this underlying claim
Claim 2 The number of resamplings for Moser-Tardos is P
A E[number of consistent trees with
root A] →P
A XA
1−XA ∼ P r[A]
We see that this is intuitively true, reusing the concept of witness trees introduced in the prior lecture. The expected value of resamplings for Moser-Tardos is the sum of expected counts of consistent trees for any particular variable A over all variables within A1, . . . , Am. This equates to
P
A XA
1−XA. Further explanation of this claim can be found on the previous scribe notes.
2.2
Adding in ε-Slack
Continuing on from where we left off, we want to consider a stricter definition of intersection. This is defined by the concept of ε-Slack, where we assume a stronger restriction that events only intersect with 2(1−ε)k/ε other events. Thus, we want to consider a slight modification to LLL,
where P r[A] ≤ (1 − ε)XAQBΓ−(A)(1 − XB). From here we see that each event gets resampled at
most O(log n) times with high probability, due to the nature of witness trees. Thus we see that by extension, with high probability that no tree of size Ω(1log n) is consistent with random table R,
as shown by the expected value:
E[trees of size Ω(1εlog n) with root A] ≤ XA
1−XA
When we add in slack, we see the expected value drops to (1 − ε)Ω(1
εlog n)≤ n−Ω(1).
2.3
Dealing with exponential negative events
There are some common cases of LLL where the number of bad events is exponential (or larger). The next section will give homework solutions to a case-study of one of these cases, acyclic edge coloring. Ideally, our goal is to continue to have an algorithm that runs in polynomial time relative to the input variables. It turns out that this is in fact possible.
We first want to define a variable:
δ = min XA
Y
BΓ−(A)
(1 − XB)
If we plug this into the ε Slack equation defined above, we see that the expected number of resamples is P
AxA ≤ n log1δ. This is done by taking advantage of the fact that our variable graph, G, is
dense, and the definition of LLL conditions, which mean that highly connected events within G have to have small probabilities. We see that G can be portrayed as the union of ≤ n cliques. This is because the v vertices in the graph have a degree of ∆, so we have ≤ n cliques that each contain nodes of at most P − 1. The number of resamplings within this clique should be a constant value, and since there are at most n cliques that cover the entire graph G, this should lead to at most O(n) resamplings. Thus, while the graph has exponential events the actual runtime is still O(n).
2.4
Edge Coloring (HW4)
2.4.1 Acyclic edge coloring
Following the hints, we index the cycles in G with 1, ..., m, and define bad events A1, ..., Am where
Ai is the cycle i being colored with at most 2 colors. The probability space is randomly coloring
each edge in G from C colors. In the dependency graph, Ai, Aj are neighbors if they share at least
one common edge in G.
We first calculate Pi ≡ Pr[Ai]. It’s at most the following event: picking randomly two colors
from C, and coloring each edge with either of the colors picked. Let li be the length of cycle i, we
get Pi< C 2 (2 C) li < 2(2 C) li−2
As the hint says, using the symmetric version of LLL won’t work for the events and dependency graph defined above: there can be an arbitrarily long cycle with each edge being intersected with a cycle of length 3. The sum of neighbor probability cannot be bounded. We therefore consider the asymmetric version of LLL. That is, we want to define xi such that for any i ∈ {1, ..., m}:
Pi< xi
Y
j∈Γ(i)
(1 − xj)
By some easy inequalities, it suffice to show: 2(2 C) li−2≤ x i( 1 e) P j∈Γi2xj
To bound the sum in the exponent, we may want to bound the number of neighbors of Ai.
the number of neighbors whose cycle has length exactly l. We assign a same xl to each of these
neighbors. (that is, xj = xlif cycle j has length l).
Imagine growing the neighboring cycle from any vertex in cycle i: Each step there are at most ∆ − 1 choices except for the last step. Since this neighboring cycle share at least 1 edge with cycle i, the number of such cycle is bounded by (∆ − 1)l−2. As there are l
i number of vertices to grow,
we have xi( 1 e) P j∈Γi2xj ≥ xi(1 e) liPl2xl(∆−1)l−2> x li( 1 e) liPl2xl∆l−2
To get the desired inequality, we want xl to cancel ∆l−2, and also includes a factor that grows
as l so that the sum becomes a geometric series. We set xl= 1 (2∆)l−2 And get xli( 1 e) liPl2xl∆l−2 ≥ 1 (2∆)li−2( 1 e) 2li
Letting C = k∆ where k is some constant to be set, we get the desired inequality. 2.4.2 Nonrepetitive coloring
We similarly define bad events Ai to be the ith even path having a repetitive coloring. Let li= 2k
be the length of ith cycle, we have
Pi≤
1 Ck
The right hand side can be similarily (to the previous problem) bounded by
xi( 1 e) P j∈Γi2xj ≥ (1 e) 2kP k2xk∆ 2k
To cancel the ∆2k, we now have to make x
k= (2∆12)k. In the end, we want
(∆ 2 C ) k ≤ (1 e) 2ka
for some constant a. This can be achieved by setting C = b∆2 for some constant b. And we are
done (by LLL).
3
Parallel Algorithm for the Lov´
asz Local Lemma
In this lecture, we also give a parallel algorithm for the Lov´asz Local Lemma. The original Moser-Tardos algorithm picks one violated event and resamples variables that the event depends on. We see that we cannot extend the algorithm to merely resample indiscriminately in parallel, as that would cause potential interference. Thus, a proper solution is to select a maximal independent S set in the subgraph of the dependency graph induced by the nodes corresponding to all the violated events. Since each event in S does not share any variable with the others in S, we can resample all the variables that the events in S depend on in parallel. It is important to note that in theory the maximum independence set is ideal, but given such a problem is NP-Hard to solve, it is impractical to do so. More importantly, we can actually avoid having to calculate a maximal independence set by merely doing a single iteration of the maximal independence set algorithm, but the proof of correctness is more involved. Consider the below parallel algorithm using maximal independence sets.
Input : n variables X1, . . . , Xn and m events A1, . . . , Am where vbl(Ai) denotes the set of
random variables that Ai depends on for each i
Start with a random assignment While ∃ a violated event A
Take a maximal independent set S in the subgraph of the dependency graph induced by all the violated events
Resample all the variables in ∪A∈Svbl(A)
Return the current assignment
We see that this algorithm runs in log2n time, where it takes log n time per maximal indepen-dence set generated, and log n time to run the actual resampling algorithm. We can in practice make this faster by proverbially fudging the maximal set generation, but this is a relatively involved proof that we skipped during lecture. Going back to derivations from lecture, suppose that the above parallel algorithm terminates after ` iterations. Let S1, . . . , S` denote the respective maximal
independent sets. Let A be an event which is chosen in the ith iteration. Then the next lemma shows that the witness tree rooted at A has depth exactly i.
Lemma 3 Let A be an event included in the maximal independent set Sichosen in the ith iteration.
Then the witness tree rooted at A has depth exactly i.
Proof Since Si is a maximal independent set, then any node in Si cannot be attached to A.
Suppose that there is no node in Si−1that is attached to witness tree. Then we get that vbl(B) ∩
vbl(A) = ∅ for all B ∈ Si−1, so {A} ∪ Si−1is an independent set. Besides, when we resample all the
variables in ∪B∈Si−1vbl(B), the values in the variables that A depends on do not change at all. That
means A was also violated in the i − 1th iteration, and thus {A} ∪ Si−1is an independent set which
appears in the i − 1th iteration. However, this contradicts the assumption that Si−1 is maximal.
Therefore, there is a node Ai−1in Si−1 attached to A. Again, notice that the other nodes in Si−1
cannot be attached to Ai−1and they can only be attached to A. Then the depth of the witness tree
before attaching nodes in Si−2is exactly 2. Likewise, there is a node Ai−2in Si−2attached to Ai−1
and the depth increases by exactly 1 after looking at Si−2. This proves the lemma.
In fact, we can use Lemma 7 to prove that the number of iterations of the parallel algorithm is O(log2n) with high probability.
Theorem 4 The number of iterations of the parallel algorithm for the Lov´asz Local Lemma is O(log2n) with high probability.
Proof Note that a tree of depth k contains k nodes, as long as it comes from resampling an MIS in every round. This is shown from the previous lemma. From our previous claim regarding ε Slack, we know that the probability of each log tree drops to (1 − ε)Ω(1
εlog n)≤ n−Ω(1). This tells
us that we need O(1εlog n) parallel rounds of MIS resampling in order to succeed. This shows that the actual algorithmic resampling takes O(log n) time (as shown by Moser-Tardos, which will lead to an overarching runtime of log2n, due to the time it takes to compute the actual MIS.
Acknowledgments. These notes are based off of previous writeups by Dabeen Lee, as well as the writeup of HW4 (Edge Coloring, 2.3), which was provided by Hanjun Li.
References
[1] K. Chandrasekaran, N. Goyal, and B. Haeupler, Deterministic Algorithms for the Lov´asz Local Lemma, SIAM Journal on Computing 42 (2013) 2132-2155.
[2] R. Moser and G. Tardos, A Constructive Proof of the General Lov´asz Local Lemma, Journal of the ACM 57 (2010) 11:1-11:15.