Max-Cut in High Dimensions - Approximation Techniques for Facility Location and Their Applicati

moved point p ∈ P . Since the minimum pairwise distance from distinct points in P is 1 and Inequality (8.2) is true for at least (1 − σ/2) · n² pairs of points (p, q) ∈ P × P with probability at least 1 − δ/3, we have that

and slack σ/2 with probability at least 1 − δ/3. Furthermore, each point lies on a grid with cell size ε/(18^qd(ε⁰, σ⁰, δ⁰)) and the maximum pairwise distance of points is O(√

On the obtained point set, we run our construction from Sections 7.2 and 8.2 with a precision parameter ε⁰⁰ := ε/3, a slack parameter σ⁰⁰ := σ/2, and an error probability parameter δ⁰⁰ := δ/3. Then, with a total error probability of δ, the resulting point set P⁰ embeds P with distortion (1+ε/3)·(1+ε/3) ≤ (1+ε) and slack σ. It follows from the above and Theorem 7 that we also have P⁰ ⊂ {1, . . . , ∆⁰}^d⁰ with spread ∆⁰ ∈ O(√

dn∆/(ε²√ σδ)) and dimension d⁰ ∈ O(1/(ε²σδ)).

As explained before in the proof of Theorem 9, we have to ensure that σ⁰⁰ > 2^2d/n (confer Theorem 7) since we use the construction given in Section 7.2. However, this is implicitly required by the fact that the space requirement of a streaming algorithm has to be sublinear in n and the space requirement of our streaming algorithm is ω(1/σ).

Finally, we analyze the complexity of our construction. Due to Theorem 10, each point in P can be embedded into the low-dimensional space R^d(ε⁰^,σ⁰^,δ⁰⁾ in O(d · log²(d)/(ε²σδ)) time using O(log(d)/(ε²σδ)) space. Due to Theorems 7 and 9, the construction from Sections 7.2 and 8.2 applied on a set of points with dimension O(1/(ε²σδ)) and spread O(√

dn∆/(ε²√

σδ)) has both an update time and space requirement of

The size of the set of representatives follows from Lemma 7.2.7.

8.3 Max-Cut in High Dimensions

In this section, we show how to embed a set of high-dimensional Euclidean points into a low-dimensional Euclidean space such that the sum of the pairwise distances is well preserved. Afterwards, we use this result to design a streaming algorithm that implicitly computes a (1 ± ε)-approximation of the max-cut problem for a dynamic data stream of high-dimensional Euclidean points.

Let ϕ : P → R^d(ε,δ) be the Johnson-Lindenstrauss embedding where each point is mapped into a Euclidean space with dimension d(ε, δ) ∈ Θ(1/(ε²δ²)). Then, we will show that, for a pair of points (p, q) ∈ P × P , the expected value of |D(ϕ(p), ϕ(q)) − D(p, q)| is δ ε · D(p, q) and |D(ϕ(p), ϕ(q)) − D(p, q)| is sharply concentrated around its expected value with probability 1 − δ. This leads to the following lemma:

Lemma 8.3.1. Let ε, 0 < ε < 1, be a precision parameter, let δ, 0 < δ < 1, be an error variable Y_i(p) as explained in the proof of Theorem 10. We define the embedding ϕ for the point p by

ϕ(p) := 1

qd(ε, δ)

· (Y₁(p), . . . , Y_d(ε,δ)(p))^T .

Following the construction in the proof of Theorem 10, each point can be embedded using a space of O(log(d)/(ε²δ²)) and by performing O(d/(ε²δ²)) arithmetic and finite field op-erations on elements of O(log(d)) bits. Furthermore, since ϕ is a linear function, we have ϕ(p − q) = ϕ(p) − ϕ(q) for all pairs (p, q) ∈R^d×R^d.

Now, let p and q be any two points inR^d. We define ν := p − q and Y (ν) := kϕ(ν)k² to be the random variable for the squared length of ϕ(ν). Then, as explained in the proof of Theorem 10, the expected value of Y (ν) is E [Y (ν)] = kνk², and we can upper bound the The expected value of err(p, q) is given by

E [err(p, q)]

8.3 Max-Cut in High Dimensions 159

It follows that, in order to upper bound the expected value of err(p, q), we have to upper bound the probability that err(p, q) > εδ/5 · 2ⁱ· kνk for each i ∈ N0. Let ` be any fixed

By Chebyshev’s inequality, we can upper bound this probability by

Now, the expected value of err(p, q) can be upper bounded by

Due to Markov’s inequality, it follows that

Pr Due to linearity of expectation, we have

Given any Euclidean point set P , the embedding described above is useful for all geo-metric problems that satisfy the following four properties:

(i) The cost of an optimal solution for P is a function whose set of input parameters is a subset of all pairwise distances of P .

(ii) The cost of an optimal solution for P is at least ^P_p∈P^P_q∈P1/c · D(p, q), where c ≥ 1 is any small constant.

(iii) If the distance D(p, q) between any two points p, q ∈ P is increased or decreased by any value α > 0, the cost of an optimal solution for P is increased or decreased by at most O(α).

(iv) The complexity of all known (1 ± ε)-approximation algorithms depends exponentially on the dimension of P .

8.3 Max-Cut in High Dimensions 161

To handle these problems, we first embed the input points and afterwards apply any efficient (1 ± ε)-approximation algorithm on the embedded points.

One suitable problem is the max-cut problem in the dynamic geometric data stream model.

Definition 8.3.2 (Euclidean Max-Cut Problem). For a set P ⊂R^d, the Euclidean max-cut problem is to find a partition of P into two subsets C₁ and C₂ such that the sum

Cut(P, C₁, C2) := ^X

(p,q)∈C1×C2

D(p, q)

of inter-cluster distances is maximized.

Obviously, the max-cut problem satisfies Properties (i) and (iii). Furthermore, it is shown in [44] that Property (ii) is satisfied for c = 4. Concerning Property (iv), the authors of [44] gave an efficient (1 ± ε)-approximation for the max-cut problem in low-dimensions that has the following properties:

Lemma 8.3.3 ([44]). Let ε, 0 < ε < 1, be a precision parameter. Given a stream of m Insert and Delete operations of points from a discrete Euclidean space {1, . . . , ∆}^d, where d is a constant, there exists a streaming algorithm that computes with probabil-ity at least 2/3, for the current point set P with cardinalprobabil-ity n, a data structure of size O(log³(∆m) · log⁴(∆)/ε^2d+4) from which an implicit (1 ± ε)-approximate solution for the max-cut problem can be extracted in poly(exp(1/ε)^O(1), (1/ε)^d, log(∆), log(n), log(m)) time.

An update can be processed in O(log²(∆) · log(∆m)) time.

By combining the embedding given in Lemma 8.3.1 with the approximation algorithm presented in [44], we can implicitly compute a (1 ± ε)-approximation for the max-cut problem on dynamic geometric data streams of high-dimensional points.

Theorem 12. Let ε, 0 < ε < 1, be a precision parameter. Given a stream of m In-sert and Delete operations of points from a discrete high-dimensional Euclidean space {1, . . . , ∆}^d, there is a randomized streaming algorithm that has a space requirement of O(log⁷(d∆mn)/ε^O(1/ε²⁾) and computes with probability at least 5/8, for the current point set P of size n, a data structure from which an implicit (1 ± ε)-approximation for the max-cut problem can be extracted in poly(exp(1/ε)^O(1), (1/ε)^1/ε², log(d), log(∆), log(n), log(m)) time. An update requires O(d · log²(d)/ε²+ log³(d∆nm/ε)) time.

Proof. We proceed in a similar way as we have done in the proof of Theorem 11. At first, we embed the discrete high-dimensional Euclidean point set P into a low-dimensional Euclidean space. This embedding induces a small multiplicative error on the cost of a max-imum cut. Then, we apply the snap-rounding technique, i.e., we impose an appropriately fine grid on the target space and move each embedded point to its nearest grid point. This movement of the points induces an additive error, which can be charged against a lower bound on the cost of a maximum cut for P to get a small multiplicative error. Finally, by

applying the techniques described in [44] on the embedded and moved points, we obtain the results stated in the theorem. Next, we explain our construction in more detail.

In the first step, we apply the embedding ϕ : P → P⁰given in Lemma 8.3.1 with precision parameter ε⁰ := ε/16 and error probability parameter δ⁰ := 1/24 on P . Then, we have that

(p,q)∈P ×P

|D(ϕ(p), ϕ(q)) − D(p, q)| ≤ ε⁰· ^X

(p,q)∈P ×P

D(p, q)

is true with probability at least 1 − δ⁰. Since Property (ii) (on page 160) is satisfied for c = 4 [44], we have MaxCut(P ) ≥ 1/4 ·^P_{(p,q)∈P ×P} D(p, q). Due to the fact that each cut maximum cut of P⁰. It follows from the above that

8.3 Max-Cut in High Dimensions 163

In the second step, we apply the snap-rounding technique. We impose a square grid on the target space R^d(ε⁰^,δ⁰⁾ with d(ε⁰, δ⁰) ∈ Θ(1/(ε²δ²)), where each cell has side length ε/(16 ·^qd(ε⁰, δ⁰)), and move each point in P⁰ to its nearest grid point. Let P⁰⁰ be the set of points that we obtain after moving the points in P⁰. Each point is moved by a distance of at most

Thus, the movement of the points induces an additive error of at most εn²/16 on the sum of the pairwise distances. Since Property (ii) (on page 160) is satisfied for c = 4 [44] and the minimum pairwise distance of P is 1, a lower bound on the cost of a maximum cut for P is n²/4. Hence, we have εn²/16 ≤ ε/4 · MaxCut(P ). Due to Inequality (8.3), we get

with probability at least 1 − 1/24. Besides, we can upper bound the diameter of P⁰⁰ as follows. Since the maximum pairwise distance of P is√

d∆, the value n²·√

d∆ is an upper bound on the cost of a maximum cut for P . Since the diameter of a point set is a lower bound on the cost of a maximum cut of the point set, we get

diam(P⁰) ≤ MaxCut(P⁰) ≤

where the second inequality follows from Inequality (8.3). As a result, the diameter of P⁰⁰ is O(√

d∆n²). Furthermore, each point in P⁰⁰ lies on a grid with cell size ε/(16 ·^qd(ε⁰, δ⁰)).

Thus, by scaling the point space by 16 ·^qd(ε⁰, δ⁰)/ε, we get a set of points from a discrete low-dimensional space {1, . . . , ∆⁰}^d⁰ with ∆⁰ ∈ O(√

d∆n²/ε²) and d⁰ ∈ O(1/ε²).

On the scaled point set, we run the approximation algorithm of [44] with precision parameter ε⁰⁰ := ε/3. Due to Lemma 8.3.3 and our calculations above, with probability at least 23/24 − 1/3 = 5/8, we can compute a point set P⁰⁰⁰ such that construction computes an implicit (1 ± ε)-approximate solution for the max-cut problem with probability at least 5/8.

Note that our construction works in the streaming model, where the first two steps are used to transform a stream of high-dimensional points into a stream of low-dimensional points. Due to Lemma 8.3.1, the transformation of one high-dimensional input point requires O(log(d)/ε²) space and O(d · log²(d)/ε²) time. Finally, since we apply the ap-proximation algorithm of [44] on a stream of points with dimension O(1/ε²) and spread O(√

d∆n²/ε²), the complexity of our construction is as claimed in the theorem.

In document Approximation Techniques for Facility Location and Their Applications in Metric Embeddings (Page 171-178)