A Quickhull Algorithm - Lecture Notes for Algorithm Analysis and Design

Let S be a set of n points whose convex hull has to be constructed. We compute the convex hull of S by constructing the upper and the lower hull of S. Let pl and pr be the extreme points of S in the x direction. Let Sa ( Sb ) be the subset of S which lie above ( below ) of the line through pl and pr. As we had noted previously, Sa∪{pl, pr} and Sb∪{pl, pr} determine the upper and the lower convex hulls. We will describe the algorithm Quickhull. to determine the upper hull using Sa∪ {p^l, pr}. As the name suggests, this algorithm shares many features with its namesake quicksort.

The slope of the line joining p and q is denoted by slope(pq). The predicate left-turn(x, y, z) is true if the sequence x, y, z has a counter-clockwise orientation, or equivalently the area of the triangle has a positive sign.

Algorithm Quickhull(Sa, pl, pr)

Input: Given Sa={p1, p2, . . . , pn} and the leftmost extreme point pl and the rightmost extreme point pr. All points of Sa lie above the line plpr. Output: Extreme points of the upper hull of Sa∪ {p^l, pr} in clockwise order.

Step 1. If Sa={p}, then return the extreme point {p}.

Step 2. Select randomly a pair {p2i−1, p2i} from the the pairs {p^2j−1, p2j}, j = 1, 2, . . . , ⌊ⁿ₂⌋.

Step 3. Select the point pm of Sa which supports a line with slope(p2i−1p2i).

(If there are two or more points on this line then choose a pm that is distinct from pl and pr). Assign Sa(l) = Sa(r) =∅).

Step 4. For each pair {p2j−1, p2j}, j = 1, 2, . . . , ⌊ⁿ₂⌋ do the following ( assuming x[p2j−1] < x[p2j] )

Case 1: x[p2j] < x[pm]

if left-turn (pm, p2j, p2j−1) then Sa(l) = Sa(l)∪ {p^2j−1, p2j} else Sa(l) = Sa(l)∪ {p2j−1}.

Case 2: x[pm] < x[p2j−1]

if left-turn (pm, p2j−1, p2j) then Sa(r) = Sa(r)∪ {p^2j} else Sa(r) = Sb(r)∪ {p2j−1, p2j}.

Case 3: x[p_2j−1] < x[p_m] < x[p_2j] Sa(l) = Sa(l)∪ {p^2j−1};

Sa(r) = Sa(r)∪ {p2j}.

Step 5. (i) Eliminate points from S_a(l) which lie below the line joining p_l and p_m. (ii) Eliminate points from Sa(r) which lie below the line joining pm and pr. Step 6. If Sa(l)6= ∅ then Quickhull(Sa(l), pl, pm).

Output p_m.

If Sa(r)6= ∅ then Quickhull(Sa(r), pm, pr).

Exercise 6.7 In step 3, show that if the pair {p2i−1, p2i} satisfies the property that the line containing p_2i−1p_2i does not intersect the line segment p_lp_r, then it guarantees that p2i−1 or p2i does not lie inside the triangle △p^lp2ipr or △p^lp2i−1pr respectively.

This could improve the algorithm in practice.

6.5.1 Analysis

To get a feel for the convergence of the algorithm Quickhull we must argue that in each recursive call, some progress is achieved. This is complicated by the possibility that one of the end-points can be repeatedly chosen as p_m. However, if p_m is p_l, then at least one point is eliminated from the pairs whose slopes are larger than the

p p p

r L

p 2j 2j-1

Figure 6.1: Left-turn(pm, p2j−1, p2j) is true but slope(p2j−1p2j) is less than the median slope given by L.

supporting line L through p_l. If L has the largest slope, then there are no other points on the line supporting pm (Step 3 of algorithm). Then for the pair (p2j−1, p2j), whose slope equals that of L, left-turn (pm, p2j, p2j−1) is true, so p2j−1 will be eliminated.

Hence it follows that the number of recursive calls is O(n + h), ,since at each call, either with an output vertex or it leads to elimination of at least one point.

Let N represent the set of slopes(p2j−1p2j), j = 1, 2, . . .⌊ⁿ₂⌋. Let k be the rank of the slope(p_2i−1p_2i), selected uniformly at random from N. Let n_l and n_r be the sizes of the subproblems determined by the extreme point supporting the line with slope(p2i−1, p2i). We can show that

Observation 6.3 max(n_l, n_r)≤ n − min(⌊ⁿ₂⌋ − k, k).

Without loss of generality, let us bound the size of the right sub-problem. There are

⌊ⁿ₂⌋ − k pairs with slopes greater than or equal to slope(p2i−1p2i). At most one point out of every such pair can be an output point to the right of p_m.

If we choose the median slope, i.e., k = ⁿ₄, then nl, nr ≤ ³₄n. Let h be the number of extreme points of the convex hull and hl(hr) be the extreme points of the left (right) subproblem. We can write the following recurrence for the running time.

T (n, h)≤ T (nl, hl) + T (nr, hr) + O(n) where nl+ nr ≤ n, h^l+ hr ≤ h − 1.

Exercise 6.8 Show that T (n, h) is O(n log h).

Therefore this achieves the right balance between Jarvis march and Graham scan as it scales with the output size at least as well as Jarvis march and is O(n log n) in the worst case.

6.5.2 Expected running time

^∗

Let T (n, h) be the expected running time of the algorithm randomized Quickhull to compute h extreme upper hull vertices of a set of n points, given the extreme points p_l and p_r. So the h points are in addition to p_l and p_r, which can be identified using

2·n comparisons initially. Let p(n^l, nr) be the probability that the algorithm recurses on two smaller size problems of sizes nl and nr containing hl and hr extreme vertices respectively. Therefore we can write

T (n, h)≤ X

∀nl,nr≥0

p(nl, nr)(T (nl, hl) + T (nr, hr)) + bn (6.5.1)

where n_l, n_r ≤ n − 1 and nl + n_r ≤ n, and hl, h_r ≤ h − 1 and hl+ h_r ≤ h and b > 0 is a constant. Here we are assuming that the extreme point pm is not pl or pr. Although, in the Quickhull algorithm, we have not explicitly used any safeguards against such a possibility, we can analyze the algorithm without any loss of efficiency.

Lemma 6.1 T (n, h)∈ O(n log h).

Proof: We will use the inductive hypothesis that for h^′ < h and for all n^′, there is a fixed constant c, such that T (n^′, h^′)≤ cn^′log h^′. For the case that pm is not pl or pr, from Eq. 6.5.1 we get

T (n, h)≤P

∀nl,nr≥0p(nl, nr)(cnllog hl+ cnrlog hr) + bn.

Since nl+ nr ≤ n and h^l, hr ≤ h − 1,

nllog hl+ nrlog hr ≤ n log(h − 1) (6.5.2) LetE denote the event that max(nl, n_r)≤ ⁷₈n and p denote the probability ofE. Note that p≥ ¹₂.

From the law of conditional expectation, we have

T (n, h)≤ p · [T (nl, h_l|E) + T (nr, h_r)|E] + (1 − p) · [T (nl, h_l+| ¯E) + T (nr, h_r| ¯E)] + bn where ¯E represents the complement of E.

When max(nl, nr)≤ ⁷₈n, and hl≥ hr, nllog hl+ nrlog hr ≤ 7

8n log hl+1

8n log hr (6.5.3) The right hand side of 6.5.3 is maximized when h_l = ⁷₈(h− 1) and hr = ¹₈(h− 1).

Therefore,

nllog hl+ nrlog hr ≤ n log(h − 1) − tn

where t = log 8− ⁷₈log 7 ≥ 0.55. We get the same bounds when max(n^l, nr) ≤ ⁷₈n and hr ≥ hl. Therefore

T (n, h) ≤ p(cn log(h − 1) − tcn) + (1 − p)cn log(h − 1) + bn

= pcn log(h− 1) − ptcn + (1 − p)cn log(h − 1) + bn

≤ cn log h − ptcn + bn

Therefore from induction, T (n, h)≤ cn log h for c ≥ _tp^b.

In case pm is an extreme point (say pl), then we cannot apply Eq. 6.5.1 directly, but some points will still be eliminated according to Observation 6.3. This can happen a number of times, say r≥ 1, at which point, Eq. 6.5.1 can be applied. We will show that this is actually a better situation, that is, the expected time bound will be less and hence the previous case dominates the solution of the recurrence.

The rank k of slope(p_2i−1p2i) is uniformly distributed in [1,ⁿ₂], so the number of points eliminated is also uniformly distributed in the range [1,ⁿ₂] from Observation 6.3. (We are ignoring the floor in ⁿ₂ to avoid special cases for odd values of n - the same bounds can be derived even without this simplification). Let n1, n2. . . nrbe the r random variables that represent the sizes of subproblems in the r consecutive times that pm is an extreme point. It can be verified by induction, that E[Pr

i=1ni] ≤ 4n and E[nr] ≤ (3/4)^rn where E[·] represents the expectation of a random variable.

Note that Pr

i=1b· nⁱ is the expected work done in the r divide steps. Since cn log h≥ 4nb + c(3/4)^r· n log h for r ≥ 1 (and log h ≥ 4), the previous case dominates. ✷

In document Lecture Notes for Algorithm Analysis and Design (Page 65-69)