The staircase graph and the worst-case example

4.4 Application: Stochastic Voronoi Diagram in R 1

4.4.1 The staircase graph and the worst-case example

In this section, we first show a useful tool called the Staircase Graph to best illustrate the stochastic Voronoi Diagram.

68 We denote the m + 1 cells created by the m = n₂

pairwise bisectors by c0, c1, c2,

. . . , cm from left to right. As observed above, L(xi, q) will be the same for any q in

a fixed cell. Therefore, with a slight abuse of notation, we define L(xi, cj) to be the

likelihood of xito be generator of cell cj, where 1≤ i ≤ n and 0 ≤ j ≤ m. Now, for each

site xi, the m + 1 likelihood values, L(xi, c0), . . . , L(xi, cm), form a stair-series. There

are n sites, which in total form n different stair-series in the plane, and it is obvious that argmax L(xi, cj), 1≤ i ≤ n (the site corresponding to the topmost curve) will be

the most likely generator for cell cj. We call this collection of stair-series a Staircase

Graph. The upper envelope of the staircase graph corresponds to the stochastic Voronoi Diagram.

A B C D

c0 c1 c2 c3 c4 c5 c6

xAB xAC xBC xAD xBD xCD

Figure 4.6: An example for illustrating the staircase graph

As a simple example, in Figure 4.6, there are four points, A, B, C and D, on the x-axis from left to right. Let us assume their existence probabilities are pA = 0.6,

pB= 0.8, pC = 0.3, and pD = 0.7. All six mid-points are marked by vertical bars, and

all seven cells, c0, . . . , c6 are also marked. For any given query point q ∈ c0, A is the

nearest site with respect to it. The second nearest site is B, and the third and the fourth will be C and D, respectively. We can record this relative order for A, B, C and D in c0 by a four-element sequence (A, B, C, D). As soon as q crosses the mid-point xAB,

the four-element sequence changes to (B, A, C, D), i.e., the order between A and B is swapped. This pattern holds true when q crosses any mid-point, say xαβ, i.e., before

crossing, α and β must be consecutive elements in the sequence, and after crossing, their order will be swapped. Based on this important observation, we can compute the sequences for each of the ci, and consequently the likelihood for each site for each ci

can be calculated efficiently based on these sequences. Figure 4.7a shows, for each cell, the corresponding 4-element sequence and the likelihood for each site to be the nearest site in that interval. We also plot the four likelihood series (the last four columns of the table) to get the staircase graph shown in Figure 4.7b. From this figure, we know A is the most likely nearest site for cell c0, B is the most likely nearest site for cells c1, c2, c3,

and c4, and D is the most likely nearest site for cells c5 and c6. Note, in this case, site

C is not a generator for any cell, not even of the cell c3 which contains it! One

Cell Sequence L(A,_{·) L(B, ·) L(C, ·) L(D, ·)} c0 (A, B, C, D) 0.6000 0.3200 0.0240 0.0392 c1 (B, A, C, D) 0.1200 0.8000 0.0240 0.0392 c2 (B, C, A, D) 0.0840 0.8000 0.0600 0.0392 c3 (C, B, A, D) 0.0840 0.5600 0.3000 0.0392 c4 (C, B, D, A) 0.0252 0.5600 0.3000 0.0980 c5 (C, D, B, A) 0.0252 0.1680 0.3000 0.4900 c6 (D, C, B, A) 0.0252 0.1680 0.0900 0.7000

(a) 4-element sequences for all the cells and the likelihood values for all the sites 0 1 2 3 4 5 6 7 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 A B C D

(b) The staircase graph. Note that the range of x-axis varies from 1 to 7, which corresponds to c0,

c1, . . . ,c6. (Note that some vertical segments from

different stair-series are overlapping.)

Figure 4.7: The statistics table and the corresponding staircase graph

may observe that each stair-series in the staircase graph is a unimodal function. So the question becomes whether it is possible for these n stair-series to interlace one another in order to make the upper envelope sufficiently complicated (i.e., of size Θ(n2_{)). The}

answer is yes, and in the rest of this subsection, we give a concrete example.

A worst-case example: We generate the position and the existence probability of the i-th point as follows, where we consider n > 3 points, and > 0 is a sufficiently

70 small real number, say < n−4_.

pi=      1 if i = n, pi+1 1 + pi+1− if i < n. xi =    0 if i = 1, xi−1+ 10−i+1 if i > 1.

In general, the pattern of the xi sequence is 0, 0.1, 0.11, 0.111, 0.1111 . . . By such con-

struction, intuitively, if we focus on points x1 and x2, x3, . . . , xn, points x2, x3, . . . , xn

will cluster together, and x1 will be far away from them, i.e., the midpoint of x1 and

any xi (i > 1) must lie in the interval (0, 0.1). Indeed, this property recursively ap-

plies to any suffix of the input, i.e., xi, xi+1, . . . , xn. In other words, if we focus on

xi, xi+1, . . . , xn, points xi+1, xi+2, . . . , xn will cluster, and point xi will be far away

from them. Consequently, the midpoint of xi and any xj (j > i) must lie in the

interval (xi, xi+1), and thus all the midpoints from left to right will be ordered as

x12, x13, . . . , x1n, x23, x24, . . . , x2n, . . . , xn−1n, where xij denotes the midpoint of xi and

xj. (Please refer to Figure 4.8 for a four-point example.)

x1 x12, x13, x14 x2, x3, x4

x2 x23, x24 x3, x4

x3 x34 x4

Figure 4.8: A recursive view of the worst case example where n = 4

Based on the special positions of the midpoints, one may observe that the n-element sequence in each cell from left to right varies exactly as the behavior of bubble sort. Formally, at the beginning of each “bubble” stage, the leftmost element xi will be

selected to move to position n+1_{−i in the sequence by swapping with its right neighbor.} We call each xi the moving element of each stage, and we have the following important

lemma.

Lemma 4.4. The winner (the one with the largest likelihood) in any cell is the left neighbor of xi in the corresponding n-element sequence, where xi is assumed to be the

moving element. (Note that, if xi is the first element of the sequence, then the winner

will be the last element. This situation happens only once at the very beginning of the entire process when the sequence is (x1, x2, . . . , xn), and x1 is the moving element.)

(See Section 4.6.5 for a proof.)

Table 4.2 also shows a 4-point example.

c0 : 1 2 3 4 c1 : 2 1 3 4 c2 : 2 3 1 4 c3 : 2 3 4 1 c4 : 3 2 4 1 c5 : 3 4 2 1 c6 : 4 3 2 1

Table 4.2: A four-point example where each line corresponds to an n-element sequence of the corresponding cell. There are 7 cells in total due to 4₂

midpoints. The moving element of each cell is in the box, and the winner is marked by underscore.

One can easily verify that, by the above construction, the most likely nearest site changes after crossing every midpoint, and thus the upper envelope of the staircase graph corresponding to the above dataset has size n₂

= Θ(n2_{). In other words, the}

O(n2_{) space bound for the 1D stochastic Voronoi Diagram is tight, as shown in this}

example.

To illustrate the worst case, we generate a dataset using the definition above for n = 8. Figure 4.9a lists the detailed input data where we choose = 0.01, and Figure 4.9b shows the corresponding staircase graph, in a pattern that has changes which scale as Θ(n2_{) if we were to increase n. We can even increase the complexity of the upper}

envelope by reducing the value of , and it turns out when gets sufficiently small, the upper envelope can have (n_{− 1) + (n − 2) + · · · + 1 =} n₂

changes, which illustrates the fact that the generators of every consecutive pair of cells may be different in some stochastic Voronoi Diagram. Figure 4.9c indicates the case when we set = 0.0001.

1 2 3 4 5 6 7 8

x’s -1.0000 0 2.0000 2.1000 2.3000 2.3100 2.3300 2.3310 p’s 0.0921 0.1137 0.1412 0.1782 0.2318 0.3189 0.4900 1.0000

(a) Position and existence probability for each point

0 5 10 15 20 25 30

10−2

10−1 100

(b) The staircase graph when = 0.01.

0 5 10 15 20 25 30

10−2

10−1 100

Figure 4.9: Worst case data set. Note that the y-axis is in log scale.

In document Efficient geometric algorithms for preference top-k queries, stochastic line arrangements, and proximity problems (Page 81-86)