• No results found

Algorithm for Multiple Query Scenario

The reason for privacy loss during multiple queries with time interval constraints is due to the fixed pre-partitioning. Fixed pre-partitioning is suitable for the single query scenario or the first query, but not for the subsequent queries in the multiple query scenario. In the case of multiple queries, to prevent this privacy loss during a subsequent query, instead of keeping the initial fixed partitioning, we first create a new area, hereafter called the selection area, by expanding the box generated at the previous query to its neighboring cells. If n is the number of T time periods elapsed between two subsequent queries, then a neighboring cell is any cell that is no more than n rows or columns away from the current box. Formally, ntk≠tk≠1

T Ì

, where tk≠1 and tk are the time of issuing queries k ≠ 1 and k, respectively.

Next, the box for the new query is determined by selecting a random box of size b ◊ b from within the selection area that also includes the user’s cell.

7.2.1 Selection area

The selection area, Sk, represents all possible boxes the user could be in for query, qk,

given as

Sk =

Ó

b◊ b boxes in ficœBk≠1nbr(c, n) Ô

nbr(c, n) = {cÕ|cÕcan be reached from c in n steps} .

If the user does not hit the border of the grid G while moving n steps from any cell in box Bk≠1, then the selection area Skfor the query qkforms a square of size (b+2n)◊(b+2n).

If the border of G is reached before the n steps, Sk can be a rectangle. However, if Z is

large enough, the chance of having a rectangular Sk is slim. Figures 7.2a and 7.2b show Sk

where n = 1 and n = 2, respectively. The numbers inside each cell ci signify the number of

b◊ b boxes inside the selection area that contain the cell, i.e., the number of possible b ◊ b boxes from which the algorithm could choose for the next query if the user is located in ci.

We call this number the weight wi of a cell. For instance, in Figure 7.2b the algorithm can

(a) n = 1. 4 6 8 8 6 4 6 9 12 12 9 6 8 12 16 16 12 8 4 6 8 8 6 4 6 9 12 12 9 6 8 12 16 12 8 (b) n = 2. Figure 7.2: Selection area for a (4 ◊ 4 size) box.

choose one out of sixteen possible boxes.

Let the cells of Sk be numbered c1, c2, c3...c(b+2n)2 starting at the top left corner, con-

tinuing row by row. As we advance by one cell to the right along the horizontal direction, the number of possible boxes that could be created from the cell c2 increase by one as well.

If 2n Ø b, this increment continues until the cell cb is reached. This is illustrated in Figure

7.2b, where b = 4 and n = 2 . When 2n < b, the increment of the number of possible boxes stops at the cell c2n+1 as shown in Figure 7.2a, where b = 4 and n = 1. The number of

possible boxes for the rest of the cells in the first row continue to be b until we reach the cell occurring b cells before the top right corner if 2n Ø b. If 2n < b, then it will continue

Thereafter, the number of possible boxes decrease till it becomes one again at the cell on the top right corner. Coming to the second row, one can easily deduce that the first cell on the second row has two possible boxes, one from the cell above and one at the current cell. Similarly, the next cell can generate four possible boxes, two boxes created at the cell above and two created at the current cell, and so on. The number of possible boxes for rest of the cells can be determined following the same process. One has to note that the same logic applies for a traversal along the vertical direction. A weights matrix with the number of possible boxes for each cell can be formulated as follows. Create a row vector ¯x of the weights of the first row and a column vector ¯y of the weights of the first column. The multiplication ¯y ◊ ¯x gives the weights matrix for the entire Sk.

7.2.2 Choosing a box

Our algorithm makes the selection of the b ◊ b box (for interest set generation) based on whether the issued query is the first one, or one of the subsequent ones. Algorithm 7.1 uses pseudo functions whose objectives are discussed next. For the first query, as described Algorithm 7.1 Box selection for interest set generation.

Global Initialization: Time of previous query tp= 0; Previous box Bp = null

Input: Current time t; Box size b; User cell cu

Output: Box B

1: function BoxForCurrentQuery(t, b, cu)

2: G Ω Z ◊ Z grid Û Initial grid.

3: if first query then

4: B Ω G.FixedBox(b, cu) Û (b ◊ b) box from pre-partitoned grid.

5: else

6: nΩÏt≠tp T

Ì

7: SΩ G.GetSelectionAreaBoxes(Bp, n) Û (b ◊ b) boxes in selection area.

8: B Ω RandomSampling(S, cu) ÛRandom box from S includes cu.

9: end if 10: tp Ω t

11: BpΩ B

12: return B 13: end function

in Section 6.1, the grid G is pre-partitioned into fixed non-overlapping boxes of size b ◊ b. The algorithm simply chooses the box that contains the user’s cell.

Let us assume that the algorithm is trying to generate the interest set for query qk and

n be the maximum number of cells the user could have moved after the previous query. For the second query, and subsequent ones, the algorithm first calculates n based on query timestamps and determines the selection area Sk. The box for the current query Bk is

selected by picking a b ◊ b box uniformly at random from Sk such that it contains the user.

The algorithm uses the same techniques presented in the previous chapter for the single query scenario to efficiently generate the interest set Ik.

Since the client can cache earlier results, it only requests details for POIs that are new to this box, i.e. Iretrieve = Ik≠ I, where I is the cache (set) of all POIs retrieved earlier and

Iretrieve is the set of POIs to be retrieved by the current query. This will continue till the

current user session ends. When the time interval is large enough for the user to reach the farthest cell in G, the algorithm starts a new session with the fixed pre-partitioning step.

7.2.3 Obfuscation

Since the user cannot move any farther in the given number of time units (n), the chance for the user being in a cell outside the selection area is zero. Refer to Figure 7.2 for the cases where n = 1 and n = 2. Further, each cell in the selection area (any numbered cell in the figure) has a chance (probability > 0) that the user could exist in that cell after n time units. Algorithm 7.1 randomly selects a box from this selection area when issuing query qk, and because there is non-zero probability of user being in any cell of the box

selected for the new query, obfuscation is preserved.