• No results found

The proposed framework takes two ranking functions f1and f2, a set of PoIs, and a set of M queries and finds out which of the ranking functions produces better rankings for the given set of queries. Our framework employs pairwise relevance questions as in PointRank algorithm. The complete framework is shown in Figure 8. Our framework is composed of M local evaluations each of which corresponds to an input query and a global evaluation that decides on the better ranking function with respect to the results of the local evaluations.

A local evaluation consists of three elements. The first element is the rank-ing process. It simply computes top-k rankrank-ings l1and l2for the input query from the input set of PoIs by applying the functions f1and f2, respectively.

The second element is the matrix based question model. The main task of this element is to decrease the number of pairwise relevance questions and to determine the most important questions in order to comply with the budget constraint. We employ a learn-to-rank method to form a top-k ranking l12 to cover the important features of both rankings. This element requires a set of training queries. We first form two rankings corresponding to f1 and f2for each training query. Then, a score for each object included in these lists is computed with respect to its rank in both lists. Objects together with their scores form the training data. After training a ranking function, we use the ranking function to generate scores for the PoIs included in l1∪l2 and the first k elements of this set constitute l12. Before generating the questions, the matrix based question model eliminates the questions that both l1and l2

agree on the answer. We utilize an entropy definition to decide whether we should ask the question to the crowdsourcing workers. Entropy of an object is defined with respect to the representativeness of the keywords contained in its documents. If the entropy difference between a pair of objects is less than a threshold, a question regarding this pair is added to the set of generated questions. The algorithm to generate questions stops when the size of this set reaches to the given budget constraint. The algorithm first checks the pairs of objects included in l12. If we still have remaining budget, lc=l1∩l2 is considered. If there is still space for more questions, then we check the remaining objects in the set lr=l1∪l2\lc.

Crowdsourcing based evaluation is the third element of the local evalu-ation. This element first collects answers for the generated questions. We utilize three different methods to determine the answer for a pairwise rele-vance question. The first method is majority voting. The answer given by the majority of the workers is selected as the answer to the question. The second method is voting based on constant confidence. The confidence value for a worker is the same for all questions in this method. The third method is vot-ing based on dynamic confidence. A worker can specify a confidence value

4. Crowdsourcing-based Evaluation of Ranking Approaches

Fig. 8:System Framework [13]

for a specific question in this method. We store the answers in a n×n matrix where n is the number of objects. However, we have a partial matrix after collecting the answers from crowdsourcing platform due to the budget con-straints. Then, we apply non-negative matrix factorization to fill the missing values of the matrix. The last step of this element is forming the final ranking for the input query. We utilize Borda-count [5] to form the ranking.

Example 4.1 (Ranking with Borda-Counts)

Let O = {p1, p2, p3, p4, p5} and we have pairwise relevances {p1 ≺ p2, p1 ≺ p3, p1 ≺ p4, p1 ≺ p5, p2 ≺ p4, p3 ≺ p2, p3 ≺ p4, p3 ≺ p5, p5 ≺ p2, p5≺p4}. The Borda counts for the PoIs are as follows: g(p1) =0, g(p2) = 3, g(p3) =1, g(p4) =4, g(p5) =2. So the final ranking ishp4, p2, p5, p3, p1i.

Global evaluation takes the rankings produced by ranking functions f1

and f2 (l1 and l2) and the final ranking (lf) for each query qi. Then, global evaluation component compares l1and l2with lf using Kendall tau distance [17] for qi. If l1has a lower Kendall tau distance than l2, it is considered as a vote for f1. Otherwise, it is a vote for f2. The global evaluation component concludes that the ranking function with the majority of the votes is a better ranking function.

4.3 Discussion

In order to evaluate the proposed framework, we focus on two ranking func-tions. These functions ( f1 and f2) are obtained by changing the weighting parameter in a ranking function widely used in the literature for spatial key-word querying. Then, we obtain a vector containing the number of votes for f1 for a set of queries using the maximum budget. Then, we compare the number of votes for f1 obtained by our framework for different parameter settings with the vector obtained with the maximum budget using Cosine similarity.

Figures 9a and 9b show how our framework is affected by the budget constraint and the matrix factorization. Figure 9a illustrates that our algo-rithm is sensitive to the budget constraint. In other words, it is clear that the framework provides better results when the available budget increases.

Figure 9b shows the effect of matrix factorization on our framework. The re-sults provide clear evidence that we obtain better rere-sults by utilizing matrix factorization. The experimental evaluation also suggests that voting based on dynamic confidence performs better than majority voting and voting based on constant confidence. This is expected since workers might not provide the same confidence for every question they answer. For instance, a worker

5. GPS-based Ranking Synthesis

The number of binary questions

CS With MF

Without MF

(b) Fig. 9:Effects of Budget and Matrix Factorization [13]

might be more confident about answering about a pair of PoIs that he regu-larly visits than a pair of PoIs he has only heard of.

5 GPS-based Ranking Synthesis

This section gives an overview of Paper D [23].

5.1 Problem Motivation and Statement

Paper A and Paper C focus on constructing rankings for spatial keyword queries by means of crowdsourcing. However, crowdsourcing-based meth-ods have some drawbacks. First, they are quite costly since one has to pay for each task completed by a crowdsourcing worker. Second, it requires ex-tremely long time since one has to wait until all of the tasks are completed by crowdsourcing workers. Third, to answer the pairwise questions gener-ated by the algorithms we propose, workers need to be knowledgeable about the geographical region that the questions cover since we focus on spatial relevance. Moreover, even though they are familiar with the region, they might not have information about the PoIs included in a question. For these reasons, Paper D focuses on utilizing GPS data instead of crowdsourcing.

Paper D utilizes GPS records collected from vehicles. The following defi-nition is reproduced from [23].

Definition 0.1. A GPS record G is a four-tuple hu, t, loc, imi, where u is the ID of a user, t is a timestamp, loc is a pair of Euclidean coordinates representing the location, and im is the vehicle ignition mode.

Paper D addresses the problem of synthesizing a ranking for a given top-k spatial keyword query using a set of GPS records SG and a set of PoIs SP.