In this section, we study the effect of the parameters of PointRank presented in Section 3.2. We generated a total of 200 rankings: 100 total rankings and 100 partial rankings (where half of the pairs are incomparable). For each parameter that we vary, we fix the other parameters to their default values.
For this set of experiments, the default values of ina, pt, pvalue, minni, and maxni are 5, 0.6, 0.2, 2, and 15, respectively. Worker reliability is set to 0.7.
Initial Number of Assignments
In this set of experiments, we analyze the effect of changing ina, the initial number of assignments.
As can be seen from Figure A.6, the Kendall tau distance decreases when the number of initial assignments increases. This is due to the fact that when the algorithm creates fewer assignments in the initial iterations, the chances of having an early consensus is quite high. In addition, there is a difference between the distance values for total and partial rankings. Our method uses the transitivity rule to infer new pairwise relevances using the questions al-ready asked and this effects the performance for partial rankings. Because of the early consensus described above and the transitivity rule, the algorithm may decide that two objects are comparable even though they are incompara-ble. As the figure shows, the gap between partial and total rankings decreases when the initial number of assignments increases.
Paper A.
KTD (Partial) KTD (Total) NA (Partial) NA (Total)
Fig. A.6:Effect of Initial Number of Assignments
As expected, Figure A.6 shows that the total number of assignments in-creases when the initial number of assignments inin-creases. There is a gap between the assignment counts of total rankings and partial rankings since when the ground truth is a total ranking, the algorithm uses transitivity to decrease the number of questions, which in turn results in fewer assignments.
Figure A.6 also demonstrates that our algorithm produces a pairwise rele-vance relation that is close to the ground truth with a reasonable number of assignments for both total and partial rankings. To illustrate, for ina = 25, the total number of assignments is between 2000 and 3000, which is 45 to 65 assignments for each pairwise relevance question.
The experiments also show that unless the initial number of assignments is set to a very low value, the algorithm does not produce any inconsistencies (not shown in the figure). Even for the lowest value of this parameter, the average number of inconsistencies per execution is below 0.2.
Probability Threshold
In this set of experiments, we analyze the effect of changing pt, the probability threshold.
Figure A.7 shows that the Kendall tau distance increases as the probability threshold increases since the algorithm cannot decide on the relevances if the probability threshold is too high.
Figure A.7 also shows that the probability threshold parameter does not significantly influence the total number of assignments. This is as expected since the algorithm stops creating new assignments for a pairwise relevance question when consensus is reached, and the probability threshold is not
4. Experimental Evaluation
0 100 200 300 400 500 600
0 0,2 0,4 0,6 0,8 1 1,2
0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95
Number of Assignments
Kendall Tau Distance
Probability Threshold
KTD (Partial) KTD (Total) NA (Partial) NA (Total)
Fig. A.7:Effect of Probability Threshold
used in this stopping criterion. Nevertheless, the number of assignments for total rankings slightly increases when the probability threshold increases.
This is because the transitivity property can be used less effectively when, due to high probability threshold, fewer relevances are set based on the an-swers from the crowd.
If the probability threshold is greater than or equal to 0.6, the algorithm does not produce any inconsistencies (not shown in the figure).
P-value
In this set of experiments, we analyze the effect of changing the pvalue pa-rameter.
Figure A.8 shows that the p-value has little effect on the Kendall tau dis-tance. However, there is a slight decrease in distance when the p-value in-creases since increasing the p-value dein-creases the chance of having an early consensus which slightly increases the quality of the result. For the same reason, when the p-value increases, the total number of assignments also in-creases. For high p-values, the algorithm has to create a large number of assignments to achieve consensus. The graphs also show that the difference between the number of assignments for total rankings and partial rankings is negligible.
Finally, the results of experiments show that the P-value selection does not have a noticeable influence on the number of inconsistencies.
Paper A.
KTD (Partial) KTD (Total) NA (Partial) NA (Total)
Fig. A.8:Effect of P-value
Minimum Number of Iterations
In this set of experiments, we analyze the effect of changing minni: the mini-mum number of iterations.
KTD (Partial) KTD (Total) NA (Partial) NA (Total)
Fig. A.9:Effect of Minimum Number of Iterations
Figure A.9 shows that when the minimum number of iterations increases, the Kendall tau distance decreases. This is because a high minimum num-ber of iterations avoids incorrectly deciding on relevances due to premature consensuses. Also, as expected, when the minimum number of iterations increases, the total number of assignments increases as well.
4. Experimental Evaluation
The experiments also show that if the minimum number of iterations is set to 3 or more, there are no inconsistencies, as the algorithm effectively avoids early consensuses leading to incorrect relevances and, thus, inconsistencies.
Maximum Number of Iterations
Our experiments show that changing the maximum number of iterations does not have a significant effect on the Kendal tau distance, the total number of assignments, and the number of inconsistencies. As the worker reliability is set to a relatively high value of 0.7, the algorithm usually reaches a consen-sus before reaching the maximum number of iterations.