• No results found

The Cascade Click Model (CCM) assumes that a user scans the ranked retrieved document list from top to bottom until she/he finds one relevant document and click it (Craswell et al., 2008). However, the user may be required to browse more than one document from the retrieved list until she/he is satisfied. Thus, (Guo et al., 2009) proposed a more sophisticated click model which is an extension for CCM. This model is called Dependent Click Model (DCM). The DCM is CCM with multiple clicks until the user is satisfied with the browsed relevant clicked documents from the ranked retrieved list. The DCM considers the position of the clicked documents (instances), the average of user clicks and the document examination parameters in its calculation. Assumeµi is the position- dependent parameter that is used to reflect the probability that the user would like to see more documents after a click at position i. If the user does not click the document, the next document will be examined with probability one. The parameters µ are a set of shared behaviour parameters for a user over multiple query sessions. The Click and the Examination probabilities in DCM can be demonstrated in the following recursive process

(1≤i≤M, where M is the number of documents in the search result):

Ed1,1 = 1,

Cdi,i =Edi,i ·rdi,

Edi+1,i+1 =µi ·Cdi,i + (Edi,i −Cdi,i). (8.2.1)

where Edi,i is the probability of examination of document di, while Cdi,i is the click

probability for documentdi,rdi is the relevance degree of documentdiandµiis the user

behaviour parameter for browsing (examining) documenti.

The DCM model can be explained as follows. The click probability of the document in the search result is determined by its relevance, after the user examination. Then, after clicking on a document, the probability for next document examination is determined byµi. From equation8.2.1, the examination and click formula for documentdi,i are as follows: Edi,i = i−1 Y j=1 (1−rdj+µj ·rdj), Cdi,i =rdi · i−1 Y j=1 (1−rdj+µj ·rdj). (8.2.2)

If the actual click event use{C1, C2, ..., CM}in query session, whileM is the number of documents in the search result (ranked retrieved list) and therdi is the document rel-

evance of documentdi. The log-likelihood for DCM clicks in each query session before positionlis given by (Guo et al.,2009):

LDCM =

l−1 X

i=1

Ci∗(log(rdi) +log(µi)) + (1−Ci)∗log(1−rdi)

+Cl∗log(rdl) + (1 +Cl)∗log(1−rdl) +log 1−µl+µl∗ n Y j=l+1 (1−rdj) ,(8.2.3)

LDCM ≥ l X i=1 Ci∗log(rdi) + (1 +Ci)∗log(1−rdi) + l−1 X i=1 Ci∗log(µi) +log(1−µl) (8.2.4)

If the query session does not receive any clicks, then LDCM is a particular case with

l=M, Cl =µl = 0. A learning approach is carried for DCM learning by maximising the lower bound ofLDCM in equation8.2.4. The user behaviour parameter is estimated by:

µi = 1−

N o. of query sessions when last clicked position is i

N o. of query sessions when position i is clicked (8.2.5)

For1≤i≤M −1, the equation8.2.5gives the empirical probability of positionibeing a non-last-clicked position for all query sessions in the training set. Finally, the sampling procedure of DCM model for examination variablesEi and click variablesCi, while the users examine the search result list one-by-one from the top position until the end is given by: 1: E1 ←1; 2: ifEi == 0,then 3: Ci ←0, 4: Ei+1 ←0, 5: else 6: Ci ≈Bernoulli(rdi), 7: Ei+1 ≈Bernoulli(1−Ci+µi∗Ci) 8: end if

where Bernoulli() is the Bernoulli probabilistic distribution (Teugels, 1990). For the most state-of-the-art click models, (Chuklin et al.,2015) surveyed their details.

Table 8.1: Average of the Training Data Results of Nine LETOR Datasets

Average Training Data Results of the Datasets

Evaluation Metric Approach MQ2008 MQ2007 HP2004 TD2004 NP2004 HP2003 TD2003 NP2003 Ohsumed MAP ES-Click 0.49 0.467 0.822 0.247 0.23 0.817 0.336 0.729 0.457 DCM Model 0.309 0.329 0.112 0.027 0.044 0.044 0.055 0.232 0.327 NDCG@10 ES-Click 0.512 0.442 0.833 0.357 0.366 0.823 0.402 0.773 0.474 DCM Model 0.34 0.266 0.122 0.0313 0.057 0.048 0.062 0.251 0.252 P@10 ES-Click 0.277 0.382 0.101 0.284 0.277 0.111 0.201 0.095 0.537 DCM Model 0.211 0.266 0.019 0.028 0.048 0.008 0.034 0.035 0.327 RR@10 ES-Click 0.55 0.575 0.819 0.628 0.681 0.806 0.672 0.744 0.735 DCM Model 0.345 0.39 0.108 0.075 0.12 0.042 0.094 0.226 0.594 Err@10 ES-Click 0.023 0.024 0 0 0 0 0 0.00 0.036 DCM Model 0.058 0.061 0.00 0.001 0.007 0.003 0.007 0.014 0.113

Table 8.2: Average of the Test Data Results of Nine LETOR Datasets Average Test Data Results of the datasets

Evaluation Metric Approach MQ2008 MQ2007 HP2004 TD2004 NP2004 HP2003 TD2003 NP2003 Ohsumed

MAP ES-Click 0.473 0.457 0.712 0.223 0.197 0.736 0.256 0.66 0.446 DCM Model 0.298 0.326 0.117 0.025 0.048 0.045 0.055 0.21 0.332 NDCG@10 ES-Click 0.493 0.43 0.691 0.292 0.312 0.78 0.286 0.714 0.452 DCM Model 0.325 0.262 0.131 0.033 0.059 0.049 0.079 0.233 0.262 P@10 ES-Click 0.271 0.38 0.09 0.249 0.215 0.102 0.156 0.085 0.496 DCM Model 0.207 0.268 0.021 0.029 0.045 0.009 0.05 0.034 0.343 RR@10 ES-Click 0.526 0.552 0.699 0.452 0.535 0.761 0.353 0.669 0.628 DCM model 0.329 0.381 0.114 0.082 0.124 0.046 0.167 0.206 0.623 Err@10 ES-Click 0.025 0.026 0.00 0.00 0.00 0.00 0.003 0.00 0.056 DCM Model 0.054 0.06 0.00 0.00 0.01 0.003 0.014 0.013 0.118

8.3

Implementation and Experimental Results

In this section, the summary of the results obtained from applying this approach on nine dataset benchmarks were presented. Lerot and then ES-Rank library packages (Schuth et al.,2013) were used for the implementation of this experiments. The Lerot package is an online LTR package that is used for producing ranking models for the LTR datasets. The experimental settings of ES-Rank and Lerot are the default of these packages. The number of evolving iteration is 1300 iteration in ES-Rank, while configuration setting in Lerot is the default one provided with the package. The ES-Rank takes the ranking model produced by Lerot package using DCM model as initial parent chromosome. Then, it evolves a better ranking model for each training dataset using various evaluation fitness metrics. The p-value of the null hypothesises of paired t-test is10−15 for these results.

This indicates the high degree of confidence in the result improvements for the various experimental settings between ESClick and DCM.

Figure 8.1: Illustrating the Table8.1performance for on the LETOR datasets.

From Table 8.1 and 8.2, the ES-Rank can improve the DCM ranking model when applied on the training and test data of nine well-known LETOR dataset benchmarks. This approach used five fitness evaluation metrics to show the optimisation capability of ES-Rank. The MAP, NDCG@10, P@10, RR@10 and Err@10 are the evaluation fitness metrics used in this experiment. The highest values of each DCM and ES-Click in the MAP, NDCG@10, P@10 and RR@10 results are the best accuracy results, while the lowest value of each DCM and ES-Click on Err@10 is the best accuracy results. For ex- ample, the training ES-Click result value in MQ2008 is 0.4896, while the corresponding DCM value on the same training set (MQ2008) is 0.3094. Thus, ES-Click outperforms DCM model on MQ2008. Similarly, the other datasets have similar accuracy on the rest of benchmarks on the training and test data values. Thus, ES-Click in general outperformed DCM model. The LETOR dataset benchmarks used on this experiment are MQ2008, MQ2007, HP2004, TD2004, NP2004, HP2003, TD2003, NP2003 and Ohsumed bench- marks. The details of these results on the five folds in the nine benchmarks are shown on the AppendixCin TablesC.1toC.18.

Figures8.1 and8.2 illustrates the bar chart for each fitness evaluation metric results reported in the Tables 8.1 and 8.2. In the figures, higher values correspond to better performance, while in the last figure lower values correspond to better performance. Figure 8.1 represents the performance on the training sets, while Figure 8.2 represents the predictive performance on testing sets. From these figures it can be observed that the ES-Click technique exhibits the overall best performance among all techniques.

Figure 8.2: Illustrating the Table8.2performance for on the LETOR datasets.

8.4

Chapter Summary and Conclusion

To sum up, the ES-Rank can improve the performance of DCM ranking model using five evaluation fitness metrics. This approach is called ES-Click (Evolutionary Strategy De- pendent Click Model). It started with learning DCM ranking model from the benchmarks and then the ES-Rank used the DCM ranking models to evolve better ranking models. In general, the ES-Click ranking model outperformed the DCM model in all cases. The Lerot and ES-Rank packages are used in this chapter for producing DCM ranking model and improving it using ES-Rank. Nine LETOR datasets are used in this experiment which show the powerful of ES-Rank to optimise a DCM model. Thus, Evolutionary Strategy can improve the online LTR models as it showed its capability to improve the offline LTR models.

Conclusions and Future Work

9.1

Introduction

This chapter provides the answers to the research questions presented in Chapter1. It also presents the conclusion and the summary of contributions of the whole thesis. Moreover, it discusses the possible extensions for the future related research work in the field.