• No results found

Interaction across Augmented Elements

4.4 Result Analyses

4.4.3 Interaction across Augmented Elements

A highlight of the above analysis is the difference in the click behavior for video re- sults, compared to news and image. Multiple cases were observed where the frequency pattern of videos differed from that of images and news. Videos tended to have a low frequency even when the position was high or source orientation was high. On the other hand, participants tended to click video results more frequently than the other sources when the position was middle or the orientation was middle. Implications of this finding will be discussed in the next Section.

4.5 Discussion

The work presented in this chapter was motivated by the prominent complexity of ag- gregated search interfaces compared to the conventional single-source design. Despite the complexity, there has been a limited number of studies looking at the effects of fac- tors such as augmented elements’ positions, source types, and the search tasks source orientation, on people’s search behavior.

As mentioned above, the intention was not to decide which type of aggregation is more effective than the other. Instead, the interest was in understanding the characteristics of both designs so that we can leverage their advantages depending on the context of sys- tem use. The following discusses the main findings of this study and their implications on the design of aggregated search interfaces.

The first finding is that the factors that affect participants’ click-through behavior differ between the vertical and tiled designs. This may sound obvious, however, this should not be underestimated, since it suggests that the way in which we present results from different sources indeed matters. For example, participants’ click-through behavior was significantly affected by the position of augmented elements in the vertical design, echoing the findings of previous studies Joachims et al. [2005];Agichtein and Zheng

[2006];Guan and Cutrell[2007];Keane et al.[2008], yet not in the tiled design. This suggests that we need a careful estimation of the position of the augmented elements with respect to the base elements when the vertical design is employed. Such infor- mation can be useful when the suitable aggregation approach is required. For instance, when it is not possible to afford to measure the position of augmented elements, the

4.5. Discussion

tiled-like approach might be more appropriate since participants’ click-through behav- ior was not affected by the position in this type of aggregation. Such a situation may arise in digital libraries and elsewhere. These results address the first research question [R1] of this study.

The second finding is that videos resulted in a different click-through pattern from news and images. This trend was common in both the vertical and the tiled designs. This suggests that, when deciding to retrieve videos, different behavior from other sources may be observed.

While it is not entirely clear why videos are different, it is possible to suggest some possible reasons. That is, we can speculate on a couple of potential factors for the trend. First, videos are multimodal mediaHalvey et al.[2009] combining text, images, and audio in a dynamic way. The dynamism and multimodality of the information source might cause a user to give a different priority to videos during a search task. The different priority may cause the different click-through pattern of videos when compared to news and images. Secondly, it may be due to the type of surrogate used to represent video results, being less informative of the contents of the video than the equivalent image or text representations. The title of news articles and the thumbnail of an image can provide a good indication of the respective content of the documents. On the other hand, although basic metadata of the videos were presented, this may not have been as informative of the contents. This difficulty in getting the preview of videos might cause the different click-through patternSong and Marchionini[2007]. It should be noted that, the task time limit of 2 minutes did not seem to discourage users from viewing video results.

Although, it was not clear why the difference in the click-behavior across results from different sources were observed, yet, the overall message that we get from this finding is that depending on the type of source aggregated on the result page, it might be possible to observe difference in users’ searching behavior. Alternatively, more generally, this study suggests that participants’ click-through behavior can be different across source types. These observations addresses the third research question [R3]

The third finding was that a search task’s orientation towards a particular source could affect participants’ click-through behavior. This trend was common to both the vertical and tiled designs. Traditional information retrieval research has been focused on the modeling of thematic (or topical) relevance of documents. However, research on XML document retrieval Lalmas and Tombros [2007] and geographic information retrieval (GIR) Purves and Jones [2004] has demonstrated that relevance can be multidimen- sional, there being a structural relevance in XML retrieval, and geographic relevance in

4.6. Summary

GIR, which can be considered apart from the thematic relevance. In a similar way, the experimental design of this study controlled the level of orientation towards a particular source (i.e., news, images, and videos). The significant effect of the source orientation observed in our experiments suggests that the task’s source orientation is an important factor to investigate in research on aggregated search. On the design level, it suggests that devising a means of capturing a searcher’s intent about the source is an important problem to tackle. These observations address the second research question [R2].

4.5.1

Limitations

The study described in this chapter has some limitations. First, the results to be dis- played on the interfaces were generated a priori. Although this made the investigation fair, the implications of these results are limited to this particular set of search results. Second, to reduce the complexity and duration of the experiment, only three sources were tested; image, video and news. Finally, only a single search engine results were used throughout the experiments. Therefore, the findings may or may not apply to dif- ferent environments. Further studies should be carried out to deepen our understanding of aggregated search interfaces. In addition, no attempts were made to ensure that the documents retrieved by the search engine were relevant to the tasks, to ensure that the presented results were representative of real world search engine performance.

However, as the ratio of bookmarked documents over all clicked documents suggests, participants did perceive some of the retrieved documents as relevant in order to com- plete the search tasks. Furthermore, careful attention was paid to deal with any technical problems during the experiment, and no cases were observed where participants were deliberately bookmarking clearly irrelevant documents. Therefore, it can be asserted that the backend engine did retrieve relevant documents and participants were able to find some of them although any quality controls of retrieval were not performed.

4.6 Summary

In this chapter, a study testing two aggregated search interfaces was presented. Two separate experiments were carried out during the study; one where results from the different sources were blended into a single list, and the other, where results from each source were presented in a separate panel. A total of 1,296 search sessions performed by 48 participants were analyzed.

4.6. Summary

The study reported in the chapter led to three main findings. First, the position of search results was only significant when the results are aggregated in a vertically ranked list. Second, participants’ click-through behavior on videos was different compared to other sources. Finally capturing a task’s orientation towards particular sources is an important factor to consider when considering the use of an aggregated search design.

These findings provide further insight to aggregated interface and its result presentation issues. Results from the study provide initial guidelines to the designers of aggregated interfaces and their associated concerns. Furthermore, the study triggers many potential research directions to be explored in aggregated search paradigm.

To conclude, overall aggregated search, as already pushed forwards by major search companies, is a useful paradigm. Producing aggregated result page in response to an in- formational query improves information access to the users and hence makes task com- pletion quicker. Furthermore, results suggest that the designers of aggregated search interfaces need to concentrate on different aspects over the aggregation styles. In a ver- tical aggregation, one needs an accurate estimation of the best position of augmented elements and that relevance of the source is a key element in both form of aggregation. The outcomes from the above studies not only provide insight to some initial issues of result presentation associated with aggregated interface, but also suggest many future directions for research in aggregated search.

In the next part, methods to identify the source-orientation behind a user’s query are described in Chapter 5. Furthermore, large scale log data is analyzed to uncover the dynamics of users’ searching behavior when their information need is oriented towards multiple sources in Chapter6.

Part III

Source-Orientation in Aggregated

Search

Part III

So far in this thesis, different aspects related to the result presentation in an aggregated page were discussed. The effectiveness of an aggre- gated interface, and the factors affecting click behavior in aggregated interface were investigated. Motivated by the results from the previous studies, the work was further extended to gain an insight into users’ searching behavior with respect to the information need oriented to- wards different sources. In this part, the dynamics of user behavior with respect to different sources are analyzed using large scale log data from Microsoft. First in Chapter 5, effective methods to identify the source- orientation behind a user query are presented. Once the orientations are correctly identified, they are then further exploited in Chapter6 to uncover the patterns and behavioral aspect of users for such informa- tion needs.

Chapter 5

Identifying Source-Orientation

5.1 Introduction

In the previous Chapter [4], results from the analysis of click-behavior on aggregated interfaces suggested that source-orientation is a key element in an aggregated interface. In this chapter, ways to identify the source orientation behind users’ queries are pre- sented. Two methods are tested for this purpose; Rule Based and Combination (Rule base + Machine Learning). For the first method (described in Section 5.5), a simple rule based technique is applied on the clicked URLs of the log entries. For the second method (described in Section 5.6), a combination of rule base and machine learning techniques are applied on the submitted query, clicked URL, and the title of the clicked documents.

In this work, Microsoft 2006 RFP search click data were analyzed to understand the source-orientation behind the queries. More specifically, six sources were looked at, namely, image, video, map, news, blog and Wikipedia. All other sources were viewed as standard “web”, i.e. the typical web search result.

These six sources were chosen based on a survey5-1, which showed that images, news, and videos were the three most frequently accessed sources. Map and Wikipedia were chosen because results of these types are now frequently included within the top ten result list by major search engines.