The research presented in this thesis was performed and subsequently pub- lished jointly with Professor Thorsten Joachims and partly jointly with Professor Robert Kleinberg at Cornell University. While Chapters 1, 2 and 3 present back- ground material, the key contributions of this thesis have been published as follows. Chapter 4, which presents and evaluates the concept of query chains was published in (Radlinski & Joachims, 2005b). Chapter 5, which presents an algorithm to combat presentation bias, was published in (Radlinski & Joachims, 2006). Chapter 6, which describes an approach to avoid evaluation bias, was published in (Radlinski & Joachims, 2007). Finally, Chapter 7, which addresses user diversity, was published in (Radlinski, Kleinberg and Joachims, 2008b).
During my degree, I have also contributed other research not presented as a chapter in this thesis. In particular, I have published research on personalized Web search (Radlinski & Dumais, 2006), identifying related documents using im- plicit feedback (Pohl, Radlinski and Joachims, 2007), machine learning algorithms
for optimizing ranking performance metrics (Yue, Finley, Radlinski and Joachims, 2007), ranking online advertisements (Radlinski, Broder, Ciccolo, Gabrilovich, Josifovski and Riedel, 2008a), studying the tolerance of learning from implicit feedback to random and malicious noise (Radlinski & Joachims, 2005a; Radlinski, 2007) and studying how high school students prepare for high stakes exams (Loken, Radlinski, Crespi, Cushing and Millet, 2004; Loken, Radlinski, Crespi and Millet, 2005). I also contributed to (Joachims, Granka, Pan, Hembrooke, Radlinski and Gay, 2007). Finally, a less technical overview describing a number of the key ideas in this thesis was published in (Joachims & Radlinski, 2007).
CHAPTER 2
UNDERSTANDING AND INTERPRETING USERS’ DECISIONS
This chapter presents an overview of previous research that considers users’ decision making processes, and how user actions can be interpreted. In particular, we will see how users’ decisions are affected by the way users are presented with choices to make. We start by considering the decisions people make in the offline world, from a marketing and economics perspective, as well as when applied to document relevance judgments. Following this, the majority of this chapter will consider user behavior in a Web search setting, asking a number of questions that will determine how implicit feedback can be interpreted.
2.1
Offline Decision Making
The process by which people make decisions offline has been studied extensively, particularly motivated by marketing applications. For instance, Eliashberg (1980) presented approaches for estimating consumer utility functions and Currim and Sarin (1984) studied job preferences using a related utility formulation. Numer- ous related behavioral models have been proposed, including bounded ratio- nality (Herbert, 1957), aspiration adaptation theory (Selten, 1998) and prospect theory (Tversky & Kahneman, 1981).
2.1.1
What influences user decisions?
Of more direct relevance to this thesis, a number of recent studies have explored factors that impact the reliability of preference choices made by people. Mantell and Kardes (1999) describe how the order in which choices are presented affects the preferences shown. In particular, they found that when comparisons are made between products based on specific attributes, order effects play a larger role than when comparisons are made in terms of overall (attitude-based) eval- uations. Moreover, when comparing pairs of products, Moore (1999) showed that if a superficially attractive option is presented first, consumers are willing to pay more for both options than when the superficially less attractive option is presented first. In addition, Coupey et al. (1998) and Zhang and Markman (2001) studied the effect of familiarity and motivation on preference decisions respectively. They found that the preferences people make are influenced by these factors, suggesting that consumer decision making is indeed a complex process.
In our context of interactive information ranking systems, these studies should be taken as indicative of the complexity of the behavior we will ob- serve from online users. It is unlikely that simple behavioral models will suffice to explain why people express the preferences they express. Rather the studies motivate careful analysis of online user behavior to identify complicating factors, to allow us to find ways to avoid them when collecting implicit feedback. For instance, these studies suggest that the order users are presented with online search results will affect their judgments of the relevance of the results.
2.1.2
Are decisions absolute or relative?
In the context of document relevance judgments, a particularly important ques- tion to ask in the offline world regards which sort of judgments people are able to make most reliably. In particular, when human experts are trained to make judgments about the relevance of documents to queries, what sort of judgments can we expect? Recently, Carterette et al. (2008) studied how relevance judg- ment quality is affected by the specific question asked of expert human judges. The standard approach to obtain relevance judgments from expert judges is for judges to provide a relevance score for each (query, document) pair on a two to six point scale. These relevance scores can then be used to construct a set of relative statements. Carterette et al.compared the relative judgments obtained in this way to those obtained by asking the experts for relative judgments directly. They found that relevance judges tend to agree more with each other when asked to directly provide relative judgments about pairs of documents. Moreover, it took the judges substantially less time to make a relative judgment than it took them to make an absolute judgment on a five point scale. This shows us that people (at least those trained to make relevance judgments) appear to have an easier time providing reliable and consistent relative judgments than absolute judgments.