• No results found

In this chapter, we have taken a significant step in exploring the crucial incentive issues that have the potential to derail the effectiveness of prediction markets for various forecasting tasks. We have introduced a new formal model for studying the incentives for and the impact of manipulation in prediction markets whose participants can affect the outcome by taking actions external to the market but there is some uncertainty about the market

participation of some outcome-deciders. We have characterized the equilibria of the induced game, discussed their properties, and outlined important extensions. Interesting avenues for future work include generalizing our results to markets with other price-setting mechanisms, richer signal structures, outcome functions other than the mean vote (such as non-linear and / or noisy functions of the agents’ second-stage actions), and agents who also strategically pick the time-points at which they trade.

Chapter 4

Aggregating censored signals from

non-strategic noisy agents

4.1

Introduction

In this chapter, we will abstract away from the task of providing truth-telling incentives, and concentrate on methods that can be used for combining signals obtained from agents as responses to queries designed by the principal, with the aim of getting as close to an unknown “ground truth” as possible; our agent model will assume a lack of strategization and consist only of the signal structure – the sampling distribution of agents’ signals given the ground truth – to account for the subjectivity in their inputs to the aggregator.

There exist many problem domains where the learner’s goal is to locate a certain target, given access only to a sequence of (potentially) oracles each of which provides a noisy binary response to the question of whether the target belongs to a sub-space (chosen by the learner) of its range of variation. Examples explored in the literature include dynamic pricing of goods and services (Harrison et al., 2012), object localization in images (Sznitman et al., 2013), and drug dosage discovery in Phase I clinical trials (Cheung and Elkind, 2010b).

Although the material in this chapter has general applicability, let us stay true to the spirit of information aggregation within a (prediction) market context presented in Chapters 2 and 3, and try to motivate the discussion by recounting the automated market maker for financial markets developed by Das and Magdon-Ismail (2009), henceforth referred to as MM. Recall that a market maker is a trading agent that places both buy and sell orders

within the same market (unlike buyers and sellers who respectively have demand and supply only), and readjusts its prices after every trade with some financial objective in mind, e.g. expected long-term profit maximization.

In the market model of Das and Magdon-Ismail (2009), the asset being traded attains a “true” (unknown) value at market inception that remains unchanged henceforth, and each trading agent acquires a noisy version of this value. At each time-step or episode, the MM publicly quotes an ask price and a (lower) bid price defined as the price at which it is willing to sell and buy one unit of the asset respectively in the episode. Exactly one agent interacts with the MM per episode and buys one unit, sells one unit, or does nothing depending on whether her valuation is higher than the ask price, lower than the bid price, or in between these quotes. Agents following such simple trading rules that are not “immediately” irrational but lack sophisticated optimization or learning components are often called zero-intelligence traders (Gode and Sunder, 1993), and have been used to illustrate the emergence of interesting aggregate-level properties (e.g. market efficiency) from individual properties (e.g. bounded rationality).

Since the MM sees which of the three above actions the agent took, the ask and bid prices, in addition to determining revenues, also serve as thresholds defining three (mutually exclusive and exhaustive) sub-intervals such that the MM knows which of these sub-intervals the latest trader’s valuation lies in. It can use this knowledge to adjust its quotes for the next episode with the ultimate aim of converging on the true asset value so that it stands to profit (in the long run) from the imperfectly informed traders. Das and Magdon-Ismail (2009) accomplish this task within a reinforcement learning setting by having the MM maintain a Gaussian belief state over the unknown value (which serves as the unchanging state of the world), use a moment-matching approximation to its Bayesian posterior after every agent interaction, and set bid-ask quotes as an action in its updated belief state; they show experimentally that this methodology has impressive price discovery properties. If we consider the single-threshold variant of this problem (where bid and ask price always coincide), the MM can be viewed as a learner or principal performing an aggregation of stochastic binary (thresholded) signals with its mean belief acting as the aggregate, albeit with a potential long-term profit-making aim unlike in Chapters 2 and 3. We will concern ourselves only with what the principal wishes to learn, and not why it wants to learn it.

We present an algorithm that starts with a Gaussian prior belief on a real-valued target, maintains a Gaussian belief at all times (after an initial transient phase; see below for details) by applying a moment-matching approximation to the true (complicated and non-Gaussian) posterior, and sets its threshold for querying each agent in a sequence at the mean of its current belief distribution. We show that it unconditionally converges to the target with high probability, and the asymptotic rate of convergence is near-optimal with respect to many problem parameter, optimality being defined with respect to an exact Bayesian inferential procedure that observes agents’ real-valued (unthresholded) signals.

4.2

Related work

The literature on learning with thresholded signals or binarized observations is scattered across various lines of academic research. For example, in online dynamic pricing, a seller wishes to determine the demand curve. She sets a price for a good and observes whether

or not the arriving buyer chooses to purchase at that price (Harrison et al., 2010). In

drug dosage discovery, the goal is typically to estimate the maximum dosage level that causes toxicity with less than some target probability (this is typically the focus of Phase I clinical trials) (Cheung and Elkind, 2010a). Threshold queries are also used in image or face localization, where classifiers are used as subroutines to determine whether or not a face or letter or character appears in the query region of some image (Sznitman and Jedynak, 2010). Most contributions in this vein have focused on noise of a particular form: Nature generates the correct answer, but it is then sent through a noisy transmission channel (Jedynak et al., 2011). Thus, the probability of seeing the wrong signal is constant, independent of the point of measurement (the particular threshold set by the learner). Several papers have focused on proving the asymptotic optimality of policies that measure either at or around the median (Horstein, 1963; Burnashev and Zigangirov, 1974; Castro and Nowak, 2008). More recent work shows that measuring at the median is sequentially optimal for entropy reduction in the case of symmetric noise (Waeber et al., 2011). In a different vein, Karp and Kleinberg (2007) consider noisy binary search: in this problem, a finite sequence of biased coins, ordered in increasing probability of a “heads” outcome, has to be searched for the last element with a probability of heads lower than a specified target value.

The bisection problem itself can also be thought of as a version of the classic problem of stochastic root finding (Robbins and Monro, 1951), where the learner is trying to learn the root of a real-valued, decreasing function f . The model is that the learner sequentially queries at points θ1, θ2, . . . , θn, and receives observations of f (θ1), f (θ2), . . . , f (θn) after addition of

noise (e.g. zero-mean Gaussian noise). A natural extension to binary signals is to assume that the learner observes whether or not the corrupted signal is above or below zero. This directly corresponds to a noisy binary signal indicating whether the threshold is smaller than or larger than the root. In this case, the noise model is heavily dependent on how close the threshold is set to the target root. When the threshold is near the target, the probability of seeing the wrong signal is significantly higher and no longer bounded away from 12.