• No results found

The first query selection problem I consider in this chapter is the myopic k-response query selection problem:

qk = arg max

q∈QkEV OI(q; ψ), (4.1)

for some fixed k ≥ 2. That is, what should the agent ask when the only restriction is that its query must have exactly k responses? Before tackling that problem, consider what the agent should ask when there is no restriction at all in what it can ask, i.e., when even k is unrestricted.

Prior to asking its query, the agent’s Bayes-optimal decision uψ may be suboptimal with respect to the true realization of the uncertain parameter, inducing expected loss

Eω∼ψ[Vω− Vu

ψ

ω ].

Note that this upper-bounds the maximum EVOI a query could have, and that any query allowing the agent to behave optimally thereafter in response (i.e., allowing the agent to adopt maxuVωu under any ω ∈ Ω as a function of the query response) achieves this EVOI.

Thus, the query that asks “Which decision in U has highest value under the true realiza-tion of the uncertain parameter?” must have the highest EVOI out of any possible query.

This result implies that in the extreme case where k = |U | (recall that U corresponds to the agent’s decision set), one solution to Equation4.1 is the query that asks “What is the optimal decision?” which, in fact, is a member of the set of |U |-response decision queries.

Returning to the k-response query selection problem (Equation 4.1), the agent will, in general, no longer have the ability to completely resolve its uncertainty regarding which decision it should execute via a single query (typically, k  |U |): but does the agent need to consider all k-response queries? A desirable property of a k-response query set Q would be that for all finite decision problems and for all uncertainty ψ over them, Q always contains a k-response query with EVOI as high as any other k-response query, so that there would be no benefit in considering any k-response queries beyond those contained by Q.

More formally, I will say that a query set Q is k-response EVOI-sufficient (hereafter, simply EVOI-sufficientwith the constraint k on the number of responses left implicit) if Q satisfies

sup

q∈Q

EV OI(q, ψ) = sup

q0∈Qk

EV OI(q0, ψ).

Next I derive the first and main result of this chapter: Theorem 4.2, which states that the set of k-response decision queries is EVOI-sufficient. Intuitively, this means that the above result that the |U |-response decision query has the highest possible EVOI for any unrestricted-response query generalizes to the Dk-EVOI-optimal query having the highest possible EVOI for any k-response query (where k is fixed). This reduces the (myopic) k-response query selection problem to the (myopic) k-response decision query selection problem, and I discuss algorithms for the latter in Section4.5.

To prove that the set of k-response decision queries is EVOI-sufficient, I will begin by proving the following lemma, and then I will show how it directly leads to Theo-rem 4.2. (Theorem 4.2 is actually a special case of Lemma 4.1, and in Chapter6 I will use Lemma4.1again to prove another result.)

Lemma 4.1. Consider an arbitrary k-response query q, and suppose the agent will execute decisionuj upon receiving response j to q for j = 1, 2, . . . , k. Let dq be thek-response decision query overu1, u2, . . . , uk. Then,

k

The last step follows since the following two quantities are equivalent: (1) the expected highest value under the uncertain parameter out of the k posterior-executed decisions; and (2) the expected value associated with asking which of the the k posterior-executed deci-sions has highest value under the uncertain parameter and then executing that decision.

Intuitively, Lemma4.1can be interpreted as follows. Consider the expected value asso-ciated with asking a k-response query q and then executing some arbitrary decision uj ∈ U upon observing response j to q (note that these decisions may not be Bayes-optimal deci-sions with respect to the corresponding posteriors). Lemma4.1implies that this value can only be improved by replacing q with the k-response decision query asking about those decisions u1, u2, . . . , uk.

However, typically the agent will ask a query in order to improve the expected value of its decision. That is, upon asking a k-response query q and observing response j, it ex-ecutes the posterior Bayes-optimal decision as opposed to some arbitrary decision. Since Lemma4.1is true for any arbitrary set of decisions executed for the posterior distributions, it applies to this case where the decisions are posterior Bayes-optimal as well. Intuitvely, this implies that if the agent will use the response to a k-response query q to choose the best of decisions {uj}kj=1 (which may contain duplicates), the agent can only do better by getting straight to the point and asking which of those decisions is best directly instead of

asking q. This gives rise to the query improvement procedure I discuss next.

Query improvement procedure. Consider the procedure that replaces an arbitrary k-response query q with the k-k-response decision query dq, which asks which of q’s posterior Bayes-optimal decisions is best given full knowledge of the uncertain parameter. Theo-rem 4.2 shows that the set of k-response decision queries is EVOI-sufficient by showing that this procedure can only improve the query, in that dqmust have EVOI at least as high as the original query q:

Theorem 4.2. (The set of k-response decision queries is EVOI-sufficient.)

For all finite decision problems and for all uncertaintyψ over them, the EVOI-optimal k-response decision query has EVOI equal to that of the EVOI-optimal k-response query:

sup

q∈Qk

{EVOI(q, ψ)} = max

q0∈Dk{EVOI(q0, ψ)}.

Proof. Consider an arbitrary k-response query q and recall that the decision uψ|q=j is the Bayes-optimal decision for the posterior distribution ψ|q = j (the posterior induced by the jth response to query q). Let dq denote the k-response decision query that asks for the optimal decision in the set {uψ|q=j}kj=1. Then,

implying that EVOI(q, ψ) ≤ EVOI(dq, ψ). Since this is true for any k-response query q, it is also true for the Qk-EVOI-optimal query, so the Dk-EVOI-optimal query must be Qk-EVOI-optimal.

Thus, I have shown that the set of k-response decision queries is EVOI-sufficient. This means that when restricted to asking k-response queries, there is no loss in considering only k-response decision queries.

Recursive query improvement. As discussed above, a query q can be improved (in terms of EVOI) to a decision query dq that asks which of q’s posterior Bayes-optimal decisions has highest value under the uncertain parameter. However, note that the jthdecision asked about by dqmight not be posterior Bayes-optimal when j is the response to dq. For example, the jth decision uj asked about by dq might have lower value than another decision u0j for all ω ∈ Ω that prescribe response j to dq. As a result, applying this query improvement procedure to dq may yield another query d0q that has higher EVOI than dq. In general, repeating this query improvement procedure recursively would converge to a k-response decision query qk with the property that when j is the response, the jth decision in the set queried by qk is Bayes-optimal1. I will refer to k-response decision queries that have this property as locally optimal k-response decision queries, and will revisit this point in Section4.5where I discuss algorithms for k-response decision query selection.