In this appendix I provide additional analysis pertaining to the askable decision query set-ting, as even though my primary motivation for studying the askable decision query setting in the context of this dissertation was as an empirical testing ground for strengthening intu-itions regarding what factors are important to consider when determining the applicability of WQP algorithms compared to alternatives such as MEDER, the setting is interesting in its own right.
Specifically, I provide additional theoretical analysis regarding the EVOI performance of AskEnum-AskEval compared to VBDR, a third WQP algorithm. In addition, I discuss in detail the potential merit of another algorithm for askable decision query selection, which can be viewed as an extension of the greedy decision query construction procedure from the standard decision query selection setting.
6.4.1 Value-based Decision Replacement
AskEnum-AskEval is related to a value-based WQP algorithm. Namely, consider a third projection algorithm which applies only to the askable decision query setting: Value-Based Decision Replacement (VBDR), which selects a query from QA as follows. First, d∗ is computed. Then, each decision uj queried by d∗ is replaced by a decision aj so as to minimize maxω∈Ω|Vωuj−Vωaj|, producing query ˆqA. I show next that VBDR and AskEnum-AskEval are closely related in that a performance guarantee similar to the one stated in Theorem6.2 applies to VBDR; the result is stated as Theorem6.4 below. To this end, I first prove Lemma6.3, which states that any query about decisions in D can be replaced by a query about decisions in A to ensure that a maximum of ∆(D, A) EVOI loss is incurred:
Lemma 6.3. For any q ∈ QD, there exists someq0 ∈ QAsuch that
|EV OI(q; ψ) − EV OI(q0; ψ)| ≤ ∆(D, A).
Proof. The proof is by construction. Consider some q ∈ QD, and let ujdenote the posterior Bayes-optimal decision under the posterior induced by the jth possible response to q for {uj}kj=1; i.e., uj , arg maxu∈DVψ|q=ju . Construct query q0to query {a1, a2, . . . , ak}, where aj = arg minaj∈Amaxω∈Ω|Vωuj − Vωaj|. Then,
EV OI(q; ψ) − Vψ∗ =
k
X
j=1
Pr(q = j)Vψ|q=j∗
=
k
X
j=1
Pr(q = j)Vψ|q=juj
≤
k
X
j=1
Pr(q = j) Vψ|q=jaj + ∆(D, A)
=
k
X
j=1
Pr(q = j)Vψ|q=jaj + ∆(D, A)
≤
k
X
j=1
Pr(q0 = j)Vψ|qaj0=j+ ∆(D, A) (By Lemma4.1).
Combining Theorem4.2with Lemma6.3above yields the EVOI-loss bound for VBDR:
Theorem 6.4. Let q∗ denote the QA-EVOI-optimal query, and letq denote the query se-ˆ lected by VBDR. Then for any decision problem and uncertaintyψ over them,
EV OI(q∗; ψ) − EV OI(ˆq; ψ) ≤ ∆(D, A).
Proof. Let d∗ denote the QD-EVOI-optimal query, and recall that ˆq is constructed by re-placing each decision ujqueried by d∗with decision ajin order to minimize maxω∈Ω|Vωuj− Vωaj| ≤ ∆(D, A). Then,
EV OI(q∗; ψ) − EV OI(ˆq; ψ) ≤ EV OI(d∗; ψ) − EV OI(ˆq; ψ) (By Theorem4.2)
≤ ∆(D, A) (By Lemma6.3).
Theorem 6.4 shows that like AskEnum-AskEval, VBDR is guaranteed to work well when the underlying askable and doable decision problems are similar in terms of the exact value magnitudes prescribed by the parameter space to the available decisions in each set.
Intuitively, VBDR and AskEnum-AskEval share a key weakness: they both can fail when the best askable decision queries ask about askable decisions with very low values relative to other askable decisions, since VBDR attempts to match the value magnitudes of the
decisions composing the best doable decision query (which are biased to be high), and AskEnum-AskEval is biased towards using askable decisions with high value magnitudes.
In contrast, DEER and DMVM do not share this weakness because their projection criteria consider relative decision values instead of magnitudes.
Next, I conclude my theoretical discussion of askable decision query selection by dis-cussing another specialized approach for the setting: greedily approximating AskEnum-DoEval.
6.4.2 Greedy AskEnum-DoEval
Recall that the QA-EVOI-optimal decision query can be computed through the AskEnum-DoEval algorithm described above. However, AskEnum-AskEnum-DoEval’s approach of evaluating every combination of k askable decisions requires evaluating O(|A|k) queries, which would be infeasible in many settings. It is natural to ask, then, whether adapting the greedy con-struction procedure ofViappiani and Boutilier(2010) to construct the query from decisions in A while evaluating them with respect to D would yield similar computational benefits and approximation guarantees as in the standard decision query selection setting7 (refer to Section4.5of Chapter4for details of the greedy algorithm and notation, both of which are referred to below). I refer to this greedy approximation to AskEnum-DoEval as Askable Greedy construction Doable Evaluation (AskGreedy-DoEval).
Computationally, the algorithm must perform O(k|A|) EVOI computations, each of which have complexity O(k|D|B) (recall that B represents the complexity of bayesian in-ference, such as the number of particles used in a particle filtering approach). Although AskEnum-DoEval evaluates fewer queries than AskEnum-DoEval, these EVOI computa-tions cannot be replaced by EVOR computacomputa-tions as they can in the standard decision query selection setting (recall that the EVOR of a decision query is defined as the expected value of executing the decision associated with the response chosen by the user when asked the query), because the agent cannot necessarily execute the decisions in the set being con-structed. Thus, one of the major computational advantages of the greedy construction procedure over exhaustive search in the standard decision query setting does not apply to the askable decision query setting – recall that EVOR computations are O(kB), in contrast to EVOI computations which here are O(k|D|B).
The approximation guarantee that applies to the greedy construction procedure is lost in the askable decision query selection setting as well. To understand why, consider the following argument. As discussed in Section 4.5, the approximation guarantee proven
7How to choose the first decision to add to the set is unclear. Here, assume that the algorithm begins by adding the askable decision that has highest expected value out of decisions in A.
byViappiani and Boutilier(2010) is derived by proving and combining the following two results: 1) EVOR is monotonic and submodular (this implies that EVOR can be maxi-mized greedily while offering the guarantee derived from the more general Nemhauser result (Nemhauser et al.,1978); and 2) the decision query with highest EVOI also has high-est EVOR. However, in the askable decision query setting, the askable decision query with highest EVOR may not have the highest EVOI, since applied to askable decision queries EVOR would consider how the query would induce improvement in selecting from A in-stead of D. Thus, one would need to prove that EVOI, inin-stead of EVOR, is monotonic and submodular for queries constructed from decisions in A. However, unlike EVOR, EVOI is nonmonotonic in general for decision queries, and so cannot be nonmonotonic for queries constructed from decisions in A. To see this, consider Example 6.1 above and the QA -EVOI-optimal binary-response query asking about {u1, u2}. Recall that even though u1 and u2 have low value compared to u3 and u4, the information provided by distinguishing between u1and u2 is useful for distinguishing between u5and u6. Now consider adding u3 to the set so the query becomes {u1, u2, u3}. The answer will always be u3, which provides no information to help distinguish between u5 and u6, and so the EVOI of the expanded query is zero. Hence, EVOI is nonmonotonic and so the same guarantee cannot directly apply in this setting.
This concludes my discussion of AskGreedy-DoEval; I leave empirical analysis and further theoretical analysis of AskGreedy-DoEval for future work.
a)
0 0.05 0.1 0.15 0.2 0.25 0.3
4 6 8 10 12 14 16
EVOI-loss of selected query
Size of askable decision set
Random Decision Subsets Exp: Size of Askable Decision Set vs EVOI-loss
"DEER"
"MEDER"
"AskEnum-AskEval"
b)
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2 4 6 8 10 12 14 16
EVOI-loss of selected query
Size of askable decision set
Random Decision Subsets Exp: Size of Askable Decision Set vs EVOI-loss
"DEER"
"MEDER"
"AskEnum-AskEval"
Figure 6.12: Supplemental results for Experiments 6.1 and 6.2 – average EVOI-loss of query selected as a function of the askable decision set, where the size of the doable deci-sion set is fixed at 15 and uniformly sampled from the complete Boston housing decideci-sion set, and where a) the askable decision set is uniformly sampled from the complete Boston housing decision set (Experiment 6.1); or b) the askable decision set is sampled uniformly from the doable decision set (Experiment 6.2).
CHAPTER 7
Conclusions
A variety of research communities are concerned with the problem of how an agent should select which query to ask its human user from an available set in order to improve future decision-making under uncertainty, both from a theoretical perspective (e.g., preference elicitation) and from a practical perspective (e.g., human/agent interaction). Among these communities, an oft-used criterion (and the one used in this dissertation) for measuring the value of asking a query is its associated Expected Value of Information (EVOI), which measures the expected impact of a query’s response on the agent’s policy and hence value.
EVOI-based query selection presents significant challenges in settings where optimal planning under uncertainty is computationally expensive, since EVOI measures the ex-pected impact of the query on the agent’s policy, especially when the set of queries the agent considers is large. In such cases any benefits brought about by considering queries to ask may be outweighed by the additional computational expense required to reason over them, limiting the scope of potential practical applications of querying in agents. The work in this disseration makes theoretical and practical contributions to overcoming these challenges in EVOI-based query selection, focusing on settings where incorporating query responses to update the agent’s uncertainty is much less computationally demanding than optimal planning computations.