Recall that this chapter is focused on studying the k-response query selection problem.
In Section 4.3 I showed that the k-response query selection problem can be reduced to k-response decision query selection since the set of k-response decision queries is EVOI-sufficient (Theorem4.2). Next I discuss algorithms for myopic k-response decision query selection.
4.5.1 Myopic k-Response Query Selection
The problem of selecting the EVOI-optimal k-response decision query has been studied by Viappiani and Boutilier(2010) (the decision queries of this chapter correspond to their
“noiseless choice queries”). In particular, they show that greedily constructing a k-response decision query (where, at each iteration, the next decision to add to the set asked about is determined by maximizing EVOR, a lower bound for EVOI defined below which can be computed efficiently) closely approximates EVOI-optimal k-response decision query selection. Next I summarize their results, which connect with Theorem4.2.
To begin, define the Expected Value of Recommendation (EVOR) of a k-response deci-sion query d as follows, letting udj denote the decision associated with the jth response to d:
EV OR(d, ψ) = Eω∼ψ max
j Vu
d
ωj − Vψ∗.
EVOR is the same as EVOI applied to decision queries, except when the response is j, EVOR considers the expected value associated with executing decision udj, as opposed to EVOI which considers the expected value associated with executing the posterior Bayes-optimal decision u∗ψ|d=j. Since the Bayes-optimal posterior decision when uj is the re-sponse to a decision query may not be uj, EVOR is a lower-bound for EVOI, i.e., for any k-response decision query d ∈ Dk,
EV OR(d, ψ) ≤ EV OI(d, ψ). (4.2)
Recall from Section 4.3 that starting with an arbitrary k-response query and recursively applying the query improvement procedure (presented below the proof of Theorem 4.2) eventually converges to a locally optimal k-response decision query, which has the property that when decision uj is the response, uj is the new Bayes-optimal decision. Letting Dk∗ denote the set of locally optimal k-response decision queries, this implies that
maxd∈Dk∗EV OI(d, ψ) = max
d0∈DkEV OI(d0, ψ), (4.3) and that for any locally optimal k-response decision query d (i.e., d ∈ Dk∗),
EV OR(d, ψ) = EV OI(d, ψ). (4.4)
Combining these three facts, we have that
arg max
d∈Dk
EV OI(d, ψ) = arg max
d∈D∗kEV OI(d, ψ)
= arg max
d∈D∗kEV OR(d)
= arg max
d∈Dk
EV OR(d),
i.e., the EVOI-optimal k-response decision query can be computed by maximizing EVOR instead of EVOI. Furthermore, EVOR is monotonic and submodular in k, implying EVOR maximization can be closely approximated by greedily constructing the query on the basis of EVOR (Nemhauser et al., 1978), and Equation 4.4 implies that the same procedure closely approximates EVOI maximization as well. Next I summarize the properties of the aforementioned greedy construction procedure compared to an exhaustive aproach.
Let the computational complexity of executing a single Bayes update be O(B), where B is a measure of the size of the problem (which I leave undefined here because I will simply count how many such updates are performed by the different algorithms).
Exhaustive k-Response Decision Query Selection Algorithm. Exhaustively evaluate each possible k-response decision query and select the best one. This has computational complexity O(|U |kkB).
Greedy k-Response Decision Query Selection Algorithm. Approximate the EVOI-optimal k-response decision query by greedily constructing the set of k decisions it asks about as follows. Begin with the empty set, and on the first iteration, add the prior Bayes-optimal de-cision u∗ψto the set. Then on each subsequent ithiteration, expand the set of i − 1 decisions added so far to include the decision that maximizes the EVOR of the i-response decision query that asks about those decisions. This algorithm enjoys the guarantee that the EVOI of the k-response decision query constructed is within a factor of 1 − (k−1k )k(at worst 63%) of EVOI-optimal (due to monotonicity and submodularity of EVOR, as explained above), and has computational complexity O(k2|U |B).
4.5.2 Nonmyopic k-Response Query Selection
AlthoughViappiani and Boutilier (2010) do not discuss algorithms for nonmyopic query selection, the same algorithms as described above can be applied in the nonmyopic setting by exploiting the theoretical results provided by this chapter.
Namely, combining Theorem 4.2 with Theorem 4.6 implies that the EVOI-optimal
depth-n k-response query tree can be computed through two steps: (1) compute the EVOI-optimal kn-response decision query q∗; (2) contruct a depth-n k-response decision-set query tree µ∗ yielding the same EVOI as q∗.
Working backwards, step (2) can be implemented by the procedure described in the proof of Lemma 4.4, which involves computing any size-k partition of the kn decisions queried by q∗ for all kn nodes of the tree. Since each of these kn computations is O(kn), step (2) has complexity O(k2|U |). Implementing step (1) by either the exhaustive or the greedy algorithm above, then, yields the two algorithms below for depth-n k-response decision-set query tree selection.
Exhaustive Depth-n k-Response Decision-set Query Tree Selection Algorithm. This algorithm implements step (1) using the exhaustive k-response decision query selection al-gorithm above, and so its computational complexity is O(|U |knknB).
Greedy Depth-n k-Response Decision-set Query Tree Selection Algorithm. This algo-rithm approximates step (1) using the greedy k-response decision query selection algoalgo-rithm described above, and so it has computational complexity O(k2n|U |B) while offering the guarantee that the EVOI of the query tree computed is within a factor of 1 − (knk−1n )kn (again, at worst 63%) of EVOI-optimal.
Thus, the computational problem of selecting an EVOI-optimal depth-n k-response query tree can be reduced to selecting an EVOI-optimal kn-response decision query.