6.2 Experiments
6.2.4 Computational Performance
A good query selection algorithm should not only be able to find queries with high EVOI, but it should do so at a reasonable computational cost. In Chapter5 and throughout this dissertation I have set the computational goal of query selection algorithms to being able to select a query without requiring computation time that scales with the product of optimal planning time and the size of the query set. DEER and DMVM, along with their greedy versions, accomplish this goal from a theoretical perspective; however, their first step of solving for d∗, which depends on optimal planning complexity, may incur significant cost, and they still must enumerate the query set during the projection step (in the absence of exploitable structure). In this section I determine how they compare to alternative algo-rithms in empirical computational costs for askable decision query selection in the Boston Housing dataset.
I compare the computational costs of the algorithms discussed in this chapter according to how much time, empirically, they take to select a query, as a function of (a) the size of the askable query set; and (b) the size of the underlying optimal planning problem. I quantify these dimensions in the askable query selection problem studied in this chapter as the size of the askable decision set, and the size of the doable decision set, respectively, and the results are shown in Figure6.11a and b, respectively. Note that in both cases the decision sets are sampled uniformly randomly as in Section6.2.2.1.
First, consider how DEER and DMVM compare in computation time. DEER and DMVM scale the same asymptotically since they share the same first step of solving for d∗, and both apply their projection criteria by enumerating the askable query set. How-ever, Figure6.11a shows that the overhead associated with DEER’s expected posterior d∗ response-entropy criterion is much higher than that of DMVM’s mistake region criterion, which is implemented here with a decision-boundary angle criterion (see Section6.1.1for details). In fact, in Figure6.11a, DMVM appears to require negligibly more computation as the askable decision set – and accordingly, the askable decision query set – grows, as for DMVM the computation associated with the optimal-planning-related step of solving for d∗ dominates that of the projection step. Hence, for DMVM many more askable decision queries could conceivably be added to the set without significantly changing the
computa-a)
Random Decision Sets Experiment: Size of Askable Decision Set vs EVOI
"AskEnum-DoEval"
Random Decision Sets Experiment: Size of Askable Decision Set vs d* Entropy
"AskEnum-DoEval"
"DEER"
"DMVM"
"Random"
Figure 6.7: Results for Experiment 6.1– a) Average EVOI of query selected and b) average posterior response entropy of d∗induced by query selected, as a function of the size of the askable decision set; the size of the doable decision set is fixed at 15. For each trial, both the askable and doable decisions sets are uniformly sampled from the complete Boston housing decision set.
a)
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
2 4 6 8 10 12 14 16
EVOI of selected query
Size of askable decision set
Random Decision Subsets Experiment: Size of Askable Decision Set vs EVOI
"AskEnum-DoEval"
"DEER"
"DMVM"
"DEER-Greedy"
"DMVM-Greedy"
"Random"
b)
0 0.1 0.2 0.3 0.4 0.5 0.6
2 4 6 8 10 12 14 16
Expected posterior d* response entropy
Size of askable decision set
Random Decision Subsets Exp: Size of Askable Decision Set vs d* Entropy
"AskEnum-DoEval"
"DEER"
"DMVM"
"Random"
Figure 6.8: Results for Experiment 6.2 – a) Average EVOI of query selected and b) average posterior response entropy of d∗induced by query selected, as a function of the size of the askable decision set; the size of the doable decision set is fixed at 15. For each trial the doable decision set is uniformly sampled from the complete Boston housing decision set, and the askable decision set is then sampled from the doable decision set.
a)
Risky Decisions Experiment: d* Askability vs EVOI
"DoEnum-DoEval"
Risky Decisions: d* Askability vs d* Posterior Entropy
"AskEnum-DoEval"
"DMVM"
"DEER"
"Random"
Figure 6.9: Results for Experiment 6.3 – a) Average EVOI of query selected and b) av-erage posterior response entropy of d∗ induced by query selected, as a function of ρ: the scaling factor used for the chimerical features (the last two features) of each decision in the askable deicsion set. The doable decision set is obtained by starting with two specially constructed “risky decisions” and then uniformly sampling additional decisions from the complete Boston housing decision set, and the askable decision set is obtained by uni-formly sampling from the complete Boston housing dataset and then scaling the chimerical features (the last two features of each decision) by ρ.
a)
Scaling of negative risky decision features Risky Decisions Experiment: Risk vs EVOI
"DoEnum-DoEval"
Scaling of negative risky decision features Risky Decisions Experiment: Risk vs d* Entropy
"AskEnum-DoEval"
"DMVM"
"DEER"
"Random"
Figure 6.10: Results for Experiment 6.4 – a) Average EVOI of query selected and b) av-erage posterior response entropy of d∗ induced by query selected, as a function of τ : the scaling factor used for the negative decision features of the two risky decisions present in the doable decision set. The doable decision set is obtained by starting with two spe-cially constructed “risky decisions”, scaling their negative features by τ , and then uniformly sampling additional decisions from the complete Boston housing decision set. The askable decision set is obtained by uniformly sampling from the complete Boston housing dataset and then scaling the chimerical features (the last two features of each decision) by ρ = 0.1.
tion time required to select a query.
The same relationship holds between DEER-Greedy and DMVM-Greedy, and both benefit from the reduced computational burden of the first step through its greedy con-struction approximation – the computational savings of this approximation are particularly salient in Figure 6.11b, as the computational growth of the greedy variants as a function of the size of the doable decision set is nearly flat, as opposed to the exact versions whose computation grows quadratically. In fact, DMVM-Greedy requires only between 10 and 50 miliseconds to select a query at all points shown.
When the askable decision set is small, AskEnum-AskEval’s approach of ignoring the doable decision set results in significant computational savings as compared to AskEnum-DoEval, but this same trait results in its computation comparing unfavorably to the other algorithms, even the exact AskEnum-DoEval, when the askable decision set grows to be large enough. MEDER, which was implemented exactly, requires similar computation to that of AskEnum-DoEval – this is because computing decision-entropy in the askable deci-sion query setting requires roughly the same computation as optimal planning. In practice, this computation could be approximated heuristically by, for example, considering only a sampled subset of the decisions when computing decision entropy (which would make its computation similar to that of DEER, depending on the size of the sampled set), or other uncertainty-based methods could be used – in this chapter, the purpose of MEDER is to serve as an example of an uncertainty-based method to study its effectiveness in maximiz-ing EVOI compared to DEER, without focusmaximiz-ing on its efficient implementation.
In summary, this section provided the following straightforward computational results:
while DEER had high computational costs when implemented exactly, approximating its first step greedily (DEER-Greedy) caused its performance as a function of the size of the underlying decision problem to improve substantially, and approximating its second step through mistake region minimization (DMVM), whose implementation exploited structure in the askable decision query setting studied in this chapter, caused its performance as a function of the size of the askable query set to improve substantially as well; combining the two approximations (DMVM-Greedy) reduced the empirical time required to select a query by several orders of magnitude as compared to DEER and AskEnum-DoEval.
6.3 Discussion
In this chapter I began by defining the askable decision query selection setting: an exten-sion of the standard deciexten-sion query selection setting where the deciexten-sions the agent can ask about and the decisions the agent can execute can be different. Here I studied the
EVOI-a)
Computation time (seconds) to select query
Size of askable decision set
Computation time (seconds) to select query
Size of doable decision set
Figure 6.11: Average computation in seconds to select a query as a function of the size of the askable (a) and doable (b) decision set, where in (a) the size of doable set is fixed at 20, and in (b) the size of the askable set is fixed at 20.
maximizing abilities and computational properties of AskEnum-AskEval (an algorithm that utilizes structure in this setting), MEDER, and Wishful Query Projection (WQP) algo-rithms (DEER and DMVM). I then presented an empirical study that focused on comparing DEER’s performance in terms of query EVOI to that of MEDER, AskEnum-AskEval, and baselines as a means of both providing empirical evidence for the merit of using the WQP approach for query selection, and to better understand what factors should be considered when determining whether WQP would be effective for query selection.
Confirming intuition from Chapter5, the empirical study showed that, while the relative performance of the algorithms depended on myriad factors, practitioners should consider at least two key factors when assessing the viability of using DEER as a WQP algorithm for their query selection problem. First, the quality of queries selected by DEER can suffer when the askable query set is imbalanced in terms of obtainable information about deci-sions, so DEER is most applicable when the askable query set contains a wide range of queries. Second, the quality of queries selected by DEER can suffer when the askable query set is impoverished in terms of its highest contained query EVOI compared to the EVOI-optimal decision query, so DEER is most applicable when the askable query set consistently contains queries with close to the highest possible k-response query EVOI.
Practitioners should treat these only as guidelines, not strict guarantees, as they directly correlate with DEER’s performance only in the sense of upper bounds on the extent to which things can go wrong.