• No results found

12. Related and Future Work

15.4. Objective Functions

Proposition 15.1. LetF be a set of features for a TNF planning task Π with operators O and variables V. Let {σo}o∈O be an indexed collection of orderings onV.

The set of admissible and consistent potential heuristics over featuresF can be char- acterized by a set ofO(|O||V|dw+1) linear constraints over O(|F| + |O||V|dw) vari-

ables, whered bounds the size of the variable domains, and w is the maximal induced width ofG(Π,F, o) along σofor anyo ∈ O.

Since the optimal induced width of a graph is its treewidth, we can conclude the following fixed-parameter tractability result.

Corollary 15.3. Constructing linear constraints that characterize the set of admissible and consistent potential heuristics for a set of featuresF is fixed-parameter tractable with the parametermax(d, w), where d bounds the size of the variable domains, and w is the the maximal treewidth of the context-dependency graph for an operator.

If we restrict the set of features in a way that the treewidth of context-dependency graphs is limited by a constant, there always is a polynomial-sized set of linear con- straints characterizing admissible and consistent potential heuristics. However, finding this set depends on using a good variable order in the bucket elimination algorithms, and finding a variable order that minimizes the induced width of a graph is an NP-hard problem in itself (Dechter, 2003).

Let us apply Proposition 15.1 when F is a set of binary features. In this case, G(Π,F, o) has no edges for all o ∈ O and thus its induced width along any order is 0. Proposition 15.1 asserts that the number of constraints is O(|O||V|dw+1) =

O(|O||V|d), while their size is O(|F| + |V|dw) = O(|F| + |V|) agreeing with Corol-

lary 15.2.

15.4. Objective Functions

For certain sets of featuresF, we can characterize admissible and consistent potential functions over F with a compact set of linear constraints CF. Given any objective

function obj, the weights that maximize obj subject to CF describe a “best” heuristic

function according to obj. If obj is a linear combination of feature weights, we can use an LP solver to come up with such weights. But how do we define “best”?

We now explore different ways to measure heuristic quality. Most of them rely on the observation that the heuristic value of a state s is a linear combination of feature weights:

h(s) =X

f ∈F

w(f )[s |= f]

For admissible heuristics, higher heuristic values are generally better. We can eas- ily find a potential heuristic with the highest possible value for the initial state sI by

15. Admissible and Consistent Potential Heuristics

maximizing h(sI) subject to CF. The resulting heuristic, which we call hpotF ,sI has some

interesting theoretical properties, which we discuss in Chapter 16.

One obvious disadvantage of maximizing the heuristic value of only one state is that there is no incentive to optimize potentials of features that do not occur in the state. We therefore introduce alternative objective functions that consider more than one state.

Instead of maximizing the heuristic value of a single state, we can maximize the average heuristic value of multiple states. In general, the average heuristic value for any set of states S is a weighted sum over potentials:

1 |S| X s∈S h(s) = 1 |S| X s∈S X f ∈F w(f )[s|= f]

Note that we can generally eliminate any linear transformation of the objective func- tion since we are not interested in the optimal objective value itself. It makes no differ- ence if we optimize the average heuristic value, or the sum of heuristic values, in S. For example, if S is a reasonably small set of sampled states, the coefficient of w(f ) can be set to number of states in which f is true to maintain the fact that all coefficients are integers. If we consider the set of all statesS, the number of states in which a feature is true can become very large. To avoid numeric problems, we can use the proportion of states in which the feature is true, instead. If dom(vars(f )) is the product of dom(V ) for all V ∈ vars(f), then for every partial variable assignment of variables V \ vars(f), there is one state in which f is true and|dom(vars(f))| − 1 states in which it is not. The proportion of states where f is true is thus dom(vars(f ))1 and we get:

1 |S| X s∈S h(s) = 1 |S| X s∈S X f ∈F w(f )[s|= f] =X f ∈F P s∈S[s |= f] |S| w(f ) =X f ∈F |{s ∈ S | s |= f}| |S| w(f ) =X f ∈F 1 |dom(vars(f))|w(f )

Maximizing this function subject to CF yields an admissible potential heuristic over

F with the highest possible average heuristic value, but there are two problems in prac- tice.

First, ifS contains dead ends, heuristic values can become arbitrarily large and the linear program can become unbounded. This can even happen if all dead ends are unreachable. When the linear program is unbounded, we usually cannot extract a useful heuristic function. Unfortunately, the LP is not always unbounded ifS contains a dead

15.4. Objective Functions

s0,0

s0,1

s1,0

s1,1

Figure 15.1: Example for a task where the set of all statesS contains dead ends, but the LP maximizing the average heuristic value over all atomic features is bounded. There are two variables X and Y , each with the domain {0, 1}. A state sx,y assigns x to X and y to Y . Independently of the

dead ends’ reachability, the average heuristic value is bounded by finite heuristic values.

end so we cannot use this to test for dead ends in a set of states. Figure 15.1 shows an example of a task where the LP is bounded even though there are dead ends. In this task the sum of heuristic values is two times the weight of each atomic feature since every atom occurs in exactly two states. On the other hand, the heuristic values of s0,0 and

s1,1 also add up to the sum over all feature weights, and those states must have a finite

value if the heuristic is admissible. Thus, the optimal objective value must be bounded by X s∈S hw(s) = 2X f ∈F w(f ) = 2(hw(s 0,0) + hw(s1,1)) < ∞.

Second, the heuristic values of unreachable states influence the solution. This is problematic since unreachable states are never encountered during the search. Thus it is pointless to optimize their heuristic value. Moreover, they can be fundamentally different from reachable states. For example, an invariant analysis can detect atoms that can never occur together in a reachable state, i.e. they are mutex. The set of all states S also includes states that violate such mutex information, and maximizing its average heuristic value requires considering them. We would like to only consider reachable states, but we cannot characterize this set of states concisely. (If this could be done efficiently, it would also present an efficient test for plan existence as we could use it to check if the goal is reachable.) While we could exclude all states that violate a single mutex, excluding multiple (potentially overlapping) mutexes is more complicated.

The negative influence of dead ends can be handled to some extent by introducing an upper bound M on each potential. If w(f ) ≤ M for all f ∈ F, then the heuristic value of each state is also limited by|F|M. When optimizing for an individual state s, unlimited weights may identify s as a dead end (if the LP is unbounded). While this is no longer the case when weights are limited, it has the benefit that we can maximize the average heuristic value of any set of states and are always able to extract a heuristic function. If M is large enough, “recognized” dead ends have heuristic values higher than h∗(sI) and are never expanded.

15. Admissible and Consistent Potential Heuristics

Ignoring unreachable states is a tougher problem. Ideally, we would like to maximize the heuristic values of reachable state and completely ignore all unreachable states, but even detecting whether a state is unreachable is as hard as planning itself. As an approximation, it is possible to randomly sample states from the reachable part of the state space and maximize the average heuristic value of these samples. If the sampled states accurately represent the reachable state space, a potential function that maximizes their average heuristic value can generalize to other states.

Seipp, Pommerening, and Helmert (2015) also experiment with functions that ap- proximate search effort. Assume there is a function effort(h) that correctly predicts the number of nodes that have to be expanded with an A∗ search using heuristic h. Minimizing effort(h) subject to CF would yield the admissible potential heuristic over

F that is best suited to the particular search space. The problem, of course, is that effort(h) is not known in practice and is hard to approximate accurately. There are ap- proximations for IDA∗ search (Korf, Reid, and Edelkamp, 2001) but they are based on assumptions that do not necessarily hold for A∗ search. A reasonably good approxi- mation of effort(h) is also not likely to be a linear function of the feature weights of h. Seipp, Pommerening, and Helmert (2015) consider an approximation that is quadratic in the weights and find weights for atomic potential heuristics with quadratic program- ming. We will later see that simpler methods already achieve a heuristic quality very close to the theoretical limit for atomic potential heuristics. The additional time spent on solving a quadratic program thus does not pay off. However, on a theoretical level this remains an interesting question for future research.