Summary - Improving cost and probability estimates using interaction

This chapter presented a review of several key concepts in classical planning, planning-based goal recognition, and probabilistic planning mentioned in subse- quent chapters. In particular, it offered an in-depth review of the planning problem, the planning technique based on heuristic search, and heuristic estimators in the state-of-the-art due to the significant influence to the current work. Then, it introduced the problem of goal recognition, a new active area in automated planning, which is also addressed in this dissertation. It finally reviewed the probabilistic planning problem, and mentioned the most remarkable techniques used to solve probabilistic planning problems.

Cost Estimates in a Plan Graph

using Interaction

This chapter starts by describing the classical cost propagation process in a plan graph and its use for computing cost estimates. Then, it presents a novel technique to propagate cost through a plan graph that computes more accurate es- timations of cost. Then, it proposes a family of heuristic functions based on the mentioned cost plan graph propagation technique. Finally, it shows an accuracy evaluation of the resulting family of heuristics and their application in classical planning.

3.1 Classical cost propagation in a plan graph

Simple propagation of cost estimates in a plan graph is a technique that has been used in a number of planning systems (Bonet et al., 1997; Nguyen et al., 2002; Do and Kambhampati, 2002). The computation of cost estimates starts from the initial conditions and works progressively forward through each successive layer of the plan graph. For level 0 it is assumed that the cost of the propositions at this level is zero. With this assumption, the propagation starts by computing the cost of the actions at level zero.

In general, the cost of performing an action a at level l with a set of preconditions Pais equal to the cost of achieving its preconditions. This may be computed in two different ways:

1. Maximization: the cost of an action is equal to the cost of reaching its costli- est precondition:

cost(a) = max xi∈Pa

cost(xi) (3.1)

2. Summation: the cost of an action is equal to the cost of reaching all its preconditions:

cost(a) = X xi∈Pa

cost(xi) (3.2)

The first method assumes the possibility of dependence among the preconditions of an action. This is an admissible assumption since it underestimates the cost of an action. On the other hand, the second method assumes independence among all preconditions of an action. Although this heuristic is non-admissible, it is typically more informative and compelling in practice. For the purpose of this thesis, we make use of the Summation technique as in Equation 3.2 for estimating the cost of an action.

The cost of achieving a proposition x at level l, achieved by the actions Axat the preceding level is the minimum cost among all a ∈ Ax. It is defined as:

cost(x) = min a∈Ax

{cost(a) + Cost(a)} (3.3)

where Cost(a) is the cost of applying action a, and cost(a) is given by Equa- tions 3.1 and 3.2.

Figure 3.1 shows the first layers of the plan graph for the simple Logistics problem shown in Figure 2.1. The numbers above propositions and actions are the costs associated with each one computed during the cost propagation process. The highlighted costs are the ones computed below as an example. In particular, at level 0 the cost for action (load pkg a trk) is zero since its preconditions are true in the initial state, so the cost of its effect (in trk pkg) is one at level 1:

cost(load pkg a trk) = cost(at a pkg) + cost(at a trk) = 0 + 0 = 0

cost(in trk pkg) = cost(load pkg a trk) + Cost(load pkg a trk) = 0 + 1 = 1

In the next actions’ layer, the cost for (verify pkg a trk) is 1, the sum of the cost of its preconditions, so its effect (verified pkg trk) has a cost of 2:

cost(verify pkg a trk) = cost(at a trk) + cost(in trk pkg) = 0 + 1 = 1

cost(verified pkg trk) = cost(verify pkg a trk) + Cost(verify pkg a trk) = 1 + 1 = 2

Figure 3.2 shows the continuation of the plan graph in Figure 3.1 for the simple Logistics problem. At level 2 we know by intuition that the cost of actions

P0 at a pkg 0 at a trk 0 A0 load pkg a trk 0 P1 at a pkg 0 at a trk 0 in trk pkg 1 A1 load pkg a trk 0 verify pkg a trk 1 P2 at a trk 0 at a pkg 0 in trk pkg 1 verified pkg trk 2

Figure 3.1: Example of a classical cost propagation in a plan graph.

(scan pkg trk) and (drive trk a b) is two. However, the sum of each action’s preconditions gives a cost of three:

cost(scan pkg trk) = cost(in trk pkg) + cost(verified pkg trk) = 1 + 2 = 3

cost(drive trk a b) = ( cost(at a trk) + cost(in trk pkg)+ cost(verified pkg trk) ) = 0 + 1 + 2 = 3 P2 at a trk 0 at a pkg 0 in trk pkg 1 verified pkg trk 2 A2 load pkg a trk 0 verify pkg a trk 1 scan pkg trk 3 drive trk a b 3 P3 at a trk 0 at a pkg 0 in trk pkg 1 verified pkg trk 2 scanned pkg trk 4 at b trk 4 A3 load pkg a trk 0 verify pkg a trk 1 scan pkg trk 3 drive trk a b 3 unload pkg trk b 8 P4 at a trk 0 at a pkg 0 in trk pkg 1 verified pkg trk 2 scanned pkg trk 4 at b trk 4 at b pkg 9

Figure 3.2: Example of a classical cost propagation in a plan graph (continued).

The problem here is that the proposition (verified pkg trk) is not independent of the proposition (in trk pkg) since the action (verify pkg a trk) has (in trk pkg)

as a precondition. Therefore, there is synergy between the two propositions that is not considered in simple cost propagation using the Summation heuristic.

Continuing with the simple cost propagation for the current example, the cost of propositions (scanned pkg trk) and (at b trk) at level 3 is:

cost(scanned pkg trk) = cost(scan pkg trk) + Cost(scan pkg trk) = 3 + 1 = 4

cost(at b trk) = cost(drive trk a b) + Cost(drive trk a b) = 3 + 1 = 4 This propagation results in (unload pkg trk b) having the cost:

cost(unload pkg trk b) = cost(scanned pkg trk) + cost(at b pkg) = 4 + 4 = 8

Taking the above calculations into consideration, a plan graph is built in the same way that an ordinary plan graph is created. The construction process finishes when two consecutive propositions layers are identical and there is quiescence in cost. Quiescence is reached when the cost for each proposition and action in the plan graph no longer changes. On completion, each possible goal proposition has an estimated cost of been achieved.

In Section 2.1, we show that the optimal cost for the current example is 5. Using the simple cost propagation approach, the goal proposition (at b pkg) has a cost of 9 when the propagation finishes. That is:

cost(at b pkg) = cost(unload pkg trk b) + Cost(unload pkg trk b) = 8 + 1 = 9 This cost overestimates the optimal cost because of the assumption of independence among all the preconditions of an action. In the next section, we present a method that estimates the degree of dependence between pairs of propositions and pairs of actions in a plan graph, and thus computes more accurate estimates of cost.

In document Improving cost and probability estimates using interaction (Page 57-62)