Ordered Landmarks in Planning

(1)

Jorg Homann [email protected]

Max-Planck-Institut f ur Informatik,

Saarbr ucken,Germany

Julie Porteous [email protected]

Department ofComputer andInformation Sciences,

The Universityof Strathclyde,

Glasgow, UK

Laura Sebastia [email protected]

Dpto. Sist. InformaticosyComputacion,

Universidad Politecnicade Valencia,

Valencia, Spain

Abstract

Many known planning tasks have inherent constraints concerning the best order in

which to achievethe goals. A number of researcheorts havebeen made to detectsuch

constraintsand to use them for guiding search, in thehope of speedingup the planning

process.

We go beyond the previous approaches by considering ordering constraintsnot only

over the (top-level) goals, but also over the sub-goals that will necessarily arise during

planning. Landmarks are facts that must be true at some point in every valid solution

plan. WeextendKoehlerandHomann'sdenitionofreasonableordersbetweentoplevel

goals to the more general case of landmarks. We show how landmarks can be found,

how theirreasonableorders canbeapproximated, andhowthis information canbeused

to decompose a given planning task into several smaller sub-tasks. Our methodology is

completelydomain-andplanner-independent. Theimplementationdemonstratesthatthe

approach canyield signicantruntimeperformanceimprovementswhenusedasacontrol

looparoundstate-of-the-artsub-optimalplanningsystems,asexempliedbyFFandLPG.

1. Introduction

Given the inherent complexity of the general planning problem it is clearly important to

developgoodheuristicstrategiesforbothmanagingandnavigatingthesearchspaceinvolved

in solvinga particular planninginstance. One way inwhich search can be informed is by

providing hints concerning the order in which planning goals should be addressed. This

can make a signicant dierence to search eÆciency by helping to focus the planner on

a progressive path towards a solution. Work in this area includes that of Koehler and

Homann (2000). Theyintroducethe notionof reasonable orders whichstates that a pair

ofgoalsA andBcan beorderedsothatBisachievedbeforeAifitisn'tpossibletoreacha

state inwhich Aand Barebothtrue,froma state inwhich justAis true,withouthaving

totemporarilydestroyA.InsuchasituationitisreasonabletoachieveBbeforeAtoavoid

(2)

The main idea behind the work discussed in this paper is to extend those previous

ideas on orders by not only ordering the (top-level) goals, but also the sub-goals that

will necessarily arise during planning, i.e., by also taking into account what we call the

landmarks. The key feature of a landmark is that it must be true at some point on any

solutionpathtothegivenplanningtask. ConsidertheBlocksworldtaskshowninFigure1,

which willbe ourillustrativeexamplethroughout thepaper.

A

C

D

B

D

C

B

A

initial state

goal

Figure 1: ExampleBlocksworld task.

For the reader who is weary of seeing toy examples like the one in Figure 1 in the

literature,weremarkthatourtechniquesarenotprimarilymotivatedbythisexample. Our

techniques are usefulin much more complex situations. We use the depicted toyexample

only foreasy demonstration of some of theimportant points. In the example, clear(C) is

a landmarkbecause it willneedto be achieved inanysolutionplan. Immediately stacking

B on D from the initial state willachieve one of the toplevel goals of thetask but it will

resultinwasted eortifclear(C) isnotachieved rst. Toorder clear(C)beforeon(BD) is,

however,notreasonableintermsofKoehlerandHomann'sdenition. First,clear(C)isnot

atoplevelgoalsoitisnotconsideredbyKoehlerandHomann'stechniques. Second,there

arestates whereBison D and fromwhichclear(C) can be achieved withoutunstackingB

fromD again(compare thedenitionofreasonableordersgiven above). Butreaching such

astate requiresunstackingD fromC, and thusachievingclear(C), intherstplace. This,

togetherwith thefactthat clear(C)must be madetrue atsome point,makesitsensibleto

order clear(C) before on(BD).

We propose a natural extension of Koehler and Homann's denitions to the more

general caseof landmarks(trivially,all toplevelgoals are landmarks,too). We also revise

partsoftheoriginaldenitiontobettercapturetheintuitivemeaningofagoalordering. The

extendedandreviseddenitionscapture,inparticular,situationsofthekinddemonstrated

withclear(C)on(BD)inthetoyexampleabove. Wealsointroduceanewkindofordering

thatoftenoccursbetweenlandmarks: A canbeorderedbefore B ifallvalidsolutionplans

make A true before they make B true. We call such orders necessary. Typically, a fact

isa landmark because it isnecessarily ordered before some other landmark. For example,

clear(C) is necessarily ordered before holding(C), and holding(C) is necessarily ordered

beforethetoplevel goalon(C A), intheabove Blocksworld example.

Decidingifafactisa landmark,anddecidingaboutourorderingrelations,is

PSPACE-complete. Wedescribepre-processingtechniquesthat extractlandmarks,andthat

approx-imate necessary orders between them. We introduce suÆcient criteria for the existence

of reasonable orders between landmarks. The criteria are based on necessary orders, and

inconsistencies between facts. 1

Using an inconsistency approximation technique from the

(3)

literature, we approximate reasonable orders based on our suÆcient criteria. After these

pre-processes have terminated, what we get is a directed graph where the nodes are the

foundlandmarks,and theedgesaretheordersfoundbetweenthem. Wecallthisgraphthe

landmark generation graph, shortLGG.Thisgraphmaycontain cyclesbecause forsome of

ourordersthereisnoguaranteethatthereisaplan,orevenanactionsequence,thatobeys

them. 2

Ourmethodforstructuringthesearchforaplancan nothandlecycles intheLGG,

so we remove cycles by removing edges incident upon them. We end up with a polytree

structure. 3

Once turned into a polytree, the LGG can be used to decompose the planning task

into smallchunks. We proposea method thatdoesnotdependon any particularplanning

framework. Thelandmarksprovideasearchcontrolloopthatcanbeusedaroundanybase

plannerthatiscapableofdealingwithSTRIPSinput. Thesearchcontroldoesnotpreserve

optimalitysothereisnotmuchpointinusingitaroundoptimalplannerssuchasGraphplan

(Blum & Furst, 1997) and its relatives. Optimal planners are generally outperformed by

sub-optimal planners anyway. It does make sense, however, to use the control in order

to further improve the runtime performance of sub-optimal approaches to planning. To

demonstratethis,weusedthetechniqueforcontroloftwo versionsofFF (Homann,2000;

Homann & Nebel, 2001), and for controlof LPG (Gerevini,Saetti, &Serina, 2003). We

evaluated these planners across a range of 8 domains. We consistently obtain, sometimes

dramatic, runtime improvements for the FF versions. We obtain runtime improvements

forLPG inaround half of thedomains. Theruntimeimprovement is, forall theplanners,

usuallybought atthecostof slightlylongerplans. Butthere arealsosome caseswherethe

plansbecome shorterwhen usinglandmarkscontrol.

The paperisorganisedasfollows. Section2givesthebasicnotations. Section3 denes

what landmarksare,and inwhat relationsbetweenthemwe areinterested. Exact

compu-tation of the relevant pieces of information is shown to be PSPACE-complete. Section 4

explains our approximation techniques, and Section 5 explains how we use landmarksto

structurethesearchofanarbitrarybaseplanner. Section6providesourempiricalresultsin

arangeofdomains. Section7closesthepaperwithadiscussionofrelatedwork,ofour

con-tributions,andoffuturework. MostproofsaremovedintoAppendixA,andreplacedinthe

text by proof sketches, to improve readability. Appendix B provides runtime distribution

graphsassupplementarymaterialtothetablesprovidedinSection6. AppendixCdiscusses

somedetailsregardingourexperimentalimplementationoflandmarkscontrolaroundLPG.

2. Notations

WeconsidersequentialplanninginthepropositionalSTRIPS(Fikes&Nilsson,1971)

frame-work. In the following, all sets are assumed to be nite. A state s is a set of logical facts

(atoms). An action ais a triplea=(pre(a);add(a);del(a)) wherepre(a) are theaction's

preconditions, add(a) is its add list, and del(a) is its delete list, each a set of facts. The

2.Also, none of ourordering relations istransitive. Westick to the word \order"only becauseit is the

mostintuitivewordforconstraintsontherelativepointsintimeatwhichplanningfactscanorshould

beachieved.

3.Removingedges incident oncyclesmight, ofcourse, throw away usefulordering information. Coming

up withothermethodsto treatcycles,or withmethodsthat canexploitthe informationcontainedin

(4)

result ofapplying(the actionsequenceconsisting of)a single actionato astate s is:

R esult(s;hai)= (

(s[add(a))ndel(a) pre(a)s

undened otherwise

Theresultofapplyingasequenceofmorethanoneactionto astateisrecursivelydenedas

R esult(s;ha

1 ;:::;a

n

i)=R esult(R esult(s;ha

1 ;:::;a

n 1 i);ha

n

i). Applyinganemptyaction

sequencechangesnothing,i.e., R esult(s;hi)=s. Aplanningtask(A;I;G)isatriplewhere

A is a set of actions, and I (the initial state) and G (the goals) are sets of facts (we use

the word \task" rather than \problem" in order to avoid confusion with the

complexity-theoreticnotionof decisionproblems). A plan, orsolution, foratask(A;I;G)is anaction

sequence P 2A

suchthat GR esult(I;P).

3. Ordered Landmarks: What They Are

In this section we introduce our framework. We dene what landmarks are, and in what

relations between them we are interested. We show that all the corresponding decision

problems are PSPACE-complete. Section 3.1 introduces landmarks and necessary orders,

Section 3.2 introduces reasonable orders, and Section 3.3 introduces obedient reasonable

orders(ordersthatarereasonableifonehasalreadycommitted toobeyagivena-prioriset

of reasonableorderingconstraints).

3.1 Landmarks, and Necessary Orders

Landmarksare facts thatmust be true at some pointduringthe executionof anysolution

plan.

Denition 1 Given a planning task (A;I;G). A fact L is a landmark if for all P =

ha

1 ;:::;a

n i2A

;GR esult(I;P):L2R esult(I;ha

1 ;:::;a

i

i) for some 0in.

Notethatinanunsolvabletaskallfactsarelandmarks(byuniversalquanticationover

the empty set of solution plans in the above denition). The denition thus only makes

sense if the task at hand is solvable. Indeed, while our landmark techniques can help a

planningalgorithm to nda solutionplan faster(as we willsee later),they arenot useful

for proving unsolvability. The reasonable orders we will introduce are based on heuristic

notions that make sense intuitively, but that are not mandatory in the sense that every

solutionplanobeys them,or even inthesensethat there existsa solutionplanthat obeys

them. Detailson thistopic are given withthe individualconcepts below. We remarkthat

wemaketheseobservationsonlytoclarifythemeaningofourdenitions. Giventhewaywe

usethe landmarksinformationforplanning,forour purposes itis notessential iforifnot

an ordering constraint is mandatory. Our search controllooponly suggests to theplanner

what mightbe good to achieve next, itdoesnotforcetheplannerto do so(see Section5).

Initial and goal facts are triviallylandmarks: set i to 0 respectively n in Denition1.

In general,it isPSPACE-complete to decidewhether a factis alandmarkor not.

Theorem 1 LetLANDMARKdenotethefollowingproblem: givenaplanningtask(A;I;G),

and a fact L; is L a landmark?

(5)

Proof Sketch: PSPACE-hardnessfollows by a straightforward reduction of the

comple-ment of PLANSAT{ the decision problem of whether there exists a solution plan to a

given arbitrary STRIPS task(Bylander, 1994) { to the problemof deciding LANDMARK.

PSPACE-membershipfollows viceversa. 2

FullproofsareinAppendixA. Oneof themostelementaryordering relationsbetween

apairLand L 0

oflandmarksisthefollowing. InanyactionsequencethatmakesL 0

truein

some state, L is true in the immediatepreceding state. Typically,a fact L is a landmark

because it is ordered in this way before some other landmark L 0

. The reason is typically

thatLisanecessaryprerequisite{asharedprecondition{forachievingL 0

. Wewillexploit

thisforourapproximationtechniquesinSection 4.

Denition 2 Givenaplanningtask(A;I;G),andtwofactsLandL 0

. Thereisanecessary

order between L and L 0

, written L!

n L

0

, if L 0

62I, and for all P =ha

1 ;:::;a

n i 2A

: if

L 0

2R esult(I;ha

1 ;:::;a

n

i) then L2R esult(I;ha

1 ;:::;a

n 1 i).

The denitionallowsforarbitraryfacts,butthecasethatwewillbeinterestedinisthe

case where L and L 0

are landmarks. Note that if L 0

2R esult(I;ha

1 ;:::;a

n

i) then n 1

as L 0

62I. The intention behind a necessary order L!

n L

0

is that one must have L true

beforeone canhave L 0

true. So itdoesnotmake senseto allowsuchordersforinitialfacts

L 0

. It isimportantthatLis postulatedto betruedirectlybeforeL 0

{thisway,iftwo facts

L and L 00

arenecessarily orderedbefore thesame factL 0

,one can concludethat Land L 00

mustbetruetogetherat somepoint. Wemakeuseofthisobservation inourapproximation

of reasonableorders(see Section4.2).

We denote necessary orders, and all the other ordering relations we will introduce, as

directedgraphedges\!"ratherthanwiththemoreusual\<"symbol. Wedothistoavoid

confusionaboutthemeaningofourrelations. Assaidearlier,noneoftheorderingrelations

we introduce istransitive. (Note that !

n

would be transitive ifL wasonlypostulatedto

holdsometime beforeL 0

,notdirectlybeforeit.)

Necessary orders aremandatory. We saythat an actionsequence ha

1 ;:::;a

n

i obeys an

order L ! L 0

ifthe sequence makesL true therst timebefore it makes L 0

true the rst

time. Precisely, ha

1 ;:::;a

n

i obeys L ! L 0

if either L 2 I, or minfi j L 2 add(a

i )g <

minfi j L 0

2 add(a

i

)g where the minimum over an empty set is 1. That is, either L is

true initially, orL 0

is notadded at all, or L is added before L 0

. By denition, any action

sequence obeys necessary orders. So one does not lose solutions ifone forces a plannerto

obey necessary orders, i.e. if one disallows plans violating the orders. (We reiterate that

thisis a purely theoretical observation; assaid above, our search controldoes notenforce

thefoundordering constraints.)

Theorem 2 Let NECESSARY-ORD denote the following problem: given a planning task

(A;I;G), andtwo facts L and L 0

; does L!

n L

0

hold?

Deciding NECESSARY-ORD isPSPACE-complete.

Proof Sketch: PSPACE-hardness follows by reducing the complement of PLANSAT to

NECESSARY-ORD. PSPACE-membershipfollows with a non-deterministic algorithm that

(6)

Anotherinterestingrelationaregreedy necessary orders,aslightlyweakerversionofthe

necessaryordersabove. We postulatenotthatL istruepriorto L 0

inall actionsequences,

but only in those action sequences where L 0

is achieved for the rst time. These are the

ordersthat we actuallyapproximateand useinourimplementation(see Section4).

Denition 3 Given a planning task (A;I;G), and two facts L and L 0

. There is a greedy

necessaryorderbetweenLandL 0

,writtenL!

gn L

0

,ifL 0

62I,andforallP =ha

1 ;:::;a

n i2

A

: if L 0

2 R esult(I;ha

1 ;:::;a

n

i) and L 0

62 R esult(I;ha

1 ;:::;a

i

i) for 0 i < n, then

L2R esult(I;ha

1 ;:::;a

n 1 i).

Like abovewiththenecessary orders,the actionsequence achievingL 0

mustcontainat

least one step as L 0

62I. Obviously,!

n

is stronger than !

gn

, that is, with L !

n L

0

for

two facts L and L 0

, L!

gn L

0

follows. Greedy necessary orders are stillmandatory inthe

sensethatevery actionsequence obeys them.

The denition of greedy necessary orders captures the fact that, really, what we are

interestediniswhathappenswhenwedirectlyachieve L 0

fromtheinitialstate,ratherthan

in some remote part of thestate space. The consideration of these more remote parts of

thestate space,which is inherent inthedenitionof the non-greedynecessary orders, can

make us lose useful information. Consider the Blocksworld example inFigure 1. There is

a greedynecessary order betweenclear(D) and clear(C),clear(D) !

gn

clear(C),but nota

necessaryorder, clear(D)6!

n

clear(C).If we makeclear(C) true thersttime inanaction

sequencefromtheinitialstate,thentheactionachievingclear(C)willalwaysbeunstack(D

C), which requires clear(D) to be true. On the other hand, there can of course be action

sequences which achieve clear(C) by dierent actions (unstack(A C), for example). But

reaching a state where clear(C) can be achieved by such an action involves unstacking D

from C,and thusachievingclear(C),in therst place. We willseelater (Section4.2) that

theorderclear(D)!

gn

clear(C)canbeusedtomaketheimportantinferencethatclear(C)

isreasonably orderedbeforeon(B D).

More generally, thedenition of greedy necessary orders is made from the perspective

that we are interested in orderingthe rst occurence ofthe facts L inour desiredsolution

plan. Alldenitions and algorithms in the rest of this paperare designed from this same

perspective. Sinceafactmight(have to)bemadetrueseveral timesina solutionplan,one

could just as wellfocus on ordering the fact's lastoccurence, orany occurence, or several

occurences ofit. We choseto focuson therst occurences offacts mainlyinorder to keep

things simple. It seems very hard to say anything usefula prioriabout exactlyhow often

andwhen somefactwillneedtobecome trueinaplan. The \greedyassumption"thatour

approachthusmakesisthatallthelandmarksneedtobeachievedonlyonce,andthatitis

besttoachieve themasearlyaspossible. Ofcoursethisassumptionisnotalways justied,

andmaylead todiÆculties,suchase.g. cyclesinthegeneratedLGG(seealso Sections4.4

and 6.8). Generalisingourapproach to takeaccount ofseveral occurences ofthesame fact

isan open research topic.

Theorem 3 Let GREEDY-NECESSARY-ORD denote the following problem: given a

plan-ning task (A;I;G), and two facts L andL 0

; does L!

gn L

0

hold?

(7)

Proof Sketch: Bya minormodicationof theproof to Theorem2. 2

3.2 Reasonable Orders

Reasonable orders were rst introduced by Koehler and Homann (2000), for top level

goals. We extendtheirdenition,ina slightlyrevisedway,to landmarks.

Let us rst reiterate what the idea of reasonable orders was originally. The idea

in-troduced by Koehler and Homann is this. If the planner is in a state s where one goal

L 0

has just been achieved, but another goal L is still false, and L 0

must be destroyed in

order to achieveL,thenit mighthave beenbetter to achieve Lrst: to get toa goalstate

from s,theplannerwillhave to deleteand re-achieveL 0

. If thesame situationarisesinall

statesswhereL 0

hasjustbeenachievedbutLisfalse,thenitseemsreasonabletogenerally

introducean orderingconstraint L!L 0

,indicatingthat L shouldbe achieved priorto L 0

.

Theclassicalexamplefortwofactswithareasonableorderingconstraintareonrelations

inBlocksworld,whereon(B,C)isreasonablyorderedbeforeon(A,B)wheneverthegoalis

to havebothfactstrue inthegoalstate. Obviously,ifone achieveson(A, B)rstthen one

hasto unstackA again inorder to achieve on(B, C).

ThinkaboutanunmodiedapplicationofKoehlerandHomann'sdenitiontothecase

of landmarks. Considera state swherewe have a landmarkL 0

,butnotanotherlandmark

L,and achieving LinvolvesdeletingL 0

. Does itmatter? It might be that we do notneed

to achieve L from s anyway. It might also be that we do not need L 0

anymore once we

have achieved L. In both cases, there is no need to delete and re-achieve L 0

, and it does

not appear reasonable to introduce the constraint L ! L 0

. The question is, under which

circumstancesisitreasonable? Theanswerisgivenbythetwomentionedcounter-examples.

Thesituationmattersif1. weneedtoachieveLfroms,and2. wemustre-achieveL 0

again

afterwards. Both conditions are trivially fullledwhen L and L 0

are top level goals. Our

denitionbelowmakessure they holdforthelandmarksL andL 0

inquestion.

We saythatthere isa reasonableordering constraint betweentwo landmarksL and L 0

if, startingfrom anystate where L 0

wasachieved before L: L 0

must betrue at some point

laterthantheachievement ofL;and onemustdeleteL 0

on thewaytoL. Formally,we rst

dene the \set of states where L 0

was achieved before L", then we dene what it means

that\L 0

mustbe trueat some pointlater thantheachievement of L",then basedon that

we denewhat reasonableorders are.

.

1. By S

(L 0

;:L)

,we denotethe setof states ssuch that thereexistsP =ha

1 ;:::;a

n i2A

,

s=R esult(I;P), L 0

2add(a

n

), andL62R esult(I;ha

1 ;:::;a

i

i) for 0in.

2. L 0

is in the aftermath of L if, for all states s 2 S

(L 0

;:L)

, and all solution plans

P = ha

1 ;:::;a

n i 2 A

from s, G R esult(s;P), there are 1 i j n such that

L2R esult(s;ha

1 ;:::;a

i

i)and L 0

2R esult(s;ha

1 ;:::;a

j i).

3. There is a reasonable order between L and L 0

, written L !

r L

0

, if L 0

is in the

aftermath of L, and

8s2S

(L 0

;:L)

:8P 2A

: L2R esult(s;P))9a2P :L 0

(8)

Let usexplainthisdenition, and howit diersfrom Koehler and Homann'soriginal one. 1. S (L 0 ;:L)

containsthestateswhereL 0

wasjustadded,butLwasnottrueyet. Theseare

thestatesweconsider: weareinterested to knowif, fromevery state s2S

(L 0

;:L) ,we

willhave to deleteand re-achieve L 0

. In Koehler and Homann's originaldenition,

S

(L 0

;:L)

containedmorestates, namelyall thosestates swhereL 0

wasjustaddedbut

L 62 s. This denitionallowed cases where L was achieved already but was deleted

again. Our revised denitioncaptures better theintuition that we want to consider

allstateswhereL 0

wasachievedbeforeL. Thereviseddenitionalsomakessurethat,

foralandmarkL,anysolutionplanstartingfroms2S

(L 0

;:L)

mustachieveLat some

point.

2. The denition of the aftermath relation just says that, in a solution plan starting

froms2S

(L 0

;:L) ,L

0

mustbetruesimultaneouslywithL,oratsomelatertimepoint.

Koehler and Homann didn't need such a denitionsince this condition is trivially

fullledfortoplevelgoals.

3. The denition of L !

r L

0

then says that, from every s 2 S

(L 0

;:L)

, every action

sequence achievingL deletes L 0

at some point. Withtheadditional postulationthat

L 0

is in the aftermath of L, this impliesthat from every s 2 S

(L 0

;:L)

one needs to

delete and re-achieve L 0

. Koehlerand Homann's denitionhere is identical except

thatthey do notneedto postulate theaftermath relation.

BecauseintheirdenitionS

(L 0

;:L)

containsmorestates, andtoplevelgoalsaretrivially

intheaftermathofeachother,KoehlerandHomann's!

r

denitionisstrongerthanours,

i.e. L!

r L

0

intheKoehlerandHomann senseimpliesL!

r L

0

asdened above (we give

an examplebelowwhereour,butnottheKoehlerandHomann L!

r L

0

relationholds). 4

It is important to note that reasonable orders are not mandatory. An order L !

r L

0

only says that, if we achieve L 0

before L, we will need to delete and re-achieve L 0

. This

might mean that achieving L 0

before L is wasted eort. But there are cases where, inthe

process of achievingsome landmarkL, one hasno other choice butto achieve, delete, and

re-achieve a landmarkL 0

. In theTowers of Hanoi domain,forexample, thisis thecasefor

nearly all pairs of top level goals { namely, for all those pairs of goals that say that (L 0

)

disci must be located on disc i+1, and (L) disci+1 must be located on disci+2. In

such a situation,forcing a planner to obeythe order L!

r L

0

cuts outall solutionpaths.

One can also easily construct cases where L !

r L 0 and L 0 ! r

L hold for goals L and L 0

(that can notbe achieved simultaneously). Considerthefollowing example. There arethe

4.NotethatanorderL!rL 0

intendstotellusthatweshouldnotachieveL 0

beforeL. Thisleavesopen

theoptiontoachieveLandL 0

simultaneously.Inthatsense,ourdenition(givenaboveinSection3.1)

of whatit meansto obeyanorderL!L 0

,namelyto addLstrictlybeforeL 0

,is abittoorestrictive.

Inourexperience,therestrictionisirrelevantinpractice. Innoneofthemanybenchmarkswetrieddid

we observe factsthat werereasonably ordered (orderedatall, infact) relative toeachotherand that

could be achievedwith thesame action{ rememberthat we consider thesequential planningsetting.

WeremarkthatonecaneasilyadaptourframeworktotakeaccountofsimultaneousachievementofL

andL 0

. Nochangesareneededexceptintheapproximationofobedientreasonableorders,whichwould

(9)

seven facts L, L 0 ,P 1 ,P 2 ,P 0 2 ,P 3

,and P 0

3

. InitiallyonlyP

1

is true,and the goal isto have

L andL 0

. Theactions are:

name (pre; add; del)

opL

1

= (fP

1

g; fL;P

2 g; fP 1 g) opL 0 1 = (fP 1 g; fL 0 ;P 0 2 g; fP 1 g) opL 2 = (fP 0 2

g; fL;P

3 g; fL 0 ;P 0 2 g) opL 0 2 = (fP 2 g; fL 0 ;P 0 3

g; fL;P

2 g) opL 3 = (fP 0 3

g; fLg; fP

0 3 g) opL 0 3 = (fP 3 g; fL 0 g; fP 3 g)

Figure 2 shows thestate space of theexample. Thereare exactlytwo solutionpaths, h

opL 1 ,opL 0 2 ,opL 3

i and h opL 0 1 ,opL 2 ,opL 0 3

i. The rst ofthese pathsachieves, deletes,

and re-achievesL,thesecond one doesthesame withL 0

. S

(L 0

;:L)

contains thesinglestate

thatresultsfromapplyingopL 0

1

totheinitialstate. Fromthatstate,onehastoapplyopL

2

inorderto achieve L,deletingL 0

,soL!

r L

0

holds. Similarly,itcan beseenthatL 0

!

r L

holds. Note thateither solutionpathdisobeysone of thetwo reasonableorders.

P’

{ L,

P }

₃

{ L’,

3 P’ }

{ L, L’ }

opL

₃

opL’

₃

opL

1

2 { P

₁

}

1 opL’

}

2 opL

₂

{ L,

2 P

{ L’,

Figure 2: Statespace of theexample.

Wereiteratethattheabovearepurelytheoreticalobservationsmadetoclarifythe

mean-ing of ourdenitions. Our search control does notenforce thefoundordering constraints,

itonlysuggests them to theplanner.

While reasonable orders L !

r L

0

are not mandatory, they can help to reduce search

eortinthosecaseswhereachievingL 0

beforeLdoes implywastedeort. OurBlocksworld

example from Figure 1 constitutes such a case. In the example, it makes no sense to

stack B onto D while D is still located on C, because C has to endup on topof A. By

Denition 4, clear(C) !

r

on(B D) holds: S

(on(BD);:cl ear(C))

contains onlystates where B

has been stacked onto D, but D is still on top of C. From these states, one must delete

on(B D) in order to achieve clear(C). Further, on(B D) is a top-level goal so it is in the

aftermath ofclear(C),and clear(C) !

r

on(B D)follows. Theorder doesnotholdinterms

of Koehler and Homann's denition, because there the S

(on(BD);:cl ear(C))

state set also

containsstates whereDwasalready removed fromC.

Like the previous decision problems, those related to the aftermath relation and to

reasonableordersare PSPACE-complete.

Theorem 4 LetAFTERMATHdenotethefollowingproblem: givenaplanningtask(A;I;G),

and two facts L and L 0

; is L 0

in the aftermath of L?

(10)

AFTERMATH.PSPACE-membershipfollowsbyanon-deterministicalgorithmthatguesses

counter examples. 2

Theorem 5 Let REASONABLE-ORD denote the following problem: given a planning task

(A;I;G), and two facts Land L 0

such that L 0

isin theaftermath of L;does L!

r L

0

hold?

Deciding REASONABLE-ORDis PSPACE-complete.

REASONABLE-ORD,with thesameconstruction asusedbyKoehlerand Homann(2000)

for the original denition of reasonable orders. PSPACE-membership follows by a

non-deterministicalgorithm thatguesses counter examples. 2

3.3 Obedient Reasonable Orders

Saywe already have aset O of reasonableordering constraints L!

r L

0

. The questionwe

focusoninthesectionathandis,ifa plannercommits toobeyalltheconstraintsinO,do

other reasonableordersarise? The answeris,yes, there might.

Consider the following situation. Say we got landmarksL and L 0

, such that we must

delete L 0

in order to achieve L. Also, there is a third landmark L 00

such that L 0

!

n L

00

and L !

r L

00

. Now, if the order L ! L 00

was necessary, L !

n L

00

, then we would have

a reasonable order L !

r L

0

: L and L 0

would need to be true together immediately prior

to the achievement of L 00

, so L 0

would be in the aftermath of L. However, the ordering

constraint L! L 00

is \only" reasonableso there is no guarantee that a solution plan will

obey it. A plan can choose to achieve L 0

before L 00

before L, and thereby avoid deletion

and re-achievement of L 0

. But if we enforce the ordering constraint L !

r L

00

,disallowing

plansthatdo notobey it, thenachievingL 0

before L leadsto deletionand re-achievement

of L 0

and is thusnotreasonable.

With the above, the idea we pursue now is to dene a weaker form of reasonable

or-ders, which are obedient in thesense that they onlyarise ifone commits to a given set O

of (previously computed) reasonable ordering constraints. In our experiments, using (an

approximation of) such obedient reasonable orders, on topof the reasonableorders

them-selves, resulted in signicantly better planner performance in a few domains (such as the

Blocksworld), and made no dierence in the other domains. Summarised, what we do is,

we startfrom thesetO ofreasonableordersalready computedbyourapproximations, and

then insert new orders that are reasonable given one commits to obey the constraints in

O. We do thisjustonce, i.e. wedo notcompute axpoint. The detailsareinSection 4.3.

Right now, we denewhat obedientreasonable ordersare.

The denition of obedient reasonable orders is almost the same as that of reasonable

orders. Theonlydierencelies inthatwe consideronlyactionsequencesthatare obedient

in the sense that they obey all ordering constraints in the given set O. The denition of

when an action sequence ha

1 ;:::;a

n

i obeys an order L ! L 0

was already given above: if

eitherL2I,orminfijL2add(a

i

)g<minfijL 0

2add(a

i

)g wheretheminimumover an

(11)

Denition 5 Given a planning task (A;I;G), a set O of reasonable ordering constraints,

and two facts L and L 0

.

1. By S O

(L 0

;:L)

, we denote the set of states s such that there exists an obedient action

sequence P = ha

1 ;:::;a

n i 2 A

, with s = R esult(I;P), L 0

2 add(a

n

), and L 62

R esult(I;ha

1 ;:::;a

i

i)for 0in.

2. L 0

isintheobedientaftermathofLif,forallstatess2S O

(L 0

;:L)

,andallobedient

solu-tionplansP =ha

1 ;:::;a

n i2A

,GR esult(I;P),wheres=R esult(I;ha

1 ;:::;a

k i),

there are k i j n such that L 2 R esult(I;ha

1 ;:::;a

i

i) and L 0

2 R esult(I;

ha

1 ;:::; a

j i).

3. There is an obedient reasonable order between L and L 0

, written L ! O

r L

0

, if and

only ifL 0

isin the obedient aftermath of L, and

8s2S O

(L 0

;:L)

:8P 2A

: L2R esult(s;P))9a2P :L 0

2del(a)

This denition is very similar to Denition 4 and thus should be self-explanatory, in

its formalaspects. Thedenitionof theaftermath relationlooks alittle more complicated

becausethesolutionplanP starts from theinitialstate,notfrom sasinDenition4,and

reaches swith actiona

k

. This isjust aminor technicaldevice to cover thecasewhere, for

some of the L

1 !

r L

2

constraints in O, L

1

is contained in s already (and thus does not

needtobeaddedaftersinorderto obeyL

1 !

r L

2

). Notethat, inpart3ofthedenition,

theaction sequencesP achievingL are notrequiredto be obedient. While it would make

sensetoimposethisrequirement,ourapproximationtechniques(that willbeintroducedin

Section4.3)onlytakeaccountofO inthecomputationoftheaftermathrelationanyway. It

isanopenquestionhowourotherapproximationtechniquescouldbemadetotakeaccount

of O.

We remark that the modied denitions do notchange the computational complexity

of the corresponding decision problems. 5

As a quick illustration of the new denitions,

reconsider the situation described above. There, L 0

is not in the aftermath of L, but in

theobedient aftermathofLbecauseall actionsequencesthatobeytheconstraintL!

r L

00

makeL 0

trueat apointsimultaneouslywithorbehindL (namelyimmediatelypriorto L 00

,

assumingthatthereisno actionthataddsbothL andL 00

). AsL 0

mustbedeletedinorder

to achieve L, we obtain the ordering L ! fL! r L 00 g r L 0

. That is, if the planner obeys the

constraint L!

r L

00

thenitis reasonableto also orderL beforeL 0

.

Just likethe reasonableorders,the obedient reasonableorders arenotmandatory.

En-forcinganobedientreasonableordercan cutoutallsolutionpaths. Thereasonisthesame

as for the reasonable orders. An order L ! O

r L

0

only says that, given we want to obey

O, achieving L 0

before L implies deletion and re-achievement of L 0

. If this really means

that achievingL 0

beforeL is wasted eort, theorder tells usnothing about. Considerthe

following example. There are the ten facts L, L 0

, L 00

, P, A

1 , A 2 , A 3 , B 1 , B 2

, and B

3 .

5.For theobedient aftermathrelation, minormodications oftheproof toTheorem4suÆce.

PSPACE-hardness follows by usingthe empty set ofordering constraints. PSPACE-membershipfollows by

ex-tending the non-deterministic decisionalgorithm withags that check if the ordering constraintsare

(12)

Initially onlyP is true, and the goal is to have L, L 0

, and L 00

. The construction is made

sothat L!

r L

00

,and L6!

r L

0

butL! fL!rL 00 g r L 0

. Enforcing L! fL!rL 00 g r L 0 rendersthe

taskunsolvable. The actionsare:

opA = (fPg; fA

1

g; fPg)

opB = (fPg; fB

1

g; fPg)

opA 1 = (fA 1 g; fL 0 ;L 00 ;A 2 g; fA 1 g) opA 2 = (fA 2

g; fL;A

3 g; fL 00 ;A 2 g) opA 3 = (fA 3 g; fL 00 g; fA 3 g) opB 1 = (fB 1 g; fL 0 ;B 2 g; fB 1 g) opB 2 = (fB 2

g; fL;B

3 g; fL 0 ;B 2 g) opB 3 = (fB 3 g; fL 0 ;L 00 g; fB 3 g)

Figure3showsthestatespaceoftheexample. Onehastochooseoneoutoftwooptions.

First,one appliesopAto theinitialstate and thenproceedswithopA

1 ,opA

2

,andopA

3 .

Second, one applies opB to the initial state and proceeds with opB

1 , opB

2

, and opB

3 .

TherstoptionistheonlyonewhereL 00

becomestruebeforeL. OnehastodeleteL 00

with

opA

2

, and re-achieve it withopA

3

. For thisreason, the order L !

r L

00

holds. The order

L!

r L

0

doesnotholdbecause ifone chooses therst optionthenL 0

becomestruepriorto

L, and is neverdeleted. However, committing to the order L !

r L

00

means excludingthe

rstoption. Inthesecond option,L 0

becomestrue beforeL,and mustthenbe deletedand

re-achieved, so we get theorder L! fL!rL 00 g r L 0

. But there is no solutionplan that obeys

this order because there is no way to make L true before (or, even, simultaneously with)

L 0

.

L’,

2 {

opB

L,

{ L’, L’’, A

₂

}

2 {

}

opA

L, L’, A

₃

B }

₂

B

₃

}

opA

₃

opB

₃

{

{ L, L’, L’’}

{ P

opB

opA

₁

1 opB

}

1

1 A

opA

}

B

{

Figure 3: Statespace of theexample.

4. How to Find Ordered Landmarks

We now describe ourmethods to ndlandmarksin a given planning task, and to

approx-imate their inherent ordering constraints. The result of theprocess is a directed graph in

theobviousway,thelandmarksgenerationgraph(LGG).Section4.1describeshowwend

landmarks, and how we approximate greedy necessary orders between them. Section 4.2

givesa suÆcientcriterion forreasonableorders, based ongreedy necessaryordersand fact

inconsistencies,anddescribeshowweusethecriterionforapproximatingreasonableorders.

(13)

handlingofcyclesintheLGG,andSection4.5describesapreliminaryformof\lookahead"

ordersthat we have alsoimplementedand used.

4.1 Finding Landmarks and Approximating Greedy Necessary Orders

We nd(a subsetofthe)landmarksinaplanningtask, andapproximatethegreedy

neces-saryordersbetween them,bothinone process. The process issplit into two parts:

1. Compute an LGG of landmark candidates together with approximated

greedy necessary orders between them. This is donewith a backchaining

pro-cess. The goals form therst landmark candidates. Then, forany candidate L 0

,the

\earliest" actions that can be used to achieve L 0

are considered. Here, \early" is a

greedyapproximationof reachabilityfromtheinitial state. The actionsare analysed

toseeiftheyhavesharedpreconditionfactsL{factsthatmustbetruebefore

execut-inganyoftheactions. ThesefactsLbecomenewcandidatesifthey havenotalready

beenprocessed,andtheordersL!

gn L

0

areintroduced. Theprocessisiterateduntil

there are no new candidates. (Due to the greedy selection of actions, L/the order

L!

gn L

0

is notproved to be a landmark/agreedynecessary order.)

2. Removefrom the LGG the candidates(and their incident edges) that can

not be proved to belandmarks. ThisisdonebyevaluatingasuÆcient condition

oneachcandidateLintheLGG.TheconditionignoresallactionsthataddL,andasks

ifa relaxed version ofthe taskis stillsolvable. If not, Lis proved to be a landmark.

(Any relaxation can be used in principle; we use the relaxation that ignores delete

listsasinMcDermott, 1999 and Bonet &Gener, 2001.)

The nexttwo subsections focuson these two stepsinturn.

4.1.1 Landmark Candidates

We give pseudo-code for our approximation algorithm below. As said, we make the

algo-rithmgreedybyusingan approximation ofreachabilityfrom theinitialstate. The

approx-imationwe use is a relaxed planning graph (Homann & Nebel, 2001), shortRPG. Let us

explainthisdatastructurerst. AnRPGisbuiltjustlikeaplanninggraph(Blum&Furst,

1997), exceptthatthedeletelistsofallactions areignored;asaresult,there areno mutex

relations inthe graph. The RPG thusis a sequence P

0 ;A

0 ;P

1 ;A

1 ;:::;P

m 1 ;A

m 1 ;P

m of

propositionsets(layers)P

i

andactionsets(layers)A

i . P

0

containsthefactsthataretruein

theinitial state, A

0

contains those actions whosepreconditions arereached(contained) in

P

0 ,P

1

containsP

0

plustheadd eects oftheactions inA

0

,and soon. We have P

i P

i+1

and A

i

A

i+1

for all i. If the relaxed task (without delete lists) is unsolvable, then the

RPG reaches a xpoint before reaching the goal facts, thereby proving unsolvability. If

the relaxed task is solvable, then eventually a layer P

m

containing the goal facts will be

reached. 6

6.NotethattheRPGthusdecidessolvabilityoftherelaxedplanningtask. Indeed,buildinganRPGisa

variationof thealgorithm givenbyBylander(1994) to provethat planexistence is polynomialinthe

(14)

An RPGencodesan over-approximationofreachabilityintheplanningtask. Wedene

thelevelofafact/actiontobetheindexoftherstproposition/actionlayerthatcontainsthe

fact/action. Then,ifthelevelofa fact/actionis l,onemust applyat leastlparallelaction

stepsfromtheinitialstatebeforethefactbecomestrue/theactionbecomesapplicable. (The

fact/actionlevelcorrespondstothe\h 1

"heuristicdenedbyHaslum&Gener,2000.) We

usethisover-approximationofreachabilitytoinsertsomegreedinessintoourapproximation

of \greedy"necessary orders (morebelow). The approximationprocess proceeds asshown

inFigure 4.

initialisetheLGGto (G;;),andset C:=G

whileC6=;do

setC 0

:=;

forallL 0

2C ;level(L 0

)6=0do

letAbetheset ofallactionsasuchthatL 0

2add(a),andlevel(a)=level(L 0

) 1 ()

forallfacts Lsuch that8a2A:L2pre(a) do

ifLisnotyetanodein theLGG,setC 0

:=C 0

[fLg

ifLisnotyetanodein theLGG,theninsertthat node

ifL!

gn L

0

isnotyet anedgeintheLGG,theninsertthat edge

endfor

setC :=C 0

endwhile

Figure 4: Landmarkcandidate generation.

The set of landmarkcandidatesis initialisedto comprise thegoal facts. Each iteration

of the while-loop processes all \open" candidatesL 0

{those L 0

inC. Candidates L 0

with

level0,i.e.,initialfacts,arenotusedto producegreedynecessaryordersandnewlandmark

candidates, because after all such L 0

are already true. For the other open candidates L 0

,

the set A comprises all those actions at the level below L 0

that can be used to achieve

L 0

. Note that these are the earliest possible achievers of L 0

in the RPG, or else the level

of L 0

would be lower. We take as the new landmark candidatesthose facts L that every

actioninArequiresasaprecondition,andupdatetheLGGand thesetofopencandidates

accordingly. Independentlyofthe()step,thealgorithmterminatesbecausethereareonly

nitely many facts. Because we use the RPG level test in step (), the levels of the new

candidatesLare strictlylowerthanthelevelof L 0

,and thewhile-loopterminatesafterat

mostm iterationswherem isthe indexofthe topmostpropositionlayerintheRPG.

If weskippedthetest fortheRPGlevelat thepointinthealgorithmmarked (),then

thenewcandidatesLwouldbeprovedlandmarks,andthegeneratedorderswouldbeproved

to be necessary and thusalso greedy necessary. Obviously, if all actions that can achieve

a landmark L 0

require L to be true,then L is a landmark that must be true immediately

prior to achieving L 0

. Restrictingthe choice of L 0

achievers with the RPG level test, the

found landmarksand orders may be unsound. Consider the following example, where we

want to move from cityA to cityD on theroad mapshownin Figure 5,usinga standard

(15)

A

B

C

D

E

Figure 5: An exampleroadmap.

TheabovealgorithmwillcomeupwiththefollowingLGG:fat(A),at(E),at(D)g;fat(A)

!

gn

at(E), at(E) !

gn

at(D)g { theRPG isonly builtuntilthegoals are reachedthe rst

time, which happens in this example before move(C D) comes in. However, the action

sequence hmove(AB), move(B C),move(C D)i achievesat(D) withoutmakingat(E) true.

Therefore, at(E) is not really a landmark, and at(E) !

gn

at(D) is not really a greedy

necessary order.

By restrictingourchoiceofL 0

achievers withtheRPGleveltestat step()inFigure4,

as said we intend to insert greediness into our approximation of greedy necessary orders.

The generated orders L !

gn L

0

are only guaranteed to be sound if, in the RPG, the set

of earliest achievers of L 0

contains all actions that can be used to make L 0

true for the

rst time from theinitial state. Of course, itis hard to exactly computethat latter set of

actions, and also itis highly non-trivial { ifpossibleat all { to ndgeneral conditions on

whenthe earliestachieversintheRPGcontain allthese actions. Intheroadmapexample

above, theactions thatcanachieve at(D) fortherst timearemove(C D)andmove(ED),

but theonly earliest achiever in theRPG is move(E D). This leads to theunsound at(E)

!

gn

at(D) order. In the following example taken from the well-known Logistics domain,

the earliest achievers of L 0

do contain all actions that can make L 0

true for therst time.

SayL 0

=at(PA)requirespackagePtobeattheairportAofitsorigincity,andPisnotat

thisairportinitially. The actions thatcan achieve L 0

are to unload P from thelocal truck

T, orto unload it from any airplane. The onlyearliest achiever inthe RPG isthe unload

fromT,andindeedthat'stheonlyactionthatcanachieveL 0

forthersttime{inorderto

getthepackage into an airplane,thepackage hasto arriveat theairportintherst place.

Ourapproximationprocesscorrectly generatesthenewlandmarkcandidatein(PT)aswell

asthegreedy necessaryorder in(P T)!

gn

at(PA). Notethat in(P T)6!

n

at(P A).

WeshowbelowinSection4.1.2 howwere-establishthesoundnessof thelandmark

can-didates,removingcandidates(andtheirassociatedorders)thatarenotprovablylandmarks.

Wedidnotndawaytoprovablyre-establishthesoundnessofthegeneratedgreedy

neces-saryorders,andunsoundordersmaystayintheLGG,potentiallyalsocausingtheinference

ofunsoundreasonable/obedientreasonableorders(see thesectionsbelow). We didobserve

such unsoundnessin a few domains duringour experiments(individual discussions are in

Section6). We remarkthefollowing.

1. While unsound approximated L !

gn L

0

orders are not validwith respect to

Deni-tion3, theystillmake somesenseintuitively. Theyare generatedbecauseL isinthe

preconditions of all actions that are the rst ones in the RPG to achieve L 0

. This

meansthatgoingto L 0

via Lisprobablyagoodoption,intermsofdistancefromthe

initialstate.

2. UnlessLisalandmarkforsomeother reason(thanfortheunsoundorderL!

gn L

0

),

landmarkvericationwill remove L, and inparticular the order L!

gn L

0

(16)

LGG(seethediscussionoftheFigure5examplebelowinthesectionaboutlandmark

verication).

3. As said before, our search control does not enforce the orders in the LGG, it only

suggests them to theplanner. So even ifthere is no plan that obeys an order inthe

LGG, thisdoesnotmeanthat oursearch controlwillmake the plannerfail.

4. Ifwewereto extractonlyprovablynecessaryorders,bynotusingtheRPGleveltest,

we would misstheinformationthat liesinthose !

gn

orders thatarenot!

n

orders.

Forthesereasons,inparticularforthelastone,weconcentratedonthepotentiallyunsound

RPG-basedapproximationinour experiments. We also ransome comparative teststo the

\safe" strategywithouttheRPGlevel test,indomains wheretheRPG producedunsound

orders. SeethedetailsinSection6.

One case where an !

gn

order, that is not an !

n

order, contains potentially useful

information,istheLogisticsexamplegivenabove. Anothercaseistheaforementionedorder

clear(D)!

gn

clear(C)inourrunningBlocksworldexamplefromFigure1. Toconcludethis

subsection, let ushave a lookat what ourapproximation algorithm from Figure 4 doesin

thatexample. The RPGfortheexampleis summarisedinFigure 6.

P0 A0 P1 A1 P2 A2 P3

on-table(A) pick-up(A) holding(A) stack(BA) on(BA) stack(CA) on(CA)

on-table(B) pick-up(B) holding(B) stack(BD) on(BD) stack(CB) on(CB)

on-table(C) unstack(DC) holding(D) stack(BC) on(BC) stack(CD) on(CD)

on(D C) clear(C) put-down(B) ... ... ...

clear(A) ...

clear(B) pick-up(C) holding(C)

clear(D) ... ...

arm-empty()

Figure 6: SummarisedRPG fortheillustrativeBlocksworld examplefrom Figure1.

As we explained above, the extraction process starts byconsidering the goals on(C A)

andon(BD)aslandmarkcandidates. TheRPGlevelofon(CA)is3,thelevelofon(BD)is

2. Thereis onlyone actionwithlevel2 thatachieveson(C A): stack(C A).So, holding(C)

(level 2) and clear(A) (level 0) are new candidates. The new LGG is: (fon(C A),on(B

D),holding(C),clear(A)g;fholding(C) !

gn

on(C A), clear(A) !

gn

on(C A)g). Processing

on(B D), we nd that its only earliest achiever is stack(B D), and we generate the new

candidatesholding(B)(level1)andclear(D)(level0) withtherespectiveedges. Inthenext

iteration,holding(C) (level 2) produces the new candidatesclear(C) (level1), on-table(C)

(level 0), and arm-empty() (level 0) by the achiever pick-up(C); and holding(B) (level 1)

produces the new candidates on-table(B) (level 0) and clear(B) (level 0) by the achiever

pick-up(B). In the third and nal iteration of the algorithm, clear(C) (level 1) produces

the new candidate on(D C) (level 0) by the achiever unstack(D C). The process ends up

withthe LGGas shown inFigure 7 (theedges in thedepicted graphare all directed from

bottom to top). Fact sets of which our LGG suggests that they have to be true together

at some point { because they are either top level goals, or !

gn

(17)

fact{ are grouped together inboxes. Assaid before,thisinformationis important forthe

approximationof reasonableordersdescribed below.

on(c,a)

clear(a)

on−table(c)

clear(c)

holding(c)

clear(d)

holding(b)

on(b,d)

arm_empty

on(d,c)

on−table(b)

_clear(b)

Figure 7: LGG forthe illustrative Blocksworld task, containing the found landmarks and

!

gn

orders.

4.1.2 Landmark Verification

Assaidbefore,weverifylandmarkcandidatesbyevaluatingasuÆcient conditiononthem,

and throwingawaythose candidateswhere theconditionfails. Theconditionwe useisthe

following.

Proposition 1 Given aplanningtask (A;I;G), anda factL. Deneamodied actionset

A

L

asfollows.

A

L

:=f(pre(a);add(a);;) j(pre(a);add(a);del(a))2A;L62add(a)g

If (A

L

;I;G) is unsolvable, then L isa landmark in (A;I;G).

Note that the inverse direction of the proposition does not hold { that is, if L is a

landmark in (A;I;G) then (A

L

;I;G) is notnecessarily unsolvable { because ignoringthe

delete lists simplies the achievement of the goals. As mentioned earlier, deciding about

solvability of planning tasks with empty delete lists can be done in polynomial time by

buildingtheRPG.ThetaskisunsolvableitheRPGcan'treachthegoals. Soourlandmark

vericationprocesslooks atall landmarkcandidatesinturn. Candidatesthataretoplevel

goals orinitial facts are triviallylandmarks, so they need not be veried. For each of the

othercandidatesL,theRPG corresponding to(A

L

;I;G) isbuilt,andifthatRPG reaches

thegoals, thenL andits incidentedges are removed fromtheLGG.

Reconsidertheroad mapexampledepictedinFigure5. The LGGbuiltwillbe fat(A),

at(E), at(D)g;fat(A) !

gn

at(E), at(E) !

gn

at(D)g. But at(E) is not really a landmark

(18)

verifying at(E),we detect this. In theRPG, when ignoringall actions that achieve at(E),

move(A,B), move(B,C), and move(C,D) stay in and so the goal remains reachable. Thus

at(E) andits edges(in particular,the invalidedgeat(E) !

gn

at(D)) areremoved,yielding

thenal(trivial)LGTwithnodesetfat(A),at(D)gandemptyedgeset. Notethat, ifat(E)

was a landmark for some other reason than reaching D (like, if one had to pick up some

object at E), then at(E) would not be removed by landmark verication and the invalid

order at(E) !

gn

at(D) wouldstayin.

In the Blocksworld examplefrom Figure 1, landmark vericationdoes notremove any

candidates,and the LGGremainsunchangedasdepicted inFigure 7.

4.2 Approximating Reasonable Orders

Our process to approximate reasonable orders starts from the LGG as computed by the

methods described above, and enriches the LGG with new edges corresponding to the

approximated reasonableorders. The process hastwo mainaspects:

1. We approximate the aftermath relation based on the LGG. This is done

by evaluating a suÆcient condition that covers certain cases when greedy necessary

ordersimplytheaftermath relation.

2. We combine the aftermath relation with interference information to

ap-proximatereasonable orders. ForeachpairoflandmarksL 0

andLsuchthatL 0

is

intheaftermathofLaccordingto thepreviousapproximations, asuÆcientcondition

isevaluated. Theconditioncovers certain cases whenL interferes with L 0

,i.e., when

achieving L (from a state in S

(L 0

;:L)

) involves deletingL 0

. If the conditionholds, a

reasonableorder L!

r L

0

isintroduced.

The next two subsections focus on these two aspects inturn. In our implementation, the

computation of the aftermath relation is interleaved with its combination with

interfer-ence information. Pseudo-code for theoverall algorithm is given inthesecond subsection,

Figure 8.

4.2.1 AftermathRelation

ThesuÆcient conditionthatwe useto approximatetheaftermathrelationisthefollowing.

Lemma 1 Given a planning task (A;I;G), and two landmarks L and L 0

. If either

1. L 0

2G, or

2. there are landmarks L = L

1 ;:::;L

n+1

, n 1, L

n 6= L

0

, such that L

i !

gn L

i+1 for

1in, and L 0

!

gn L

n+1 ,

then L 0

isin the aftermath of L.

Proof Sketch: IfL 0

2Gthen L 0

is triviallyinthe aftermath of L. Otherwise, underthe

givencircumstances,L 0

andL

n

mustbetruetogetherat somepointinanyactionsequence

achieving thegoal from a state in S

(L 0

;:L)

, namely directly prior to achievement of L

(19)

AsL hasa pathof !

gn

orders to L

n

,it hasto betrue priorto (orsimultaneouslywith,if

n=1) L 0

. 2

Notethatthislemmajustcapturesthepropertywe mentionedbefore,whenwecantell

from the LGG that several facts must be true together at some point. In the second case

of the lemma, these facts are L 0 and L n . L 0 and L n

are both ordered !

gn

before L

n+1

and somust be true together before achieving that fact. The rst case of the lemma can

be understood thiswayas implicitlyassuming L

n

as some other top-level goalthat L has

a path of !

gn

orders to. (In our implementation, L must have such a path in the LGG

orit would nothave beengenerated asa landmarkcandidate.) IfL 0

and L

n

mustbe true

together, andweadditionallyknowthatLmustbetruesometimebeforeL

n

,thenweknow

thatL 0

is intheaftermath of L. 7

The moststraightforward ideato make useof Lemma 1wouldbeto simply enumerate

all pairs of nodes (landmarks) in the LGG, and evaluate the lemma, collecting the pairs

L and L 0

of landmarkswhere the lemma condition holds. While this would probablynot

be prohibitively runtime-costly, one can do better by having a closer look at the lemma

condition. Consider each node L 0

in the LGG in turn. If L 0

is a top level goal, then L 0

is in the aftermath of all other nodes L. If L 0

is not a top level goal, then consider all

nodesL

n 6=L

0

such thatL 0

and L

n

bothhave a !

gn

order before some other node L

n+1 .

The nodes LintheLGG thathave an (possiblyempty)outgoing !

gn

path to such anL

n

are exactly those for which L 0

is in the aftermath of L according to Lemma 1. As said,

pseudo-codeforour overall approximation of reasonableordersisgiven below inFigure8.

NotethattheinputstoLemma1are!

gn

orders,whileinpracticeweevaluatethelemma

on theedges intheLGG asgenerated bythe processes describedabove inSection4.1. As

we discussed above, the edges in the LGG may be unsound, i.e. they do not provably

correspond to !

gn

orders. In eect, neither can we guarantee that our approximation to

theaftermath relationis sound.

4.2.2 ReasonableOrders

We approximate reasonable orders by considering all pairs L and L 0

where L 0

is in the

aftermath of L according to the above approximation. We test if L interferes with L 0

according to the denition directly below. If the test succeeds, we introduce the order

L!

r L

0

.

. L interferes with

L 0

if one of the following conditions holds:

1. L and L 0

are inconsistent;

2. there isa fact x2 T

a2A;L2add(a)

add(a), x6=L, such that x isinconsistent withL 0 ; 3. L 0 2 T a2A;L2add(a) del(a);

4. or there isa landmark x inconsistent with L 0

such that x!

gn L.

7.Intheory,onecouldalsoallowL 0

=L

n

6=LinLemma1. Inthis case,L hasapathof!

gn

ordersto

L 0

,whichtriviallyimpliesthatL 0

isintheaftermathofL. Butinfact, itis thenimpossibletoachieve

L 0

beforeLsoanorderL!

r L

0

(20)

As said before, our (standard) denition of inconsistency is that facts x and y are

inconsistent inaplanning taskifthere isno reachable state inthetaskthat containsboth

x and y. 8

Note that the conditions 1 to 4 of Denition 6, while they may look closely

relatedatrstsight(andpresumablyarerelatedinmanypracticalexamples),indeedcover

dierent cases of whenachievingL involvesdeletingL 0

. More formally expressed,for each

conditionitherearecaseswhereiholdsbutnoconditionj6=iholds. Forexample,consider

condition2. Inthefollowingexample,thereisareasonableorderL!

r L

0

,andLinterferes

with L 0

dueto condition 2only. There are thesix facts L,L 0

,x, P

1 , P

2

, and P 0

. Initially

onlyP 0

is true,and thegoalis to have Land L 0

. The actionsare:

opL 0 = (fP 0 g; fL 0 g; fxg) opP 1 = (fP 0 g; fP 1 g; fL 0 ;P 0 g) opP 2 = (fP 0 g; fP 2 g; fL 0 ;P 0 g) opL 1 = (fP 1

g; fL;x;P 0 g; fP 1 g) opL 2 = (fP 2

g; fL;x;P 0

g; fP

2 g)

In this example, the only action sequences that are possible are of the form (opL 0 k opP 1 Æ opL 1 k opP 2 Æ opL 2 )

, in BNF-style notation. In eect, L !

r L

0

because if we

achieve L 0

rst, we have to applyone of opP

1

and opP

2

,which bothdeleteL 0

. Condition

2 holds: x is inconsistent withL 0

and added by both opL

1

and opL

2

. As forcondition1,

L and L 0

are not inconsistent because one can apply opL 0

after, e.g., opL

1

. Condition 3

is obviously notfullled, and condition 4 is not fullledbecause there are two options to

achieve Lsono facthas a!

gn

order beforeL.

Interference together with the aftermath relation implies reasonable orders between

landmarks.

Theorem 6 Givena planning task(A;I;G), and two landmarksL and L 0

. If L interferes

withL 0

, and either

1. L 0

2G, or

1 ;:::;L

n+1

, n 1, L

n 6= L

0

, such that L

i ! gn L i+1 for

1in, and L 0 ! gn L n+1 ,

then thereis a reasonable order between L andL 0

, L!

r L

0

.

Proof Sketch: ByLemma 1,L 0

isin theaftermath ofL. Letuslookat thefourpossible

reasons for interference. If L is inconsistent with L 0

then obviously achieving L involves

deletingL 0

. If all actions that achieve L add a fact that is inconsistent with L 0

, the same

argument applies. The case where all actions that achieve L delete L 0

is obvious. As for

8.DecidingaboutinconsistencyisobviouslyPSPACE-hard. Justimagineataskwhereweinsertoneofthe

factsinto theinitialstate,andtheotherfactsuchthatitcanonlybemadetrueoncetheoriginalgoal

hasbeenachieved. Weapproximateinconsistencywithasoundbutincompletetechniquedevelopedby

(21)

thelastcase, saywe are ina state s2S

(L 0

;:L)

. Then x isnotin s(because L 0

is). Dueto

x!

gn

L,x mustbe achieved directly priorto L,and thusL 0

willbe deleted. 2

Overall, our method forapproximating reasonable orders based on the LGG works as

speciedinFigure 8. With what wassaid above,the algorithmshouldbe self-explanatory

except for theinterference tests. When doing these tests, we need informationaboutfact

inconsistencies,and,forcondition4 ofDenition6,about!

gn

orders. Ourapproximation

tothelatterpiecesofinformationare,asbefore,the(approximate)!

gn

edgesintheLGG.

Our approximation to the former piece of information is a technique from the literature

(Fox&Long,1998), theTIMAPI.Thisprovidesa functionTIMinconsistent(x,y) that,for

facts x and y, returnsTRUE onlyif xand y are inconsistent. The functionis incomplete,

i.e., itcan returnFALSEeven ifxand yareinconsistent.

forallnodesL 0

intheLGGdo

ifL 0

2Gthen

forallnodesL6=L 0

intheLGGdo

ifL interfereswithL 0

,theninserttheedge L!

r L

0

intotheLGG

endfor

else

forallnodesL

n 6=L

0

in theLGG

s.t. thereareanodeL

n+1

andedgesL 0 ! gn L n+1 ,L n ! gn L n+1

in theLGGdo

forallnodesLintheLGG

s.t. Lhasan(possiblyempty)outgoingpathof!

gn

edgestoL

n do

ifL interfereswith L 0

,theninserttheedgeL!

r L

0

intotheLGG

endfor

endif

endfor

Figure 8: Approximatingreasonableorders basedon theLGG.

Note thatthealgorithmfrom Figure8might generateordersL!

r L

0

incases whereL

alreadyhasapathof!

gn

edgesto L 0

. Asnotedearlier,inthiscaseL 0

cannotbe achieved

beforeL sotheorderL!

r L

0

ismeaningless. Onecouldavoid suchmeaningless ordersby

anadditionalchecktosee,foreverygeneratedpairLandL 0

,ifLhasanoutgoing!

gn path

to L 0

. We do thisinourimplementationonlyfor theeasy-to-check specialcases wherethe

lengthof the !

gn

pathfrom L to L 0

is 1 or2. Note thatthe superuous!

r

orders don't

hurt anyway; in fact they don't change our search process (Section 5.1) at all. The only

purposeofourspecial casetest is to avoid some unnecessaryevaluationsof Denition6.

Because the inputs to our approximation algorithm are !

gn

edges in the LGG, and

as discussed before these edges are not provably sound, the resulting !

r

orders are not

provablysound(whichthey otherwise would bebyTheorem6).

Let us nisho ourrunning Blocksworld example, byshowinghow theorder clear(C)

!

r

on(B D), ourmotivating examplefrom theintroduction, is found. Have a look at the

LGGinFigure 7. SaytheprocessdepictedinFigure8 considers,inits outermostfor-loop,

the LGG node L 0

= on(B D). L 0

is a top level goal so all other nodes L in the LGG, in

(22)

on(B D)becauseof condition4inDenition 6: clear(D)isinconsistent withon(B D), and

hasan edgeclear(D) !

gn

clear(C)inthe LGG. Consequentlytheorder clear(C) !

r on(B

D)isinferredandintroducedinto theLGG. Notethat, tomake thisinference,we needthe

edgeclear(D) !

gn

clear(C) whichisnot a!

n order.

4.3 Approximating Obedient ReasonableOrders

The process that approximates obedient reasonable orders starts from the LGG already

containingtheapproximatedreasonableorders, andinsertsnew ordersthatarereasonable

given one commits to the !

r

orders already present in the LGG. The technology is very

similartothetechnologyweusetoapproximatereasonableorders. Largely,wedothesame

thingasbeforeandjusttreat the!

r

edgesasiftheywereadditional!

gn

edges. Formally,

thedierencelies inthesuÆcient criterionforthe, nowobedient,aftermath relation.

Lemma 2 Given aplanning task(A;I;G), a setO of reasonable ordering constraints,and

two landmarksL and L 0

. If either

1. L 0

2G, or

1 ;:::;L

n+1

, n 1, L

n 6= L

0

, such that L

i ! gn L i+1 or L i ! r L i+1

2O for 1in, and L 0 ! gn L n+1 , then L 0

isin the obedient aftermath of L.

ProofSketch: ByasimplemodicationoftheprooftoLemma1. Therstcaseisobvious,

inthesecondcase L 0

must be trueone stepbeforeL

n+1

becomestrue,andL mustbetrue

sometime beforethat. 2

Notethat theprovedpropertydoesnotholdifthere isonlyareasonableorderbetween

L 0 and L n+1 , L 0 ! r L n+1

instead of L 0

!

gn L

n+1

, even if we have committed to obey

L 0 ! r L n+1

. It isessential thatL 0

mustbe truedirectly beforeL

n+1 .

9

The partsof our technology that do not dependon the aftermath relation remain

un-changed. Interference is dened exactly as before. Together with the obedient aftermath

relation, itimpliesobedient reasonableordersbetweenlandmarks.

Theorem 7 Given a planning task (A;I;G), a set O of reasonable ordering constraints,

and two landmarks L and L 0

. If L interferes withL 0

,and either

1. L 0

2G, or

1 ;:::;L

n+1

, n 1, L

n 6= L

0

, such that L

i ! gn L i+1 or L i ! r L i+1

2O for 1in, and L 0 ! gn L n+1 ,

then thereis an obedient reasonable order between L and L 0

, L! O

r L

0

.

9.IfobeyinganoderL

1 !

r L

2

isdenedtoincludethecasewhereL

1 andL

2

areachievedsimultaneously,

thelemmadoesnothold. ThefactsL1;:::;Ln+1 couldthenall beachievedwithasingleaction,given

theordersbetweenthemareall(only)takenfromthesetO. Onecan\repair"thelemmabyrequiring

that,foratleastoneoftheiwhereLi6!gnLi+1butLi!rLi+12O,thereisnoactionthathasboth

L

i andL

i+1