Ordered Landmarks in Planning
Jorg Homann [email protected]
Max-Planck-Institut f ur Informatik,
Saarbr ucken,Germany
Julie Porteous [email protected]
Department ofComputer andInformation Sciences,
The Universityof Strathclyde,
Glasgow, UK
Laura Sebastia [email protected]
Dpto. Sist. InformaticosyComputacion,
Universidad Politecnicade Valencia,
Valencia, Spain
Abstract
Many known planning tasks have inherent constraints concerning the best order in
which to achievethe goals. A number of researcheorts havebeen made to detectsuch
constraintsand to use them for guiding search, in thehope of speedingup the planning
process.
We go beyond the previous approaches by considering ordering constraintsnot only
over the (top-level) goals, but also over the sub-goals that will necessarily arise during
planning. Landmarks are facts that must be true at some point in every valid solution
plan. WeextendKoehlerandHomann'sdenitionofreasonableordersbetweentoplevel
goals to the more general case of landmarks. We show how landmarks can be found,
how theirreasonableorders canbeapproximated, andhowthis information canbeused
to decompose a given planning task into several smaller sub-tasks. Our methodology is
completelydomain-andplanner-independent. Theimplementationdemonstratesthatthe
approach canyield signicantruntimeperformanceimprovementswhenusedasacontrol
looparoundstate-of-the-artsub-optimalplanningsystems,asexempliedbyFFandLPG.
1. Introduction
Given the inherent complexity of the general planning problem it is clearly important to
developgoodheuristicstrategiesforbothmanagingandnavigatingthesearchspaceinvolved
in solvinga particular planninginstance. One way inwhich search can be informed is by
providing hints concerning the order in which planning goals should be addressed. This
can make a signicant dierence to search eÆciency by helping to focus the planner on
a progressive path towards a solution. Work in this area includes that of Koehler and
Homann (2000). Theyintroducethe notionof reasonable orders whichstates that a pair
ofgoalsA andBcan beorderedsothatBisachievedbeforeAifitisn'tpossibletoreacha
state inwhich Aand Barebothtrue,froma state inwhich justAis true,withouthaving
totemporarilydestroyA.InsuchasituationitisreasonabletoachieveBbeforeAtoavoid
The main idea behind the work discussed in this paper is to extend those previous
ideas on orders by not only ordering the (top-level) goals, but also the sub-goals that
will necessarily arise during planning, i.e., by also taking into account what we call the
landmarks. The key feature of a landmark is that it must be true at some point on any
solutionpathtothegivenplanningtask. ConsidertheBlocksworldtaskshowninFigure1,
which willbe ourillustrativeexamplethroughout thepaper.
A
C
D
B
D
C
B
A
initial state
goal
Figure 1: ExampleBlocksworld task.
For the reader who is weary of seeing toy examples like the one in Figure 1 in the
literature,weremarkthatourtechniquesarenotprimarilymotivatedbythisexample. Our
techniques are usefulin much more complex situations. We use the depicted toyexample
only foreasy demonstration of some of theimportant points. In the example, clear(C) is
a landmarkbecause it willneedto be achieved inanysolutionplan. Immediately stacking
B on D from the initial state willachieve one of the toplevel goals of thetask but it will
resultinwasted eortifclear(C) isnotachieved rst. Toorder clear(C)beforeon(BD) is,
however,notreasonableintermsofKoehlerandHomann'sdenition. First,clear(C)isnot
atoplevelgoalsoitisnotconsideredbyKoehlerandHomann'stechniques. Second,there
arestates whereBison D and fromwhichclear(C) can be achieved withoutunstackingB
fromD again(compare thedenitionofreasonableordersgiven above). Butreaching such
astate requiresunstackingD fromC, and thusachievingclear(C), intherstplace. This,
togetherwith thefactthat clear(C)must be madetrue atsome point,makesitsensibleto
order clear(C) before on(BD).
We propose a natural extension of Koehler and Homann's denitions to the more
general caseof landmarks(trivially,all toplevelgoals are landmarks,too). We also revise
partsoftheoriginaldenitiontobettercapturetheintuitivemeaningofagoalordering. The
extendedandreviseddenitionscapture,inparticular,situationsofthekinddemonstrated
withclear(C)on(BD)inthetoyexampleabove. Wealsointroduceanewkindofordering
thatoftenoccursbetweenlandmarks: A canbeorderedbefore B ifallvalidsolutionplans
make A true before they make B true. We call such orders necessary. Typically, a fact
isa landmark because it isnecessarily ordered before some other landmark. For example,
clear(C) is necessarily ordered before holding(C), and holding(C) is necessarily ordered
beforethetoplevel goalon(C A), intheabove Blocksworld example.
Decidingifafactisa landmark,anddecidingaboutourorderingrelations,is
PSPACE-complete. Wedescribepre-processingtechniquesthat extractlandmarks,andthat
approx-imate necessary orders between them. We introduce suÆcient criteria for the existence
of reasonable orders between landmarks. The criteria are based on necessary orders, and
inconsistencies between facts. 1
Using an inconsistency approximation technique from the
literature, we approximate reasonable orders based on our suÆcient criteria. After these
pre-processes have terminated, what we get is a directed graph where the nodes are the
foundlandmarks,and theedgesaretheordersfoundbetweenthem. Wecallthisgraphthe
landmark generation graph, shortLGG.Thisgraphmaycontain cyclesbecause forsome of
ourordersthereisnoguaranteethatthereisaplan,orevenanactionsequence,thatobeys
them. 2
Ourmethodforstructuringthesearchforaplancan nothandlecycles intheLGG,
so we remove cycles by removing edges incident upon them. We end up with a polytree
structure. 3
Once turned into a polytree, the LGG can be used to decompose the planning task
into smallchunks. We proposea method thatdoesnotdependon any particularplanning
framework. Thelandmarksprovideasearchcontrolloopthatcanbeusedaroundanybase
plannerthatiscapableofdealingwithSTRIPSinput. Thesearchcontroldoesnotpreserve
optimalitysothereisnotmuchpointinusingitaroundoptimalplannerssuchasGraphplan
(Blum & Furst, 1997) and its relatives. Optimal planners are generally outperformed by
sub-optimal planners anyway. It does make sense, however, to use the control in order
to further improve the runtime performance of sub-optimal approaches to planning. To
demonstratethis,weusedthetechniqueforcontroloftwo versionsofFF (Homann,2000;
Homann & Nebel, 2001), and for controlof LPG (Gerevini,Saetti, &Serina, 2003). We
evaluated these planners across a range of 8 domains. We consistently obtain, sometimes
dramatic, runtime improvements for the FF versions. We obtain runtime improvements
forLPG inaround half of thedomains. Theruntimeimprovement is, forall theplanners,
usuallybought atthecostof slightlylongerplans. Butthere arealsosome caseswherethe
plansbecome shorterwhen usinglandmarkscontrol.
The paperisorganisedasfollows. Section2givesthebasicnotations. Section3 denes
what landmarksare,and inwhat relationsbetweenthemwe areinterested. Exact
compu-tation of the relevant pieces of information is shown to be PSPACE-complete. Section 4
explains our approximation techniques, and Section 5 explains how we use landmarksto
structurethesearchofanarbitrarybaseplanner. Section6providesourempiricalresultsin
arangeofdomains. Section7closesthepaperwithadiscussionofrelatedwork,ofour
con-tributions,andoffuturework. MostproofsaremovedintoAppendixA,andreplacedinthe
text by proof sketches, to improve readability. Appendix B provides runtime distribution
graphsassupplementarymaterialtothetablesprovidedinSection6. AppendixCdiscusses
somedetailsregardingourexperimentalimplementationoflandmarkscontrolaroundLPG.
2. Notations
WeconsidersequentialplanninginthepropositionalSTRIPS(Fikes&Nilsson,1971)
frame-work. In the following, all sets are assumed to be nite. A state s is a set of logical facts
(atoms). An action ais a triplea=(pre(a);add(a);del(a)) wherepre(a) are theaction's
preconditions, add(a) is its add list, and del(a) is its delete list, each a set of facts. The
2.Also, none of ourordering relations istransitive. Westick to the word \order"only becauseit is the
mostintuitivewordforconstraintsontherelativepointsintimeatwhichplanningfactscanorshould
beachieved.
3.Removingedges incident oncyclesmight, ofcourse, throw away usefulordering information. Coming
up withothermethodsto treatcycles,or withmethodsthat canexploitthe informationcontainedin
result ofapplying(the actionsequenceconsisting of)a single actionato astate s is:
R esult(s;hai)= (
(s[add(a))ndel(a) pre(a)s
undened otherwise
Theresultofapplyingasequenceofmorethanoneactionto astateisrecursivelydenedas
R esult(s;ha
1 ;:::;a
n
i)=R esult(R esult(s;ha
1 ;:::;a
n 1 i);ha
n
i). Applyinganemptyaction
sequencechangesnothing,i.e., R esult(s;hi)=s. Aplanningtask(A;I;G)isatriplewhere
A is a set of actions, and I (the initial state) and G (the goals) are sets of facts (we use
the word \task" rather than \problem" in order to avoid confusion with the
complexity-theoreticnotionof decisionproblems). A plan, orsolution, foratask(A;I;G)is anaction
sequence P 2A
suchthat GR esult(I;P).
3. Ordered Landmarks: What They Are
In this section we introduce our framework. We dene what landmarks are, and in what
relations between them we are interested. We show that all the corresponding decision
problems are PSPACE-complete. Section 3.1 introduces landmarks and necessary orders,
Section 3.2 introduces reasonable orders, and Section 3.3 introduces obedient reasonable
orders(ordersthatarereasonableifonehasalreadycommitted toobeyagivena-prioriset
of reasonableorderingconstraints).
3.1 Landmarks, and Necessary Orders
Landmarksare facts thatmust be true at some pointduringthe executionof anysolution
plan.
Denition 1 Given a planning task (A;I;G). A fact L is a landmark if for all P =
ha
1 ;:::;a
n i2A
;GR esult(I;P):L2R esult(I;ha
1 ;:::;a
i
i) for some 0in.
Notethatinanunsolvabletaskallfactsarelandmarks(byuniversalquanticationover
the empty set of solution plans in the above denition). The denition thus only makes
sense if the task at hand is solvable. Indeed, while our landmark techniques can help a
planningalgorithm to nda solutionplan faster(as we willsee later),they arenot useful
for proving unsolvability. The reasonable orders we will introduce are based on heuristic
notions that make sense intuitively, but that are not mandatory in the sense that every
solutionplanobeys them,or even inthesensethat there existsa solutionplanthat obeys
them. Detailson thistopic are given withthe individualconcepts below. We remarkthat
wemaketheseobservationsonlytoclarifythemeaningofourdenitions. Giventhewaywe
usethe landmarksinformationforplanning,forour purposes itis notessential iforifnot
an ordering constraint is mandatory. Our search controllooponly suggests to theplanner
what mightbe good to achieve next, itdoesnotforcetheplannerto do so(see Section5).
Initial and goal facts are triviallylandmarks: set i to 0 respectively n in Denition1.
In general,it isPSPACE-complete to decidewhether a factis alandmarkor not.
Theorem 1 LetLANDMARKdenotethefollowingproblem: givenaplanningtask(A;I;G),
and a fact L; is L a landmark?
Proof Sketch: PSPACE-hardnessfollows by a straightforward reduction of the
comple-ment of PLANSAT{ the decision problem of whether there exists a solution plan to a
given arbitrary STRIPS task(Bylander, 1994) { to the problemof deciding LANDMARK.
PSPACE-membershipfollows viceversa. 2
FullproofsareinAppendixA. Oneof themostelementaryordering relationsbetween
apairLand L 0
oflandmarksisthefollowing. InanyactionsequencethatmakesL 0
truein
some state, L is true in the immediatepreceding state. Typically,a fact L is a landmark
because it is ordered in this way before some other landmark L 0
. The reason is typically
thatLisanecessaryprerequisite{asharedprecondition{forachievingL 0
. Wewillexploit
thisforourapproximationtechniquesinSection 4.
Denition 2 Givenaplanningtask(A;I;G),andtwofactsLandL 0
. Thereisanecessary
order between L and L 0
, written L!
n L
0
, if L 0
62I, and for all P =ha
1 ;:::;a
n i 2A
: if
L 0
2R esult(I;ha
1 ;:::;a
n
i) then L2R esult(I;ha
1 ;:::;a
n 1 i).
The denitionallowsforarbitraryfacts,butthecasethatwewillbeinterestedinisthe
case where L and L 0
are landmarks. Note that if L 0
2R esult(I;ha
1 ;:::;a
n
i) then n 1
as L 0
62I. The intention behind a necessary order L!
n L
0
is that one must have L true
beforeone canhave L 0
true. So itdoesnotmake senseto allowsuchordersforinitialfacts
L 0
. It isimportantthatLis postulatedto betruedirectlybeforeL 0
{thisway,iftwo facts
L and L 00
arenecessarily orderedbefore thesame factL 0
,one can concludethat Land L 00
mustbetruetogetherat somepoint. Wemakeuseofthisobservation inourapproximation
of reasonableorders(see Section4.2).
We denote necessary orders, and all the other ordering relations we will introduce, as
directedgraphedges\!"ratherthanwiththemoreusual\<"symbol. Wedothistoavoid
confusionaboutthemeaningofourrelations. Assaidearlier,noneoftheorderingrelations
we introduce istransitive. (Note that !
n
would be transitive ifL wasonlypostulatedto
holdsometime beforeL 0
,notdirectlybeforeit.)
Necessary orders aremandatory. We saythat an actionsequence ha
1 ;:::;a
n
i obeys an
order L ! L 0
ifthe sequence makesL true therst timebefore it makes L 0
true the rst
time. Precisely, ha
1 ;:::;a
n
i obeys L ! L 0
if either L 2 I, or minfi j L 2 add(a
i )g <
minfi j L 0
2 add(a
i
)g where the minimum over an empty set is 1. That is, either L is
true initially, orL 0
is notadded at all, or L is added before L 0
. By denition, any action
sequence obeys necessary orders. So one does not lose solutions ifone forces a plannerto
obey necessary orders, i.e. if one disallows plans violating the orders. (We reiterate that
thisis a purely theoretical observation; assaid above, our search controldoes notenforce
thefoundordering constraints.)
Theorem 2 Let NECESSARY-ORD denote the following problem: given a planning task
(A;I;G), andtwo facts L and L 0
; does L!
n L
0
hold?
Deciding NECESSARY-ORD isPSPACE-complete.
Proof Sketch: PSPACE-hardness follows by reducing the complement of PLANSAT to
NECESSARY-ORD. PSPACE-membershipfollows with a non-deterministic algorithm that
Anotherinterestingrelationaregreedy necessary orders,aslightlyweakerversionofthe
necessaryordersabove. We postulatenotthatL istruepriorto L 0
inall actionsequences,
but only in those action sequences where L 0
is achieved for the rst time. These are the
ordersthat we actuallyapproximateand useinourimplementation(see Section4).
Denition 3 Given a planning task (A;I;G), and two facts L and L 0
. There is a greedy
necessaryorderbetweenLandL 0
,writtenL!
gn L
0
,ifL 0
62I,andforallP =ha
1 ;:::;a
n i2
A
: if L 0
2 R esult(I;ha
1 ;:::;a
n
i) and L 0
62 R esult(I;ha
1 ;:::;a
i
i) for 0 i < n, then
L2R esult(I;ha
1 ;:::;a
n 1 i).
Like abovewiththenecessary orders,the actionsequence achievingL 0
mustcontainat
least one step as L 0
62I. Obviously,!
n
is stronger than !
gn
, that is, with L !
n L
0
for
two facts L and L 0
, L!
gn L
0
follows. Greedy necessary orders are stillmandatory inthe
sensethatevery actionsequence obeys them.
The denition of greedy necessary orders captures the fact that, really, what we are
interestediniswhathappenswhenwedirectlyachieve L 0
fromtheinitialstate,ratherthan
in some remote part of thestate space. The consideration of these more remote parts of
thestate space,which is inherent inthedenitionof the non-greedynecessary orders, can
make us lose useful information. Consider the Blocksworld example inFigure 1. There is
a greedynecessary order betweenclear(D) and clear(C),clear(D) !
gn
clear(C),but nota
necessaryorder, clear(D)6!
n
clear(C).If we makeclear(C) true thersttime inanaction
sequencefromtheinitialstate,thentheactionachievingclear(C)willalwaysbeunstack(D
C), which requires clear(D) to be true. On the other hand, there can of course be action
sequences which achieve clear(C) by dierent actions (unstack(A C), for example). But
reaching a state where clear(C) can be achieved by such an action involves unstacking D
from C,and thusachievingclear(C),in therst place. We willseelater (Section4.2) that
theorderclear(D)!
gn
clear(C)canbeusedtomaketheimportantinferencethatclear(C)
isreasonably orderedbeforeon(B D).
More generally, thedenition of greedy necessary orders is made from the perspective
that we are interested in orderingthe rst occurence ofthe facts L inour desiredsolution
plan. Alldenitions and algorithms in the rest of this paperare designed from this same
perspective. Sinceafactmight(have to)bemadetrueseveral timesina solutionplan,one
could just as wellfocus on ordering the fact's lastoccurence, orany occurence, or several
occurences ofit. We choseto focuson therst occurences offacts mainlyinorder to keep
things simple. It seems very hard to say anything usefula prioriabout exactlyhow often
andwhen somefactwillneedtobecome trueinaplan. The \greedyassumption"thatour
approachthusmakesisthatallthelandmarksneedtobeachievedonlyonce,andthatitis
besttoachieve themasearlyaspossible. Ofcoursethisassumptionisnotalways justied,
andmaylead todiÆculties,suchase.g. cyclesinthegeneratedLGG(seealso Sections4.4
and 6.8). Generalisingourapproach to takeaccount ofseveral occurences ofthesame fact
isan open research topic.
Theorem 3 Let GREEDY-NECESSARY-ORD denote the following problem: given a
plan-ning task (A;I;G), and two facts L andL 0
; does L!
gn L
0
hold?
Proof Sketch: Bya minormodicationof theproof to Theorem2. 2
3.2 Reasonable Orders
Reasonable orders were rst introduced by Koehler and Homann (2000), for top level
goals. We extendtheirdenition,ina slightlyrevisedway,to landmarks.
Let us rst reiterate what the idea of reasonable orders was originally. The idea
in-troduced by Koehler and Homann is this. If the planner is in a state s where one goal
L 0
has just been achieved, but another goal L is still false, and L 0
must be destroyed in
order to achieveL,thenit mighthave beenbetter to achieve Lrst: to get toa goalstate
from s,theplannerwillhave to deleteand re-achieveL 0
. If thesame situationarisesinall
statesswhereL 0
hasjustbeenachievedbutLisfalse,thenitseemsreasonabletogenerally
introducean orderingconstraint L!L 0
,indicatingthat L shouldbe achieved priorto L 0
.
Theclassicalexamplefortwofactswithareasonableorderingconstraintareonrelations
inBlocksworld,whereon(B,C)isreasonablyorderedbeforeon(A,B)wheneverthegoalis
to havebothfactstrue inthegoalstate. Obviously,ifone achieveson(A, B)rstthen one
hasto unstackA again inorder to achieve on(B, C).
ThinkaboutanunmodiedapplicationofKoehlerandHomann'sdenitiontothecase
of landmarks. Considera state swherewe have a landmarkL 0
,butnotanotherlandmark
L,and achieving LinvolvesdeletingL 0
. Does itmatter? It might be that we do notneed
to achieve L from s anyway. It might also be that we do not need L 0
anymore once we
have achieved L. In both cases, there is no need to delete and re-achieve L 0
, and it does
not appear reasonable to introduce the constraint L ! L 0
. The question is, under which
circumstancesisitreasonable? Theanswerisgivenbythetwomentionedcounter-examples.
Thesituationmattersif1. weneedtoachieveLfroms,and2. wemustre-achieveL 0
again
afterwards. Both conditions are trivially fullledwhen L and L 0
are top level goals. Our
denitionbelowmakessure they holdforthelandmarksL andL 0
inquestion.
We saythatthere isa reasonableordering constraint betweentwo landmarksL and L 0
if, startingfrom anystate where L 0
wasachieved before L: L 0
must betrue at some point
laterthantheachievement ofL;and onemustdeleteL 0
on thewaytoL. Formally,we rst
dene the \set of states where L 0
was achieved before L", then we dene what it means
that\L 0
mustbe trueat some pointlater thantheachievement of L",then basedon that
we denewhat reasonableorders are.
Denition 4 Given a planning task (A;I;G), and two facts L and L 0
.
1. By S
(L 0
;:L)
,we denotethe setof states ssuch that thereexistsP =ha
1 ;:::;a
n i2A
,
s=R esult(I;P), L 0
2add(a
n
), andL62R esult(I;ha
1 ;:::;a
i
i) for 0in.
2. L 0
is in the aftermath of L if, for all states s 2 S
(L 0
;:L)
, and all solution plans
P = ha
1 ;:::;a
n i 2 A
from s, G R esult(s;P), there are 1 i j n such that
L2R esult(s;ha
1 ;:::;a
i
i)and L 0
2R esult(s;ha
1 ;:::;a
j i).
3. There is a reasonable order between L and L 0
, written L !
r L
0
, if L 0
is in the
aftermath of L, and
8s2S
(L 0
;:L)
:8P 2A
: L2R esult(s;P))9a2P :L 0
Let usexplainthisdenition, and howit diersfrom Koehler and Homann'soriginal one. 1. S (L 0 ;:L)
containsthestateswhereL 0
wasjustadded,butLwasnottrueyet. Theseare
thestatesweconsider: weareinterested to knowif, fromevery state s2S
(L 0
;:L) ,we
willhave to deleteand re-achieve L 0
. In Koehler and Homann's originaldenition,
S
(L 0
;:L)
containedmorestates, namelyall thosestates swhereL 0
wasjustaddedbut
L 62 s. This denitionallowed cases where L was achieved already but was deleted
again. Our revised denitioncaptures better theintuition that we want to consider
allstateswhereL 0
wasachievedbeforeL. Thereviseddenitionalsomakessurethat,
foralandmarkL,anysolutionplanstartingfroms2S
(L 0
;:L)
mustachieveLat some
point.
2. The denition of the aftermath relation just says that, in a solution plan starting
froms2S
(L 0
;:L) ,L
0
mustbetruesimultaneouslywithL,oratsomelatertimepoint.
Koehler and Homann didn't need such a denitionsince this condition is trivially
fullledfortoplevelgoals.
3. The denition of L !
r L
0
then says that, from every s 2 S
(L 0
;:L)
, every action
sequence achievingL deletes L 0
at some point. Withtheadditional postulationthat
L 0
is in the aftermath of L, this impliesthat from every s 2 S
(L 0
;:L)
one needs to
delete and re-achieve L 0
. Koehlerand Homann's denitionhere is identical except
thatthey do notneedto postulate theaftermath relation.
BecauseintheirdenitionS
(L 0
;:L)
containsmorestates, andtoplevelgoalsaretrivially
intheaftermathofeachother,KoehlerandHomann's!
r
denitionisstrongerthanours,
i.e. L!
r L
0
intheKoehlerandHomann senseimpliesL!
r L
0
asdened above (we give
an examplebelowwhereour,butnottheKoehlerandHomann L!
r L
0
relationholds). 4
It is important to note that reasonable orders are not mandatory. An order L !
r L
0
only says that, if we achieve L 0
before L, we will need to delete and re-achieve L 0
. This
might mean that achieving L 0
before L is wasted eort. But there are cases where, inthe
process of achievingsome landmarkL, one hasno other choice butto achieve, delete, and
re-achieve a landmarkL 0
. In theTowers of Hanoi domain,forexample, thisis thecasefor
nearly all pairs of top level goals { namely, for all those pairs of goals that say that (L 0
)
disci must be located on disc i+1, and (L) disci+1 must be located on disci+2. In
such a situation,forcing a planner to obeythe order L!
r L
0
cuts outall solutionpaths.
One can also easily construct cases where L !
r L 0 and L 0 ! r
L hold for goals L and L 0
(that can notbe achieved simultaneously). Considerthefollowing example. There arethe
4.NotethatanorderL!rL 0
intendstotellusthatweshouldnotachieveL 0
beforeL. Thisleavesopen
theoptiontoachieveLandL 0
simultaneously.Inthatsense,ourdenition(givenaboveinSection3.1)
of whatit meansto obeyanorderL!L 0
,namelyto addLstrictlybeforeL 0
,is abittoorestrictive.
Inourexperience,therestrictionisirrelevantinpractice. Innoneofthemanybenchmarkswetrieddid
we observe factsthat werereasonably ordered (orderedatall, infact) relative toeachotherand that
could be achievedwith thesame action{ rememberthat we consider thesequential planningsetting.
WeremarkthatonecaneasilyadaptourframeworktotakeaccountofsimultaneousachievementofL
andL 0
. Nochangesareneededexceptintheapproximationofobedientreasonableorders,whichwould
seven facts L, L 0 ,P 1 ,P 2 ,P 0 2 ,P 3
,and P 0
3
. InitiallyonlyP
1
is true,and the goal isto have
L andL 0
. Theactions are:
name (pre; add; del)
opL
1
= (fP
1
g; fL;P
2 g; fP 1 g) opL 0 1 = (fP 1 g; fL 0 ;P 0 2 g; fP 1 g) opL 2 = (fP 0 2
g; fL;P
3 g; fL 0 ;P 0 2 g) opL 0 2 = (fP 2 g; fL 0 ;P 0 3
g; fL;P
2 g) opL 3 = (fP 0 3
g; fLg; fP
0 3 g) opL 0 3 = (fP 3 g; fL 0 g; fP 3 g)
Figure 2 shows thestate space of theexample. Thereare exactlytwo solutionpaths, h
opL 1 ,opL 0 2 ,opL 3
i and h opL 0 1 ,opL 2 ,opL 0 3
i. The rst ofthese pathsachieves, deletes,
and re-achievesL,thesecond one doesthesame withL 0
. S
(L 0
;:L)
contains thesinglestate
thatresultsfromapplyingopL 0
1
totheinitialstate. Fromthatstate,onehastoapplyopL
2
inorderto achieve L,deletingL 0
,soL!
r L
0
holds. Similarly,itcan beseenthatL 0
!
r L
holds. Note thateither solutionpathdisobeysone of thetwo reasonableorders.
P’
{ L,
P }
3
{ L’,
3
P’ }
{ L, L’ }
opL
3
opL’
opL’
3
opL
1
2
{ P
1
}
1
opL’
}
}
2
opL
2
{ L,
2
P
{ L’,
Figure 2: Statespace of theexample.
Wereiteratethattheabovearepurelytheoreticalobservationsmadetoclarifythe
mean-ing of ourdenitions. Our search control does notenforce thefoundordering constraints,
itonlysuggests them to theplanner.
While reasonable orders L !
r L
0
are not mandatory, they can help to reduce search
eortinthosecaseswhereachievingL 0
beforeLdoes implywastedeort. OurBlocksworld
example from Figure 1 constitutes such a case. In the example, it makes no sense to
stack B onto D while D is still located on C, because C has to endup on topof A. By
Denition 4, clear(C) !
r
on(B D) holds: S
(on(BD);:cl ear(C))
contains onlystates where B
has been stacked onto D, but D is still on top of C. From these states, one must delete
on(B D) in order to achieve clear(C). Further, on(B D) is a top-level goal so it is in the
aftermath ofclear(C),and clear(C) !
r
on(B D)follows. Theorder doesnotholdinterms
of Koehler and Homann's denition, because there the S
(on(BD);:cl ear(C))
state set also
containsstates whereDwasalready removed fromC.
Like the previous decision problems, those related to the aftermath relation and to
reasonableordersare PSPACE-complete.
Theorem 4 LetAFTERMATHdenotethefollowingproblem: givenaplanningtask(A;I;G),
and two facts L and L 0
; is L 0
in the aftermath of L?
Proof Sketch: PSPACE-hardness follows by reducing the complement of PLANSAT to
AFTERMATH.PSPACE-membershipfollowsbyanon-deterministicalgorithmthatguesses
counter examples. 2
Theorem 5 Let REASONABLE-ORD denote the following problem: given a planning task
(A;I;G), and two facts Land L 0
such that L 0
isin theaftermath of L;does L!
r L
0
hold?
Deciding REASONABLE-ORDis PSPACE-complete.
Proof Sketch: PSPACE-hardness follows by reducing the complement of PLANSAT to
REASONABLE-ORD,with thesameconstruction asusedbyKoehlerand Homann(2000)
for the original denition of reasonable orders. PSPACE-membership follows by a
non-deterministicalgorithm thatguesses counter examples. 2
3.3 Obedient Reasonable Orders
Saywe already have aset O of reasonableordering constraints L!
r L
0
. The questionwe
focusoninthesectionathandis,ifa plannercommits toobeyalltheconstraintsinO,do
other reasonableordersarise? The answeris,yes, there might.
Consider the following situation. Say we got landmarksL and L 0
, such that we must
delete L 0
in order to achieve L. Also, there is a third landmark L 00
such that L 0
!
n L
00
and L !
r L
00
. Now, if the order L ! L 00
was necessary, L !
n L
00
, then we would have
a reasonable order L !
r L
0
: L and L 0
would need to be true together immediately prior
to the achievement of L 00
, so L 0
would be in the aftermath of L. However, the ordering
constraint L! L 00
is \only" reasonableso there is no guarantee that a solution plan will
obey it. A plan can choose to achieve L 0
before L 00
before L, and thereby avoid deletion
and re-achievement of L 0
. But if we enforce the ordering constraint L !
r L
00
,disallowing
plansthatdo notobey it, thenachievingL 0
before L leadsto deletionand re-achievement
of L 0
and is thusnotreasonable.
With the above, the idea we pursue now is to dene a weaker form of reasonable
or-ders, which are obedient in thesense that they onlyarise ifone commits to a given set O
of (previously computed) reasonable ordering constraints. In our experiments, using (an
approximation of) such obedient reasonable orders, on topof the reasonableorders
them-selves, resulted in signicantly better planner performance in a few domains (such as the
Blocksworld), and made no dierence in the other domains. Summarised, what we do is,
we startfrom thesetO ofreasonableordersalready computedbyourapproximations, and
then insert new orders that are reasonable given one commits to obey the constraints in
O. We do thisjustonce, i.e. wedo notcompute axpoint. The detailsareinSection 4.3.
Right now, we denewhat obedientreasonable ordersare.
The denition of obedient reasonable orders is almost the same as that of reasonable
orders. Theonlydierencelies inthatwe consideronlyactionsequencesthatare obedient
in the sense that they obey all ordering constraints in the given set O. The denition of
when an action sequence ha
1 ;:::;a
n
i obeys an order L ! L 0
was already given above: if
eitherL2I,orminfijL2add(a
i
)g<minfijL 0
2add(a
i
)g wheretheminimumover an
Denition 5 Given a planning task (A;I;G), a set O of reasonable ordering constraints,
and two facts L and L 0
.
1. By S O
(L 0
;:L)
, we denote the set of states s such that there exists an obedient action
sequence P = ha
1 ;:::;a
n i 2 A
, with s = R esult(I;P), L 0
2 add(a
n
), and L 62
R esult(I;ha
1 ;:::;a
i
i)for 0in.
2. L 0
isintheobedientaftermathofLif,forallstatess2S O
(L 0
;:L)
,andallobedient
solu-tionplansP =ha
1 ;:::;a
n i2A
,GR esult(I;P),wheres=R esult(I;ha
1 ;:::;a
k i),
there are k i j n such that L 2 R esult(I;ha
1 ;:::;a
i
i) and L 0
2 R esult(I;
ha
1 ;:::; a
j i).
3. There is an obedient reasonable order between L and L 0
, written L ! O
r L
0
, if and
only ifL 0
isin the obedient aftermath of L, and
8s2S O
(L 0
;:L)
:8P 2A
: L2R esult(s;P))9a2P :L 0
2del(a)
This denition is very similar to Denition 4 and thus should be self-explanatory, in
its formalaspects. Thedenitionof theaftermath relationlooks alittle more complicated
becausethesolutionplanP starts from theinitialstate,notfrom sasinDenition4,and
reaches swith actiona
k
. This isjust aminor technicaldevice to cover thecasewhere, for
some of the L
1 !
r L
2
constraints in O, L
1
is contained in s already (and thus does not
needtobeaddedaftersinorderto obeyL
1 !
r L
2
). Notethat, inpart3ofthedenition,
theaction sequencesP achievingL are notrequiredto be obedient. While it would make
sensetoimposethisrequirement,ourapproximationtechniques(that willbeintroducedin
Section4.3)onlytakeaccountofO inthecomputationoftheaftermathrelationanyway. It
isanopenquestionhowourotherapproximationtechniquescouldbemadetotakeaccount
of O.
We remark that the modied denitions do notchange the computational complexity
of the corresponding decision problems. 5
As a quick illustration of the new denitions,
reconsider the situation described above. There, L 0
is not in the aftermath of L, but in
theobedient aftermathofLbecauseall actionsequencesthatobeytheconstraintL!
r L
00
makeL 0
trueat apointsimultaneouslywithorbehindL (namelyimmediatelypriorto L 00
,
assumingthatthereisno actionthataddsbothL andL 00
). AsL 0
mustbedeletedinorder
to achieve L, we obtain the ordering L ! fL! r L 00 g r L 0
. That is, if the planner obeys the
constraint L!
r L
00
thenitis reasonableto also orderL beforeL 0
.
Just likethe reasonableorders,the obedient reasonableorders arenotmandatory.
En-forcinganobedientreasonableordercan cutoutallsolutionpaths. Thereasonisthesame
as for the reasonable orders. An order L ! O
r L
0
only says that, given we want to obey
O, achieving L 0
before L implies deletion and re-achievement of L 0
. If this really means
that achievingL 0
beforeL is wasted eort, theorder tells usnothing about. Considerthe
following example. There are the ten facts L, L 0
, L 00
, P, A
1 , A 2 , A 3 , B 1 , B 2
, and B
3 .
5.For theobedient aftermathrelation, minormodications oftheproof toTheorem4suÆce.
PSPACE-hardness follows by usingthe empty set ofordering constraints. PSPACE-membershipfollows by
ex-tending the non-deterministic decisionalgorithm withags that check if the ordering constraintsare
Initially onlyP is true, and the goal is to have L, L 0
, and L 00
. The construction is made
sothat L!
r L
00
,and L6!
r L
0
butL! fL!rL 00 g r L 0
. Enforcing L! fL!rL 00 g r L 0 rendersthe
taskunsolvable. The actionsare:
name (pre; add; del)
opA = (fPg; fA
1
g; fPg)
opB = (fPg; fB
1
g; fPg)
opA 1 = (fA 1 g; fL 0 ;L 00 ;A 2 g; fA 1 g) opA 2 = (fA 2
g; fL;A
3 g; fL 00 ;A 2 g) opA 3 = (fA 3 g; fL 00 g; fA 3 g) opB 1 = (fB 1 g; fL 0 ;B 2 g; fB 1 g) opB 2 = (fB 2
g; fL;B
3 g; fL 0 ;B 2 g) opB 3 = (fB 3 g; fL 0 ;L 00 g; fB 3 g)
Figure3showsthestatespaceoftheexample. Onehastochooseoneoutoftwooptions.
First,one appliesopAto theinitialstate and thenproceedswithopA
1 ,opA
2
,andopA
3 .
Second, one applies opB to the initial state and proceeds with opB
1 , opB
2
, and opB
3 .
TherstoptionistheonlyonewhereL 00
becomestruebeforeL. OnehastodeleteL 00
with
opA
2
, and re-achieve it withopA
3
. For thisreason, the order L !
r L
00
holds. The order
L!
r L
0
doesnotholdbecause ifone chooses therst optionthenL 0
becomestruepriorto
L, and is neverdeleted. However, committing to the order L !
r L
00
means excludingthe
rstoption. Inthesecond option,L 0
becomestrue beforeL,and mustthenbe deletedand
re-achieved, so we get theorder L! fL!rL 00 g r L 0
. But there is no solutionplan that obeys
this order because there is no way to make L true before (or, even, simultaneously with)
L 0
.
L’,
2
{
opB
L,
{ L’, L’’, A
2
}
2
{
}
opA
L, L’, A
3
B }
2
B
3
}
opA
3
opB
3
{
{ L, L’, L’’}
{ P
opB
opA
1
1
opB
}
}
1
1
A
opA
}
B
{
{
Figure 3: Statespace of theexample.
4. How to Find Ordered Landmarks
We now describe ourmethods to ndlandmarksin a given planning task, and to
approx-imate their inherent ordering constraints. The result of theprocess is a directed graph in
theobviousway,thelandmarksgenerationgraph(LGG).Section4.1describeshowwend
landmarks, and how we approximate greedy necessary orders between them. Section 4.2
givesa suÆcientcriterion forreasonableorders, based ongreedy necessaryordersand fact
inconsistencies,anddescribeshowweusethecriterionforapproximatingreasonableorders.
handlingofcyclesintheLGG,andSection4.5describesapreliminaryformof\lookahead"
ordersthat we have alsoimplementedand used.
4.1 Finding Landmarks and Approximating Greedy Necessary Orders
We nd(a subsetofthe)landmarksinaplanningtask, andapproximatethegreedy
neces-saryordersbetween them,bothinone process. The process issplit into two parts:
1. Compute an LGG of landmark candidates together with approximated
greedy necessary orders between them. This is donewith a backchaining
pro-cess. The goals form therst landmark candidates. Then, forany candidate L 0
,the
\earliest" actions that can be used to achieve L 0
are considered. Here, \early" is a
greedyapproximationof reachabilityfromtheinitial state. The actionsare analysed
toseeiftheyhavesharedpreconditionfactsL{factsthatmustbetruebefore
execut-inganyoftheactions. ThesefactsLbecomenewcandidatesifthey havenotalready
beenprocessed,andtheordersL!
gn L
0
areintroduced. Theprocessisiterateduntil
there are no new candidates. (Due to the greedy selection of actions, L/the order
L!
gn L
0
is notproved to be a landmark/agreedynecessary order.)
2. Removefrom the LGG the candidates(and their incident edges) that can
not be proved to belandmarks. ThisisdonebyevaluatingasuÆcient condition
oneachcandidateLintheLGG.TheconditionignoresallactionsthataddL,andasks
ifa relaxed version ofthe taskis stillsolvable. If not, Lis proved to be a landmark.
(Any relaxation can be used in principle; we use the relaxation that ignores delete
listsasinMcDermott, 1999 and Bonet &Gener, 2001.)
The nexttwo subsections focuson these two stepsinturn.
4.1.1 Landmark Candidates
We give pseudo-code for our approximation algorithm below. As said, we make the
algo-rithmgreedybyusingan approximation ofreachabilityfrom theinitialstate. The
approx-imationwe use is a relaxed planning graph (Homann & Nebel, 2001), shortRPG. Let us
explainthisdatastructurerst. AnRPGisbuiltjustlikeaplanninggraph(Blum&Furst,
1997), exceptthatthedeletelistsofallactions areignored;asaresult,there areno mutex
relations inthe graph. The RPG thusis a sequence P
0 ;A
0 ;P
1 ;A
1 ;:::;P
m 1 ;A
m 1 ;P
m of
propositionsets(layers)P
i
andactionsets(layers)A
i . P
0
containsthefactsthataretruein
theinitial state, A
0
contains those actions whosepreconditions arereached(contained) in
P
0 ,P
1
containsP
0
plustheadd eects oftheactions inA
0
,and soon. We have P
i P
i+1
and A
i
A
i+1
for all i. If the relaxed task (without delete lists) is unsolvable, then the
RPG reaches a xpoint before reaching the goal facts, thereby proving unsolvability. If
the relaxed task is solvable, then eventually a layer P
m
containing the goal facts will be
reached. 6
6.NotethattheRPGthusdecidessolvabilityoftherelaxedplanningtask. Indeed,buildinganRPGisa
variationof thealgorithm givenbyBylander(1994) to provethat planexistence is polynomialinthe
An RPGencodesan over-approximationofreachabilityintheplanningtask. Wedene
thelevelofafact/actiontobetheindexoftherstproposition/actionlayerthatcontainsthe
fact/action. Then,ifthelevelofa fact/actionis l,onemust applyat leastlparallelaction
stepsfromtheinitialstatebeforethefactbecomestrue/theactionbecomesapplicable. (The
fact/actionlevelcorrespondstothe\h 1
"heuristicdenedbyHaslum&Gener,2000.) We
usethisover-approximationofreachabilitytoinsertsomegreedinessintoourapproximation
of \greedy"necessary orders (morebelow). The approximationprocess proceeds asshown
inFigure 4.
initialisetheLGGto (G;;),andset C:=G
whileC6=;do
setC 0
:=;
forallL 0
2C ;level(L 0
)6=0do
letAbetheset ofallactionsasuchthatL 0
2add(a),andlevel(a)=level(L 0
) 1 ()
forallfacts Lsuch that8a2A:L2pre(a) do
ifLisnotyetanodein theLGG,setC 0
:=C 0
[fLg
ifLisnotyetanodein theLGG,theninsertthat node
ifL!
gn L
0
isnotyet anedgeintheLGG,theninsertthat edge
endfor
endfor
setC :=C 0
endwhile
Figure 4: Landmarkcandidate generation.
The set of landmarkcandidatesis initialisedto comprise thegoal facts. Each iteration
of the while-loop processes all \open" candidatesL 0
{those L 0
inC. Candidates L 0
with
level0,i.e.,initialfacts,arenotusedto producegreedynecessaryordersandnewlandmark
candidates, because after all such L 0
are already true. For the other open candidates L 0
,
the set A comprises all those actions at the level below L 0
that can be used to achieve
L 0
. Note that these are the earliest possible achievers of L 0
in the RPG, or else the level
of L 0
would be lower. We take as the new landmark candidatesthose facts L that every
actioninArequiresasaprecondition,andupdatetheLGGand thesetofopencandidates
accordingly. Independentlyofthe()step,thealgorithmterminatesbecausethereareonly
nitely many facts. Because we use the RPG level test in step (), the levels of the new
candidatesLare strictlylowerthanthelevelof L 0
,and thewhile-loopterminatesafterat
mostm iterationswherem isthe indexofthe topmostpropositionlayerintheRPG.
If weskippedthetest fortheRPGlevelat thepointinthealgorithmmarked (),then
thenewcandidatesLwouldbeprovedlandmarks,andthegeneratedorderswouldbeproved
to be necessary and thusalso greedy necessary. Obviously, if all actions that can achieve
a landmark L 0
require L to be true,then L is a landmark that must be true immediately
prior to achieving L 0
. Restrictingthe choice of L 0
achievers with the RPG level test, the
found landmarksand orders may be unsound. Consider the following example, where we
want to move from cityA to cityD on theroad mapshownin Figure 5,usinga standard
A
B
C
D
E
Figure 5: An exampleroadmap.
TheabovealgorithmwillcomeupwiththefollowingLGG:fat(A),at(E),at(D)g;fat(A)
!
gn
at(E), at(E) !
gn
at(D)g { theRPG isonly builtuntilthegoals are reachedthe rst
time, which happens in this example before move(C D) comes in. However, the action
sequence hmove(AB), move(B C),move(C D)i achievesat(D) withoutmakingat(E) true.
Therefore, at(E) is not really a landmark, and at(E) !
gn
at(D) is not really a greedy
necessary order.
By restrictingourchoiceofL 0
achievers withtheRPGleveltestat step()inFigure4,
as said we intend to insert greediness into our approximation of greedy necessary orders.
The generated orders L !
gn L
0
are only guaranteed to be sound if, in the RPG, the set
of earliest achievers of L 0
contains all actions that can be used to make L 0
true for the
rst time from theinitial state. Of course, itis hard to exactly computethat latter set of
actions, and also itis highly non-trivial { ifpossibleat all { to ndgeneral conditions on
whenthe earliestachieversintheRPGcontain allthese actions. Intheroadmapexample
above, theactions thatcanachieve at(D) fortherst timearemove(C D)andmove(ED),
but theonly earliest achiever in theRPG is move(E D). This leads to theunsound at(E)
!
gn
at(D) order. In the following example taken from the well-known Logistics domain,
the earliest achievers of L 0
do contain all actions that can make L 0
true for therst time.
SayL 0
=at(PA)requirespackagePtobeattheairportAofitsorigincity,andPisnotat
thisairportinitially. The actions thatcan achieve L 0
are to unload P from thelocal truck
T, orto unload it from any airplane. The onlyearliest achiever inthe RPG isthe unload
fromT,andindeedthat'stheonlyactionthatcanachieveL 0
forthersttime{inorderto
getthepackage into an airplane,thepackage hasto arriveat theairportintherst place.
Ourapproximationprocesscorrectly generatesthenewlandmarkcandidatein(PT)aswell
asthegreedy necessaryorder in(P T)!
gn
at(PA). Notethat in(P T)6!
n
at(P A).
WeshowbelowinSection4.1.2 howwere-establishthesoundnessof thelandmark
can-didates,removingcandidates(andtheirassociatedorders)thatarenotprovablylandmarks.
Wedidnotndawaytoprovablyre-establishthesoundnessofthegeneratedgreedy
neces-saryorders,andunsoundordersmaystayintheLGG,potentiallyalsocausingtheinference
ofunsoundreasonable/obedientreasonableorders(see thesectionsbelow). We didobserve
such unsoundnessin a few domains duringour experiments(individual discussions are in
Section6). We remarkthefollowing.
1. While unsound approximated L !
gn L
0
orders are not validwith respect to
Deni-tion3, theystillmake somesenseintuitively. Theyare generatedbecauseL isinthe
preconditions of all actions that are the rst ones in the RPG to achieve L 0
. This
meansthatgoingto L 0
via Lisprobablyagoodoption,intermsofdistancefromthe
initialstate.
2. UnlessLisalandmarkforsomeother reason(thanfortheunsoundorderL!
gn L
0
),
landmarkvericationwill remove L, and inparticular the order L!
gn L
0
LGG(seethediscussionoftheFigure5examplebelowinthesectionaboutlandmark
verication).
3. As said before, our search control does not enforce the orders in the LGG, it only
suggests them to theplanner. So even ifthere is no plan that obeys an order inthe
LGG, thisdoesnotmeanthat oursearch controlwillmake the plannerfail.
4. Ifwewereto extractonlyprovablynecessaryorders,bynotusingtheRPGleveltest,
we would misstheinformationthat liesinthose !
gn
orders thatarenot!
n
orders.
Forthesereasons,inparticularforthelastone,weconcentratedonthepotentiallyunsound
RPG-basedapproximationinour experiments. We also ransome comparative teststo the
\safe" strategywithouttheRPGlevel test,indomains wheretheRPG producedunsound
orders. SeethedetailsinSection6.
One case where an !
gn
order, that is not an !
n
order, contains potentially useful
information,istheLogisticsexamplegivenabove. Anothercaseistheaforementionedorder
clear(D)!
gn
clear(C)inourrunningBlocksworldexamplefromFigure1. Toconcludethis
subsection, let ushave a lookat what ourapproximation algorithm from Figure 4 doesin
thatexample. The RPGfortheexampleis summarisedinFigure 6.
P0 A0 P1 A1 P2 A2 P3
on-table(A) pick-up(A) holding(A) stack(BA) on(BA) stack(CA) on(CA)
on-table(B) pick-up(B) holding(B) stack(BD) on(BD) stack(CB) on(CB)
on-table(C) unstack(DC) holding(D) stack(BC) on(BC) stack(CD) on(CD)
on(D C) clear(C) put-down(B) ... ... ...
clear(A) ...
clear(B) pick-up(C) holding(C)
clear(D) ... ...
arm-empty()
Figure 6: SummarisedRPG fortheillustrativeBlocksworld examplefrom Figure1.
As we explained above, the extraction process starts byconsidering the goals on(C A)
andon(BD)aslandmarkcandidates. TheRPGlevelofon(CA)is3,thelevelofon(BD)is
2. Thereis onlyone actionwithlevel2 thatachieveson(C A): stack(C A).So, holding(C)
(level 2) and clear(A) (level 0) are new candidates. The new LGG is: (fon(C A),on(B
D),holding(C),clear(A)g;fholding(C) !
gn
on(C A), clear(A) !
gn
on(C A)g). Processing
on(B D), we nd that its only earliest achiever is stack(B D), and we generate the new
candidatesholding(B)(level1)andclear(D)(level0) withtherespectiveedges. Inthenext
iteration,holding(C) (level 2) produces the new candidatesclear(C) (level1), on-table(C)
(level 0), and arm-empty() (level 0) by the achiever pick-up(C); and holding(B) (level 1)
produces the new candidates on-table(B) (level 0) and clear(B) (level 0) by the achiever
pick-up(B). In the third and nal iteration of the algorithm, clear(C) (level 1) produces
the new candidate on(D C) (level 0) by the achiever unstack(D C). The process ends up
withthe LGGas shown inFigure 7 (theedges in thedepicted graphare all directed from
bottom to top). Fact sets of which our LGG suggests that they have to be true together
at some point { because they are either top level goals, or !
gn
fact{ are grouped together inboxes. Assaid before,thisinformationis important forthe
approximationof reasonableordersdescribed below.
on(c,a)
clear(a)
on−table(c)
clear(c)
holding(c)
clear(d)
holding(b)
on(b,d)
arm_empty
on(d,c)
on−table(b)
clear(b)
Figure 7: LGG forthe illustrative Blocksworld task, containing the found landmarks and
!
gn
orders.
4.1.2 Landmark Verification
Assaidbefore,weverifylandmarkcandidatesbyevaluatingasuÆcient conditiononthem,
and throwingawaythose candidateswhere theconditionfails. Theconditionwe useisthe
following.
Proposition 1 Given aplanningtask (A;I;G), anda factL. Deneamodied actionset
A
L
asfollows.
A
L
:=f(pre(a);add(a);;) j(pre(a);add(a);del(a))2A;L62add(a)g
If (A
L
;I;G) is unsolvable, then L isa landmark in (A;I;G).
Note that the inverse direction of the proposition does not hold { that is, if L is a
landmark in (A;I;G) then (A
L
;I;G) is notnecessarily unsolvable { because ignoringthe
delete lists simplies the achievement of the goals. As mentioned earlier, deciding about
solvability of planning tasks with empty delete lists can be done in polynomial time by
buildingtheRPG.ThetaskisunsolvableitheRPGcan'treachthegoals. Soourlandmark
vericationprocesslooks atall landmarkcandidatesinturn. Candidatesthataretoplevel
goals orinitial facts are triviallylandmarks, so they need not be veried. For each of the
othercandidatesL,theRPG corresponding to(A
L
;I;G) isbuilt,andifthatRPG reaches
thegoals, thenL andits incidentedges are removed fromtheLGG.
Reconsidertheroad mapexampledepictedinFigure5. The LGGbuiltwillbe fat(A),
at(E), at(D)g;fat(A) !
gn
at(E), at(E) !
gn
at(D)g. But at(E) is not really a landmark
verifying at(E),we detect this. In theRPG, when ignoringall actions that achieve at(E),
move(A,B), move(B,C), and move(C,D) stay in and so the goal remains reachable. Thus
at(E) andits edges(in particular,the invalidedgeat(E) !
gn
at(D)) areremoved,yielding
thenal(trivial)LGTwithnodesetfat(A),at(D)gandemptyedgeset. Notethat, ifat(E)
was a landmark for some other reason than reaching D (like, if one had to pick up some
object at E), then at(E) would not be removed by landmark verication and the invalid
order at(E) !
gn
at(D) wouldstayin.
In the Blocksworld examplefrom Figure 1, landmark vericationdoes notremove any
candidates,and the LGGremainsunchangedasdepicted inFigure 7.
4.2 Approximating Reasonable Orders
Our process to approximate reasonable orders starts from the LGG as computed by the
methods described above, and enriches the LGG with new edges corresponding to the
approximated reasonableorders. The process hastwo mainaspects:
1. We approximate the aftermath relation based on the LGG. This is done
by evaluating a suÆcient condition that covers certain cases when greedy necessary
ordersimplytheaftermath relation.
2. We combine the aftermath relation with interference information to
ap-proximatereasonable orders. ForeachpairoflandmarksL 0
andLsuchthatL 0
is
intheaftermathofLaccordingto thepreviousapproximations, asuÆcientcondition
isevaluated. Theconditioncovers certain cases whenL interferes with L 0
,i.e., when
achieving L (from a state in S
(L 0
;:L)
) involves deletingL 0
. If the conditionholds, a
reasonableorder L!
r L
0
isintroduced.
The next two subsections focus on these two aspects inturn. In our implementation, the
computation of the aftermath relation is interleaved with its combination with
interfer-ence information. Pseudo-code for theoverall algorithm is given inthesecond subsection,
Figure 8.
4.2.1 AftermathRelation
ThesuÆcient conditionthatwe useto approximatetheaftermathrelationisthefollowing.
Lemma 1 Given a planning task (A;I;G), and two landmarks L and L 0
. If either
1. L 0
2G, or
2. there are landmarks L = L
1 ;:::;L
n+1
, n 1, L
n 6= L
0
, such that L
i !
gn L
i+1 for
1in, and L 0
!
gn L
n+1 ,
then L 0
isin the aftermath of L.
Proof Sketch: IfL 0
2Gthen L 0
is triviallyinthe aftermath of L. Otherwise, underthe
givencircumstances,L 0
andL
n
mustbetruetogetherat somepointinanyactionsequence
achieving thegoal from a state in S
(L 0
;:L)
, namely directly prior to achievement of L
AsL hasa pathof !
gn
orders to L
n
,it hasto betrue priorto (orsimultaneouslywith,if
n=1) L 0
. 2
Notethatthislemmajustcapturesthepropertywe mentionedbefore,whenwecantell
from the LGG that several facts must be true together at some point. In the second case
of the lemma, these facts are L 0 and L n . L 0 and L n
are both ordered !
gn
before L
n+1
and somust be true together before achieving that fact. The rst case of the lemma can
be understood thiswayas implicitlyassuming L
n
as some other top-level goalthat L has
a path of !
gn
orders to. (In our implementation, L must have such a path in the LGG
orit would nothave beengenerated asa landmarkcandidate.) IfL 0
and L
n
mustbe true
together, andweadditionallyknowthatLmustbetruesometimebeforeL
n
,thenweknow
thatL 0
is intheaftermath of L. 7
The moststraightforward ideato make useof Lemma 1wouldbeto simply enumerate
all pairs of nodes (landmarks) in the LGG, and evaluate the lemma, collecting the pairs
L and L 0
of landmarkswhere the lemma condition holds. While this would probablynot
be prohibitively runtime-costly, one can do better by having a closer look at the lemma
condition. Consider each node L 0
in the LGG in turn. If L 0
is a top level goal, then L 0
is in the aftermath of all other nodes L. If L 0
is not a top level goal, then consider all
nodesL
n 6=L
0
such thatL 0
and L
n
bothhave a !
gn
order before some other node L
n+1 .
The nodes LintheLGG thathave an (possiblyempty)outgoing !
gn
path to such anL
n
are exactly those for which L 0
is in the aftermath of L according to Lemma 1. As said,
pseudo-codeforour overall approximation of reasonableordersisgiven below inFigure8.
NotethattheinputstoLemma1are!
gn
orders,whileinpracticeweevaluatethelemma
on theedges intheLGG asgenerated bythe processes describedabove inSection4.1. As
we discussed above, the edges in the LGG may be unsound, i.e. they do not provably
correspond to !
gn
orders. In eect, neither can we guarantee that our approximation to
theaftermath relationis sound.
4.2.2 ReasonableOrders
We approximate reasonable orders by considering all pairs L and L 0
where L 0
is in the
aftermath of L according to the above approximation. We test if L interferes with L 0
according to the denition directly below. If the test succeeds, we introduce the order
L!
r L
0
.
Denition 6 Given a planning task (A;I;G), and two facts L and L 0
. L interferes with
L 0
if one of the following conditions holds:
1. L and L 0
are inconsistent;
2. there isa fact x2 T
a2A;L2add(a)
add(a), x6=L, such that x isinconsistent withL 0 ; 3. L 0 2 T a2A;L2add(a) del(a);
4. or there isa landmark x inconsistent with L 0
such that x!
gn L.
7.Intheory,onecouldalsoallowL 0
=L
n
6=LinLemma1. Inthis case,L hasapathof!
gn
ordersto
L 0
,whichtriviallyimpliesthatL 0
isintheaftermathofL. Butinfact, itis thenimpossibletoachieve
L 0
beforeLsoanorderL!
r L
0
As said before, our (standard) denition of inconsistency is that facts x and y are
inconsistent inaplanning taskifthere isno reachable state inthetaskthat containsboth
x and y. 8
Note that the conditions 1 to 4 of Denition 6, while they may look closely
relatedatrstsight(andpresumablyarerelatedinmanypracticalexamples),indeedcover
dierent cases of whenachievingL involvesdeletingL 0
. More formally expressed,for each
conditionitherearecaseswhereiholdsbutnoconditionj6=iholds. Forexample,consider
condition2. Inthefollowingexample,thereisareasonableorderL!
r L
0
,andLinterferes
with L 0
dueto condition 2only. There are thesix facts L,L 0
,x, P
1 , P
2
, and P 0
. Initially
onlyP 0
is true,and thegoalis to have Land L 0
. The actionsare:
name (pre; add; del)
opL 0 = (fP 0 g; fL 0 g; fxg) opP 1 = (fP 0 g; fP 1 g; fL 0 ;P 0 g) opP 2 = (fP 0 g; fP 2 g; fL 0 ;P 0 g) opL 1 = (fP 1
g; fL;x;P 0 g; fP 1 g) opL 2 = (fP 2
g; fL;x;P 0
g; fP
2 g)
In this example, the only action sequences that are possible are of the form (opL 0 k opP 1 Æ opL 1 k opP 2 Æ opL 2 )
, in BNF-style notation. In eect, L !
r L
0
because if we
achieve L 0
rst, we have to applyone of opP
1
and opP
2
,which bothdeleteL 0
. Condition
2 holds: x is inconsistent withL 0
and added by both opL
1
and opL
2
. As forcondition1,
L and L 0
are not inconsistent because one can apply opL 0
after, e.g., opL
1
. Condition 3
is obviously notfullled, and condition 4 is not fullledbecause there are two options to
achieve Lsono facthas a!
gn
order beforeL.
Interference together with the aftermath relation implies reasonable orders between
landmarks.
Theorem 6 Givena planning task(A;I;G), and two landmarksL and L 0
. If L interferes
withL 0
, and either
1. L 0
2G, or
2. there are landmarks L = L
1 ;:::;L
n+1
, n 1, L
n 6= L
0
, such that L
i ! gn L i+1 for
1in, and L 0 ! gn L n+1 ,
then thereis a reasonable order between L andL 0
, L!
r L
0
.
Proof Sketch: ByLemma 1,L 0
isin theaftermath ofL. Letuslookat thefourpossible
reasons for interference. If L is inconsistent with L 0
then obviously achieving L involves
deletingL 0
. If all actions that achieve L add a fact that is inconsistent with L 0
, the same
argument applies. The case where all actions that achieve L delete L 0
is obvious. As for
8.DecidingaboutinconsistencyisobviouslyPSPACE-hard. Justimagineataskwhereweinsertoneofthe
factsinto theinitialstate,andtheotherfactsuchthatitcanonlybemadetrueoncetheoriginalgoal
hasbeenachieved. Weapproximateinconsistencywithasoundbutincompletetechniquedevelopedby
thelastcase, saywe are ina state s2S
(L 0
;:L)
. Then x isnotin s(because L 0
is). Dueto
x!
gn
L,x mustbe achieved directly priorto L,and thusL 0
willbe deleted. 2
Overall, our method forapproximating reasonable orders based on the LGG works as
speciedinFigure 8. With what wassaid above,the algorithmshouldbe self-explanatory
except for theinterference tests. When doing these tests, we need informationaboutfact
inconsistencies,and,forcondition4 ofDenition6,about!
gn
orders. Ourapproximation
tothelatterpiecesofinformationare,asbefore,the(approximate)!
gn
edgesintheLGG.
Our approximation to the former piece of information is a technique from the literature
(Fox&Long,1998), theTIMAPI.Thisprovidesa functionTIMinconsistent(x,y) that,for
facts x and y, returnsTRUE onlyif xand y are inconsistent. The functionis incomplete,
i.e., itcan returnFALSEeven ifxand yareinconsistent.
forallnodesL 0
intheLGGdo
ifL 0
2Gthen
forallnodesL6=L 0
intheLGGdo
ifL interfereswithL 0
,theninserttheedge L!
r L
0
intotheLGG
endfor
else
forallnodesL
n 6=L
0
in theLGG
s.t. thereareanodeL
n+1
andedgesL 0 ! gn L n+1 ,L n ! gn L n+1
in theLGGdo
forallnodesLintheLGG
s.t. Lhasan(possiblyempty)outgoingpathof!
gn
edgestoL
n do
ifL interfereswith L 0
,theninserttheedgeL!
r L
0
intotheLGG
endfor
endfor
endif
endfor
Figure 8: Approximatingreasonableorders basedon theLGG.
Note thatthealgorithmfrom Figure8might generateordersL!
r L
0
incases whereL
alreadyhasapathof!
gn
edgesto L 0
. Asnotedearlier,inthiscaseL 0
cannotbe achieved
beforeL sotheorderL!
r L
0
ismeaningless. Onecouldavoid suchmeaningless ordersby
anadditionalchecktosee,foreverygeneratedpairLandL 0
,ifLhasanoutgoing!
gn path
to L 0
. We do thisinourimplementationonlyfor theeasy-to-check specialcases wherethe
lengthof the !
gn
pathfrom L to L 0
is 1 or2. Note thatthe superuous!
r
orders don't
hurt anyway; in fact they don't change our search process (Section 5.1) at all. The only
purposeofourspecial casetest is to avoid some unnecessaryevaluationsof Denition6.
Because the inputs to our approximation algorithm are !
gn
edges in the LGG, and
as discussed before these edges are not provably sound, the resulting !
r
orders are not
provablysound(whichthey otherwise would bebyTheorem6).
Let us nisho ourrunning Blocksworld example, byshowinghow theorder clear(C)
!
r
on(B D), ourmotivating examplefrom theintroduction, is found. Have a look at the
LGGinFigure 7. SaytheprocessdepictedinFigure8 considers,inits outermostfor-loop,
the LGG node L 0
= on(B D). L 0
is a top level goal so all other nodes L in the LGG, in
on(B D)becauseof condition4inDenition 6: clear(D)isinconsistent withon(B D), and
hasan edgeclear(D) !
gn
clear(C)inthe LGG. Consequentlytheorder clear(C) !
r on(B
D)isinferredandintroducedinto theLGG. Notethat, tomake thisinference,we needthe
edgeclear(D) !
gn
clear(C) whichisnot a!
n order.
4.3 Approximating Obedient ReasonableOrders
The process that approximates obedient reasonable orders starts from the LGG already
containingtheapproximatedreasonableorders, andinsertsnew ordersthatarereasonable
given one commits to the !
r
orders already present in the LGG. The technology is very
similartothetechnologyweusetoapproximatereasonableorders. Largely,wedothesame
thingasbeforeandjusttreat the!
r
edgesasiftheywereadditional!
gn
edges. Formally,
thedierencelies inthesuÆcient criterionforthe, nowobedient,aftermath relation.
Lemma 2 Given aplanning task(A;I;G), a setO of reasonable ordering constraints,and
two landmarksL and L 0
. If either
1. L 0
2G, or
2. there are landmarks L = L
1 ;:::;L
n+1
, n 1, L
n 6= L
0
, such that L
i ! gn L i+1 or L i ! r L i+1
2O for 1in, and L 0 ! gn L n+1 , then L 0
isin the obedient aftermath of L.
ProofSketch: ByasimplemodicationoftheprooftoLemma1. Therstcaseisobvious,
inthesecondcase L 0
must be trueone stepbeforeL
n+1
becomestrue,andL mustbetrue
sometime beforethat. 2
Notethat theprovedpropertydoesnotholdifthere isonlyareasonableorderbetween
L 0 and L n+1 , L 0 ! r L n+1
instead of L 0
!
gn L
n+1
, even if we have committed to obey
L 0 ! r L n+1
. It isessential thatL 0
mustbe truedirectly beforeL
n+1 .
9
The partsof our technology that do not dependon the aftermath relation remain
un-changed. Interference is dened exactly as before. Together with the obedient aftermath
relation, itimpliesobedient reasonableordersbetweenlandmarks.
Theorem 7 Given a planning task (A;I;G), a set O of reasonable ordering constraints,
and two landmarks L and L 0
. If L interferes withL 0
,and either
1. L 0
2G, or
2. there are landmarks L = L
1 ;:::;L
n+1
, n 1, L
n 6= L
0
, such that L
i ! gn L i+1 or L i ! r L i+1
2O for 1in, and L 0 ! gn L n+1 ,
then thereis an obedient reasonable order between L and L 0
, L! O
r L
0
.
9.IfobeyinganoderL
1 !
r L
2
isdenedtoincludethecasewhereL
1 andL
2
areachievedsimultaneously,
thelemmadoesnothold. ThefactsL1;:::;Ln+1 couldthenall beachievedwithasingleaction,given
theordersbetweenthemareall(only)takenfromthesetO. Onecan\repair"thelemmabyrequiring
that,foratleastoneoftheiwhereLi6!gnLi+1butLi!rLi+12O,thereisnoactionthathasboth
L
i andL
i+1