5.4 Semantically Equivalent Execution
5.4.2 The Planning Task
The composition of services can be formulated as an AI Planning problem, which has been pioneered in particular by [McD02, MS02]. As a result of significant advancements in terms of theoretical foundations, scalability, available tools, and the fact that actions are well suited for modeling operations of services, AI Planning has come to be a pri- mal approach to (semi-)automatic composition of (Web) services, evidenced by a series of surveys [RS04, KSKR05, Pee05, ACMS08, Klu08, MP09, GTSS11, SVV11]. Planning approaches range from functional, non-functional, and process-level composition, com- binations of these, and composition under varying assumptions on the environment. Yet the prevalent model to planning – not only in the context of service composition – is that of a discrete state space that is to be searched for solutions.
The Basic State Space Model to Planning
We briefly introduce the state space model, which follows mostly [GNT04, Gef11]. In its basic form the model contains:
• a finite set of states S, called the state space,
• a set of actions A where A(s) ⊆Adenotes the set of actions applicable (executable) in state s∈ S, and
8Examples where this is the case are the broader, narrower, and related properties (roles) in the Sim- ple Knowledge Organization System (SKOS) [MB09], which are not transitive (and neither defined as reflexive nor irreflexive).
• a transition function F : A×S→Ssuch that a6∈A(s) implies F(a, s) =s; in words, Fassociates to each current state s and action a a successor state s0that represents the result of applying a in s if a is applicable in s.
A planning domain is correspondingly represented by the 3-tuple PD= (S, A, F) .
One can equally conceive this model as a directed graph in which a node is a state and an edge, labeled with an action, represents a transition between a state and its successor state resulting from the application of the action. The variety of this model lies in the actual definition of what a state and an action is, what the criterion for an action is to be applicable in a state, and how the transition function modifies a state.
Given a known initial state s0 ∈ S and a goal state sg ∈ S, a sequence of actions a0, . . . , an such that
si+1 =F(ai, si), 0 ≤i ≤n, is a solution or plan in this model if
sg =F(an, sn) .
Planning is therefore the ability of a software (agent) to automatically synthesize a plan without being explicitly told the necessary steps that need to be performed to reach a goal from an initial situation (state). The 3-tuple
PP= (PD, s0, sg)
denotes a planning problem (or planning instance) in the planning domain PD. Clearly, a path between s0 and sg in the graph forms a plan – a sequence of actions, which in- dicates that the process of planning can be reduced to (heuristic) search in the graph whether there exists a path leading from s0 to sg. An optimal plan has minimum exe- cution cost among all plans for a planning instance. Under the assumption that actions have uniform execution costs, a plan is optimal if there is no other plan that is shorter (i.e., the length of a plan represents its total execution costs).
A common assumption is that there is at least one action applicable in every state (∀s ∈ S: A(s) 6= ∅). While this assumption is necessary for liveliness it is obviously
not sufficient to guarantee that a plan exists for a planning problem. More relevant, in fact, are two properties concerning dependability of planning algorithms: soundness and completeness. A planning algorithm is sound if it generates plans that are correct, mean- ing that execution of the plan transforms s0into sg. A planning algorithm is complete if it is guaranteed to (eventually) find a plan if one exists.
The main reasoning task, which is performed by virtually all state space search plan- ners either explicitly or implicitly when navigating through the space, is plan checking: given a planning problem PP, is a plan a solution for PP. The second prominent rea- soning task is plan existence: given a planning problem PP, is there a solution for PP. Decidability of the latter is, however, not of utmost importance. Most planning tools assume the existence of a plan anyway and try to find one, instead of proving that none
exists; see also the discussion in [Hel02]. More relevant is therefore the efficiency of finding plans while not exhausting available resources such as time or memory.
The basic state space model captures restricted environments only. More specifi- cally, the fact that the initial state is known assumes full observability of the environment, which essentially means that one has complete knowledge about the initial situation. Second, the fact that the transition function maps to a single successor state implies that actions are deterministic. Third, the system is static, meaning that it stays in a state unless an action is applied. Finally, time is implicit (i.e., abstracted away), which implies that actions are thought to be instantaneous and have no duration. Planning under these as- sumptions is commonly referred to as classical planning. The seminal and still common framework for encoding classical planning problems is the Stanford Research Institute Problem Solver (STRIPS) [FN71], which is introduced next.
The STRIPS Framework for Encoding Classical Planning Problems
In short, a STRIPS planning problem is formulated as a 4-tuple(P, O, I,Γ)where:
• P is a finite set of propositional variables (Boolean variables), called the conditions (a.k.a. fluents as their truth value can change from state to state);
• O is a finite set of operators (actions) of the form (pre, add, del) where pre, add, del are each a subset of P, called the precondition, add, and delete sets, respectively; • I ⊆Pis the initial state; and
• Γ ⊆Pare the goals.
There is one remark on actions versus operators in order here. Unlike stated, it is cus- tom that an operator is understood as a parametrized action; that is, pre, add, del contain atoms (predicates) of the form p(x1, . . . , xn) where xi is a variable that is implicitly ex- istentially quantified. An operator thereby represents all actions that can be obtained by instantiating each variable from a finite set of given logical constants, which we de- note with C. These constants represent objects existing in the domain that are the sub- jects of planning – individuals in case of DLs. It is assumed that different constants denote different objects, that every object that exists is represented by a constant, and that the interpretation of constants does not change between states (i.e., standard names assumption together with a fixed interpretation). Notice that a ground atom therefore resembles a propositional variable. Formally, let Var(o)be the set of variables occurring in pre, add, del of an operator o. The set of actions that can be obtained by instantiating o based onC, denoted with o[C], is
o[C] = {o[θ] |θ: Var(o) → C}
where o[θ]denotes an action obtained by applying a substitution θ to o.
A STRIPS instance encodes the planning domain as follows. Every state s is de- scribed as a subset of P (s ⊆ P). States are interpreted under CWA: ϕ ∈ Pis true in s if
ϕ ∈ s; otherwise ϕ is false in s. Hence, states are complete descriptions of the current
state sg is a goal state ifΓ ⊆ sg. The set of actions applicable in a state s, denoted with A(s), are those whose preconditions are a subset of s, formally
A(s) = {a| a∈ [
o∈Oo[C]
and pre(a) ⊆s} . (5.6) Finally, given an action a and a state s, the successor state is
s0 =F(a, s) = (s\del(a)) ∪add(a) . (5.7)
Asymptotic computational complexity in the basic (propositional) STRIPS frame- work is intractable as it has been shown to be PSpace-complete [Byl94]. The reason is that there can, in general, be plans of exponential length in the size of the planning problem. Complexity drops to NP for plans bounded to polynomial length. Exceed- ingly long plans are more of a theoretical matter as the intuition especially in the area of service composition is that composite services are rather short.
As a result of intractability, a major part of planning research since then has focused on (i) encoding planning problems in a way that keeps the search space as small as possible and (ii) to devise search strategies that enable scaling up to possibly huge state spaces while being sound and complete, and ideally also optimal. A remarkable achievement in this respect is [HCZ10] where a translation of STRIPS planning prob- lems into the framework of planning with multi-valued state variables is described, which is a special case of Functional STRIPS [Gef00], and which results in considerably smaller state spaces.
The Problem Domain Description Language (PDDL) [MGH+98] is the de facto standard machine-parsable format for representing STRIPS planning instances and in- stances expressed in successor languages of STRIPS. Newer versions of PDDL [FL03, GL06] have been extended in several ways; amongst others, to support expressing ex- tended goals, which will be outlined next.
Extensions to Cover Practical Domains
Since the basic state space model is not expressive enough to capture many real-world environments it is extended in several ways. Major dimensions along which extensions are made are summarized inTable 5.1, which we will go through briefly one by one. Goals. Extensions regarding goals concern scenarios in which one wants to express more complex objectives than the specification of a final state to be reached. This in- cludes two classes. First, constraints that describe states that must be traversed or, con- versely, states that must be avoided by a plan (e.g., achieving a subgoal, avoiding a critical situation). Second, constraints that must be optimized or meet at some, any, or all time during plan execution. As both classes relate to a state trajectory in time9, such goals are referred to as temporally extended goals.
Another line of research is concerned with whether goals are regarded manda- tory or not. In so-called over-subscription planning [Smi04] a.k.a. partial satisfaction plan- ning [BNDK04] goals are no longer mandatory but desired, meaning that one is satisfied
9In short, a state trajectory is a sequence of pairsh(s
0, t0),(s1, t1), . . . ,(sn, tn)iwhere each siis a state and tiis a timestamp.
Table 5.1: Different dimensions of planning domains. Dimension Short Explanation
Goals Whether a goal represents a state to be reached (no matter how) ver- sus a desired evolution in the domain.
Whether goals are regarded mandatory versus optional (desired). Observability Whether states of the domain are partially or fully observable.∗ Controllability Whether actions/operators are deterministic versus nondeterministic. Dynamics Whether the state of the domain changes only through actions or
whether there can be other events that also effect state changes. Agility Whether plan generation and plan execution are separated versus
interleaved, meaning that execution can take place while planning is ongoing.
Processing Whether a plan is a strict versus a partial ordering, is conditional, iterative, or a mixture thereof.
∗Partial observability includes the special case of no observability at all.
with plans that achieve a subset of the goals, as opposed to classic planning which ter- minates in failure unless all goals can be achieved. Such goals are also referred to as soft goals as opposed to hard goals. They are mainly motivated by planning under limited resources available (e.g., time) or the presence of mutually exclusive goals. In order to assist a planner in choosing which goals to achieve, each goal has an utility value asso- ciated (i.e., one prioritizes goals). Alternatively, one can also penalize the violation of a goal, which is the approach considered in PDDL3 [GL06]. In addition, actions do no longer have uniform execution costs. The objective of a planner is then to maximize the utility while minimizing costs. Hence, planning involves a (combinatorial) optimiza- tion problem, also referred to as net-benefit problems [HDR08, KG09]. Soft goals, in fact, describe a simple model of preferences. More interestingly, soft goals10 do not increase expressive power since they can be compiled into a STRIPS planning problem with ac- tion costs and hard goals [KG09], for which conventional cost-based STRIPS planning machinery can be used then.
Observability. In various real-world domains it is not practical even feasible (for tech- nical reasons) to have complete information about states; that is, states are partially observable only. Planning under incomplete information about states is modeled by ex- tending the basic model such that there is a set of initial states. A planner must then account for the fact that the system might be in any of these states, which is corre- spondingly referred to as conformant planning as a plan must work for all possible initial states. Extending STRIPS to allow for negative atoms, conditional effects, and adoption of OWA is one way of modeling partial observability. Not surprisingly, computational
10To be precise, a soft goal here is either a single fluent or a conjunctive or disjunctive formula over different fluents.
complexity becomes harder: plan existence is ExpSpace-complete for plans exponential in length [Rin04] andΣP2-complete for plans bounded to polynomial length [Tur02]; plan checking is NP-hard. Similarly, a KB interpreted under OWA models partial observabil- ity since it may be satisfied by many interpretations, each representing a possible state of the domain.
Controllability. Apart from partial observability there can be one more source of un- certainty: actions that behave nondeterministically. Controlling evolvement in this case necessitates taking all possible outcomes of actions into account, which is correspond- ingly referred to as contingency planning. Extending the transition function F to map to a set of successor states is one way of representing nondeterminism of actions, which ren- ders the model conceptually close to a Kripke structure [Kri63] used mainly in the field of Model Checking [CGP01]. Another way leading to the theory of Markov Decision Processes [Put94] is to associate transitions with probabilities to capture the stochastic character of the domain. The general assumption, however, is that nondeterminism of actions is tractable, meaning that the number of different outcomes that an action may have is finite and all possible outcomes are known in advance.
Dynamics. Thus far planning domains were characterized by the absence of additional events transforming the system into a new state.11 Extending the model to accommo- date for such events can be done by (i) introducing a finite set of events E and (ii) for- mulating the transition function as F : A×E×S→S. As there can be transitions solely caused by an action or an event, a neutral event and, symmetrically, a no-op action are introduced in addition. Events can also be used to model the concurrent execution of multiple plans (by different) engines in the domain.
Agility. An operational aspect not related to the underlying model concerns the way plan synthesis and plan execution are integrated. There are basically two possibilities to this. Under the paradigm of static planning (a.k.a. offline planning) both are strictly separated, meaning that the plan is not executed unless it has been completely gener- ated. Conversely, under the paradigm of dynamic planning (a.k.a. online planning) both can be interleaved. In the most dynamic case planning and execution are interleaved in a step-by-step way: each new action added to a plan is executed immediately followed by planning for the next action, which is repeated unless the goal state is reached. Dy- namic planning becomes relevant for (i) nondeterministic actions in order to react to their actual outcome and (ii) dynamic domains in which it is important to take events into account.
Similarly, a common technique to make planning under incomplete information practical is to execute information-providing actions12 directly at planning time while execution of world-altering actions is simulated. However, this involves the so-called Invocation and Reasonable Persistence Assumption (IRP) [MS02]. Intuitively, IRP states that (i) information-providing actions can be executed at planning time (i.e., preconditions are satisfied) and that (ii) information persists once gathered until execution of world- altering actions, which includes that world-altering actions must not change gathered information even if the change is only simulated, seeExample 5.1.
11No matter whether these events are exogenous versus endogenous. 12Also known as sensing or callback actions.
Example 5.1
Imagine an information-providing action a1 that identifies an idle ambulance; say its execution returned the ambulance named A1. Imagine a world-altering action a2 that assigns a mission to A1 by changing its state to busy. If a1is executed again after sim- ulating execution of a2then it would report A1 still as idle because the real world state is behind the simulated world state (i.e., re-executing a1later during planning provides “outdated” information). Consequently, the simulated world state is incorrectly over- written13, which might lead to incorrectly reassigning A1.
Conversely, if the state of A1 changes in the real world between simulation and ex- ecution time of world-altering actions (due to dynamics such as concurrent executions) then the generated plan might become outdated relative to the real world state, which can lead to a runtime execution fault of a2.
Processing. Plans need not necessarily be sequences of actions. Whenever two actions are not mutually exclusive (i.e., there is no causal dependency between them) then they can be arranged partially ordered; thus, their execution can be linearized in either or- der or even in a concurrent way. A partial-order plan consequently specifies only those orderings among actions that are necessary to achieve the goal, which is also referred to as the least commitment strategy [Wel94]. Conditional plans bring about another type of structure. They are mainly considered to cope with nondeterministic actions to choose the next action depending on the actual situation after a nondeterministic action has been executed. Similarly, iterative plans are a concise way of representing repeated exe- cution of an action until a desired situation occurs. Execution of all these types of plans forms a process within the process model introduced inSection 4.3.
Linking Planning and Service Composition
The apparent connection between action planning and service composition is that a parametrized action corresponds to an operation or to an atomic service. A sequential plan corresponds to a composite service having a sequential control flow; analogously, a partially ordered plan corresponds to a control flow with parallel flows. Details of the correspondences vary depending on what types of semantics are represented by the action and service model (cf.Figure 4.1) and how it is precisely formalized.