7.3 Belief-Based Programs and Probabilistic Projection Tests
7.3.1 Belief-Based Programs
The belief-based bGOLOG plans we have considered so far were only pseudo-belief-based plans, that is belief-based plans whose tests and conditionals made only appeal to the robot’s beliefs concerning the directly observable fluent reg. The reason why we only considered pseudo- belief-based plans is that reg is directly observable, and thus that the robot’s beliefs regarding reg can be determined without considering the epistemic fluent p. Note that as reg is directly observable, the robot’s beliefs regarding the value of a register are always either 0 or 100%. Now that we have formally specified how the epistemic fluents p and pll evolve, we can consider unrestricted belief-based plans that appeal to arbitrary real-valued beliefs.
As an example, let us go back to the ship/reject domain (cf. Chapter 6) and assume that we want to specify that the robot is to correctly process the widget with probability 99%. For simplicity, we do not require the widget to be painted. This can be achieved by the following belief-based bGOLOG plan:
proc(ΠloopInsp,
[while(¬(Bel(FL) ≥ 0.99 ∨ Bel(¬FL) ≥ 0.99),
[send(inspect, nil), send(fork, inspect), Bel(reg(inspect)6= nil) = 1?]),
if(Bel(FL) = 1, send(fork, reject), send(fork, ship)), Bel(reg(processed)6= nil) = 1?]).
The above plan specifies that the robot is to activate the inspect process until it is sufficiently confident about whether the widget is flawed or not. Note that the send(inspect, nil) action in
the body of the while-loop is necessary to guarantee that the test Bel(reg(inspect)6= nil) = 1?
blocks the plan until inspect has completed execution even if inspect has already been activated before. Once the robot is sufficiently confident about the value of FL, the widget is shipped or rejected, depending on the robot’s beliefs. Let us now consider some example on-line
execution traces of ΠloopInsp. First, let us consider the following situation:
S1 = do([send(inspect, nil), send(fork, inspect),.
reply(fork, nil), ccUpdate(0.25), ...ccUpdate(10.0), reply(inspect, OK),
send(fork, reject),
reply(fork, nil), ccUpdate(10.25), ...ccUpdate(20.0), reply(processed,>)], S0).
Let Γ be the set of axioms AXBelUp (cf. Proposition 25) together with the axioms from
Section 6.3.2 modeling the ship/reject domain. Then it is not difficult to see that S1 is a
legal (completed) on-line execution trace of ΠloopInsp with respect to Γ. Initially, ΠloopInsp
can cause two transition, involving the execution of send(inspect, nil) and send(fork, inspect). Note that initially the while-loop is not Final because the robot’s intial beliefs in FL are 0.3. Thereafter, the belief-based plan becomes blocked, waiting for reg(inspect) to get a non-nil
7.3. BELIEF-BASED PROGRAMS AND PROBABILISTIC PROJECTION TESTS 157
value. The next actions in S1 are exogenous actions. The reply(inspect,OK) action causes
the robot’s beliefs in FL to immediately rise to 100% (cf. Section 7.1.4). As a result, the belief-based plan becomes unblocked. The while-loop is Final because in the actual situation Bel(F L) is 1, so the belief-based plan pursues the conditional, which results in the execution of send(fork, reject). Therafter, it waits for the reject process to finish execution, which is signaled
by the exogeneous reply(processed,>) action, and finishes execution. Thus S1 corresponds to
a completed on-line execution.
The on-line execution of ΠloopInspcan also result in other execution traces, like for example:
S2 = do([send(inspect, nil), send(fork, inspect),.
reply(fork, nil), ccUpdate(0.25), ...ccUpdate(10.0), reply(inspect, OK), send(inspect, nil), send(fork, inspect),
reply(fork, nil), ccUpdate(10.25), ...ccUpdate(20.0), reply(inspect, OK), send(fork, reject),
reply(fork, nil), ccUpdate(20.25), ...ccUpdate(30.0), reply(processed,>)], S0).
As discussed in Section 7.1.4, the observation of one OK answer causes the robot’s belief in ¬FL to rise to 70/73 = 0.7/(0.7 + 0.3 ∗ 0.1). Similarly, the observation of two OKs cause the
robot’s belief in ¬FL to rise to 0.7/(0.7 + 0.3 ∗ 0.1 ∗ 0.1), which is more than 0.99. Thus, af-
ter the second OK answer the while-loop becomes Final and ΠloopInspexecutes the conditional.
As another example, the following belief-based plan specifies that the robot is to activate the inspect process until it is sufficiently confident about whether the widget is flawed or not, then it is to activate the paint process until its belief in the widget being painted rises to 99%, and finally it is to process the widget:
proc(ΠloopInsp&P aint,
[while(¬(Bel(FL) = 1 ∨ Bel(¬FL) ≥ 0.99),
[send(inspect, nil), send(fork, inspect), Bel(reg(inspect)6= nil) = 1?]),
[while(Bel(PA)≤ 0.99,
[send(painted, nil), send(fork, paint), Bel(reg(painted)6= nil) = 1?]),
if(Bel(FL) = 1, send(fork, reject), send(fork, ship)), Bel(reg(processed)6= nil) = 1?]).
As the examples illustrate, belief-based programs allow the programmer to provide domain dependent procedural knowledge in a natural way. We remark that our framework does
not only allow the on-line execution of belief-based plans like ΠloopInsp, but also supports
probabilistic projection of belief-based plans. In particular, from the set of axioms Γ it is possible to deduce:
PBel(PR∧ ¬ER, S0, ΠloopInsp, kernelBHL) = 99.7, and
PBel(PA∧ PR ∧ ¬ER, S0, ΠloopInsp&P aint, kernelBHL) = 99.45075.
Functional Fluents in Belief-Based Programs In Section 6.2.4, we required that a
bGOLOG plan may not refer to functional fluents as arguments of primitive actions or proce- dure calls. Intuitively, the reason why we had to make this assumption is that bGOLOG plans may only appeal to the robot’s beliefs but not to the actual value of fluents. In particular, this means that a bGOLOG plan may not appeal to the value of functional fluents. Thus, the
following pGOLOG program is not a legal high-level plan with respect to our formalization of BHL’s 1-dimensional robot example:
[say(“My position is:”), say(position)].
While it is clear that the robot cannot refer to the value of position because it is un- certain about it, intuitively nothing prevents the robot from announcing the estimate of the actual position, namely the value of reg(posEstimate) provided by the low-level process noisySensePos. Note that unlike in the case of position, the robot is certain about the value of reg(posEstimate), meaning that it has a 100% belief.
To allow the robot to refer to functional fluents about which it is certain, we introduce the epistemic functional fluent Kwhich(f ). Kwhich takes as argument a functional fluent f . Intuitively, the value of Kwhich(f ) is v if the robot has a 100% evidence that the value of f is v; else Kwhich(f ) = nil. The following axiom makes this precise:
Kwhich(f, s) = v ≡ Bel(f = v, s) = 1 ∨ ¬∃v0.Bel(f = v0, s) = 1∧ v = nil.
Thus, Kwhich allows the robot to refer to the value of functional fluents about which it has 100% beliefs. Using Kwhich, it would be possible to specify that the robot is to announce the estimate provided by noisySensePos, for example using the following instructions:
[say(“My position is:”), say(Kwhich(reg(posEstimate)))].
As another application of Kwhich, suppose the 1-dimensional robot wants to get to position 0. Then, taking advantage of Kwhich, one could specify the following plan, telling the robot to first activate noisySensePos, then wait until it provides an estimate d, and finally activate noisyAdv, telling it to move back the robot by d units:
Πkwhich= [send(fork, sense), Bel(reg(posEstimate) = nil) < 1?,.
send(fork, advance(−1 ∗ Kwhich(reg(posEstimate)))), Bel(position = 0) > 0?].
Using probabilistic projection, one can deduce that this plan has a reasonable probability to
result in the robot being at position 0. Let Γ be the set of axioms AXBelUp together with
the definition of PBel, the axioms (7.21) and (7.22) from Section 7.2.1 specifying the robot’s initial epistemic state, the successor state axiom for position, the definition of getAdvArg, and
the action precondition axiom Poss(exactAdv(x))≡ True. Then, it is possible to show:
Γ|= PBel(position = 0, S0, Πkwhich, kernelBHL) = 3/8.
That is, the plan has a probility of 37.5% to move the robot to position 0. We remark that our formalism also entails that the probability to end up at position 1 respectively -1 is 25%, and that the probability to end up at position 2 respectively -2 is 6.25%. In total, there are 45 possible execution traces.
Aside – waitFor in Belief-Based Programs In Section 6.2.4 we also required that a
bGOLOG plan may not include waitFor actions. Intuitively, the reason why we had to make this assumption is that while bGOLOG plans may only appeal to the robot’s beliefs, waitFor actions directly appeal to the value of continuous fluents, like for example clock, battLevel or robotLoc. In the remainder of this subsection, we will sketch some consideration as to how this restriction can be overcome.
7.3. BELIEF-BASED PROGRAMS AND PROBABILISTIC PROJECTION TESTS 159 The idea is that although a bGOLOG plan may not wait for the value of a continuous fluent to fulfill certain conditions, it may very well wait for the robot’s continuously changing beliefs about the value of continuous fluents to fulfill a condition. To get a feel for this idea, let us reconsider the 1-dimensional robot example from Section 4.1.2, where a mobile robot is moving along a straight line (this is not the example from Bacchus, Halpern and Levesque which doesn’t account for continuous change). The robot’s location is represented by the fluent robotLoc1d. For simplicity, we will not consider any actions that affect the robot’s position, nor a model of the low-level processes.
Let us assume that initially the robot knows that it is moving with velocity 1, but is unsure
about its position at the beginning of S0: there is a 30% chance that it is at position 9, and
a 70% chance that it is at position 10. The following axiom makes this precise, specifying
that initially the robot considers two situations possible, s1 and s2, with degree of likelihood
0.3 respectively 0.7. s1 represents the situation where at the beginning of S0 the robot is at
position 9, and s2 where it starts at position 10. Note that in both situations, the value of
robotLoc1d is a continuous linear function of time (cf. Section 4.1.3 on page 58). ∃s1, s2∀s.s 6= s1∧ s 6= s2 ⊃ p(s, S0) = 0 ∧
p(s1, S0) = 0.3∧ p(s2, S0) = 0.7∧
robotLoc1d(s1) = linear(9, 1, start(s1))∧
robotLoc1d(s2) = linear(10, 1, start(s2))
Next, let us assume that the robot’s task is to execute the action say(“I am at 20”) as soon as
it reaches position 20. Intuitively, the robot cannot execute waitFor(robotLoc1d≥ 20) because
it has only probabilistic beliefs about its position. However, it can wait until its belief that it has reached position 20 exceeds a certain threshold. For example, it can wait for its belief in
being at a position ≥ 20 to exceed 50%. Intuitively, this should be the case after 10 seconds.
Or, it can wait for its belief to be at a position ≥ 20 to exceed 99%, which should cause it to
wait for 11 seconds.
The above example suggests that one possibility to relax the condition that no waitFor actions may occur in bGOLOG plans would be to allow the occurence of waitFor actions
appealing to epistemic t-forms. Here, an epistemic t-form is an expression of the form
Bel(τ ) op p, where τ is an ordinary t-form, op∈ {≥, =≤}, and p is a probability. An example is
Bel(robotLoc1d≥ 20) ≥ 0.5. As with ordinary t-forms, one would evaluate an epistemic t-form
at a situation s and time t. [Bel(τ ) op p][s, t] would then be defined as Bel(τ [s, t]) op p, where τ [s, t] is the expression used to evaluate ordinary t-forms (cf. Section 4.1.4 on page 58). For ex-
ample, [Bel(robotLoc1d≥ 20) ≥ 0.5][s, t] would become Bel(val(robotLoc1d(s), t) ≥ 20) ≥ 0.5.
Using this approach, it would be possible to specify and project bGOLOG plans like the following:
[waitFor(Bel(robotLoc1d≥ 20) ≥ 0.5), say(“I am at 20”)].
However, these ideas can only be considered as pre-considerations to a more general framework for dealing with probabilistic uncertainty and continuous change in pGOLOG. There are many reasons why the considerations presented above cannot be considered complete. For one, so far we assume that during on-line execution the high-level controller is provided with estimates of the value of all continuous fluents by means of ccUpdate actions. This means that during on-line execution the continuous fluents are directly observable. However, in general it seems desirable to consider both directly observable and non-observable continuous fluents. For example, unlike the voltage level of the robot’s batteries, which arguably can be considered
as directly observable during on-line execution, the (continuous) position of a ball kicked by the robot can hardly be considered as directly observable.
For another, our successor state axiom for pll specifies that a configuration c considered possible is only removed from the epistemic state (without replacement by a successor con- figuration) if a reply action occurs that is not compatible with the program component of c. However, when dealing with continuous change it is appealing to also make use of ccUpdate actions to sharpen the robot’s epistemic state. For example, in the mobile robot example considered in Chapter 4 and 5 one could imagine to remove a configuration c from the robot’s epistemic state if the approximation of the trajectory yielded by c and the estimates provided by the ccUpdates differ significantly. We leave this to future work.