5.3 Learning Relational Affordances in a Continuous Domain Setting
5.3.2 Learning a Linear Conditional Gaussian BN
Just like in the single object case, we will start by learning a BN from the data (corresponding O, A, E values) obtained from a behavioural babbling stage. This constitutes Step 1a of our pipeline. The behavioural babbling state is performed with the right-arm only. Pairs of objects are placed in front of the robot at various positions. For choosing these babbling settings we use a similar approach as that for one-arm actions presented in Chapter 4. We place one object in the centre of the action space for the respective action, and we place the second object at varying x and y-coordinates around this first object such that the distance between the objects is smaller than the hand motion distance. The robot executes one of its actions A on one object (named: main object,
OM ain). OM ain may interact with the other object (secondary object, OSec)
causing it to also move. Figure 5.3 shows such a setting. The robot executed 300 such exploratory actions, obtaining 300 sets of O, A, E values. One such set, for the action shown in Figure 5.3, is shown in Table 5.1.
Table 5.1: Collected O, A, E data for the tap action in Figure 5.3
Object Properties Action Effects
shapeOM ain : cube
shapeOSec : cube
distXOM ain,OSec: 7.00cm
distYOM ain,OSec: 17.00cm
tap
displXOM ain: 0.40cm
displYOM ain: 7.23cm
displXOSec: 0.39cm
displYOSec: 4.38cm
Once we have collected the data, we learn a Linear Continuous Gaussian (LCG)
Bayesian Network (BN) [Kjærulff and Madsen, 2005]. This LCG BN models a
single-action step, and it specifically models right-arm actions. We will show later how, with the help of PPL modelling, we can extend the model to two-arms,
86 TWO-ARM ROBOTS MODELS
and to two-actions steps. In our setting displX, displY , distX and distY can be approximated by conditional Gaussian distributions over the short distances over which objects interact. We show later how to enforce these distances by adding logical rules. Later experiments for two-arm scenarios will show that this approximation is better than the discretisation presented in the previous chapter.
The LCG model of our setting is shown in Figure 5.4, where discrete random variables are represented by a single ellipse, and continuous ones by a double ellipse. displXOM ain and displYOM ain only depend on A and the object shape
since the hand is moved over a preprogrammed distance (with a given tolerance).
displXOSec and displYOSec depend on both the relative distance OSec is away
from OM ainand the shapes of both objects.
Figure 5.4: LCG BN model for two-object interaction
The LCG parameters are learnt from the collected babbling data (e.g., Table 5.1) by using the maximum likelihood parameter estimation from the BNT toolbox [Murphy et al., 2001] for Matlab. E.g., during our tap action for two interacting cubes (as in Figure 5.3), the displacement of OSec on the y-axis is (in cm):
N (19.92 − 0.05 ∗ distXOM ain,OSec− 0.86 ∗ distYOM ain,OSec, 0.17). (5.1)
Intuitively this makes sense: the second cube is moved along, so we expect the learnt coefficient of distY to be close to −1, but also to depend a little bit on
distX if the objects are not aligned, as in Figure 5.3. Also intuitively, the mean
coefficient generally depends on the widths of the shapes (15cm for cubes), as well as the preprogrammed action distance.
LEARNING RELATIONAL AFFORDANCES IN A CONTINUOUS DOMAIN SETTING 87
5.3.3
PPL Modelling
We will now continue with Step 1b of our pipeline, modelling relational affordances in a PPL. In the previous chapter relational affordances were modelled using the PPL ProbLog. Here, since we deal with continuous distribution random variables, modelled by normal distributions, we use our new state-of-the-art PPL Distributional Clauses (DCs) [Gutmann et al., 2011b], a continuous extension of ProbLog.
For the modelling of affordances in PPL for the task in this chapter, the main predicates we will use are presented in Table 5.2 below.
Table 5.2: Predicates used for affordance modelling
Predicate Meaning
shape(Obj, Shape) The shape of object Obj is Shape.
distX(Obj1, Obj2, T) Distribution of the relative x-axis distance between objects Obj1 and Obj2 at time-step T.
distY(Obj1, Obj2, T) Distribution of the relative x-axis distance between objects Obj1 and Obj2 at time-step T.
displX(Obj, Arm, T) Distribution of the x-axis displacement of object Obj due to an action with arm Arm at time-step T. Arm is one of left or right.
displY(Obj, Arm, T) Distribution of the y-axis displacement of object Obj due to an action with arm Arm at time-step T. dX(Obj, D, T) The overall x-axis displacement of object Obj due
to all actions at time-step T is D.
dY(Obj, D, T) The overall y-axis displacement of object Obj due to all actions at time-step T is D.
action(Type, Arm, Obj) The type of the action on object Obj with arm Arm is Type. Type can be push or tap.
approx_ok(A, ObjM, ObjS, DX, DY)
True if the Gaussian approximation for the action effect holds for the action A on main object ObjM and with secondary object ObjS when the x-axis distance between the objects is DX and the y-axis distance between the objects is DY.
twoArmA(AL, AR, OL, OR) The robot action for the left arm is of type ALon
object OL, and for the right arm is of type AR on
object OR.
actCheck(OL, OR) True if two-arm action with left arm on object OL
88 TWO-ARM ROBOTS MODELS
We can now proceed to model the LCG using DCs to generalise to a relational affordance model. We generalise over the number of objects as before, by introducing variables for objects (e.g., displX(ObjMain) for the displXOM ain in
the LCG), and so build a general multiple object PPL model from the two-object LCG BN. We illustrate the modelling with examples.
We first model the shape of an object being randomly chosen from our set of 4 shapes, each with 25% probability:
shape(Obj) ∼ finite([14: cube,1
4 : prism, 1 4: bar,
1
4: cyl]) ← obj(Obj).
where variable Obj universally quantified over the set of all objects.
Now we can model the LCG from Figure 5.4 with the learnt parameters. For example, to transform the LCG Equation 5.1 in DCs, one writes:
displY(ObjSec) ∼ gaussian(Mu, 0.17) ← action(tap, ObjMain), '(shape(ObjMain)) = cube, '(shape(ObjSec)) = cube,
'(distX(ObjMain, ObjSec)) = DX, '(distY(ObjMain, ObjSec)) = DY, Mu is 19.92 − 0.05 ∗ DX − 0.86 ∗ DY.
meaning for a tap action, if the two shapes are cubes, displY of ObjSec is distributed according to a Gaussian with mean given by M u.
We can use definite clauses to model that the above Gaussian approximation holds only over small distances (10cm on the motion axis and while there is overlap on the orthogonal axis), while over big distances there will be no effect on ObjSec. For our two cubes running example: The distances distX and distY will be later given as evidence.
approx_ok(tap, cube, cube, DX, DY) ← DY > 15, DY < 25, DX > −15, DX < 15.
where 15cm is the smallest centre-to-centre distance between two cubes. We then just need to add approx_ok(tap, cube, cube, DX, DY) to the body of the DC clause defining displY above. Similar rules can be added to enforce the action space.
At this point we have all the tools to fully model the relational affordance model with the parameters learnt as in Section 5.3.2. Once the program is defined, the inference algorithm based on sampling from [Gutmann et al., 2011b] or [Nitti et al., 2013] is used to compute the probability of a user’s query. For example, assuming two cubes o1 and o2, one can ask for the probability
of the y-axis displacement of o2 being greater than 3cm given some distance
between o1 and o2. For this, we need to compute:
MODELLING AFFORDANCES FOR TWO-ARM ROBOTS 89