Fusion Method - Experiments and Results - Robot skill learning through human demonstration and

3.4 Experiments and Results

4.3.2 Fusion Method

The feature-level fusion method is shown in Fig. 4.6-(a) which consists of feature selection, HMM-based classification and Bayesian filtering. In general, different sets of features characterize different manipulative actions. For example, finger forces vary during actions such as “screwing” and “fixing”. The skeleton joint angles capture the motion of the actions like “hammering” and “wrenching”. However, some of the features may be irrelevant and have to be removed, since the presence of irrelevant features requires more computational cost. In addition, it may reduce the classification accuracy. Algorithms used for selecting features fall into two categories: the wrapper methods and the filter methods [104]. The wrapper method uses cross-validation to predict the benefits of adding or removing a feature from the feature subset used. While the filter method does not rely on any knowledge of the algorithm to be used [105]. The wrapper method is computa- tionally intensive. Since each new subset is used to train a model, which is tested on a hold-out set. Counting the number of mistakes made on that hold-out set gives the score for that subset. Therefore, we choose the filter method which uses a proxy measure instead of the error rate to score a feature subset. This measure is chosen to be fast to compute, whilst still capturing the use- fulness of the feature set. Our feature selection approach is inspired by the variance preservation

Finger motion features Arm motion features Force features Feature selection Hidden Markov Models Bayesian model Finger motion features Arm motion features Force features Hidden Markov Models Hidden Markov Models Hidden Markov Models Bayesian model Decision fusion (a) (b)

Figure 4.6: Fusion approaches. (a) feature-level fusion method. (b) decision-level fusion method.

algorithm [106], which selects a subset of features that preserve the variance contained in the data. Since the assembly action are all non-stationary actions. Therefore, the feature sequence which can characterize the action should show a decent amount of variance. The variance of each feature in the training set for each action is calculated. The range of the feature data is normalized to the same scale. The variance ratioρ is used as the selection measurement

ρi j = vi j/

(vi j) (4.3)

where vi j is the variance of the ith feature in action j’s training data set. As long asρi j is over a

predeﬁned threshold, which indicates it is a signiﬁcant feature for the jth action, the ith feature will be kept. The recognition models will be trained using the features selected.

On the other hand, for the decision-level fusion method as shown in Fig. 4.6-(b), features are divided into different groups. Multiple classifiers are created using the different groups of features. To fuse the decision from each classifier, we design a measure which indicates the confidence of each decision. The confidence index is defined as

ζj = maxP(ψa|λi j)/

Selected part Ps

Action Aj

Selected tool Tt

Obs p Obs a Obs t

Figure 4.7: The Bayesian Model for action recognition.

whereλi j is the HMM for action i using feature group j.ζj is the conﬁdence index of the decision

from classifier j. The decision of the classifier with the strongest confidence index ζs will be

chosen.

arg max

i P(ψa|λis) (4.5)

whereλisdenotes the type s HMM (with the strongest conﬁdence) for action i.

4.3.3 Bayesian Network for Modeling Object/action Dependency

There are two kinds of objects involved in the assembly task: tools and parts. They both have correlation with the action, which can be modeled using a Bayesian model. Here Aj denotes a

manipulative action. The Bayesian model is shown in Fig. 4.7. ψp is the decision of the part

recognition. ψtis the decision of the tool recognition. whereψp∈ Ψp,ψt ∈ Ψtandψa ∈ Ψa.Ψpand

Ψt are the set of part and tool type respectively.

At this level, the goal is to ﬁnd the maximum posterior likelihood (MAP) estimation, arg max

P(Aj|ψa, ψp, ψt). According to the Bayesian rule,

P(Aj|ψa, ψp, ψt)∝ P(ψa|Aj, ψp, ψt)· P(Aj|ψp, ψt) (4.6)

As shown in the Bayesian model, we made an assumption thatψa is independent ofψpandψt

given Aj. Therefore, P(ψa|Aj, ψp, ψt) = P(ψa|Aj). P(ψa|Aj) can be interpreted as P(ψa|λj) which

is the output of the low-level HMMs. λj is the HMM model for action Aj. Applying the total

P(Aj|ψp, ψt)= m n P(Aj|Pm, Tn, ψp, ψt)· P(Pm, Tn|ψp, ψt) (4.7)

where Pm, Tnis part m and tool n. Aj is independent ofψp, ψt given Pm, Tn. Therefore,

P(Aj|Pm, Tn, ψp, ψt)= P(Aj|Pm, Tn) (4.8) P(Aj|Pm, Tn) is the prior probability characterizing the action that occurs on part Pmusing tool Tn, which can be calculated based on the occurrence of Aj given Pmand Tnin the training set. On

the other hand, we have

4.4 Experiments and Results

In this section, we ﬁrst introduce the experiment setup. Then, we analyze the experimental results.

In document Robot skill learning through human demonstration and interaction (Page 71-74)