5. Learning by Imitation
Toyoaki Nishida Kyoto University
Copyright © 2016, Toyoaki Nishida, Atsushi Nakazawa, Yoshimasa Ohmoto, Yasser Mohammad, At ,Inc. All Rights Reserved.
Conversational Informatics, May 18th, 2016
Learning by imitation—Generic framework
Measurement Corpus Generalization Dialogue
patterns
[Nishida et al 2014]
Endow robots with an ability
of autonomously imitating
human behaviors.
Interactions from observation—General framework
Causes
Causes
Causes
a 1
a 2
a 3
t
t
t
[Nishida et al 2014]
Problem formulation:
Find approximately repeated subsequences in a longer time series.
(1) Motif Discovery – Finding Patterns of Interaction
[Nishida et al 2014]
K-Motif(n, R): Given a time series T, a subsequence length n and a range R, the most significant motif in T (called 1-Motif(n, R) is the subsequence C 1 that has the highest count of non-trivial
matches. The K th most significant motif in T (called K-Motif(n, R)) is the sequence C K that has the highest count of non-trivial
matches, and satisfies for all . D ( C
K, C
i) 2 R 1 i K
K-Motif(n, R, d): The K th most d-significant motif in T is the
subsequence C K that has the highest count of non-trivial matches, and satisfies where d (possibly non-contiguous)
datapoints can be ignored where calculating the distance
between C K , C i , for all . In general we have , and typically .
R C
C
D (
K,
i) 2
n d
K
i
1 d n
[Chiu 2003]
(1) Motif Discovery – Finding Patterns of Interaction
[Pavesi 2001]
The pattern-driven method
Given a set of k unaligned sequences, a distance measure d, and a threshold value for d, find all patterns that occur in at least q sequences out of k within distance from the
sequence.
2 ,
l 1
(1) Motif Discovery – Finding Patterns of Interaction
# of (AA, A*, *A)’s
Planted (l, d)-motif problem. You are given t strings of length n, initially generated at random. Each string is planted with exactly one appropriate
occurrence with exactly d substitutions (or defects). Find the unknown motif y.
E.g., Planted (15, 4)-motif problem on t=20 sequences of length 600
n=600
t=20 Motif of length l=15
Exactly d=4 substitutions
[Chiu 2003]
The sequence-driven method
(1) Motif Discovery – Finding Patterns of Interaction
y:
n=600
t=20 Motif of length l=15
Exactly d=4 substitutions
[Pevzner 2000]
The WINNOWER algorithm
(1) Motif Discovery – Finding Patterns of Interaction
dor less dor less
2dor less
Segments of length l
2dor less
n=600
t=20 Motif of length l=15
Exactly d=4 substitutions
[Pevzner 2000]
The WINNOWER algorithm
Vertices corresponds to l-mers (e.g., 15-mers) present in t (e.g., 20) input sequences. The edges connect two vertices if and only if the corresponding l-mers differ in at most 2d positions and do not both come from the same input sequence.
(This schema illustrates the case where t=4)
(1) Build a graph
(2) Look for a clique of size t in this graph.
(1) Motif Discovery – Finding Patterns of Interaction
Motif Discovery
n=600
t=20 Motif of length l=15
Exactly d=4 substitutions
The SP-STAR algorithm
(1) Find a median string W from samples S
(2) Repeat improvements.
) , ( min
arg D W S
W
S W
where,
t j i
j
i
W
W d S
W D
1
) , ( )
, (
W
1, , W
t
W
- Compute best instances of W in sample sequences S ={s
1, ... , s
t}.
the majority string whose i th letter is the most frequent i th letter in W .
- W
- and W
1, …, W
tare best instances
of W in sample sequences S={s
1, ... , s
t}
[Pevzner 2000]
[Buhler 2002]
n=600
t=20 Motif of length l=15
Exactly d=4 substitutions
The PROJECTION algorithm
k
Locality-sensitive hashing f ( s ) s [ i
1], s [ i
2], , s [ i
k] l
t
samples )
1 (
n l t
Repeat m times
Most enriched bucket
buckets
|
| into
Projected Σ
kInferred planted motif (substring of inferred motif)
(1) Motif Discovery – Finding Patterns of Interaction
[Buhler 2002]
Σ
Locality-sensitive hashing
It concatenates characters from at most k distinct positions of s to form a length-k string called an LSH value. Buhler’s filter
accepts the pair (s 1 , s 2 ) if and only if f(s 1 )=f(s 2 )
] [ , ], [ ], [ )
( s s i 1 s i 2 s i k
f
Consider two strings s 1 and s 2 of common length d over an alphabet .
Fix r<d; we call s 1 and s 2 similar if they differ by at most r single character substitutions.
To detect similarity between s 1 and s 2 , we construct a randomized filter.
Locality-sensitive hashing
(1) Motif Discovery – Finding Patterns of Interaction
The projection size k?
E l n
k t ( 1 )
log | |
[Buhler 2002]
k l
k d l
k d l
p ( , , )
The number of independent trials m?
B t p l d k s m q
( )
1 , ( , , )
log ( )
) 1
log(
) , , (
, s
B m q
k d l p t
The probability that a given planted motif instance hashes to the planted bucket.
E: the allowed expected number of random sequences that hash into the sane bucket.
The bucket threshold s?
Twice the average bucket size?
| ?
|
) 1
2 t ( n l s
k
The probability of failure in which fewer than s planted instances hash to the planted bucket even in given m trials should be less than 1-q.
Parameter setting
(1) Motif Discovery – Finding Patterns of Interaction
Candidate window Comparison window Noise window
Signal Motif
Candidates of motifs
[Catalano 2003]
Catalano’s algorithm
w w w
w w w 1 sub-windows w w w 1 sub-windows w w w 1 sub-windows
Rejection threshold
Make k best matches with the smallest distances;
Remove pairs whose distance is greater than
Refine
(1) Motif Discovery – Finding Patterns of Interaction
[Mohammad 2009]
Problem formulation
• Given a time series X(t)
find recurring patterns of length between L
1and L
2using distance function D subject to the constraint P(t), where P(t) is an estimation of the probability that a motif occurrence exists near time step t.
P(t)
unlikely likely
Change point discovery
Constrained motif discovery
(1) Motif Discovery – Finding Patterns of Interaction
0 33.4 14.5 34.5
33.4 0 22.43 2.31
14.5 22.43 0 17.43 34.5 2.31 17.43 0
Signal Motif
Constraint
0 ∞ ∞ ∞
∞ 0 ∞ 2.31
∞ ∞ 0 ∞
∞ 2.31 ∞ 0
DGCMD (Distance Graph Constrained Motif Discovery)
[Mohammad 2009]
(1) Motif Discovery – Finding Patterns of Interaction
Parameterization
[Ide 2005]
Representative patterns are calculated from m and n subsequences at both sides of a time point (t).
Singular Spectrum Transformation
(1) Motif Discovery – Finding Patterns of Interaction
Future Change angle
H G
Past t Future
… Hankel matrix
… Singular value decomposition
[Ide 2005]
(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transformation
, x ( 2 ), x ( 1 ), x ( 0 ), x ( 1 ), x ( 2 ),
T
~ ( ) ( ) ~ ~ ~ 0
~
v H t H t v v v v
T T
T
[Ide 2005]
(1) Extraction of past patterns
Given time series:
x t w x t x t
Tt
s ( 1 ) ( ), , ( 2 ), ( 1 ) A consecutive subsequence with length w:
2
~ 1 1 ~
) ~ ( max arg ,
,
) 1 ( ), 2 ( , ), (
) (
1
v t H v
v v
t s t
s n
t s t
H u u
v v T
n T
T
Calculate a representative pattern u at t:
v t cH i
t s v c u
n
i
i
( ) ( )
1
Solution:
where,
Solve:
where, is a Lagrange multiplier.
v v
t H t
H ( )
T( ) i.e., solve:
u u
t H t
H ( ) ( )
T
Note:
l
l
u u u
U
1,
2, ,
Obtain a hyperplane of l representative patterns
u
1, u
2, , u
l
l significant eigenvectors:
Singular Spectrum Transformation
(1) Motif Discovery – Finding Patterns of Interaction
x t g x t g w
Tg t
r ( ) ( ), , ( 1 )
( ), ( 1 ), , ( 1 )
)
( t r t g r t g r t g m
G
u u
t G t
G ( ) ( )
T
) (
) ) (
( U U t
t U t U
T l l
T l l
) ( ) ( 1
)
( t t t
z
T
[Ide 2005]
(2) Extraction of the current pattern
A column vector of length w:
Hankel matrix for the current:
Compute the normalized largest eigenvector for:
(3) Compute the change-point score
)
(t
(4) Singular spectrum transformation
, x ( 2 ), x ( 1 ), x ( 0 ), x ( 1 ), x ( 2 ),
T
, z ( 2 ), z ( 1 ), z ( 0 ), z ( 1 ), z ( 2 ),
T
C
Singular Spectrum Transformation
(1) Motif Discovery – Finding Patterns of Interaction
1 0
1
0 1 0
1 0 1
0 1 0
1 0
1
) 0 ( H
0 1 0 , 0
2 1
2 1
3 1
3 1 3 1
1
0 0 u
Singular Spectrum Transform – Example (1)
For the past
) 0 ( ) 0
( H
H
Thas two significant eigenvectors
Representative patterns:
0 0 0
2 1
2 1
u
2
0 0 0
0 0
2 1
2 1
3 1
3 1 3 1
U
2Hankel matrix:
(1) Motif Discovery – Finding Patterns of Interaction
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
0 1 0
1 0 1
0 1 0
1 0
1
0 1 0
) 0 ( G
0 0 0 )
0 (
2 1
2 1
0 0 0
) 0 (
2 1
2 1
2 2
U
T U
For the future A:
G
TG ( 0 ) ( 0 ) has the most significant eigenvector:
Computing change-point score
0 0 0
) 0 (
2 1
2 1
0
0 0 0
0 0 0
1 ) 0 (
2 1
2 1
2 1
2 1
T
z Hankel matrix:
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
( 0 ) 0 , ( 1 ) 1 , ( 2 ) 0 , ( 3 ) 1 ,
x x x x
X
FA(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transform – Example (1A)
21 13
8
13 8
5
8 5
3
5 3
2
3 2
1
) 0 ( G
79 . 0
49 . 0
30 . 0
19 . 0
11 . 0 ) 0
(
20 . 0
15 . 0
20 . 0
15 . 0
20 . 0
) 0
2
(
2
U
T U
For the future B:
G
TG ( 0 ) ( 0 ) has the most significant eigenvector:
Computing change-point score
49 . 0
37 . 0
49 . 0
37 . 0
49 . 0
) 0
( z ( 0 ) 1 ( 0 )
T ( 0 ) 0 . 59 Hankel matrix:
( 0 ) 1 , ( 1 ) 2 , ( 2 ) 3 , ( 3 ) 5 ,
x x x x
X
FB(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transform – Example (1B)
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
128 / 43 64 / 21 32 / 11
64 / 21 32
/ 11 16
/ 5
32 / 11 16
/ 5 8
/ 3
16 / 5 8
/ 3 4
/ 1
8 / 3 4
/ 1 2
/ 1
) 0 ( G
44 . 0
43 . 0
45 . 0
41 . 0
50 . 0 ) 0
(
16 . 0
012 . 0
16 . 0
012 . 0
16 . 0
) 0
2
(
2
U
T U
For the future C:
Computing change-point score
58 . 0
041 . 0
58 . 0
041 . 0
58 . 0
) 0
( z ( 0 ) 1 ( 0 )
T ( 0 ) 0 . 72 Hankel matrix:
( 0 ) 1 / 2 , ( 1 ) 1 / 4 , ( 2 ) 3 / 8 , ( 3 ) 5 / 16 ,
x x x x
X
FC(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transform – Example (1C)
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
2 / 1 0
2 / 1
0 2
/ 1 1
2 / 1 1
2 / 1
1 2
/ 1 0
2 / 1 0
2 / 1
) 0 ( G
0 5 . 0
71 . 0
5 . 0
0 )
0
(
24 . 0
0 24 . 0
0 24 . 0
) 0
2
(
2
U
T U
For the future D:
Computing change-point score
58 . 0
0 58 . 0
0 58 . 0
) 0
( z ( 0 ) 1 ( 0 )
T ( 0 ) 0 . 59 Hankel matrix:
( 0 ) 1 / 2 , ( 1 ) 0 , ( 2 ) 1 / 2 , ( 3 ) 1 ,
x x x x
X
FD(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transform – Example (1D)
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
0 8 0
8 0
4
0 4
0
4 0
2
0 2 0
) 0 ( G
0 89 . 0
0 45 . 0
0 )
0
(
0 67 . 0
0 67 . 0
0
) 0
2
(
2
U
T U
For the future E:
Computing change-point score
0 0 0
) 0 (
2 1
2 1
z ( 0 ) 1 ( 0 )
T ( 0 ) 0 . 051 Hankel matrix:
( 0 ) 0 , ( 1 ) 2 , ( 2 ) 0 , ( 3 ) 4 ,
x x x x
X
FD(1) Motif Discovery – Finding Patterns of Interaction
Singular Spectrum Transform – Example (1E)
} 1 ) 1 ( , 0 ) 2 ( , 1 ) 3 ( , 0 ) 4 ( , 1 ) 5 ( , 0 ) 6 ( , 1 ) 7 ( ,
{
x x x x x x x
X
P
Overview
After Discovering Basic Motifs in both actions and commands and detecting their occurrence in all time series as in this graph
Command 1
Action 1
use the natural delay between commands and actions calculated during the discovery phase.
For every command-action pair calculate the joint-activation of them by the number of occurrences of the action within the natural delay interval of the command.
Use the joint-activation values to induce a Baysian Network describing the relation between actions and commands
[Mohammad 2009]
(2) Association
(2) Association
[Mohammad 2009]
Use Granger-causality to find delay between and
- Regress actions using actions & gestures - Regress actions using actions only
- Compute residues
- Calculate g-causality statistic
- Find the delay that maximizes g-causality arg max (
)
op
S
Cˆ A Cˆ G
) ˆ (
) ˆ (
) ( )
ˆ (
1 1
1
u t C t i C t i
c t
C
Gi i A
i i
A
) ˆ (
) ( )
ˆ (
1
2
e t C t i
c t
C
Ai i
A
Ti
t u SSR
1
1 2
( )
Ti
t e SSR
1
0 2
( )
) 1 2 /(
/ ) (
1
1 0
SSR T
SSR S SSR
Example
Estimated delay
Result of the association phase:
Augmented Bayesian Network (ABN) representing the relation between gestures and actions (interaction protocol)
(2) Association
Bayesian network with two values added to each link:
- Mean of the delay between the
activation of the cause and result nodes:
- Variance of the delay:
2 1 n
n 22 1 n
n [Mohammad 2009]
(3) Controller generation Overview
(a) Motor bubbling:
Takes as input an action stream state and produce increment F
i+ and decrement functions and F
i- that produce a motor command from a given action stream.
(b) Piecewise linear controller generation:
Generates controllers from the piecewise mean approximation of the action nodes of the given ABN, F
i+ , and F
i- .
(c) Producing the closed loop controller:
The difference between actual action stream as perceived and the piecewise mean approximation is calculated to correct for the error.
[Mohammad 2010]
(4) Accumulation Overview
Combine two already learned ABNs: AN 1 and AN 2 .
- Generalizing user modeling into task modeling.
- The main assumption: the action nodes will be more similar in the two ABNs than gesture nodes.
[Mohammad 2010]
The main steps:
(1) Create links between nodes in AN 1 and AN 2 if the nodes are similar to each other.
Dynamic Time Warping (DTW) is used to measure the distance between the means of the two nodes.
(2) Repeat the same processing for the gesture nodes in AN 1 and AN 2 .
(3) Remove all the conflicts in the two link lists to have at most one node in the second ABN connected to any node in the first ABN.
(4) After resolving all conflicts, the nodes in the two ABNs that are still linked are
combined and their HMM and motif mean are re-generated from the full sent of
motif occurrences used when creating the two ABNs.
(4) Accumulation
[Mohammad 2010]
a
b
c
d
f e
1
2
3
4
5
6
a1
b2
c 3
d4
f5 e
6
(4) Accumulation
[Mohammad 2010]
Main assumption
Action nodes are more compatible than gesture nodes
Algorithm
1. Associate action nodes with similar stored pattern
Set of action node association links
2. Associate gesture nodes with similar stored pattern
Set of action node association links
3. Calculate Link Competence Index for association links
Set of LCIs for gestures and actions
4. Resolve association link conflicts using LCIs
Final ABN
la
12ij, v la
12ij
lg ,
12ijv lg
12ij
LCI la
12ij, LCI lg
12ij
ABN Combination
(4) Accumulation
[Mohammad 2010]
Compile AN 1 and AN 2 lists {every action node}
Calculate
Calculate for all nodes and order them
Create a link iff
for any
Set
Gesture association links are calculated the same way
2 1
j
la i
1 , 2
DTW i k
d m m
1 1
min ,
i d DTW m m i k
1 , 2
DTW i j i
d m m
1 , 2 1 , 2
DTW i j DTW i k i
d m m d m m
2 2 2
: 1 l exists
k l i
m m la
1 2 i j DTW 1 i , 2 j
v la d m m
Associating action/gesture nodes
(4) Accumulation
[Mohammad 2010]
LCI Calculation
(4) Accumulation
[Mohammad 2010]
Effect of accumulation -- example
Imitation, Simulation and Conversation
[Nishida et al 2014]
Melzoff’s Active Intermodal Mapping (AIM) model
Imitation, Simulation and Conversation
[Nishida et al 2014]
Imitation, Simulation and Conversation
Intention
Behavior
Interaction
Reasoning
Simulation
Intention
Behavior
Reasoning
Simulation
Intention
Behavior
Reasoning
Simulation
[Nishida et al 2014]
Fluid imitation: Imitation in Social Context
The simplest fluid imitation system
[Nishida et al 2014]
Fluid imitation: Imitation in Social Context
An interaction oriented fluid imitation
[Nishida et al 2014]
Fluid imitation Engine (FIE)
[Mohammad 2016]
Perspective taking
Perceived behavior and environment state Self-Imitation Engine
Significance Estimator Imitation Engine Learned behaviors
Salience and relevance
Segmented demonstrations
Converts perceived environment state and model behavior into the frame of reference of the model then mapping the whole thing into the learner’s frame of the reference.
Calculates the significance of perceived motions/behaviors based on their top-down relevance to the current set of goals of the learner and bottom-up saliency of the behavior or environment context and objects.
Receives the transformed environmental and behavior signals, uses them to segment the action stream or
perception stream, and combines the segmented motions into sets of that are fed to the imitation engine.
Receives segmented action demonstrations and learns the
corresponding action models (SAXImitate).
Why should we imitate robots?
[Mohammad 2015]
(H1) Back and mutual imitation both increase human’s perception of robot’s imitative skill over the control situation in which no back or mutual imitation is performed.
(H2) The increase in perceived imitative skill and overall imitative skill will only appear in first person subjective evaluations and will disappear in third person evaluations.
(H3) Back and mutual imitation conditions will increase the
participant’s subjective evaluation of the robot’s imitative skill as
well as her/his intention of future interaction with the robot.
Why should we imitate robots?
[Mohammad 2015]
(H3) Back and mutual imitation conditions will increase the participant’s
subjective evaluation of the robot’s imitative skill as well as her/his intention of future interaction with the robot.
Fraction of subjects preferred each robot according to each of the measured
variables in the follow-up study
Summary
1. Autonomous learning is necessary to realize social artifacts that can exhibit natural and sophisticated interaction behaviors.
2. I have shown a framework of autonomous learning consisting of learning by watching and learning by doing.
3. The key algorithm for learning by watching is motif discovery.
4. Change point discovery is effective to improve both efficiency and robustness of the motif discovery algorithm.
5. Causality analysis is used to find plausible latency of reactions.
References
[Buhler 2002] J. Buhler and M. Tompa. Finding motifs using random projections. Journal of Computational Biology 9(2): 225–242
[Catalano 2006] Joe Catalano, Tom Armstrong, and Tim Oates. Discovering patterns in real-valued time series. In Knowledge Discovery in Databases:
PKDD 2006, pages 462–469, 2006.
[Chiu 2003] B. Chiu, E. Keogh, and S. Lonardi, “Probabilistic discovery of time series motifs,” in KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2003, pp. 493–498.
[Furaoa 2006] Shen Furaoa, Osamu Hasegawa. An incremental network for on-line unsupervised classification and topology learning, Neural Networks 19 (2006) 90–106
[Ide 2005] T. Ide and K. Inoue, “Knowledge discovery from heterogeneous dynamic systems using change-point correlations,” in Proc. SIAM Intl. Conf.
Data Mining, 2005.
[Mohammad 2009] Yasser Mohammad, Toyoaki Nishida, Shogo Okada, Unsupervised Simultaneous Learning of Gestures, Actions and their Associations for Human-Robot Interaction," Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on , pp.2537-2544, 11-15 Oct.
2009
[Mohammad 2015] Yasser Mohammad, Toyoaki Nishida. Why should we imitate robots? -- Effect of back imitation on judgment of imitative skill.
International Journal of Social Robotics, (forthcoming)
[Mohammad 2009 PhDThesis] Yasser Mohammad, Autonomous Development of Natural Interactive Behavior for Robots and Embodied Agents, PhD Thesis, Kyoto University, September 2009
[Mohammad 2010] Yasser Mohammad, Toyoaki Nishida, Learning Interaction Protocols using Augmented Baysian Networks Applied to Guided Navigation, Taipei, Taiwan, IROS 2010.
[Mohammad 2016] Yasser Mohammad and Toyoaki Nishida. Data Mining for Social Robotics: Toward Autonomously Social Robots, Springer, 2016 http://www.springer.com/us/book/9783319252308
[Nishida et al 2014] Toyoaki Nishida, Atsushi Nakazawa, Yoshimasa Ohmoto and Yasser Mohammad, Conversational Informatics, Springer 2014.
[Okada 2009] Shogo Okada and Toyoaki Nishida. Incremental clustering of gesture patterns based on a self organizing incremental neural network, in Proceedings of International Joint Conference on Neural Networks, Atlanta, Georgia, USA, June 14-19, pp. 2316-2322, 2009.
[Pevzner 2000] Pevzner, P. A. & Sze, S. H. (2000). Combinatorial approaches to finding subtle signals in DNA sequences. In proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology. La Jolla, CA, Aug 19-23. pp 269-278.
[Vahdatpour 2009] Alireza Vahdatpour, Navid Amini and Majid Sarrafzadeh. Toward unsupervised activity discovery using multi-dimensional motif detection in time series, in: Proceeding IJCAI'09 Proceedings of the 21st international jont conference on Artificial intelligence, Pages 1261-1266, 2009.
[Xu 2009] Yong Xu, Kazuhiro Ueda, Takanori Komatsu, Takeshi Okadome, Takashi Hattori, Yasuyuki Sumi and Toyoaki Nishida, WOZ Experiments for Understanding Mutual Adaptation, AI&Society, Vol. 23, No. 2, Page 201-212, 2009.