Coupled Behavior Analysis
with Applications
Professor Longbing Cao (
操
龙兵
)
Director, Advanced Analytics Institute
University of Technology Sydney, Australia
Agenda
•
Why coupled behaviors?
•
What is behavior?
•
What are coupled behaviors?
•
What is coupled behavior analysis (CBA)?
•
Combined mining for high-impact behavior analysis
•
Combined mining for high-impact behavior analysis
•
Coupled Hidden Markov Model-based abnormal
behavior analysis
Why Coupled Behaviors?
Why Coupled Behaviors?
Why Coupled Behaviors?
Why Coupled Behaviors?
•
Why does this stock go so crazily?
An example
•
Short-term manipulation behaviors as cause
Behavior
Behavior
exterior
presentation
presentation
presentation
Possible
driver
Possible
behavior
interior
driver
Behaviors of associated accounts as the driver of the price movement
Group behavior
interior-driven price
movement
Group
behaviors
Group
behaviors
movement
•
What makes multiple behaviors different?
Key factors:
• Multiple actors
• Multiple behaviors
• Multiple properties
• Coupling relationships
• Organizational factors
How are CB handled by existing
techniques?
Time series analysis
Multiple time series analysis
Behavior
exterior
analysis
Multiple time series analysis
Frequent pattern mining
Sequence analysis
Coupled sequence analysis
Behavior
interior
analysis
Public
service
business
Insurance
business
analytics
Coupled behaviors are ubiquitous
•
Relevant projects in UTS Advanced Analytics Institute
business
analytics
Financial
business
analytics
analytics
Banking
business
analytics
Education
student
analytics
Investment
business
analytics
What is Behavior?
What is Behavior?
Longbing Cao, In-depth Behavior Understanding and Use: the Behavior Informatics Approach, Information Science,
180(17); 3067-3085, 2010. www.behaviorinformatics.org
An abstract behavior model
•
Demographics and
circumstances
of behavioral
subjects and objects
•
Associates of a behavior may
form into certain
behavior
sequences or network
;
•
Social behavioral network
•
Social behavioral network
consists of sequences of
behaviors that are organized
in terms of certain
social
relationships or norms
.
•
Impact, costs, risk and trust of
behavior/behavior network
Behavior Visual Descriptor
•
Vector-oriented behavior pattern analysis
–
Behavior performer
:
•
Subject (
s
), action (
a
), time (
t
), place (
w
)
–
Social information
:
•
Object (
o
), context (
e
), constraints (
c
), associations (
m
)
•
Object (
o
), context (
e
), constraints (
c
), associations (
m
)
–
Intentional information
:
•
Subject’s: goal (
g
), belief (
b
), plan (
l
)
–
Behavior performance
:
•
Impact (
f
), status (
u
)
New methods for vector-based behavior
Behavioral data
•
Behavioral elements hidden or dispersed in
transactional data
•
behavioral feature space
Behavioral data modeling
Behavioral feature space
Mapping from transactional to behavioral data
Behavioral data processing
Behavior informatics – Concept Map
B
e
h
a
v
io
r
R
e
p
re
s
e
n
ta
ti
o
n
&
R
e
a
s
o
n
in
g
B
e
h
a
v
io
r
L
e
a
rn
in
g
&
M
in
in
g
B
e
h
a
v
io
r
R
e
p
re
s
e
n
ta
ti
o
n
&
R
e
a
s
o
n
in
g
B
e
h
a
v
io
r
L
e
a
rn
in
g
&
M
in
in
g
What is Coupled Behavior?
What is Coupled Behavior?
Longbing Cao, In-depth Behavior Understanding and Use: the Behavior Informatics Approach, Information Science,
180(17); 3067-3085, 2010. www.behaviorinformatics.org
Coupling relationships
•
From temporal aspect
•
From inferential aspect
•
From inferential aspect
•
From combinational aspect
{
a a
1,
2,
⋯
,
a
n}
{
a
→
a
}
Basic Behavior Patterns
Tracing: Different actions with sequential order.
Consequence: Different actions have causalities in occurrence.
{
a
i→
a
j}
1
{
a
↔
,
⋯
,
↔
a
n
}
{
a a
1 2,
⋯
,
a
n}
Synchronization: Different actions occur at the same time.
{
a
1⊕ ⊕
a
2,
⋯
,
⊕
a
n}
{
a
i
⇒
a
j
}
Exclusion: Different actions occur mutually exclusively.
Precedence: Different actions have required precedence
And more to be explored…
Sequential Combination
Parallel Combination
Nested Combination
Fuzzy or probabilistic Combination
A B C
× × ×
⋯
What is the Coupled Behavior
Analysis (CBA) problem?
Analysis (CBA) problem?
Longbing Cao, Yuming Ou, Philip S Yu. Coupled Behavior Analysis with Application, IEEE Trans. Knowledge and Data
Engineering.
Longbing Cao, In-depth Behavior Understanding and Use: the Behavior Informatics Approach, Information Science,
180(17); 3067-3085, 2010. www.behaviorinformatics.org
Customer behaviors
•
Customer
a
i
’s
N
behaviors
B
i
: {
b
i1
,
b
i2
,…,b
in
}
•
M
customers’ behaviors
•
M
customers’ behaviors
B
1
: {
b
11
,
b
12
,…,
b
1n
}
B
2
: {
b
21
,
b
22
,…,
b
2n
}
……
B
m
: {
b
m1
,
b
m2
,…,
b
mn
}
An Example of Stock Market
Transactional Data
Behavior Feature
Matrix
B1
B2
B3
B4
B6
B5
B7
B8
Existing approaches
•
M
customers’ behaviors
B
1
: {
sell
,
buy
}
B
2
: {
buy
,
sell
,
sell
}
B
3
: {
buy
}
B
3
: {
buy
}
B
4
: {
buy
}
• (sell,sell_price,volume_small,long_interval,Non-frequent) 333 40.758873929008566% • (sell,sell_price,volume_small,long_interval,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 99 12.11750305997552% • (buy_withdraw,price_other,withdraw_part,short_withdraw interval,Non-frequent) 24 2.9375764993880047% • (action_other,price_other,volume_other,interval_other,Non-frequent) 322 39.4124847001224% • (action_other,price_other,volume_other,interval_other,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 122 14.932680538555692% • (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) 164 20.0734394124847% • (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) 20 2.4479804161566707% • (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent)
•
Complex behavior pattern analysis
• (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 45 5.507955936352509% • (buy_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 23 2.8151774785801713% • (sell_withdraw,price_other,withdraw_part,short_withdraw interval,Non-frequent) 21 2.570379436964504%
• (buy,buy_price_last or buy_price_limit or buy_price_sell,volume_small,long_interval,Non-frequent) 130
15.911872705018359%
• (buy,buy_price_last or buy_price_limit or buy_price_sell,volume_small,long_interval,Non-frequent)
(action_other,price_other,volume_other,interval_other,Non-frequent) 85 10.40391676866585% • (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) 116 14.19828641370869% • (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (sell,sell_price,volume_small,long_interval,Non-frequent) 23 2.8151774785801713% • (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) 21 2.570379436964504% • (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 43 5.2631578947368425% • (sell_withdraw,price_other,withdraw_part,long_withdraw interval,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) (action_other,price_other,volume_other,interval_other,Non-frequent) 21 2.570379436964504%
Combined Pattern Mining for
High Impact Behavior Analysis
High Impact Behavior Analysis
Longbing Cao, Huaifeng Zhang, Yanchang Zhao, Dan Luo, Chengqi Zhang. Combined Mining:
Discovering Informative Knowledge in Complex Data, accepted by IEEE Trans. SMC Part B
Longbing Cao. Zhao Y., Zhang, C. Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. on
Combined Pattern Pairs
•
A combined rule pair is composed of two contrasting rules.
•
For customers with same characteristics U, different
policies/campaigns, V
1
and V
2
, can result in different outcomes,
T
1
and T
2
.
Combined Pattern Clusters
•
Based on a combined rule pair, related combined rules can be
organized into a cluster to supplement more information to the
rule pair.
•
The rules in cluster C have the same U but different V , which
makes them associated with various results T.
Interestingness of Rule Pair/Cluster
•
dist(): the dissimilarity between the descendants of R
1
and R
2
•
The interestingness of combined rule pair/cluster is decided by both
the interestingness of rules and the most contrasting rules within the
pair/cluster.
•
A cluster made of contrasting confident rules is interesting, because it
explains why different results occur and what can be done to produce
an expected result or avoid an undesirable consequence.
Combined Demographics +
Behavior Analysis
Behavior Analysis
•Longbing Cao, Huaifeng Zhang, Yanchang Zhao, Dan Luo, Chengqi Zhang. Combined Mining: Discovering Informative
Knowledge in Complex Data, IEEE Trans. SMC Part B.
•Longbing Cao. Zhao Y., Zhang, C. Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. on
Knowledge and Data Engineering, 20(8): 1053-1066, 2008.
•Yanchang Zhao, Huaifeng Zhang, Longbing Cao Chengqi Zhang. Combined Pattern Mining: from Learned Rules to
Combined Pattern Mining
•
Type A:
Demographics differentiated
combined pattern
–
Customers with the same actions but different
demographics
demographics
Combined Pattern Mining
•
Type B:
Action differentiated
combined
pattern
–
Customers with the same demographics but
taking different actions
taking different actions
•
There were 7,711 association rules before removing
redundancy of combined rules.
•
After removing redundancy of combined rules, 2,601
rules were left, which built up 734 combined rule
clusters.
clusters.
•
After removing redundancy of combined rule clusters, 98
rule clusters with 235 rules remained, which was within
the capability of human beings to read.
Behavior 1
Behavior 2
Demographic 1
Low value
High value
Coupled Hidden Markov
Model-based Abnormal Coupled Behavior
based Abnormal Coupled Behavior
Analysis
Longbing Cao, Yuming Ou, Philip S Yu. Coupled Behavior Analysis with Application, IEEE Trans. Knowledge and
Data Engineering.
Cao, L., Ou Y, Yu PS, Wei G. Detecting Abnormal Coupled Sequences and Sequence Changes in Group-based
Construct behavior sequences
CHMM Based Coupled Sequence
Modeling
•
Coupled behavior sequences
–
Multiple sequences
–
Coupling relationship
Adaptive CHMM for Detecting
Sequence Changes
•
Benchmark Models
–
HMM-B: Buy-based HMM
–
HMM-S: Sell-based HMM
–
HMM-T: Trade-based HMM
–
HMM-T: Trade-based HMM
–
IHMM: HMM-B + HMM-S + HMM-T
–
CHMM: CHMM(buy, sell, trade)
Evaluation
•
Technical performance
•
Computational cost
Prospects
Prospects
Sequence
analysis
Coupled
behavior
analysis
Impact
Impact--oriented:
oriented:
-- Positive
Positive
Frequent
Pattern
mining
Event
detection
Group
Behavior
Pattern
mining
Community
discovery
-- Positive
Positive
-- Negative
Negative
-- Multi
Multi--level
level
-- Mixed
Mixed
Novel Behavior Pattern Mining
•
Semi-supervised coupled behavior analysis
1: Coupling relationship analysis
Behavior Informatics-SIG:
http://www.behaviorinformatics.org/
Cao, L: BI at DDDM2008 Joint with ICDM2008