Human behavior analysis from videos using optical flow

(1)

• L a b o r a t o i r e I n f o r m a t i q u e F o n d a m e n t a l e d e L i l l e

UNIVERSITE DES SCIENCES ET TECHNOLGIES DE LILLE LIFL – UMR 8022 – Bât. M3 – 59655 Villeneuve d’Ascq cedex Tél. : (33) 3 28 77 85 41 – Fax : (33) 3 28 77 85 39 – e-mail : … @lifl.fr

Human behavior analysis from videos using optical flow

Yassine Benabbas

Directeur de thèse : Chabane Djeraba

Multitel Workshop 2011

1

(2)

Plan

• Introduction

• State of the Art

• Global approach

– Recognition of human Actions – Crowd Event Detection

– Motion Pattern Extraction

• Conclusion

2

(3)

Introduction

• Automatic behavior analysis is a very active field in research and industry

• It consists in extracting information from videos using computer vision algorithms

• The extracted information is used to:

– Assist surveillance operators

– Provide statistics for marketing agents – Perform video retrieval

– Allow more natural and immersive human machine interactions

– …etc

3

(4)

State of the art

• Many approaches have been proposed for behavior analysis

– Human activity recognition [Le et al. cvpr2011 ] – Crowd event detection [Adam et al. TPAMI 2008]

– Motion pattern extraction [Rodriguez et al, iccv2009]

• However, they were focusing on a single aspect of behavior analysis or were very complex

– Example : Dynamic textures [Ma and Cisar, cvpr2009]

• Privacy issues are not addressed

• Intelligent cameras that contain embedded software require fast and reusable algorithms

4

(5)

Our approach

• We propose a generic approach for behavior analysis

• It is based on three levels of features

– Easier understanding

• Each level can be designed separately

– More control

• Each level can be reused for other purposes

– Save more processing power

• The lower level relies on motion information

– Preserves privacy ‘out of the box’

5

(6)

General Approach

6

High level information

Mid-level descriptors

Low level features

Video stream

Applications

• Human action recognition

• Crowd event detection

• Motion pattern extraction

(7)

LOW LEVEL FEATURES

General approach

7

(8)

Interest point detection

• Identification of ‘good’ points that can be efficiently and easily

tracked.

• We used the « good features to track » algorithm

– Fast and efficient OpenCV implementation

– Jianbo Shi; Tomasi, C.; , "Good

features to track," Computer Vision and Pattern Recognition, 1994.

Proceedings CVPR '94., 1994 IEEE

Computer Society Conference on , vol., no., pp.593-600, 21-23 Jun 1994

doi: 10.1109/CVPR.1994.323794

8

(9)

Optical flow computation

• Estimate the motion of interest points

• Implementation of Bouguet

9

Frame t and its interest points

Frame t+1

+ =

Optical flow vectors

(10)

General Approach

10

Video stream

Applications

(11)

MID-LEVEL FEATURES : DIRECTION MODEL AND MAGNITUDE MODEL

General approach

11

(12)

Vector allocation to blocks

• Each vector is allocated to a block depending on its origin

• Eliminate vectors with a very small or a very big magnitude

12

Optical flow vectors allocated to

a matrix of 8x4 blocs

(13)

Direction model

13

• The orientations of optical flow vectors are clustered in each bloc

• The circular data is clustered

using von Mises distributions

(14)

• The orientations of optical flow vectors are clustered in each bloc

• The circular data is clustered using von Mises distributions

Direction model

14

(15)

Direction model (2)

• The direction model is updated at each new frame for all the duration of the video clip

15

t=0

Direction model

Optical flow

(16)

Direction model (2)

16

t=40

Direction model Bloc size: 20x20 Optical flow

• The direction model is updated at each new frame for all the

duration of the video clip

(17)

Direction model (2)

• The direction model is updated at each new frame for all the duration of the video clip

17

T=115

Direction model

Bloc size: 20x20

Optical flow

(18)

Direction model (2)

• The direction model is updated at each new frame for all the duration of the video clip

18

T=160

Optical flow Direction model

Bloc size: 20x20

(19)

Magnitude model

• The magnitude model is estimated following the same steps as the

direction model

• We estimate a Gaussian mixture for each bloc

19

(20)

APPLICATIONS

General approach

20

(21)

General Approach

21

Video stream

Applications

(22)

Human Action Recognition

• Different terminologies (action, activity, event)

• In this presentation: action recogntion consists in the identification of simple daylife actions(ex : walk, run...)

• Our input is a video (query video) captured from a monocular camera

22

Answer to the phone Boxing

(23)

Model associated to a video sequence

23

Model of a video = (direction model, magnitude model)

(24)

running

jogging

handwaving

handclapping

boxing walking

Distance metric

24

Query model

Template models

…

(25)

running

jogging

handwaving

handclapping

boxing walking

Distance metric

25

…

… Query model

Template models

(26)

running

jogging

handwaving

handclapping

boxing walking

Distance metric

26

…

Detected event

Query model

Template models

(27)

Distance metric

27

Distance between two direction models

Distance between two magnitude models

(28)

Result comparison

28

ADL dataset

KTH dataset

[BALD11] Yassine Benabbas, Samir Amir, Adel Lablack, and Chabane Djeraba. Human action recognition using direction and magnitude models of motion. In International Conference on Computer Vision and Applications (VISAPP), 2011

(29)

General Approach

29

Video stream

Applications

(30)

Crowd Event Detection

• Objective:

– Detection of interesting events or situation that occur in a crowd scene

• The targeted events are:

– Running – Splitting

– Local Dispersion – Evacuation

– Merging

• These events are defined in the PETS’2009 workshop.

30

(31)

Compute the instantaneous direction model

• Compute the direction model for the current frame

• Keep only the main orientation for each block of the direction model

31

(32)

Group Clustering and Tracking

• Cluster the neighboring blocks that have a similar direction into a group.

32

(33)

Group Clustering and Tracking

• Cluster the neighboring blocks that have a similar direction into a group.

33

(34)

Group Clustering and Tracking

• Cluster the neighboring blocks that have a similar direction into a group.

34

(35)

Group Clustering and Tracking

• Cluster the neighboring blocks that have a similar direction into a group.

• Define an orientation and a centroid for each group.

• Each group is tracked over the next frames

35

(36)

Event detection

• We use two classifiers:

– One for running and walking events using the mean motion speed as a feature

– One for local dispersion, split, merge and evacuation events using as features:

• Number of groups

• Mean orientation

• The circular variance

• Mean motion speed

• The mean distance between groups

– Using two classifiers allows to detect

36

(37)

Comparison

[BID11] - Yassine Benabbas, Nacim Ihaddadene, and Chabane Djeraba. Motion pattern extraction and event detection for 37 automatic visual surveillance. EURASIP Journal on Image and Video Processing, 2011:15, 2011

(38)

General Approach

38

Video stream

Applications

(39)

Motion Pattern Extraction

• It consists of extracting usual (or repetitive) patterns (or trends) of motion

• It can be considered as a synthesized information about the motion behavior in a video

39

(40)

Motion Pattern Extraction

• Motions patterns learned from a given scene can be used for modeling usual behaviors of subjects and have a lot of applications:

– They provide relevant information about subjects’

behavior.

– They can improve tracking results.

– They can help to detect events.

• Learning motion patterns in unstructured crowd scenes is a difficult task;

– In some locations in the scene, the motion has different orientations (example : zebra crossing)

40

(41)

Clustering similar regions

• Affect at most k major orientations for each cell.

– They are obtained from the cell’s mixture model.

• A direction model is obtained

Representation of the learned direction model 41

(42)

Clustering similar regions

• Cluster similar blocks

depending on their major orientations

– Two blocks are similar If they are neighbor, the window is one block.

– And the cosine similarity between two of their major orientations is less that a predefined threshold.

• A block can belong to a maximum of k clusters

42 Pattern 1

Pattern 2

Pattern 3 Direction Model

(43)

Experiments

• Car traffic video from the AVSS dataset

• The orientations of optical flow vectors are represented

43

(44)

Detected patterns

44

(45)

Putting it all together

45

(46)

Escalator

46

(47)

Comparison

[BID11] - Yassine Benabbas, Nacim Ihaddadene, and Chabane Djeraba. Motion pattern extraction and event detection for 47 automatic visual surveillance. EURASIP Journal on Image and Video Processing, 2011:15, 2011

(48)

Conclusion and future works

• Conclusions

– General approach for video analysis

– Based on motion, which preserves privacy – Very promising results

– Can be easily improved and applied to other applications

• Future works

– Open source behavior analysis toolbox – Apply approaches in real environments – Scale independent features

– In event detection: apply weights to direction and magnitude models

– Affine group analysis (detect walking and running persons inside a group)

48

(49)

QUESTIONS?

Thank you for your attention

49