Detection of Event Using Trajectory Hyper graphs Method

(1)

2017 International Conference on Electronic and Information Technology (ICEIT 2017) ISBN: 978-1-60595-526-1

Detection of Event Using Trajectory Hyper-graphs Method

Jia KE

1,2

and Xiao-jun CHEN

2,3,a,*

1

School of management, Jiangsu University 2

School of Computer Science and Communication Engineering, JiangSu University 3

Affiliated Hospital of Jiangsu University, Jiangsu University

*Corresponding author [email protected]

Keywords: Detection of event, Hyper-graph, Trajectory hyper-graph, Diagonal matrix, Hausdroff distance

Abstract. The main function of machine vision is to improve the flexibility and automation of the production; extraction of semantic information in the video data is difficult in the field of machine vision. The video event detection involves the shooting of the surveillance video, and the detection and conversion to the data processing and analysis. This paper presents the trajectory hyper-graph theory, recognition of events in video and sub events, so it can strengthen the ability of event classification. By doing the experiments of several monitoring video (MV) datasets from different scenes for event detection, we find that the numbers of vertices and clusters of trajectory and multi-label hyper-graph fusion method (TG-MLG) are larger than these of the other two methods, and it has better description performance. Using the trajectory hyper-graph theory for detection of event, the high-level semantic information can be used for video classification, searching and forecasting.

Introduction

An appropriate representation method of the relationships between video objects is crucial for video events detection. In the literature, numerous works using graph or hyper-graph have been proposed. For instance, Hakeem proposed clustering method based on image segmentation for complex event detection. Based on temporal relationship between simple events, this method clusters relevant events by graph cutting. Huang employed video hyper-graph partitioning to detect moving objects [1]. In many real world problems, traditional graph which is based on a single similarity function is insufficient for representing the relations among a set of objects [2]. In general, different pairwise graphs can be built based on affinity functions computed from different features. Then, a weighted similarity measure using all the features could be established in order to combine these representations [3]. However, simply taking their weighted sum as the new affinity function may lead to the loss of some information which is crucial to clustering task [4]. On the other hand, one may consider the relationship among three or more data points to determine if they belong to the same cluster. For example, we may compute the probability that one object and its neighbors belong to the same category. This representation for data sets with higher order relationships is termed as hyper-graph which is defined on a set of vertices and a set of weighted hyper-edges [5].

(2)

of target is constructed. Meanwhile, multi-label semantic hyper-graphs are established to represent semantic concepts in video. These two hyper-graphs are segmented by spectral clustering method to yield clusters of trajectory and label. Finally, they are integrated to reflect the relevance between their vertices. Experiments are performed to detect complex events in video and the results confirm the effectiveness of our method.

Hyper-graph Theory

Hyper-graph is based on graph and set theory. Objects with common features belong to a set. Different levels of abstraction can be attributed to the set of sets. In such a way, a structure based on inclusion relation of sets can be established. Hyper-graph has emerged as a useful tool to describe such structure.

Fusing Trajectory and Multi-label Hyper-graphs for Event Detection

According to the spatial-temporal characteristics of trajectory as well as the co-occurrence of semantic tags, this paper uses spectral segmentation to partition trajectory hyper-graph and multi-label hyper-graph, thus producing segmentation results of video events. From the fusion results of temporal dependencies between events, we can get the classification of complex events, thus achieving complex event detection.

Definition of Trajectory

As the variation of time parameters, target motion can be represented by a series of spatial changes in continuous video frames. Therefore, trajectory is defined as a form exhibiting simultaneous spatial and temporal variation in this paper. We have improved mixture Gaussian model in [6] so as to identify the moving target and used invariant contour moment to describe multiple moving targets in consecutive frames. Taking the

center-of-mass coordinate of the moving target as the spatial position (x yi, i) of objecti, we can extract trajectory from the variation of consecutive spatial position of moving targets.

Definition 1. In a video, let oi stand for i−th moving target with trajectory objecti

T

expressed as a triple (x y ti, i, i). It means at time point ti, the spatial position of the moving target oi_is(x yi, i)_{, where}xi_andyi_{respectively denotes the horizontal and} vertical coordinate. Let R2_andR_{denote the spatial and temporal domain of target} motion, respectively, then the temporal-spatial domain can be expressed as R× 2

R _.

Definition 2. The velocity vector of trajectory TS at time point t(ti ≤t≤ti+1) is

( , )=( - , - )

-

-i + 1 i i + 1 i x y

i + 1 i i + 1 i

x x y y

p p

t t t t _{. The velocity of the target between}(x yi, i)_and(xi +1,yi +1)_can

be expressed as

2 2

( - ) + ( - )

-i + 1 i i + 1 i

i + 1 i

x x y y

t t _.

Similarity between Trajectories

(3)

trajectory is also a main characteristic of moving target. Therefore, we combine the space distance and motion feature to measure the similarity of trajectories, wherein the motion feature is represented by the velocity direction of the motion point on the trajectory.

In the set of trajectories, two trajectories ti and tj_{are selected. The spatial distance}

between them is dsp

(

ti,,tj

)

expressed as:

(

_,

)

min

{

(

_,

) (

, _,

)

}

sp i j i j j i

d t t = d t t d t t

(1)

where

(

,

)

max min

j i

i j

b t a t

d t t a b

∈ ∈

= −

. The more similar ti and tj_{are, the smaller the spatial} distance between them is, and vice versa.

In addition to spatial distance, the velocity direction also can be used as a measure of

similarity between trajectories. Let ap _{be any point on} ti and ( , ) a a x p y p

p p _{be the}

corresponding velocity vector. Similarly, let bq_{be any point on}tj_and( , ) b b x q yq

p p _{be the}

corresponding velocity vector. Then, the velocity direction can be defined as

cosθ = ( , ) ( , )

( , ) ( , )

i

a a b b

x p y p x q yq

a a b b

x p y p x q yq

p p p p

(2) In terms of velocity direction, the distance between trajectories is given by

( , ) 1 cos

ve i j

d t t = − θ

(3) Combining trajectory spatial distance and velocity direction distance, the similarity between trajectories is given by

1

( , ) ( , ) (1 ) ( , )

i

i j sp i j ve i j

p t

D t t k d t t k d t t n ∈

 

=

∑

_ _i + − _

(4)

where the weight of the dve( , )t ti j _{is given by formula(3),}n_{is the number of points on}

i

t

, k_{is a trade-off coefficient used to balance the influence of spatial distance and} velocity direction distance. In this paper, we simply set the value of k_{as 0.5, meaning} that the two kinds of distance have equivalent influence on the similarity between trajectories. Then, the hyper-graph is adopted to perform clustering analysis.

Establishing Trajectory Hyper-graph

Based on the hyper-graph theory discussed in Section 2, we construct trajectory hyper-graph to represent the trajectories of multiple moving targets. Each vertex corresponds to a trajectory in a video. We consider a set of trajectory consisting of n

trajectories { ,..., ,..., }t1 ti tn , where ti and tj_{are two trajectories. Obtaining the vertex set}

{

₁ ₂

}

SV = v v, ,…,vn ₍| S |V =n) of trajectory hyper-graph, we can first get the similarity

measurement from spatial and temporal feature vectors of each trajectory, and then

calculate the affinity matrix MFof size | S | | S |V × V of the vertex set SV. Finally, we take each vertex as the centroid and form a hyperedge e_{containing such vertex and its m-1}

closest vertex (in the sense of affinity matrix MF).

In summary, we can build hyper-graph GT_{through the following steps:}

(4)

Input: The set of trajectories{ ,..., ,..., }t1 ti tn extracted from a video

Step1. Determine the vertex of trajectory hyper-graph GT: each trajectory is viewed

as a vertex, the vertex of GT_{is set as}SV t =

{

v v1, 2,…,vn

}

_.

Step2. Determine the hyperedge of GT_{set as}SE t=

{

e e1, 2,…,em

}

_:

Step2.1. Let ti and tj_{be denoted by}vi and vj_{respectively, the affinity between}

them is

( , )

exp i j

F ij

D t t M

σ

 

= _− _

  _whereσ _{is the standard deviation calculated from}

( , )i j

D t t

.

Step2.2. According to the | S | | S |V × V _{affinity matrix}MF _{of the vertex set}SV t_, construct hyperedge: view each vertex as the centroid and form a hyperedge e

containing such vertex and its m−1_{closest vertex (in the sense of affinity matrix}MF_).

The resulting hyperedge of trajectory hyper-graph GT_{set is given by}SE t=

{

e e1, 2,…,em

}

_.

Output: The trajectory hyper-graph GT

The above algorithm is the procedure for constructing trajectory hyper-graph GT. As we can see, the similarity in both spatial and velocity direction distance of the multiple

moving targets will influence the affinity matrix MF of trajectory hyper-graph GT.

Based on this, we can calculate the correlation matrix MHt_ofGT_{, the diagonal matrix}

W M

of hyperedge weight denoted as ( )

j i

ij i F

v e

w e M

∈

=

∑

ij

, the diagonal matrix Dv_{of vertex}

degree , and the diagonal matrix of hyperedge De.

We use the hyper-graph spectral segmentation algorithm 1 to partition hyper-graph

T G

into sub-hyper-graphs recursively until the value of c S

( )

and the average weights of hyperedge in each sub-hyper-graph are not less than a given threshold. In this way, we can obtain the trajectory cluster of the hyper-graph, which is denoted

asTC={TC1,...,TCl,...,TCm}.It is easy to see from spectral partitioning characteristics that, the correlation between trajectories from the same trajectory clusters is large while it is small between trajectories from different trajectory clusters.

Experimental Analysis

Experiment Scene

We selected several monitoring video (MV) datasets from different scenes for event detection experiments, including road traffic (RTMV), expressway traffic (ETMV), canals traffic (CTMV), dock (DMV), parking lot (PLMV), residence (RMV). Some typical frames from these videos are show in Table 1.

(5)

Table 1 shows detailed information of the involved experimental videos. In experiments, our method will analyze and process the following events: normal and abnormal traffic events in RTMV and ETMV, multi-ship sailing in a channel in CTMV, the tanker loading dock (unloading) event in DMV, the car parking and the vehicle access event in PLMV, the residents, visitors and other personnel access event in RMV. The experimental environment is Intel (R) Core (TM) i5 CPU, 8G RAM and a 5400RPM IDE hard drive, the operating system is 32-bit Windows 2010 Server.

Comparing the Numbers of Vertices and Clusters

In the experiments, we compare the proposed trajectory and multi-label hyper-graph fusion method (TG-MLG) with two related methods CGC and HG-LGC. CGC detects events based on ordinary graph while HG-LGC introduces trajectory hyper-graph with Hausdroff similarity measurement into the hyper-graph model established by multi-label semi-supervised learning. Figure. 1 shows the number of vertices in the hyper-graphs the above methods generate on each dataset, where the vertices of TG-MLG and HG-LGC consist of those produced by the trajectory and multi-label hyper-graphs. Figure. 2 compares the numbers of clusters generated, where the clusters of TG-MLG and HG-LGC comprise those produced by the trajectory and multi-label hyper-graphs. Therefore, the numbers of vertices and clusters of TG-MLG are larger than those of the other two methods.

0 50 100 150 200 250 300

RTMV ETMV CTMV DMV PLMV RMV

N

um

b

er

o

f

v

er

ti

ce

s

in

h

y

p

er

g

ra

ph

HG-LGC

TG-MLG

[image:5.612.129.485.331.643.2]

CGC

Figure. 1. Number of vertices in hyper-graph with three different methods.

0 5 10 15 20 25 30 35 40 45

RTMV ETMV CTMV DMV PLMV RMV

N

u

m

b

er

o

f

cl

u

st

er

s

in

h

y

p

er

g

ra

p

h

HG-LGC

TG-MLG

CGC

Figure 2. Number of clusters in hyper-graph with three different methods.

Conclusion

[image:5.612.131.483.337.467.2] [image:5.612.135.481.488.625.2]

(6)

semantic concept model of events. Based on the hyper-graph theory, this paper established detection of event according to the trajectory of characteristics and time moving target extraction. At the same time, we propose the construction of the concept of the video with multi label hyper-graph. We put the mapping relationship between two hyper-graph pairwise fusion track and multi label, and complex events can be detected. The experiment results show that our method is better than other methods for precision and recall rate. In the future, we will apply this method to detect other types of events in video.

Acknowledgements

This research has partially been supported by National Natural Science Foundation of China under Grant No. 61773184, 61502206 and 61502208, College Natural Science Research of Jiangsu Province under Grant No. 14KJB520008, Senior Technical Personnel of Scientific Research Fund of Jiangsu University under Grant No. 13JDG126, Research Innovation Program for College Graduates of Jiangsu Province under Grant No. KYLX15_1078, New Technologies and Projects of Affiliated Hospital of Jiangsu University under Grant No. xjs2016035, Medical Research Project of Jiangsu Provincial Health and Family Planning Commission under Grant No. X2017003, Research on hospital management innovation of Jiangsu Hospital Association under Grant No. JSYJY-3-2017-216.

References

[1] Mezaris, V., Scherp, A., Jain, R., Kankanhalli, M.S. Real-life events in multimedia Detection, representation, retrieval, and applications[J], Multimedia Tools and Applications, 2014, (70): 1-6.

[2] Hongeng S, Navatia R, Bremond F. Video-based event recognition-activity representation and probabilistic recognition methods[J],Computer vision and image understanding, 2004, 96:129-162.

[3] Hakeem A, Shah M. Learning, detection and representation of multi-agent events in videos [J]. Artificial Intelligence, 2007,171 (8-9):586-605.

[4] Ruocco, M., Ramampiaro, H. A scalable algorithm for extraction and clustering of event-related pictures [J]. Multimedia Tools and Applications, 2014, (70): 55-88.

[5] Jiang Y, She Q Q, Li M, et al. A transductive multilabel text categorization approach [J]. Journal of Computer Research and Development, 2008, 45 (11):1817-1822.