Pedestrian Tracking under Dense Crowd

(1)

2018 3rd International Conference on Computational Modeling, Simulation and Applied Mathematics (CMSAM 2018) ISBN: 978-1-60595-035-8

Pedestrian Tracking under Dense Crowd

Ge YANG

1,2,3,*

, Si-ping CHEN

1,2,3

, Jing HUANG

1,3

and Hui HE

1,3

1

Key Laboratory of Intelligent Multimedia Technology, Beijing Normal University (Zhuhai Campus), Zhuhai 519087, China

2

Engineering Lab on Intelligent Perception for Internet of Things (ELIP), Shenzhen Graduate School, Peking University, Shenzhen 518055, China)

3

College of Information Technology, Beijing Normal University (Zhuhai Campus), Zhuhai 519087, China

*Corresponding author

Keywords: HOG and HSV features, SVM classifier, Particle filter, Pedestrian tracking, Robustness.

Abstract. In dense scenes, a large number of individuals can cause more serious problems such as blurred vision, chaotic scenes and so on. In view of the above problems, a tracking algorithm based on human head shoulder model is proposed. Support vector machine is used to train the classifier by machine learning. The average accuracy rate of pedestrians tracking in high density scenes is about 95%.

Introduction

Literature [1] proposes a fusion design method for pedestrian detection and target tracking system based on HOG-PCA RBFNN pattern classifier. The algorithm adopts FCM's RBFNS pattern classifier, which is composed of conditions, conclusions and reasoning parts. Literature [2] proposes a general framework for detection, recognition and tracking. Literature [3] proposes a motion cue encoding algorithm for pedestrian path prediction in dense crowd scenes. Literature [4] combines the semantic features of visual attention with the color semantic features, and combines the static and time-domain visual attention models in the space domain, and detects pedestrians through the choice of visual attention focus. Literature[5] proposes an algorithm for using the head as a target model. The algorithm consists of two parts: the front part and the back part.

FHH-PF Pedestrian Tracking Algorithm

In densely populated scenes, pedestrian tracking algorithms are faced with severe mutual occlusion between pedestrians. The problem of pedestrian detection is similar to that of pedestrian tracking, illumination change and environmental impact. The FHH-PF pedestrian tracking algorithm is mainly divided into two parts. One is to add HOG and HSV histogram features to the particle filter framework to complete the particle tracking. Another part is the processing flow for the short lost target. The transient loss of the target refers to the situation that the target meets the occlusion, the state (speed, position) mutation, the target departure from the video frame and so on. Otherwise, the target should be dealt with, otherwise the tracking effect of the algorithm is not reliable. The following two parts of the algorithm flow detailed description.

Algorithm input: video sequence.

Algorithm output: video sequence with a tracking box.

Step1:First, input the video frame sequence and denoise the video frame.

Step2:Manually annotate the tracking target. In the system, through the way of mouse interaction, the rectangle box is used to select the pre tracking target box.

(2)

window is the extension of the artificial marking rectangle, which extends the original position as the center and enlarges the length and width by 2 times. If the pedestrian head shoulder is detected, the position of the tracking target is corrected. Otherwise, re frame the tracking target.

Step4:Initialization particle. First, the state parameters of the particles are set up, and the state of the k particles is represented by S_k.It is expressed as





k k, k, k, k, k, k, w k, , h k,

S  x y x y w h a a .

,

k k

x y means tracking the center of the rectangle box. w h_k, _k represent the width and height of the

rectangle box. x y_k, _k represents the motion vector of the target. The vector is obtained by the formula

(1). a_{w k}_, ,a_{h k}_, shows the ratio of target scale change, which is obtained by formula (2).



x yk, k



xkxk1,ykyk1 (1)

 

, 1

w k k k

h k k k

a w w

a h h

 

 

(2)

The state of a particle is expressed by the expectation of a probability function, such as the formula (3).

 k 1 N

i i k k i E S s w







(3)

In which N represents the total number of particles. All particles are initialized into initial state parameters, in which the particle motion vector is set to 0 and the scale change rate is set to 1. The initial weight of all particles is set to 1/N. The number of particles in this paper is 100.

Step5:Particle state transfer. Since the moving objects have continuity, the algorithm uses the two order autoregressive model to predict the current state of the particles. The two order autoregressive model is represented as a model (4). In which _k_₁ is system state noise, A, B, C are constant. This paper sets up A=2, B=-1, C=1.

k k 2 k1 k 1

S AS_ BS_  C _ (4)

Step6:Extract the HOG and HSV features of the target rectangle box and fuse the features with the PCA method. If the current frame is the initial frame, the fused feature is used as the target template.

Step7:The Bhattacharyya distance of the characteristic histogram is calculated to describe the similarity between the predicted target position and the real position. In this chapter, the Bhattacharyya distance is used to represent the similarity between the current particle and the characteristic histogram of the predicted particles. Assuming that the two distributions are P(u) and Q(u), the normalization between P(u) and Q(u) can be expressed as (5), (6), (7), (8).

   

P u Q u du



_

(5)

Then the similarity expression between two histograms

 

1, , u

u m

P p _ and

 

1, , u

u m

Q q _ is deduced

to be (6).

 

1

,

m u u

u P Q p q 





(6)

Bhattacharyya distance:

 

1 ,

(3)

Let T be the default threshold, and d_{Bhattacharyya} represents the Bhattacharyya distance between the

current feature histogram and the target template.

pdate, o pdate, Bhattacharyya Bhattacharyya d T d T    _  U

N U (8)

The threshold T represents the maximum Bhattacharyya distance. When d_{Bhattacharyya}T is used to

determine the current target distortion. It may be blocked, the target speed behavior is mutated, and the target vanishes. In response to this situation, we need to relocate the target. When d_{Bhattacharyya} T

is used. The current target is determined as the real target. The target template is updated and the updated particle weight is calculated.

Step8:The weight of the updated particle is calculated. The Bhattacharyya distance corresponding to the characteristic histogram is dFHH. The weight expression of the particle is like the formula (9). _FHH represents the noise corresponding to the characteristic histogram, and  is the weight coefficient. 2 2 exp 2 FHH FHH d w      _ _   (9)

Step9:Target location estimation. According to the weight of particles calculated in the previous part, we need to estimate the true position of the target. In this paper, the robust mean method is used for tracking and estimation. The particles in a certain range of the largest particles of the weight are selected, and the weighted average of these particles is calculated as the target position of the estimation. It is expressed in formula (10).

| 

 

1

,

k

K

i i i m

k k k k k

p z

i

x E _ x x w if x x 



 



  (10)

Where m

k

x is the particle with the greatest weight at k moment,  is the maximum threshold allowed.

Step10:The target location of Step9 is output as tracking result. And the resampling operation is carried out. Resample N particles ₀ i

k

X i1, 2, ,N from particle set



Xˆ₀ iki1, 2, ,N



according to importance weight. To satisfy the following formula: ˆ_k i _k i 1,i 1, 2, ,N N

    .

Determines whether the current frame is the last frame of the video sequence. If, then end the algorithm. Otherwise import the next frame of video, return Step5.

Tracking Experiment and Analysis

Development Environment and Platform of Experimental System

This system is implemented on personal notebook computer, and the development language is C++. The crowd density of video scenes is defined, such as formula (13). PD (Population density, PD) represents population density, with a percentage per unit, such as formula (11), (12).

Pedestrians occupy the area of video scene =

The total area of the video scene

PD (11)













low density 0, 40%

Medium density 40% 65%

High density 65% 100%

(4)

Experiment of FHH-PF Pedestrian Tracking Algorithm

In this section, pedestrian tracking experiments are carried out on pedestrian occlusion, occlusion between pedestrians and background objects, human posture change, background object color similarity, scale change, movement speed direction mutation and so on. Compare the literature [6] and [7] with FHH-PF algorithm in this paper, and do experiments in 4 videos with different characteristics respectively. Since the tracking target is only the head and shoulders of pedestrians, pedestrians should be marked with the rectangle box according to the ratio of head shoulder to whole body when tracking the target. In the experimental part of pedestrian tracking, 100 particles are used to initialize. The tracking rate of a single pedestrian tracking in video sequences is expressed by formula (13).

The number of successful tracking Tracking rate

Total frame number of pedestrians i =

n video (13)

The tracking performance of the algorithm is evaluated by horizontal position offset, vertical position offset and center position offset.

Horizontal offset: the distance between the center of the target and the center of the tracking result in the horizontal direction.

Vertical offset: the distance between the center of the target and the center of the tracking result in the vertical direction.

[image:4.595.187.411.615.780.2]

Center offset: the distance between the center of the target and the center of the tracking result. There are 2 videos used in the experiment, such as table1.

Table 1. The characteristics of the test video sequence. Video

sequence number

Main features

1 Unstructured scenes, severe occlusion, and color similarity interference. 2 Unstructured scenes, occlusion, pedestrian posture change, fast motion, sudden

change of movement direction

The FHH-PF algorithm, literature [6] and literature [7] algorithm proposed in this paper are used to track experiments in Video 1 to video 4. The tracking results are compared in complex scenes, such as serious occlusion, interference with object color similarity, change of pedestrian attitude, change of motion direction, scale change, illumination change and so on. The tracking results are analyzed to demonstrate the superiority of the FHH-PF algorithm in tracking complex scenes. In literature [6], the HSV color feature is used as the target template of particle filter, and the tracking algorithm is used to correct the tracking position of the target in order to achieve the tracking. In literature [7], two features of target color and edge are fused as target templates, and Mean Shift framework is used to achieve pedestrian tracking. When the target is lost, the target is retrieved through information around the target, motion and other information.

781th frame 793rd frame 808th frame

824th frame 836th frame

(5)

As in Figure 1, video 1 is a high-density crowd scene. The target in the scene is partially obscured and the direction of motion suddenly changes. From the 781st frame to the 808th frame, the direction of the target movement has not changed, and the three algorithms can track the target very well. In the 808th frame steering, serious drifting occurred in literature [7]. According to the drift offset, the literature [7] algorithm failed to track. In the process of 808th frames to 836th frame, the literature [7] is trying to retrieve the target, which greatly reduces the drift offset from the target. The literature [6] and the algorithm in this paper can track pedestrians very well when the direction of the target movement changes abruptly.

121st frame 153rd frame 174th frame

[image:5.595.187.412.189.337.2]

200th frame 218th frame

Figure 2. Tracking results of three algorithms in video 2.

As in Figure2, video 2 is a scene of an emergency in a public place, in the video 121st frame to 153rd frame, the target is running quickly, and there is a slight change of attitude during the running. At the same time, the three algorithms can better track pedestrians without obvious drift. From 153rd frame to 174th frame, pedestrians change from upright stance to squat posture. At this point, the serious drift phenomenon in the literature [6] and [7] tracking results is identified as tracking failure. This algorithm can still track the target better. From 174th frame to 200th frame, the target will resume upright running from squat. At this time the literature [6] was wrong with other pedestrians. Literature [7] can track targets, but with large offset. The algorithm in this paper can still be tracked accurately. In the 218th frame of video, literature [6] loses target, and literature [7] tracking results still have large offset. Our algorithm can still track targets accurately.

Table 2. The comparison of the average tracking offset of the three algorithms in the above 4 videos. Position offset

(pixel)

Literature [6] algorithm

Literature [7] algorithm

FHH-PF algorithm (this paper)

Video 1 1.75 10.75 0.85

Video 2 49.25 15.5 1.75

Tracking rate 78% 82% 95%

Conclusion

In this paper, DHS-FHH pedestrian detection algorithm and FHH-PF pedestrian tracking algorithm are proposed. Experiments show that the detection performance based on fusion feature is better than that of single feature, and the average accuracy is improved by 4%-5%. The algorithm proposed in this paper is less dependent on the speed and direction of the target motion, so it can quickly locate the target in the direction of the velocity of the target.

Acknowledgment

[image:5.595.104.490.524.587.2]

(6)

Research Project of Shenzhen (JCYJ20160428153620486, JCYJ20170303140803747); Key Laboratory of Intelligent Multimedia Technology (201762005); Teaching Reform Project of Beijing Normal University (Zhuhai Campus) (201706).

References

[1]Jeon P H, Park C J, Kim J Y, et al. Design of Pedestrian Detection and Tracking System Using HOG-PCA and Object Tracking Algorithm[J]. Transactions of the Korean Institute of Electrical Engineers, 2017, 66.

[2]Nguyen V D, Nguyen H V, Tran D T, et al. Learning Framework for Robust Obstacle Detection, Recognition, and Tracking[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(6):1633-1646.

[3]Li Y, Mekhalfi M L, Rahhal M M A, et al. Encoding Motion Cues for Pedestrian Path Prediction in Dense Crowd Scenarios[J]. IEEE Access, 2017, pp(99):1-1.

[4]Li Ning, Gong Yuan, Xu Ling Ling, et al. Pedestrian detection combined with semantic features under visual attention mechanism [J]. Chinese Journal of image and graphics, 2016, 21 (6): 723-733.

[5]Cao Rui, Wang Min, Duan Xiaoxiao. Pedestrian detection and tracking in densely populated scenes [J]. computer and modernization, 2017 (1): 75-78.

[6]Liu Mengfei, Fu Xiaoyan, Shang Yuan Yuan, et al. Progress in pedestrian tracking [J]. laser and Optoelectronics based on HSV color characteristics and contribution degree, 2017 (9): 137-147.