Xiaojing Zhang
Department of Software Engineering, School of Software, Harbin University of Science and Technology, Harbin, China Email: [email protected]
Yajie Yue, and Chenming Sha
School of Software, Harbin University of Science and Technology, Harbin, China Email:[email protected], [email protected]
Abstract—Object tracking has always been a hotspot in the field of computer vision, which has a range of applications in real world. The object tracking is a critical task in many vision applications. The main steps in video analysis are: detection of interesting moving objects and tracking of such objects from frame to frame. Most of tracking algorithms use pre-defined methods to process. In this paper, we introduce the Mean shift tracking algorithm, which is a kind of important no parameters estimation method, then we evaluate the tracking performance of Mean shift algorithm on different video sequences. Experimental results show that the Mean shift tracker is effective and robust tracking method.
Index Terms—Object tracking, computer vision, Mean shift, no parameters estimation
I. INTRODUCTION
Visual tracking in a video sequence on the target of interest the effective tracking has always been a typical problem in the field of computer vision. These two issues are reflected in the performance of the tracking algorithm is real-time and robustness. With the continuous development of computer vision, pattern recognition technology and digital video technology, many scholars its in-depth research, a lot of new ideas and methods. Of which, how to take into account real-time and soundness of the system is always the forefront of research. Combined with the current situation, how to further improve the robustness of the system on the basis of real- time tracking algorithm is the main purpose of this paper. Object tracking has always been a hotspot in the field of computer vision. Its application includes video monitoring, human computer interaction, vehicle tracking and so on [1-5]
Recently, to select these right features, which play an important role in the tracking object in a video sequences [12-13]. Feature selection is closely similarly to the object and target representation. We always analysis the object motion to focuses on simple characteristics such as color, texture, shape, geometry and so on. Generally, most of tracking algorithms always use a combination of these features to track an object. In general, these features are selected manually by the user, which depend on the many fields and application domains. Yet, the problem of automatic feature selection has brought more and more attention in computer vision and pattern recognition, namely the detection of objects and then to track object in video sequences. As we all known, the object tracking is usual the first step in activities analysis system, including of interactions and relationships between objects of interest. Most of tracking algorithms have been proposed and improved. Object tracking is an estimation question or the trajectory analysis for an object when moving through a sequence of images or video. These usual tracking methods consist of block-matching, KLT, the Kalman filter [14-15], Mean shift [16-17], Camshift [18- 19] and so on.
. Many of these applications which require reliable object tracking techniques should meet in real- time constraints. To track the object in a video sequence should be defined the dynamic entities which constantly change under few of several influence factors. The challenges in a robust tracking algorithm are caused by the presence of noise, occlusion, background clutter, varying viewpoints and illumination changes. Recently, a few of algorithms are proposed to overcome these difficulties [6-9]. In a tracking video sequence, an
unknown target should be defined as anything which is interesting to analysis. For example, one people walking on a road, cars in the road, planes in the air, hand or a face in motion and so on. Recently, the form and appearance representations are classified into three families as the representation by points, representation by bounding boxes and the object silhouettes and contour [10-11].
Object tracking algorithm based on color histogram for Mean shift which has obtained a wide range of applications, because of the method is simple and good real-time, it can deal with target deformation and some shelter. In these algorithms, a lot of things used statistical model for target tracking with neighbor pixel domain expression, often use a reference model through the nonlinear estimation for the parameters of the moving target model [20-21]. But in actual problems, we can't get the current image related to probability density distribution of information, and the lack of probability density distribution of prior knowledge, these parameters
not easy to determine the limit and estimate methods in the application of target tracking. Therefore, no parameters estimation method is used in the estimation of target best position information, also is in the continuous point probability density value available near the point of observation samples probability estimation. Mean shift algorithm is a kind of important no parameters estimation method, which has been successfully applied in target tracking and related computer vision field [11-12].
The aim of this paper is concerned with object tracking in video streams, object tracking algorithm based on Mean-shift is introduced in this paper. Mean-shift tracker which is based on color histogram and it is a kind of important no parameters estimation method, then we evaluate the tracking performance of Mean shift algorithm on different video sequences. Experimental results show that the Mean shift tracker is effective and robust tracking method.
II. MATERIALS
In the visual tracking, a method based on gradient matching algorithm is adopted to find the image pattern corresponding to the target image in the current frame efficiently. This kind of method is the key to obtain the similarity function in the current frame of probability density distribution once get the probability density distribution, we can accord to its gradient change direction matching search for the best path, in order to improve the efficiency of the pattern matching, and meet the requirement of real-time tracking algorithm. Because we can’t obtain probability density distribution information of the current frame and it is lack of the prior knowledge of the overall probability distribution. So we only use non-parameter estimation method to get the gradient of probability density. Non-parameter density estimation is based on the following thought that continuous point’s probability density value can be estimated by using points of observation of the area nearby samples set.
A. Kernel Density Gradient Estimation
Suppose x x1, 2, are N independent distribution of xN n dimension vector space. The kernel density estimate is proposed as following [25]: 1 1 ˆ ( ) ( ) ( ) N n i N i x x p x Nh K h − = − =
∑
(1) where K(x) is a scale function, it must meet:sup ( ) n y R K y ∈ < ∞ (2) n ( ) R K y dy< ∞
∫
(3) lim n ( ) 0 y→∞ y K y = (4) ( ) 1 n R K y dy=∫
(5) The parameter must meet:lim ( ) 0
N→∞h N = (6)
To ensure estimated progressive unbiasedness. Minimum-variance estimation continuity by type guarantee:
lim n( )
N→∞Nh N = ∞ (7) Probability meaning of uniform continuity by the following equation:
2
lim n( )
N→∞Nh N = ∞ (8) We use density gradient estimates as a probability density estimation gradient:
1 1 ˆ ( ) ( n) N ( i) x N x i x x p x Nh K h − = − ∇ ≡
∑
∇ (9) 1 1 1 1 1 ˆ ( ) ( ) ( ) ( ) ( ) N n i x N x i N n i i x x p x Nh K h x x Nh K h − = + − = − ∇ ≡ ∇ − = ∇∑
∑
(10) where: 1 2 ( ) ( ) ( ) ( ) ( , ,... )T n K y K y K y K y y y y ∂ ∂ ∂ ∇ ≡ ∂ ∂ ∂ (11) (1) Progressive biasIf function K(x) satisfies the formula from (2) to (4) and also to satisfy the equation (6), so we can have:
( )
n
R g x dx< ∞
∫
(12) The sequence of functions as followings:1
( ) ( ) n ( ( ) ) ( )
n
N R
g x ≡h− N
∫
K h− N y g x−y dy(13) There are all points from the function g(x) will be convergence as:lim N( ) ( ) n ( ) R
N→∞g x =g x
∫
K y dy (14) According to the above analysis, the function will be progressive and unbiased estimates meet the criteria below:lim ( ) 0
x→∞p x = (15) In the practical application of the probability density function is basically satisfy this condition. Therefore, we can obtain as followings:
ˆ
lim { x N( )} x ( )
N→∞E ∇ p x = ∇ p x (16) (2) Continuity
if the second moment of the continuous point ∇xp x( ) is continuous for the function ˆ∇xpN( )x as:
2 ˆˆ
lim { x N( ) x N( ) } 0
N→∞E ∇ p x − ∇ p x = (17)
To satisfy the following criterions [26]: lim ( ) 0 N→∞h N = (18) 2 lim n( ) N→∞Nh N = ∞ (19) sup ( ) n i y R K y ∈ ′ < ∞ (20) ( ) n i R K y dy′ < ∞
∫
(21) © 2013 ACADEMY PUBLISHERlim n i( ) 0 y→∞ y K y ′ = (22) Where: ( ) ( ) i i k y K y y ∂ ′ ≡ ∂ (23) (3) Uniform continuity For ε > , if : 0 ˆ lim Pr su p ( ) ( ) 0 n x N x N R p x p x ε →∞ ∇ − ∇ > = (24)
So the gradient estimation ∇ˆxpN( )x is called as uniformly continuous. If the function∇ˆxpN( )x meets the following equations, the gradient estimation will be uniformly continuous. lim ( ) 0 N→∞h N = (25) 2 2 lim n ( ) N Nh N + →∞ = ∞ (26) ( ) exp( ) ( ) n T x R G W =
∫
jW X ∇ K x dX (27) So ∇ˆxp x( ) is uniformly continuous.B. Mean Shift Theory
To give the kernel function ( )K x and the kernel function bandwidth h , probability density estimation of the point x is represented as following:
1 1 ˆ ( ) ( ) n i d i x x f x K h nh = − =
∑
(28) Generally, Epanechnikov kernel is a common kernel function: 2 1 1 ( 2)(1 ), 1 ( ) 2 0, 1 d E c d x if x K x if x − + − < = ≥ (29) where cd 2 / 2 1 ( ) (2 ) exp( ) 2 d N K x = π − − xis the sphere volume.Another is often used kernel function is Gaussian kernel:
(30) To define profile function of kernel function K to meet as followings :[0, )k ∞ → . So the corresponding k (x) R for Epanechnikov kernel function is described as:
1 1 ( 2)(1 ), 1 ( ) 2 0, 1 d E c d x if x k x if x − + − < = ≥ (31)
The equation (3) corresponding k(x) is:
/ 2 1 ( ) (2 ) exp( ) 2 d N k x = π − ⋅ − x (32) Fig. 1 shows the profile of Gaussian kernel.
Using the above definition, density estimation formula can rewrite as following:
2 1 1 ˆ ( ) ( ) n i d i x x f x k h nh = − =
∑
(33) To define: ( ) ( ) g x = −k x′ (34) where the first order derivative of k(x) is existed in the interval except few points by hypothesis. Define the kernel function:Fig.1 Gaussian profile 2
( ) ( )
G x =Cg x (35) where C is the normalized constant. Then we use th bins. Fig.5, Fig.6, Fig.7, Fig.8 and Fig.9 show the tracking results on different image sequences by Mean shift tracker. From Fig.5 we can see that the mean shift based tracer proved to be robust tracking results.
From Fig.6we can obtain that the mean shift tracking algorithm will be robust under these conditions with varying viewpoints and illumination changes. In the David image sequence, the illumination is varying from weak to be strong, as well as the head has a great angle rotation. So udder such conditions, the tracker still has a good tracking performance and it is very robust.
#330 #335 #340 #345 #350 #355 #360 #365 #370 #375 #380
#200 #210 #220
#230 #240 #250
#300 #310 #3200
Fig.6 Results of Mean shift tracking on David image sequence
#10 #20 #30 #120 #130 #140 #290 #300
Fig.7 Results of Mean shift tracking on Face image sequence
#1 #5 #10
#15 #20 #25
#30 #35 #40
Fig.8 Results of Mean shift tracking on the ball sequence
#001
#020
#060
#160
Fig.9 tracking result of girl sequence
To demonstrate the efficiency of the mean shift tracking algorithm, Fig.8 presents the tracking results under the object in a fast motion condition. On the ball sequence, the ball has a greater speed and do back and forth movement up and down. The Mean shift tracking
algorithm can obtain a better tracking performance and it can be robust to track the ball with a fast movement.
V.CONCLUSION
Visual tracking is a hot research topic in the field of computer vision, have important applications in the field of intelligent video surveillance, robot navigation, intelligent transportation, and defense security. In order to meet the actual needs, the current visual tracking methods need to deal with a lot of practical problems: how to establish an effective appearance model, while reasonably describe the target can better distinguish background; how to select a fast and efficient tracking inference algorithm to meet the real-time requirements; multi-target tracking correctly associated target. In order to solve these problems, visual tracking involves a number of academic disciplines, including: image processing, pattern recognition, probability theory and mathematical statistics, optimization theory and control theory. Visual tracking is a both theoretical and practical significance of the research theme
The Mean shift tracking algorithm basing on the Mean shift use the color histogram to track object, because of its simpleness perfect real-time performance, the ability to process the target deformation and the block situation and other different situation, and also the accuracy, so the Mean shift video tracking algorithm has been widely used. Experimental results show that the Mean shift tracking algorithm is effective, robust and can be used for tracking in different scenes.
REFERENCES
[1] Xue Mei, Haibin Ling, “Robust visual tracking using L1 minimization”, 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 1436-1443.
[2] Zhou H, Yuan Y, Shi C. Object tracking using SIFT features and mean shift. Computer Vision and Image Understanding, 2009, 113(3): 345-352.
[3] Krainin M, Henry P, Ren X, et al. Manipulator and object tracking for in-hand 3d object modeling. The International Journal of Robotics Research, 2011, 30(11): 1311-1327. [4] Grabner H, Matas J, Van Gool L, et al. Tracking the
invisible: Learning where the object might be//Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 1285-1292.
[5] Ning J, Zhang L, Zhang D, et al. Robust object tracking using joint color-texture histogram. International Journal of Pattern Recognition and Artificial Intelligence, 2009, 23(07): 1245-1263.
[6] Babenko B, Yang M H, Belongie S. Robust object tracking with online multiple instance learning. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2011, 33(8): 1619-1632.
[7] A. Yimaz, O. Javed and M. Shah, “Object tracking: A survey,” 2006, Vol. 4, pp. 13-20.
[8] Saravanakumar, S. Vadivel and A. Saneem Ahmed, “Multiple human object tracking using background subtraction and shadow removal techniques”, The International Conference on Signal and Image Processing ,2010, pp. 15-17
[9] Meng G, Jiang G, “Real-time illumination robust maneuvering target tracking based on color invariance”,
Proceedings of the 2nd International Congress on Image and Signal, 2009, pp. 1-5
[10] Gammeter S, Gassmann A, Bossard L, et al. Server-side object recognition and client-side object tracking for mobile augmented reality//Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, 2010: 1-8.
[11] N. A Gmez. “A Probabilistic Integrated Object Recognition and Tracking Framework for Video Sequences”, Universitat Rovira I Virgili, PHD thesis, Espagne, 2009.
[12] Collins,R. T, Yanxi Liu and Leordeanu, M, “Online selection of discriminative tracking features”, IEEE Transactions on Pattern Analysis and Machine Intelligence , 2010, Vol. 10, pp. 1631-1643.
[13] Ying-Jia Yeh, Chiou-Ting Hsu, “Online Selection of Tracking Features Using AdaBoost”, IEEE Transactions on Circuits and Systems for Video Technology, 2009, Vol. 3, pp. 442-446.
[14] X. Sun, H. Yao and S. Zhang, “A Refined Particle Fil ter Method for Contour Tracking”, Visual Communications and Image Processing, 2010, Vol. 8, pp. 1-8.
[15] D. Comaniciu, V. Ramesh and P. Meer, Kernel-Based Object Tracking”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, Vol. 14, pp. 564-577 [16] D. Chen, F. Bai, B. Li and T. Wang, “A Real-time
Tracking Method for Scout Robot”, IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp. 1001-1006.
[17] L. Ido, L. Michael and R. Ehud, “Mean Shift tracking with multiple reference color histograms”, Computer Vision and Image Understanding, 2010, pp. 400-408
[18] Wang Z, Yang X, Xu Y, et al. Camshift guided particle filter for visual tracking. Pattern Recognition Letters, 2009, 30(4): 407-413.
[19] Yin M, Zhang J, Sun H, et al. Multi-cue-based CamShift guided particle filter tracking. Expert Systems with Applications, 2011, 38(5): 6313-6318.
[20] Ess A, Schindler K, Leibe B, et al. Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 2010, 29(14): 1707-1725.
[21] Saleemi I, Hartung L, Shah M. Scene understanding by statistical modeling of motion patterns//Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2069-2076.
[22] Chu D M, Smeulders A W M. Color invariant surf in discriminative object tracking//Trends and Topics in Computer Vision. Springer Berlin Heidelberg, 2012: 62-75. [23] Wu Z, Thangali A, Sclaroff S, et al. Coupling detection and data association for multiple object tracking//Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 1948-1955.
[24] Zulkifley M A, Moran B. Robust hierarchical multiple hypothesis tracker for multiple-object tracking. Expert Systems with Applications, 2012.
[25] Cacoullos T., Estimation of a multivariate density, Ann. Inst. Statist. Math., Vol.18, pp. 179-189.
[26] Fukunaga K., Introduction to statistical pattern recognition, Second Ed., Academic Press, New York, 1990.
Xiaojing Zhang received the B.S. degree in Communication
Engineering from Northeast Dianli University in 2002. And she received the M.S. degree in Software Engineering from Yunnan University in 2005. Her primary research area focused on
Embedded System. Currently she is a Lecturer of the School of Software, Harbin University of Science and Technology, Harbin, China.
Yajie Yue received the B.S. degree in automation from Harbin
University of Science and Technology in 2004. And she received the M.S. degree in Control Theory and Control Engineeringfrom Harbin University of Science and Technology in 2007. She has been working in Harbin University of Science
and Technology from 2007. Currently her primary research area focused on IC design and Computer Architecture.
Chenming Sha was born in Heilongjiang Province, China, in
1981. She received the B.S. degree in Computer Science and Technology from Beijing Normal University, China in 2004, and M.S. degree in Computer application technology from Beijing University of Technology, China in 2007. Now she is working at Harbin University of Science and Technology, Harbin, China as a Lecturer.