Background Modeling - A survey on object detection and tracking algorithms

Figure 3.9: Output after background substraction using Julio cezar method The basic principle of background subtraction is to compare a static background frame with the current frame of the video scene pixel by pixel. This technique builds a model of the background any frame can be compared with the model to detect zones where a significant difference occurs. Some steps for the background subtraction, i.e. first developement of a background model of the scene is done, then background subtraction to detect foreground object.

Julio cezar et al. [3] has proposed a background model, and incorporates a novel technique for shadow detection in gray scale video sequences. In the ﬁrst stage, a

20-40 seconds) to distinguish moving pixels from stationary pixels.

In the second stage, only those stationary pixels are processed to construct the initial background model. Let V be an array containing N consecutive images,Vk_{(i, j)}

be the intensity of a pixel (i,j) in the K-th images of V, σ(i, j) and λ(i, j) be the standard deviation and median value of intensities at pixel (i,j) in all images in V, respectively. The initial background for a pixel (i,j) is performed by a three-dimensional vector: the minimum m(i,j) and maximum n(i,j) intensity values and the maximum intensity diﬀerence d(i,j) between the consecutive frames observed during this training period. Then the background model B(i,j)=[ m(i,j), n(i,j), d(i,j)] is obtained As follows:      m(i, j) n(i, j) d(i, j)     =      minzVz(i, j) maxzVz(i, j)

maxz|Vz(i, j)− Vz−1(i, j)|

     Where ’z’ are the frames satisfying|Vz_{(i, j)λ(i, j)}_{| ≤ 2σ(i, j)}

After the training period, an initial background model B(i,j) is then obtained. Then each input image It(i, j) of the video sequence is compared to B(i,j), and a

pixel (i,j) is classiﬁed as a foreground pixel if It(i, j) > (m(i, j)− kµ) and It(i, j) < (n(i, j) + kµ)

After the background subtraction by using the above method, some shadow region pixels which is wrongly classiﬁed as foreground objects.so they have employed a shadow region detection method to remove the shadow pixels from detected foreground pixels.

3.12.1 Analysis

Here analysis of different detection algorithm is done. In the experiment of taking different video, it is shown that, Julio cezar method is the best object detection method. After getting the silhouette of the background objects, the contour of the foreground objects after some post processing of the resulting silhouette like region filling is performed. Then extraction of the feature from the contour is done.But the

feature extraction part is not discussed here.

3.13 Object Tracking

After the object detection is achieved, the problem of establishing a correspondence between object masks in consecutive frames should arise. Obtaining the correct track information is crucial for subsequent actions, such as object identiﬁcation and activity recognition. For this situation, block matching technique is used.

3.13.1 Block Matching Method

The entire process of tracking the moving object is illustrated in the following Fig 2.6. The block matching method is well described in [4], which we have applied here. Block matching is a technique for tracking the interest moving object among the moving objects emerging in the scene. In this article, the blocks are defined by dividing the image frame into non-overlapping square parts. The blocks are made based on peripheral increment sign correlation (PISC) image Satoh et al., 2001; Sugandi et al., 2007 that considers the brightness change in all the pixelsof the blocks relative to the considered pixel. Fig. 9 shows the block in PISC image with block size is 55 pixels. Therefore, one block consists of 25 pixels. The blocks of the PISC image in the previous frame are defined as shown in Eq. (2.3). Similarly, the blocks of the PISC image in the current frame are defined in Eq. (2.4). To determine the matching criteria of the blocks in two successive frames, evaluation is done using correlation value that expresses in Eq. (2.5). This equation calculates the correlation value between block in the previous frame and the current one for all pixels in the block. The high correlation value shows that the blocks are matched each other. The interest moving object is determined when the number of matching blocks in the previous and current frame are higher than the certain threshold value. The

threshold value is obtained experimentally. bnp =    1, if fnp≥f(i,j) o, otherwise (3.22) b′_np =    1, if fnp≥f(i,j) o, otherwise (3.23) corrn ₌ N ∑ P =0 bnp∗ b ′ np+ N ∑ P =0 (1− bnp)∗ (1 − b ′ np) (3.24)

where p & p′ are the block in the previous and current frame, n is the number of block and N is the number of pixels of block.

3.13.2 Tracking Method

The tracking method used in this article can be described as following. The matching process is illustrated in Fig.2.6. Firstly, blocks and the tracking area are made only in the area of moving object to reduce the processing time. The previous frame is devided into block size (block A) with 9x9 pixels in the previous frame. It is assume that the object coming ﬁrstly will be tracked as the interest moving object. The block A will search the matching block in each block of the current frame by using correlation value as expresses in Eq.(2.5). In the current frame, the interest moving object is tracked when the object has maximum number of matching blocks. When that matching criteria is not satisﬁed, the matching process is repeated by enlarging the tracking area (the rectangle with dash line).The blocks still are made inside the area of moving object. When the interest moving object still cannot be tracked, then the moving object is categorized as not interest moving object or another object and the tracking process is begun again from the begin.

3.13.3 Feature Extraction

The feature of objects extracted inthe spatial domain is the position of the tracked object. The spatial information combined with the features in time domain represents

the trajectory of the tracked object, so the movement and speed of the moving objects can be estimated that needs to tracked. Therefore, the features of spatial domain are very important toobject identiﬁcation. The bounding box deﬁned in Eq. (2.4) is used as spatial information of moving objects.

After getting the interest moving object,then extraction of interest moving object by using a bounding box. The bounding box can be determined by computing the maximum and minimum value of x and y coordinates of the interest moving object according to the following equation:

B_mini ={(xi_min, y_mini )|x, y ∈ Oi } (3.25)

B_maxi ={(xi_max, y_maxi )|x, y ∈ Oi } (3.26) where Oi _{denotes set of coordinate of points in the interest moving object i,B}i

min

is the left top corner cordinates of the interest moving object i, and B_maxi is the right bottom corner cordinates of the interesting moving object i.In the chapter 4 shows the bounding box of the object tracking.

Figure 3.10: Video tracking results

In document A survey on object detection and tracking algorithms (Page 63-67)