Fast Relative Speed Moving Object Detection Algorithm

7.5.1 Predictive Displacement Estimation

Currently, the region for template matching is limited to 16x16 pixels around the

averaged amplitude and direction of MVs of the identified region which may contain

relatively fast moving objects. Since the target object is moving, the average amplitude

and direction of the MVs should be compensated by the predicted speed change (or

acceleration) of the object. The change in speed of the detected object can be predicted

by the amplitude of MVs of previous frames. With a basic kinematic model for the

movement of the moving object, the prediction can be refined by the use of Kalman

filter.

7.5.2 The Use of Bi-Prediction Motion Vector

In the current implementation, the MVs from P-frames were used to evaluate the

movements of objects in successive images. As discussed in Chapter 2, MVs from

H.264/AVC can be erroneous as they were used primarily for video compression rather

than precision object motion estimation. One of the very important pieces of

information in B-frames is bi-prediction motion vectors of some macroblocks. The

bi-prediction MVs can be used to refine the accuracy of MVs in P-frames. By taking

into account the spatial and temporal consistency of object movements in successive

frames, MV outliers can be detected and eliminated. In this connection, the false

positive rate can further be reduced.

One point to note is that MVs are estimated using the current frame as the coordinate

reference. The coordinates in the current frame are always in integer form, aligning to

the macroblock boundaries. For a bi-predictive macroblock in a B-frame, each pixel in

173

to the two bi-predictive MVs. The bi-predictive MVs can be of sub-pixel units of up to

1/4 precision rather than integer. Also, they are not limited to be aligned to the

macroblock boundaries.

Therefore, to utilise the bi-predictive MVs in B-frames to improve the reliability of

MVs in P-frames, the coordinate reference of the MVs in the B-frame has to be

changed to use the current P-frame as reference. This involves the construction of a

MV map by rounding or truncating the sub-pixel MVs to integer pixels. After that, the

MV map can be used to compare with the MVs obtained from the current P-frame.

MVs in the current P-frame can be discarded if the discrepancies are larger than certain

thresholds, indicating they are temporally inconsistent to the corresponding MVs found

in the previous B-frame.

7.5.3 The Algorithm for Template Matching

Currently, the search algorithm for template matching is performed by simple

exhaustive spiral search about the estimated displacement from the position in the

previous frame. The search can be computationally inefficient especially when a match

template cannot be found inside the defined search range. The speed of the search

algorithm can be improved by employing smarter search algorithm. Similar fast search

algorithms for H.264/AVC motion estimation can be considered, such as the Diamond

Search algorithm (Tham and S. et al., 1998), UMHexagonS algorithm (Chen and Xu et

al., 2006), and Image Edge Assisted Search algorithm (Chan and Siu, 2001). According

to the test results of these authors, all these fast search algorithms can reduce the

computation cost by at least 80%.

Also, the decision on successful template matching relies on the evaluation of sum of

174

cost for SAD evaluation can be reduced by using integral image (Schweitzer and Bell

et al., 2002, Viola and Jones, 2001). The properties of Integral Image allow rapid and

efficient computation. Figure 7-4 illustrates how the Integral Image can facilitate rapid

computation of sums and differences of rectangular regions. Figure 7-4(a) shows a

rectangle at location from Po(0,0) to P(x,y). The sum of pixel values above and on the

left of P(x,y) is Px,y as expressed in Equation (7.5).

, ' , ' ( ', ') x y x x y y P i x y ≤ ≤ =

∑

(7.5)

where i x y

(

', '

)

is the pixel value at point ( ', ')x y . The relationship of area A, B, C

and D to the point P1, P2, P3 and P4 as shown in the Integral image in Figure 7-4(b)

can be expressed as Equation (7.6).

1 , 2 , 3 , 4

P =A P = +A B P = +A C P = + + +A B C D (7.6)

By solving these equations, D can be expressed as Equation (7.7).

1 4 2 3

D=P+P −P −P (7.7)

So, the area D can be evaluated very quickly by knowing the sum of pixel values at the

four corners of D.

Figure 7-4: Integral image. (a) sum of pixel values above and left of P(x,y), (b) sum of pixel values above and left of P4(x,y) = A+B+C+D

175

In addition to the use of search algorithm, the moving direction of the identified

template can also be estimated from the MVs from the H.264/AVC encoder. During

the HV mode, the template identified from HG mode can be evaluated for the moving

direction of blocks inside the template by referring to the MVs from the H.264/AVC

encoder. Since the MVs from H.264/AVC encoder are referring to the movement from

the current frame to the previous frame, a reverse MV map needs to be constructed so

that the movement of the blocks in the template from the previous frame to the current

frame can be evaluated. After that, if the resultant movement of the whole template is

consistent to the estimated movement, the HV can be regarded as successful. Similarly,

tracking can perform in a similar way as in the HV mode.

7.5.4 Reducing False-Positive Detection

As mentioned in Chapter 4.4.4, there were cases of false-positive HG and HV for MV

based moving object detection. The false-positive hypothesis can pass the HV stage

because the identified displacement in the HV stage is within the allowable limit of the

estimated displacement evaluated in the HG stage, even though the estimated

displacement found in HG is erroneous.

The false positive detection can be reduced by evaluating the direction of movement of

the template in HV stage. If the movement of the template from the HG frame to the

HV frame is moving away from the ego vehicle, the HV can be treated as failed. And if

the direction of movement of the template is pointing to the FOE, and the amplitude of

the displacement is the same as the displacement due to ego motion, the HV can also

176

In document Moving object detection for automobiles by the shared use of H 264/AVC motion vectors : innovation report (Page 189-193)