Analysing Motion Estimation Experiments - Autonomous real time object detection and identificat

3.7 Analysing Motion Estimation Experiments

3.7.1 Keypoint detection

The results shown in figures 3.15 and 3.16 are of different key point detection algorithms applied across three separate scenes. The objective of this experimentation was to explore the effectiveness of motion estimation, and the variation of results from the algorithms. Each frame was selected based on the difficulties motion estimation had at eliminating noise. In each video sequence there are several frames where the image stitching is good enough so that no extraneous noise is found. However, in a comparison scenario, it was desirable to have frames where all permutations of the algorithm exhibit noise to an extent. The helicopter frame provides for some disparity between the methods. Both SIFT and BRISK exhibit the noise of the road markings on the right clearer than SURF and ORB. BRISK eliminates the road markings to the left completely, as does ORB. Whilst the SURF algorithm detects both verge lines, it has a fainter detection than all others for this line (a fainter detection means a smaller shift in alignment). ORB is the most noise free in this scenario. The panning scenario, which should not have any detections (no moving objects) is fairly consistent across all four techniques, each exhibiting small detections of background. SIFT and BRISK both detect some line noise in the bottom right of this frame. ORB has small detections in this region, and the SURF algorithm is the least noisy as there is no noise detected in the bottom right. The z-axis motion is the most noisy result, with each algorithm producing many detections. This is highlighting a weakness in the motion estimation approach; it is susceptible to noise and cannot detect moving objects when the camera motion is in scale space. Notice how the cars in the Z-axis frame do not register as even noise. In this scenario, both SURF and ORB are slightly better at discriminating background noise. In terms of performance, both BRISK and ORB perform the keypoint detection faster than SIFT or SURF, with SIFT being the slowest and ORB being the quickest. Despite SIFT being the slowest, SURF detects the most keypoints, followed by BRISK and then ORB. In terms of image size scaling, despite being the second fastest, BRISK scales the worst with the times increasing by order of magnitudes between video sequences. ORB maintains a log-linear scalability with image size.

3.7.2 Key point matching

Figure 3.17 shows the comparison of matching algorithms used in this experiment. For consistency, the SURF algorithm is used as a baseline key point detector. The algorithm is used because the output provides a large number of keypoints that lends itself to filtering the matches in the next process. If there are too few matches, a homography matrix cannot be

3.7 Analysing Motion Estimation Experiments 67

generated. In pure matching, the FLANN algorithm provides the same number of matches as the brute force algorithm. It is consistent that with the same matches, the resultant output videos are the same. The computational performance of the brute force method is faster in both scenarios than FLANN, and scales better - only increasing by 11ms for a more complex frame compared to 17ms increase by FLANN.

3.7.3 Key point filtering

Figure 3.18 shows a comparison of filtering algorithms applied to the keypoint matches. The objective is to remove keypoint matches that are inaccurate matches and will skew the homography generation. SURF and the brute force matcher are used in these examples. The KNN and radius filters are both cross check filters, and the outlier filter is a simple distance measure filter. In the Helicopter video the radius filter has slightly better noise reduction than the KNN filter. The verge line on the right is less pronounced in KNN filtering compared with the radius filtering. In the panning scenario, there is little difference between both cross check filters. The outlier filter removes noise even further on the Helicopter video, with much reduced noise on the left compared with KNN or radius cross check filtering. The result suggests that the frames were warped together slightly differently because the verge line on the right is more pronounced towards the bottom of the frame compared with a higher up detection of the verge. The panning sequence is barely affected by the filtering. The computational performance of both cross check filters is comparable with each other with the radius filter being 4ms slower than KNN in the helicopter video. The filters are almost identical in terms of performance in the panning sequence. The outlier filter adds very little performance overhead to the matching process, and removes similar numbers of matches compared with KNN and radius matching.

3.7.4 Homography Interpolation

The interpolation results are shown in figure 3.19. In the simple translational movement of the helicopter video, the centroid interpolation errors are difficult to spot because of the single direction of scene movement, the cubic performance is fractionally better than the linear interpolation. In the video scene with the camera panning from a fixed spot, two tests were conducted. One with no moving objects and the second with an extreme panning motion. The linear interpolation appears to perform fractionally better in the helicopter scene than the cubic interpolation with less distortions, but in the panning scene the cubic interpolation performs the best. It is difficult to draw conclusions directly from these results and they do

In document Autonomous real time object detection and identification (Page 83-85)