The response function threshold Rt is a minimal value of a response function (LoG, DoG, Hessian etc.) of a detected feature. Main purpose of the response function threshold is to filter out spurious local features which may arise from the stochastic processes in image acquisition rather than structures observed in the scene. It directly affects the number of detected features as it rejects features with low contrast.
Generally most of the authors which tried to measure feature detectors performance had struggled with the response function threshold and other parameters which affect the number of detected features. The number of detected features depends on the input image and also on the detector implementation. Therefore authors ([27],[2]) usually choose to use the default parameters of the detectors.
In this section we examine relation of the response function threshold to detector perfor- mance. DTU Robot 3D benchmark is used to measure the geometric precision of the detector and retrieval benchmark is used to measure the distinctiveness of the extracted features in image retrieval task. To obtain more general results we have performed the tests with several implementations of the local image feature detectors. Also when the implementations allow, we include both similarity and affine invariant alternatives of the detected features in order to see whether the behaviour is affected by affine iteration.
In CMP implementation of the detectors, the response function threshold is set directly as expected standard deviation of noise of image brightness values, with exception of the DoG approximation where it is set directly as the minimum difference of Gaussian value. In other implementations its value is directly used against computed values and it does not account different properties of the feature responses. In the case of the MSER detector [26], parameter ∆min, minimal margin of intensity values for a stable region is varied.
Epipolar geometry In this experiment we test the geometric precision of detected local features. It is performed with the DTU Robot 3D benchmark (described in and ). We test the detectors with varying response thresholds (or minimal margin for MSER detector) and measuring the repeatability on two selected camera viewpoints against the reference image. The reference image is captured with camera 0.5m away from the scene (Key frame in Figure ). The first viewpoint, referred as Linear path is with camera 0.8, away from the scene with camera in the same bearing and is used to measure detector scale invariance. The second viewpoint, referred as Arc 2 the camera is on a circular path 0.65m away from the scene and snaps the scene in angle of 25◦. The particular camera viewpoints tested against
(a) Repeatability (b) Number of correspondences
Figure 5.6: Repeatability of CMP Hessian detector with scale space built with Algorithm 4. Repeatability is plotted per scale on increasingly blurred bikes 1 image. Local features detected in the reference image are divided into 5 intervals in its scale such that each interval contain the same number of features. Then repeatability per group is computed.
Arc 2
Linear path
Figure 5.7: First 10 scenes used for repeatability calculation and visualisation of selected viewpoints.
the reference frame are visualised in figure 5.7. The repeatability has been averaged over the first 10 scenes which include a variety of materials and structures. The scenes are shown in figure 5.7.
In order to compare the detectors fairly, the repeatability is drawn in the plots as a function of number of features which changes with response function threshold. For some algorithms, such as RANSAC in wide baseline stereo, the fraction of inliers is important for its time to converge as its stopping criterion depends only on that. Therefore, if we increase the number of features but the repeatability remains the same, it does not benefit RANSAC and increases processing time (but sometimes we want more absolute number of inliers as with that we can obtain more precise model).
For the Arc 2 viewpoint the results are shown in Figure 5.8, measured values and detector thresholds are shown in Table C.2. For all detectors, with increasing number of features, the repeatability increases with exception of MSER detector where the repeatability remains the same. Comparing feature responses, the best results are obtained with Hessian based feature detectors.
In case of linear path viewpoint, the results are shown in 5.9, measured values and detector thresholds are shown in Table C.1. In this case the response function threshold has small effect on the detector performance, only in case of DoG and MSER features, with smaller response function threshold the repeatability decreases. In case of MSER, with small margin the stability function is more affected by noise which causes least stable estimation of region extent. With DoG features the repeatability decrease more than for LoG features which it approximates as with small thresholds the approximation error may have bigger influence.
0 1000 2000 3000 4000 5000 0.3 0.35 0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (a) Approx+MSER 0 1000 2000 3000 4000 5000 0.3 0.35 0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (b) Harris-based 0 1000 2000 3000 4000 5000 0.3 0.35 0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (c) Hessian-based VLF DoG MSER CMP LoG CMP DoG Vgg HarLapAff VLF HarLap VLF HarLapAff VGG HesLapAff VLF Hessian VLF HessLap VLF HessAff CMP Hessian CMP HessAff (d) Legend
Figure 5.8: Detector repeatability in DTU Robot 3D benchmark as a function of average number of detected features per image when varying response function threshold (or min. margin for MSER). Repeatability was computed over first 10 scenes in Arc 2 dataset for the viewpoint 25◦. Detector thresholds are shown in Table C.2.
in [1] and is caused by rather small affine transformations between the tested images and additional degrees of freedom that need to be estimated by detector. From the implementation point of view, VGG detectors, used in detector comparison [1] obtained the worst performance.
Image retrieval The results of image retrieval experiment (mAP ,defined in 4.6) as a func- tion of average number of features per image are shown in Figure 5.10, measured values and detector thresholds in Table . They are computed with the same detectors as in previous experiment but accompanied by SIFT descriptor. Descriptor algorithm was set to use mea- surement scale ν = 3 and without orientation assignment. The tests were performed with the Oxford Buildings dataset [33].
From the results we can see that for all detectors there is a minimal number of features required to successfully cover object instances which explains small mAP for small number of features. When the number of features increases over 2000 the newly added features are not improving the results significantly, with exception of DoG features where with the lowest response function thresholds mAP decreases, similarly as in the epipolar geometry test.
From the invariance point of view it seems that the affine invariance improves the results slightly only for hybrid detectors. In case of Hessian detectors, affine shape adaptation seems only to decrease the number of features but does not the performance.
Contrary to epipolar geometry results, it seems that the VGG detectors give better results than other implementations.
0 1000 2000 3000 4000 5000 0.35
0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (a) Approx+MSER 0 1000 2000 3000 4000 5000 0.35 0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (b) Harris-based 0 1000 2000 3000 4000 5000 0.35 0.4 0.45 0.5
Avg. Num. of features
Avg. Repeatability (c) Hessian-based VLF DoG MSER CMP LoG CMP DoG Vgg HarLapAff VLF HarLap VLF HarLapAff VGG HesLapAff VLF Hessian VLF HessLap VLF HessAff CMP Hessian CMP HessAff (d) Legend
Figure 5.9: Detector repeatability in DTU Robot 3D benchmark as a function average number of detected features per image when varying response function threshold (or min. margin for MSER). Repeatability was computed over first 10 scenes in linear-path dataset where camera moves 0.3 metres away from the scene. Detector thresholds are shown in Table C.2.
Conclusions Our experiments show that for tasks, where the precise localisation of detected features is at stakes, detecting more features with lower threshold can increase detector perfor- mance. However, for DoG and MSER features there exists a limit where their scale invariance deteriorate. In image retrieval there is a limit after which the new regions does not improve retrieval system performance as the detected features become less distinctive.