Chapter 4 Interest Point Additions and Testing
4.5 Adjustable BRISK Detector
In response to experiments with the dynamic detector, I created the Ad- justable BRISK detector. This interest point detector has detection times comparable to BRISK, while producing a more consistent number of features. It leverages the large amount of visual consistency between sequential frames in a video sequence to use previous detection results to inform the selection
(a) (b)
Figure 4.7: Figure 4.7a shows the detection time per frame for a BRISK detector with t = 15, o = 1. Figure 4.7b shows the detection time per frame for a dynamic BRISK detector with feature boundaries 600-800.
of future BRISK parameters.
The Adjustable BRISK detector is a modified version of the BRISK- AGAST detector. It selects interest points in the same manner, but has additional functions that allow for tuning of the threshold value to keep the number of features detected relatively constant.
Usage is similar to that of the BRISK detector. The user specifies six parameters in the initialization of the detector:
• init thresh - The starting threshold value to use in the first detection. • nOctaves - The number of octaves to use in the detection of feature points. This parameter is exactly the same as in the BRISK-AGAST detector.
• min features - The lower-bound on the acceptable number of features. • max features - The upper-bound on the acceptable number of features. • min thresh - The lowest acceptable value that the threshold can take. • max thresh - The greatest acceptable value that the threshold can take. The detect function is used to extract a list of interest points from an image. Immediately after a detect call, the updateParams function is called with the number of interest points detected, n, as a parameter. This function adjusts the detector parameters to produce more points if too few interest points were detected, and less if too many were detected. The threshold
update logic for updateParams is given in Equation (4.17), where t is the previous threshold and ˆt is the new threshold value.
ˆ t =
(
0.9t min features > n
1.1t max features < n (4.17) Although the number of octaves o and the threshold value t are both capable of changing the number of features, only t is adjusted in Equation (4.17). This is because t can take a much broader range of values, and those values give finer control of the feature count than o.
Figure 4.9 demonstrates this, showing feature detection counts for ranges of t and o on the famous “Lena” image (Figure 4.8). In Figure 4.9b, the feature count is shown for the range of all possible o values 0 - 9. As discussed in Section 4.3, increasing o from 0 to 1 results in a sharp decrease in the number of feature detected, from 1435 to 299 in this case. Subsequent increases in o result in decreases in the feature count of only 1-2 features.
Figure 4.9a shows the large set of values t can cover and the wide range of feature counts that it can produce. With a properly selected threshold value, most any range of feature counts can be achieved.
Note the detection efficiency in Figures 4.9e and 4.9f, given in µs per feature. For threshold values less than 80, an approach altering only the threshold is able to produce features more efficiently than one changing the number of octaves. It should also be mentioned that a change in the value of o used to detect features would necessitate the reinitialization of the descriptor extractor, so as to generate a new look-up table including the appropriate number of scale levels. The generation of the look-up table is a time-intensive process which requires several hundred milliseconds.
Concerns regarding the stability of the detected feature points also played a role in the choice to use only the threshold value to control the number of features. We saw in Section 4.3 that the location of an interest point is constant with respect to t. This property only extends to the lower levels of the scale pyramid for o. Interest points located at the top level will almost surely change location when o is increased, since the localization technique for those points expands to include scale refinement and inter scale level interpolation.
The change in the location of the features is an issue in the RGB-D SLAM System. When the algorithm processes a new frame, it performs visual odom-
Figure 4.8: Lena.
etry by comparing features between the new frame and existing key frames and then examining the rigid transformation between the reconciled 3-D lo- cations of the matching pairs in the two frames.
The RGB-D SLAM System uses feature matches with previously detected frames to estimate the pose of the camera. If a change in o results in a change in the location of interest points, their descriptors might be altered enough to prevent matches from being established, resulting in a loss of tracking. Even if enough matches are found, the change in position of the interest points may alter their relative positions, resulting in an incorrect motion estimate that introduces error into the estimated trajectory.
A multiplicative threshold adjustment in Equation (4.17) is chosen over the more typical constant adjustment because of the nature of the feature count - threshold plot in Figure 4.9a. For large values of t, large changes in t are required to elicit a significant change in the feature count, while at smaller values of t very small changes in t are required to prevent gross changes in the feature count.