SFOP detector. - Contributions to the Completeness and Complementarity of Local Image Features

1: for eachintegration scale σ_I do 2: for eachpixel location x do 3: Compute gradient∇_σI

3 L(x).

4: Compute λ₁, the smallest eigenvalue of the following struc-

ture tensor matrix

µ(x, α, σI, σD) = G(σI)∗ (Rα∇σDL(x)∇σDL(x) T_RT

α).

5: for eachangle α∈ {0°, 30°, 60°} do 6: Compute Ω(x, α, σ_I). 7: end for 8: Determine α₀ = argmin α∈{0°,30°,60°} Ω(x, α, σI). 9: Compute precision w(x, α, σ_I). 10: end for 11: end for

12: Detect local maxima in a 26-neighborhood of w. 13: Select keypoints x with λ₁> T_λ.

14: Perform non-maxima suppression. 15: Interpolate w.

SFOP features are displayed in Fig. 2.4. Sets of SFOP features usually exhibit a low density, yet they tend to provide a good coverage of the most informative content. Besides retrieving complementary and interpretable features, SFOP detector has also an accurate response. 2.2.1.11 SUSAN

SUSAN (Smith, 1992, 1996; Smith & Brady, 1997), which stands for Smallest Univalue Segment Assimilating Nucleus, is a morphologi- cal operator suggested for edge detection as well as corner detection.

For each pixel x in the image, a circular mask M – whose nucleus is

x– is computed. Then, for every pixel y∈ M \ {x}, its intensity value is compared to the one of x using the following function:

c(y, x) = exp(−(I(x) − I(y)

t )

Figure 2.4: SFOP features.

where t determines the radius. The intensity comparison allows us to obtain the Univalue Segment Assimilating Nucleus (USAN) area for

x, which is given by

n(x) = X

y∈M

c(y, x). (2.20)

The USAN area succinctly describes the structure in the neighbor- hood of n(x): n is maximum when x lies in a flat region; in case of edges, the area is half of its maximum; for corners, n(x) is even lower. The corner measure used by the SUSAN algorithm is based on the previous inferences: c(x) =    nmax 2 if n(x) < nmax 2 0 otherwise , (2.21)

where nmaxis the maximum area of the USAN.

Figure 2.5 depicts examples of SUSAN feature points. 2.2.1.12 FAST

FAST, which stands for Features from Accelerated Segment Test (Ros- ten & Drummond, 2006), is an algorithm based on the SUSAN crite- rion which uses machine learning to provide an extremely efficient

Figure 2.5: SUSAN keypoints.

feature extraction. Pixels are compared on a Bresenham circle of 16 pixels around the keypoint/corner candidate. The idea is to classify groups of adjacent pixel into three categories: brighter, darker, and similar. A given pixel is a corner if there are 12 adjacent pixels that are either brighter or darker than the center. The ID3 (Iterative Di- chotomizer 3) algorithm (Quinlan, 1986) is utilized to build a decision tree with the goal of selecting the pixel which yields the most information about whether the candidate pixel is a keypoint/corner, measured by the entropy of the corner classification responses. The resulting decision tree is converted into a long sequence of nested conditional statements written in C language. This source code corresponds to the final detector.

Examples of keypoints extracted by the FAST algorithm are displayed in Fig. 2.6.

FAST-ER (Features from Accelerated Segment Test - Enhanced Re- peatability) (Rosten et al., 2010) is an improved version of FAST which takes into account the repeatability of features in order to retrieve points with a high repeatability rate.

Figure 2.6: FAST keypoints.

2.2.1.13 Laplacian of Gaussian (LoG) detector

Keypoints are representatives of visually salient image parts. In some cases, these conspicuous regions around keypoints correspond to blobs. As mentioned earlier, a blob is an image part that is brighter or darker than the surroundings. Blob detection is usually performed in an image scale-space representation in order to determine its scale. Linde- berg (1998) proposes a scale covariant blob detector which is the re- sult of searching for scale-space extrema of (scale) normalized Lapla- cian of Gaussian (LoG):

σ2_∇2L(x, σ) = σ2(Lxx(x, σ) + Lyy(x, σ)). (2.22) This operator has a maximal response at the center of circular blob structures (see Fig. 2.7).

2.2.1.14 Difference of Gaussians (DoG) detector

The Difference of Gaussians (DoG) operator is an approximation of the Laplacian operator. In a scale-space, the difference between images at different scales is an approximation of the derivative with respect to scale and the Laplacian corresponds to the image derivative in the scale direction. Therefore, the Laplacian of the Gaussian operator can be approximated by the difference between two Gaus-

Figure 2.7: Feature extraction using the Laplacian of Gaussian.

sian smoothed images whose scales are separated by a factor of k (Grauman & Leibe, 2011) :

D(x, σ) = (G(kσ) − G(σ))∗ I(x). (2.23) The DoG operator is the basis of the popular Scale-Invariant Feature Transform (SIFT) descriptor (Lowe, 1999, 2004). To construct the de- scriptors, keypoints are firstly detected in a scale-space. A keypoint is a location at which the DoG attains a local extremum. To charac- terize the neighborhood of each one of the keypoints, a descriptor is constructed. It consists of 16 gradient orientation histograms with 8 bins each, producing a vector with 128 elements. This descriptor is rotation and scale invariant.

2.2.1.15 Harris-Laplace

The Harris-Laplace (Mikolajczyk & Schmid, 2001, 2002, 2004) is a scale covariant detector that results from the combination of the popular Harris-Stephens keypoint detector (Harris & Stephens, 1988) with a Gaussian scale-space representation. It starts with a multi-scale Harris-Stephens keypoint extraction followed by an automatic scale selection (Lindeberg, 1998) defined by a normalized Laplacian operator. In this case, the characteristic scale for a given structure corresponds to the scale where the Laplacian attains a maximum, which is independent of the image resolution, yielding, thereby, a scale covari-

ant response. The algorithm starts by building a scale-space representation for the Harris-Stephens measure using n pre-selected scales σ_i = ξi−1_σ

0, with σ0 ∈ R+, ξ > 1, and i = 1, . . . , n. At each level (scale) σi, keypoints are found by computing the local maxima that are above a given positive threshold THS:

   x?=argmaxlocal x f_HS(x, σI) fHS(x?, σI)> THS . (2.24)

The next step in the algorithm is to determine the scale of the keypoints, which is done by finding a local normalized Laplacian (of Gaussian) extrema in a range of scales above a given positive threshold TLoG:    σ?=argmaxlocal σ σ2_(L xx(x?, σ) + Lyy(x?, σ)) σ?2(Lxx(x?, σ?) + Lyy(x?, σ?))> TLoG . (2.25)

Algorithm 3 outlines the main steps for the detection of Harris-Laplace regions.

In document Contributions to the Completeness and Complementarity of Local Image Features (Page 44-49)