Dynamic Threshold SIFT for Image Matching

(1)

2017 2nd International Conference on Computer, Mechatronics and Electronic Engineering (CMEE 2017) ISBN: 978-1-60595-532-2

Dynamic Threshold SIFT for Image Matching

Bao-guo WEI

1,2,*

, Xing-jian HE

2

, Hai-xi ZHANG

2

and Gao-feng WANG

1

Guizhou Yu Peng Technology Co. Ltd., Guiyang, China

2

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China

*Corresponding author

Keywords: SIFT (Scale Invariant Feature Transform), Dynamic threshold, GLCM.

Abstract.Despite the good results obtained in various applications, the performance of SIFT

de-clined under conditions of insufficient illumination and blur imaging. The main reason is that SIFT key points are discarded if the DoG response is less than a fixed preset threshold. We proposed a dynamic threshold assign method based on both global and local information. Firstly Initial thresh-old is obtained according to contrast of GLCM, making it adaptive to insufficient illumination and blur imaging. Secondly, in order to control the number of feature points, the threshold is adjusted once again according to feature points’ distribution. So every feature point is assigned a different threshold adapting to surrounding situation. At last, the mismatch removing algorithm is also been improved by incorporating the global distribution context. Experiment results show that the im-proved SIFT algorithm is not only well adapted to low light and image blur, but also can adjust the number of feature points and reduce clustering effects.

Introduction

SIFT (Scale Invariant Feature Transform) has been widely used in image matching, registration and stitching, due to its being invariant to image scale and rotation [1, 2]. However, there are still some drawbacks in SIFT, such as large computation cost, weak performance in affine transform, in-sufficient matching pair under weak illumination and blur. A lot of improvements to SIFT has been made, such as SURF (Speeded-up Robust Features) [3] and PCA-SIFT [4], they largely deduced the computation cost, and enhanced the computing efficiency of SIFT. A-SIFT [5] has better effect rela-tive to SIFT in affine transform. So far, most improvement to SIFT emphasizes the representation and dimension reduction of feature vector, and few works considers the selection of feature. It is shown in [6] that the fixed threshold, used in classic SIFT to exclude instable key points, cannot fit to all kinds of image. Under the condition of insufficient illumination or blur imaging, it is difficult to obtain enough feature points, so the number of matching pairs is reduced, resulting the reduction of matching performance. To overcome this situation, You ZHAI and Luan ZENG proposed an adaptive threshold based on image entropy [7], showing good performance under insufficient condi-tion, while losing effect to blur image.

To extract appropriate SIFT features under insufficient illumination and blur image, we propose a twofold adaptive threshold SIFT. Firstly, we obtain initial threshold considering image contrast. Then according to the feature distribution around each key point, we revise the threshold once more, to adjust the number of features. The proposed algorithm can improve feature distribution and sup-press feature cluster to some extent. Moreover, we exclude mismatches according to the space dis-tribution of features. Experimental results shows that the matching effect is improved under insuffi-cient illumination and blur image, and clustering effects are suppressed as well.

(2)

SIFT, which is proposed in 1999 [1] and modified in 2004 [2] by Lowe, consists of four major stag-es. (1) scale-space extrema detection; (2) keypoint localization; (3) orientation assignment; (4) key-point descriptor.

In the second stage, the standard SIFT firstly fits a 3D quadratic function to the local sample patch to determine the interpolated location of the extremum, thus gets sub-pixel accuracy. Then the function value at the extremum is used for rejecting unstable extrema with low contrast. All extrema with a derivative value less than a preset threshold, say, 0.03, were discarded.

On the other hand, when SIFT used in matching, Lowe proposed using Euclidean distance to ex-press the difference between feature vectors. However, due to similar region in images and the defi-ciency of global information, mismatches occurred. To deal with mismatch, Fischler proposed a RANSAC algorithm [8]. Peng et al introduce global information into feature vectors instead of im-prove the matching method to promote the accuracy of matching [9].

The Improved SIFT

Analysis of the Standard SIFT

In the second stage, Lowe suggested using a preset threshold of 0.03 to discard the low contrast ex-trema. However, a fixed threshold cannot suit to all kinds of situation. Under the condition of insuf-ficient or uneven illumination, or blur image, the pixel values distribute in a narrow range, so the number of keypoints obtained by the standard SIFT can be small, which is not enough to a good matching. The reason behind this phenomenon is that the threshold used in discarding unstable key-points is relatively low to a narrow-distributed image.

[image:2.612.81.534.438.518.2]

Figure 1 shows the results of keypoints localization using standard SIFT in the same scene of day and night, and the same scene of clear and blur imaging, respectively. We can see that under com-mon conditions the standard SIFT can retain enough keypoints (Figure 1 (a) and (c)), while under the condition of insufficient illumination and blur image, the number of retained keypoints is re-duced greatly (Figure 1 (b) and (d)). The successive matching process will be affected therein.

Figure 1. The keypoints localization results of deferent conditions. (a), (c) common condition, (b) the same scene with (a) under low illumination, (d) the same scene with (c) of blur imaging.

The Improved SIFT

The approach in [7] revise the threshold according to the entropy of the whole image, and obtain good results under illumination change, while lose effect in blur image.

We suggest to adjust the threshold for discarding unstable interest points adaptively according to image contrast and feature distribution, the local and global information of image. The algorithm will enhance the matching effect under insufficient illumination and blur imaging, and can keep the effect of standard SIFT to common images. For an image I with M rows and N columns and Ng gray-levels, an GLCM (Gray-level co-occurrence Matrix ) of some gray-level configuration may be described by a matrix of relative frequencies P_φ_,_d

(

a,b

)

, describing how frequently two pixels with gray-levels a, b appear in the image separated by a distance d in direction φ. At a certain distance and angle of the pixel pair, the definition of a GLCM can be as following:

(

,

)

#{( , ),( , ) | ( , ) , ( , ) }

(3)

where #{x} denotes the number of elements in x. P is a matrix of Ng rows and Ng columns. The contrast measure of GLCM reflects local image variations, is defined as,

∑

−

=

b a

k _P _a_b

b a CON , ) , ( |

| λ ₍₂₎

typically we set k=2 and λ=1.

For simplicity, we just consider GLCM with d set to 1 and φset to 0. In our experiment, it is enough to determinate the threshold. The initial thresholdthreshold0 can be calculated according to

CON of GLCM as,

02 . 0 01 . 0 * )) 01 . 0 /( (

0= CON CON+ +

threshold ₍₃₎

For a common image, the value of CON is usually greater than 1. So we get the initial threshold using CON/(CON+0.01). To common images, CON/(CON+0.01)is about 1, and the initial threshold is near 0.03. To images with insufficient illumination or blur, the initial threshold is varied in the range of 0.02 to 0.03 according to the contrast.

The distribution of features cannot be adjusted and the clustering effect cannot be improved by only resort to the initial threshold. So we adjust the threshold once more, according to the distribu-tion informadistribu-tion of feature points. To an image of size M×N and for each feature point, we exam-ine the feature points within a certain distance around it, say M/20×N/20, then calculate the modi-fied value of threshold0according to rij, which is the distance between the current feature point i and each of its surrounding feature points, say j. The modified value corresponding tor_ijis:

) /( ) 40 (

T_ij = M+N− r_ij M+N

(4) Summing up the modified value, we obtain the distribution metric of feature point i

∑

∈ ∈ + − + = = ) ( ) ( ) /( ) 40 ( i Neighbor j ij i Neighbor j ij

i T M N r M N

T ₍₅₎

According toT_i, we adjust the threshold used in SIFT the second time. So the final local threshold within a certain range of a feature point is,

        + − + + =

∑

∈ ()

2 ( 40 )/(2( ))

log 0 i Neighbor j ij

i threshold M N r M N

threshold ₍₆₎

here we adopt logarithm to restrain the range of threshold. log_s(T_i/2)<0 if T_i/2<1, and vice visa. So we can change the threshold based on the value of T_i.

In addition, it is necessary to restrain the threshold within a special range in case it becomes too big or too small:

0 * 2 0 * 2 ) 01 . 0 , 2 / 0 max( ) 01 . 0 , 2 / 0 max( 0 * 2 ) 01 . 0 , 2 / 0 max( i threshold threshold threshold threshold threshold threshold threshold threshold threshold threshold threshold i i i i ≥ < ≤ <     

= ₍₇₎

Thus the threshold is within the interval of

[

max(threshold0/2,0.01),2*threshold0

]

.

Mismatching Discard Method Based on Global Information

(4)

pair is large. We revised the method proposed in [9] to make the threshold selection easily. The pro-cedure is stated as following:

Firstly, SIFT feature descriptors are calculated in two images using the method proposed by Lowe, with the aforementioned threshold.

For each pair of initial matched points, based on the dominant orientation of each corresponding point, create a new coordinate system in each image respectively. Using angle of degree 45 divide the image into eight segments. We assign each segment a number between 1 and 8.

The other matched points are used to form the global feature vectors according to their segment index in the new coordinate. The dimension of the vector n is the same as the number of initial matched pairs.

For each of the initial matched pairs, calculate the absolute error vector r=r1r2rn by subtract their global feature vectors. Then get the total relative location error of r by

∑

= =

1 | |

2 i

ri

n

D . If r_i>5,

let r_i=8−r_i. If Dis less than a preset threshold, say k×n (2<k<3), then the initial matched pair is taken as correct match, otherwise is discarded.

Repeat the process to other initial matched pair, until all of them are processed.

Experiment Result and Discussion

We have evaluated our algorithm using hundreds of images, comparing with the standard SIFT and the method proposed by [7]. The experimental environment is Intel Core TM2 Duo CPU 2.4GHz and 2G memory, with the development tool Matlab R2010b.

Extracted SIFT under the Condition of Insufficient Illumination and Blur Image

[image:4.612.154.453.435.544.2]

Figure 2 and Figure 3 show the typical experimental result of the algorithm proposed in [7] and ours. We compare the extracted SIFT points under the condition of insufficient illumination and blur.

Figure 2. The feature points extracted under insufficient illumination by the algorithm in [7] and ours.

(5)

[image:5.612.161.454.66.179.2]

Figure 3. The feature points extracted in blur image by the algorithm in [7] and ours.

The effect of algorithm proposed in [7] declined in blur image, as shown in Figure 3 (a). Com-pared with Figure 1 (d), the number of extracted points are almost the same, indicating the thresh-olds are nearly the same. Because the entropy changes little from common images to blur images and so do the thresholds. Using contrast of GLCM to reflect the change of illumination and sharp-ness of an image, our method can adapt to both the conditions and adjust the threshold properly, as Figure 3(b) shown. Figure 4 shows the matching results of the three compared algorithms. Table 1 gives the matching results in figures.

[image:5.612.163.450.304.715.2]

(6)

[image:6.612.110.503.81.210.2]

Table 1. Comparison of matching results.

Image1 Image2 Image3 Image4

Standard SIFT

initial matching

pair 128 397 251 221

Correct matching 101 279 238 172

Method in [7]

initial matching

pair 139 428 425 245

Correct matching 104 284 402 192 gain 3.0% 1.8% 68.9% 11.6%

Our method

Correct matching 136 322 405 318

gain 107 242 385 284

gain 5.9% -13.3% 61.7% 65.1%

[image:6.612.142.475.294.402.2]

Table 1 shows the matching number of four typical images by the three methods. We can see that for common images our method gets similar results as the standard SIFT. For uncommon images, our method can obtain proper number of match points by adjusting the threshold. Figure 5 shows the change of the feature point by twice adjusted threshold. We can see that by the two adjustment, the thresholds various with each feature points and its surrounding distribution.

Figure 5. Change of special feature points.

Conclusion

The performance of standard SIFT algorithm declined under low illumination and blur imaging. Based on the contrast of GLCM and feature point distribution information, we propose the twice ad-justed threshold SIFT, and filtering out mismatches using feature points distribution context. The experimental results show that the method proposed can adjust the number of feature points proper-ly, and the matching effect outperforms the standard SIFT and algorithm in [7]. Our future research will focus on accelerating the algorithm to achieve image matching in real-time.

Acknowledgement

This research was financially supported by the Key Science and Technology Program of Guizhou Province (2017GZ60903), China and the Science and Technology Research Program of Xi’an (2017086CG/RC049).

Reference

[1] David Lowe, Object recognition from local scale-invariant features, A. Proceeding of ICCV. 2 (1999) 1150-1157.

[2] David LOWE, Distinctive image features from scale-invariant key-point, J. International Jour-nal of Computer Vision. 60(2004) 91-110.

(7)

[4] Ke Y., Sukthankar R., PCA-SIFT: A more distinctive representation for local image de-scriptors, C. Proceedings of IEEE International Conference on Computer Vision and Pattern Recog-nition, Washington, DC, USA, 2004, pp. 511-517.

[5] Morel J.M., Yu G.S., A-SIFT: A new framework for fully affine invariant image comparison, J. Society for Industrial and applied Mathematics Journal on Image Sciences. 2(2009) 438-469.

[6] R. Song, J. Szymanski. Well-distributed SIFT features, J. Electronics Letters. 45(2009) 308-310.

[7] You Zhai, Luan Zeng, A SIFT matching algorithm based on adaptive contrast threshold, C. In-ternational Conference on Consumer Electronics, Communications and Networks, XiaNing, China, 2011, pp. 1934-1937.

[8] Fischler M., Bolles R., Random sample consensus: A paradigm for model fitting with applica-tions to image analysis and automated cartography, J. Communicaapplica-tions of the ACM. 24(1981) 381-395.