International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 6, June 2014)
308
Hybrid Feature Extraction Method for Partial Face Recognition
K. Marina Singha
1, Daizy Deb
2, Dr. Sudipta Roy
31 PG Scholar, 2Assistant Professor, 3Associate Professor & Head, Department of IT, Assam University, India
Abstract— Face Recognition is one of the challenging tasks faced till date in the field of computer vision and pattern recognition. A robust face recognition system is essential for proper and accurate face recognition. Facial feature extraction is an important stage of face recognition process as it includes the important information required for face recognition. A good feature extraction algorithm helps in extraction of relevant information helpful in accurate face recognition. This paper discusses the SIFT (Scale Invariant Feature Transform) algorithm which is one of the successful algorithms for local feature extraction and 2DPCA (Two Dimensional Principal Component Analysis) which is an improved version of PCA (Principal Component Analysis), PCA being a holistic in nature .Here we propose the use of SIFT and 2DPCA for facial feature extraction method.
Keywords— Facial feature extraction, partial face images, SIFT, 2DPCA
I. INTRODUCTION
Facial feature extraction is an important phase in a face recognition process. It involves the usage of various algorithms for extracting the important facial features from a face image and representing them in certain variables. The features extracted using these algorithms are represented in such a way that less storage space is required in a database. The information stored must be relevant enough for accurate face recognition. A robust face recognition system is organized in such a way that it works well even under various constraints like illumination variation, orientation, pose variation, facial expression etc. A feature extraction algorithm plays an important role in a face recognition system. Distinct and proper features extracted helps in better face representation and recognition. There are two types of facial feature extraction methods – Local feature extraction method and Global feature extraction method. Local feature extraction methods extract local features like eyes, nose, and mouth. While holistic extraction method considers the whole face image for extracting features. These feature extraction algorithms have been helpful in various face recognition processes because of their appropriate information extracted. Some of the holistic feature extraction methods include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis etc. PCA is considered to be one of the successful algorithms used for face recognition till date.
Among the local feature extraction methods the most successful algorithms include Scale Invariant Feature Transform (SIFT), Gabor Wavelet Feature Transform. The Gabor features seem to be robust against local distortions caused by variance of illumination, expression and pose, and hence been successfully applied for face recognition. The SIFT method provides features that are not affected by occlusions, rotation, partial illumination and noise. This paper discusses the SIFT algorithm for local feature extraction and modular PCA which is an extension of PCA for global feature extraction Partial face images are the face images which are not holistic in nature. They include non-frontal face images, face covered by scarf, hat, and sunglasses or occluded by other objects. Various algorithms have been successful in face recognition, but they may not produce accurate result when comes to partial faces. So it is necessary that we choose an algorithm in such a way that they work well even under occlusion and other variations. Here we discuss the SIFT, PCA and 2DPCA algorithms, their importance and look forward to use for extracting features for the purpose of partial face recognition.
II. RELATED WORKS ON SIFTAND 2DPCA
SIFT
David G. Lowe introduced the SIFT algorithm to extract distinctive invariant features from images that could be used for reliable matching between different views of an object or scene. The method proved to robustly identify objects among clutter and occlusion [1].
Janez Kriˇzaj, Vitomir ˇStruc, Nikola Paveˇsi´c have used SIFT algorithm in their work for robust face recognition. They proposed a new technique in which SIFT descriptors are computed at fixed locations learned during training stage which eliminated the need for keypoint detection on the test image. They experimented their work on Extended Yale B (EYB) database and compared the performance with popular algorithms like PCA and LDA. They concluded that their method performed better than PCA and LDA [2].
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 6, June 2014)
309 Their experiments on ORAL face database showed that their proposed method performed much better than other techniques like PCA, ICA, Fisher faces, and 2D-PCA with a recognition rate of 96.3% [3].
Ajmal S. Mian, Mohammed Bennamoun and Robyn Owens presented a feature based algorithm for robust 3D texured face recognition. The algorithm showed performance with a recognition rate of 96.6% when tested upon FRGC v2 data with different models of 2D and 3D faces [4].
Cong Geng, Xudong Jiang proposed a method of face recognition based on the multi-scale local structures of the face image. Keypoint detection, partial descriptor and insignificant keypoint removal was included in their work. Their proposed method resulted in a better performance than SIFT and other holistic methods [5].
Mohamed Aly used SIFT algorithm for face recognition and compared its performance with that of eigenfaces and fisherfaces. Experiments showed that only 30% of fetures were required saving 91% of the time needed to match all the extracted features. It was also observed that SIFT features provided better performance for up to 50% reduction in resolution [6].
Patrik Kamencay, Martina Zachariasova, Robert Hudec, Roman Jarina, Miroslav Benco, Jan Hlubik have introduced a novel method of face recognition using a hybrid of SIFT-PCA and KNN. They have used graph based he Graph based segmentation to improve precision of keypoints. Their experiments on ESSEX database showed a performance rate of 96% [7].
2DPCA
Jian Yang, David Zhang, Alejandro F. Frangi, and Jingyu Yang introduced 2DPCA for image representation in the year 2004.They experimented the method on ORL, AR, and Yale database under various conditions like pose variation, sample size, facial expression. It showed that 2DPCA performed 20 times faster than PCA. Considering the performance for recognition 2DPCA was much better than PCA [8].
Liwei Wang, Xiao Wang, Xuerong Zhang, Jufu Feng have showed that 2DPCA is equivalent to block-based PCA, and used for face recognition. Experiments were performed on FERET database showing better performance than block-based PCA [9].
K Shilpa et al. used 2DPCA for image representation and recognition. Their experiments showed that 2DPCA method for feature extraction was more efficient than PCA [10].
Benouis Mohamed et al. used combination of other feature extraction methods along with 2DPCA and have showed the advantages of using 2DPCA in their experiments [11].
Kim, Tae Young, et al. have used 2DPCA for face recognition under occlusion. Experiments were performed on AR database and comparisons were made with LDA and PCA. 2DPCA showed better results than LDA and PCA [12].
III. SCALE INVARIANT FEATURE TRANSFORM
The SIFT method was introduced by David G. Lowe for extracting features from images. The features are invariant to image scale
,
rotation, partial illumination and 3D projective transform and they are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise and change in illumination. SIFT are also not affected by occlusion, clutter and unwanted noise in the image. As they are highly distinctive in nature, a single feature can be matched correctly with a large database of features. Following are the four major filtering steps of computation used to generate the set of image feature based on SIFT:A. Scale-space Extrema Detection
This step identifies image locations and scales that are identifiable from different views. Scale space and Difference of Gaussian (DoG) functions are used to detect stable keypoints. Difference of Gaussian is used for identifying key-points in scale-space and locating scale space extrema by taking difference between two images, one with scaled by some constant time of the other. To detect the local maxima and minima, each feature point is compared with its 8 neighbors at the same scale and in accordance with its 9 neighbors up and down by one scale. If this value is the minimum or maximum of all these points then this point is an extrema.
The scale space of an image is defined as a function, L(x, y,)that is produced from the convolution of a variable- scale Gaussian, G(x, y, ), with an input image, I(x, y):
L(x, y, G(x, y, ) * I(x, y)
(1)
Where is the convolution operation in x and y, and
G(x, y, ) = 1∕(22 )e-x2+y2/22 (2) is a variable scale Gaussian, the result of convolving an image with a difference-of-Gaussian filter.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 6, June 2014)
310 G(x, y, k) - G(x, y,) is given by
D(x, y, ) = L(x, y, k)- L(x, y, ) (3)
Figure1 [1].The initial image is repeatedly convolved with Gaussian to produce the set of scale space images on the left and adjacent Gaussian images are subtracted to produce the difference-of-Gaussian images on the right. The difference-of-Gaussian image is down-sampled
after each step by a factor of 2.
Figure2 [3]. Maxima and minima of difference of Gaussian
A. Keypoints Localization in Laplacian Space
Here the localized keypoints are refined and the low contrast keypoints are discarded by calculating the Laplacian space. After computing the location of extremum value, if the value of difference of Gaussian pyramids is less than a threshold value, the point is excluded. If there is a case of large principle curvature across the edge but a small curvature in the perpendicular direction in the difference of Gaussian function, the poor extrema is localized and eliminated.
B. Assignment of Orientation
The key-points are assigned orientation based on local image characteristics. From the gradient orientations of sample points, an orientation histogram is formed within a region around the key-point. Orientation assignment is followed by key-point descriptor which can be represented relative to this orientation.
C. Keypoint Descriptor
Here, the feature descriptors which represent local shape distortions and illumination changes are computed. The local image gradients are measured at the selected scale around each keypoints.
Figure3 [2]. SIFT descriptor
Sift Features
Figure4 [13]. SIFT features detected
Here, Figure4. shows the keypoints detected after performing the SIFT algorithm.
The importance of SIFT features are:
SIFT generates a large number of features that densely cover the image over the full range of scales and rotation
A good quality features is required for object recognition. In case of SIFT presence of at least 3 features matched correctly helps in reliable identification
The SIFT features in the database are compared with the features of the new image based on Euclidean distance using nearest-neighbor algorithm.
The keypoints are distinctive in nature which allows a single feature to find its correct match with good probability in large database of features.
The filtering process of the algorithm helps in refining and and selection of the most distinctive features which will remain the same even after the change in size, orientation and illumination of the image. These invariant and distinctive nature of the SIFT
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 6, June 2014)
311
IV. TWO DIMENSIONAL PRINCIPAL COMPONENT ANALYSIS
Two Dimensional Principal Component Analysis (2DPCA) is similar to the conventional Principal Component Analysis in a way that they both produce eigenvectors and eigenvalues used during recognition. The difference lies in the calculation of the covariance matrix. In PCA method the images are converted into column vectors and then the covariance matrix is calculated. While in case of 2DPCA covariance can be calculated directly from the images and hence the size of the covariance matrix is less than that calculated in PCA method.
The 2DPCA method can be described as follows:
Let X be a n-dimensional unitary column vector. An image A, of size mn is projected onto X by the following linear transformation:
Y= AX (4)
Thus, we obtain an m-dimensional projected vector Y, which is called the projected feature vector of image A. The total scatter of the projected samples can be introduced to measure the discriminatory power of the projection vector X. The total scatter of the projected samples can be characterized by the trace of the covariance matrix of the projected feature vectors.
The covariance matrix of the projected feature vectors of the training samples can be denoted by:
Sx= E [(Y-EY) (Y-EY)T ]
= E [(AX-E (AX)) (AX-E (AX))T ]
= E [((A-EA) X) ((A-EA) X)T ] (5) Therefore,
tr(Sx) = XT [ E(A-EA)T (A-EA) ] X (6)
Now,
Gt = E [(A-EA) T
(A-EA) ] (7)
Gt can be calculated directly using the training image
samples. Suppose that there are M training sample images in total, the j-th training image is denoted by an m x n matrix Aj, (j = 1, 2 ), an t e avera e ima e o all
trainin samples is enote Gt can be evaluated by
Gt = 1/ ∑j=1M (Aj-A)T (Aj-A) (8)
So, equation (6) can be expressed by
tr(Sx) = XT Gt X (9)
The unit column vector X that maximizes equation (9) is called optimal projection axis. This means the total scatter of the projected samples is maximized after the projection of an image matrix onto X so that the discriminative power of the projection vector X is also maximized. The optimal projection axis Xopt is the unitary vector that maximizes
equation (9), i.e., the eigenvector of Gt corresponding to the
largest eigenvalue. Since, it is not enough to have only one optimal projection axis, a set of projection axes, X1, X2, ,
Xd, need to be selected which satisfies the following
criterion:
{ X1, X2, , Xd}= arg max tr(Sx) (10)
The optimal projection axes, X1, X2,... Xd are the
orthonormal eigenvectors of Gt corresponding to the first d
largest eigenvalues. These vectors are used for feature extraction.
For a given image A, let
Yk = AXk , k= 1,2, ,
Then a family of projected feature vectors Y1 , , Yd ,
are called the principal components (vectors) of the sample image A.
The principal component vectors obtained are used to form an m × d matrix B= [Y1 , , Yd ] which is called the
feature matrix or feature image of the image A.
A nearest neighbor classifier is used for classification. The distance between two arbitrary feature matrices,
d
d (Bi, Bj)= Σ ‖ YK(i) - YK(j) ‖2
K=1
W ere ‖ YK(i) - YK(j) ‖2 is the Euclidean distance
between the two principal component vectors YK (i)
and YK(j).
A. Advantages of 2DPCA over PCA
2DPCA is a matrix based feature extraction method. It works on face image matrices.
In case of PCA, the face image matrix is transformed into 1D vector. Whereas in 2DPCA the images need not be converted to 1D vector, hence information is not lost.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 6, June 2014)
312 2DPCA has lower dimensionality than that of PCA,
and are more efficient than PCA
2DPCA can evaluate the matrix and is computationally more efficient than PCA.
V. CONCLUSION
Observing the related works and advantages of both SIFT and 2DPCA, it can be concluded that the combination of both the algorithms will be a good approach for feature extraction and used for partial face recognition. Partial faces do not have data of full faces so it will be difficult to recognize if full data of face is not available. Since SIFT algorithm helps in extraction of distinctive features which are invariant to scale, orientation and illumination, they will be useful for recognition even when global features are not available. 2DPCA will be helpful in global feature extraction as well as dimension reduction. As it has been proved to be better than PCA and successful in face recognition under occlusion, it seems promising to use both SIFT and 2DPCA as a hybrid method to extract features for partial face recognition.
REFERENCES
[1] Lowe, David G. "Distinctive image features from scale-invariant keypoints." International journal of computer vision 60.2 (2004): 91-110.
[2] Križaj, Janez, Vitomir Štruc, an Nikola Pavešić " aptation o SIFT features for robust face recognition." Image Analysis and Recognition. Springer Berlin Heidelberg, 2010. 394-404.
[3] Yanbin, Han, Yin Jianqin, and Li Jinping. "Human face feature extraction and recognition base on SIFT." Computer Science and Computational Technology, 2008. ISCSCT'08. International Symposium on. Vol. 1. IEEE, 2008.
[4] Mian, Ajmal S., Mohammed Bennamoun, and Robyn Owens.
"Keypoint detection and local feature matching for textured 3D face recognition." International Journal of Computer Vision 79.1 (2008): 1-12.
[5] Geng, Cong, and Xudong Jiang. "Face recognition based on the multi-scale local image structures." Pattern Recognition 44.10 (2011): 2565-2575.
[6] Aly, Mohamed. "Face recognition using SIFT features." CNS/Bi/EE
report 186 (2006).
[7] Kamencay, Patrik, et al. "A Novel Approach to Face Recognition using Image Segmentation Based on SPCA-KNN Method." Radioengineering 22.1 (2013).
[8] Yang, Jian, et al. "Two-dimensional PCA: a new approach to
appearance-based face representation and recognition." Pattern Analysis and Machine Intelligence, IEEE Transactions on 26.1 (2004): 131-137.
[9] Wang, Liwei, et al. "The equivalence of two-dimensional PCA to line-based PCA." Pattern Recognition Letters 26.1 (2005): 57-60. [10] Shilpa, K., Syed Musthak Ahmed, and A. Venkata Ramana. "Face
Representation And Recognition Using Two-Dimensional PCA." International Journal of Computer Technology & Applications 3.1 (2012).
[11] Mohamed, Benouis, and Scnouci Mohamed. "Face recognition
approach based on two-dimensional subspace analysis and PNN." Programming and Systems (ISPS), 2013 11th International Symposium on. IEEE, 2013.
[12] Kim, Tae Young, et al. "Occlusion invariant face recognition using two-dimensional pca." Advances in Computer Graphics and Computer Vision. Springer Berlin Heidelberg, 2007. 305-315. [13] Da, Bangyou, and Nong Sang. "Local binary pattern based face