A Review on Subspace Methods for Face Image Recognition

(1)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 4, Issue 2, February 2014)

816

A Review on Subspace Methods for Face Image Recognition

Fousiya K K

1

, Kumary R Soumya

2

Asst. Professor, Department of Computer Science & Engineering, Jyothi Engineering College Thrissur, Kerala

Abstract— Pattern recognition systems have recently attained significant attention in the field of image processing. Image recognition becomes a challenging issue in real world applications because of the high dimensional images. Curse of dimensionality is a major issue in pattern recognition systems. This higher dimensionality causes a degradation of the classification accuracy of the classifier. By providing better dimensionality reduction as well as higher class discrimination prior to the classification process, a classifier can provide higher classification accuracy. The aim of this paper is to provide an overview of the various pattern recognition methods and its relevance to improve the classification accuracy of the existing face recognition systems.

Keywords—PCA, LDA, SVM, Nearest Neighbour, Neural Network

I. INTRODUCTION

Pattern recognition is a rapidly growing area in the field of image processing. Because of the higher dimensionality of images like digital photographs, satellite images, medical images etc, the problem of degradation performance, lower efficiency of the recognition system etc occurred. For achieving the efficient processing, we used different dimensionality reduction techniques without losing the necessary information. Dimensionality reduction is the process of transforming high dimensional data into a lower dimensional representation.

Because of the high dimensional face images, face recognition systems have to address the dimensionality reduction problem. Normally, a face image having m*m pixels is represented by a vector in Rm*m . This m*m dimensional space is too large for recognition process. Analyzing the data set having a large number of variables

increases the computational complexity of

the

recognition

process. When dealing with machine learning and data mining algorithms, this high dimensionality causes degradation in the efficiency of the classifier system. Thus, for solving the curse of dimensionality, various dimensionality reduction methods have been applied as a pre-processing step to classification. The remainder of this paper is organized as follows. In section II, we describe about pattern recognition techniques. Section III presents related works and conclusion is described in section IV.

II. PATTERN RECOGNITION TECHNIQUES

Pattern recognition can be defined as the process of assigning a label to a set of values or the process of assigning each input value to one of a given set of groups known as classes. The applications of pattern recognition includes text classification, document classification, Optical Character Recognition (OCR), image analysis etc. The commonly used methods for pattern recognition include template matching, statistical pattern recognition, syntactic methods and artiﬁcial neural networks. Normally, a pattern recognition system goes through the different stages as shown in Figure.1.

[image:1.612.352.521.350.507.2]

Figure. 1. Pattern Recognition Stages

Image acquisition is the process of retrieving an image

from any hardware based source like highresolution digital

camera. The pre-processing step includes a series of operations performed on the input image. It actually enhances the image, making it suitable for the next phase of operation. The various operations performed on the image in the preprocessing step include binarization, noise removing etc. Binarization process converts a gray scale or colour image into a binary image using global (Otsu‘s

method) or local thresholding technique. Image

enhancement is done through the noise removal operation. Feature extraction is a type of dimensionality reduction technique that effectively extracts the relevant information called feature vectors from the input data. Classification is the important phase of any recognition system.

Preprocessing

Recognition/Classification

Image Acquisition

(2)

International Journal of Emerging Technology and Advanced Engineering

817 It is the process of assigning a label to a set of given input patterns. In general, the performance of a recognition system is determined by how exactly the feature vectors are extracted and classify them into a group accurately.

Principal Component Analysis(PCA) [1] is a statistical method processed in the linear domain. The purpose of PCA is to reduce the high dimensional input data in to smaller dimensionality of feature space, which is needed for the recognition process. For this purpose, the Eigenface[2] algorithm is used to find the feature vectors which are the most amount of face images information within the entire image space. The subspace which represents these feature vectors forms a face space. All faces in the training set are projected onto the face space to find a set of weights that describes the contribution of each vector in the face space. To identify a test image, it requires the projection of the test image onto the face space to obtain the corresponding set of weights. By comparing the weights of the test image with the set of weights of the faces in the training set, the face in the test image can be identified.

Linear Discriminant Analysis (LDA) [1] is also a statistical method processed in the linear domain. The Fisherface approach is used for dimensionality reduction in this technique. The aim of LDA is to maximize the discrimination between different classes, while minimizing the within class distance. In classification systems, LDA is superior to PCA because, it provides higher class discrimination by using the class information and so, LDA

is widely used in face recognition systems. LDA algorithm

selects features that are most effective for class separation

while PCA selects features important for class

representation.

In Support Vector Machine (SVM) [5] based classification systems, the classification is done by the construction of a new hyper plane that provides maximum separation between the data item. That is, the margin of the hyper plane should be maximum. The vectors which are lying near to the hyper plane are called support vectors. Every point in the input space is non-linearly mapped into a high dimensional feature space. This mapping is done using kernel functions. In [9], the kernel functions used for experiments include linear, polynomial and radial basis function kernels.

Neural Network (NN) classifiers show the better classification results compared to the conventional classifier like Nearest Neighbour Classifier. Neural Network classifiers are one of the most efficient classifiers because of its simplicity, and good learning ability of the NN.

The Back propagation feed forward neural network is the one of the best NN classifier used for better classification. The Back Propagation Neural Network consists of input layers, hidden layers and output layers. The number of neurons to the input layer is depends on the size of the features extracted from the input. The number of neurons to the output layer depends on the total number of classification classes. The complexity of the system increases as the number of hidden layers increases. Training can be done by entering the inputs to the back propagation neural network, based on that, a back propagation feedback error spread over the network and updates the weights for the correct classification.

III. RELATED WORK

Recently a significant number of works have been done on face recognition. In security systems, face recognition systems have wide spread application. Face recognition methods are classified as feature based methods and subspace methods. Feature-based methods extract the local features such as the position of the ear, eyes, nose, mouth

etc. Subspace method tries to reduce the dimension of the

data, while retaining the maximum separation among the distinct classes.

‗Eigenface[2][11]‘ and ‗Fisherface‘ [2] are the two commonly used subspace methods for face recognition. The ‗Eigenface‘ approach uses the linear dimensionality reduction method Principal Component Analysis (PCA) for subspace generation, whereas the ‗Fisherface‘ approach uses the linear supervised dimensionality reduction method Linear Discriminant Analysis (LDA).

PCA finds a projection on a lower dimensional representation, where along the principal component most of the data variation occurs in an unsupervised manner. PCA provides the accurate representation of the data with minimum reconstruction error and also finds the best axis for projection [4]. The main aim of LDA is to maximize the discrimination between different classes, while minimizing the within class distance. In classification systems, LDA is superior to PCA because, it provides higher class discrimination by using the class information [3][4] and so, LDA is widely used in face recognition systems. Class discrimination capability of PCA is very less compared to LDA [3].

(3)

International Journal of Emerging Technology and Advanced Engineering

818 The main aim of LDA is to maximize the discrimination between different classes, while minimizing the within class distance. In classification systems, LDA is superior to PCA because, it provides higher class discrimination by using the class information and so, LDA is widely used in face recognition systems. Class discrimination capability of PCA is very less compared to LDA. LLE and Isomap are the non linear dimensionality reduction techniques [3] [4][10]. LLE is suitable for processing the high dimensional data that contains non linear structures. LLE has been applied for face recognition, but the discrimination capability is less compared to LDA [10]. LLE has wide spread application in many areas of science including data analysis and data visualization. In order to recover the true dimensionality and geometric structure of the high dimensional data, Isomap is preferred [6]. Isomap

retainsthe global structure but, computationally expensive

for large datasets. Furthermore, many work has been done in the literature by combining different dimensionality techniques for face recognition [6][8]. However, the classification accuracy of such combinations is not so high.

Mahesh Pal et al. [5] discussed about the need of dimensionality reduction as a preprocessing step to classification. Support Vector Machine (SVM) is a widely used method for classification. SVM constructs a hyper plane in which the margin between different classes is high. Therefore, it provides better classification accuracy. Experiments are conducted for classification of hyper spectral data using SVM. Experimental results show that the classification accuracy of SVM can be increased by reducing the dimensionality of the data [5]. Thus, the results proved that, dimensionality reduction is an essential pre-processing stage for classification by SVM. By the addition of features, the classification accuracy of SVM. However, even for large training data set, dimensionality reduction has an important role. Experimental results show that there is a strong dependence between dimensionality reduction and classification accuracy of SVM. When the feature dimensions considered were 55 and 65, the classification accuracy was 92.24 and 91.76%.

In [6], Jianke Li et al. introduced a method for face recognition system based on the combination of Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for feature extraction. After performing the PCA transformation, the resulting components are uncorrelated and have less reconstruction error. LDA increases the class separation and extracts the LDA features, which is better for classification. So, for providing higher discrimination of the samples, LDA is used after PCA transformation.

Thus PCA forms a PCA subspace and the combination of PCA and LDA form the LDA subspace having high

discrimination. For classification purpose, Nearest

Neighbor Classifier (NNC) is used. Experiments are conducted on the ORL face database containing 400 images of 40 individuals. Classification accuracy for different training sample size and for various feature dimensions was calculated. An increase in training sample size causes an increase in the classification accuracy of both PCA method and combined PCA and LDA method. Also, it can be concluded that the PCA and LDA combination method provides better classification accuracy than the PCA alone method for face recognition [6].

In [7], Md. Omar Faruqe et al. proposed a face recognition system using Principal Component Analysis (PCA) and support vector machine (SVM). Since the major components of a face recognition system are feature extraction and classification, its performance strongly depends on the factors including the way in which the features are extracted, how accurately the features are classified into a group and the selection of a classifier. Therefore, the selection of the feature extractor and the classifier is very important in a face recognition system. In this work, PCA for feature extraction and SVM for classification are used.

In SVM based classification systems, the classification is done by the construction of a new hyper plane that provides maximum separation between the data item that is, the margin of the hyper plane should be maximum. The vectors which are lying near to the hyper plane are called support vectors. Every point in the input space is non-linearly mapped into a high dimensional feature space. This mapping is done using kernel functions. In [9], the kernel functions used for experiments include linear, polynomial and radial basis function kernels. Experiments are conducted on the ORL face database containing 400 images of 40 individuals. Randomly selected 200 samples serves as the training set. This sample set is used for constructing Eigen faces and also for training the SVM. A change in feature vector size causes a change in input vector size of the classifier because; the output after PCA transformation is given directly to the classifier. From the experiments conducted it can be concluded that, PCA and SVM combination provides better classification accuracy for face recognition.

(4)

International Journal of Emerging Technology and Advanced Engineering

819 Firstly, the image dimensionalities are reduced using PCA. After that, highly class discriminant features are extracted using LDA.

In order to improve the accuracy of the method for face classification, projected face images are given to a neural network using feed forward back propagation technique. This neural network consists of three layers, the input layer, the hidden layer and the output layer. Tangent sigmoid function is used as activation function in the input and hidden layers and pure linear function is applied in the output layer and simple back-propagation learning rule is used for training and updating the weights. Output of the neural network is a vector such that each element of that vector represents similarity of the input image to one of available classes. For experimentation, a three layer perceptron neural network with 40 neurons in the input layer, 20 neurons in the hidden layer and 10 neurons in the output layer, for classification of the input images is used. The experiments are conducted on YALE face datasets [8].

A classification system always provides higher classification accuracy only if it is preceded by a better

dimensionality reduction stage and better class

discrimination stage. Table I summarizes the different dimensionality reduction methods used for face recognition in the literature survey.

TABLEI

CLASSIFICATION ACCURACY VS FEATURE DIMENSION

Feature Extraction

Method

Classifier Feature Dimension

Classification Accuracy

PCA NNC 40 85%

PCA+ LDA NNC 40 93%

PCA +

LDA

Neural

Network

40 96.5%

PCA SVM 40 97%

The literature survey shows that whatever be the classifier used for classification, dimensionality reduction is an essential step prior to the classification stage [5]. When the Nearest Neighbor Classifier (NNC) is used for classification of face images in ORL database, PCA alone method provided a classification accuracy rate of 85%, which is satisfactory.

But, this rate is significantly improved into 93% by using PCA and LDA combination. It is clear that,

combination of PCA and LDA is very much better than

PCA alone for improving classification accuracy of SVM. By using combination of PCA and LDA for feature extraction and Neural Network for classification, there is an improvement in the classification accuracy as 96.5%. From the experiments conducted on the ORL face database using SVM as a classifier, it is clear that PCA alone method provides better classification accuracy of 97% for the same feature dimension [7].

IV. CONCLUSION

Curse of dimensionality is a major problem in recognition systems. It can be overcome by using an effective dimensionality reduction technique as a

pre-processing step. Studies show that, the feature

dimensionality has an adverse impact on the classification accuracy of any classifier. Therefore, it is better to perform

a dimensionality reduction method prior to the

classification stage. PCA is a widely used dimensionality

reduction method. Providing better dimensionality

reduction and higher class discrimination prior to the classification process leads to higher classification accuracy. PCA is an extensively used dimensionality reduction method. But, the class discrimination capability is limited. It demands a dimensionality reduction method with high class discrimination. LDA gives higher class discrimination among the different classes. By the induction of LDA between PCA and SVM improves the classification accuracy. It proposes the combination of PCA and LDA prior to SVM improves an improvement in the classification accuracy.

REFERENCES

[1] Aleix M. MartoAnez and Avinash C. Kak, ―PCA versus LDA‖,

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228-233, February 2001.

[2] Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman,

―Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, NO. 7, July 1997

[3] S.T. Roweis and L. Saul, ―Nonlinear dimensionality reduction by

locally linear embedding", Science, vol 290 22, No. 5500, December 2000, pp. 2323-2326.

[4] Joshua B. Tenenbaum, Vin de Silva and John C. Langford, ―A

Global Geometric Framework for Nonlinear Dimensionality Reduction", science, VOL 290 22, December 2000.

[5] Mahesh Pal and Giles M. Foody, ―Feature Selection for

(5)

International Journal of Emerging Technology and Advanced Engineering

820

[6] Jianke Li, Baojun Zhao and Hui Zhang, ―Face Recognition Based

on PCA and LDA Combination Feature Extraction", The 1st International Conference on Information Science and Engineering, 2009

[7] Md. Omar Faruqe and Md. Al Mehedi Hasan, ―Face Recognition

Using PCA and SVM", The 3rd International Conference on Antiounterfeiting, Security, and Identi_cation in Communication, 2009

[8] A. Hossein Sahoolizadeh, B. Zargham Heidari, and C. Hamid

Dehghani ―A New Face Recognition Method using PCA,LDA and Neural network‖, World Academy of Science, Engineering and Technology 41 2008

[9] A.Zagouras, A.Macedonas, G.Economou and S.Fotopoulos, ―An

application study of manifold learning-ranking techniques in face recognition", Multimedia Signal Processing, 2007, Pages: 445 – 448

[10] Junping Zhang, Huanxing Shen and Zhi-Hua Zhou, ―Unified Locally

Linear Embedding and Linear Discriminant Analysis Algorithm for Face Recognition", Sinobiometrics, LNCS 3338, pp. 269-304, Springer 2004

[11] M.Truk and A. Pentland,‖ Eigen faces for recognition‖, Journal of