A FAST ALGORITHM FOR
RECOGNITION OF BASIC FACIAL
EXPRESSIONS WITH GABOR FILTER
BANK
Hadi Seyedarabi
Faculty of Electrical Engineering, University of Tabriz, Tabriz, Iran
Saeid Fazli
Electrical Engineering Department, Zanjan University, Zanjan, Iran
Reza Afrouzian
Faculty of Electrical Engineering, University of Tabriz, Tabriz, Iran
Abstract :
Automatic recognition of Facial expression can be used as a useful tool for human-computer interaction (HCI) systems. Feature extraction plays an important role on accuracy of facial expression recognition (FER) systems. This paper uses the Gabor filter bank for feature extraction and linear discriminant analysis (LDA) for feature reduction. The Gabor filter bank is one of the most useful tools in feature extraction methods and is able to present local features in spatial and frequency domain, but imposes heavy computational complexity on facial expression systems. Hence for reducing computational complexity and improving computational speed, this paper uses half of the face for processing and feature extraction. Experimental results show that in spite of improving computational speed, the proposed method has almost similar accuracy with classical ones that uses hole of the face for recognition of six basic emotions.
Keywords: Facial expression recognition (FER); Gabor filter ban; half of the face image; linear discriminant analysis (LDA).
1. Introduction
In recent years, automatic analysis of facial expressions has become one of the important challenges in the field of computer vision and has received a special importance and the focus of the relatively recently initiated research area of facial expression has lied on sensing, detecting and interpreting human affective states and devising appropriate means for handling this affective information in order to enhance current human machine interface designs [Delac and Grgic, (2007)].
Applications such as computer-based advisors, virtual information desks, automatic recognition of sign language, driver fatigue detection, video compression and lip reading are generated as a result of research in this field.
al, (1997)] who used fisherfaces (LDA algorithm) and some researchers [Pantic and Rothkrantz, (2000) & Deng
et al, (2005)] used Gabor Wavelets to extract features of face for facial expression recognition. Gabor filter is one of the appearance-based methods used in facial analysis in recent years. The Gabor filter removes most of the variability in an image due to variation in lighting and contrast, at the same time being robust against small shift and deformation [Lades et al, (1992)]. Lades et al. applied Gabor wavelet for object recognition [Lades et al, (1992)] and Liu used Gabor wavelet for face recognition and employed various methods including LDA [Liu and Wechsler, (2002)] and ICA [Liu and Wechsler, (2003)] for feature reduction. Also Lyons et al. used Gabor wavelet for recognition of 6 basic emotions [Lyons et al, (1999)]. Bartlett et al. combined Gabor filter with boosting methods for facial expression recognition [Bartlett et al, (2003)].
This paper also uses Gabor filter bank with five scales and eight orientations for feature extraction. As the Gabor filter bank imposes computational complexity on the facial expression recognition systems and since there is the symmetry on the face, this paper uses half of the face image instead of using the total face. Experimental results show that in spite of the proposed method has almost similar recognition rate in comparison with classical ones, time consumption for the proposed method is much smaller.
The organization of this paper is as follows: Section 2 presents total explanations about Gabor filter and LDA, while section 3 describes the proposed method. The experimental results are shown in section 4, and finally the conclusions form the last section.
2. General Tools
2.1. Gabor Filter
Gabor filter is a useful tool to extract local features in both spatial and frequency domain and seems to be a good approximation to the sensitivity profiles of neurons found in visual cortex of higher vertebrates [Lee, (1996)]. In the spatial domain, a Gabor wavelet is defined as a product of a complex exponential modulated times a Gaussian function by the following equation:
2 2
2 2 2
( )
2 2
2
1
( , , , )
[
]
2
x y i x
x y
e
δe
ωe
ω δψ
ω θ
πδ
′+ ′
− ′ −
=
−
(1)Where (x, y) is the pixel position in the spatial domain,
ω
is the central frequency of a sinusoidal plane wave,θ
is the orientation of a Gabor filter andδ
xy is the standard deviation along the both of x and ydirections. The second term of the Gabor wavelet, 2 2
2
e
−ω δ , compensates for the DC value because the cosine component has nonzero mean (DC response) while the sine component has zero mean [Deng et al, (2005)]. The parametersx
′
andy
′
are given as the following equations:cos
sin
x
′ =
x
θ
+
y
θ
,y
′ = −
x
sin
θ
+
y
cos
θ
(2)We set δ =πω to define the relationship between δ and ω in our experiments [Deng et al, (2005)]. Gabor filter bank with different frequencies and orientation have been used to extract features of face image. In most of cases a Gabor filter bank with five frequencies and eight orientations is used. Figure 1 shows the Gabor filter bank with five different scales and eight different orientations and Figure 2 shows its frequency responses.
The following equations give five frequencies (m=1,2,...,5), and eight orientations (n=1,2,…,8) for Gabor filter bank:
( 1)
2 2
m m
π
ω = × − − , ( 1)
8
n n
π
θ = × − (3)
The input image
I x y
( , )
is convolved with the Gabor filter bank ψ( , , , )x yω θ to obtain Gabor feature representationQm n, ( , )x y .,
( , )
( , )
( , , , )
m n
Q
x y
=
I x y
∗
ψ
x y
ω θ
(4)The phase of Qm n, ( , )x y changes linearly under small displacement in the direction of the sinusoid, but the
Fig. 1. The Gabor filter bank with five frequencies and eight orientations
Fig. 2. Frequency response of classic Gabor filter bank (five scales and eight orientations)
2.2. LDA
Linear Discriminant Analysis (LDA) searches for those vectors in the underlying space that best discriminate among classes [Martinez and Kak, (2001)]. Mathematically speaking, for all the samples of all classes, we define two measures of within-class scatter matrix (
S
ω) and between-class matrix (S
B) that defined as:1 1
(
)(
)
j
c N j j T
i j i j
j i
S
ω=
= =x
−
u
x
−
u
(5)1
(
)(
)
c T
B j j j
S
=
=u
−
u u
−
u
(6)Where
x
ijis the ith sample of class j,u
j is the mean of class j, c is the number of classes,N
j is thenumber of samples in class j and
u
represents the mean of all classes.The goal is to maximize the between-class measure while minimizing the within-class measure. One way to
do this is to maximize the ratio ( )
T L B L
L T
L L
S J
Sω
ω ω
ω
ω ω
= . If
S
ω is a nonsingular matrix then this ratio is maximizedwhen the column vectors of the projection matrix,
ω
L, are the eigenvectors ofS S
ω−1 B. It be proven that thereare at most
c-1
nonzero generalized eigenvectors and we require at leastt+c
samples to guarantee thatS
ωdoes not become singular (
t
is the dimension of samples). Since the dimension of Gabor feature vector is large (N
t
+
c
,N
is the number of samples in the all classes), hence PCA space is used to produce intermediate space. Thus, the originalt
-dimensional space is projected onto an intermediateg
-dimensional space using PCA and then onto a finalf
-dimensional space using LDA. It is given by:T T i L P i
3. The Proposed Method
This work uses Gabor filter bank (five scales and eight orientations) for feature extraction. Firstly, the face region was located in input images and removed from background using a rectangle based on the face model described in [Shih and Chuang, (2004)] and then rescaled to 128×96 pixels before applying to Gabor filter bank. Figure 3 shows some of the results at this stage. This is done for reduction of mathematical computation and improving performance of the proposed algorithm.
Fig. 3. Some images of Cohen-Kanade database after preprocessing
In Previous researchers, we have been utilized hole of the face image for feature extraction [Fazli et al, (2009a) & Fazli et al, (2009b)]. As there is the symmetry on the face, this work uses half of the face image for feature extraction. Hence after preprocessing, the image is equally divided into two parts each of them including 128 96× pixels and then the first part is selected and the Gabor filter bank is applied on it. Figure 4 shows half of the face image before applying Gabor filters. Output of Gabor filter bank is 40 feature images (five frequencies and eight orientations) of size 4166 (128 48× ) which is down-sampled into vectors of size 96 using scale of 1:64 and normalized to average 0 and unit variance. Then they are concatenated to each other, resulting in a feature vector of size 3680 for each input image.
Fig. 4. one sample of half face image before applying Gabor filter bank
The Gabor feature vector has large dimension as applying the classifier to output of Gabor filter lead to misclassification and computational complexity. Hence the proposed method uses LDA method for further reduction of feature data dimensionality. The images are categorized into six different forms of basic emotions. Therefore the size of feature vectors are reduced to five (
c-1
) using LDA. At last, K-nearest neighborhood (KNN) as a classical classifier is used to categorize the outputs of LDA into 6 basic emotions including happiness, sadness, anger, surprise, fear and disgust.4. Experimental Results
In general 198 static images (33 images for each class) from 70 subjects are used for training and testing. We randomly divide them into three sets with equal number of images for each class. A set is used for testing and the rest for training (66 images for testing and 132 images for training). This is performed three times and in each time, a new set is chosen for testing. This procedure is repeated for 10 times. The average of these 30 results is calculated as a single estimation.
The proposed algorithm is simulated using MATLAB-R2008a with a Pentium (R) Dual-core CPU 2.50 GHz PC and 1.99 Mb of RAM.
Table 1 demonstrates the percentage of correct recognition and confusions for each of 6 basic emotions. The average performance of system is about 86.7%.
Table1. Confusion matrix of half face image for six basic emotions
Surprise Sad Happy Fear Disgust Anger 0 14.8 4 0.60 3.33 11.51 69.72 Anger 0 2.12 0.60 0.60 94.56 2.12 Disgust 0.30 2.72 10.9 81.5 4 1.51 3.03 Fear 0 0.90 94.56 3.03 0.30 1.21 Happy 0 81.5 0.63 2.12 0.60 15.15 Sad 98.19 1.21 0 0 0 0.60 Surprise
The next experiment uses whole of the face image with a size of 128 96× pixels as input for Gabor filter bank, in order to compare the results with the proposed method. In this experiment, the stages of down sampling and concatenating are also applied on the outputs of Gabor filter bank (with five frequencies and eight orientations) and the Gabor feature vector has the size of 7680. Table 2 shows the results for current experiment. According to the table, the average recognition rate for total face image is 88.3%. With comparison of two tables is concluded that the proposed method has similar recognition rate with classical ones. Similarity of recognition accuracy between two experiments for each of six basic emotions is different. Expressions including surprise, happy, fear and disgust have similar recognition rates with classical ones. As shown in tables, vast differences between recognition rates of two experiments belong to expressions of anger and disgust. In other words, when half of the face image is used for feature extraction, vast misclassifications are occurred in expressions of anger and disgust.
Table2. Confusion matrix of total face image for six basic emotions
Surprise Sad Happy Fear Disgust Anger 0 8.48 0 0.90 16.96 73.66 Anger 0 0.60 0.30 0 94.26 4.84 Disgust 0 0.60 9.39 82.74 2.12 5.15 Fear 0 0 95.46 3.33 0 1.21 Happy 0 85.17 0.30 0.30 1.81 12.42 Sad 98.8 0.30 0 0 0 0.90 Surprise
Table 3 shows recognition rates and time consumptions of two experiments. As can be seen, when the proposed method is used, the recognition rate is approximately similar with algorithm that uses the total face image (classical ones) while the time consumption for feature extraction is much smaller than classical ones.
Table 3. Time consumption and average recognition rate for total face and half face image
Recognition rate
Time consuming for
feature extraction (s) Dimension of Gabor feature vector
Total face 88.3 2.412 7680
5. Conclusion
This paper presents a person independent facial expression recognition system based on Gabor filter and LDA. Gabor filter bank is used for feature extraction and LDA algorithm is used for feature reduction. As the Gabor filter bank imposes large computational burden and also there is the symmetry on the face, the proposed method uses half of the face image. Experimental results show four of six basic emotions including happy, surprise, disgust and fear have recognition rate approximately equal to classical ones. Vast differences in recognition rates are occurred on two basic expressions including angry and sad. Also experimental results show that in spite of presenting high recognition rate for six basic emotions, the proposed method has much smaller calculation burden in comparison with classical ones.
References
[1] Bartlett, M S.; Littlewort, G.; Fasel, I.; Movellan, J R. (2003): Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction, in Workshop on Computer Vision and Pattern Recognition for Human-Computer Interaction, pp. 1295-1302.
[2] Belhumeour, P.N.; Hespanha, J.P.; Kriegman, D.J. (1997): Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, pp.711-720.
[3] Delac, K.; Grgic, M. (2007). Face Recognition. 1rf ed, Published by the I-Tech Education and Publishing, Vienna, Austria
[4] Deng, H.B.; Jin, L.W.; Zhen, L.X.; Huang, J.C. (2005): A New Facial Expression Recognition Method Based on Local Gabor Filter Bank and PCA plus LDA, International Journal of Information Technology Vol. 11 No. 11.
[5] Essa, I.; Pentland, A. (1997): Coding, analysis, interpretation, and recognition of facial expressions, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, pp. 757-763.
[6] Fazli, S.; Afrouzian, R.; Seyedarabi, H. (2009a): A Combined KPCA and SVM Method for Basic Emotional Expressions Recognition, Second International Conference on Machine Vision, Dubai.
[7] Fazli, S.; Afrouzian, R.; Seyedarabi, H. (2009b): High- Performance Facial Expression Recognition Using Gabor Filter and Probabilistic neural network, IEEE international conf on Intelligent Computing and Intelligent Systems, Shanghai, China.
[8] Gokturk, S.B.; Bouguet, J.Y.; Tomasi, C.; Girod, B. (2002): Model-based face tracking for view independent facial expression recognition, Proc. IEEE Int’l Conf. Face and Gesture Recognition, pp. 272-278.
[9] Kanade, T.; Cohn, J. F.; and Tian, Y. (2000): Comprehensive database for facial expression analysis, Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, pp. 46-53.
[10] Lades, M.; Vorbruggen, J.C.; Buhmann, J.; Lange, J.; von der Malsburg, C.; Wurtz, R.P.; Konen, W. (1992): Distortion Invariant Object Recognition in the Dynamic Link Architecture, IEEE Transactions on Computers, 42, 3, 300-311.
[11] Lee, T S. (1996): Image Representation Using 2D Gabor Wavelets, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 18, pp.959-971.
[12] Liu, C.; Wechsler, H. (2002): Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition, IEEE Trans. Image Processing. 11 (4), 467–476.
[13] Liu, C., Wechsler, H., (2003): Independent Component Analysis of Gabor Features for Face recognition, IEEE Trans. Neural Networks, Vol. 14, pp.919-928.
[14] Lyons, M J., Budynek, J., Akamatsu, S., (1999), “Automatic Classification of Single Facial Images”, IEEE Trans on Pattern Analysis and Machine Intelligence, Vol. 21, No. 12.
[15] Martinez, A.M., Kak, A.C., (2001): PCA versus LDA, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 23, pp.228-233 [16] Pantic, M.; Patras, I., (2006): Dynamics of facial expression: Recognition of facial actions and their temporal segments from face
profile image sequences. IEEE Trans. on Systems, Man and Cybernetics - Part B, Vol. 36, No. 2, pp. 433-449.
[17] Pantic, M.; Patras, I., (2005): Detecting facial actions and their temporal segments in nearly frontal-view face image sequences, Proc. IEEE Int'l Conf. on Systems, Man and Cybernetics, pp. 3358-3363.
[18] Pantic, M.; Rothkrantz, Jm., (2000): Automatic Analysis of Facial Expressions: The State of the Art, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, pp.1424-1444.
[19] Shih, F. Y.; Chuang, C., (2004): Automatic extraction of head and face boundaries and facial features, Information Sciences, Vol. 158, pp.117-130