A FAST ALGORITHM FOR RECOGNITION OF BASIC FACIAL EXPRESSIONS WITH GABOR FILTER BANK

(1)

A FAST ALGORITHM FOR

RECOGNITION OF BASIC FACIAL

EXPRESSIONS WITH GABOR FILTER

BANK

Hadi Seyedarabi

Faculty of Electrical Engineering, University of Tabriz, Tabriz, Iran

[email protected]

Saeid Fazli

Electrical Engineering Department, Zanjan University, Zanjan, Iran

[email protected]

Reza Afrouzian

Faculty of Electrical Engineering, University of Tabriz, Tabriz, Iran

[email protected]

Abstract :

Automatic recognition of Facial expression can be used as a useful tool for human-computer interaction (HCI) systems. Feature extraction plays an important role on accuracy of facial expression recognition (FER) systems. This paper uses the Gabor filter bank for feature extraction and linear discriminant analysis (LDA) for feature reduction. The Gabor filter bank is one of the most useful tools in feature extraction methods and is able to present local features in spatial and frequency domain, but imposes heavy computational complexity on facial expression systems. Hence for reducing computational complexity and improving computational speed, this paper uses half of the face for processing and feature extraction. Experimental results show that in spite of improving computational speed, the proposed method has almost similar accuracy with classical ones that uses hole of the face for recognition of six basic emotions.

Keywords: Facial expression recognition (FER); Gabor filter ban; half of the face image; linear discriminant analysis (LDA).

1. Introduction

In recent years, automatic analysis of facial expressions has become one of the important challenges in the field of computer vision and has received a special importance and the focus of the relatively recently initiated research area of facial expression has lied on sensing, detecting and interpreting human affective states and devising appropriate means for handling this affective information in order to enhance current human machine interface designs [Delac and Grgic, (2007)].

Applications such as computer-based advisors, virtual information desks, automatic recognition of sign language, driver fatigue detection, video compression and lip reading are generated as a result of research in this field.

(2)

al, (1997)] who used fisherfaces (LDA algorithm) and some researchers [Pantic and Rothkrantz, (2000) & Deng

et al, (2005)] used Gabor Wavelets to extract features of face for facial expression recognition. Gabor filter is one of the appearance-based methods used in facial analysis in recent years. The Gabor filter removes most of the variability in an image due to variation in lighting and contrast, at the same time being robust against small shift and deformation [Lades et al, (1992)]. Lades et al. applied Gabor wavelet for object recognition [Lades et al, (1992)] and Liu used Gabor wavelet for face recognition and employed various methods including LDA [Liu and Wechsler, (2002)] and ICA [Liu and Wechsler, (2003)] for feature reduction. Also Lyons et al. used Gabor wavelet for recognition of 6 basic emotions [Lyons et al, (1999)]. Bartlett et al. combined Gabor filter with boosting methods for facial expression recognition [Bartlett et al, (2003)].

This paper also uses Gabor filter bank with five scales and eight orientations for feature extraction. As the Gabor filter bank imposes computational complexity on the facial expression recognition systems and since there is the symmetry on the face, this paper uses half of the face image instead of using the total face. Experimental results show that in spite of the proposed method has almost similar recognition rate in comparison with classical ones, time consumption for the proposed method is much smaller.

The organization of this paper is as follows: Section 2 presents total explanations about Gabor filter and LDA, while section 3 describes the proposed method. The experimental results are shown in section 4, and finally the conclusions form the last section.

2. General Tools

2.1. Gabor Filter

Gabor filter is a useful tool to extract local features in both spatial and frequency domain and seems to be a good approximation to the sensitivity profiles of neurons found in visual cortex of higher vertebrates [Lee, (1996)]. In the spatial domain, a Gabor wavelet is defined as a product of a complex exponential modulated times a Gaussian function by the following equation:

2 2

2 2 2

( )

2 2

2

1 ( , , , )

[

]

2

x y i x

x y

e

δ

e

ω

e

ω δ

ψ

ω θ

πδ

′+ ′

− _′ −

=

−

(1)

Where (x, y) is the pixel position in the spatial domain,

ω

is the central frequency of a sinusoidal plane wave,

θ

is the orientation of a Gabor filter and

δ

_xy is the standard deviation along the both of x and y

directions. The second term of the Gabor wavelet, 2 2

2

e

−ω δ , compensates for the DC value because the cosine component has nonzero mean (DC response) while the sine component has zero mean [Deng et al, (2005)]. The parameters

x

′

and

y

′

are given as the following equations:

cos

sin

x

′ =

x

θ

+

y

θ

,

y

′ = −

x

sin

θ

+

y

cos

θ

(2)

We set δ =π_ω to define the relationship between δ and ω in our experiments [Deng et al, (2005)]. Gabor filter bank with different frequencies and orientation have been used to extract features of face image. In most of cases a Gabor filter bank with five frequencies and eight orientations is used. Figure 1 shows the Gabor filter bank with five different scales and eight different orientations and Figure 2 shows its frequency responses.

The following equations give five frequencies (m=1,2,...,5), and eight orientations (n=1,2,…,8) for Gabor filter bank:

( 1)

2 2

m m

π

ω = × − − , ( 1)

8

n n

π

θ = × − (3)

The input image

I x y

( , )

is convolved with the Gabor filter bank ψ( , , , )x yω θ to obtain Gabor feature representationQm n, ( , )x y .

,

( , )

( , , , )

m n

Q

x y

=

I x y

∗

ψ

x y

ω θ

(4)

The phase of Qm n, ( , )x y changes linearly under small displacement in the direction of the sinusoid, but the

(3)

Fig. 1. The Gabor filter bank with five frequencies and eight orientations

Fig. 2. Frequency response of classic Gabor filter bank (five scales and eight orientations)

2.2. LDA

Linear Discriminant Analysis (LDA) searches for those vectors in the underlying space that best discriminate among classes [Martinez and Kak, (2001)]. Mathematically speaking, for all the samples of all classes, we define two measures of within-class scatter matrix (

S

_ω) and between-class matrix (

S

_B) that defined as:

1 1

(

)(

)

j

c N _j _j _T

i j i j

j i

S

_ω

=

 

₌ ₌

x

−

u

x

−

u

(5)

1

(

)(

)

c _T

B _j j j

S

=



₌

u

−

u u

−

u

(6)

Where

x

ijis the ith sample of class j,

u

j is the mean of class j, c is the number of classes,

N

j is the

number of samples in class j and

u

represents the mean of all classes.

The goal is to maximize the between-class measure while minimizing the within-class measure. One way to

do this is to maximize the ratio ( )

T L B L

L _T

L L

S J

S_ω

ω ω

ω

ω ω

= . If

S

_ω is a nonsingular matrix then this ratio is maximized

when the column vectors of the projection matrix,

ω

_L, are the eigenvectors of

S S

_ω−1 _B. It be proven that there

are at most

c-1

nonzero generalized eigenvectors and we require at least

t+c

samples to guarantee that

S

_ω

does not become singular (

t

is the dimension of samples). Since the dimension of Gabor feature vector is large (

N



t

+

c

,

N

is the number of samples in the all classes), hence PCA space is used to produce intermediate space. Thus, the original

t

-dimensional space is projected onto an intermediate

g

-dimensional space using PCA and then onto a final

f

-dimensional space using LDA. It is given by:

T T i L P i

(4)

3. The Proposed Method

This work uses Gabor filter bank (five scales and eight orientations) for feature extraction. Firstly, the face region was located in input images and removed from background using a rectangle based on the face model described in [Shih and Chuang, (2004)] and then rescaled to 128×96 pixels before applying to Gabor filter bank. Figure 3 shows some of the results at this stage. This is done for reduction of mathematical computation and improving performance of the proposed algorithm.

Fig. 3. Some images of Cohen-Kanade database after preprocessing

In Previous researchers, we have been utilized hole of the face image for feature extraction [Fazli et al, (2009a) & Fazli et al, (2009b)]. As there is the symmetry on the face, this work uses half of the face image for feature extraction. Hence after preprocessing, the image is equally divided into two parts each of them including 128 96× pixels and then the first part is selected and the Gabor filter bank is applied on it. Figure 4 shows half of the face image before applying Gabor filters. Output of Gabor filter bank is 40 feature images (five frequencies and eight orientations) of size 4166 (128 48× ) which is down-sampled into vectors of size 96 using scale of 1:64 and normalized to average 0 and unit variance. Then they are concatenated to each other, resulting in a feature vector of size 3680 for each input image.

Fig. 4. one sample of half face image before applying Gabor filter bank

The Gabor feature vector has large dimension as applying the classifier to output of Gabor filter lead to misclassification and computational complexity. Hence the proposed method uses LDA method for further reduction of feature data dimensionality. The images are categorized into six different forms of basic emotions. Therefore the size of feature vectors are reduced to five (

c-1

) using LDA. At last, K-nearest neighborhood (KNN) as a classical classifier is used to categorize the outputs of LDA into 6 basic emotions including happiness, sadness, anger, surprise, fear and disgust.

4. Experimental Results

(5)

In general 198 static images (33 images for each class) from 70 subjects are used for training and testing. We randomly divide them into three sets with equal number of images for each class. A set is used for testing and the rest for training (66 images for testing and 132 images for training). This is performed three times and in each time, a new set is chosen for testing. This procedure is repeated for 10 times. The average of these 30 results is calculated as a single estimation.

The proposed algorithm is simulated using MATLAB-R2008a with a Pentium (R) Dual-core CPU 2.50 GHz PC and 1.99 Mb of RAM.

Table 1 demonstrates the percentage of correct recognition and confusions for each of 6 basic emotions. The average performance of system is about 86.7%.

Table1. Confusion matrix of half face image for six basic emotions

Surprise Sad Happy Fear Disgust Anger 0 14.8 4 0.60 3.33 11.51 69.72 Anger 0 2.12 0.60 0.60 94.56 2.12 Disgust 0.30 2.72 10.9 81.5 4 1.51 3.03 Fear 0 0.90 94.56 3.03 0.30 1.21 Happy 0 81.5 0.63 2.12 0.60 15.15 Sad 98.19 1.21 0 0 0 0.60 Surprise

The next experiment uses whole of the face image with a size of 128 96× pixels as input for Gabor filter bank, in order to compare the results with the proposed method. In this experiment, the stages of down sampling and concatenating are also applied on the outputs of Gabor filter bank (with five frequencies and eight orientations) and the Gabor feature vector has the size of 7680. Table 2 shows the results for current experiment. According to the table, the average recognition rate for total face image is 88.3%. With comparison of two tables is concluded that the proposed method has similar recognition rate with classical ones. Similarity of recognition accuracy between two experiments for each of six basic emotions is different. Expressions including surprise, happy, fear and disgust have similar recognition rates with classical ones. As shown in tables, vast differences between recognition rates of two experiments belong to expressions of anger and disgust. In other words, when half of the face image is used for feature extraction, vast misclassifications are occurred in expressions of anger and disgust.

Table2. Confusion matrix of total face image for six basic emotions

Surprise Sad Happy Fear Disgust Anger 0 8.48 0 0.90 16.96 73.66 Anger 0 0.60 0.30 0 94.26 4.84 Disgust 0 0.60 9.39 82.74 2.12 5.15 Fear 0 0 95.46 3.33 0 1.21 Happy 0 85.17 0.30 0.30 1.81 12.42 Sad 98.8 0.30 0 0 0 0.90 Surprise

Table 3 shows recognition rates and time consumptions of two experiments. As can be seen, when the proposed method is used, the recognition rate is approximately similar with algorithm that uses the total face image (classical ones) while the time consumption for feature extraction is much smaller than classical ones.

Table 3. Time consumption and average recognition rate for total face and half face image

Recognition rate

Time consuming for

feature extraction (s) Dimension of Gabor feature vector

Total face 88.3 2.412 7680

(6)

5. Conclusion

This paper presents a person independent facial expression recognition system based on Gabor filter and LDA. Gabor filter bank is used for feature extraction and LDA algorithm is used for feature reduction. As the Gabor filter bank imposes large computational burden and also there is the symmetry on the face, the proposed method uses half of the face image. Experimental results show four of six basic emotions including happy, surprise, disgust and fear have recognition rate approximately equal to classical ones. Vast differences in recognition rates are occurred on two basic expressions including angry and sad. Also experimental results show that in spite of presenting high recognition rate for six basic emotions, the proposed method has much smaller calculation burden in comparison with classical ones.

References

[1] Bartlett, M S.; Littlewort, G.; Fasel, I.; Movellan, J R. (2003): Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction, in Workshop on Computer Vision and Pattern Recognition for Human-Computer Interaction, pp. 1295-1302.

[2] Belhumeour, P.N.; Hespanha, J.P.; Kriegman, D.J. (1997): Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, pp.711-720.

[3] Delac, K.; Grgic, M. (2007). Face Recognition. 1rf ed, Published by the I-Tech Education and Publishing, Vienna, Austria

[4] Deng, H.B.; Jin, L.W.; Zhen, L.X.; Huang, J.C. (2005): A New Facial Expression Recognition Method Based on Local Gabor Filter Bank and PCA plus LDA, International Journal of Information Technology Vol. 11 No. 11.

[5] Essa, I.; Pentland, A. (1997): Coding, analysis, interpretation, and recognition of facial expressions, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, pp. 757-763.

[6] Fazli, S.; Afrouzian, R.; Seyedarabi, H. (2009a): A Combined KPCA and SVM Method for Basic Emotional Expressions Recognition, Second International Conference on Machine Vision, Dubai.

[7] Fazli, S.; Afrouzian, R.; Seyedarabi, H. (2009b): High- Performance Facial Expression Recognition Using Gabor Filter and Probabilistic neural network, IEEE international conf on Intelligent Computing and Intelligent Systems, Shanghai, China.

[8] Gokturk, S.B.; Bouguet, J.Y.; Tomasi, C.; Girod, B. (2002): Model-based face tracking for view independent facial expression recognition, Proc. IEEE Int’l Conf. Face and Gesture Recognition, pp. 272-278.

[9] Kanade, T.; Cohn, J. F.; and Tian, Y. (2000): Comprehensive database for facial expression analysis, Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, pp. 46-53.

[10] Lades, M.; Vorbruggen, J.C.; Buhmann, J.; Lange, J.; von der Malsburg, C.; Wurtz, R.P.; Konen, W. (1992): Distortion Invariant Object Recognition in the Dynamic Link Architecture, IEEE Transactions on Computers, 42, 3, 300-311.

[11] Lee, T S. (1996): Image Representation Using 2D Gabor Wavelets, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 18, pp.959-971.

[12] Liu, C.; Wechsler, H. (2002): Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition, IEEE Trans. Image Processing. 11 (4), 467–476.

[13] Liu, C., Wechsler, H., (2003): Independent Component Analysis of Gabor Features for Face recognition, IEEE Trans. Neural Networks, Vol. 14, pp.919-928.

[14] Lyons, M J., Budynek, J., Akamatsu, S., (1999), “Automatic Classification of Single Facial Images”, IEEE Trans on Pattern Analysis and Machine Intelligence, Vol. 21, No. 12.

[15] Martinez, A.M., Kak, A.C., (2001): PCA versus LDA, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 23, pp.228-233 [16] Pantic, M.; Patras, I., (2006): Dynamics of facial expression: Recognition of facial actions and their temporal segments from face

profile image sequences. IEEE Trans. on Systems, Man and Cybernetics - Part B, Vol. 36, No. 2, pp. 433-449.

[17] Pantic, M.; Patras, I., (2005): Detecting facial actions and their temporal segments in nearly frontal-view face image sequences, Proc. IEEE Int'l Conf. on Systems, Man and Cybernetics, pp. 3358-3363.

[18] Pantic, M.; Rothkrantz, Jm., (2000): Automatic Analysis of Facial Expressions: The State of the Art, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, pp.1424-1444.

[19] Shih, F. Y.; Chuang, C., (2004): Automatic extraction of head and face boundaries and facial features, Information Sciences, Vol. 158, pp.117-130