New Advance of Feature Extraction Algorithm for FER

(1)

2017 3rd International Conference on Computer Science and Mechanical Automation (CSMA 2017) ISBN: 978-1-60595-506-3

New Advance of Feature Extraction Algorithm for FER

Ting ZHANG

School of Information Engineering, Minzu University of China, Beijing, China, 100081

Keywords: Facial expression recognition, Feature Extraction, Convolution neural network, Computer vision

Abstract. As a challenging interdisciplinary in biometrics and emotional computing, Facial expression recognition (FER) has become a research hotspot in the field of pattern recognition, computer vision and artificial intelligence both at home and abroad. The extraction and selection of facial expression features is one of the most important steps in the process of FER, The validity of the features extracted directly affect the performance of FER. This paper focus on the analysis of the current research states of the latest facial expression extraction algorithm based the two-dimensional and the three-two-dimensional, try to analyze and compare the various methods in theory, and comprehensively study facial expression recognition based the convolution neural network. Finally, the research challenges are generally concluded, and the possible trends are outlined.

Introduction

The study of American psychologist Mehrabian shows: Emotional expression = 7% language + 38% sound + 55% facial expression. Therefore, facial face expression is the most important carrier of human emotions, and is an important way to exchange emotional information between people. As a challenging interdisciplinary subject in the field of biometrics and affective computing, Facial expression recognition has aroused widespread concern, mainly because it not only has important theoretical research significance, but also has broad application prospects.

Generally facial expression recognition is divided into four processes: image pre-processing, face detection, feature extraction and facial expression classification. Feature extraction is one of the most difficult tasks in facial expression recognition, which is the key to realize accurate and feasible expression recognition. Due to the different research background, the researchers have different definitions of basic emotions. Among them, Ekman's emotional theory has the greatest impact, where there are six basic emotions: anger, disgust, fear, pleasure, sadness and surprise. The basic task of feature extraction is to find the most effective representation from these characteristics extracted, so that the difference is big in the different categories of expressions, and is small in the same type of expression.

Feature Extraction Algorithms for 2D FER

Face expression is reflected in facial features: movements of eyebrows、eyes、and mouth. How to effectively extract these features is the key to achieving accurate and feasible expression recognition. At present, there are many kinds of feature extraction methods, which can be divided into geometric shapes, texture features and the combination of the two.

(2)

transform. LBP as a local operator can be used for image texture analysis, face recognition and facial expression recognition. Based on the idea of LBP, an LDP operator is proposed, which is used to extract facial texture features. The Gabor feature was applied to calculate the expression characteristics, combining with elasticity graph matching and linear analysis. But the high-dimensional Gabor feature is an obstacle to quickly identify facial expressions.

[image:2.612.95.519.199.458.2]

In addition, Multi-feature fusion is an effective way of the facial expressions representation, and the different features in many algorithms are combined to describe facial expressions. Table 1 lists the main feature extraction methods for 2D FER.

Table 1. Main feature extraction methods for expression recognition.

literature feature expression categories dataset recognition rate

[3] KSOM 6 MMI 90.2%

[4] ASM 7 JAFFE 86.96%

[5] Candide 4 Cohn-Kanade 85% [6] FACS 7 Co hn-Kanade 96.7% [7] LBP-TOP 6 Cohn-Kanade 96.26

[8] LBP 7 JAFFE 89.65%

[9] Gabor 7 JAFFE 89.67%

7 Cohn-Kanade 91.51%

[10] Log-Gabor 7 JAFFE 82.3%

6 Cohn-Kanade 97.7% [11]

Gauss-Laguerre 6 ₇ Cohn-Kanade _MMI 90.37% _85.97%

[12] LDP 7 Cohn-Kanade 93.4%

7 JAFFE 85.4%

[13] PHOG+SIFT 6 Cohn-Kanade 96.33%

7 JAFFE 96.2%

[14] LBP+Gabor 7 Cohn-Kanade 99.2%

6 MMI 94.1%

Feature Extraction Algorithms for 3D FER

The 3D facial expression features can solve the research dilemma faced by 2D images well, and can improve the effect of the facial expression recognition, because its acquisition is not affected by the surrounding environment changes, and has a good robustness to the changes in face posture also. The BU-3DFE database [15] is mainly used for the analysis and research of 3D FER, in which each 3D face model was manually calibrated with 83 key points. Based on these key points, the geometric features can be extracted, and are used to characterize the significant changes for FER. Figure 1 shows the 83 key points provided in the BU-3DFE database and the distance characteristics and curvature characteristics based on the 83 key points.

(3)

(a) (b) (c)

Figure 1. (a) 83 facial feature points marked on the 3D facial expression model displayed in the texture mode. (b) 24 manually devised features defined by the normalized Euclidean distances

between certain facial feature points on the 3D facial expression model displayed in the shade mode. (c)Distance and curvature feature [16].

Methods Based on Geometric Features

According to the 83 key points provided by BU-3DFE database, the 11 points selected by Soyel et al. are selected to obtain the most representative 6 distance features to characterize facial expression changes, and 91.3% facial expression recognition result were obtained[17]. According to FAPUS (Facial Action Parameter Unites), Tang et al [18] extracted the corresponding distance feature and the slope, and finally achieved a recognition rate of 95.1%. The methods of facial expression recognition based on the distance features of the 83 key points also appeared in the literature [19-21]], and the results of facial expression recognition based on the samples in BU-3DFE database are 93.7% [19, 20] and 88.2% [21] respectively. Srivastava et al. [22] took the offset of the size and direction of all coordinate vectors as features, represented them as a two-dimensional matrix, and finally achieved an average recognition rate of 91.7%.

Methods Based on Local Patch

The feature extraction algorithms based on the local patch, which could effectively obtain the local facial expression information, reduced the loss of facial expression information to a certain degree with respect to the geometric features. According to the 64 labeled key points, Wang et al.[23]divided the face into 7 local patches, and used the distribution histogram of each surface to represent the human face expression changes, and achieved a recognition rate of 83.6%.

According to the 83 key points of each 3D face in the BU-3DF database, Maalej et al [24, 25] used level curves to describe the deformation information of local patch around these key points, and the results based on the samples in BU-3DFE database are 96.1%[25] and 98.8%[24] respectively. According to the patch attributes face around some key points, Lemaire et al. implemented the fitting of the SFAM (Statistic Facial fe Ature Model), and obtained the average recognition rate of 75.8%.

Methods Based on Probabilistic Model

[image:3.612.134.480.92.228.2]

(4)

FER Algorithms Based on CNN Features

By constructing a nonlinear neural network with multiple hidden layers, the deep learning simulates the human brain for analysis and learning. At present, the theory research for facial expression recognition based on the deep learning mainly focused on CNN (convolution neural network). A set of FER system based on CNN was developed, which could predict the location of facial feature points recognize the facial expression by training a multi-task deep neural network. A novel approach based on based on the combination of optical flow and a deep neural network - stacked sparse auto encoder could analyze video image sequences effectively and reduce the influence of personal appearance difference on facial expression recognition [31]. Combined region of interest (ROI) and K-nearest neighbors (KNN), a fast and simple improved method called ROI-KNN for facial expression classification was proposed, which relieves the poor generalization of deep neural networks due to lacking of data and decreases the testing error rate apparently and generally [32].A new architecture had the two hard-coded feature extractors: a Convolutional Auto encoder (CAE) and a standard CNN, which could can significantly boost accuracy and reduce the overall training time [33]. A 3D Convolutional Neural Network architecture consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video [34]. The attention model, which consisted of a deep architecture which implements convolutional neural networks to learn the location of emotional expressions in a cluttered scene, greatly improved the expression recognition [35].

Current Problems and Future Prospects

With the rapid improvement of computer hardware and image processing technology, facial expression recognition has made a lot of gratifying research results. However, there are still many theoretical problems and technical problems, which seriously limit its development and popularization, mainly in:

(1) The establishment of the standard expression library. There is no uniform standard. Different researchers use different expressions database, even some researcher use their own structure of the expression library built by them. There are many difficulties in the comparison of a variety of recognition algorithm. Therefore, many researchers are looking forward to a public and standard facial expression database containing a large number of images.

(2) Extract more robust features. At present, the expressions samples are less affected by facial pose changes and do not take into account factors such as face occlusion. For these factors, the effect of feature extraction algorithm is further studied, and more robust expression features are obtained.

(3) Facial expression recognition based on video sequences. The video sequence can show the change of facial expression, and relative to the static facial expression, the video sequence is more consistent with the actual situation, therefore, research on facial expression recognition based on video sequence will become the focus of future research.

(4) Reduce the computational complexity of the FER algorithm.

(5) Recognition of subtle or mixed expressions. Most of the studies are still in the research of the six basic facial expressions, but in reality people often show subtle expressions changes or the, mixed expressions. Therefore, how to recognize the subtle expressions or the mixed expressions is still an important problem

References

(5)

[2] I. Kotsia, I. Pitas. Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines. IEEE Transaction on Image Processing, 16, 1, 172-187, 2007.

[3] A. Majumder, L. Behera, V. K. Subramanian. Emotion recognition from geometric facial features using self-organizing map, J. Pattern Recognition, 47, 3, 1282-1293, 2014.

[4]K. C. Huang, Y. H. Kuo, M. F. Horng. “Emotion recognition by a novel triangular facial feature extraction method”, International Journal of Innovative Computing Information and Control, 8, 11, 7729-7746, 2012.

[5] R. A. Patil, V. Sahula, A. S. Mandal. Features classification using geometrical deformation feature vector of support vector machine and active appearance algorithm for automatic facial expression recognition, Machine Vision and Applications, 25, 3, 747-761, 2014.

[6]Y. I. Tian, T. Kanade, J. F. Cohn. Recognizing action units for facial expression analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 2, 97-115, 2001.

[7] G. Zhao, M. Pietikäinen. Dynamic texture recognition using local binary patterns with an application to facial expressions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 6, 915-928, 2007.

[8] D. T. Lin, D. C. Pan. Integrating a mixed-feature model and multiclass support vector machine for facial expression recognition, Integrated Computer-Aided Engineering, 2009, 16(1): 61-74.9. W. F. Gu, C. Xiang, Y. V. Venkatesh, D. Huang, H. Lin. “Facial expression recognition using radial encoding of local Gabor features and classifier synthesis”, Pattern Recognition, 45, 1, 80-91, 2012.

[9]S. M. Lajevardi, Z. M. Hussain. Automatic facial expression recognition: feature extraction and selection. Signal Image and Video Processing, 6, 1, 159-169, 2012.

[10]A. Poursaberi, H. A. Noubari, M. Gavrilova, S. N. Yanushkevich. Gauss-Laguerre wavelet textural feature fusion with geometrical information for facial expression identification, EURASIP Journal on Image and Video Processing, 17, 1, 1-13, 2012.

[11] T. Jabid, M. H. Kabir, O. Chae. “Robust Facial Expression Recognition Based on Local Directional Pattern”, ETRI Journal, 32, 5,784-794, 2010.

[12]Z. Li, J. Imai, M. Kaneko. Facial-component-based Bag of Words and PHOG Descriptor for Facial Expression Recognition, Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics, San Antonio, TX, USA, 64, 2, 1353-1358, 2009.

[13]T. H. H. Zavaschi, A. S. Britto, L. E. S. Oliveira, A.L. Koerich. Fusion of feature sets and classifiers for facial expression recognition, Expert Systems with Applications, 40, 2, 646-655, 2013.

[14]Liu Y, Hou X, Chen J, et al. Facial expression recognition and generation using sparse autoencoder, International Conference on Smart Computing. Hong Kong, China, 125-130, 2014.

[15] L Yin, X Wei, Y Sun, J Wang. M.J. Rosato. A 3D Facial Expression Database For Facial Behavior Research. 7th Conference on Automatic Face & Gesture Recognition, 211-216, 2006.

[16]T Sha, M Song, J Bu, C Chen, D Tao. Feature level analysis for 3D facial expression recognition. Neurocomputing, 74, 12, 2135-2141, 2011.

[17]H Soyel, H Demirel. Facial Expression Recognition Using 3D Facial Feature Distances, Springer Berlin Heidelberg , 4633: 831-838, 2007.

(6)

[19]H Soyel, H Demirel. Optimal feature selection for 3D facial expression recognition with geometrically localized facial features. International Conference on Soft Computing, 1-4, 2009.

[20]H Soyel, H Demirel. Optimal feature selection for 3D facial expression recognition using coarse-to-fine classification. Turkish Journal of Electrical Engineering & Computer Sciences, 18, 6, 1031-1040, 2010.

[21]U Tekguc, H Soyel. H Demirel. Feature selection for person-independent 3D facial expression recognition using NSGA-II, International Symposium on Computer & Information Sciences, 35-38, 2009.

[22]R Srivastava, S Roy. 3D facial expression recognition using residues, Tencon IEEE Region 10 Conference, 1-5, 2010.

[23]J Wang, L Yin, X Wei, Y Sun. 3D Facial Expression Recognition Based on Primitive Surface Feature Distribution. IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2, 1399-1406, 2006.

[24]A Maalej, BB Amor, M Daoudi, A Srivastava, S Berretti. Shape analysis of local facial patches for 3D facial expression recognition, Pattern Recognition, 44, 8, 1581-1589, 2011.

[25] A Maalej, BB Amor, M Daoudi, A Srivastava, S Berretti. Local 3D Shape Analysis for Facial Expression Recognition, International Conference on Pattern Recognition, 4129-4132, 2010.

[26]S Ramanathan, A Kassim, YV Venkatesh, SW Wu. Human Facial Expression Recognition using a 3D Morphable Model, IEEE International Conference on Image Processing, 661-664,2007.

[27]I Mpiperis, S Malassiotis, MG Strintzis. Bilinear Models for 3-D Face and Facial Expression Recognition. IEEE Transactions on Information Forensics & Security, 3, 3, 498-511, 2008.

[28]I Mpiperis, S Malassiotis, V Petridis, MG Strintzis. 3D Facial Expression Recognition using Swarm Intelligence, IEEE International Conference on Acoustics ,2133-2136,2008.

[29] I Mpiperis, S Malassiotis, MG Strintzis. Bilinear elastically deformable models with application to 3D face and facial expression recognition, 8th Conference on Automatic Face & Gesture Recognition, 32 ,2,1-8,2008.

[30] I Mpiperis, S Malassiotis, MG Strintzis. Bilinear Decomposition of 3-D face images: An application to facial expression recognition. Workshop on Image Analysis for Multimedia Interactive Services, 1-4, 2009.

[31] Liu Y, Hou X, Chen J, et al. “Facial expression recognition and generation using sparse autoencoder[C]”, Proceedings of IEEE International Conference on Smart Computing (SMARTCOMP). Hong Kong, China, 125-130, 2014.

[32]Sun Xiao, Pan Ting, Ren Fu-Ji. “Facial expression recognition using ROI-KNN deep con-volutional neural networks”, Acta Automatica Sinica, 42, 6, 883-891, 2016.

[33] Dennis Hamester, Pablo Barros, Stefan Wermter. “Face Expression Recognition with a 2-Channel Convolutional Neural Network”, Proceedings of International Joint Conference on Neural Networks (IJCNN), 1787-1794, 2015.

[34] Behzad Hasani, Mohammad H. Mahoor. “Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks”, IEEE Conference on Computer Vision and Pat-tern Recognition. https://arxiv.org/pdf/1705.07871.pdf, 2017.