2016 International Conference on Electronic Information Technology and Intellectualization (ICEITI 2016) ISBN: 978-1-60595-364-9
Robust Face Recognition Based on
Covariance Matrix
Jianhong Ma, Han Zhang and Baofeng He
ABSTRACT
As the issue of robustness of face recognition based on depth image sets, we propose that multiple Kinect images is being as a set of images, and depth data captured is used to automatically estimate poses and crop face area. Firstly, divide image sets into c subsets, and divide the images in all the subsets into image blocks of 4×4. Then, simulate images in sets as a form of image blocks, dividing in accordance with posture. Each set is represented using covariance matrix. Finally, the simulation of images in subsets is on Riemannian manifold. In order to classify, separately learnt SVM models for each image subset on the Lie group of Riemannian manifold and introduce a fusion strategy to combine results from all image subsets. We have verified the effectiveness of the proposed method on the three largest public Kinect face data sets Curtin Faces, Biwi Kinect and UWA Kinect. Compared to other advanced methods, the recognition rate has improved greater, the standard deviation is kept low, with robust to the number of image sets, image sub-setting number and spatial resolution.
INTRODUCTION
The existing face recognition methods can be divided into 2D image set parameter model method [1-3] and nonparametric model method [4]. Parameter model method is akin to a statistical distribution model of image collection, and then __________________________
Jianhong Ma, Han Zhang, Software Technology School, Zheng Zhou University, Henan, China, 450002
through the divergence measure the similarity of two images set. If the test and there is no strong statistical relationship between training set and the method will not be able to get the right parameters. Another image are parameters set representation method, made no assumptions about the data statistical distribution of [5].References [6] from the device to obtain the low resolution of RGB-D images for face recognition based on single image, this paper proposes a based on the estimate of the main view and side view scanning method. However, need to manually test tip and using iterative closest point (Integrated Circuit Package, ICP) algorithms aim the facial scanning and reference scan, is hampering the effectiveness of many practical applications. References [7] learned from image affine package or convex hull model gets a collection of samples, using the characteristics of angle to determine the linear subspace in a banach space on behalf of the distance between the image set. Characteristic angle of two subspaces was 0≤
θ1≤θ2≤...θd≤π/2 defined as the minimum angle between any vector in the
subspace and any vector in the second subspace.
The method of references to estimate 3D image face posture was used, and according to the estimate of the position ,it will be focused image is divided into image cluster, mage as a subset of the cluster, and through the covariance matrix method based on image block, said make image subset on Riemannian manifold. For classification problems, through the use of Riemann Kernel function will be the point of the Riemannian manifold embedded in Reproducing Kernel Hilbert Space (Reproducing Kernel Hilbert Space, RKHS), so that each image subset learn their Support Vector Machine (Support Vector Machine, SVM) model. The main contribution of the summary is as follows:
(1)using the depth image gathered to make the subset of a set of data, so as to solve the position change;
(2)the introduction of a covariance matrix method based on image block said image sets, different from the existing set of image representation, the representation dimension is very low, the calculation of obvious advantages;
(3)on Riemannian manifold training support vector machine (SVM), and the result of the fusion from independent subset.
AUTOMATIC PRETREATMETN
learning a SVM model respectively, the query image set classification results through fusion the classification results of each subset of image C.
PROPOSED METHOD
Image Set to Image Subset
An image is divided into subsets of images C. First, use all of the training data in the available position information Clustering algorithm to calculate the center of C posture cluster. By using the cluster center, the images in the collection are allocated to C cluster, the shortest Euclidean distance between the cluster centers and the pose of each image are used as the standard vectors for the distribution of clusters in order to improve the recognition ability, the rotation matrix of each image represents the position of the individual. The pose of each image is transformed from the Euler angle
to the rotational vector RM 3 3
by the following formula
cos cos cos sin sin sin cos sin sin cos sin cos cos sin cos cos sin sin sin sin cos cos sin sin
sin sin cos cos cos
M R
(1)
Among formula(1), RMdimension reduction to a single vector of 9. used to represent human faces in an image
Image Set Representation
Considering the image set with n image
( ) 1n i i X x
, we will first set of each image was divided into 4×4 equidistant non-overlapping blocks. Dimension reduction to these blocks will be 16 pieces of a single line, all the images in the collection to repeat this process. Then through the covariance matrix of 16 16 image block, the image collection, covariance matrixC16 16 defined as follows:
, 1 1 n
i i i i
p q p p q q
i
C x x x x
n
(2)
Support Vector Machine of Covariance Matrix
Hilbert space, the kernel function is used to map the point on Riemann manifold to the reproducing kernel Hilbert space (RKHS), in this way we can use SVM classifier on the Riemann manifold of the symmetric positive definite matrix.
RESULTS AND ANALYSIS
Face Dataset
Curtin Faces, Biwi Kinect and UWA Kinect are three of the largest publicly available Kinect human face datasets, these datasets subjects number were 52 (5000 RGB-D image), 20 (15,000 RGB-D image) and 48 (15,000 RGB-D image). As a Result, the image illumination condition, the posture of the head, the expression of the head, the decoration of the sunglasses and the shading of the hand in dataset are all changed.
[image:4.612.176.419.396.476.2]Curtin Faces dataset: 52 subjects and more than 5000 RGB-D images can be catch through the Microsoft Kinect sensor. Each feature captures images in different poses, illumination, facial expressions and decorative sunglasses. The sample image is one of the subjects in the database is shown in Figure 1. The database also contains images with sunglasses images, no sunglasses images and hand covered images.
Figure 1. Some sample images of Curtin Faces dataset.
Figure 2. Some sample images of Biwi Kinect dataset.
UWA Kinect database: UWA Kinect dataset is captured in the lab environment of the University of Western Australia, contains the 48 subjects’ more than 15000 low resolution RGB-D images which obtained by Kinect sensor, the total number of images per person is from 289 to 500 in the dataset, and is shown in Figure 3, some photos of a person from the UWA Kinect dataset. Each of the subjects in the database has a change in facial expressions and a large head rotation.
Figure 3. Some sample images of UWA Kinect dataset.
Influence of Image Set Quantity
We combine data sets into a large data set containing 120 subjects over 35000 RGB-D images. By exchanging randomly the images which contain every features divided into k weight, each person has a K image set, which is used for training, the remaining k-1 image set for testing, this paper determines that the same image will not be repeated in different sets. In short, there are 120 training images sets and 120× (k-1) test image sets in total.
Then from 5 to 10 to change the value of k to carry on the experiment, the increasing of K value makes the experiment have more challenges. This is due to the larger value of k is meant to reduce the number of images in each set, and keep the number of training sets to increase the total number of test sets .In order to obtain the consistency result, 5 experiments are carried out for different random exchange images to K sets. IS shown in TABLEI, the average recognition rate and the standard deviation. The average recognition rates of RGB images, D images and RGB-D data fusion were 91.18%, 93.82% and 94.73%, respectively.
[image:5.612.175.420.319.373.2]nature. With the increase of K value, the average recognition performance will be reduced, because of the increase of K value, the images are divided into more image sets, the total of images in each image set will be reduced. A reduction in the number of images in a collection means that the diversity and recognition function between images are reduced. However, the difference between k = 5 and K = 10 in the recognition performance is not obvious, which indicates that this method is robust to the variation of the number of images.
TABLE I. RELATIONSHIP BETWEEN K VALUE AND AVERAGE RECOGNITION RATE.
K value 5 6 7 8 9 10
RGB 92.1 92.0 92.3 90.8 91.0 91.2
D 94.8 94.1 94.0 93.8 93.0 94.2
RGB-D 96.2 95.1 95.2 95.6 94.1 95.1
CONCLUSION
By using low cost 3 d sensors to obtain low quality depth image, the problem of face recognition to access data into an image set based on the depth of color image - image classification problem, and introduces a co-variance matrix representation based on image block to simulate the image collection on the Riemann manifold. RKHS theory is to adjust the SVM classification and make it adapt to the manifold. In a large database to evaluate the method of the database in the posture, facial expression change, light conditions change, sunglasses decorations and covered on the synchronous change by hand. The experimental results show that this method is effective for the classification framework of RGB-D image and the practicability of the face recognition.
ACKNOWLEDGEMENTS
We thank all the anonymous reviewers for their valuable comments. The project is supported by National Natural Science Foundation of Henan talent training joint fund(U1304107),Science and technology projects(142102210500,122102210518) of Henan province, Henan province(122102210518),Key scientific research projects of Henan Provincial Department of Education (15A520029).
REFERENCES
2. Cevikalp H., Triggs B. Face recognition based on image sets [C]//Proc of IEEE Conference on Computer Vision & Pattern Recognition. Washington DC: IEEE Computer Society, 2010: 2567-2573.
3. Li Yong-zhou, Luo Da-yong, Liu Shao-qiang. Face Recognition Using Orthogonal Discriminant Linear Local Tangent Space Alignment [J]. Journal of Image and Graphics. 2009, 14(11): 2311-2315.
4. Wang Li-yan, Li Wei-sheng. MMDA with blocking clustering for face recognition Application Research of Computers [J]. 2014, 31(9): 2853-2855.
5. Liang Shu-fen, Liu Yin-hua Li Li-chen. Face recognition under unconstrained based on LBP and deep learning [J]. Journal on Communications. 2014, 35(6): 154-160.
6. Li B.Y.L., Mian A., Liu W., et al. Using Kinect for face recognition under varying poses, expressions, illumination and disguise [C]//Proc of IEEE Workshop on Applications of Computer Vision. Washington DC: IEEE Computer Society, 2013: 186-192.