A Spiking Neural Networks Based Face Recognition Algorithm

(1)

2017 2nd International Conference on Artificial Intelligence: Techniques and Applications (AITA 2017) ISBN: 978-1-60595-491-2

A Spiking Neural Networks Based Face Recognition Algorithm

Jin-qing LIU

1

, Yin LIU

2

, Li-chun

YU

1

and Xiao-yun

DENG

3

1

Department of Information Technology, Fuzhou University of International Studies and Trade, Fuzhou, Fujian, China, 350202

2

The Department of Communication Engineering, Xiamen University, Xiamen, Fujian, China, 361005

3

Fujian Provincial Key Laboratory for Photonics Technology, Fujian Normal University, Fuzhou, Fujian, China, 350007

Keywords: Spiking neural network, Nearest neighbor classifier, Feature recognition, Complexity and best threshold.

Abstract. In this paper, the complexity algorithm is used to locate the human eyes, and then the best

threshold method is used to locate the human eyes accurately. This method is more accurate than the gray projection method, and it is faster and less affected by the light and noise. Spiking neural networks, which inherit the parallel mechanism from biological system, are used to extract the face features. The spiking neural networks can remember key features of a visual image through synapse strength distribution and recall the visual image by triggering a specific neuron. Based on the key features, the nearest neighbor classifier is used for matching faces. Experimental results show that the proposed algorithm of eye location works well and has advantages about eye location in multi-position and complex background, and the face features extraction based on spiking neural networks can achieve high recognition rate and reduce the time. Furthermore, the algorithm can be transformed to GPU platform and can be speed up dramatically.

Introduction

In this paper face recognition is divided into three steps: face localization, face feature extraction, the face recognition. The face location algorithm is the foundation of the whole face recognition technology, this article uses the method based on the complexity and the best threshold value[1] for human eye localization, it can position face quickly and accurately .

A spiking neural network[2] is introduced to perform the discrete cosine transform for visual images in this paper. Simulation results show that the spiking neural network is able to perform the discrete cosine transform for visual images, and show that with a small number of neural networks’ coefficients can reconstruct the original image. Compared to single thread serial computation, spiking neural networks, which inherit the parallel mechanism from biological system, is more effective for image processing.

The rest of the paper is organized as follows. Face alignment method is first described in section II and feature extraction using spiking neural networks is introduced in section III. In section IV, face recognition using the nearest neighbor classifier is presented. Experimental results are described in section V. Finally in section VI we conclude the paper.

Face Alignment and Normalization

(2)

characteristic points on the basis of the curves [4]. The second one is based on statistics. Wu An used gray level information and BP neural networks to build the pupil filter for locating eye. This method can improve the accuracy, but it needs to collect a lot samples to train the classifier. The third one is based on some rules, such as those in literature [5]. The generality of the method is restricted due to amount of prior knowledge.

Feature Extraction Based on Spiking Neural Networks

Architecture of the Neural Networks

Since the 1990s, scientists putted forward a set of spiking neural network theory based on the neuron model of the Nobel Prize winner Hodgkin[6] which were more close to the biological neural network. Combining biological visual information processing mechanism and spiking neural networks to study image information processing becomes the interdisciplinary of computer vision, neuroscience and intelligent science [7].

The human brain can remember key features of a visual image by a glance. The brain's visual information processing capacity far exceeds any other synthetic information processing systems’. The human visual system contains complex circuits of neurons that extract salient information from visual inputs. Signals from photoreceptors are processed by retinal interneurons, integrated by retinal ganglion cells and sent to the brain by axons of the retinal ganglion cell through different pathways [8].

In digital image processing, different pixel can be regarded as different strength incentive to neurons, different incentives induce neuron membrane potential up to a threshold value, if the potential gets to the threshold value the spiking neuron will release a spike sequence [9]. Thus it can be seen that the frequency of spike sequence reflects the change of the strength of input signal. The response to strength of input visual signal by spiking neurons [10] is shown in Figure 1.

[image:2.612.334.512.412.535.2]

Figure 1. The response to the input signal strength of spiking neurons.

Figure 2. Spiking neural network for feature extraction and image reconstruction.

Inspired by biological information processing mechanism above, a spiking neuron network model is proposed to remember key features of the visual image by inference. The spiking neural network for feature extraction and image reconstruction is shown in Figure 2. The model of spiking neural network can be used to explain how a spiking neuron-based system can store the key features of visual image. The visual image can be recalled by injecting a stimulus current to the specific neurons in this model.

The network is composed of three layers: The first layer is light receptors, each pixel correspond to a receiver so the pixel values can be transform into the spiking signal.

(3)

neurons in the input array through excitatory synapses and from a specific neuron KFi in the key feature neuron layer through a synapse.Their synapse strength distributions are W_N₍_m_,_n₎→_ON₍_p_,_q₎ and

( , )

i KF ON p q

W → for ON neurons, similarly, WN(m,n)→OFF(p,q) and WKFi→OFF p q( , ) for the OFF neurons.

The reconstruction neuron layer is used to demonstrate reconstruction of the visual image using the value distributions of WKFi→ON p q( , ) and WKFi→OFF p q( , ) and stimulus current Is of the key feature neuron

KFi , a recalled image can be obtained from the reconstruction neuron array even if no original image is presented to the input neuron array. This means that the distributions of _{( , )}

i KF ON p q W → and

( , )

i

KF OFF p q

W _→ have stored the key features of the image.

Extraction Based On Spiking Neural Networks

Feature extraction is a very important issue for face recognition. In extraction we try to find the most discriminatory and robust feature representation so as to recognize faces accurately and rapidly when face images are under different illumination, expression or point of view[11].

The Discrete Cosine Transform (DCT) is an efficient approach for key feature extraction in the image processing domain. Like other transforms, the Discrete Cosine Transform attempts to decorate the image data. After decoloration each transform coefficient can be encoded independently without losing compressions efficiency. DCT has been used as a feature extraction step in various studies on face recognition. This results in a significant reduction of computational complexity and better recognition rates. The spiking neural network proposed last section is based on the principle of DCT for visual image.

The 2-D DCT is a direct extension of the 1-D case and is represented as follows:

    +     + = N v y M u x y x f v u MN v u C 2 ) 1 2 ( cos 2 ) 1 2 ( cos ) , ( ) ( ) ( 2 ) , ( π π α α (1) Where 0≤u≤M −1,and0≤v≤N−1,

    − ≤ ≤ = = 1 1 , 2 0 , 1 ) ( M u M u M u α (2)     − ≤ ≤ = = 1 1 2 0 1 ) ( N v N v N v α (3) The inverse transform is set as follows

    +     + = N v y M u x v u C v u MN y x f 2 ) 1 2 ( cos 2 ) 1 2 ( cos ) , ( ) ( ) ( 2 ) , ( π π α α (4) Where .

The number of DCT coefficients is equal to the amount of the pixels of the image. The key issue is how to choose the number of DCT coefficients, thus expressing face effectively.

A face image which has 128×128 dimensions is shown in Figure 3.b. The 8×8 subset of the distribution, which represents the low and medium frequency of the face image, is shown in Figure 3.a. It is obvious that a lot of information of the original image just focus on very few coefficients of the transform domain. The maximal coefficients can reach 22000, the minimum coefficients are less than

1 , , 2 , 1 , 0 , 1 , , 2 , 1 ,

0 − = −

= M y N

(4)

1. The multiple of the reduction can reach 1000 in the first 64 coefficients. The majority of the coefficients are less than 1, and these coefficients contain less information of the image.

So we abandon a large numbers of coefficients and remain some coefficients which contain a lot of information of the image. And this paper choose the top left corner's DCT coefficients according the path which is shown in Figure 3.c.

Presented a face image to the spiking neural networks, a set of stable synapse strength distributions is obtained after a period of training. Then these strength distributions are transformed to the key features of the face image according to the method proposed above.

Image Reconstruction of Spiking Neural Networks

Compared results of spiking neural network with that from traditional DCT, spiking neural network inherit the advantages of the traditional DCT, the result is shown in Figure 4.

[image:4.612.99.519.232.341.2]

a. 8×8 DCT coefficient value b. 128×128 face image c. The path of coefficient extraction Figure 3. The distribution of the key features.

When the input is a square of smooth gray value in the receptive field, the spiking neural network produce a low frequency spike sequence; when inputs a square of gradient gray value, such as the edges in an image, the spiking neuron responds with a high frequency spikes[9]. So the spiking neural network retains more information of image edge information than the traditional DCT method.

x

y

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 0 20 40 60 80 100 120 140 160 180 200 x y

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 -50 -40 -30 -20 -10 0 10 20 30 40

a. image reconstruction of traditional DCT b. original image subtract the traditional DCT reconstruct image

x

y

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 0 20 40 60 80 100 120 140 160 180 200 x y

10 20 30 40 50 60 70 80

10 20 30 40 50 60 70 80 -50 -40 -30 -20 -10 0 10 20 30 40

[image:4.612.92.520.415.580.2]

c. image reconstruction of spiking neural network d. original image subtract the spiking neural networks reconstruct image

Figure 4. Different results of different methods.

(5)

a. original image b. reconstruction image c. reconstruction image d. reconstruction image e. reconstruction image of 2×2 of 3×3 of 4×4 of 5×5

f. reconstruction image g. reconstruction image h. reconstruction image i. reconstruction image of 6×6 of 7×7 of 8×8 of 9×9

Figure 5. The reconstruction image be influenced by the number of eigenvalue.

Face Recognition by NN

Assume the dimension of the image’s feature is M. Set the features as x, and set the features of images in face databases as xi. Classifying these features by the principle of Nearest Neighbor based on Euclidean distance, the algorithm is represented as follows:

i x x

l= − =

[

(x₀−xi_,₀)2 +(x₁−xi_,₁)2++(xM−₁−xi_,M−₁)2

]

12₍₅₎

[

]

T

M x x x

x= 0, 1, −1 ₍₆₎

[

]

T

M i i i

i x x x

x = _,₀, _,₁,, _, −₁ ₍₇₎

The issue of recognizing face becomes a calculation of the maximum value of l. Then recognition rates are calculated by statistics methods.

Simulation and Analysis of Result

[image:5.612.81.533.63.240.2]

To evaluate the proposed face features extraction based on the spiking neural network algorithm, we systematically compare it with the tradition DCT algorithm on ORL and CMU face databases. ORL database contains 400 frontal images with different facial expressions, illumination conditions, hairstyles with or without glasses for 40 subjects, 10 images for each subject. Each sample is a 92×112 gray image, with tolerance for some tilting and rotation of up to 20.CMU database contains 123 individuals, with several facial expressions for each subject. Some of the samples are shown in Figure 6. The nearest neighbor classifier is used for matching faces.

(6)

[image:6.612.97.517.71.231.2]

a. Some samples in ORL database b. Some samples in CMU database Figure 6. Some samples in ORL and CMU face databases.

For both ORL and CMU databases, half samples are randomly selected as the training set and the remaining samples as the testing set. The results of 3 algorithms (i.e. spiking neural networks, DCT, PCA[12])on ORL and CMU databases are shown in Table 1,which indicates that the spiking neural networks is significantly better than other two algorithms.

Table 1. Face Recognition Rates on ORL and CMU Databases.

Rec. methods Rate on ORL Rate on CMU Ave. rate

SNN 89.7 88.9 89.3

DCT 89.6 88.8 89.2

PCA 86.4 84.6 85.5

To verify the relationship of various sampling rates with recognition rate, tests have been conducted for three different sampling rates of 0.8, 0.5, and 0.3. Table 2 shows the average recognition rates. It is obvious that the sampling rate should neither be quite high nor extremely low.

Table 2. Recognition Rates on Different Sampling Rate.

Sample rate Rate on ORL Rate on CMU Ave. rate

0.8 89.7 88.9 89.3

0.5 89.2 88.4 88.8

0.3 88.1 86.9 87.5

The results of recognition rates on different number of features on ORL face databases are shown in Figure 7. The recognition rate of spiking neural networks and traditional DCT method are shown respectively in Figure 7. From Figure 7, spiking neural networks and the DCT have similar performance. The best recognition achieved by spiking neural networks was 89.7% when the number of feature dimensions is 68.

0 20 40 60 80 100 120 140 160 180

0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9

the number of eigenvalues of feature

re

c

o

g

n

it

io

n

r

a

te

/%

Spiking neural networks DCT

[image:6.612.207.407.578.724.2]

(7)

In this paper we use the spiking neural networks which inherit the parallel mechanism from biological system to extract the facial features. The algorithm has been implemented using GPU platform. Experiments show that the speed of face feature extraction will increase by about 30 times on GPU, about 0.5s.

Conclusion

In this paper an algorithm based on complexity and best threshold is used to locate eyes. The spiking neural networks, which inherit the parallel mechanism from biological system, are introduced to extract the features of the face images. The network can perform the facial feature extraction similar to the discrete cosine transform. The experimental results show that the method works well and has advantages about eye location in muti-position and complex background, and the feature extraction algorithm achieves high recognition rate and reducing the run time. The speed of the algorithm can be largely increased by transplanting it to GPU platform. If the hardware implementation of the network can reach the speed as same as biological neuron network, the key feature extraction can be completed in 400ms, and the network can be applied to spiking neuron based artificial intelligent systems to support the processing visual images.

Acknowledgements

The authors gratefully acknowledge the supports from The National Natural Science Foundation of China (61179011), Natural Science Foundation of Fujian Province (2010J01327), Fujian Provincial Key Laboratory of Photonics Technology, Fujian Normal University. The authors also acknowledge the editor and colleagues who provided technical supports.

References

[1] S. Anith, D. Vaithiyanathan, R. Seshasayanan. “Face recognition system based on feature extration,” International Conference on Information Communication & Embedded System .pp. 660-664, 2013.

[2] Q. X. Wu, T. M. McGinity, L. Maguire, J. Harkin, and R. Cai, “Remembering Key Feature of Visual Images based on Spike Timing Dependent Plasticity of Spiking Neurons,” Proceeding of IEEE, [Page: 2168-2173 Year of Publication: 2009 ISBN: 0018-9219 ].

[3] Yadav. S, N. Nain, “Fast Face Detection Based on Skin Segmentation and Facial Features,”International Conference on Signal-Image Technology & Internet-Based Systems, pp. 663-668, 2015.

[4] G. F. Xu, S. Q. Ding, “Eye location using hierarchical classifier,” Journal of Chinese Computer Systems, vol. 29, n. 6, pp. 1159-1162, 2008.

[5] L. M. Zhang, P. Lenders, “Knowledge-based eye detection for human face recognition,” Proc of knowledge-based Intelligent Engineering Systems and Allied Technologies.2000, pp. 117-120.

[6] A. Hodgkin, and A. Huxley, “A quantitative description of membrane current and its application to conduction and excitation in nerve,” J. Physiol. Vol. 3, n. 4, pp. 500-544, 1952.

[7] M. Huang, “Iris location based on edge extraction of spiking neural networks,” Electronic Measurement technology. China, vol.25, n. 8, pp. 49-52, 2012.

[8] E.R. Kandel, J.H. Shwartz, Principles of neural science. Edward Amold (Publishers) Ltd., 1981.

(8)

[10] P. Dayan, L. Abbott. "Theoretical neuroscience: computational and mathematical modeling of neural systems." Philosophical Psychology vol. 2. 154-155, 2002.

[11] Q. Zhang, Y. Z. Cai, “Novel face recognition method by combining spatial domain and selected complex wavelet features,” Journal of Donghua University. China, vol. 28, n. 3, pp. 285-290, 2011.