2017 International Conference on Computer Science and Application Engineering (CSAE 2017) ISBN: 978-1-60595-505-6
Detecting Driver Drowsiness Using Eye State
Xuefei Liu* and Guanghua Tan
College of Computer Science and Electronic Engineering, Hunan University, 410082 Changsha, China
ABSTRACT
Various studies have suggested that 20%-50% road accidents are fatigue-related. Nevertheless, on certain roads, the maximum value is out of scope. This paper aims at developing a vision-based drowsiness detecting system which is noninvasive and accurate to help alert drowsy driver in advance. Eye state is used to evaluate drowsiness. First, face region is detected through AdaBoost with Haar feature. Then instead of applying old fashioned eye location methods which many drowsiness detect systems still possess, our system obtain eye region through a customized state of art face alignment algorithm based on Regressing Local Feature. Finally, an SVM is trained to distinguish eye states. Results demonstrate that our system can achieve a high detection rate in various lighting circumstances.
INTRODUCTION
Fatigue is one of main causes for traffic accidents. Since fatigue does not happen in a sudden but as a process, developing systems to sense the onset of drowsiness and alert drowsy driver in advance becomes crucial. A system like this is called DDDS (Detect Driver Drowsiness System). Computer vision techniques are widely used in DDDS because of its non-obstructive and considerately effective features. And among the visual behaviors, eye state stands as the most important cue for drowsiness measurement. However nowadays a large amount of DDDS still rely on old-fashioned eye detection techniques to locate eyes which result in low efficiency and accuracy. Considering the fast development of face alignment, we make use of and customize state of the art algorithms to build a DDDS based on eye state analysis. It functions effectively well without cumbersome prerequisites. Our work mainly contributes on customizing and integrating new algorithms into useful DDDS which can highly boost its efficiency. The rest of the paper is arranged as follows. In section 2 factors evolved in drowsiness detection are introduced. Section 3 explains the architecture of system while Section 4 demonstrates its details. At last, we display abundant experiment results, and get our conclusion.
BACKGROUND
Physiological measurements like heart rate, brain activity, skin conductance, muscle activity can reflect one’s fatigue level, especially EEG (Electroencephalography), which is considered as the authentic standard for evaluating other detectors. But due to additional sensors and peripheral equipment are often required in those methods, they are not commonly used in practice.
When people are experiencing fatigue, various obvious visual behaviors can be observed from their heads, faces, and eyes, such as frequent nodding, smaller degree of eye opening (or even closed), eye blink, yawning and so on [1]. Using computer vision technique, which is nonintrusive and efficient in extracting visual characteristics, we can infer a driver’s vigilance only by analyzing the images taken in front of him. To be specific, head posture can be used to measure drowsiness, e.g. Murphy-Chutorian, E. et al. [2] estimate the driver's head’s orientation and apply the head pose information on their driver assistance system. Eye state can be used to measure driver drowsiness generally in ways like calculating values such as the percentage of eye closure (PERCLOS)[3] the frequency of eye closure (FEC), and eye closure duration (ECD)[4]. Methods based on yawning usually locate driver’s mouth first and then classify the mouth state as normal or yawning like Saradadevi, M. et al. did in [5]. And methods based on facial expressions mainly combine yawning, blinking, and eyebrow rising or other facial cues together, e.g. Jimenez-Pinto, J. et al. analyze the salient points on face to determine fatigue level [6]. Among all these features, eye status shows superior accuracy. Therefore, we choose eye closure duration (ECD) as the key measurement in our system. Two main technologies are essential in obtaining ECD—one is eye localization to find where the eye region is, the other is eye state classification. Eye localization is a big topic and has been studied extensively, whose details will be discussed later in the following part of this section.
Indirect vehicle behaviors like steering wheel, lateral position can also characterize driver’s vigilance, e.g. an abrupt correction after a short time’s vacancy can be the sign of vigilance loss. For example, Y Takei et al.[7] use chaos theory to analyze steering motion to deduce the driver’s fatigue level.
Moreover, since sleepy drivers cannot manipulate the steering wheel precisely, he or she is likely to drive the vehicle out of the center of the lane. Therefore by monitoring the lane and the deviation of driving path, a driver’s state of vigilance can also be measured.
ARCHITECTURE
FACE DETECTION FACE ALIGNMENT & EYE DETECTION EYE STATE CLASSIFICATION DROWSINESS DECISION
Figure 1. Architecture of system.
ALGORITHM IMPLEMENTATION Face and Eye Detection
At the first stage of our algorithm, we use AdaBoost algorithm with Haar-like features to detect the face region in the captured webcam image. A satisfying performance on face detection purpose is provided. However, when it comes to the second stage, eye detection, due to eye region is much smaller and does not have as many features as face like symmetric property or so on, Adaboost + Haar detection technique could not work on eyes as well as on face.
Therefore, we apply the shape regression approach using LBF, Local Binary Feature for face alignment [9] to locate eyes. Facial shape S is predicted in a cascaded manner by shape regression. An initial shape is viewed as S0. In each stage, a shape
increment ΔS is estimated to progressively refine S. Generally, ∆St, shape increment at stage t, can be regressed as:
t
t t t-1
(I,S )
S W
(1)
where I is the input image, Φt is a feature mapping function, and Wt is a linear
regression matrix, St−1stands for the shape from the previous stage. Assume there are
L landmarks, each φlt is learnt by independently regressing the lth landmark, constraining it to its own corresponding local region. Then we obtain Φt simply by
concatenating these L local mapping function φlt together, which can be written as Φt = {φ1t, φ2t, … , φLt}.
In specific, a standard regression random forest is used to learn each φlt, the target function is: 2 2 , i 1 1 ¦ ( , ˆ min ) t l t t t
l i l
t t
l Ii i
S S
(2)ˆt i S
stands for the ground truth shape increment. Operator lextracts two elements
(2l − 1, 2l) from the vector Sˆit , so ˆ t
l Si
is the ground truth 2D-offset of lth
landmark in ith training sample.
t l
is the regression output which will be discarded
later. i iterate over all training samples.
We get the global feature mapping function t by concatenating binary feature φlt which are obtained through the local random forest learning, then a global linear projection Wtis learnt by minimizing the following objective function:
1 2 2
2 2
1
( ,
n ˆ )
mi
Wt n
t t t t t
i i
i I
l I S
S W W
The first term explains the regression target, the second term is a L2 regularization on Wt, and its strength is controlled by λ.
This method takes advantage of the local binary features and integrates them on global linear regression which leads to a good performance. What’s more important, it’s quite efficient to meet our real-time acquirement.
In the original version of the method they used 68 face landmarks for training. While in our application, we focus on eye locations only, so we reduced the landmarks from 68 to the 20, 10 are around eyes. This procedure can greatly boost the computational speed but also lead to a little loss on shape constraint.
Inspired by [10], we apply “between eyes” template matching on our system. “Between eyes” is the region between two eyes and has the property of invariant to both open and closed eyes circumstances. Instead of using the circle-frequency filter to obtain that region, we simply store the template of “between eyes” when two eyes are stably confirmed. Though template matching is a primitive way to detect eye, it actually works well as compensation to the main algorithm.
Eye State Classification
Once the eye patches are obtained, feature extraction method PCA (principal components analysis) is used to reduce the dimensionality of the feature space. It tries to find a linear transformation orthonormal matrix to map the original n-dimensional feature space into m-dimensional feature subspaces, when n is much bigger than m. In our experiment, a 24*24 eye patch, which can be reshaped to a 576*1 vector is then reduced to 15*1. 15 is chosen here cause the sum of the 15 largest eigenvalue occupies more than 90% of the total sum of all eigenvalues, meaning the 15 eigenvectors with the largest eigenvalues are representative enough. Next, the reduced vector is used as the input of the SVM with RBF kernel to determine the eye state, whether it is open or closed. The SVM in our system is trained with 1000+ open and closed eye images, while cross-validation is conducted to find out the optimal parameters.
Drowsiness Decision
00111111101111010111110000000011010011001
Figure 2. Drowsiness decision.
EXPERIMENT RESULTS
Our method distinguishes itself from traditional drowsiness detection methods. It’s built with the state of art face alignment algorithm using LBF instead of old fashioned eye detection counterparts which often suffer from high error rates and low efficiency. Experiment is conducted on a single laptop with i5 CPU of 2.50GHz, 4GB RAM.
[image:5.612.93.504.467.597.2]Our system is evaluated through two steps, one is the accuracy of obtaining eye region, the other is the accuracy of the whole system. Four subjects took part in our experiment. We place a USB camera in front of them. In the first experiment, they were required to take hundreds of pictures with open eyes, partially open eyes, closed eyes respectively. Face alignment with LBF is applied to obtain eye region on these pictures. Accuracy is shown in TABLE I. In the second experiment, as shown in TABLE II, accuracy stands for recognizing the state as what it truly is. Namely, Accuracy = 1 - the chance of misrecognizing a normal state as drowsy state - the chance of misrecognizing a drowsy state as normal state. In real on-board situations, many drivers wear glasses while driving, our method is based on the robust face alignment algorithm with LBF, which is able to locate eyes under tricky circumstances like face with glasses, partially occluded , or even deformed by expression or orientation. Once eye region is obtained, it is used as the input of the SVM as normal since we have included eyes with glasses in the training set. The problem lies in under circumstances where there are light spots on the glasses due to reflection, it can hardly tell whether the eye is open or closed. This situation doesn’t happen all the time but needed to be considered for further improvement. In both experiments, subjects could move face like in real situation, so every possible orientation can be captured. Therefore different facial rotations as well as varying facial expression are taken into consideration. Results show our system surpasses two representative state of art systems based on traditional eye detection methods. The algorithm inherits the fast speed advantage of LBF, as shown in TABLEIII, we hold a fast processing time of 14 fps which is very competitive.
TABLE I. RESULTS OBTAINED AFTER THE FACE ALIGNMENT TO OBTAIN EYE REGION.
Open eyes Eye partially open Closed eyes
TP FP FN TP FP FN TP FP FN
Person1(%) 96.8 1.3 1.9 84 10 6 98.6 0.7 0.7
Person2(%) 96.7 2.2 1.1 84.6 6.1 9.3 98.3 0 1.7
Person3(%) 96.7 2.4 0.9 86.3 6.9 6.9 97.5 0 2.5
Person4(%) 96.6 1.7 1.7 76.8 15.2 9.0 98.5 0 1.5
TABLE II. COMPARISON WITH DIFFERENT METHODS.
Methods Accuracy
Our method 98.3
Bogusław al.’s method[11] 97.0 Jo et al.’s method, 2014[12] 95.0
TABLE III. AVERAGE PROCESSING TIME OF PROPOSED SYSTEM.
Process Average Processing time (ms/frame)
Face detection 21.2
Face Alignment via LBF 48.5
Eye state classification Feature extraction(PCA) 0.08
SVM 0.03
Drowsiness decision 0.01
Total Time 69.82(≈14fps)
CONCLUSIONS
This paper has developed an efficient drowsiness detection system based on computer vision. The system takes advantage of AdaBoost with Haar features, modified face alignment algorithm through local binary features, template matching of “between
21.2
48.5
0.08 0.03 0.01
69.82
0 10 20 30 40 50 60 70 80
[image:6.612.89.502.79.624.2]eye” region and SVM classifier to detect a driver’s eyes and infer the fatigue state. Our algorithm only needs a simple webcam and can achieve quite satisfying detection result in various illuminate conditions. In the future, it is worth exploring to make use of the local binary features of the eye instead of using pixel values directly on PCA to deduce eye state. Performance of SVM should also be improved to be able to calculate amplitude for PERCLOS instead of simply using open or closed states only since in our current version a binary classification still works better than trying to relate the result with the opening degree of eye.
REFERENCES
1. L. M. Bergasa, J. Nuevo, M. A. Sotelo, R. Barea, and M. E. Lopez. 2006. “Real-Time System for Monitoring Driver Vigilance_IEEE TRANSACTION,” IEEE Trans. Intell. Transp. Syst., 7(1), pp. 63–77.
2. E. Murphy-Chutorian, A. Doshi, and M. M. Trivedi. 2007. “Head Pose Estimation for Driver Assistance Systems: A Robust Algorithm and Experimental Evaluation,” 2007 IEEE Intell. Transp. Syst. Conf., pp. 709–714.
3. W. W. Wierwille, L. A. Ellsworth, S. S. Wreggit, R. J. Fairbanks, and C. L. Kirn. 1994. “Research on vehicle-based driver status/performance monitoring; development, validation, and refinement of algorithms for detection of driver drowsiness,” Rep. no. DOT HS 808 247, NTIS, Springfield, VA 2216, pp. 219.
4. T. D’Orazio, M. Leo, C. Guaragnella, and A. Distante. 2007. “A visual approach for driver inattention detection,” Pattern Recognit., 40(8), pp. 2341-2355.
5. M. Saradadevi. 2008. “Driver Fatigue Detection Using Mouth and Yawning Analysis,” Int. J. Comput. Sci. Netw. Security, 8(6), pp. 183–188,.
6. J. Jiménez-Pinto and M. Torres-Torriti. 2009. “Driver alert state and fatigue detection by salient points analysis,” Conf. Proc. - IEEE Int. Conf. Syst. Man Cybern., October, pp. 455-461.
7. Y. Takei and Y. Furukawa. 2005. “Estimate of driver’s fatigue through steering motion,” 2005 IEEE Int. Conf. Syst. Man Cybern., 2(1), pp. 1–6.
8. J. Jo, S. J. Lee, H. G. Jung, K. R. Park, and J. Kim. 2011. “Vision-based method for detecting driver drowsiness and distraction in driver monitoring system,” Opt. Eng., 50(January), pp. 127202. 9. S. Ren, X. Cao, Y. Wei, and J. Sun. 2014. “Face Alignment at 3000 FPS via Regressing Local Binary
Features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1(1), pp. 1-8.
10. S. Kawato and J. Ohya. 2000. “Real-time detection of nodding and head-shaking by directly detecting and tracking the ‘between-eyes,’” Proc. - 4th IEEE Int. Conf. Autom. Face Gesture Recognition, FG 2000, pp. 40-45.
11. E. B. V All, I. Road, E. Union, E. Union, and D. A. Systems. 2014. “Neurocomputing Hybrid computer vision system for drivers eye recognition and fatigue monitoring Bogus ław Cyganek n , S ł awomir Gruszczyński,” Recent trends Intell. Data Anal. — Sel. Pap. 6th Int. Conf. Hybrid Artif. Intell. Syst., 126, pp. 78–94.