Chapter 3 System design and initial experiments
3.3 Non-intrusive remote camera-based system
To make the video-based system non-intrusive to the user, the face and eye feature-detection algorithms in this project were developed around video data acquired from a remote camera placed at a fixed distances from the user. A remote-camera-based system allows free head rotations and translation, at least to certain extent, without the face being lost from the camera’s field of view. Figure 3-2 shows an image of a subject acquired with a remote camera placed at approximately 60 cm in front of the subject. In this image the subject’s face is centred in the camera’s field of view with sufficient background visible on either side of the face to capture any lateral head movement. One face width on either side is incorporated in the field of view. Unlike a head-mounted system, the remote-camera based system can be incorporated discretely away from the user’s sight and without impairing their vision or distracting them from their task.
Figure 3-2. An example of an image acquired from the remote camera placed at approximately 60 cm away from the subject with one face width space on either side of the face to allow head movement.
A Logitech QuickCam 4000 camera, designed for webcam application, was used for digital video data acquisition in this project. The camera utilizes a glass lens and a Sharp LZ24BP CCD sensor which acquires digital images with 24-bit RGB colour and 640 x 480 pixels VGA resolution. The camera streams the digital images at 30 fps through a USB 2.0 interface.
The LZ24BP CCD sensor is a ¼-type progressive-scan sensor with RGB primary colour mosaic filters and is sensitive to both the visible and NIR spectrum. Since the camera is intended to operate in the visible spectrum, it has an optical highpass filter in front of its lens to block the IR spectrum. This IR-block optical highpass filter was removed so that the camera could operate under NIR illumination for reasons explained in section 3.4.
The glass lens in the QuickCam 4000 camera allows acquisition of better quality images than the plastic lenses common on cameras for webcam use. A built-in automatic white balancing feature of the QuickCam 4000 camera compensates for varying ambient illumination levels by controlling the electronic exposure rate to reduce variation in image intensity, making the camera robust under varying illumination. However, the camera also allows the electronic exposure rate to be manually adjusted if required.
For the video-based alertness monitoring system to be useful, the camera must sample the images fast enough to capture the eyelid closure during a spontaneous blink while a person is alert. During a typical spontaneous blink, the eyelids are more than 75% closed for approximately 66 ms corresponding to a bandwidth of approximately 15 Hz (Evinger et al., 1991). To meet the Nyquist sampling requirement, the camera must therefore sample the images at a rate of 30 Hz or greater to detect closed eyelids during a typical spontaneous blink.
The sampling rate of 30 fps through the USB 2.0 interface by the QuickCam 4000 was considered to be adequate to operate the video-based alertness monitoring system in real-time because the eyelid movements observed during the drowsy periods are relatively much slower than spontaneous blinks while alert. The bandwidth for data transfer from the camera to a computer is another factor that must be considered for real-time operation. Although the 480 Mbit/s bandwidth of the USB 2.0 interface is sufficient for video data transfer in real-time, the host-centric nature of the USB interface requires hand-shaking overhead between the host computer and the webcam during which video frames can be lost. The peer-to-peer nature of the FireWire interface does not require hand shaking and is better suited than the USB 2.0 interface for video data transfer in real time applications.
Although real-time operation is one of the requirements for the video-based system to be useful, the development of robust and accurate algorithms was considered to be of higher priority during this project development. Hence, the development of the algorithms was based on post-processing of images from the recorded videos. For this reason, the video data acquisition at high frame rate through the FireWire interface was not critical during the project development. However, a camera with a higher frame rate and FireWire interface should be considered to operate the system in real-time in future.
Webcams are designed for general purpose video data streaming applications such as video conferencing over the internet and are not ideal for machine vision applications. Cameras that are designed specifically for machine vision applications with higher image quality and operating specifications than webcams are also available in the market. However, the image quality of the high-end low-cost QuickCam 4000 camera was considered sufficient for initial development of the computer vision facial feature detection algorithm. In addition, the performance of the face and eye feature detection algorithms developed based on the images acquired from the low-cost webcam is likely to improve when applied to the higher quality images acquired from a camera specifically designed for machine vision application.