DESIGN AND DEVELOPMENT OF DSP PROCESSOR
BASED GESTURE RECOGNITION SYSTEM FOR REAL
TIME APPLICATIONS
S. Nikhil1, Saima Mohan2, B. Ramya3
1- (Engg.) Student, 2- Senior Lecturer (BME), 3- Senior Lecturer (ESD), Embedded System Design Centre
M. S. Ramaiah School of Advanced Studies, Bangalore
Abstract
Human-Computer Interactions (HCI) makes computers more usable and receptive to user’s needs by improving the interactions between humans and computers. Current HCIs use keyboards, mice, joysticks and touch-screens as user interface. However, such mechanical devices are inconvenient for natural and direct interactions, whereas, the human gesture allows users to communicate with machines in a natural way. Among the available gestures for Gesture Recognition System (GRS), hand gestures are the most powerful means of communication and they find their application in educational institutions and entertainment.
In the current paper, a GRS has been designed and developed for file browser application. The GRS system mainly consists of a Transmitter module and a Receiver module. The transmitter module consists of a camera, a DSP processor and a wireless RF transmitter. The receiver module consists of a wireless RF receiver, an embedded processor and a LINUX machine. The image acquired through the camera is processed by the DSP processor to recognize the captured gesture. The processing at the transmitter includes skin color segmentation, palm extraction and gesture recognition. Skin color segmentation results in extracting skin colored objects using image subtraction in conjunction with varied threshold combinations of color spaces. The segmented image is then processed further with a novel method of extracting palm region and adaptively building a bounding box around it by using the horizontal and vertical profilers. Distance profile is obtained by drawing radial lines from the centroid of the extracted palm to the edges of the contour. Gestures are analyzed and recognized based on the angle of the peak point obtained from the distance profile. The recognized gesture is transmitted to the embedded processor of the receiver module through wireless RF transceivers. The embedded processor assisted by UART device driver on Linux machine controls the file browser application at the receiver end.
The functionality of the integrated GRS is tested for wireless file browser application efficiently for a distance of 50m. The development of an effective static GRS devoid of training data, multiple cameras, markers, bulky computing device and gloves constitute the original contribution of the current research. Overall, the developed GRS exhibits desirable features of robustness, portability and cost effectiveness. The experimental results for chosen four types of hand gesture, shows a recognition rate of more than 90% for each gesture
Key Words:Human-Computer Interactions, GRS, Wireless File Browser, Skin Color Segmentation
Abbreviations
Ain Analog Input Pin DAB DMA Access Bus DEB DMA External Bus DMA Direct Memory Access DSP Digital Signal Processors EBIU External Bus Interface Unit GRS Gesture Recognition System HCI Human Computer Interaction PCA Principle Component Analysis PPI Parallel Peripheral Interface RF Radio Frequency
UART Universal Asynchronous Receiver/Transmitter
1. INTRODUCTION
Gestures involve physical movements of fingers, hands, arms, head, face, or body with the intent of: 1) conveying meaningful information or 2) interacting with the environment. Gestures can be static and/or dynamic. Gestures are often language and culture specific. They can be broadly classified into hand and arm gestures, head and face gestures and body gestures. Hand gestures were one of the first means of communication, which is established long before speech and language developed. Among the available
gestures, hand gestures are the most powerful means of communication.
Current HCIs use keyboards, mice, joysticks and touch-screens as user interface. However, such mechanical devices are inconvenient and unsuitable for natural and direct interactions [1, 2]. In the present day framework of interactive, intelligent computing, an efficient HCI is assuming utmost importance in our daily lives. Gesture Recognition System (GRS) finds its application in educational institutions and entertainments, such as Product reviews, Digital libraries, Slide changers, Advertising hoardings, Animations and Gaming. The main approaches for analyzing and classifying hand gestures for Human Computer Interface (HCI) applications include [2] Glove-based techniques and Vision-based techniques [1]. Glove based techniques are efficient in terms of gesture recognition but requires cumbersome devices like gloves, camera and sensors; hence hindering the naturalness of user’s interaction at high costs.
Vision based techniques improve the interaction between humans and computers at low costs requiring only a camera. But the main drawback of the system includes the segmentation being critical and computationally heavy.
Image segmentation and gesture recognition are the most important phases of GRS. Segmentation serves as the key step, as it plays a vital role for the success of image analysis. The various segmentation techniques which have been widely used in literature for recognition of hand include skin color segmentation [3], Gaussian modelling [4] and image subtraction [5]. The natural representation of the real-world environment is better analysed through color image processing. Tan Tian Swee, et al. [2], propose skin color segmentation as a technique to block the non-skin objects from the captured image. Reference [4], claims the Gaussian modelling as an efficient and robust segmentation technique catering changes to illumination and complex background. Such model based system demands the need for training data. The evident drawback of skin color segmentation and Gaussian modelling technique reflects in the inability to filter skin colored background objects from the image. Image subtraction technique as proposed by [5], eliminates shadows and the irrelevant skin colored objects of the background. A combination of image subtraction technique with either skin color segmentation or Gaussian modelling has been proposed by various authors to achieve better segmentation results.
A thorough processing of the segmented image results in the identification of gestures. The most extensively used techniques to identify the gestures include feature extraction [6], PCA [7] and contour analysis [8]. Feature extraction and PCA techniques extract features and eigen vectors respectively from the training data and compare the corresponding features/vectors of input image to identify the gestures. Though the precision of recognition of gesture is significantly high, the main drawback of these methods is the requirement of the system to be trained. The contour based analysis on the other hand does not require the system to be trained and is computationally fast.
In this current paper, a DSP processor based GRS has been designed and developed for file browser application. Vision based techniques have been used for recognition of gesture in this research. The GRS consists of a transmitter module, wireless module and a receiver module. The transmitter module consists of a camera and a Blackfin BF533 board. The image captured by the camera is segmented by the DSP processor to extract skin colored objects from the input image. Segmentation involves a combination of skin color segmentation and image subtraction.
The segmented image is processed to build a bounding box around the palm. Palm is extracted using horizontal and vertical profiles. The extracted palm is further processed to recognize gesture by drawing radial lines from the centroid of the extracted palm to the edges of the palm contour. Fingers form a characteristic peak in the distance profile when stretched. Gestures are recognized based on the distance and orientation of edges of the contour to the centroid of extracted palm.
The recognized gesture is wirelessly transmitted to the receiver through RF transceivers. The receiver consists of an embedded processor that receives the recognized gesture command wirelessly and converts the analog signal to its digital equivalent. The valid digital values are transmitted from the embedded
processor to the Linux machine. At the Linux machine, UART device driver has been designed to enable serial communication and to control the file browser application. Functionality of file browser application depends on the gesture recognized by the DSP processor.
The existing GRS has limitations pertaining to use of cumbersome equipments like markers and gloves owing to incompetent segmentation techniques which are effective only for uniform backgrounds. Most of the gesture analysis techniques in GRS require training data, thereby facilitating extensive use of memory and computation time. The conventional gesture recognition algorithms also impose restrictions like
Signer will have to wear full sleeved shirts Only relevant regions of the gesture will be part of the frame
Distance between signer and camera is fixed The above assumptions results in limiting the quality of natural interaction between humans and computers. The existing GRS use desktop/laptop as computing devices restraining the portability of the system and making the product bulky.
Fig. 1 Gestures chosen for the file browser application
An efficient palm extraction technique, succeeded in making the system adaptive to distance variations between the camera and the signer. This research also undertakes the implementation of segmentation, palm extraction and gesture recognition algorithms on a DSP processor rather than the currently prevalent practice of using desktop/laptop. The DSP processor based reconfigurable GRS has a balanced mix of implementation of advanced technologies and a potential for enhanced end-user product appreciation. The developed GRS exhibits desirable features of robustness, portability and cost effectiveness. The development of an effective segmentation and gesture recognition technique eliminating the usage of markers, gloves, training data, bulky computing devices and multiple cameras, constitute the principal objective of this research.
Varied test cases were designed and computed to record the behaviour of the GRS. The GRS was found to be working on expected lines.
2. SYSTEM DEVELOPMENT
2.1 Gestures Chosen
The main aim of the project was to make a prototype for generalized applications like advertising hoardings, slide changers and digital library. There are no standardized methods for selecting gestures except for applications involving sign language. Figure 1 indicates the gestures identified and their respective signatures for the file browser application.
2.2 System Design
The GRS consists of a camera, DSP processor, wireless transceivers, an embedded processor, a Linux machine and the display unit as indicated in Figure 2. The input from camera is processed at the transmitter to recognize the gesture. The recognized gesture command is wirelessly transmitted to the embedded processor from the DSP processor through the RF transceivers. The embedded processor assisted by UART device driver on Linux machine controls the file browser application at the display unit.
Fig. 2 Block diagram of GRS
2.3 Transmitter Module
The input from camera is processed at the transmitter to recognize the gesture. The processing at the transmitter includes skin color segmentation, palm extraction and gesture recognition as illustrated in Figure 3. The input image is segmented to extract skin colored objects. The segmented image is then processed with a novel method of extracting palm region and bounding box is built around it. Distance profile is obtained by drawing radial lines from the centroid of the extracted palm to the edges of the palm contour. Gestures are analyzed and recognized based on the angle of the peak point in the distance profile. The recognized gesture command is transmitted from the DSP processor wirelessly to the receiver. File browser application at the receiver is controlled by the input gestures.
2.4 Skin Color Segmentation
Skin color segmentation and image subtraction are the different segmentation techniques adopted in this paper. Skin color segmentation is used to block the non-skin colored objects from the image at low computation time in the absence of training data. Skin color segmentation involves establishing thresholds in a particular color space to recognize skin colored objects within the range and equating all the pixel values out of the range to zero; thereby eliminating all the non-skin colored objects from the frame. The YCbCr thresholds from literature resulted in efficient outcomes for uniform backgrounds, but failed for complex backgrounds. Hence new RGB, YCbCr, HSV and YIQ thresholds were derived using an iterative method. RGB and YCbCr thresholds worked better than thresholds from all the other color spaces for complex backgrounds. The main drawback of skin color segmentation is the inability to remove skin colored background objects. This drawback is tackled using image subtraction, where the image representing the features of the gesture is subtracted from the image constituting the uniform/complex background. RGB image subtraction results in efficient segmentation for most of the images, but shadow removal is not effective. Whereas, YCbCr image subtraction results in
the removal of shadows due to its property of decoupling the luminance and chrominance component. Thus, image subtraction is implemented both on RGB and YCbCr color spaces. A combination of image subtraction and RGB, YCbCr thresholds derived from an iterative process resulted in an efficient segmentation when integrated with the YCbCr thresholds from literature.
Fig. 3 Block diagram of transmitter module
2.5 Palm Extraction
The segmented image is processed to extract palm and build a bounding box around it. Vertical and horizontal profilers for the segmented image were computed and the corner points (points where there is a shift of magnitude from zero to a certain positive value or vice versa) of the profilers were found. Using these points, the bounding box is drawn around the segmented image.
The main advantages of the palm extraction technique include
The size of bounding box being made adaptive. (The size of the bounding box need not be set by the user prior to the start of the system.)
The image to be scanned only in a particular row/column direction. (The computation time for scanning the whole image is prevented.)
2.6 Radial Lines and Gesture Recognition
The palm extracted image is further processed to determine the centroid. Here the centroid is determined based on the density of 1’s within the palm contour.
Radial lines are drawn from the determined centroid to the edges of the extracted contour of palm. The main reason for drawing the radial lines is to establish the relationship between the distance and the corresponding orientation of the radial line from the centroid to the contour of the extracted palm. During the process of drawing radial lines, if non-skin colored pixels are encountered, then the radial line will be stopped abruptly. If skin colored pixels are located outside the bounding box, then the distance for that particular radial line is made zeros, resulting in the removal of irrelevant parts of the arm.
After the distance profile is determined, it is further scanned to remove the remaining irrelevant part of the gesture by assuming that a gesture cannot be made within +45° and -45° of the arm. If there are more than ten successive zeros in the distance profile, then
+45° and -45° around the region of the distance profile are made zeros. Hence, this technique results in effective removal of irrelevant parts of palm. Depending on the angle of the peak in distance profile, gesture is recognized and transmitted wirelessly from the transmitter to the receiver.
2.7 Image Resizing
Due to sophisticated real time processing involved in GRS and creation of multiple image matrices results in exhaustive usage of memory, inturn demanding the need for image resize. Image resizing is attempted based on the principles of Gaussian and Laplacian image pyramid level reduction, involving low-pass filtering of input image. Filtering involves convolution of input with symmetric weighting functions as represented in equation 1. Each output is computed as weighted average of input within a 5x5 window. The weights are symmetric as signified in equation 2
n)
+
m,
+
(
n)g
w(m,
=
j)
g(i,
= n l = m2j
2i
2 2 1 2 2
---- (1)
2 21
1
= m=
)
w(m,
--- (2) where, g (i, j) is the resized imagegl-1 (i, j) is the input image and
w (m, i) are the weights of the low pass filter
2.8 C Implementation of GRS
The gesture recognition system which was designed and simulated in MATLAB is converted to C code. The converted code is simulated in VC++ compiler and ported onto the Blackfin BF533 board. The gesture recognition code is implemented on the Blackfin BF533 by reading the image file from the hard disk. For real time application, the images were captured from camera interfaced with Blackfin BF533 board and processed to identify the gestures.
2.9 Image Processing Features of Blackfin
BF533
The Fig. 4 shows the block diagram of Blackfin BF533. External bus interface unit (EBIU) is used to mainly access the off-chip memory system. DMA is connected to the peripherals like PPI and SPI through the DMA access bus (DAB) and to the EBIU through the DMA external bus (DEB). The DAB and DEB buses provide a means for DMA-capable peripherals to gain access to on-chip and off-chip memory with little or no degradation in core bandwidth to memory. The processor has multiple, independent DMA controllers that support automated data transfers with minimal overhead for the core. DMA transfers can occur between the internal memories and any of its DMA-capable peripherals. The DMA for PPI has the highest precedence of all the other DMA functions.
The board supports video input and output applications. The ADV7171 video encoder provides up to three output channels of analog video, while the ADV7183 video decoder provides up to three input channels of analog video. Both the encoder and the decoder connect to the PPI of the Blackfin BF533 processor. The input from the camera is stored in
external memory (SDRAM) of the Blackfin BF533 through the ADV7183 decoder. The output from the Blackfin BF533 is displayed on the monitor through the ADV7171 encoder.
Fig. 4 Block diagram of blackfin BF533
Image acquisition is done using a PAL camera. The captured YCbCr images of size 720x576 are stored in the memory of Blackfin BF533. The captured image can be displayed on the screen by using a utility called “image viewer” in VDSP++, but it takes ten minutes to display an image of size 720x576. Hence it would be difficult to set the position of camera for real time capture.
A C code to display the captured image on the monitor is developed to set the position of camera at real time and to make the system more interactive and entertaining.
The image acquisition, image display and the gesture recognition codes were integrated. Owing to memory constraints of Blackfin BF533, only ten images of size 720x576 and type ‘unsigned char’ could be accommodated in the processor memory. Hence the gesture recognition code was optimized. As a part of optimization, image resizing phase and a part of skin color segmentation phase were eliminated. The integration of the final code could only be done till the skin color segmentation. Bounding box could not be built around the palm for segmented image owing to problems with acquisition of image from camera due to poor lighting conditions, complex background and unstable camera position.
2.10 Wireless Transceivers
The recognized gesture commands is transmitted from the Blackfin BF533 to ARM processor through wireless transceivers which is as illustrated in Fig. 5. The RF transceivers at a frequency of 434 MHz were chosen for this purpose.
Fig. 5 Block diagram of wireless module
The Blackfin BF533 transmits parallel data to the HT12E encoder. The HT12E encoder converts parallel data to serial data and transmits it to RF transmitter. RF
transmitter communicates wirelessly with the HT12D decoder through RF receiver. The HT12D decoder converts serial data to parallel data and transmits it to the ADC pins of the ARM LPC2129 processor. The wireless module communicates with each other through serial data transfer.
The length of the antenna at the transmitter end and the receiver end is calculated using the following equations
Length of the antenna is ¼th of wavelength of signal to be transmitted
Wavelength = frequency/velocity = 433 MHz / 300 MHz
Length of antenna = 36 cm
The distance of communication specified is 100 m.
2.11 Receiver Module
The gesture commands transmitted from the Blackfin BF533 is received by the ARM LPC2129 at the receiver. The analog signals from the wireless module is converted to digital values at the ARM LPC2129 and transmitted to the Linux machine through UART communication as shown in Figure 6. The Linux machine is responsible for controlling the file browser application at the display unit depending on the recognized gesture.
2.12 ARM Interface
The data from the RF module is received by the embedded processor at the analog pins. The analog signal is converted to digital data by the ADC unit of ARM LPC2129. The valid digital data is transmitted serially to the Linux system through UART communication.
Fig. 6 Block diagram of receiver module
The function of ADC unit is to convert the analog signal to digital form and then send the valid digital value to the Linux machine. The LPC2129 has a built-in analog to digital convertor. For the analog to digital convertor to work properly, two registers namely A/D Control Register (ADCR) and A/D Data Register (ADDR) have to be configured appropriately. The ADCR register has to be configured to select AIN0, AIN1, AIN2 and AIN3 as the input pins. For the proper functioning of ADC, the clock frequency has to be less than 4.5 MHz. Analog to digital conversion has to be done continuously. ADDR has to be monitored to check if the conversion is error free. For every successful conversion, the bits 15:6 in the ADDR register contain a binary fraction representing the voltage on the Ain pin. Zero in the field indicates that the voltage on the Ain pin was less than, equal to, or close to that on VSSA,
while 0x3FF indicates that the voltage on Ain was close to, equal to, or greater than that on V3A. Now the output got from bits 15:6 is used to recognize the gesture received and transmitted to the LINUX machine through UART communication.
Fig. 7 Mapping of chosen gesture commands for file browser application
LPC2129 has two UARTs for serial communication namely UART0 and UART1. UART0 has been used in this project by configuring the PINSEL register. UART registers are configured to set the data length, stop bit, parity bits and baud rate. The UART Transmitter Holding Register is assigned the value to be transmitted to the LINUX machine. The top byte is the newest character in the Tx FIFO and can be written via the bus interface. The LSB represents the first bit to transmit.
Fig. 8 Diverse input images used for image segmentation
2.13 Linux Interface
UART device driver is designed for Linux system to receive data from the ARM processor. For the device driver to be functional the UART registers are initialized and device id (major and minor numbers) is registered. If a new data is received in UART receive register, then copy_to_user() is used to copy data from kernel buffer to user buffer. The file browser
application is controlled depending on the gesture command received at the UART device driver in Linux. ioctl routines are designed to control the communication between kernel space and user space. Figure 7 shows the mapping of the gesture command received with the action taken at the file browser application.
3. RESULTS AND VALIDATION
3.1 Skin Color Segmentation
Diverse images shown in Figure 8 are used for testing the functionality of the system. The images are selected based on varying skin tones among people from different races, lighting conditions, sizes of images and backgrounds. The segmentation techniques worked satisfactorily for all the test cases.
Fig. 9 Results of image segmentation
Figure 9 shows the results of skin color segmentation. RGB image subtraction resulted in an efficient segmentation but could not remove shadows. YCbCr image subtraction resulted in effective removal of shadows, because in YCbCr color space the luminance and chrominance components are decoupled. A combination of RGB and YCbCr image subtraction resulted in better segmentation, which was improved further with the usage of YCbCr thresholds from literature. The main problem with this approach is the prospect of securing noisy or distorted background images due to unstable system or moving people. Hence solitary use of YCbCr thresholds from literature might not be enough for complex images. This resulted in derivation of new RGB and YCbCr thresholds from an iterative process. An integration of image subtraction and a combination of various thresholds from different color spaces resulted in an efficient and robust segmentation technique. The results show that the image subtraction and skin color segmentation techniques complement each other.
3.2 Palm Extraction
Figure 10 shows the result of palm extraction which has been used in this project. The distinct advantage of this approach involves adaptive reconstruction of the bounding box around the principal region of palm at low computation time.
Fig. 10 Results of palm extraction
3.3 Gesture Recognition
To recognize the gesture, the centroid is found depending on the density of 1’s in the extracted palm. This technique yields better results when compared to fixing the centroid at the centre of the bounding box.
Fig. 11 Results of gesture recognition
The radial lines are drawn from the centroid to the edges of the extracted palm contour which constitutes for the distance profile. Representation of the distance profile results in appearance of few peaks close to each
other leading to inappropriate gesture recognition. This issue is resolved by scanning the distance profile to eliminate the irrelevant parts of the gesture, which is as illustrated in Figure 11.
Depending on the angle of the peak detected, gesture is recognized and transmitted wirelessly to the receiver. The time taken for the execution of the gesture recognition code is 8.411 sec. To reduce the computation time, the image is resized using the low pass filtering techniques. The resized images consumed 1.314 sec for recognizing the gesture; thereby reducing the computation time.
Fig. 12 Test images for gesture recognition Table 1. Test cases considered for palm extraction
and gesture recognition Sl .
No
Test case Success Partial Fail
1 Effective segmentation 2 More than one finger
stretched
3 Changing distances between signer and camera
4 Multiple skin colored objects detected 5 Varied sizes of image
captured
Effective palm extraction and gesture recognition techniques have resulted in the system being liberated from:
Assumptions like the signer wearing full sleeved shirts or the presence of relevant parts of the gesture only in the image frame
Gloves, markers, training data, bulky computing devices and multiple cameras
The various images used for palm extraction and gesture recognition are shown in Fig. 12. Table 1 shows various test cases used to test the functionality of palm extraction and gesture recognition. The result of segmentation greatly influences the extraction of palm and subsequently in the recognition of gestures.
3.4 Blackfin Implementation
The results of reading a file from hard disk into Blackfin BF533 memory and stages of processing for gesture recognition are as shown in the Fig. 13.
Fig. 13 Result of file read implementation on BF533
It can be seen that the original image is reduced to half the size using the image resizing techniques. The resized image is used for further processing. The image is segmented using image subtraction and a combination of thresholds from various color spaces. The segmented image is processed for extracting palm and to build a bounding box around it. Centroid was determined for the extracted palm and radial lines were drawn from the centroid to the edges of the contour, hence resulting in plotting of the distance profile. The implementation of the designed GRS on Blackfin BF533 has resulted in accurate recognition of gestures. The outcome of the implementation in Blackfin BF533 was validated with respect to the results obtained in MATLAB and a one to one correspondence was drawn between them.
Fig. 14 Image Capture in Blackfin BF533
3.5 Image Capture
Image was captured using a CMOS/CCD camera and the values were written to Blackfin memory. The result of image capture is shown in Fig. 14. The size of the captured image was 720x576x3 and type YCbCr. The YCbCr image was converted to RGB and viewed in the “image viewer” of VDSP++.
Fig. 15 Output of Skin Color Segmentation on Blackfin BF533
The displayed image was validated with the output obtained from the “image viewer” and the results were comparable to each other.
The image stored in Blackfin memory was displayed on the monitor using the PPI interface.
The integration of image capture, image display and the gesture recognition codes was successful for segmentation phase as indicated in Fig. 15. Due to the presence of several specks in the segmented image, palm extraction could not be performed.
Table 2. Test cases for wireless module Sl no Test case Results
1 For a distance of 50m Successful communication 2 Length of antenna between
10cm to 15cm
Successful communication 3 Varied frequencies at
transmitter and receiver Failure 4 Distance greater than 100m Failure
3.6 Wireless Module
The functionality of the wireless module was tested for various test cases shown in Table 2. The working of the wireless module was validated with results from the wired connection. The set-up consisting of wired connection interface was established for this purpose. The results of both the modules matched for all the test cases considered.
Table 3. Test cases for receiver module Sl no Test case Results
Test case for ADC in ARM LPC2129 1 Analog Voltages <1.5V 0-digital output 2 Analog Voltages >1.5V 1-digital output 3 Analog Voltages >5V Chip damaged
Test case for UART at ARM LPC2129 and Linux 1 Transmission at high data
speeds Partially successful 2 Transmission at low data
rates Successful
Test case for application program at Linux 1 Varied files in a particular
folder Could be opened
2 Opening new file without closing the previous file
Successful 3 Files and folders together
in a particular folder
Successful
3.7 Receiver
The output received at the UART device driver in Linux was validated with respect to output obtained at Minicom in Linux. There was one to one mapping between the values received at both the terminals. Table 3 shows the various test cases used to test the functionality of the receiver and the results obtained.
4. CONCLUSION
The designed and developed GRS system includes camera, DSP processor, wireless transceivers, embedded processor, Linux machine and a display monitor. Transmitter involves skin color segmentation, palm extraction and gesture recognition at the transmitter end. RGB and YCbCr thresholds combined with image subtraction techniques resulted in efficient skin color segmentation. Palm extraction using the horizontal and vertical profilers have made the bounding box adaptive and reduced computation time. Gesture recognition involves drawing radial lines from
the centroid of the extracted palm to the edges of the contour. This approach does not require the system to be trained; hence reducing the computation time and memory requirement. Implementation of the gesture recognition algorithm on a DSP processor makes the system robust and portable.
For real time implementation of gesture recognition algorithm, suitable camera position, ideal lighting conditions and adequate memory for processing have to be considered.
The direct result of improved segmentation, palm extraction and gesture recognition techniques resulted in the elimination of gloves, markers, training data, multiple cameras and bulky computing devices like laptops and desktops
The developed GRS transmits the recognized gesture to the receiver wirelessly. The receiver module involving ARM processor transmits the gesture command to the LINUX system through UART device driver; which in-turn controls the application to be displayed on the monitor. The functionality of the system is tested for file browser application efficiently.
5. REFERENCES
[1] Yuan, Y., Liu, Y. and Barner, K. (2005) Tactile Gesture Recognition for people with disabilities,
IEEE International Conference on Acoustics, Speech, and Signal Processing, vol 5, pp. 461-464.
[2] Mitra, S. and Acharya, T. (May, 2007) Gesture Recognition: A Survey, IEEE Transactions on
Systems, Man, and Cybernetics, vol. 37, pp.
311-324
[3] Swee, T.T., Salleh, S.H., Ariff, A.K., Ting, C., Seng, S.K. and Huat, L.S. (2007) Malay Sign Language Gesture Recognition system ,International Conference on Intelligent and
Advanced Systems, pp. 982 – 985.
[4] Lee, L.K., Kim, S., Choi, Y.K. and Lee, M.H. (2000) Recognition of hand gesture to human-computer interaction, IEEE transactions on
Industrial Electronics, vol, 3, pp. 2117-2122.
[5] Imai, A., Shimada, N. and Shirai, Y. (2004) 3-D hand posture recognition by training contour variation, Sixth IEEE International Conference on
Automatic Face and Gesture Recognition, pp. 895
– 900.
[6] Rao, V.S. and Mahanta, C. (2006) Gesture Based Robot Control, Fourth International Conference
on Intelligent Sensing and Information Processing,
pp. 145 – 148
[7] Kao, Y., Gu, H. and Yuan, S. (2008) Integration of Face and Hand Gesture Recognition, Third
International Conference on Convergence and Hybrid Information Technology, vol 1, pp. 330 –
335.
[8] Zaletelj, J., Perhavc, J. and Tasic, J.F. (2007) Vision-based human-computer interface using hand gestures, Eighth International Workshop on
Image Analysis for Multimedia Interactive Services, pp. 41