2018 3rd International Conference on Information Technology and Industrial Automation (ICITIA 2018) ISBN: 978-1-60595-607-7
A Novel Gesture Control Scheme of Mobile
Robots Based on ROS
Zhenwei Li, Xiaoli Yang, Lei Zhao, Yue Zhu and Mengli Jia
ABSTRACT
Gesture control is one form of human-computer interaction, which captures the gesture image through the camera and parses it into corresponding instructions to control the movement of mobile robots. As an intuitive, convenient and flexible interaction method, gesture control can be used as an assisted control way of warehousing and logistics robots, intelligent wheelchairs and so on. In this paper, a mobile robot gesture-controlling scheme was designed and implemented using MATLAB and Robot Operating System (ROS). Firstly, a hand gesture image was captured using a RGBD camera and then was preprocessed using MATLAB. Then, the gesture contour was extracted through morphological processing and the gesture was recognized based on pixels’ changing. Finally, a MATLAB-ROS interface was developed to convert the recognized gestures into corresponding commands to control the basic movements(forward, backward, turn left, turn right and stop, etc.) of a mobile robot. A TurtleBot2-a mobile robot using ROS was used to test this scheme. The results show that the scheme this paper put forwarded can recognize five commonly used static gestures and can control five basic movements of TurtleBot2 successfully and the recognition and controlling accuracy can attain over 90%.1
INTRODUCTION
Gesture is a common expression other than language and looks. Gestures express multiple meanings, that is, the meanings of the same gestures in different environments are different. At the same time, multiple gestures can also express the
1
same meaning[1]. This is one of the important research objects in the field of intelligent recognition. The principle of the human-computer interaction accomplished through gestures is to allow the machine to "understand" and "respond" to specific gestures. The operators can use gestures to manipulate tools, perform certain tasks, or conduct language exchanges.
Gesture recognition has become more and more widely used in human-computer interaction, and a variety of identification methods have emerged. Wang Bing[2] adopted fingertip detection algorithms based on pixel classification to train and recognize each gesture by using hidden Markov models. Gao Chen[3] proposed an algorithm combining convex hull and curvature detection of fingertips, and used support vector machine (SVM) for gesture recognition. The recognition rate was as high as 97.1%. Ding Yi [4] used the HOG feature to represent gesture images, and then used histogram intersection kernel SVM for gesture recognition. The recognition rate reached 93.33%.
A hand gesture control scheme in MATLAB and ROS environments was designed and implemented in this paper. This paper acquired the hand gesture image through the Kinect camera, and extracted the skin color region by combining RGB and HSV dual color spaces. After image smoothing and morphological processing, the skin-like region was removed by the gesture region determining condition, the rectangular gesture region was extracted and finally the gesture outline was obtained. This paper used the pixel change and two center- distance-threshold methods to recognize hand gestures and store the recognition result. The recognition result was acquired in the ROS environment and was converted into a corresponding control command controlling the movement of a mobile robot.
HAND GESTURE RECOGNITION
Skin Color Extraction
The skin color-based segmentation method divides the skin color region from the image through the clustering feature of the skin color in the color space, and uses the feature information of skin color to implement hand gesture segmentation[5]. This segmentation method is intuitive, efficient and accurate. This paper adopted the threshold segmentation of RGB color space, and then combined the clustering of skin color distribution in HSV space, and performed the AND operation between the two of them to achieve the extraction of skin color regions. Using the following formula, the image was converted from RGB space to HSV space.
R G B
max
V , , (2)
R G B
/Vmin -1
S , , (3)
The R, G, and B in the formula are the values at each pixel in the image, which are the corresponding components of the RGB color space.
Through experiments, the human skin color space HSV[6] has the following range: H (2, 28), S (50, 200). Using this range, the extraction of skin color can be performed in a simpler manner, which is beneficial to enhance the real-time nature of the system. In addition, the characteristic hue of the skin color is in the RGB space and its RGB value satisfies R>G>B, therefore, this feature was combined in the skin color extraction.
According to the skin color information above, we binarized the acquired gesture image to obtain a skin color region. According to the original H, S value at image (x, y) and R, G, B value, we determined the pixel value at the binary map (x, y):
else
B G andR andS
H
, 0
200 , 50 28
, 2 , 255 fx,y
(4)
H, S, etc. in formula 4 are the values of the pixel at (x, y) in the original image corresponding to HSV and RGB color space components respectively. By looping the traversal of the image, we determined whether the corresponding component of the pixel at (x, y) in the original image satisfied the condition or not. If the condition is met, the pixel at the zero matrix (x, y) is equal to 1; otherwise, it is equal to 0. The result of the skin color area extraction is shown in Fig. 1 (a).
(a) Binary Image (b) Image after median filtering
[image:3.612.174.423.486.645.2](c) Gesture region (d) Image after morphological processing
Hand Gesture Segmentation
Hand gesture images may be distorted due to interference from different noises in the process of generation and transformation. Therefore, it is necessary to process the image, filter out unnecessary information, and extract effective gesture contour. The processing steps are as follows:
Step 1: Median filter the skin region image. As shown in Fig.1 (b), the median filter used in this paper not only removes noise, but also preserves the edge characteristics of the image without causing significant blurring.
Step 2: Remove the skin area outside the gesture and the skin-like area in the background(Fig.1 (c)). If the ratio of binarized skin area or skin-like area to the entire image area is less than 0.06, these areas are not gesture areas and need to be removed.
Step 3: The burrs on the edge and internal voids of the binarized gesture image can be removed by morphological treatment such as swelling and corrosion. Experiments showed that this process can effectively smooth the outline and fill the void, as shown in Fig.1 (d).
[image:4.612.107.490.345.451.2](a)gesture 1 (b)gesture 2 (c)gesture 3 (d)gesture4 (e)gesture5
Figure 2. Predefined gesture 1-5.
(a) Image traversing (b) Rectangle gesture region (c) Gesture Contour
[image:4.612.175.421.514.644.2]Gesture Recognition
The number of fingers is the most obvious feature of the gesture. This paper recognizes the specified five static gestures (as shown in Fig.2) as gestures 1-5 by calculating the number of fingers.
In order to reduce the amount of calculations in subsequent recognition and ease the computational pressure, an effective rectangular gesture region was extracted from the image to complete gesture recognition. The algorithm idea of obtaining a rectangular gesture area is: traversing each row of pixels of the gesture binary image from left to right and from top to bottom. When scanning to row i and column j, if the pixel value at (i,j) is 1, we can determine the top of the rectangle abscissa; and so on, the bottom abscissa can be obtained by scanning from left to right, from the bottom up. The right and left borders of the rectangular gesture area can be obtained in the same way, so as to capture a rectangular gesture area; and 5 pixel widths were reserved for each side. The schematic diagram of the algorithm is shown in Fig.3 (a). Meanwhile, we acquired two feature points, the center of the rectangular gesture area and the gesture center, as shown in Fig.3 (b), and Fig.3(c) is the final extracted gesture contour.
After the rectangular gesture area was obtained, the number of fingers was calculated according to the change of pixels at the edge of the finger. From the image of the gesture area, we can see that in the finger part, the pixel of one finger changes twice; if it is gesture 2 then the pixel changes four times, thenumber of fingers equals 2, which is the result of 4 divided by 2, and so on. The pre-defined gestures 1-4 can be directly identified by the method above of detecting the number of fingers. Gesture 5 recognition also needed to be in conjunction with the distance between the center of the rectangular gesture area and the center of the gesture. If the distance is less than 16, the result is determined as gesture 5; otherwise, make no change to the initial recognition result. Since the finger usually appeared in the upper 1/4 to 1/3 range of the rectangle of the gesture area, when the gesture was being recognized, only the upper 7/24 portion of the rectangle of the gesture area was detected to improve the detection accuracy. Finally, the gesture recognition result was stored as a resul.mat file.
GESTURE CONTROL IMPLEMENTATION
corresponding Twist message upon received control command to control Turtlebot2 to move.
TESTS AND RESULTS
In this paper, 16 samples of each class of the gestures 1 to 5 are taken as test samples. A total of 80 test samples are used to test the gesture recognition scheme. The test results are listed in Table1. As can be seen from the data in Table1, the accuracy of the proposed gesture recognition is 91.4%, which is high, and the rejection recognition rate is 11.2%, which is also relatively high. Due to the limitations of the gesture recognition algorithm in this paper, the gesture recognition accuracy of gesture 1 and gesture 5 is lower than the average level.
[image:6.612.166.429.375.628.2]In the ROS environment, gesture control was realized by Rviz tool simulation. As shown in Fig.4, the node Listener subscribes to the chatter topic and sends reply "I heard forward". Then, the corresponding Twist message was published to control TurtleBot2 mobile robot to advance. "Stop" controls it to stop its movement.
TABLE I.TEST RESULTS.
CONCLUSIONS
In this paper, a mobile robot gesture control scheme is designed and implemented using MATLAB and ROS, and the gesture recognition is applied to the movement control of the mobile robot and the results are effective. The research mainly includes static gesture recognition based on Kinect vision and mobile robot motion control. We focus on the implementation of gesture segmentation, the design of gesture recognition algorithms and the implementation of motion control.
The results show that the gesture control program can capture gesture images through the Kinect camera, recognize the five kinds of common static gestures input from the camera, and then convert them into corresponding control commands to control TurtleBot2 to move forward, backward, turn left, turn right and stop.
REFERENCES
1. Yun Liu, Lifeng Zhang, Shujun Zhang. A Hand Gesture Recognition Method Based on Multi-Feature Fusion and Template Matching. Procedia Engineering, 2012, 29(4):1678-1684.
2. Bing Wang, Hongwei Dong, Mingmin Zhang, etc. Dynamic Hand Gesture Recognition Based on Kinect. Transducer and Microsystem Technologies, 2018, 37(2):143-146.
3. Chen Gao, Yajun Zhang. Fingertip Detection and Hand Gesture Recognition Based on Kinect Depth Image. Computer Systems & Applications, 2017, 26(4):192-197.
4. Yi Ding, Jiangtao Cao, Ping Li, etc. Research on Hand Gesture Recognition in Complex Backgrounds. Techniques of Automation and Applications, 2016, 35(8):113-116.
5. Dhruva N, Rupanagudi S, Sachin S, et al .Novel Segmentation Algorithm for Hand Gesture Recognition. IEEE International Multi Conference on Automation Computing, Control, Communication and Compressed Sensing, Kottayam, India, 2013:383-388.