Vision Based Gesture Recognition from RGB Video frames Using Morphological Image Processing Techniques

(1)

Vol. 28, No. 13, (2019), pp. 321-332

Vision Based Gesture Recognition from RGB Video frames Using Morphological Image Processing

Techniques

Fahmid Al Farid Noramiza Hashim

Junaidi Abdullah

Faculty of Computing and Informatics Multimedia University

Cyberjaya, Malaysia

Faculty of Computing and Informatics Multimedia University Cyberjaya, Malaysia

Faculty of Computing and Informatics Multimedia University

Cyberjaya, Malaysia

Abstract

with the large number of population in all over the world nowadays, novel human computer interaction systems and techniques can be used to help improve our way of life.

A vision based gesture recognition technology can help to maintain the safety and needs of the disable as well as others. Gesture recognition from video frames is a challenging task due to the high changeability in the features of each gesture with respect to different person. In this work, we propose a vision-based hand gesture recognition algorithm where the image frames are from RGB video data. Gesture-based systems are more natural, spontaneous, and straightforward. Previous works attempted to recognize hand gesture for different kind of scenarios. According to our studies, gesture recognition system can be based on wearable sensor or it can be vision based. Our proposed method is applied on a vision based gesture recognition system. In our proposed system image acquisition starts from RGB videos capture using Kinect sensor. We convert the image frames one after another from videos to blur for background noise removal. Then, we convert the images of a whole video into HSV color mode. After that, we do the dilation, erosion, filtering, and thresholding operations on the images. We use these morphological image processing techniques for converting the images to black and white format.

Finally, using the prominent classification algorithm SVM we recognize the hand gestures with a higher accuracy 91.01 percent compared to the state of the art. In conclusion, the proposed algorithm aims to create a better vision-based hand gesture recognition system with a unique solution in this domain.

Keywords: Vison-based, Gesture Recognition, RGB Video Data, SVM, and Morphological Image Processing

1. Introduction

Hand gesture recognition system is the forefront in human computer interaction (HCI).

In general, in HCI the vision-based technology of hand gesture recognition is very crucial.

In previous times mouse and keyboard were used for human computer interaction.

Gesture recognition is also an important part of human activity recognition where the main goal to identify actions from a sequence of observations. The vision-based gesture

(2)

Vol. 28, No. 13, (2019), pp. 321-332

recognition is the basis of many applications including healthcare, HCI and video surveillance [1]. Voice recognition and gesture recognition draw a great attention to the researcher in the field of human computer interaction.

In gesture recognition field hand segmentation plays a very important role. To improve the accuracy of segmentation the phase of hand gesture segmentation needs to reduce the time of segmentation, lessen the effect of illumination conditions on the process of hand gesture segmentation. It is also required to segment hand gestures from different illumination conditions. Hand gestures segmentation has a vital position in hand gesture recognition domain [2, 3].

In the field of computer vision research, the analysis and the explanation of human behavior from visual input is a trend. It is motivated by the increase in our needs due to its exciting scientific challenges in terms of applications, such as virtual environments, monitoring systems, security, entertainment, health support, etc. In this work, they focus on the investigation and the acknowledgment of gesture dependent on hand cues. Hands are the most effective among human body parts, what's more, natural collaboration devices in Human-Computer Interaction (HCI) driven applications. The examination of hand motion acknowledgment exists for a long time and has pulled in numerous scientists from the field of computer vision. The enthusiasm for virtual and augmented reality applications propelled an uncommon need in increasingly accurate hand gesture acknowledgment. In reality, it enables users to play and interrelate with a virtual world to the greatest extent of ordinary way [4, 5]. The article presented a novel of its kind gesture identification system. This framework could be executed by end-to-end on activity based hardware with utilization of True North neuro synaptic processor. It can be used to detect hand movements in real-time with less power from the activities flowed by a Dynamic Vision Sensor (DVS) [12].

In our proposed algorithm the module is divided into three parts i.e. image acquisition, hand identification and gesture recognition. Our main focus was on hand segmentation. In this proposed algorithm, image acquisition starts from RGB plus depth videos capture using Kinect sensor. Face detection algorithm such as Haar like features has been used for human detection. Hand segmentation is performed using an improved version of k-means clustering algorithm where coordinates of pixel and color are used. Some standard algorithms also have been implemented for tracking and pose estimation.

The remaining portion of the paper is organized as follows. In Section 2, some related works are discussed. In Section 3, the proposed method for hand gesture recognition is described in detail. In Section 4, the experimental results of our approach is presented.

The main contributions of this paper is given below:

(1) To be best of my knowledge this a unique solution for gesture recognition (2) Our proposed method proposed an efficient way of feature extraction

2. RELATED WORKS

Hand gesture recognition has become a matured field of research. A lot of work has been done in this area. Hand segmentation is a difficult task as hands can vary in form and skin color. Mostly looks very different at alternative view point, can be open or closed, can be occluded to some extent, can have altered locations of the fingers, etc. The indication of skin color is a very noticeable [2, 7].

One related paper presents a new approach based on depth Map captured by an RGB-D Kinect camera for hand gesture recognition. Even though this camera offers two types of information ”RGB Image” and ”Depth Map” however, they have used depth information is to evaluate and recognize the hand gestures. Whereas we have in our method we have used the RGB image only. They have proposed a new method based on edge detection to

(3)

Vol. 28, No. 13, (2019), pp. 321-332

remove the noise and to do the hand segmentation. Furthermore, new descriptors are familiarized to map the hand gesture. These features are scale invariant, rotation and translation. Their approach is applied on French sign language alphabet. They have used fresh alphabet to show its usefulness and estimate the robustness of the proposed descriptors [13]. However, other objects may also have color similar to skin.

In [3], the camera is static and the segment of the hands is done based on their movement. Other methods use a single color background [9-11], or depend on depth information obtained by an RGB-D camera [6, 8]. In general conditions these approaches are unable to provide accurate masks. The method we propose is based on feature extraction in a unique way and classification with SVM efficiently. Our first work on feature extraction was not mature enough. As we have used 3 fold cross validation in our experiment the accuracy result was fully dependent on the training data of each fold.

While the third fold’s data were being used for training the accuracy result was not so high. We have removed the noise form dark images from that dataset. By using morphological process we have achieved a considerable accuracy in each folds. Finally the average accuracy also become very high [15].

Human activity recognition (HAR) is a generally contemplated machine vision issue.

Utilizations of HAR incorporate video reconnaissance, medicinal services, and human-PC interaction. As the imaging system progresses and the camera gadget updates, novel methodologies for HAR always rise. Human exercises have an inalienable various leveled structure that shows its distinctive dimensions, which can be considered as a three-level arrangement. To begin with, for the base dimension, there is a nuclear component and these activity natives establish progressively complex human exercises.

After the activity crude dimension, the activity/movement comes as the second dimension. At long last, the intricate cooperation frame the best dimension, which alludes to the human exercises that include multiple people and items. Atomic activities are performed by an explicit piece of the human body, for example, the hands, arms, or abdominal area part. Activities and exercises are utilized conversely in this article, alluding to the entire body developments made out of a few activity natives in fleeting consecutive request and performed by a solitary individual without any individual or extra articles. In particular, we allude the phrasing human exercises as all developments of the three layers and the exercises/activities as the center dimension of human exercises.

Human exercises like strolling, running, and waving hands are ordered in the activities/exercises level [16].

Gestures are a characteristic type of human correspondence. While going with discourse, gesture pass on information about the expectations, interests, sentiments and thoughts of the speaker. Gestures are much progressively essential in uproarious situations, at a separation, and for individuals with hearing debilitations. In these situations, gestures supplant speech as the essential methods for correspondence, getting to be both progressively normal and increasingly organized. Automated gesture recognition is in this manner a vital area of computer vision research about, with applications in Human/Computer interfaces (HCI). Of course, a huge writing has created on gesture recognition. A decent method to quantify advance in this swarmed field is to take a gander at the ChaLearn challenges, which began in 2011 and have proceeded through 2017. The current ChaLearn IsoGD dataset is one of the biggest and most fluctuated gesture datasets accessible, with 249 gestures from an assortment of areas including mudras (Hindu/Buddhist hand motions), Chinese numbers, and flying signals.

The ChaLearn 2017 test pulled in contenders from over the world, and the consequences of that test can be sensibly deciphered as mirroring the present best in class approaches [17].

(4)

Vol. 28, No. 13, (2019), pp. 321-332

The pictures contain estimation data of core attention for an assortment of exploration and investigation regions. Then again, Computer machines have turned into an indistinguishable piece of our general community, affecting numerous parts of our everyday lives as far as correspondence and collaboration. The fundamental intention is to build up a framework that can rearrange the manner in which people interface through computing machines. The framework is structured utilizing Canny's edge location for verge discovery and Histogram of slants for highlighting abstraction and the Support Vector Machine (SVM) Classification which is broadly utilized for cataloging and regression analysis. SVM formulating algorithms manufactures a prototype that forecasts whether another precedent may categorized into one class or other. Furthermore, the classifier gains from the information focuses in precedents when they are characterized having a place with their particular classifications. With the proceeded with development of communications and multimedia systems and correspondence frameworks, the composition and assessment areas have seen a relentless increment in the emphasis on the picture based data. Advanced digital image handling manages control of computerized pictures through a computerized machines. It is a subfield of signs and systems however emphasis especially around pictures. Digital image processing attentions around building up a digital framework that can implement handling on a picture data. The contribution of that framework is a digital picture as input and the framework procedure that picture utilizing proficient algorithms, and gives a picture (image) as an output.

Gesture is a kind of non-verbal correspondence in which perceptible critical achievements convey specific messages, moreover instead of discourse or composed and in parallel with expressed arguments. Gestures could be stationary (act or certain posture) that may entail fewer computational unpredictability or dynamic (arrangement of stances) which are increasingly confusing however realistic for persistent situations. Distinctive techniques have been proposed for getting data essential for gesture recognition framework. A few techniques utilized extra equipment devices, for example, information glove devices and shading markers to effectively extricate thorough interpretation of gesture highlights. Different techniques dependent on the presence of the hand utilizing the skin shading to ration the hand and concentrate fundamental highlights, these strategies thought about simple, regular and less cost contrasting and techniques referenced previously. It incorporates development of the hands, confront, or different body organisms. gestures identification can be regarded as a path for computing machines to start to comprehend humanoid non-verbal communication, alongside these positions constructing a more extravagant scaffold amongst machines and people than crude content UIs or even Graphical User Interfaces (GUIs), which still limit the larger part of impacts to console and mouse. It sanctions people to interface with the machine (HMI) and converse routinely with no machine-driven devices. The primary motivation behind growing such a framework lies in the way that gestures identity has actualized in movement investigation to device knowledge. It additionally serves numerous solicitations from augmented reality to communicate via gestures identification systems.

A great deal of research has been carried-out in keep going 2-3 decades close by gesture identification methodology. These can be generally partitioned into two classes, to be specific, Vision Based Gesture Recognition and Glove Based Gesture Recognition. Glove based Hand Gesture Recognition blocks the automatic nature as massive appliances are required to wear. A Vision founded methodology utilizes highpoints rescued from visual presence of the information of picture model of the hand, contrasting these displayed points and features removed from evidence camera(s) or video input [18].

Effective endeavors close by gesture recognition investigate inside the most recent two decades cleared the way for common human-computer communication frameworks.

Uncertain difficulties, for example, dependable distinguishing proof of gesturing stage, affectability to size, outline, and speediness varieties, and issues because of impediment keep hand gestures reaction look into still exceptionally dynamic. The techniques utilizing RGB and RGB-D cameras are looked into with quantitative and subjective examinations

(5)

Vol. 28, No. 13, (2019), pp. 321-332

of algorithms. Quantitative examination of algorithms is finished utilizing a lot of 13 estimates looked over improved qualities of the algorithms and the provisional strategy received in algorithms assessment. We call attention to the requirement for bearing in mind these methods organized with the recognition accuracy of the algorithms to foresee its execution in genuine applications. Nonverbal correspondence, which incorporates correspondence through hand gestures, body stances, and outward appearances makes up around 66% of all correspondence among human. Hand gestures are a standout amongst the most widely recognized classification of non-verbal communication utilized for correspondence and collaboration. While whatever remains of the body demonstrates a progressively broad passionate state, hand gestures can have explicit semantic substance in it. Because of the speed and expressiveness in association, hand gestures are generally utilized in communications via signs and human-computer connection frameworks. One progressing objective in human-machine interface configuration is to empower prevailing and connecting with assistance. For instance, vision-based hand gestures acknowledgment (HGR) frameworks can empower contactless association in clean conditions, for instance, healing center medical procedure rooms, or basically give linking with controls to diversion and gaming applications. Nevertheless HGR isn't as hearty as standard console and mouse based communication. Issues, for example, affectability to size and speed varieties, poor execution against complex foundations and changing lighting conditions, and the solid recognition of gesture stages have restricted the utilization of hand gestures as a dependable methodology in interface structure [19].

In this paper, creators present their consequences of programmed gesture reaction frameworks utilizing distinctive sorts of cameras so as to contrast them in reference with their exhibitions in division. The obtained picture portions give the information to advance examination. The pictures of a solitary camera framework are generally utilized as info information in the examination region of gesture recognition. In contrast with that, the investigation aftereffects of a stereo shading camera and a thermal camera framework are utilized to decide the focal points and inconveniences of these camera frameworks. On this premise, a real-time gesture recognition system framework is proposed to characterize letter sets (A-Z) and numbers (0-9) with a normal acknowledgment rate of 98% utilizing Hidden Markov Models (HMM) [20].

In this paper, authors present a short review on the related vital datasets. A foundation on the varieties of datasets is displayed before the characterizations. The terms activity, action and gestures are somewhat covered while referencing by the specialists, and in this manner, some datasets have covered classes. The databases are grouped into three classifications as activity, gesture and action. This paper distinguishes key datasets and brings up regions, which are required to investigate in future. Some datasets are clarified and less known datasets are referenced just so specialists can investigate them whenever required. Different angles on the assortments of datasets are represented in this paper as well. It is apparent that various difficulties in the field of activity and gestures examination stay unsolved. Thus, we require some extensive datasets in differed circumstances [21].

Gesture acknowledgment frameworks are considered as one of the ongoing applications that entreated a distinctive collaboration among human and computer that reproduces however as much as could be expected human-human connection. Building gesture acknowledgment framework requires specific developments; these means can be condensed by: identifying hand motion, deciding basic reference focuses to draw the key highlights of hand section, extricating general features that can rule gestures consist of space, lastly the acknowledgment of the gesture. After this authors have execute the previous phases of gesture framework in their past work, in which the hand motion is established and trapped utilizing Genetic Algorithm (GA) with an expanding number of chromosomes utilizing variable length of individual procedure, and to decide hand centroid GA is utilized with limiting number of chromosomes. Geometric reference

(6)

Vol. 28, No. 13, (2019), pp. 321-332

indicates are utilized concentrate two sorts of features which are used for finger recognizable proof and gesture identification stages. In this paper they considered the order organize; Gaussian model is utilized as the classifier to ascertain finger’s probability after connected the extricated features. Test results demonstrate the power and effectiveness of the proposed framework by the distinguished fingers and the perceived gestures [22].

This paper presents a relative investigation of current hand gestures acknowledgment frameworks and gives the new methodology for the gesture reaction which is simple less expensive and option of input devices like mouse with static and dynamic hand motions, for intelligent PC applications. In spite of the expansion in the consideration of such frameworks there are as yet certain impediments in writing. Most applications require diverse imperatives like having particular lightning conditions, utilization of a particular camera, making the client wear a multi-hued glove or need heaps of preparing the data.

The utilization of hand motions gives an alluring option in contrast to lumbering interface gadgets for human-PC collaboration called HCI. This interface is basic enough to be run utilizing a customary webcam and requires small preparing [23].

In this work, creators present a novel continuous technique for hand gestures acknowledgment. In their system, the hand district is removed from the foundation with the background subtraction technique. By then, the palm and fingers are divided to recognize and see the fingers. Finally, a standard classifier is associated to predict the characteristics of hand gestures. The investigations on the dataset collection of 1300 pictures demonstrate that our technique performs well and is profoundly proficient. Also, our strategy demonstrates preferable execution over a condition of-workmanship technique on another informational index of hand gestures [24].

In this paper, creators propose a multimodal motion acknowledgment strategy dependent on a ResC3D network. One key thought is to locate a minimal and compelling portrayal of video arrangements. Hence, the video improvement systems, for example, Retinex and median filters are connected to take out the brightening variety and disorder in the input video, and a weighted edge unification methodology is used to test key edges.

Upon these portrayals, a ResC3D position, which use the benefits of both leftover and C3D display, is produced to separate highpoints, together with an accepted connection examination based combination plot for mixing features. The implementation of their strategy is assessed in the Chalearn LAP separated gesture acknowledgment challenge. It achieves 67.71% exactness and positions the first place in this test [25].

Hand gestures are the most regular and natural non-verbal correspondence medium while utilizing a computer machine, and related research endeavors have been focused on research interest. Also, the data given by current business economical insight cameras can be oppressed in different gesture recognition based frameworks. The zone of hand gesture examination covers hand posture estimation and gesture recognition. Hand posture estimation is viewed as more difficult than other human part estimation because of the little size of the hand, its more prominent multifaceted nature and its essential self- impediments. In addition, the improvement of an exact hand gesture recognition framework is additionally testing because of high dissimilarities between gestures got from specially appointed, social as well as individual elements of users. This paper proposes a unique system to speak to hand gestures by utilizing hand shape and movement descriptors registered on 3D hand skeletal highpoints. Authors utilize a worldly pyramid to display the dynamic of motions and a direct SVM to play out the arrangement. Moreover, they make the Dynamic Hand Gesture dataset containing 2800 groupings of 14 gesture sorts. Assessment results demonstrate the promising method for utilizing hand skeletal data to perform hand gesture recognition. Tests are completed on three hand motion datasets, containing a lot of fine and coarse heterogeneous gestures.

Besides, outcomes of our methodology as far as idleness showed upgrades for a low- inertness hand gesture recognition frameworks, where an early description is required. At

(7)

Vol. 28, No. 13, (2019), pp. 321-332

that point, they expand the investigation of hand gesture examination to online reaction.

Utilizing a reflective learning approach, they apply an interchange learning procedure to learn hand position and shape highlights from insight picture dataset initially made for hand posture estimation. Second, they display the worldly varieties of the hand stances and its shapes utilizing a repetitive reflective learning method. At long last, both data are converged to perform exact earlier identification and recognition of hand gestures.

Analyses on two datasets show that the proposed methodology is able to distinguish a happening motion and to perceive its sort far before its end [26].

In this article the writers can recognize two classes of hand gestures, specifically static and dynamic. A static motion is a specific hand design and posture articulated by a self- contained picture. A dynamic gesture is a moving signal, articulated by a sequence of pictures. This makes marked language a decent experiment for the classification of gestures, since each motion is allocated a specific importance. This paper has concentrated on the improvement of a framework that will perceive static hand pictures against complex background dependent on South African Sign Language (SASL). A Support Vector Recognition framework has been utilized to order hand poses as gestures, because of its high speculation execution without the need to include a prior information, notwithstanding when the components of the information space is very high [27].

This proposition shows a novel perceptiveness camera-based real time hand gesture recognition framework for preparing a human-like robot hand to collaborate with people through communication via gestures. Authors built up a particular constant Hand Gesture Recognition (HGR) framework, which utilizes multiclass Support Vector Machine (SVM) for preparing and identifying of the static hand stances and N-Dimensional Dynamic Time Warping (ND-DTW) for dynamic hand motions identification. A 3D hand signals preparing/testing dataset was recorded utilizing an insight camera custom fitted to oblige the kinematic valuable constraints of the human-like mechanical hand. Test results demonstrate that the multiclass SVM technique has a by and large 98.34%

acknowledgment rate in the HRI (Human-Robot Interaction) mode and 99.94%

acknowledgment rate in the RRI (Robot-Robot Interaction) mode, just as the most reduced normal run time compared to the k-NN (k-Nearest Neighbor) and ANBC (Adaptive Naïve Bayes Classifier) approaches. In unique motions acknowledgment, the ND-DTW classifier shows a superior execution than DHMM (Discrete Hidden Markov Model) with a 97% acknowledgment rate and essentially shorter run time [28].

3. PROPOSED ALGORITHM FOR FEATURE EXTRACTION

After getting the video frames we blur the images for better extraction of hand and removing the background noise. We convert the image frames to blur by kernel size (3x3). Images are then converted to HSV color mode. We leaves the colors only in specific color ranges. Again we do blurring with kernel size (7x7) and an ellipse kernel with size (5x5).

We use the popular morphological techniques of image processing. We dilates the image object and after erosion objects in the images get thinner. Blurring the image with kernel size (21*21). After that thresholding the image and converting to black and white format. Then we start to find the contour object and the maximum contour with maximum area. Finally we save the masked images where they stored the center of max contours.

We got the trajectory from all max contour of video frames. Here the trajectory is the feature. We have many features which is equivalent to the number of video frames.

Number of video frames depends on the lengths of video. The overall process depicted by Fig. 1.

(8)

Vol. 28, No. 13, (2019), pp. 321-332

Fig.1. Proposed algorithm for feature extraction

4. METHODOLOGY

In our experiment we have use the famous SKIG dataset. The dataset consist of 1080 RGB videos. We have divided the whole dataset into 3 parts. We randomly put the data into 3 folds. This dataset consists of 10 types of gestures. We have 10 classes of gestures.

These are circle (clockwise), triangle (anti-clockwise), up-down, right-left, wave, ”Z”, cross, come here, turn around

and pat. In the collection process, all these ten categories are performed with three Hand postures: fist, index and flat [14]. We have used two third of the dataset for training and the remaining one third for testing purposes. We have used our proposed algorithm to perform hand gesture recognition using hand-crafted features given video stream and then fed into the SVM for classification (Fig. 2). The training, testing and classification accuracy have depicted in Fig. 3.

Fig. 2. Classification with SVM

(9)

Vol. 28, No. 13, (2019), pp. 321-332

Fig. 3. Training, testing and classification accuracy

5. RESULTS

We have achieved the same average accuracy in both fold 3 and fold 2 respectively 98.61. While we use fold 3 data for testing, fold 1 and fold 2 are being used as training purposes. While we use fold 2 data for testing, fold 1 and fold 3 are being used as training purposes. Average accuracy is much lower in fold 1, which is 75.83. However, the overall average accuracy 91.01. These are all shown in the Table I. The average accuracy is better now in the enhanced version of our proposed gesture recognition algorithm, and more importantly this method is very unique compared to the other hand crafted methods.

Comparison of classification accuracies (%) of the RGB Channel on SKIG dataset. The feature representation have shown in Fig. 3.

Fig. 3. Feature representations

TABLE I. ACCURACIES IN DIFFERENT FOLDS

TABLE II. COMPARISON OF CLASSIFICATION ACCURACIES (%) OF THE RGB CHANNEL ON SKIG DATASET

6. CONCLUSIONS AND FUTURE WORKS

(10)

Vol. 28, No. 13, (2019), pp. 321-332

In our proposed algorithm image acquisition is done from RGB video data. We convert the image frames from videos to blur for background noise removal. Then, we convert the images into HSV color mode. Some standard algorithms has been implemented for tracking and pose estimation. Eventually, using the classification algorithm hand gestures has been recognized. Our algorithm aims to create a better vision-based hand gesture recognition system which can do unique solution of the problem. We have proposed a unique morphological algorithm for feature extraction which finally fed into an SVM to get higher accuracy. We have planned to improve the overall accuracy of gesture recognition by using RGB and depth information fusion of our proposed method. We have created our own dataset which we aiming to validate using our proposed algorithm.

We also aims to use the internet videos of gestures to benefit the computer vison world where most of the videos are still in RGB.

Acknowledgments

This work was fully supported by Multimedia University, Cyberjaya, Malaysia.

References

[1] Mukti, F. A., Eswaran, C., Hashim, N., Ching, H. C., & Ayoobkhan, M. U. A. (2018).

An Automated Grading System for Diabetic Retinopathy using Curvelet Transform and Hierarchical Classification. International Journal of Engineering & Technology, 7(2.15), 154-157.

[2] A.D. Bagdanov, A. Del Bimbo, L. Seidenari, and L. Usai, “Real-time hand status recognition from RGB-D imagery,” in Proceedings of the 21st International Conference on Pattern Recognition (ICPR ’12), pp. 2456–2459, November 2012.

[3] Ayoobkhan, M. U. A., Chikkannan, E., & Ramakrishnan, K. (2018). Feed-forward neural network-based predictive image coding for medical image compression.

Arabian Journal for Science and Engineering, 43(8), 4239-4247.

[4] De Smedt, Quentin. "Dynamic hand gesture recognition-From traditional handcrafted to recent deep learning approaches." PhD diss., Université de Lille 1, Sciences et Technologies; CRIStAL UMR 9189, 2017.

[5] M. R. Malgireddy, J. J. Corso, S. Setlur, V. Govindaraju, and D. Mandalapu, “A framework for hand gesture recognition and spotting using sub-gesture modeling,” in Proceedings of the 20th International Conference on Pattern Recognition (ICPR ’10), pp. 3780–3783, August 2010.

[6] P. Suryanarayan, A. Subramanian, and D. Mandalapu, “Dynamic hand pose recognition using depth data,” in Proceedings of the 20th International Conference on Pattern Recognition (ICPR ’10), pp. 3105–3108, August 2010.

[7] [6] S. Park, S. Yu, J. Kim, S. Kim, and S. Lee, “3D hand tracking using Kalman fiter in depth space,” Eurasip Journal on Advances in Signal Processing, vol. 2012, no. 1, article 36, 2012.

[8] J. L. Raheja, A. Chaudhary, and K. Singal, “Tracking of fingertips and centers of palm using KINECT,” in Proceedings of the 2nd International Conference on Computational Intelligence, Modelling and Simulation (CIMSim ’11), pp. 248–252, September 2011.

[9] Y. Wang, C. Yang, X. Wu, S. Xu, and H. Li, “Kinect based dynamic hand gesture recognition algorithm research,” in Proceedings of the 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC ’12), pp. 274–279, August 2012.

[10] M. Panwar, “Hand gesture recognition based on shape parameters,” in Proceedings of the International Conference on Computing, Communication and Applications (ICCCA ’12), pp. 1–6, February 2012.

[11] Z. Y. Meng, J.-S. Pan, K.-K. Tseng, and W. Zheng, “Dominant points based hand figer counting for recognition under skin color extraction in hand gesture control

(11)

Vol. 28, No. 13, (2019), pp. 321-332

system,” in Proceedings of the 6th International Conference on Genetic and Evolutionary Computing (ICGEC ’12), (2012) 364–367.

[12] Amir A, Taba B, Berg DJ, Melano T, McKinstry JL, Di Nolfo C, Nayak TK, Andreopoulos A, Garreau G, Mendoza M, Kusnitz J. A Low Power, Fully Event- Based Gesture Recognition System. InCVPR 2017 Jul 1 (pp. 7388-7397).

[13] Ben Jmaa, Ahmed, et al. "A new approach for hand gestures recognition based on depth map captured by rgb-d camera." Computación y Sistemas 20.4 (2016): 709-721.

[14] Liu L, Shao L. ”Learning Discriminative Representations from RGB-D Video Data.”

InIJCAI 2013 Aug 3 (Vol. 1, p. 3).

[15] Fahmid F, Noramiza H, Junaidi A, “Vision-based Hand Gesture Recognition from RGB Video Data Using SVM” IWAIT IFMIA 2019 Jan 6

[16] Zhang, S., Wei, Z., Nie, J., Huang, L., Wang, S. and Li, Z., “A review on human activity recognition using vision-based method. Journal of healthcare engineering”, 2017.

[17] Pradyumna Narayana, Ross Beveridge, Bruce A. Draper; “The IEEE Conference on Computer Vision and Pattern Recognition” (CVPR), 2018, pp. 5235-5244

[18] Nagashree, R.N., Michahial, S., Aishwarya, G.N., Azeez, B.H., Jayalakshmi, M.R.

and Rani, R.K., “Hand gesture recognition using support vector machine”, June 2015.

[19] Pisharady, P.K. and Saerbeck, M., “Recent methods and databases in vision-based hand gesture recognition: A review”, Computer Vision and Image Understanding, 141, pp.152-165, 2015.

[20] Appenrodt, J., Al-Hamadi, A. and Michaelis, B., “Data gathering for gesture recognition systems based on single color-, stereo color-and thermal cameras”, International Journal of Signal Processing, Image Processing and Pattern Recognition, 3(1), pp.37-50, 2010.

[21] Ahad, M.A.R., 2014, September. Datasets for Action, Gesture and Activity Analysis.

In The 2nd International Conference on Intelligent Systems and Image Processing 2014 (ICISIP2014).

[22] Ibraheem, N.A., 2016. Finger Identification and Gesture Recognition Using Gaussian Classifier Model. International Journal of Applied Engineering Research, 11(10), pp.6924-6931.

[23] Narendra V. Jagtap1 Prof. R. K. Somani2 Prof. Pankaj Singh Parihar3

[24] Chen, Z.H., Kim, J.T., Liang, J., Zhang, J. and Yuan, Y.B., “Real-time hand gesture recognition using finger segmentation”, The Scientific World Journal, 2014.

[25] Miao, Q., Li, Y., Ouyang, W., Ma, Z., Xu, X., Shi, W., Cao, X., Liu, Z., Chai, X. and Liu, Z., “Multimodal Gesture Recognition Based on the ResC3D Network”, In ICCV Workshops (pp. 3047-3055), 2017 October.

[26] De Smedt, Q., 2017. Dynamic hand gesture recognition-From traditional handcrafted to recent deep learning approaches (Doctoral dissertation, Université de Lille 1, Sciences et Technologies; CRIStAL UMR 9189).

[27] Naidoo, S., Omlin, C.W. and Glaser, M., ”Vision-based static hand gesture recognition using support vector machines”, University of Western Cape, Bellville, 1998.

(12)

Vol. 28, No. 13, (2019), pp. 321-332 [28] Zhi, D., 2018. “Depth Camera-Based Hand Gesture Recognition for Training a Robot to Perform Sign Language” (Doctoral dissertation, Université d'Ottawa/University of Ottawa).