• No results found

Recognizing & Interpreting Indian Sign Language Gesture for Human Robot Interaction

N/A
N/A
Protected

Academic year: 2020

Share "Recognizing & Interpreting Indian Sign Language Gesture for Human Robot Interaction"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Abstract— This paper describes a novel approach towards

recognizing of Indian Sign Language (ISL) gestures for Humanoid Robot Interaction (HRI). An extensive approach is being introduced for classification of ISL gesture which imparts an elegant way of interaction between humanoid robot HOAP-2 and human being. ISL gestures are being considered as a communicating agent for humanoid robot which is being used in this context explicitly. It involves different image processing techniques followed by a generic algorithm for feature extraction process. The classification technique deals with the Euclidean distance metric. The concrete HRI system has been established for initiation based learning mechanism. The Real time robotics simulation software, WEBOTS has been adopted to simulate the classified ISL gestures on HOAP-2 robot. The JAVA based software has been developed to deal with the entire HRI process.

Keywords- Indian Sign Language; Gesture; Real Time; Euclidean distance; WEBOTS simulation software; HOAP-2 humanoid robot.

I. INTRODUCTION

Gestures are used for communication in daily life. Sign language is used for exchanging the ideas, messages, and thoughts among deaf and dumb people. The sign language which confirms their existence in India is called Indian Sign Language [ISL] like as American Sign Language (ASL) and British Sign Language (BSL) are concerned inherently with their existence [1, 2]. ISL produces static as well as dynamic gesture respectively and to recognize them needs to solve several kinds of challenges like two handed , both hands moving some times, sometimes one hand is moving fast another slow, different hand shapes, contacting the body. Most of ISL gestures are being considered here are dynamic in nature, which offers more problem complexity towards humanoid robot interaction. Unlike ASL gesture, Indian Sign Language constructs a prominent hand gesture being considered with both hands [1,2].

Numerous approaches have been adopted for gesture based communication and teaching demonstration to humanoid robot in a wide variety of way. Programming by demonstration emerges in a new way of learning architecture where human demonstrator is used to teach a humanoid

robot by generalizing probabilistic estimation to different context [3]. It offers humanoid robot to achieve new skills and goal by imitation learning and observation [5]. A mathematical framework has been involved for reproducing the gestures for humanoid robot and generating the joint angle trajectories of hand motion are encapsulated in Hidden Markov Models (HMM) [4]. A probabilistic representation model based on Principal Component Analysis (PCA) and Gaussian Mixture Model (GMM) has been entertained for incremental learning to humanoid robot with kinesthetic teaching process [6]. A learning algorithm for humanoid robot based on the Gaussian Mixture Regression (GMR) has been encountered to learn several tasks by a set of kinesthetic demonstrations effectively [7]. An imitation game has been identified to teach a humanoid robot in order to recognize gestures based on the Hidden Markov Model (HMM) [8]. Numerous techniques have been illustrated for gesture classification and recognition which could be accommodated for teaching demonstration to humanoid robot.

One prominent approach describes the vision based recognition technique [9] to achieve visual information in the form of feature vector. Several techniques have been evolved for pattern classification with the implementation of dynamic gestures in real time [10, 11]. The moving gesture classification in Indian sign language which keeps all the information about the hand motions [10] behaves as most complicated task. Dynamic gesture implies sequence of images. An approach has been undergone based on 2-D locations of fingertips and palms which were used predominately in [11]. An elegant way of gesture classification is entirely based on the hand trajectory, hand shapes ,hand motions [12] and Hidden Markov Model (HMM) [13] being considered as a robust classifier.

In this paper we propose a practical framework for ISL gesture based human robot interaction. Gesture based learning mechanism contributes a significant role to achieve new skills for a humanoid robot interaction in a very efficient manner. ISL video is being used extensively to compose several hand gestures to train the HOAP-2 robot in real time. Orientation histogram feature is extracted for real time classification using Euclidean distance metric. For humanoid robot command is provided in terms of joint angle values. Look up table approach is followed for mapping the human gesture with humanoid robot to avoid real time

Recognizing & Interpreting Indian Sign Language Gesture for Human Robot

Interaction

Anup Nandy, Soumik Mondal, Jay Shankar Prasad, Pavan Chakraborty and G.C.Nandi Robotics & Artificial Intelligence Lab

Indian Institute of Information Technology Allahabad, India

{ nandy.anup , mondal.soumik }@gmail.com, { jsp , pavan , gcnandi }@iiita.ac.in

(2)

inverse kinematics calculation which becomes the overhead to solve.

The entire paper has been illustrated into subsequent sections as below.

Section II describes the accomplishment of ISL gesture acquisition techniques in addition to applying image processing method, background uniformity, image enhancement using histogram equalization and noise elimination using Gaussian filtering techniques. Next section deals with ISL gesture classification approach with addition to feature selection and evaluation. It elaborates a statistical technique for calculation of Euclidean distance among all the training set of ISL gestures with a new test ISL gesture. Section IV deals with real time simulation software, WEBOTS and specification of HOAP-2.Section V illustrates the learning of humanoid robot HOAP-2 with the classified result of ISL gesture. Next section demonstrates the comprehensive simulation results generated by HOAP-2 on WEBOTS platform. Finally, conclusion and Future work have been identified in section VII. References are included at the end of this paper. Next section takes an inspiration to discuss ISL acquisition process.

II. GESTURE COLLECTION AND PREPROCESSING

ISL video has been captured by selecting several dynamic gestures (i.e. sequence of frames) in real time using webcam as shown in Fig. 1. We have chosen the gestures arbitrarily from standard ISL dictionary. One elementary approach for image processing tends to background uniformity which applies to be chosen of dark background for dealing with gray scale images effectively [1]. We have identified several ISL dynamic gestures in different light illumination contrast. The process is being followed by histogram equalization technique for normalization. To perform histogram equalization, consider CDF (Cumulative Distribution Function) as shown in equation (1).

(1)

Where needs to be obtained in the form of contrast stretching, is the histogram of the probability. Mathematically the discrete form of the transformation function for the histogram equalization is given by equation (2).

(2)

Where , with = the

normalized intensity value (gray level value). = Number of gray levels in the image. = Total number of pixels with gray level . = total number of pixels.

Transform the input image to output image using the

transformation function Whenever the

transformation function is equal to the CDF of the image or normalized sum of the histogram which gives the histogram equalization of the input image.

Next section deals with feature selection and classification.

ISL Gesture

Initial Position Final Position

Shoot

River

Recover

Quiet Down

Miss

Meet

Heap

Go

Fig 1. ISL video gestures with different light illumination condition

III. ISL RECOGNITION APPROACH

Learning is a very complicated task for humanoid robot, HOAP2 through the extensive interaction with sign language gestures. It induces challenges towards the understanding of ISL gestures and then performs some specified task eventually. Feature selection plays an important role in pattern classification technique [13]. Orientation of edges in an image is being considered as a feature vector of the entire ISL video. We are obtaining the different direction histogram in the form of 18 and 36 bins. It follows an algorithm as discussed in [17, 18] for the evaluation of feature vector which is being used for pattern classification.

3.1 Feature Extraction Algorithm

Step 1: Divide the image frame into grids (60x40) pixels. If P(x ,y) is the point in the grid. Then set of pixel is

where

Step 2: Find the motion vector between I and I+1 frame for each grids of step1.

Step 3: Generate a direction histogram from these vectors using gradient direction .

(3)

Step 5: Mean feature F = , where

Step 6: Find the Mean of the feature vector F’. 3.2 Classification algorithm

Find the Euclidean distance.

Step 1: Apply all the above steps for test gesture till feature

vector computation. Let .

Step 2: Find the minimum distance using

between vector G and F’.

Step 3: Identify the matched class based on step 2 result towards calculation of direction of edges.

We have trained our classifier using twenty ISL gestures some of them have been shown in Fig 1. It is very fast for the calculation of direction of edges of ISL gesture in real time. The geometrical representation of training gestures and test gesture in feature space has been described in Fig.2. It provides real time performance to train humanoid robot HOAP-2.

Fig 2. Representation of 2D feature space for ISL Test and Training gestures where D3 is the minimum Euclidean distance classified as gesture A.

[image:3.612.56.280.339.458.2]

Fig 3. Flowchart of ISL gesture recognition for the simulation of HOAP-2

Fig.3 explains the entire classification process in addition to simulation of HOAP-2.

IV. WEBOTS & HOAP-2 SPECIFICATION

WEBOTS is simulation robotics software [15] which provides comprehensive facilities towards modeling, programming (with C, C++ and Java) and simulating for any

kind of robots. It was developed to work on different algorithms in order to build different types of robot in different environments. Several libraries are available for controller programs which are essentially required to transform the programs into the real robot. It invokes the Comma Separated Value (CSV) in the controller program which is essentially required for various jobs executed by HOAP-2 robot. It offers some challenges to make the CSV file programmatically. Several CSV files have been constructed for humanoid robot. Fig.4 shows the WEBOTS simulation platform with keeping all the modalities of the HOAP-2 performance in real time.

HOAP-2 or Humanoid Open Architecture Platform is a well known humanoid robot having two arms, two legs and other important body joints. Several applications could be done using HOAP-2 robot by controlling the movements of different joints of the robot. The mechanical architecture of HOAP-2 is extremely well organized for coordination between all the body joints effectively. It deals with controller program in order to generate any type of motion. It is defined that controller program invokes a CSV file having joint angles of every joint of the robot at any particular time span. It is being observed that every joint has been introduced with a specific name, designation and also a unique column has been reserved in the CSV file for each joint. Each pattern carries some useful information which is generated using HOAP-2 controller. Every CSV possesses twenty-seven columns which belongs to controller with twenty-five joints including two arms having five degree of freedom, two legs having six degrees of freedom, the head of two degree of freedom, and a body joint. It works on Real Time Linux platform with number of challenging applications like imitation learning [16], human portraits drawing and teaching by human demonstration. Next section explores the idea of learning technique on WEBOTS platform using real time JAVA based software.

[image:3.612.325.556.478.580.2]

Fig 4. Demonstration of HOAP-2 on WEBOTS platform.

V.

LEARNING OF ISL GESTURE

[image:3.612.54.300.508.623.2]
(4)

challenging aspect incorporates the human robot interaction in any environment. The DFD (Data Flow Diagram) tells about the learning process of humanoid robot which has been described elaborately in Fig.5. The working principle of JAVA based software indicates the integration with HOAP-2 in real time. Fig. 6 and Fig. 7 show the step by step process of learning of ISL gestures. The software allows capturing of any type of ISL gesture using WEBCAM. The modality of data acquisition is being followed by learning of ISL gesture to humanoid robot HOAP-2. It offers robustness to calculate the feature vector and do classification with addition to simulation on HOAP-2. The learning mechanism depends on the appropriate classification of ISL gesture which imparts a significant role throughout the entire simulation process of HOAP-2.

User Command Video Capture

Feature Extraction

Store Test Gesture Direction Histogram

Direction Hist

Recognition of The Test Gesture

User Command

Direction Histogram

Classification Result Store Recognized

Test Gesture

Simulation Of Humanoid Robot

Classification Result of The Test Gesture

User

[image:4.612.60.298.254.470.2]

Simulated Output

Fig 5. Data Flow Diagram of Java based HRI software

[image:4.612.332.551.317.470.2]

Fig 6. Real time ISL video (gesture) capturing tool

In the context of feature evaluation, the software manipulates the orientation of edges of ISL gesture in the form of histogram bins. It comprises 18 and 36 different column values of edge orientation respectively. The software provides the generic architecture towards learning mechanism of HOAP-2 which performs some specific kind of gestures according to the newly classified gesture. The details of learning gestures have been explored in the next

subsequent sections.

The human robot interaction has been taken place with HOAP-2 by performing several activities in terms of creating following gestures:

a). BYE BYE. b) WALKING. c). MAY I PLEASE. d). GREETING. e). DRINKING. f). TRAFFIC. g). MAKING PHONE CALL h). MARTIAL ART.

All the gestures have been performed predominantly by the humanoid robots which are associated with the classified ISL gesture. The way of learning process marks an intelligent behavior of HOAP-2 which sustains its learning capability in any type of environment. The learning process is dealt with the HOAP-2 robot controller which has been built intelligently to invoke CSV file in order to do above mentioned gestures in real time. All the predefined gestures carry out some useful information about all the joints of upper body and the lower body of the humanoid robot. The above mentioned gesture applications have been developed using HOAP-2 robot considering each joint angles preciously. Fig.8.(a) illustrates the gesture MAY I PLEASE on WEBOTS platform using HOAP-2.

Fig 7. Java based software for HRI process

(a) (b)

Fig 8. (a) Illustrates MAY I PLEASE and (b) TRAFFIC gesture

[image:4.612.317.557.501.609.2]
(5)

keeping all the pulses in order to execute the movement of the humanoid robot. It achieves an allowable joint range while making the above gestures application effectively. Next section illustrates the significance of the patterns generated by the HOAP-2 robot along with the simulation results.

VI. RESULT ANALYSIS

The classified ISL gestures with average accuracy 90% are entirely mapped with specific gesture based applications on humanoid robot. Each gesture generates a pattern which comprises some movement of the body joint or movement of other parts of the body. Different classified ISL gesture patterns have been presented in Fig.8 and Fig.9. Those above defined patters (Fig 10&11) are composed against ISL gesture GO and QUIET DOWN which are being included as our predefined classes. Each pattern carries its own signature which could be used as a learning entity for humanoid robot HOAP-2. Every pattern has been customized by the feature vector of the orientation histogram with considering 18 bins and 36 bins respectively. The Y axis of the gesture pattern describes the distribution of the edge orientation values where as the X axis indicates the total number samples of each histogram bin. The learning pattern of humanoid robot has been achieved by the classification of ISL gesture in real time. It has been constructed using CSV files as shown in Fig. 10 & Fig. 11.This gesture has been performed by humanoid robot HOAP-2 with keeping movement of all the respective joint angle values in CSV file. The estimation of the joint angle values and position command of the robot are defined using mathematical expression as shown below.

[image:5.612.322.549.51.451.2]

. Where indicates the command of position value of the humanoid robot that has been formulated in pulses. refers to the measurement of joint angles in degree. represents to the change of coefficient in pulse/deg. The movement of HOAP-2 can be both in clockwise and anticlockwise direction. So the CSV controller needs to know about which joint moves in which direction with changing the value in + ve direction as well as –ve direction. Each joint is having a unique device ID and a unique column name associated with CSV controller files. It is being noticed that to move a particular joint, the robot needs to rotate its certain motor.

Fig 8. Pattern for GO gesture

Fig 9. Pattern for QUIET DOWN gesture

Fig 10. MAY I PLEASE gesture for HOAP-2

Fig 11. TRAFFIC gesture for HOAP-2

[image:5.612.68.296.600.727.2]
(6)

directions. The Y axis for each pattern indicates the position values represented by counts (pulses) where as X axis indicates the time sequence which increases monotonically with the trajectory. Fig.12 illustrates the estimated manipulation of all the gestures.

Fig 12. Estimated manipulations of ISL gestures in 3-D

This Fig. 12 depicts an intelligent orientation of several objects which apparently describes the mean of the feature vector estimated, number of total gestures accepted and finally values of average features computed. It can be analyzed from the 3-D plot that each gesture carries its own values of average edge orientation feature vector. It would be extensively used for classification by computing the minimum distance from the mean of all the training gestures.

VII. CONCLUSION AND FUTURE WORK

Learning to ISL gesture is a daunting task for HOAP-2. The trajectory generation for a particular gesture had come up with challenging aspect. The comprehensive software architecture had been followed up in order to perform all the computations on image processing and classifications. The exact trajectory generation and execution implies that classification rate among all the training set of ISL gestures was extremely satisfactory. The controller program for HOAP-2 was associated with CSV files to generate those patterns. The JAVA based software had undergone to make association with each predefined ISL gestures and trajectory generated by humanoid robot HOAP-2.

Future work implies to implement the software directly to deaf and dumb persons to make them understand the ISL gesture and translate it into sentences. It will provide an efficient way of learning to cope up with different circumstances. The real time social interaction with different challenging persons in speech and hearing would explore several issues in human computer interaction terminology.

REFERENCE

[1] Tirthankar Dasgupta, Sambit Shukla, Sandeep Kumar,Synny Diwakar,

Anupam Basu,“A Multilingual Multimedia Indian Sign Language

Dictionary Tool”, The 6’th Workshop on Asian Language Resources, pp. 57-64, 2008.

[2] M.K. Bhuyan, D. Ghoah, P.K. Bora, “A Framework for Hand Gesture

Recognition with Applications to Sign Language”, India Conference, 2006 Annual IEEE, pp. 1-6, Sept, 2006.

[3] Sylvain Calinon, Florent Guenter and Aude Billard, “On Learning, Representing and Generalizing a Task in a Humanoid Robot”, IEEE Trans.on Systems,Man and Cybernetics, Part B, Vol. 37, No. 2, pp. 286-298, 2007.

[4] Sylvain Calinon & Aude Billard, “ Stochastic Gesture Production and Recognition Model for a Humanoid Robot”, In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vol. 3, pp. 2769-2774, 2004.

[5] Sylvain Calinon, Florent Guenter and Aude Billard, “Goal-Directed Imitation in a Humanoid Robot”, In Proceedings of the International Conference on Robotics and Automation (ICRA), pp. 299-304, 2005. [6] M. Lopes, J. Santos-Victor, "Visual learning by imitation with motor

representations," Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on , vol. 35, no. 3, pp. 438-449, June 2005.

[7] M. Hersch, F. Guenter, S. Calinon, A.G. Billard, "Learning Dynamical System Modulation for Constrained Reaching Tasks”, 6th IEEE-RAS International Conference on Humanoid Robots, pp. 444-449, 4-6 Dec. 2006.

[8] M. Pantic, L.J.M. Rothkrantz, "Toward an affect-sensitive multimodal human-computer interaction", Proceedings of the IEEE, vol. 91, no. 9, pp. 1370- 1390, Sept. 2003.

[9] D. Kelly, Reilly Delannoy, J. Mc Donald, and Markham, “A

framework for continuous multimodal sign language recognition”, In Proceedings of the 2009 international Conference on Multimodal interfaces (Cambridge, Massachusetts, USA,), pp. 351-358, November 02-04, 2009.

[10] Isaac Garcia Incertis, Jaime Gomez Garcia-Bermejo, Eduardo Zalama

Casanova, "Hand Gesture Recognition for Deaf People Interfacing", icpr, 18th International Conference on Pattern Recognition (ICPR'06), Vol. 2, pp. 100-103, 2006.

[11] Thomas Coogan , George Awad, Junwei Han and Alistair Sutherland,

Real time hand gesture recognition including hand segmentation and tracking”, ISVC 2006 - 2nd International Symposium on Visual Computing, 6-8 November 2006, Lake Tahoe, NV, USA. ISBN 978-3-540-48628-2.

[12] S. Kettebekov, M. Yeasin, R. Sharma, "Prosody based audiovisual coanalysis for coverbal gesture recognition", Multimedia, IEEE Transactions on, vol. 7, no. 2, pp. 234- 242, April 2005.

[13] J S Prasad, G.C. Nandi, “Clustering Method Evaluation for Hidden Markov Model Based Real-Time Gesture Recognition”, Advances in Recent Technologies in Communication and Computing, ARTCom '09, pp. 419-423, 27-28 Oct. 2009.

[14] J. Alon, V. Athitsos, Quan Yuan, S. Sclaroff, "A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation", Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 31, no. 9, pp. 1685-1699, Sept. 2009.

[15] Webots software, http://www.cyberbotics.com/products/webots/. [16] M. A. Wood, J. J. Bryson, "Skill Acquisition Through Program-Level

Imitation in a Real-Time Domain", Systems Man and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 37, no. 2, pp. 272-285, April 2007.

[17] Anup Nandy, Jay Shankar Prasad, Pavan Chakraborty, G. C. Nandi, Soumik Mondal, “Classification of Indian Sign Language In Real Time”, In the proceedings of International Journal on Computer Engineering and Information Technology (IJCEIT), Vol. 10, No. 15, pp. 52-57, Feb. 2010.

[18] Anup Nandy, Jay Shankar Prasad, Soumik Mondal, Pavan

Figure

Fig 2. Representation of 2D feature space for ISL Test and Training gestures where D3 is the minimum Euclidean distance classified as gesture A
Fig 7. Java based software for HRI process
Fig 9. Pattern for QUIET DOWN gesture

References

Related documents

Discussion: Nursing student’s knowledge increased with an educational intervention, it is recommended that these students receive this type of educational opportunity..

Interstate Rehabilitation Program Scheduled 28 Projects 171 Miles $261 Million Completed 45 Projects 290 Miles $997 Million Under Construction 6 Projects 33 Miles $264

Thus, the Jila, subterranean river in Liuzhou, the Lihu subterranean river in Nandan and the Maocun village subterranean river in Guilin in karst regions in Guangxi Province and

Convex Optimization using Sparsified Stochastic Gradient Descent with Memory.. Faculté Informatique et Communications École Polytechnique Fédérale de Lausanne Thèse de

The significant variables from Season Performance in the final equations were mostly related to the outcomes of how a team performed during the season rather than isolated

In brief, this paper proposes a non-speculative three-stage adaptive router and a low-complexity single-cycle bypassing mechanism to efficiently reduce the contention among the

For all the sensors, which are different for the visible (APS) and UV (IAPS, in both analog and photon counting modes), standard measurements will be performed

Nevertheless, there is evidence that any type of video game, regardless of its learning or recreational nature, can help students develop certain knowledge, skills,