4.4 Applications of hand gesture recognition
4.4.3 Gesture driven interfaces
The development of hand-sensing devices has led to the consideration of the hand as a natural input medium for many tasks. Sturman (1992) provides an overview of the possibilities for the use of the hand as an input to interface systems, ranging from controlling multiple MIDI musical instruments to determining the actions of a computer-animated puppet. Sturman distinguishes between interfaces in which features of the hand are mapped continuously onto one or more features of the system being controlled, and those in which discrete positions or motions of the hand are recognised and used as input commands or 'tokens'. The features recognised by the latter
may be static hand positions, or more complex gestures, and hence it is this category of interface which will be discussed in this section.
Vaananen and Bohm (1993) identify a number of advantages to be derived from the incorporation of gestures into control system interfaces. The most important is the natural mapping of some input tasks onto user gestures, which makes gesture-based interfaces easy to learn and use. Although a natural correspondence between gesture and task does not always exist, making use of such mappings when possible allows the construction of interfaces which use a combination of gesture and other input mode to provide the user with more expressive power. The MIT Media Laboratory has developed a range of systems which demonstrate the capabilities of interfaces combining input paradigms such as speech, gesture and eye- tracking (Bolt 1980, Herranz 1992, Thorisson et al 1992)
Use of gestures in virtual reality systems
One of the areas where hand gestures are already widely used is in VR systems. 'Finger flying' has become a standard means of moving around in glove-based VR systems, whereby the user points in the direction they wish to move and uses the position of their thumb to control the speed of motion. Similarly simple hand configurations are often used as controls for manipulating virtual objects. To pick up an object the user moves their hand until the virtual image of their hand is inside the object, and then forms a fist. The object becomes attached to the virtual hand and will move with it until the user opens their hand (Fisher et al 1986). The GIVEN system (Vaananen and Bohm 1993) uses similar hand gestures to allow the user to control actions such as forward and backward motion, and returning to a starting location in the virtual environment. These examples serve to illustrate the manner in which gestures can be used to produce intuitive interfaces to VR systems.
The 'finger flying' and 'pick up' gestures are extremely simple, consisting only of static hand configurations. Gesture recognition systems such as SLARTI offer the potential to incorporate more complex, temporal hand gestures into VR systems, thereby providing the user with a richer interface. One aspect of VR where gestures would be particularly well suited is in allowing the user to rapidly manipulate and modify the virtual objects in the system. VR pioneer Jaron Lanier has stated his desire to create 'user- malleable' systems in which the user can create new virtual objects and environments in a few seconds (Barlow 1990b). The desire for speed requires an expressive interface, which must also be easy to learn and use so as to
impose minimal cognitive overheads on the user. A multi-modal interface consisting of voice and gesture would appear a suitable means of implementing such a system. Gestures form a natural means of describing some spatial properties of objects such as their shape, size and motion whilst voice can be used to define other attributes such as colour which are less naturally represented by hand gestures.
Use of gestures in robotic control
A further area in which gestural interfaces have been explored is the field of robotic control. Remote control of robots either by a user with direct visual contact with the robot or via a teleoperator system, has potential for application to a wide range of tasks which cannot be accomplished by either a human agent or an autonomous robot. Examples include geographical exploration of the surfaces of other planets (McGreevy 1992), and manipulation of objects in a microscopic environment (Robinett 1992).
A major issue to be overcome in developing these systems is the need for an interface which provides a natural means for controlling the multiple degrees of freedom possessed by most robotic devices. Sturman (1992) reports that conventional devices such as dials and sliders are often not suitable for controlling more than a small number of parameters, and so more sophisticated interfaces may be required. The use of the hand as an input device (and particularly the use of hand gestures) can provide a more easily learnt and utilised interface. Several studies have been undertaken to examine the manner in which gestures can be used to control robots, based either on actual physical devices or on simulations of such robots.
Papper and Gigante (1993) explored the use of gestures to control a simulated robotic arm designed to mimic the structure of the human hand and arm. Hand gestures were used to provide various controls over the robot arm which could not be obtained by a direct mapping from the user's hand to the robot manipulator. For example, parts of the robot could be locked in place, the robot's wrist could be continuously rotated in a manner not possible with a human wrist, and a fine control mode could be entered to allow more precise control over the robot. It was found necessary to include an explicit 'clutch' gesture which toggled the activation of the gesture recognition component of the system, so as to avoid interpretation of unintentional gestures, and to allow the user to reposition their hand.
Singh (1993) compared several different mappings for controlling a small robotic arm via a CyberGlove, and reported that a gesture-based control
system was easier to learn and operate than systems based on direct mappings from the hand joints to robot joints. This agrees with results reported by Sturman (1992) from experiments based on controlling a simulated crane using a variety of different interfaces.
5 Spatial neural networks
It is assumed that the majority of readers of this thesis will already be familiar with at least the basic concepts involved in neural networks, and therefore do not require an introduction to the field of connectionism. However it is also envisioned that some readers may lack background knowledge of this area and so this chapter provides a limited introduction to this field, discussing the basic concepts underlying neural networks with a focus on the style of networks used in this research. For a more comprehensive review of neural networks see any of the many textbooks available on the topic – for example Wasserman (1989) or Hertz et al (1991). Section 5.1 provides a high-level overview of neural networks. Section 5.2 will be of most relevance to the reader already knowledgeable about neural networks as it describes the style of network and the experimental methodologies used in developing SLARTI. Section 5.3 discusses some of the properties of neural networks, particularly those which were of most relevance to the creation of the SLARTI system.
This chapter restricts its discussion to the use of neural networks for the recognition of spatial patterns rather than the classification of time-series. The latter issue is covered in Chapter 6.