Posture and gesture recognition, as presented in the last section, are mainly used for a discrete (event-based) interaction. The user performs a certain gesture, the system recognizes it, and triggers the system action related to the gesture. Nevertheless, the input data of full body interaction can as well be used in a continuous way. For example, the hand motion can be used
continuously to control a cursor in a GUI. All current operating systems are based on GUIs usually controlled by mouse/keyboard or finger touch. When porting them to full body interaction without handheld devices, there are several differences the interaction designer needs to be aware of. Although a discrete full body interaction should be preferred for interaction in virtual environments (cf. Section 3.1) and most application scenarios investigated in this dissertation apply discrete interaction, there are cases in which it is worth to implement continuous full body interaction and, e.g. to control a GUI similar as with the mouse and keyboard (cf. Section 5.3).
A basic GUI includes a pointer or cursor that the user controls, as well as graphical items that are displayed on the screen. The user can select an item by moving the pointer to it and confirming the selection which is known as the point-and-click paradigm. Depending on the interaction modality, the mapping of the user input to cursor movements and the item selection differ. With conventional mouse interaction, the two dimensional mouse movements are mapped to the screen according to the settings of the operating system. Confirmation of an item selection is realized by pressing a button on the mouse. With touch interaction, the cursor movement and item selection can happen within one step, when the finger or stylus touches the surface. For full body interaction without handheld devices, the cursor is usually controlled with one hand in the air (freehand interaction), how- ever, there are various options how to realize the cursor mapping and item selection. I describe several such options in the following sections.
2.4.1
Cursor Control
An intuitive way to realize cursor movement, when interacting from a dis- tance without handheld devices, is pointing at interface items with one hand. To give feedback on the pointing position, the cursor usually has a graphical representation on the screen. In opposite to the conventional arrow representation in mouse interaction, for freehand interaction it is common to display a hand icon on the screen. There are different ways of mapping the hand position to a screen position, e.g. Vogel and Balakrishnan [182] distinguish between absolute ray-casting and relative pointing, which
(a) Absolute ray-casting
(b) Relative pointing
Figure 2.16: Absolute ray-casting (a) and relative pointing (b) on a screen (light blue colored rectangle)
are illustrated in Figure 2.16. For ray-casting, the mapping is defined by the point in which a ray extended from the hand in pointing direction inter- sects with the screen (cf. Figure 2.16a). Therefore, users directly point at the objects on screen which is potentially more intuitive. Relative pointing applies an indirect mapping, in which hand positions relative to the user’s body are mapped to screen positions, without taking the actual placement and dimensions of the screen into account (cf. Figure 2.16b).
Therefore, an indirect mapping provides higher accuracy when standing at a farther distance or pointing onto a smaller screen, and allows for a more comfortable (lower) hand position when standing closer or pointing onto a larger screen. For example, Vogel and Balakrishnan [182] measured 22.5% error rate for ray-casting in comparison to 3.5% for an indirect mapping with the task of pointing at small targets (16 mm) from a distance of 4 m. The higher hand position of the ray-casting technique represents a common problem of midair interaction, sometimes referred to as the “gorilla-arm effect” [67]. Because of the corresponding arm fatigue, the interaction gets less precise over time, and the user has to take more breaks or preliminarily stop the interaction earlier.
For small targets distributed on a larger screen, researchers further de- veloped approaches, in which the cursor does not target a single point on the screen, but a (possibly adaptive) activation area [59, 168]. In this way, users do not have to point at the target precisely, but it is allowed to have a certain displacement, as long as this does not favor another nearby target.
2.4.2
Item Selection
With a working freehand pointing mechanism, the user is able to hover a cursor over GUI items, however, we still need to find a way to determine when the pointing actually indicates the selection of an item which is called the Midas Touch problem in the literature. In the case of mouse interaction, a selection can be simply elicited by a mouse click. For other interaction techniques, such as gaze, touch or freehand interaction, the mouse click has to be replaced by alternatives. There exist several solutions in the literature that can also be used within freehand interaction.
An easy way for taking over the function of a traditional mouse click is an automatic selection after a certain dwell on an indicated item. However, adapting dwell time to a particular situation and an individual user is a great challenge. On the one hand, dwell time needs to be chosen long enough to avoid false alarms. On the other hand, it should be rather short in order not to slow down user interaction. For gaze based GUI interaction, dwell times between 0.3–1 seconds are typically chosen and sometimes also adapted for expert users [115]. For full body interaction, higher dwell times need to be taken, as the interaction is slower as well. For example, Microsoft provides a dwell based method for interaction in the Xbox Kinect GUI and dwell time is set to about 1.3 seconds (Xbox Firmware 2.0.14719.0). Because of this fixed duration, such a dwell-based approach has a clear limit in performance, which does not offer many possibilities to improve by training. Nevertheless, a dwell-based selection is considered to be easy-to- use for novice users and to offer a constant low error rate.
Another solution to the Midas touch problem is the definition of a spe- cific selection area on the screen. This solution is commonly known in text input systems with custom virtual keyboards. While Quikwriting [143] and Cirrin [116] require the user to move the cursor back to the center area of a virtual keyboard after indicating the character(s) with the cursor, Huckauf and Urbina [73] presented a writing system in which a character is selected by moving with the eye to the text input field after looking at the character. Simlar as the dwell based method, this approach avoids that the user has to perform a second task apart of moving the cursor on screen. In addition,
it reduces the Midas touch problem as a separate cursor movement is need for the selection.
The last option for item selection is requiring the user to perform a sec- ondary action apart from the cursor movement. This should further reduce the Midas touch problem. Various researchers suggest to add a second in- put modality, e.g. Shoemaker et al. [161] propose a system in which the Nintendo Wii Remote motion controls the cursor movement, but pressing a button results in the selection. This is convenient as the Wii Remote already contains the necessary button, but it would be quite awkward in a freehand system in which no hand-held device is present. Markussen et al. [117] use hand motions for the cursor movement, but track markers on a glove worn by the users to detect taps with the index finger. While this seems to be a natural way of selection in freehand interaction, using gloves in spontaneous interactions with public displays would be undesirable and a robust implementation using Kinect tracking without gloves is not feasible at the moment. Another solution is given by Ren et al. [148] who use a separate hand gesture for the selection. This secondary action consists of reaching with the hand for the onscreen item while still pointing at it.
Dwell based selection is already established in commercial applications, however, selection areas can provide a faster solution for the Midas touch problem with some input modalities. While avoiding the use of additional devices or special gear to achieve even faster selection with a secondary action, the pushing gestures as investigated by Ren et al. [148] promise similar advantages for freehand interaction.