5.2 The $3 Gesture Recognizer: Simple Gesture Recognition
5.2.2 Implementation
Extending the work by (Wobbrock and Wilson, 2007), we present a ges- ture recognizer that can recognize gestures from 3D acceleration data as input. To test our algorithm we used acceleration samples obtained from a Nintendo Wii Remote (WiiMote) (Lee, 2008). The WiiMote fea- tures an ADXL 330 Accelerometer (Analog Devices Inc., 2012)W and
5.2 The$3Gesture Recognizer: Simple Gesture Recognition for Devices with 3D Accelerometers 137
the acceleration data can be sent, as in our case, via a Bluetooth con- nection to a PC. Our algorithm is by no means limited to the WiiMote. It can be used in any acceleration-enabled device, for instance modern smart-phones such as the Nokia N95 or Apple’s iPhone.
5.2.2.1 Gesture Trace
In contrast to (Schlömer et al., 2008), we do not modify or pre-process the gesture trace contains the differences in acceleration vectors over time
the raw acceleration data in any way (ltering, smoothing, etc.). To determine the current change in acceleration, we subtract the current acceleration value reported by the WiiMote from the previous one. We thus obtain anacceleration delta.
By summation of the acceleration deltas, we obtain a gesture trace T
(which can be projected and plotted into a 2D plane to obtain a graph- ical representation of the gesture (Kallio et al., 2006)).
Gesture Class
Library Gesture Class Gesture Library
1 * 1 *
Figure 5.4: UML object diagram showing the relationship between the gesture class library, gesture classes and gesture traces.
5.2.2.2 Gesture Class Library
Thegesture class libraryLcontains a predened number of gesture traces the gesture class library contains training gestures for every gesture class for eachgesture classG. We also refer to these traces astraining gestures.
Figure 5.4 depicts a graphical representation of these relationships.
5.2.2.3 Gesture Recognition Problem
The basic task of our algorithm is to nd the best matching gesture class G from a gesture class library L, for a given input gesture I. An example set of gesture classes is given in Figure 5.3.
To nd a matching gesture class, we compare the trace ti of I to the
traces of all training gestures tGk ∈ L and generate a score table that
lists the comparison score ofti and eachtGk. A heuristic then is applied
to the score table to determine if a gesture has been recognized.
The following sections describe the steps of the $3 Gesture Recognizer in detail.
138 5 Motion Gestures Algorithm 1 Gesture Trace Resampling :
Input: gesture trace t, desired number of sample points in normalized trace lengthN
Output: length-normalized gesture tracetN
Calculate the total lengthLof the gesture trace, then calculate the increment lengthI = L/(N−1)
n = 0
whilen<Ndo
start a new segmentsn
whilethe current segment’s lengthlsn < Ido
– add points of the original trace to a segmentsn
– backtrack along the excessive distancelsn−Ialong the orig-
inal trace using the unit vector obtained from the difference of the last two points added tosn
– appendsntotN
–n=n+1
end while end while
Our resampling algorithm uses an approach slightly different to Wobbrock’s version. In this way all points from the original gesture trace t are consumed, and we obtain a resampled trace tN, which
consists of N equidistant points.
5.2.2.4 Resampling
In order for our trace to be classiable by the gesture recognition al- for classi cation, the
number of points in an input gesture must be resampled to lengthN
gorithm, it needs to be resampled so that the gesture trace has a xed number N of equidistant sample points. This is because the gesture input duration and movement speeds can vary between users, even for the same intended gesture. In our caseN=150, which is slightly above the average amount of acceleration deltas received while users enter a gesture with the WiiMote. SettingNto a lower value decreases the ges- ture recognition precision, while choosing a higherNjust increases the computation time for gesture recognition, without a signicant gain in accuracy.
The Algorithm 1 box shows a pseudocode representation of the resam- pling algorithm we developed.
5.2 The$3Gesture Recognizer: Simple Gesture Recognition for Devices with 3D Accelerometers 139 5.2.2.5 Rotation to “Indicative Angle” and Rescaling
To correct for rotational errors during gesture entry, the resampled rotating to indicative angle aims to correct for rotational differences between gestures
gesture traceTN is rotated once along the gesture’sindicative angle. Like
Wobbrock, we dene the indicative angle as the angle between the ges- ture’s rst point p0 and its centroid c = (x¯, ¯y, ¯z). The angle is deter-
mined by taking the arccosine of the normalized scalar product of p0
andc:
θ = acos( p0•c
∥p0∥ ∥c∥
) (5.23)
The rotation along the indicative angle is then performed using the unit vector of the vector orthogonal to p0 andc. The orthogonal vector is
obtained using the cross product of P0andc:
vaxis =
p0×c
∥p0×c∥
(5.24)
Using vaxis as axis and θ as angle, we generate a rotation matrix for
rotation around an arbitrary axis to rotate all points of TN to obtain
TNθ.
After rotation, TNθ is scaled to t in a normalized cube of1003 units, scaling needs to be
performed to correct for size differences to compensate for scaling differences between gestures. The algorithm
has now nished pre-processing our the original user input and has obtained a gestureTM, which is ready for matching with candidate ges-
tures from the gesture class library.
5.2.2.6 Golden Section Search for Minimum Distance at Best Angle Like Wobbrock, we use the average MSE (Mean Square Error) to cal- culate the path distancedbetweenTM and candidate gesture from the
gesture class library. We convert the path distance to a[0, 1]scale using a version of Wobbrock’s scoring equation adapted to three dimensions, wheredsignies the path distance andlthe side length of the cube that
TM was scaled to in the rescaling step.
Score=1− d
0.5√3l2 (5.25)
Following Wobbrock’s discussion of rotation invariance of path dis- Golden Section Search is used to approximate the optimal rotation angles
tances, we have adapted a Golden Section Search (GSS) using the Golden Ratio φ = 0.5(−1+√5) to approximate the local minimum path distance within an angular range of [−180◦. . . 180◦], for rotation around the three axis of the coordinate system, signied by the angles α,β andγ. We dene a minimum cutoff angle for GSS of2◦, in order
140 5 Motion Gestures
to guarantee that the approximate minimum is found after exactly 11 iterations of GSS. We chose a rather large angular search range as we could not conrm that Wobbrock’s observation that the minimum dis- tance consistently occurs within±45◦from the indicative angle applies for gestures in 3D space.
It seems very likely that this does not hold true for gestures input in 3D⁴. Figure 5.5 shows the presence of a very distinct local minimum in the vicinity of the indicative angle, notice the steep drop-off in distance in the vicinityα= β=180◦ which represents the center of the search- space.
The GSS-Based minimum distance approximation is repeated for each a score table is built
from the pairwise distances of the gesture entry from all template
trace for every gesture class in the gesture class library. We thus obtain a table of scores with classes of likely matches.
distance
↵
Figure 5.5: A 3D height map of the classication distance depending the rotation of the gesture trace. The vertical axis represents distance, whereas the two horizontal axes represent the rotation anglesαandβ. In this projection, γis xed at0◦. Section 5.3 contains a closed-form solution to the problem of nding the optimal rotation.
5.2.2.7 Scoring Heuristic
Wobbrock’s original algorithm did not feature a heuristic to reduce the occurrence of false positives, which is a common problem for simple gesture recognition algorithms operating on large gesture vocabularies (Wobbrock and Wilson, 2007).
The matches obtained from gestures entered as 3D acceleration data applying our heuristic
improves the precision of the gesture recognition
⁴After the $3GR, we developed Protractor3D which solves the matter of nding the correct rotation for 3D gestures by applying a closed-form solution to the problem, see Section 5.3 for more details.
5.2 The$3Gesture Recognizer: Simple Gesture Recognition for Devices with 3D Accelerometers 141
are not as precise as strokes entered on a touch screen. To compensate for the weaker matches, we have developed our own scoring heuristic, which processes the score table described in the previous section.
Using this heuristic, we achieved a considerable reduction of false pos- itive recognitions compared to Wobbrock’s original strategy of select- ing the gesture candidate with the highest matching score to deter- mine the recognized gesture. After sorting the score table by maxi- mum score, our heuristic determines the recognized gesture with the following rules:
Algorithm 2 Scoring Heuristic :
input score tableScontaining all the values of the distance between the input gestures and the templates inL output the id of the recognized gesture or “gesture not rec-
ognized”
– εis dened as the threshold score.
– if the highest-scoring candidate in the score table has a score
>1.1ε,thenreturn this candidate’s gesture ID.
– else if, within the top three candidates in the score table, two candidates exist of the same gesture class and have a score
> 0.95ε, respectively,then return the gesture ID belonging to these two candidates.
– else, return “gesture not recognized”.