Category 4: System Response (Output)
4.4 Experiment
4.4.3 Variables
Independent Variables
• Between-Groups - Error Rates: 0%, 10%, 20%, 30%, 40%. Errors were distributed at pre-determined intervals throughout each block and the error rates were based on the number of errors present for each set of tasks within a block, and held constant for each session.
• Interaction Context: The keyboard was placed in one of two locations to create the two interaction conditions: In front of the monitor for the desktop condition and at the far right side of the desk for the ubiquitous condition.
• Task criticality: The critical task condition presented a timer on the screen and required the participants to complete the tasks as quickly as possible, while the non-critical condition did not.
• Within-Groups - Tasks: We use 4 separate blocks of trials, each consisted of several tasks to work on for the duration of the session (see Table4.1).
Dependent variable Tolerance: We calculate user tolerance as the percentage of gestures used to advance the slides compared to keys pressed during the trials, such that the greater the number of times a participant chose to use the keyboard represents a lower tolerance level for errors in gesture recognition. Tolerance was measured for each of the four tasks used in the blocks.
Subjective and qualitative data. We recorded participant responses from the post-experiment questionnaires to obtain the following subjective results:
• User satisfaction or frustration with the gestures
• User confidence in the gestures: Rated from low to high
• Overall impression of the system: Rated from terrible to wonderful
• Confidence in the gesture system: Rated from low to high
• Perceived accuracy of the system: Rated from low to high.
We observations participants during the trials and conducted interviews after the sessions to obtain qualitative data. We discuss the subjective ratings in the discussion section, along with observations from the pilot study, and results from the experiment sessions.
4.4.4 Results
We planned to use the complete set of error rates (0-40%) in both the desktop and ubiquitous computing conditions, however we discovered that once users began to expe- rience any error rate in the desktop computer conditions, tolerance levels were so low, that most preferred to use the keyboard over the gestures. Thus, we did not complete the trials in the desktop condition for error rates over 10%. We ran a repeated-measures analysis of variance (ANOVA) on all of the data for an incomplete factorial analysis. Results are discussed next.
4.4.4.1 Within-Group effects
Figure 4.8: The profile plot shows the tolerance levels for the different tasks (within-
group) in the two keyboard locations.
Tolerance. Tolerance levels between the four tasks were found to be significant (F(1,14)=8.995, p<.01), showing a trend for decreasing levels of tolerance with each task
in succession (means: task1 97.53, task2 88.46, task3 87.23, task4 84.20).
Tolerance levels for the four tasks showed a within-group interaction effect for task characteristics, with an increase in tolerance in the fourth task in the critical condition, but not in the non-critical condition (F(1,14)=5.024, p<.05). Tolerance also shows an interaction effect with the interaction contexts, with a slight increase in tolerance level in task 3, but then decreasing for task 4 in the ubiquitous condition (F(1,14)=6.011,
p<.05), shown in Figure 4.8. Tolerance levels also appear to interact with error rates and interaction context (F(1,14)=2.916, p<.05), showing that in the ubiquitous condition,
the increased tolerance level in task 3 occurs in the 10% error condition.
4.4.4.2 Between-Group Effects
We found several interesting results in for the between-group analysis shown in Table4.2. First, significant results were shown for error rates, keyboard location and timing. To investigate the interaction effects, we conducted a second ANOVA to explore the two interaction contexts.
Table 4.2: The table presents the results of the between-group ANOVA.
Tolerance. Results from this ANOVA show that in the ubiquitous condition, there were no significant differences in the tolerance levels for any of the independent variables of error rate or task characteristics, suggesting that users are extremely tolerant of recognition errors in the ubiquitous condition. However, this does not occur in the desktop condition.
The desktop condition reveals significant differences in tolerance for error rates F(1,4)=30.993, p<.05) such that tolerance significantly decreased by the 10% error condition. Timing is also significant (F(1,4)=19.835, p<.05), showing a lower tolerance in the timed con-
dition (mean=89.43) than in the non-critical condition (mean=97.03). We also found an interaction effect for error rate and timing (F(1,4)=16.857, p<.05,) where tolerance
appears to converge at the 0 error rate, but diverges in the 10% error rate condition, with lower tolerance levels in the critical condition.
4.4.4.3 Subjective Results
Results suggest that there is a significant correlation between users overall tolerance level and their confidence in the gestures (.578 at .01) and overall impression of the system (.503 at .05). Error rates showed a negative correlation to user confidence (-405 at .05) and a positive one to perceived accuracy (.617 at .01) however, no correlation was noted between error rates and satisfaction, supporting our hypothesis that error tolerance is more dependent on other factors than on error rates alone. Mean values for satisfaction, overall impression and confidence are presented in Table4.3.
Table 4.3: The table shows the mean values of the subjective results for all partici-
pants.
Figure 4.9: The graph shows error bars for user satisfaction ratings according to error
Table 4.4: The table shows mean subjective results for the participants in the 0 and
10% error conditions.
Satisfaction. Participants rated their satisfaction with gestures on average higher in the ubiquitous condition than in the desktop condition, however, there was little difference noted between error rates. The critical condition did show a slightly lower rating for satisfaction than for the non-critical condition (mean timed: 6.07 mean not- timed: 6.14) as shown in Figure4.9. Results suggest that a more satisfying interaction experience can be achieved provided that the time and effort taken to perform the gestures does not outweigh the benefits.
Impressions Overall impressions were rated slightly higher in the desktop scenario than in the ubiquitous condition, however this may be due to the additional error rates seen by these participants. When we look at the ratings and compare only the 0 and 10% error rates we do see that mean overall impression for the ubiquitous scenario (mean=7.0) is higher than for the desktop scenario (mean=6.0) as shown in Table4.4.
4.4.5 Discussion
Interaction context. We considered two interaction contexts based on the location of an alternative input mode; desktop computing when a direct controller is directly in front of the interaction space, and ubiquitous computing when the keyboard is located away from the primary task. In this experiment, participants seemed to instinctively chose to use the keyboard over the gestures when it was close at hand. In this case, gestures not only provided less control than a direct input device, but took more time to execute than a key press. We extend these results and suggest that the gestures are most appropriate for situations in which the user sees a distinct benefit in having access to distance interactions or for extending the flexibility of desktop interactions.
Error rates. While 100% recognition accuracy of computer vision technology is not yet possible in everyday computing technology, this study suggests that error rates can potentially reach 40% before user tolerance levels fall off in the ubiquitous computing
puting input technology, gestures could still provide benefits and satisfying secondary task interactions for ubiquitous computing scenarios.
Task characteristics. While we consider several tasks in this experiment, the main characteristic investigated is the level of criticality of a task. While we found only a slight increase in the tolerance for errors in the non-critical tasks, results suggest that for tasks that require a greater degree of precision or accuracy, gestures may not be an appropriate control for several reasons. First, the lack of precision in gesture recognition may not provide appropriate support for tasks that require a high level of accuracy and second, due to the additional delays of processing perceptual input, a task could take longer to perform than when using a direct input device. Finally, although we did not specifically investigate the different task characteristics in this experiment, our results suggest that there may be differences due to task characteristics in the tolerance users have for errors in gesture recognition, however while the results also suggest that these may effect tolerance levels for errors, further investigation would be required to understand these differences.
4.5
Qualitative Analysis
This section discusses the qualitative results obtained through observations during the pilot study, the experiment, and the post-experiment interviews with the participants. These are organised according to the interaction model.
Interaction context. In this study, we considered two interaction contexts, desktop and ubiquitous computing. We note that in the pilot study, we tested error rates up to 60% in the ubiquitous condition before participants exhibited frustration with the gestures and chose the keyboard. In addition, we found that the gesture recognition responses could take over 4 seconds before participants would be forced to use the keyboard, and many of them continued to user the gestures in spite of this long delay. Again, this suggests that interaction context is a significant factor to consider when trying to determine if gestures are an appropriate interaction technique.
Alternative interaction modes. When considering gestures as an interaction tech- nique for any context, most devices come equipped with remote controllers, buttons or
other direct-input controls so that in case of failure, there is an override system. This was reflected in our use of the keyboard as an alternative input mode. Thus, we note that providing users with an override to the gestures should be considered as an essential feature of gesture interactions.