Quantitative User Study Experiment - Scalable natural user interfaces for data-intensive explor

Using the VR system, visualization designs, and motion data described above, we con- ducted an exploratory user study. Participants in the study were tasked with analyzing spatial and temporal relationships in the synthetic dataset of moving bumpy disc forms. Accuracy, speed, and confidence were assessed for each of the 8 fundamental visualization designs described earlier.

Hypotheses. To guide the exploratory study, we formulated a series of 5 core testable hypotheses. The first 4 test the design choices that we believe will be best for speed and for accuracy across both dimensions of the design space. The final hypothesis tests users’ ratings of confidence in the design.

• H1: Users will be faster to complete space-time analysis tasks for motion visualizations designed with the Animated Space strategy as compared to (a) Static Space and (b) Interactive Space strategies.

• H2: Users will make fewer errors in space-time analysis tasks for motion visualizations designed with the Interactive Space strategy as compared to (a) Static Space and (b) Animated Space strategies.

The rationale for these two hypotheses about the design choice for space is that we know from previous experiments performed in other contexts that animated displays (i.e., using automatic camera rocking) have outperformed other designs in terms of the speed of analysis [95]. Although our hypothesis is that this finding will also hold in our situation, we target several differences in the data visualization (e.g., using data that fundamentally represent motions, using VR technology) that are interesting to

test. With respect to accuracy, the closest related work did not find a significant difference between the different display conditions; however, we intuitively believe the users should be most accurate in analyzing this type of motion data when they have complete interactive control over the visualization.

• H3: Users will be faster to complete space-time analysis tasks for motion visualizations designed with the Static Time strategy as compared to (a) Interactive Time and (b) Animated Time strategies.

• H4: Users will make fewer errors in space-time analysis tasks for motion visualizations designed with the Interactive Time strategy as compared to (a) Static Time and (b) Animated Time strategies.

The rationale for these two hypotheses about the design choice for time is as follows. First, following a similar rationale as for space, we intuitively believe that users should be most accurate in analyzing these data when they have complete interactive control over the visualization. However, while we believe that Interactive Time is likely to produce the most accurate results, it will be interesting to see whether Static Time and Animated Time are nearly as accurate. One reason that we hypothesize that users will be fastest analyzing Static Time designs is that visualizations that depict time statically (i.e., simultaneously showing more than one instant in time) have outperformed animated visualizations in other contexts[98]. In those trend visualizations, changes over time were read quickly and accurately by users when a full trace of the trajectory over time was displayed, but trends were more difficult to identify when animation was used, despite the compelling aesthetic of the animation. We believe a similar effect will hold true in this situation motivated by complex 3D scientific motions.

• H5: Users will be most confident in space-time analysis tasks for motion visualizations designed with the Interactive strategies (both Interactive Time and Inter- active Space) as compared to the non-interactive strategies.

This final hypothesis tests the common understanding that the feeling of “being in control” provided by explicit user interfaces will lead users to be most confident in motion analyses using these designs.

Design. The experiment followed a 3x3 within-subjects design. The independent variables were (1) design choice for time (Interactive Time, Animated Time, Static Time) and (2) design choice for space (Interactive Space, Animated Space, Static Space). Task. Participants were asked to perform a task that requires both spatial understanding (judging distances and collisions between two organic 3D geometries) and temporal understanding (finding the first occurrence of multiple similar events). While viewing visualizations of the motion of the two bumpy disc forms, participants were required first to detect collisions of two highlighted features (one shown in red on the top disc, and one in green on the bottom disc) with the opposite discs and second to indicate which feature was the first to collide with the other disc by touching the appropriate button (colored red or green) on the touch table. A collision was defined as occuring whenever any part of the highlighted feature intersected any part of the opposite disc. Training. The experiment began with an initial training period consisting of 18 tasks, 6 with each of the Interactive Space/Interactive Time, Animated Space/Static Time, and Static Space/Animated Time designs. These were chosen strategically to introduce each participant to the full range of design choices encountered in the experiment. During training, visual feedback was provided to indicate correct and incorrect answers. Experiment Sessions. During the data collection portion of the experiment, tasks were presented in blocks of 26 motions, one block for each visualization design. Before each block, on-screen instructions informed the user of the design choices used for both space and time and the user interface employed in the upcoming trials. Users were also informed that the first two trials in each block were to be considered practice trials. After each block, participants were asked to rate their confidence in the most recent visualization design using a 7-point Likert scale (1 = not confident at all, 7 = very confident). The ranking was indicated by pressing a numbered button on the table surface. In addition to the confidence ranking, the other dependent variables recorded for each subject were time taken and accuracy for each trial. The order of presentation for the 8 visualization designs was randomized across subjects using a balanced Latin square. The order of presentation for the motion sequences (not including the 2 practice trials) was also randomized within each block.

Design Errors Time Taken Confidence

Mean SD Mean SD Mean SD

IS, IT 2.13 1.58 16.99 4.83 5.25D 1.13 IS, AT 2.60 1.49 12.88 4.85 4.19 1.47 IS, ST 3.10 1.48 16.74 6.67 4.13C 1.08 AS, IT 2.40 2.09 14.75 4.11 5.19ABC 0.98 AS, AT 3.06 1.58 12.38 6.59 4.19A 1.28 AS, ST 3.10 1.36 16.91 7.01 4.25 1.13 SS, IT 2.42 1.74 18.72 7.94 4.63 1.09 SS, AT 2.98 1.52 15.00 7.56 3.69BD 1.35

Table 4.1: Experimental results for each visualization design. For the confidence measure, significantly different pairwise results are indicated by corresponding superscripts. population (mainly from computer science) participated in the study and were com- pensated for their time. The mean age of the participants was 24.56, with a standard deviation of 2.56; one was left-handed. Of the participants, 8 reported playing video games regularly, 6 sometimes, and 2 never; 13 reported little to no prior experience using virtual reality, and 3 reported using virtual reality more than 20 times. Time to complete the study, including practice sessions, ranged from 1 to 2 hours.

4.5 Quantitative Study Results

Each participant produced raw performance data from which we calculated the av- erage time taken for each visualization design and the total number of errors for each visualization design. The raw data, organized by visualization design, are reported in Table 4.1. We analyzed the data using standard error plots and mixed-model univariate analysis of variance (Anova). We modeled our experiment as a repeated-measure design that considered the observer as a random variable and the design choice for space, design choice for time, and time between collisions as fixed. The confidence measures reported by participants were analyzed separately.

The main effect of the design choice for time was significant for both the number of errors (F (2, 345) = 12.560, p < 0.05) and the time taken (F (2, 345) = 30.643, p < 0.05). The main effect of the design choice for space was not significant for the number of errors (F (2, 345) = 1.660, p = 0.192) but was significant for the time taken

Number of Errors

Time Taken (Seconds)

Interactive

Time AnimatedTime StaticTime

Interactive

Time AnimatedTime StaticTime

Interactive

Space AnimatedSpace SpaceStatic

Interactive

Space AnimatedSpace SpaceStatic 0 0 5 10 15 0 5 10 15 1 2 3 0 1 2 3

Design Choice for Time Design Choice for Space

Figure 4.6: Experimental results for number of errors and time taken. Significantly different pairwise measures are indicated via dashed lines. Error bars are +/- 2 SE. (F (2, 345) = 11.948, p < 0.05). No significant cross-interactions were found. Follow- on pairwise comparisons were performed with the Bonferonni correction applied. The pairwise comparisons are summarized graphically in Figure 4.6.

Interactive Time designs produced the fewest number of errors. The difference in the number of errors was found to be significant when comparing Interactive Time with both Animated Time (Part a of H4, 95%CI = [0.941, 0.198], p < 0.05) and Static Time (Part b of H4, 95%CI = [1.207, 0.376], p < 0.05); thus, we accept H4. For the design choice for space, no significant differences in the number of errors were found; thus, we fail to accept H2.

Participants took the least amount of time to analyze the Animated Time designs as compared to both Static and Interactive Time. This does not agree with H3, in which we thought that users would be fastest analyzing Static Time designs, and thus

we fail to accept H3. Although not one of our original hypotheses, we note that the observed difference in time taken was found to be significantly less for Animated Time than for both Interactive Time (95%CI = [4.685, 2.116, p < 0.05) and Static Time (95%CI = [4.842, 1.970], p < 0.05). The difference between Interactive Time and Static Time was not significant. A significant difference in time taken was also found among the design choices for space. Animated Space designs were significantly faster than Static Space (Part a of H1, 95%CI = [3.620, 0.748], p < 0.05); the mean time taken for Interactive Space designs was just slightly more than for Animated Space, but the differences were not significant. Thus, we accept hypothesis H1a, but we fail to accept hypothesis H1b.

Participants’ ratings of confidence are summarized in Table 4.1. Judging by mean responses, participants were most confident in the Interactive Space/Interactive Time design, closely followed by the Automatic Space/Interactive Time design. (Both of these designs received mean ratings above 5.0.) Table 4.1 indicates significant differences in these ratings based on a Friedman test comparing the 8 visualization display conditions (χ2

(7, N = 16) = 38.229, p < 0.05) and follow-on pairwise Wilcoxon tests. Automatic Space/Interactive Time was significantly different relative to three other designs: Au- tomatic Space/Automatic Time (z = 2.906, p < 0.05), Static Space/Automatic Time (z = 3.688, p < 0.05), and Interactive Space/Static Time (z = 2.844, p < 0.05). Ad- ditionally, findings for Interactive Space/Interactive Time were significantly different than those Static Space/Automatic Time (z = 3.406, p < 0.05). Revisiting H5, the data partially support this hypothesis. Users were more confident regarding Interactive Time designs as compared to Animated Time and Static Time, and with Interactive Space designs as compared to Static Space but not to Animated Space.

In document Scalable natural user interfaces for data-intensive exploratory visualization: designing in the context of big data (Page 60-65)