Comparative Evaluation of Initial and Revised Prototypes

Chapter 3: Design and Evaluation of a World-in-Miniature

3.3 Addressing Scalability and Awareness

3.3.3 Comparative Evaluation of Initial and Revised Prototypes

We compared the scalability and awareness enhancements of our revised prototype, SEAPort, to our previous prototype, ARIS. Since awareness is a complex, multi- dimensional concept, this study focuses on an interface’s ability to aid recall of which applications are on which screens in the workspace. Future work will test other awareness dimensions. Our improvements were evaluated as a whole, as they are very inter- dependent and would likely be replicated as a whole, not piecewise in other interfaces. 3.3.3.1 Experimental Design and Configuration

The experiment used a repeated measures design with Interface (ARIS and SEAPort) and Clutter (low and high) as within subject factors and Input Device (stylus and mouse) as a between subjects factor. The presentation order of Interface and selection of Input Device was counter-balanced.

Our workspace reflected those shown in [73, 116]. It consisted of two 61" plasma displays, a 20" LCD screen, and an 18" graphics tablet. The two smaller displays were placed 2’ apart on a table in the center of the room. The large screens were placed 1’ apart against the wall in front of the table.

3.3.3.2 Users and Experimental Activities

Eight users (4 female) participated in our study and were students in CS or Psychology. Ages ranged from 18-29. To make the experiment as engaging as possible, we used a context that we felt would be familiar and of interest to most users. Users were asked to review and organize content for a multimedia presentation about our school’s basketball team.

The first activity evaluated the effectiveness of our scalability enhancements. For this activity, users used each interface to organize content among several screens in the workspace, which consisted of four relocation tasks. The user was given printed instructions, e.g. relocate the document containing rebound statistics to the leftmost large screen. Users performed similar activities in two conditions, one with only a few applications (low clutter) and another with many (high clutter). The latter caused at least half of the representations to have at least some occlusion, but the number of applications open was within practical limits.

Our second activity evaluated how well SEAPort enhanced recall of the state of the workspace. For this activity, applications were configured among the screens, a user reviewed the configuration, and then recalled as much of it as possible, without and then with the aid of the interface. This part of our evaluation was inspired by the Situation Awareness Global Assessment Technique (SAGAT), a commonly used technique for measuring awareness where information screens are blanked out and users are asked to recall task-related information [45, 46].

3.3.3.3 Procedure and Measurements

When a user arrived, the activities were explained, the first interface was demonstrated, and the user practiced using it. The user performed the organizing activity in the low clutter condition (4 applications, 1 per screen) and then performed a similar activity in the high clutter condition (11 applications, 2-4 per screen). The user was asked to perform the activities as quickly as possible. Upon completion, the user filled out a questionnaire.

This process was repeated for the second interface, using a counter-balanced order. Screen capture software was used to record a user’s on-screen interaction.

For the awareness activity, the experimenter configured a set of applications among the screens and the user reviewed it for 20 seconds. With screens turned off, the user labeled a printed map of the workspace with the location and content (one descriptive word) of the applications. The screen closest to the user was then turned on showing the interface maximized. Using it as a memory aid (no interaction), the user modified their map as desired. The user performed this activity only in the high clutter condition and the number of applications (11) was well above short-term memory limits [92]. This process was repeated for the second interface.

For the organizational activity, we measured:

• Completion time. This was the time from the first cursor movement to the completion of the final task.

• Error. We defined an error as any interaction step that did not move a user closer to completing the activity.

• Subjective feedback. Users rated an interface across various dimensions, including ease of use, learnability, and overall satisfaction. Questions were taken from [85].

• Visual scans. This is when the user shifted visual attention to a screen other than the local screen, an important metric for evaluating interfaces for MDEs [19]. Visual scans were identified by reviewing an over-the-shoulder video of the user. For the awareness activity, we measured the number of applications correctly labeled on the map in both the free recall and interface assisted conditions. For an application to be marked as correct, both its location and content had to be correct.

3.3.3.4 Results

There were no differences for errors and Input Device had no effect. Thus, these will not be discussed further. An ANOVA showed that Clutter had a main effect on completion time (F(1,7)=19.07, p<0.003). Completion time was worse for high clutter (µ=42.3s) than low clutter (µ=31.9s), likely due to the increase in choices (Hick’s Law). Interface also had a main effect on completion time (F(1,7)=13.24, p<0.008) with activities being performed faster with SEAPort (µ=31.5s) than with ARIS (µ=42.7s), which represents a 26% improvement. There were no interactions in the data.

Clutter had a main effect on visual scans (F(1,7)=9.03, p<0.020). There were more scans during high clutter (µ=4.6) than low clutter (µ=3.0), likely due to the larger number of applications. Interface also had a main effect (F(1,7)=77.54, p<0.001), with SEAPort (µ=2.3) causing fewer visual scans than ARIS (µ=5.3), a 56% improvement.

For subjective feedback, SEAPort was rated higher on five of seven dimensions (t(7)≥2.376, p<0.049). Users found the new interface 11% easier to use (µSEA=6.50,

µARIS=5.88), 21% more comfortable (µSEA=6.50, µARIS=5.38), 38% better for finding

information (µSEA=5.88, µARIS=4.25), and 25% more satisfying (µSEA=6.25, µARIS=5.00)

for the tasks.

For the recall data, there was no difference between the baselines (µSEA=7.9, µARIS=7.3),

consistent with short-term memory limits. However, users were able to recall much more of the configuration with SEAPort (µ=10.8) than with ARIS (µ=7.8; t(7)=5.3, p<0.001), a 28% improvement. The revised visual representation in SEAPort thus enables users to better extract which applications are on which screens, an important component of awareness.

Overall, the empirical results confirm that our new interaction techniques have made significant strides towards meeting our goals of improving scalability and enhancing awareness. These techniques advance the use of the WIM metaphor for managing applications in MDEs. Also, the interaction techniques and revised visual representations can be leveraged in other world-in-miniature user interfaces, e.g., [76].

In document An Interaction Framework for Managing Applications and Input in Multiple Display Environments (Page 76-80)