• No results found

6.2 Study #1: Feasibility and Quantitative Measures

6.2.5 Qualitative results

These quantitative results were indicative of an acceptable level of usability and learnability for programmers, but still insufficient for non-programmers. In order to identify subjective perceptions of problems and how they could be resolved, qualitative results were obtained from triangulating observation of participants completing the tasks with their post-study interview feedback. In general, given that the first three study tasks were relatively instructive with regards to the blocks required, the majority of barriers encountered were related to the direct manipulation

6.2. STUDY #1: FEASIBILITY AND QUANTITATIVE MEASURES 135

Figure 6.5:Menu system in Study #1, where different components are obscured in menus

of the interface, and not to participants’ problem solving capabilities with Jeeves. Salient issues noted are described in detail as follows.

6.2.5.1 Issue 1: supporting exploration

The exploratory nature of non-programmers appeared to be disrupted by visibility issues, an example of which is shown in Figure 6.5, where blocks are accessed from hidden menus. Figure 6.5A shows the initial menu view, consisting of two “abstract” triggers - namely time- based and sensor-based triggers. By clicking on these, their underlying “concrete” triggers are exposed, shown in Figure 6.5B and C respectively. For example, “Interval”, “Random” and “Set Times” triggers are accessible from the time-based abstract trigger. Similarly, concrete actions and expressions were also grouped into menus, initially hidden until their abstract parent was clicked. Participants who had enough time to generate a partial solution to the final task tended to employ a “bricolage” approach to programming, as described by Turkle and Papert [164]. In this approach, participants build spontaneously from the ground up, deleting erroneous components as necessary, similar to participants in an analysis of Scratch user habits by Meerbaum-Salant et al [182]. Notable exceptions were the three participants who considered themselves to be advanced programmers, who appeared to have a pre-conceived design, and constructed their solutions in a linear process.

136 CHAPTER 6. USABILITY ANALYSIS

were made visible by clicking on their abstract parent block, such as those labelled “time-based triggers” or “prompting actions”. Although this menu structure was intended to reduce the time needed to find relevant blocks, it appeared to have the opposite effect, due to participants’ tendency to search through menus at random. This difficulty was expressed by participants in their post-study feedback:

“...everything is very intuitive the building but the categories? It’s hard, I dunno. But maybe it’s just me” (P18, programmer)

“I was slightly confused by the classifications of things like app setting actions...just like ‘set’ together with ‘wait for’ and the same time there’s like the ‘sensor action’ things” (P14, programmer)

Quantitative results showed that programmers were significantly faster than non-programmers, but with a similar number of errors. Before reviewing the screen recordings again, it was hypothesised that this was due to programmers having a greater recall of required components (each sequence of incorrect menu navigations was counted as one navigation error, so this suggested that programmers’ sequences of wrong menus were shorter). Recordings were reviewed and the number of incorrect navigations – entering a menu then exiting immediately – were counted. This figure was divided this by the time taken to give a rate of incorrect clicks. Surprisingly, programmers’ average rate of incorrect clicks was 1.06/minute, and non- programmers’ average rate was 0.95/minute, with no statistically significant difference between the two. From observation it was notable that non-programmers would often pause before deciding whether to click on a menu button, whereas programmers were inclined to explore without hesitation.

This issue suggested that menus were unnecessary, particularly if superfluous blocks were removed. For example, Moody’s Principle of Perceptual Discriminability: “Different Symbols Should Be Clearly Distinguishable from Each Other”is supported, because the text and widgets incorporated into each trigger and action are used as distinguishing features. However, as shown in Figure 6.5, “Hardware Sensor Triggers” and “Software Sensor Triggers” are almost identical, and participants frequently used the wrong type, which aggravated confusion.

Resolution -In order to resolve this issue, the two sensor-based triggers were incorporated into one “Sensor Trigger” block. Further, the abstract block buttons were removed altogether and replaced with palettes from which blocks could be dragged directly, as shown in Figure 6.6. For example, all actions can be accessed from the palette labelled “Actions”. As such, the Cognitive Dimension ofvisibilitywas expected to be improved.

6.2. STUDY #1: FEASIBILITY AND QUANTITATIVE MEASURES 137

Figure 6.6: The single menu of abstract blocks was replaced with separate labelled menus containing trigger/action/condition blocks

Figure 6.7:The previous click-click-drag process caused issues for participants

6.2.5.2 Issue 2 - Improving Control

The most prominent issue that participants had was with direct manipulation of the blocks themselves. The majority of participants explicitly mentioned having difficulties with the drag- and-drop process, and all were observed to experience issues, again aggravated by the menu design. In this design, participants had to click the ‘abstract block’ to open a menu, and then click and drag the ‘concrete block’ from the menu onto the canvas. This sequence, as illustrated in Figure 6.7, was a notable issue for both programmers and non-programmers alike:

“sometimes you’d be trying to click something and you’d actually move something and then...just little errors might be magnified by the fact that it’s very click-based”(P17, non-programmer) “...you need to click and then drag. I always try to just click and click and then I’m expecting

something to happen...because you’re kind of primed oh you click and then you click and then, oh then actually you have to drag” (P16, programmer)

138 CHAPTER 6. USABILITY ANALYSIS

Figure 6.8:The sensor configuration tab, which was removed in later iterations of Jeeves

mouse input in the same way as the concrete blocks, and instead acted as buttons for accessing menus of these concrete blocks. This representation was intended to support the Cognitive Dimension of “abstraction”. Further, designing these buttons to resemble the concrete blocks contained in their respective menus was intended to support role-expressiveness while also adhering to Moody’s Principle of Semantic Transparency: “Use Visual Representations Whose Appearance Suggests Their Meaning”. For example, the appearance of the “time based triggers” button was intended to suggest its role as a means to display different concrete time triggers. However, this design decision had the inverse effect, as the appearance of the button suggested that its role was actually as a draggable block.

Resolution -As described in the resolution of Issue 1, abstract blocks were removed, such that all concrete blocks are now directly draggable from their respective palettes. It is also possible that this issue would be resolved by having menu buttons actually resemble buttons, rather than draggable blocks.

6.2.5.3 Issue 3 - Too Much Information

Multiple participants, particularly non-programmers, mentioned how they initially struggled with the number of new concepts. For example, one non-programmer suggested that:“at first it’s maybe a wee bit, a lot to take in. The different vocabulary of it all, and what different things mean”(P7, non-programmer). Those who did not struggle expected that others with less programming experience would face such issues:“I just think that if I gave that to my mum or dad I don’t think they’d be able to start with it”(P4, programmer).

Moody’s Principle of Semiotic Clarity, “There Should Be a 1:1 Correspondence between Semantic Constructs and Graphical Symbols”, as well as the“closeness of mapping”Cognitive Dimension, both influenced the original design, but apparently to a detrimental effect. In

6.2. STUDY #1: FEASIBILITY AND QUANTITATIVE MEASURES 139

attempting to capture all semantic constructs, including loops, arithmetic expressions, and various smartphone system actions, participants expressed concern that there were too many concepts introduced at once.

From this feedback, the decision was taken to minimise the number of constructs that the end-user would have to comprehend, in line with Moody’s design Principle of Graphic Economy: “The Number of Different Graphical Symbols Should Be Cognitively Manageable”. For example, the encoding of sensors was significantly simplified. Figure 6.8 shows the content of the sensor configuration tab previously accessible in Jeeves. The purpose of this pane was to allow end-users to enable and disable specific phone sensors, as well as the frequency and granularity of sampling. However, one participant highlighted this as a potential example of“hidden dependencies”: “Only thing that confused me a little bit I was looking for like, sensors right? Okay this was

something actually, to be fair I should have been prompted to enable this kind of stuff right?” (P18, programmer)

Resolution - The sensor configuration tab was removed, as well as constructs considered to be superfluous. For example, the constructs representing loops were removed, as these were observed to cause the most confusion for non-programmer participants, to whom these concepts had not previously been introduced. Furthermore, in designing tasks based on ESM studies, it was unclear whether loops would serve a useful purpose.

6.2.5.4 Positive Feedback - Learning curve

Regardless of task completion or salient difficulties in doing so, the majority of participants expressed that they felt Jeeves had a shallow learning curve, and non-programmers had consequent feelings of improvement towards the end of the study:

“...cause I’m not really a computer person, I think it was initially a bit overwhelming but once I read the instructions and familiarised myself it was really good”(P11, non-programmer) “maths and like, concepts of y’know, if this then this happens and this, like box and sort of diagrams

like that, are a bit confusing, but I felt that this got better a bit at the end”(P8, non-programmer) Given that participants only had 30 minutes of time using Jeeves, such feedback was particularly positive. It is important that end-users with no prior programming experience are able to learn the basics of Jeeves quickly and efficiently, if it is to be adopted in practice. When asked how the learnability of Jeeves could be improved, participants were uncertain, and suggested different features, including built-in tutorial features, and different means of selecting blocks.

140 CHAPTER 6. USABILITY ANALYSIS

Figure 6.9:Old drag-and-drop highlighting

caused ambiguity in Study #1 Figure 6.10:clarity in later studiesUpdated highlighting improved

6.2.5.5 Positive Feedback - Visual metaphor

The blocks-based paradigm received almost unanimously positive feedback, with comments made by programmers and non-programmers alike on the intuitive means of fitting blocks together. Indeed, only one participant, an experienced programmer, expressed dislike of visual programming, explaining: “I don’t like having to click and then click and drag. It doesn’t work for me...I’m a programmer. I’d rather just write a loop myself ”(P2, programmer).

P7, with no programming experience, appreciated the visibility and ‘flow’ of the paradigm: “...rather than just simply words and numbers and things like that you feel like you can sort of see

it and see the flow of ideas and things and the sort of different steps”(P7, non-programmer). Having confirmed the feasibility of the blocks-based notation through this encouraging feedback, very little action was taken with regards to the paradigm itself. However, it was observed that participants frequently struggled to drag and drop action blocks into the triggers or nested if-conditions. Figure 6.9 shows the previous visual feedback when an action block is dragged over an action receiver, which was observed to cause confusion. To rectify this, a clearer notation was used to highlight the relevant block, shown in Figure 6.10. This was expected to reduce the Cognitive Dimension of“error-proneness”of the previous approach, by clarifying where a dragged action would be positioned, an issue expressed by one participant:

“Sometimes you have to get this thing right in the right place or else it doesn’t, like if you kind of like drag it and drop it like here it won’t register”(P12, non-programmer)

6.2. STUDY #1: FEASIBILITY AND QUANTITATIVE MEASURES 141

6.2.6 Discussion

The results of Study #1 demonstrated the potential of Jeeves as an environment usable by those with no prior programming experience. As a feasibility study for developing Jeeves further, it confirmed that the blocks-based programming paradigm was approachable by non-programmers. Through both direct observation of participants’ usage of Jeeves, and from analysis of their post-study feedback, actionable results were obtained that directed the second iteration of Jeeves. An interesting finding of this study was that design decisions guided by the Physics of Notations and Cognitive Dimensions were not always well-received. This was due partly to the acknowledged trade-offs in dimensions; for example, the Principle of Semiotic Clarity conflicts with that of Graphic Economy. Similarly, theabstraction dimension conflicts with thevisibility dimension. (See Section 5.3 for a discussion of guideline trade-offs.) However, design principles related to semantic transparency and role-expressiveness were found to be self-conflicting in Jeeves. For example, while the appearance of the “time based triggers” button implied its role for showing concrete time triggers, it alsoincorrectlyimplied its role as a concrete block itself.

6.2.6.1 Limitations

A key limitation of this study (and indeed all three of the usability studies in this chapter) is that participants were students rather than clinicians and psychology researchers, which reduces the study’s external validity. The primary reason for doing so was that students were easily accessible. Given that this was an initial proof-of-concept and usability study, the ability to acquire larger numbers of participants from more diverse subject backgrounds and exposure to programming was considered to be of high importance. It allowed a variety of feedback to be collected, ranging from those with no programming experience and limited computer use, to HCI students who had studied interface usability as part of their undergraduate or postgraduate work. As such, a larger number of usability issues were identified which, if unresolved, could significantly reduce the utility of Jeeves as perceived by its target end-users. However, as previously discussed, this was a small number of participants for the purposes of testing for statistical significance.

It could also be argued that the students had no knowledge of experience sampling, so that by asking these students to create experience sampling apps, this would not represent the real-world use of Jeeves. To mitigate this, Jeeves was introduced to study participants as an application to create self-monitoring and automation apps. Only one participant explicitly stated that she would not use it in her daily life, and indeed four explicitly stated, without being asked, that they would like to use Jeeves for creating their own apps. Given the usability issues encountered, the enthusiasm for real-world usage by participants was surprising, and suggested an application of

142 CHAPTER 6. USABILITY ANALYSIS

Jeeves as a tool for automating smartphone functions, similar to popular apps such asAtoomaand Tasker[183]. While such an application of Jeeves was not pursued in this thesis, it is considered to be potential future work.