Pilot study 3: Testing with target audience

Finally, the workshop was tested with its intended audience using carefully selected quotations from participants in the Exploratory Study to prompt discussion. The purpose was to establish that the discussion procedure worked with its target audience of programmers: did the quotation cards evoke familiar scenarios and promote discussion?

6.5.1 Participants

A revised workshop format incorporating the procedural refinements from pilot study 2 was tested at the PyCon UK 2016 conference in Cardiff.

6.5.2 Materials: choosing quotation cards

Quotations were chosen for the quotation cards by reviewing the interview transcripts for resonant phrases to illustrate the common themes about peer behaviour that had emerged. Since the interview discussions had been guided by participants’ decisions in the interview card sorting task, themes were initially shortlisted for inclusion by examining the results of that task and applying the following criteria:

• Select the majority consensus set: cards which were sorted into a “noticeable impact” category by at least 50% of interviewees. (21 cards, of which 6 had bad impact and 15 good)

• Exclude contentious topics. Although 18 interviewees placed card 53 (Uses code comments in ways that aid understanding) in the “Good — Noticeable impact” category there was a lack of consensus in how they talked about it.

Although at a high level it qualifies as a common theme, drilling down into the reasons given for the impact showed considerable disagreement about what is good and bad. It is inconsistent with the aims of the workshop to present material on which there is such disagreement; the focus should be on the impact of a behaviour rather than discussion of what constitutes good or bad behaviour. (Candidate set reduced to 20 cards)

• Make the materials language-neutral. Participants had noted that card 18 (Is rigorous about deallocating allocated resources) is language specific. A possible future avenue is to tailor a selection of cards (perhaps with a core common set) to help address issues of particular local concern, such as language-specific ones. The workshop process is independent of the card topics. (Candidate set reduced to 19 cards)

Consideration was then given to potentially interesting features of cards not in the initial shortlist. Three criteria were used:

• Examine any cards which the majority placed in the “neutral” category. Participants’ commentaries sometimes indicated an “it depends” response to cards placed in this category. However only card 16 (Follows formal methods to the letter) met this criterion. It was ill-worded, intending to convey a dogmatic approach to software methodologies but often taken to refer to mathematical techniques for software verification. These techniques are not widely used throughout the industry and so were irrelevant to most participants’ daily experiences. The card was not included. The kind of dogmatism it intended to convey is captured more generally by card 43 (Espouses “one true way” of doing things), one of the majority consensus set. (No change to candidate set) • Examine any cards which show indications of dissent. Most have consensus on positive or negative valence, or at least neutrality. Card 26 (Logs it in the issue-tracking system when knowingly making a sub-optimal change) is the only one with more than one or two participants dissenting (4 “Bad — Slight impact”; 5 neutral; 15 “Good — Slight impact”; 4 “Good — Noticeable impact”). Some participants spoke in the interviews about the right approach being to not do a sub-optimal change in the first place. This is quite a context dependent card; what may be appropriate in an out-of-hours support call to provide a short-term solution for a customer’s urgent problem, for example, might not be a desirable approach in other circumstances. Since the dissent can be accounted for and few participants reported a noticeable impact of any kind the card was not included. (No change to candidate set)

• Consider any other common themes not captured by the preceding criteria. Card 22 (Includes code features that are not currently needed) was placed in the “Bad — Noticeable impact” category by 8 interviewees and in only the “Slight impact” category by 15, but this topic was a common theme. It was frequently expressed in terms of the XP (eXtreme Programming) principle of YAGNI: “You Ain’t Gonna Need It”. “New technologies without good reason” was an additional theme not directly addressed by the interview cards, although in some ways it is the flip side of Card 40 (Finds out whether functionality is already available before writing their own implementation). Both themes were included. (Final size of set: 21 cards.)

Quotations were chosen from among those coded to the selected themes for their ability to clearly represent the theme, a characteristic checked by asking an experienced software developer to indicate whether they recognised the behaviour being talked about in each quotation. In a few cases this criterion was not met and with the help of the developer an alternative was selected from among the candidates. See Appendix H for the final set of quotations used. These were printed on A4 cards, referred to hereafter as “quotation cards”. Both sides of each card showed a speech bubble containing the same quotation from an interview participant (Figure 11).

Chooses identifiers which are not succinct, meaningful and distinct 20

Figure 11: Format of quotation cards for Evaluation Study workshops

6.5.3 Procedure

The pilot took place within the context of a conference session so a short presentation about the research was given beforehand. Delegates were then organised into three

groups of three or four and instructions were given to the room as a whole. Each group was given an identical set of cards to look through, with instructions for everyone to choose one that resonated with them for its impact on their work and then explain why to the group. Since this session involved multiple groups it was not possible to facilitate the process of each group individually. Instead the researcher circulated among the groups, answering questions about the process, listening to the conversation and prompting people to talk about the personal impact when the tenor of discussion had become more about good practice (what people “should” do) than the impact of such actions. At the end participants were invited to complete an online questionnaire to give their feedback. Three complete responses were received (seven online sessions were started; one answered only the first four questions and three contained no answers).

6.5.4 Discussion

There were practical difficulties with conducting this pilot exercise because it was scheduled across two conference sessions, meaning that some people left half way through to attend a presentation in another room, while others joined half way

through. Nonetheless it was an instructive exercise that informed subsequent

refinements to the workshop.

It was evident, from both the number of questions participants asked about what to do and from feedback in the post-workshop questionnaire, that the instructions given needed to be clearer. They were refined to say less about the research behind the workshop. These revisions continued throughout the full study until the final version of the instructions (Appendix I) said as little about the previous work as possible while still conveying that the quotes came from a wide range of experienced programmers — the participants’ peers. Based on parting comments, it appeared that including too much of the research background in the instructions sometimes encouraged the mistaken impression that the goal was to learn participants’ views for research purposes, rather than the actual goal of attempting to create a useful discussion.

One change was made to the content of the quotation cards as a result of this pilot because a participant found “Code is a shared responsibility for everyone. If you don’t like others to work on code you wrote, well, get over yourself!” somewhat

offensive. Other quotations might also be considered quite forthright in their

language, but the way they are phrased reflects the feelings and vocabulary of real developers in a way that underlines their authenticity. No attempt was made to

sanitise them to pre-empt possible future offence, but the one which had actually been deemed offensive was replaced with a less truculent alternative, “Anyone who thinks it’s ‘their’ code is missing the point.” The full set of quotations used in all subsequent workshops is listed in Appendix H.

6.5.5 Conclusion

The goal of the workshop to facilitate discussion by use of the quotation cards worked well. Observing the overall ”buzz” in the room and listening in to the conversations showed that animated discussions were taking place. Questionnaire feedback was also positive. One participant responded to the question “Would you do the workshop again?” with an encouraging ”Yes, I’d love to discuss these issues with my team and the cards would be a good starting point.” All three respondents reported that they would recommend others to try the workshop, one believing that would help to open up discussions that would be difficult to have otherwise. Together, the observations and the questionnaire responses encouraged pursuing the plan to deliver the workshop in a company setting.

In document Helping developers to help each other: a technique to facilitate understanding among professional software developers. (Page 103-107)