Preliminary Study: Evaluation of the Prototype

4.6 Focus of the Present Thesis

5.1.3 Preliminary Study: Evaluation of the Prototype

We conducted a preliminary study to investigate the concept of GROUPGARDEN. For this,

we compared the prototype with the baseline without supportive feedback. The main research questions of this study were, whether GROUPGARDEN facilitates the ability of self-

regulation of participants and whether brainstorming rules are successfully supported.

Method

The experiment was run as a laboratory study using a repeated measures design. This means that all groups accomplished two brainstorming sessions, one with each condition. One condition served as a baseline. In that condition, groups brainstormed without any additional support. In the other condition, feedback was provided with GROUPGARDEN. We used two

different topics. Both conditions and both topics were counterbalanced using a Latin square design.

Setup and Procedure

The study took part in a quiet room. Ten groups with three participants each took part. The room was equipped with three revolving chairs and a projector that was used to project the visualization onto a white wall (see Figure 5.4, left). Participants could choose how to position themselves in front of the wall (more side-by-side or more face-to-face) (see Figure 5.4, right).

Before the sessions, the experimenter gave an introduction about the procedure of the study. The brainstorming rules were explained to the groups. Groups were asked to try to follow these rules. They were not explicitly asked to strive for balanced participation. Before the

Figure 5.4: Study setup. Left: A group of three participants in front of GROUPGARDEN(pic- ture re-staged). Top right: A group sitting more face-to-face to each other. Bottom right: A group sitting more side-by-side. The visualization is projected on the wall left of the participants (not visible in these pictures).

condition with group mirror, its functioning was explained. Furthermore, the topics were given to the groups. We chose two topics suited for brainstorming that did not require any special precognition: (1)What could a commercial for a new tablet computer look like?and (2)What could a commercial for a new caffeinated soft drink look like?

After the general introduction groups brainstormed twice for 15 minutes. The brainstorming was accomplished in a purely verbal form without taking notes or using other stationery tools, such as sticky notes. The study was designed as a Wizard of Oz experiment (Kel- ley, 1983). This means that participants of such a study think that they interact with an autonomous system, while this is actually, at least partly, operated by humans. In our case, participants were told that the experimenter only takes notes on a laptop while actually he additionally operated the control interface, meaning that he increased the idea counter or triggered warnings.

The experimenter had to listen to the discussion carefully to estimate, which contributions should be counted as an idea. To standardize this procedure, we defined what should be counted as an idea. As described in Section 3.1, definitions of creativity often include the aspects ofnoveltyandusefulness. The definition used for this study focuses on the aspect of

novelty: Each contribution that is on-topic and is novel in the context of this brainstorming session (i.e., was not stated before) is counted as an idea. Additionally, ideas building on the ideas of others need to include a somehow novel facet to be counted as idea. We did not include the aspect ofusefulness, as we suppose that it is too difficult for the experimenter to estimate the usefulness of an idea in real-time during the discussion.

To estimate the reliability of this real-time coding of ideas, two persons (the experimenter and another person) coded the two brainstorming sessions of the first group using the video

5 Supporting Brainstorming

recordings. The coders were not allowed to pause and replay the video, as the situation should be as similar as possible to the real-time coding scenario during the study. Cohen’s kappa showed substantial agreement between the two coders (κ= .80).

After each condition, participants filled out pen-and-paper questionnaires with 5-point Likert scales (1 = strongly disagree, 5 = strongly agree). At the end, participants additionally filled in a final questionnaire asking about perceived differences between the two conditions, and about preferences and demographic information. A short semi-structured interview with the whole group was held and all participants were debriefed.

All sessions were audio and video recorded. Videos were taken so that all group members are visible from the front. A screen capture was taken from the control interface that was synchronized with the videos afterwards.

Participants

In the experiment, 30 voluntary participants took part in groups of three (12 female; average age: 24, range: 18 to 32 years), 22 of them were students, 6 research assistants, 2 stated other professions. In eight groups, participants already knew each other before the study. Participants could choose if they receive a 10AC voucher from a well-known online store or participate in the study as part of an obligation in their study program.

Results

The study was evaluated using the questionnaires, interviews, video recordings, loggings and the notes of the experimenter. A dependent t-test was used to evaluate quantitative information and a Wilcoxon Signed-Rank Test for evaluating the results from the questionnaires. A 5% level of significance was applied for the tests. We used Excel for calculating the t-tests and the statistical software SPSS for calculating the Wilcoxon Signed-Rank Test.

Performance The mean number of ideas per group was 38.2 in the baseline (SE = 3.2) and slightly more, 40.9 ideas, in the group mirror condition (SE= 1.3). A t-test did not show a significant difference. The concepts that aimed to increase the amount of ideas were, as described in Section 5.1.2, the use of individual representations in form of flowers, a group representation in form of a tree and pictures shown in the clouds.

To understand the influence of these factors on the results better, we evaluated how often and with which reasons images were displayed. In three groups, there was no necessity to show any images. In one group, an image appeared once, in another group twice. In three groups, three images appeared and in two groups, four images appeared during the brainstorming. However, the reason for showing the images was rarely that the idea flow came to a standstill, but was mostly to encourage groups to think also in other directions and to include more wild ideas. This indicates that in our study, the images might have had a minor impact on the number of ideas.

Despite the little difference in the number of ideas between the two conditions, participants still perceived themselves and their group members as more productive with support of the

37 20 53 80 7 67 43 13 47 Strongly Agree Strongly Disagree Neutral 57 80 20 17 10 13 10

The group could effectively conduct the brainstorming. with group mirror

baseline

The other group members attemted to actively take part in the brainstorming.

13 3

Group members were often criticized because of their contributions. with group mirror

baseline

with group mirror baseline

Figure 5.5: Results of the questionnaires. Questionnaires handed out after both conditions (with group mirror and baseline). Numbers indicate the percentage of participants who answered with that score on the 5-Point Likert skale.

group mirror. In the final questionnaires, 91% of the participants stated that they were more motivated in the group mirror condition than in the baseline condition. Comparing the questionnaires that were handed out after both conditions revealed that participants perceived the effectiveness of the brainstorming session with group mirror better than without (z= -2.37,

p< .05,r= -.53), (see Figure 5.5, diagram at the top). They also rated the effort of others to participate in the brainstorming better in the group mirror condition compared to the baseline (z= -2.67,p< .05,r= -.6) (see Figure 5.5, diagram in the middle). For instance, one participant stated:“I found that the second time [baseline] we somehow again and again discussed the same topics. I think the first time [group mirror condition] we have more productively addressed new ideas because you wanted to get bigger [flowers].” (G6, P3).

Balance of Participation To assess the balance of the amount of the ideas of the group members, we categorized participants into below average and above average. This cate- gorization was done after the study. Thus, group members did not get to know their cate- gorization. To realize this, we took the baseline as basis and divided the mean number of ideas per group by three to get the mean number of ideas per participant (M = 12.7). All participants with more than this amount of ideas (i.e., at least 13 ideas) were categorized as

above average, the others as below average. This resulted in 17 above averageand 13be- low averageparticipants. We calculated a dependent t-test for both of these groups. Results show thatabove averageparticipants contributed significantly less ideas in the group mirror condition (M= 13.76,SE= .48) compared to the baseline (M= 16,SE= .74),t(16) = 3.27;

5 Supporting Brainstorming 0 2 4 6 8 10 12 14 16 18 Number of ideas per group (mean)

baseline with group mirror

below average above average **

***

Figure 5.6: Results on the number of ideas. Below averageparticipants contributed more with group mirror compared to the baseline,above averageparticipants less, leading to a more balanced brainstorming session. Error bars represent the standard error.

with group mirror (M = 13.46, SE = .51) compared to the baseline (M = 8.46, SE = .69),

t(12) = -5.36,p< .0001,r= .84. Figure 5.6 visually shows this effect.

These results are also supported by the answers from the questionnaires and interviews. 73% of the participants perceived participation levels as more balanced with support of the group mirror compared to the condition without that support. In the interviews, bothbelow average

andabove averageparticipants stated that they altered their behavior when supported with GROUPGARDEN: “I didn’t want to be the one with the ugliest flower and bugger up the growth of the tree.” (G1, P1) and“You restrain yourself more if you see that your flower is already bigger. I stopped talking then and thought: ‘let the others talk’” (G4, P2).

At the same time, participants did not feel obliged through the group mirror to balance their participation levels at any cost. Participants stated in the interviews that“the system would not restrain me from saying something, if I had a really good idea” (G1, P1) and that “if the others don’t come up with ideas at that moment I would still go on talking because then again you inspire the others.” (G1, P2).

Interruptions As stated before, we were interested whether individual warnings (that were provided in case of interruptions or judgments of ideas) had an effect on groups. Hence, we measured the number of interruptions and judgments. All interruptions or cases of simul- taneous speech were part of the natural conversation. For example, when two persons started to speak at the same time, we did not count them as avoidable occurrences of interruptions.

Judgments Individual warnings in case of judgments were also observed rarely, however, significantly more often in the baseline compared to the group mirror condition. With group mirror, three occurrences of judgments were noticed and thus a warning was displayed (M = .03,SE= .21). In the baseline, group members judged ideas of others 15 times (M = 1.5,SE = .34), t(9) = 4.81,p < .001, r = .85. This was also observed by the participants, who answered in the questionnaires that more occasions of judgments occurred in the

53 33 7 3

I think that the group mirror disrupted the group work.

3 17 43 20

The group mirror created a pressure to succeed.

Strongly Agree Strongly Disagree

Neutral

Figure 5.7: Results of the questionnaires. Questionnaires were handed out after the group mirror condition. Numbers indicate the percentage of participants who answered with that score on the 5-Point Likert scale. Numbers are rounded and thus might not add up to 100% exactly.

baseline condition compared to the group mirror condition (z= -2.62,p< .01,r= -.59) (see Figure 5.5, diagram at the bottom). These results are also reflected in this statement from the interviews: “With the feedback system I took care to completely leave out any criticism.”

(G4, P1).

Deviation from the Topic Group warnings in form of a lightning were also needed rarely. In the group mirror condition, four groups were warned once about digressing from the topic (M = .04, SE = .16), in the baseline, deviations from the topic were recorded seven times (thereof three times in one group). A comparison between both conditions did not reveal a significant difference.

Distraction and Pressure In the questionnaires, we asked participants, if GROUPGAR- DENdistracted them and if they felt under pressure. Results indicate that the group mirror is

little distracting, but created a pressure to succeed (see Figure 5.7).

In the final questionnaires, one participant wrote: “It is difficult for me to say, if the negative effect of the pressure the feedback produces outweighs the advantages that come from this pressure, to generate ideas by all means.”(G1, P2). Another participants stated: “I perceive it as rather disrupting to constantly see the progress of the brainstorming, because the feeling emerges that some kind of pressure is produced.” (G4, P1)

Seating Arrangement Groups were free to choose their seating arrangement. Six groups positioned themselves in form of a triangle in front of the visualization. This induces a situation in which all group members can see each other, but for two of them, the visualization is in the periphery of their field of vision. The remaining four groups sat in a row, such as in a cinema, thus, making eye contact more difficult but having a good view on the group mirror. Both seating arrangements were perceived as not ideal. Participants remarked: “I found the seating arrangement a bit difficult. Now [in the baseline] I found it much more pleasant that we could sit in a circle and look at each other.” (G2, P2).“It would be cool if we could have the visualization more centered, or on all walls.” (G9, P2).

5 Supporting Brainstorming

Preferences and Design When asked, which brainstorming session participants liked more, 74% answered in favor of the session supported by the group mirror. Furthermore, 70% were more satisfied with the results of the brainstorming. As already stated above, 91% felt more motivated, two participants explicitly mentioned that the “playfulness” of GROUP- GARDEN was the main reason. One participant stated: “This is like a task for all of us, like a game”(G6, P2).

Most participants liked the visual design of GROUPGARDEN. When asked in the question-

naires whether they liked the visual representation, 93% either fully agreed or agreed, 7% stated that they did not have an opinion. The visualization was rated as intuitive and simple.

“You know immediately what it means.” (G1, P2). However, it was also perceived as“(...) still designed childlike.” (G4, P2). It was remarked that it depends on the usage scenario, if the design is appropriate: “(...) I wouldn’t give it to a businessman, but for us it was actually pleasing.” (G5, P3). An aspect that was confusing and was not well received were the images that were displayed in the clouds.

Summary and Discussion

With GROUPGARDEN, we presented a group mirror that displays feedback about behavior

during a brainstorming session to a group. It combines individual and group feedback to increase the number of ideas and to balance participation.

However, the amount of ideas did not differ significantly between the group mirror condition and the baseline. A possible explanation for this is that phases of silence occurred rarely. Groups did not come to a point where they ran out of ideas. This indicates that in phases of constant idea flow, as in our study, people do not create more ideas with support by a group mirror than without that support. At the same time, this indicates that the time spent on explaining individual ideas seems not to decrease with group mirror. However, we assume that without a temporal limitation (in our study, each brainstorming session was restrained to 15 minutes) the group mirror might have a positive impact on the quantity of ideas. When groups run out of ideas in the end of a brainstorming session, the group mirror might encourage them to continue and think in other directions. A general statement about the issues about achieving significant results in the studies presented in this thesis is provided in Section 9.2.5.

GROUPGARDEN successfully helped to balance participation of group members. However,

this means at the same time that very productive group members decreased their amount of contributions. As the overall amount of ideas did not decrease, this can be seen as a positive effect, as it shows that free riding is less likely and group members who tend not to contribute are motivated to participate more actively.

Next to increasing quantity of ideas and balancing participation, GROUPGARDEN aims at

minimizing occurrences of interruptions, judgments and deviations from the topic. Results showed that ideas of others were judged significantly more often in the baseline condition. However, all of these issues were noticed rarely. This might be due to the artificial situation of the laboratory study. Participants were aware that they were observed and recorded. In

more natural setting, these issues might occur more often and therefore the group mirror might have a bigger impact on these issues.

We observed that the usage scenario of such a feedback system is an important factor. The design of GROUPGARDENwas considered more appropriate for informal use cases than for

professional contexts. Creating different visual designs for different use cases can solve this issue. After having conducted this study and after having observed groups using this system, we assume that GROUPGRADENis more appropriate for learning how to brainstorm, than to

use it as a constant support. This would mean that groups use the group mirror until they have internalized the rules of brainstorming. More unobtrusive designs that demand less attention might be more appropriate using them as a constant support during group work. Finally, we observed that displaying feedback on a wall led to several problems. Groups positioned themselves differently in front of the wall. When sitting more side-by-side, eye contact and a natural conversation was complicated, when facing each other, the visualization was in the periphery of the vision field for several group members. Therefore, we decided to study the effects of the display setting in more detail. The adaption of the initial prototype and the study comparing two different display settings will be explained in the next sections.

In document Tausch, Sarah (2016): The influence of computer-mediated feedback on collaboration. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik (Page 97-104)