Inevitably with such a complex tool development and study there are multiple threats to validity. In this section threats to validity are discussed pertaining to either the CreativeTeams tool, or to the related study.
7.3.1
The CreativeTeams Tool
There are multiple threats to the validity of the CreativeTeams tool.
1. External threat: Generalisation. Does CreativeTeams actually measure cre- ativity? Such a question is impossible to address with any certainty because there is no agreed definition of creativity, and no alternative measures of team creativity available for comparison. Taking these points into consid- eration the CreativeTeams tool currently provides a repeatable means of assessing novel and divergent thinking in teams.
2. External threat: Replication. Is the adaptation of the Torrance Tests of Creative Thinking [8] successful? That is, does the CreativeTeams adap- tation actually provide a comparable measure of creativity to the original TTCT? Unfortunately it is impossible to know the answer to this question. Torrance based his work on the creativity of individuals, it is therefore im- possible to compare any of his data with that gathered herein. I would argue however that the tool provides as near an imitation to the original paper based tool as possible. Working on iPads provides users with a simple shared canvas and practice time with the tool is provided to ensure partic- ipants understand how to use the tool. The CreativeTeams tool therefore provides a fair adaptation.
3. External threat: Replication. Not all of the Torrance Tests of Creative Thinking are included in the CreativeTeams tool. Whilst the tool does provide a good adaptation of the TTCT Figural activities it neglects the verbal part of the tests. The problem with a full adaptation lay in the nature of the verbal tests. The verbal tests ask individuals to list aloud responses
to a series of prompts. The CreativeTeams tool includes an adaptation of one of these activities - the Alternative Uses activity. However, I was unable to adapt the other two verbal activities because the original visual prompts that Torrance [155] based his work on were not available. That is, the original images that Torrance gave to his participants as prompts were not included in the Norms-Technical manual, nor in the copies of the scoring guides that I was able to access. Future work should therefore work to find a means of accessing versions of these activities and include them within the CreativeTeams suite to provide a fully digital collaborative form of The Torrance Tests of Creative Thinking.
4. Internal threat: History effect. Does the age of Torrance’s [8] scoring guide affect the validity of the data? The problem being that Torrance’s [8] scoring guide is based on the common responses produced by individual participants some 40 years ago. This introduces a problem for scoring be- cause many teams produce responses that refer to contemporary popular culture. For example, many teams draw Harry Potter in response to one of the Picture Completion starting shapes that looks like a lightening bolt. In the current context this would not be considered an original response deserving a low originality score. However, Torrance’s [8] scoring rubric does not include Harry Potter, and therefore this is given the maximum originality score. The risk here is that contemporary responses that aren’t original actually skew the data, providing false interpretation of the data. To investigate the possible impact I repeated the Mann-Whitney U analysis with a data set that excluded all responses given high scores simply due to not being on Torrance’s list (see table 6.3). The Mann-Whitney U results indicate that there remains no difference between co-located and virtual team originality. This demonstrates that the use of the older scoring rubric therefore seems unlikely to have a major impact on the analysis of Origi- nality scores. However, future versions of the test should incorporate the development of a new rubric to increase the accuracy of scoring.
5. Internal threat: Activity design. The Design Challenge and Design Ques- tions activities were meant to supplement the adapted TTCT activities.
On reflection, it appears that both activities were flawed despite a series of prototyping stages. The activities were meant to challenge teams to demon- strate scenario-based creativity and in particular to demonstrate creative dialogue. However, during the study teams interpreted the instructions too broadly, producing responses that were so varied as to be impossible to compare. Teams were asked to generate new and novel chair designs for use at music festivals and for hiking. Teams produced a wide variety of designs from regular folding chairs through to futuristic flying chairs. The main problem being that teams recorded their designs very differently (see examples in appendix A). Most used the drawing environment to aid their conversations. This meant that some teams have produced meaning- ful static drawn outputs that can be analysed, whilst others have used the tool only as an aid to dialogue. Such teams drew very little instead using their dialogue to express their creativity. Such variation in how teams have approached the problem is fascinating, however it is highly problematic for analysis. As such, the data gathered from this activity was excluded from further analysis in this thesis. Future versions of these activities will need to have more specific instructions so that teams approach the activity consistently.
6. Internal threat: Maturation effect. There is a risk that teams could be affected by either learning or fatigue effects. That is, if all teams completed tests in the same order then teams may become more familiar with the type of activity or tool and consequently perform better in later activities. Teams may encounter team fatigue due to the length of the test and perform worse in later activities. Either would skew results. This risk is negated by presenting teams with the tests in a random order. As such, no two teams out of our sample of 37 completed the activities in the same order.
7.3.2
The Study
The study itself introduces a number of possible threats to validity, with several pertaining to participant selection and team formation, and others relating to the scoring of outputs.
1. External threat: Generalisability. The decision to use teams of three par- ticipants is a limitation of these findings. However, such small teams had to be used for pragmatic reasons, namely increasing the likelihood of being able to recruit enough participants to produce a sufficiently large sample size of teams. Ideally future studies would use a variety of team sizes to explore the relationship between creativity and team size.
2. External threat: Selection bias. Recruiting from the student population. Student participants are far less experienced at working in teams and there- fore may not be representative of the majority of established professional teams. It is for this reason that I decided to concentrate on newly formed teams, actively forming teams where participants did not know each other. This has the added benefit of enabling this research to focus entirely on the creative processes of teams without established Transactive Memory Systems [174].
3. External threat: Selection bias. Limitation of participant experience. Re- cruiting from the student population introduced problems during prototyp- ing. Student teams were unable to complete more realistic scenario based activities. I would hypothesise that this is because students generally have little work experience. This factor motivated the move from the search for a realistic scenario-driven creativity activity for teams to complete towards the more abstract Torrance Tests of Creative Thinking [8]. However, this in turn has the benefit of making the CreativeTeams tool suitable for studying almost any group of interest, regardless of background or experience. 4. Internal threat: Selection bias. The Torrance Tests of Creative Thinking [8]
are heavily reliant on participant drawing. As such, it is possible, and even likely, that participants with a greater propensity for drawing may produce more artistically accurate responses, thus potentially skewing their teams scores. In order to take this into account participants were asked to evaluate their own artistic experience on a scale. These values were then compared with their team’s scores to see if artistic experience introduced any bias to the scoring. The results in section 6.4 suggest not.
5. Internal threat: Experimenter bias. The scoring of the Picture Construc- tion, Picture Completion, Parallel Lines and Alternative Uses activities is also problematic. Not only is there the problem that the test is 40 years old as previously highlighted, but there is a risk that outputs are not scored consistently, without Torrance’s [8] extensive experience. Multiple mark- ers were therefore used in order to negate the risk of inconsistent marking. The markers demonstrated a high level of inter-rater agreement when using Torrance’s [8] scoring guides.
6. Internal threat: Confounding. The majority of the analysis indicates no difference in the performance of co-located and virtual teams. However, there is a difference in the performance observed during the Parallel Lines activity in the elaboration scores and related drawing meta data. This finding is contrary to the majority of other trends identified across a wider range of activities and data. In order to explore this threat further the dialogue of the teams themselves was analysed and contrasted with that of teams completing the most similar activity - Picture Completion. No significant differences were found in the quantity of dialogue that teams used in either activity, nor was there any difference in the level of verbal interactions used by the co-located and virtual teams. As it stands this threat must be treated as an anomaly. It is possible that the Parallel Lines activity is not suitable for completion by teams. Further research will need to be completed to understand this difference.
7. Internal threat: Compensation rivalry. There is a risk that the virtual teams actively work harder because they feel they are at a disadvantage in comparison to their co-located counterparts. However, this is not something that I have witnessed during the study. Furthermore, there is no difference in the number of utterances used by co-located and virtual teams during both the Parallel Lines and Picture Completion activities. As such, it seems unlikely that the virtual teams actively work harder.