Chapter 5: Class Study with vLeader
5.3 Results
5.3.2 Performance Data
The investigation of the validity of vLeader performance data allowed for one direct measurement of effects of variation and lead to three different observations, related to students’ experiences within the study.
In contrast to the rest of the scenarios, the way variation was introduced to Scenario 5 and the data that vLeader collected, made it possible to directly measure the effects of introducing variation on student behaviour through the game scoring mechanism. In order
to do that, values of variance between different components of the business score was compared in both groups, applying statistical significance tests on the variances between the score components. The experimental/control group was used as independent variable and the variance – as a dependent variable. Using unpaired t-test with the assumption of equal variances no statistically significant difference was found. The dataset for this test is available in Appendix B and results from a one-tail test were not significant (t(df=50) = -0.0322, p = 0.48). This result indicates that the activity sheet for Scenario 5, intended to introduce variation for the purpose of improving understanding, did not lead students to exhibit wide variance in their performance. Such variance would have been a sign of variation in the score, when asked to explore variation in the assignment.
Three observations could be made considering students’ perceptions of practice and
advance modes, as they were provided in vLeader, interference of exploratory behaviour
in performance data and different tactics students employed in order to conceal their scores. The first of these observations involved students’ perception of different modes of play (practice and advance). In line with Malheiros and colleagues’ (2011) findings about user expectations, some students wrongly assumed that while playing in practice mode, their performance data was not being recorded for later review. This was not suggested in the consent form, and I communicated the misunderstanding to all students once I became aware of it.
The recorded performance data was explored to verify this phenomenon and results are aggregated in Table 10. To ensure validity, only those data entries were considered, that belonged to students who had played a given scenario in both practice and advance modes. These entries represented 32% of all collected game data and 93% of students that
Plays Experimental Control t-test
Mean Variance Mean Variance t(df=38) p
Scenario 1 10.2 78.59 6.1 20.58 1.86 0.0364 Scenario 2 3.6 8.89 3.4 5.52 0.18 0.4304 Scenario 3 4.0 11.00 3.9 9.04 0.05 0.4802 Scenario 4 1.8 5.12 2.4 5.62 -0.82 0.2090 Scenario 5 1.2 2.66 1.5 4.37 -0.51 0.3079 Total 20.7 282.45 17.2 121.54 0.77 0.2241 Table 9: Comparison between the control and experimental groups in terms of number of plays for each scenario and total for all scenarios. Unpaired one-tail t-test was used. Number of plays differed only in Scenario 1, as indicated in Figure 10b
played at all. Although it was considered that the actual student and scenario number might have a mediating/moderating role, this was ignored and paired t-test was conducted with mode of play as independent variable and score components as dependent variables. A statistically significant difference emerged over several sub-component scores and over all aggregated scores. There was a higher confidence as far as business score components are concerned. Students had clear insight of how business scores were being calculated. As a consequence, it could be speculated that they were better able to affect these, when trying to reach a high score in advance mode. The statistical analysis shows that students’ intention to perform better when playing in advance mode had shown least results on the tension score. They were least able to purposefully demonstrate their ability to achieve a good balance of tension when they attempted to do this.
The second observation had to do with the fact that in order to encourage exploration and variation, during class discussions, students were asked also to experiment with being passive in the game, or taking different sides in an argument (for example see Scenarios 3 and 4). As a consequence of this, and sometimes driven by their own intention to explore alternatives, in certain plays students would have reached scores that do not reflect the maximum of their potential. Although in the case of most students, it could be argued that they did this exploration in practice mode, the fact that several students played in advance mode only raises questions about the validity of these results as a measure of overall proficiency. As a reminder, notice that not playing in advance mode does not allow for progression to Scenario 2 and beyond.
Score Component Aggregated in t(df=121) p
Power Leadership -2.78 0.0031 Tension Leadership -0.60 0.2764 Ideas Leadership -2.21 0.0146 Financial Business -3.66 0.0002 Customer Business -3.76 0.0001 Employee Business -2.34 0.0106 Leadership Total -3.35 0.0005 Business Total -3.70 0.0002 Total -3.86 0.0001
Table 10: Probability thresholds for differences in vLeader score components, according to mode of play – practice or advance. The figure was calculated with a one-tail paired t-test. Compare the score hierarchy with Figure 8.
The third and last observation that had the effect of undermining the validity of collected performance data was made, based on reports by some students that they had figured out ways to selectively submit their results. One student in particular reported that shortly before finishing playing a scenario, he would determine whether he was satisfied with his performance, and if not, he would quit the game before the final scores are calculated and submitted. In this way he had developed a new leverage to determine which results he wanted submitted, and which – not. Two other students reported playing the game disconnected from the Internet, and thus inhibiting it to submit results. While such uses of
vLeader were envisioned when Simulearn Inc. developed the game, the contingency
mechanism for later submission works only when users willingly collaborate and submit the locally stored data manually. Yet, the collected data suggests that these were only exceptions that should have only minimal impact on the overall performance of the groups.