6 Experiment 2 An Investigation of the Effects of the Mappings and Harmonicity of the Audio
6.6 Experiment-2 Data Analysis
6.6.2 Question-2: results
This section presents the results obtained from the second question of the experiment, were expert and non-expert subjects in the supervised and unsupervised conditions reported on their confidence levels regarding their decision (i.e. corresponding image for each audio stimuli). A repeated measure ANOVA using a general linear model was computed with the confidence ratings as within subjects factor and the corpora and the mapping as between subjects factors. The analysis aimed to test the null hypothesis of no association between the subjects’ confidence ratings and the mappings, or the audio corpora.
Analysis of the subjects’ confidence ratings revealed no significant differences due to the corpus with an exception for the non-experts in the unsupervised condition. Significant differences were revealed in the data means of the expert group in both conditions for the factor mapping, while no significant were the differences for the expert participants in the two conditions. While no interactions between the factors mapping and audio corpus were revealed. The results of the analyses are shown in Table 18. Figure 59 and Figure 60 shows the data means for all of the participants confidence ratings for the corpus and the mapping factors respectively. An additional ANOVA model was computed to examine the relationship between reported
116
confidence and correct detections which was found to be significant for all the groups with only exception the expert group in the supervised condition. The results of the analyses are shown in Table 19.
Table 18. General ANOVA model computed to investigate the effect of the A/V associations, corpus and mapping on the perceived similarity reported by the subjects. An asterisk (*) indicates statistical significance.
Table 19. General ANOVA model computed to investigate the effect of correct detection on the reported confidence level. An asterisk (*) indicates statistical significance.
Corpus Mapping Interactions
Group F P F P F P Total Non-experts (s) 2.25 .1 .9 .4 1.89 .1 223 Non-experts (u) .28 .8 14.14* <.001 1.93 .1 215 Experts(s) .70 .5 2.26 .1 0.75 .5 127 Experts(u) 4.74* <.005 15.51* <.001 1.7 .1 311 df 3 1 1 s=supervised / u=unsupervised Correct detection Group F P Total Non-experts (s) 7.53* <.005 223 Non-experts (u) 6.15* <.05 215 Experts(s) .34 .5 127 Experts(u) 16.73* <.001 311 df 3 s=supervised / u=unsupervised
117
Figure 59. Data means and confidence intervals of participants’ confidence ratings for the corpus factor for all of the groups. If intervals do not overlap the corresponding means are statistically significant.
Figure 60. Data means and confidence intervals of participants’ confidence ratings for the mapping factor for all of the groups. If intervals do not overlap the corresponding means are statistically significant.
118
6.7 Discussion
The present experiment tested the effects of the corpus and the mapping on the ability of the subject’s to discriminate audio-visual stimuli. This experiment used the discrimination ability of the subject as an indicator of the comprehensibility and effectiveness of the mappings. The primary purpose of the mappings that were tested in the present experiment is to enable interaction with corpus-based concatenative synthesis for creative applications Musical and sound training factors were also measured in this experiment. In an overview of the data gathered in this experiment it is worth noting that overall there are consistencies in the subjects’ responses across the groups and conditions. The fact that there are consistencies between the two subject groups (i.e. sound practitioner and non-sound practitioner) suggests that musical/sound training was not a significant factor. Furthermore, the fact that there are consistencies in the subjects’ responses between the two experimental conditions (controlled environment and online) suggests that both approaches for gathering data are equally well suited for this type of experiment.
In agreement to my hypothesis, experimental results from this study revealed that participants' success rate in detecting the correct image did not vary significantly as a result of the mapping. However, some general trend that can be observed in the dataset is that both chromatic and achromatic mapping enabled participants to detect images well above chance levels, chance level being 33.3% given that there were three potential image matches for each audio stimulus. Overall the chromatic mapping enabled participants to correctly detect more images than the achromatic mapping, although this difference was not statistically significant.
In agreement to my hypothesis, the harmonicity of the audio corpus used to synthesise the audio stimuli from the images has significant effect in the subjects' ability to detect the correct image. Furthermore, as it was predicted, the detection success rate was higher when the string corpus was used, followed by the birds and the impact corpus, while the lowest success rate was observed when the wind corpus was used. The analysis shows that the corpora that had the strongest effect in the subjects’ successful detection rate were the string and the wind. Moreover, it is worth noting that the non-sound practitioners’ correct detection rate in the supervised condition when the wind corpus was used was 27.6% well below chance levels (i.e. 33.3%). While for the rest of the groups and conditions, the success rate was not far above chance levels.
Contrary to the results from the first study, in the context of the second study the harmonicity of the source audio which the audio corpus consists of appears to be important. A first interpretation of the effect of the harmonicity of the audio corpus in the ability of the subjects to detect the correct image is that when the sound corpus is not harmonic and continuous, the resulting sounds can be noisy and lack clarity. This could affect the effectiveness of the mapping by causing a reduction in the salience of the audio-visual associations. This in turn weakens the
119
ability of the participants to pay attention to the causal relationship between the image and the sound. However further research will be necessary to support this claim.
Another interpretation of the divergence in the results between the two studies regarding the influence of the harmonicity of the audio corpus, is that in the first study the task was easier and less demanding from a cognitive point of view. In the second task, multiple audiovisual parameters were manipulated simultaneously and the images were very similar and due to these factors the decision which subjects were asked to make was by far more complex in comparison to the first task. In the second study there is a greater demand to detect subtle differences, forcing the participant to actively seek for cues to determine which image is the correct one. As a result, the clarity of the sounds which the corpus consists of became an important factor. So, our conclusion is that, in the context of corpus-based synthesis, the salience and efficacy of the cross- modal associations involved in a multidimensional mapping are to a degree dependent on the typological features of the source audio which the corpus consists of. Hence, the effectiveness of mappings that link user sensorimotor actions to audio parameters for the control of sound and music is subject to the qualitative characteristics of the sound used for testing the mapping. However further research will be necessary to assess the degree of this effect.
Contrary to my hypothesis, non-sound practitioner subjects performed overall better than expert subjects in both the supervised and the unsupervised experiment, however the difference between the two groups was not significant. As in the previous experiment, my interpretation of the fact that there was no significant differences between the expert and the non-expert group is that the cross-modal correspondences tested in this study are not dependent on the level of music/sound training of the subjects. These findings are in agreement with (Lipscomb & Kim, 2004) findings and oppose the findings of (Kussner & Leech-Wilkinson, 2013; Küssner, 2014; Walker, 1987).
Finally, the participants’ confidence ratings revealed no statistically significant correlation between confidence levels and correct detection. Although overall a trend can be observed between correct detection and confidence levels reported by subjects (i.e. confidence levels on average are higher when participants have responded correctly rather than incorrectly). However the confidence levels reported on incorrect responses are very high (i.e. in most cases well above 50%), which does not indicate that participants were aware that their responses were incorrect. Overall, participants felt more confident when the chromatic mapping and the string corpus were used. Furthermore, the participants’ confidence ratings show that when the wind corpus was used participants felt least confident, however for the other three corpora the results were not following a strong correlation pattern.
120