An Evaluation Framework for Musification
7.1 Evaluating Sonification
Surveys of sonification papers report a low number of projects which carried out statistical evaluation on functional sonifications (Vogt, 2011; Degara, 2013; Ibrahim et al., 2011). While there are no established evaluation methods for sonification, ‘end-user testing with at least a working prototype’ (Ibrahim et al., 2011, p. 79) is a common approach for evaluation. Examples of tasks used include matching, comparison, classification, ordering, association, prediction, finding, memorisation, navigation and identification tasks (Ibrahim et al., 2011, pp. 79-81).
Evaluating sonification presents a range of unique challenges. It can be difficult to find enough participants to obtain statistically relevant results. It is also rare to find sufficient participants who are willing and knowledgeable enough ‘to assess the sonification for both its sonic and domain science value’ (Vogt, 2011).
166
Furthermore, participants often complete tests on different equipment and in potentially noisy environments (Degara et al., 2013, p. 170). Tests in a laboratory environment need to be combined with testing in the implementation environment, as ambient noise can distort results collected in a neutral environment (Bonebright and Flowers, 2011, pp. 111-13). A final challenge worth mentioning is the importance of aesthetics in auditory display evaluation. While aesthetics is an important consideration in sonification design, it remains a subjective perception and must be weighted and combined with more objective criteria (Degara et al., 2013, p. 70).
7.1.1 Sonification Evaluation Criteria
Existing sonification evaluation criteria are surveyed briefly to determine their usefulness for musification. The discussion will inform the proposed set of criteria for the evaluation of musification.
Hermann
The first question in a sonification evaluation might ask whether the auditory display is indeed a sonification. Hermann proposes that the following criteria must be fulfilled (2008, p. 2):
(C1) The sound reflects objective properties or relations in the input data.
(C2) The transformation is systematic. This means there is a precise definition provided
of how the data (and optional interaction) cause the sound to change.
(C3) The sonification is reproducible: given the same data and identical interactions (or
triggers) the resulting sound has to be structurally identical.
(C4) The system can intentionally be used with different data, and also be used in
repetition with the same data.’
This framework tests the definition of a sonification but not its qualitative characteristics.
167 Vogt
Vogt proposes a multi-criteria decision aid for evaluating sonification which takes ‘different stakeholders’ into account (in this case, domain scientists, sonification experts, and media artists). This allows the sonification designer to evaluate the sonification design objectively and draw conclusions on which kind of sonification is appropriate for the end user (Vogt, 2011). The proposed criteria are shown in Table 7.1.
Tab. 7.1: Sonification evaluation criteria proposed by Vogt (Vogt, 2011).
Criterion Description
Gain How much is gained by the sonification, e.g
in comparison to other displays or classical methods?
(Gestalt) Clarity How clearly can differences and interesting structures be perceived in the sonification? Learning effort How long does it take to comprehend the
sonification and to be able to make use of it? (Sound) Amenity How aesthetically pleasing (as opposed to
annoying) is the sound?
‘Contextability’ Is the sonification applicable in its context (e.g., scientific exploration, public outreach, work environment, etc.) [?]
(Task & data) Complexity How complex (or ‘non-trivial’ did you think the task or underlying data were (not the sonification or sound!)?
Technical effort How much technical effort did the sonification require?
The criterion of ‘gain’ can be understood as the questioning of a musification’s reason for being. If a listener hears a musification but does not know that it is one, what do they gain from hearing it? What does the piece gain from having been written through the sonification process? The question of ‘clarity’ is complex
168
because musification does not have information as sole purpose, therefore the question of clarity might be weighted less heavily than for a purely functional sonification. One might however ask how the data has been translated musically – whether there is any interesting correlation between data and sound which produce an interesting musical structure or event. The ‘learning effort’ required to understand a sonification and the information it transmits can be understood through the information transmission process described in 6.7. This criterion necessarily influences the clarity of the sonification, particularly if the listener only hears the musification once, for example at a concert.
Sound ‘amenity’ is treated in a simplistic manner in Vogt’s criteria as it is reduced to aesthetically pleasing (desirable) or not (undesirable). While ease of listening is important for many applications, a disturbing effect might desirable for an alarm sound. In music, pleasing aesthetics are only a part of the creative opportunities. We have seen that aesthetics does not equal pleasantness, and that music is not necessarily pleasant even if aesthetic. Therefore, aesthetics is a more appropriate term in discussing musification, rather than just amenity. The neologism ‘contextability’ refers to context in which a sonification is used, or the setting in which a musification is played. It is a useful criterion as it asks whether the musification is appropriate for its intended purpose through its musical language, structure, etc.
The two final criteria – complexity and technical effort – will not be considered as they are less relevant in the context of a musification than that of a functional sonification. Complexity refers to the complexity of the underlying task and data of a sonification; this criterion can be important when evaluating the ease of handling a complex task through aural display. The technical effort criterion
169
evaluates how much technical effort was required to produce the sonification. Again, this might be important when used in a functional environment. In a musification, we would argue that the aesthetic result takes precedence over the technical effort required to achieve it.
Williams
Williams proposes the definition of musification as a sonification which uses musical constraints and structures to facilitate the understanding of data (Williams, 2016). While they use musical aesthetics, his musifications are designed for functional purposes in a biomedical context. However, the criteria are interesting because they take aesthetics into account. Williams proposes the criteria shown in Table 7.2.
Tab. 7.2: Sonification evaluation criteria proposed by Williams (Williams, 2016).
Criterion Description
Amenity Was the musification audibly “pleasant”?
Immersion Were listeners able to give the data their full attention?
Intuitivity How readily analysable was the resulting musification with no specific training?
Efficiency How quickly could listeners identify a change in flagellar movement using the musification?
Congruency How aesthetically appropriate the musification was when presented synchronously with the visual representation?
We have previously discussed the importance of aesthetics in sonifications, as they can heavily influence the experience of the user. Amenity is often a desirable quality in an aural display, particularly as it has been shown to improve user satisfaction (Leplâtre and McGregor, 2004). However, less pleasant sonification
170
is also at times very necessary, for example, when used as an alarm. If the criterion of amenity, where pleasantness is considered a positive, is somewhat lacking even for functional uses, sonification for artistic uses needs a more complex and encompassing aesthetical criterion.