• No results found

Concurrent auditory icons

4.3 Concurrent auditory displays

4.3.2 Concurrent auditory icons

Brazil & Fernstr¨om (2006) and Brazil et al. (2009) proposed the use of concurrent auditory icons and tested identification rates with an onset asynchrony of 300 ms. In these experiments, upwards of three concurrent ‘everyday sounds’ were presented diotically to users, who were asked to identify as many of them as possible using free-text responses. Results indicated that much higher correct identification rates were achieved with concurrent auditory icons than had been the case for concurrent earcons (Brazil & Fernstr¨om, 2006). It was found that performance was improved when the sounds were selected on the basis of not being produced by the same object or action (or method of excitation) (Brazil & Fernstr¨om, 2006; Brazil et al., 2009). This is logical from both informational and energetic masking perspectives, as the change in excitation method and material would facilitate significant spectro-temporal differences and so reduce spectral overlaps and timbral similarities.

Though the identification rates reported by Brazil & Fernstr¨om (2006) and Brazil et al. (2009) were high, it is questionable whether a comparison with the results using concurrent earcons (i.e., (McGookin & Brewster, 2004a)) is justifiable, as the two studies required participants to perform slightly different tasks. With the auditory icon experiments, the user simply had to identify the nature of each of the sound sources. In the earcon experiments, however, it was necessary for participants to recognise the nature of the sources and recall the associated content. This difference introduced additional complexity in the latter experiment, which would be expected to negatively affect performance.

Several displays have made use of concurrent auditory icons in different ways. Gaver et al. (1991) created a display to be used with a simulated factory. Up to 14 auditory icons were presented concurrently alongside a visual display, which could only display a limited section of the factory. The auditory icons were designed so as to have distinctive spectro-temporal characteristics to reduce masking and maximise discriminability. The display represented the machines using repeating loops of everyday sounds and used additional sounds to represent issues occurring within the factory (i.e., breaking glass and liquid spilling). Although recognition was not quantitatively measured, observations of participants working collaboratively in pairs on the simulation showed that they communicated more to solve problems when the auditory cues were presented, and responded reliably to error noises, but

often failed to notice when a looped sound stopped. Similarly, in one version of the Audio Aura display, proposed by Mynatt et al. (1998) to act as a peripheral display for the office environment, auditory icons were combined to create a soundscape of a seaside environment with different sonic elements representing different ideas.

Whilst both Gaver and Mynatt’s systems served primarily as displays indicating states, Putz (2004) and Frauenberger et al. (2004) proposed an interface which exploited spatial separation with concurrently presented auditory icons to represent hierarchical menus. The auditory icons were presented in the frontal hemisphere on the azimuthal plane in a virtual room. The users were able to zoom in or out of the display which controlled the number of concurrently presented items using a Gaussian window. At the most zoomed out, the user was able to hear all items (although more attenuated at the sides) and at the most zoomed in only one item was audible (Putz, 2004). The user navigated through the menus by turning their head towards the target item to select it and then used a keyboard to perform the required actions. The auditory icons were looped but, in order to allow better localisation, different frequency tones (referred to as pedestal tones) were added to each of the auditory icons using a small amount of amplitude modulation. Putz (2004) reported that many of the participants kept the display set to maximum zoom or used a hint button repeatedly until they found the item that they were looking for. These findings indicate that users struggled with the concurrent presentation of the auditory icons. As the users had the ability to change the zoom, it is unclear whether they would have become accustomed to this mode if fewer items had been presented concurrently or if they had got more used to the concurrent presentation.

It would seem, therefore, that there have been some contradictory findings in terms of the number of concurrent auditory icons that can be used reliably. It is unlikely, however, that many more than three would be sensible in most task-orientated scenarios (e.g., menu navigation). A much larger number of concurrent stimuli is possible in ambient state monitoring displays, where the use case involves monitoring the states of continuous processes, as in Gaver et al.’s (1991) ARKola simulation. This is probably due to different listening techniques being required by the two scenarios. Firstly, in ambient state monitoring displays, the sounds are present for an extended period, in which time the user is able to switch attention between concurrent streams and gain familiarity with the stimuli. In task driven presentations, however, it is likely that the stimuli will be considerably shorter and only briefly displayed to the user, giving them little time to familiarise themselves with available options or switch attention. In the task-orientated scenarios, the user must focus on each

item individually, make a decision about the nature of the source and infer the object with which it has been associated. The state-monitoring display requires the user to listen to the timbre of the mixture rather than each individual sound. Then, when a change occurs due to the addition or removal of a source, the user has to determine the nature of the item which has changed.