• No results found

Auditory perception

2.2 Human auditory system

2.2.3 Auditory perception

In our everyday life, we are constantly exposed to sounds, which are being pro- cessed by the brain whether we listen to them or not. There are two major elements of sound that we are interested in, when listening: meaning - what

we hear; and location - where is it coming from. Hearing sensitivity, auditory processing and localisation will be briefly explained below. More details can be found in [Moo82, PISI00, Tsi07, RS08].

Spatial hearing and localisation

Although both sound and light are omnidirectional, our auditory system can receive information from all the direction around. This is different from our visual system. However, the HAS has mechanisms to filter certain sounds and localise them in space. The former is called auditory masking or the cocktail party effect, and it allows us to pick out and listen to a single sound in a noisy environment [Moo82]. However, sometimes fragments from these sounds could

be missing, without any perceivable effect to a listener. This is possible due to the continuity illusion phenomenon [KT02]. There are two important sound characteristics that influence the auditory localisation: spectral bandwidth and intensity [RS08]. The broader the bandwidth and the louder the sound is, the better our ability for sound localisation. The main factors that affect sound localisation are: binaural and monaural cues, reverberation and inter-sensory interaction.

Interaural intensity difference (IID)is one of the binaural cues for sound localisation. Depending on the azimuth of a sound source, each ear receives the sound at a different intensity level, except for the sounds originating directly ahead or behind us, see Figure 2.14. This cue is stronger for the higher frequen- cies, since the wave length of the low frequencies are longer than the diameter of the head, which, in that case, does not impede the waves.

Interaural time difference (ITD)is a similar cue to the IID, where one ear receives the sound slightly earlier than the other, see Figure 2.14. Similarly to the IID, the highest difference - around 700 µsec occurs when the sound originates from the side of the head.

Although being a powerful tool for sound localisation, binaural cues do not provide sufficient information about the sound source elevation. Monaural cues, however, can provide us with that information using head-related transfer func- tions (HRTFs). As the sound travels it reflects off the head, body and pinna. During these reflections some of the energy is lost which leaves the sound spec- trum suitable for sound localisation. In certain ambiguous positions, such as from ahead or from the behind of the head, where the IID and ITD are the same, head movement breaks the symmetry and resolves the confusion.

Another important element of sound localisation is distance perception. This ability evolved as we had to know if a prey or a predator is nearby or far away.

R ear L ear R ear L ear ITD A1 A2 IID=A1-A2 t t Intensity

Figure 2.14: Bianural cues: Interaural Intensity Difference (IID) and Interaural Time Difference (ITD).

When listening to a sound indoors, we rely on the reverberation. However, this cue is missing in outdoor environments, and it is substituted by sound inten- sity and movement of the sound source. Although this can be useful in sound localisation, it behaves rather poorly for unfamiliar sounds.

Despite these localisation techniques, the spatial auditory resolution is very limited. According to Perrott and Saberi, the minimum vertical audible angle without change in elevation is 0.97◦ and the minimum horizontal audible angle without change in azimuth is 3.65◦ [PS90]. This makes hearing substantially weaker than vision in spatially related tasks. However, the temporal resolution of the HAS is rather high compared to the HVS, and according to Fujisaki et al. it is 89.3Hz [FN05].

Temporal auditory processing

Temporal auditory processing can be divided into temporal integration and tem- poral resolution. Temporal integration is the ability to integrate acoustic features of a particular sound over time. It has been reported that for humans this time varies between 50 and 200ms [RS08]. One example of temporal integration is a forward masking paradigm. If the observer is presented with two sequential sounds, his/her ability to detect the second stimulus will depend on the duration of the first one. This happens because of the adaptation to energy at the frequen- cies present in the first sound (masker). Additional parameters that influence the masking intensity are: intensity and duration of the masker, duration of the in- terstimulus interval, duration of the second stimulus (target), the onset interval between the masker and the target (duration of the masker plus interstimulus interval) and interstimulus interval plus duration of the target.

Opposite to temporal integration is temporal resolution, which is the ability to resolve time. Using experiments to measure the ability of gap detection in a sound, it has been shown that humans can detect a silent period lasting a few milliseconds [RS08].