Validity of acoustic models - Acoustic models of CI function

Chapter 2. Background

2.5. Acoustic models of CI function

2.5.1. Validity of acoustic models

A signal which has been processed using the same, or similar, signal processing techniques as are used in CI speech processors and which is used to generate an acoustic signal to elicit a response in NH listeners can be termed an “acoustic model “ (AM) of CI processing (Throckmorton and Collins, 2002). Because current CI signal processing techniques are very similar to channel vocoders, AMs are also sometimes referred to as “vocoded” signals (Faulkner et al., 2000; Loizou, 2006). The aim of developing an AM is to reproduce the information content of the implant output in an acoustic form, rather than necessarily reproducing the subjective auditory sensation

experienced by the implant user, as the term “simulation” might imply. Therefore, the term “AM” is preferred here.

AMs have a number of potential benefits to research. The most important point for this thesis is they can help distinguish between the effects of CI processing per se and

electrical/neural interface factors contributing to CI user performance. AMs also allow the researcher to develop and refine hypotheses so that the design of CI performance experiments maximise CI user time. It can also be argued that studies using AMs of CI processing are of intrinsic interest even without direct reference to CI research, as they provide evidence about normal speech perception under

conditions of reduced acoustic information (Shannon et al., 1995).

A typical AM was described by Loizou et al. (2000a). First, the signal was processed through a pre-emphasis filter and then band passed into N frequency bands using sixth-order Butterworth filters. In order to create an AM, sine wave or narrow bands of noise with the centre frequencies of the corresponding electrode channels were generated with amplitudes equal to the RMS energy of the envelopes and frequencies equal to the centre frequencies of the band pass filter. The sine wave or noise bands were recombined to generate the final waveform. The RMS value was then adjusted to be equal to the original signal. The difference between generating a CI signal and an AM is the final output stage: in the first case, level variations within each channel are used to vary current level among corresponding electrode channels, while in the second case, they serve to vary amplitude among a set of carrier stimuli which are recombined to generate an acoustic waveform. It is also worth noting that this approach, similar to that of the majority of AM studies, is based on the multiple IIR filterbank rather than FFT analysis.

The validity of a CI AM, that is, its ability to predict and model CI user performance, is determined by a number of factors. A key question is the degree of similarity between the signal perceived by the CI user and the signal perceived by a NH listener with an equivalent AM. There are two aspects to this: first, whether or not identical signal processing methods have been used in AM listeners and equivalent CI subjects and, second, the degree to which processing in the normal auditory system transforms the signal. While the signal received by the CI user has been processed by the CI

itself, the signal perceived by the NH listener has been processed not only via the AM itself but also via the external, middle and inner ear of the listener. The external ear can be characterised by a frequency response which includes both pinna and external ear canal components. For the purposes of the study here, an insert earphone was used to minimise the amplification characteristics of the pinna. The question of processing is dealt with in the present study by ensuring that the same signal processing

techniques apply to both CI users and NH subjects listening to the AM stimuli (see 3.2).

Auditory acclimatisation is another factor that may impact on the validity of AMs. A CI user will normally have had a good deal of auditory experience with the CI signal when being tested, whereas a NH listener listening to an AM may have had only a few minutes acclimatisation. Faulkner et al. (2006) showed that considerable time was needed to acclimatise to the model. Their study used running speech with a

conversational discourse tracking technique in which word rate was used. The

authors found that many hours of acclimatisation was needed to optimise performance with pitch-shifted speech materials. However, Davis et al. (2003) suggested that initial acclimatisation to AM stimuli occurs within a few minutes, so long as the listener is given the original unaltered stimulus for comparison. It appears that there NH listeners are able to acclimatise relatively quickly to AM stimuli without

significant spectral shifts, but that considerably longer time is needed to achieve optimal performance with pitch-shifted stimuli (see Rosen et al, 1999). In the current study, it was proposed to include a degree of pitch shift in the AMs which would reflect the degree of upward frequency transposition associated with a normal insertion of the Nucleus 24 electrode array. As noted in 3.3.2, this degree of pitch shift was somewhat less than that noted as causing significant acclimatisation problems in Rosen et al (1999) and Faulkner et al (2006). Therefore, In order to determine if rapid acclimatisation to this more modest degree of pitch shift was possible, a pilot study was undertaken to see if a minimal acclimatisation procedure could yield valid results (see 3.1.2.).

In document Acoustic models of consonant recognition in cochlear implant users (Page 67-69)