J. Daniel carried out a number of experiments to investigate different first and second order ambisonic decoding schemes [Daniel et al, 1998]. The decoders were assessed using both HRTF measurements and binaural simulations, and listening tests with four, five and six channel loudspeaker arrays. Under ideal conditions with a single, centrally positioned listener a dual-band decoding scheme was found to be preferable in terms of localization accuracy for the first order system. Similar results were achieved with the second order system using a single band decode which optimized rV. The results indicated that decoders which perform well in non-ideal
conditions (such as an off-centre listener position) are not the optimal low frequency solutions (rV =1) designed for ideal conditions, but rather the max rE and in-phase
decoders. The tests also indicated that a homogeneous sound field can only be created if there are sufficient loudspeakers for the particular ambisonic order, as discussed in Section 4.4.
Benjamin conducted a number of listening tests to examine the differences between different ambisonic decoders and loudspeaker layouts [Benjamin et al, 2006]. First order Ambisonics material consisting of Soundfield recordings, and
various encoded test signals were played back using square, rectangular and hexagonal arrays in a medium-sized, acoustically treated listening room.
Interestingly, the initial tests were carried out in an ordinary untreated room but were abandoned as extremely poor localization was achieved. Four different decoding schemes were examined, a full-band velocity decode, a full-band energy decode, a full-band in-phase decode, and a dual-band energy/velocity decode based on Gerzon’s original design and with a transition frequency of 400Hz. Subjects were free to move around within the array and were asked to listen for attributes such as directional accuracy, perspective, timbral changes, artefacts and loudspeaker proximity effects. Overall, the majority of test subjects preferred the hexagonal array with dual-band decoding. The square layout was least preferred due to poor lateral imaging and spectral changes for different source directions. The rectangular layout was found to work well for frontal sources with rear ambience. In terms of the decoding scheme, the velocity and in-phase decoders were least preferred for opposing reasons. The velocity decoder produced uncomfortable in-head imaging and comb-filtering,
probably due to the high frequency anti-phase components which would be present in a full-band velocity decode. The in-phase decoder on the other hand was judged to be much too diffuse and reverberant, although comb filtering and artefacts due to listener movement were eliminated. The full-band energy decoder was judged to provide a balance between these two extremes and was found to work well at off-centre listener positions. However, the shelf filter decoder produced more defined sources as it appeared to pull the various spectral components of the signal to the same perceived direction. An interesting general finding was that the loudspeaker layout is
significantly more important than the choice of decoder. The results of the initial failed test would suggest that the acoustics of the listening room are also highly important.
Guastavino conducted a number of listening tests which compared 1D, 2D and 3D ambisonic presentations [Guastavino et al, 2004]. In an acoustically treated room containing six symmetrically arranged loudspeakers in a hexagonal formation with two sets of three loudspeakers arranged above and below. The twenty-seven expert listeners subjects were first asked to rate various ambient recordings made with a Soundfield microphone decoded using a full-band in-phase decoding scheme. The test results show a strong preference for the 2D, hexagonal layout in terms of
were described as sounding further away, indistinct and less enveloping while the 1D scheme was found to be the most stable with listener movement. In a second
experiment, a more directive decoding scheme (similar to a max rE scheme) was used
as this provided a better balance between localization accuracy and sensitivity to listener position [Guastavino et al, 2004]. Similar results were achieved in both tests and an analysis of the results suggests that the preferred layout, at least in terms of naturalness, is dependent on the source material (see Figure 6.2 [Guastavino et al, 2004]). The 3D layout appeared to be preferred for indoor environments, while the 2D layout was preferred for outdoor scenes and the 1D scheme for frontal music scenes.
Fig. 6.2 Naturalness responses reported by Guastavino
Kratschmer conducted informal listening tests with a number of ambisonic decoding schemes and a forty-eight loudspeaker array [Kratschmer et al, 2009]. The results suggest optimal performance in terms of localization accuracy is achieved when the number of loudspeakers is matched to the Ambisonics order, using the formula given earlier in this section. The results of this test and others [Bertet et al, 2009] suggest therefore that the performance of an Ambisonics system decreases significantly when the number of loudspeakers greatly exceeds the minimum number required for that particular order.
David Malham published an informal paper outlined his experience working with large area Ambisonic systems for theatre and music performances [Malham, 1992]. He notes that positioning an audience within a periphonic array can be problematic due to acoustic screening of the lower loudspeakers by other audience members. He also suggests that the decoding scheme and loudspeaker layout often needs to be manually adjusted to compensate for the effect of the room acoustic.
6.3.1 Discussion
The results of these tests confirm Gerzon’s original proposal in that a dual- band decoder which optimizes the velocity and energy vectors is preferred when there is a single listener. However, when off-centre listener positions are taken into
account, decoders which optimize rVare least preferred due to the significant anti-
phase components which are required to maximize the velocity component. The in- phase decoding scheme eliminates these anti-phase components entirely and so is very stable across a wide listening area, but is also very diffuse. The max-rE decoder
represents a good compromise between these two extremes, particularly at higher orders. As with stereophony, it appears that Ambisonics requires a minimum of six loudspeakers for optimum performance.
Fig. 6.3 Decoder criteria related to the size of the listening area
Daniel proposes that for a given order and distance from the centre of the array, the max-rE decoding scheme is most suitable [Daniel, 2000]. If the listening
area extends beyond this distance, or to the loudspeaker periphery, then the in-phase scheme is preferred (see Figure 6.3 [Daniel, 2000]). He proposes a tri-band decoding
scheme which applies the basic, max-rE and in-phase decoders in three consecutive
frequency bands, based upon the size of the listening area.