We have described a precomputed direct-to-indirect transfer approach to solving the acoustic rendering equation in the frequency domain for diffuse reflections. We have demonstrated that our approach is able to efficiently simulate diffuse reflections for a moving sources and listeners in static scenes. In comparison with existing methods, our approach offers a significant performance advantage when handling moving sources.
Limitations Since this approach is a precomputation-based algorithm, it cannot be used for scenes with dynamic objects. In such situations, ray-tracing-based algo- rithms are the best available choice. However, in many applications, including games and virtual environments, scenes are mostly static, with relatively few moving parts, hence our algorithm can be used to model reflections within the static portions of the scene.
Our algorithm performs matrix-vector multiplications on large matrices at run- time. The matrix size depends on the surface area of the scene and the number of geometric primitives used to represent the scene. Therefore, our method is useful mainly for scenes of low to medium complexity.
Another limitation arises from the approach we use [68] to reconstruct the en- ergy response from the subsampled Fourier coefficients. The replication of Fourier coefficients leads to comb-filter artifacts in the final audio, and is an inherent limi- tation of the reconstruction approach. An alternative would be to treat the Fourier coefficients as defining the envelope of a noise process [72]. Both these approaches are prone to errors; further study is needed to determine the suitability of one over the other based on empirical and perceptual error.
Finally, the transfer matrix is computed using the acoustic rendering equation [67], which has its own limitations, in that it is an energy-based approach (and hence cannot easily model interference) and is based on geometric approximations to the acoustic wave equation (and hence cannot accurately model low-frequency wave effects such as diffraction).
Future Work In most complex scenes, each surface sample may influence only a few other samples, due to occlusions. We could subdivide the scene into cells separated by portals, compute transfer operators for each cell independently, and
model the interchange of sound energy at the portal boundaries. Cells and portals have been previously used to model late reverberation [72], and would be a promising research direction for acoustic radiance transfer.
The acoustic response is typically a smooth function over the surfaces of the scene. Therefore, it would be beneficial to exploit spatial coherence by projecting the transfer operator into basis functions (such as wavelets) defined over the surfaces of the scene.
Some radiance transfer algorithms in visual rendering [71, 31] can model glossy reflections by using a directional basis such as spherical harmonics (SH) at each surface sample. Such a strategy can also be applied to model glossy reflections and diffraction of sound, however, the memory requirements for such an approach might be prohibitive.
Fractional delays [41] may also be used to generate a more accurate impulse response when propagation path delays do not lie exactly on time samples.
8.2
Compact Acoustic Transfer Operators
We have presented an efficient algorithm for computing sound propagation for inter- active applications. The algorithm is a two-pass hybrid of ray tracing and radiance transfer algorithms. We have demonstrated that the algorithm can model high or- ders of reflection (specular as well as diffuse) and edge diffraction at near-interactive rates with low memory overhead.
Limitations Our approach is based on geometric sound propagation using path tracing. As a result, all the limitations of geometric acoustics apply to our method. In particular, our approach cannot accurately model low-frequency reflections and edge
diffraction. Since the acoustic rendering equation is an energy-based formulation of sound propagation [67], phase-related effects, such as some cases of interference, cannot be modeled.
The run-time ray-tracing pass only computes early reflections (2–3 orders). Cou- pled with the fact that the radiance transfer pass assumes all surface samples are diffuse emitters, this implies that we cannot model higher-order purely specular re- flections (such as flutter echoes) or higher-order purely diffracted paths (such as diffraction around complex curved surfaces).
Our handling of dynamic objects is restricted to modeling direct “shadows” cast by a moving source due to moving objects. Since the transfer operators are com- puted over static surfaces only, we cannot model “indirect shadows”, i.e., occlusion of reflected sound by moving objects. As a result, we cannot completely handle situations such as moving doors between static rooms.
Like other precomputation algorithms [83, 63], our approach performs significant compression of the precomputed data in order to run on commodity hardware. This compression is lossy, and results in a reduction in the accuracy of the simulation results. As a result, our algorithm is not practical for detailed room acoustical anal- ysis. It is designed for games and other interactive applications where approximate directional cues, echoes and reverberation can be dynamically updated to generate a plausible, immersive audio experience.
Future Work Our algorithm can serve as the basis for much future research geared towards providing immersive audio in games and interactive applications. Firstly, it would be useful to investigate the possibility of using spherical harmonics to model the directional variation of acoustic radiance, while keeping memory overheads low. Another strategy for reducing memory requirements might be based on the structure
of the acoustic transfer operator: in most game environments, occlusion would lead to clustering within the transfer matrix. These clusters would roughly correspond to cells in a cells-and-portals subdivision of the scene. Therefore, it might be useful to consider computing a per-cell acoustic transfer operator and modeling sound propa- gation between cells via the portals. The clusters may also be useful in optimizing the distribution of surface samples. In general, developing techniques for automati- cally choosing surface sample pairs for computing example echograms would be an interesting avenue for future research.
Since nearest-neighbor mapping is used for the hit-points of gather rays, there may be temporal error in interpolating echograms. This may lead to errors in the final echogram. It would be useful to develop delay-aware interpolation schemes to address these issues.
It would be very interesting to extend our basic precomputation framework to a pressure-based formulation by computing the sample-to-sample transfer operators using a numerical solver for the acoustic wave equation. This would essentially amount to a precomputation-based Monte Carlo solution of the Kirchoff integral formulation of the wave equation, and would result in increased accuracy in modeling wave phenomena such as diffraction. A related formulation of transfer operators have been used for numerically solving the Helmholtz equation using the Equivalent Source Method [52].
Finally, it would be very useful to integrate our approach with a precomputation- based sound synthesis algorithm such as Precomputed Acoustic Transfer (PAT) [34] to develop a unified approach for performing efficient propagation of synthesized sound in interactive applications.