Spatialized Audio Rendering for Immersive Virtual Environments

(1)

Spatialized Audio Rendering for

Immersive Virtual Environments

Martin Naef, Markus Gross

Computer Graphics Laboratory, ETH Zurich

Oliver Staadt

(2)

Context: The blue-c

• Collaborative Immersive Virtual Reality Environment • Provide remote collaboration features

– Shared, synchronized virtual world

– Render partners using 3D video streams – Concurrent rendering and acquisition

(3)

Audio for VR

• Increase the sense of presence

• Guide the interest of the user

• Provide cues for orientation

• Requires 3D sound rendering

(4)

Overview

• Available systems and technology

• System overview

• Audio rendering pipeline

– Sound sources

– Simulation of physical effects – 3D positioning and mixdown

• API integration

• Experiments

(5)

Rendering Options

• Using headphones

– Model head/pinnae using HRTF

– Head-tracking for each user required

– Calibration for individual users required

• Using multiple speakers

– Multi-channel hardware required – Speaker placement is critical

(6)

Available Systems

• High-end systems

– Offline calculation of impulse responses

• E.g. CATT-Acoustics

– Convolution processors

• E.g. Lake

(7)

Available Systems

• Low-end systems

– PC sound cards

• Direct Sound, EAX, OpenAL

– Speaker placement

(8)

Design Goals

• Good sound quality at moderate cost • Believable results

– Not necessarily physically correct

• Support for networked sound

– Needed for remote collaboration

• Flexible speaker placement

• Efficient implementation on standard hardware

(9)

System Overview

• Part of the blue-c software core

Sound System Graphics System sync Source Localization Pipeline Reverb 11 1 n n nn Mix Bus Application Scene Graphics• Pipeline

(10)

Audio Rendering

Pipeline

Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Speakers Room-EQ LPF Sub Reverb Head Tracking

(11)

Audio Sources

• Recorded audio

– Mono samples or loops for effects

– Multi-channel files for background music

• Live input

– Microphones

– External synthesizer or sampler

• Networked input

– Remote microphones for collaboration Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking

(12)

Audio Sources

• Keep all state information

– 3D position

– Reference distance – Gain

– Temporary rendering data

• Filter coefficients and state • Delay lines

• Mixdown matrix Distance DelaySource Air Absorption Distance Gain

(13)

Distance Delay

• Simulate propagation speed of sound (300 m/s) • Store and delay samples in a memory buffer

• Keep independent write and read pointers • Read pointer is moved according to distance

– Linear interpolation of time and samples – Results in a frequency shift (Doppler effect)

Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking Delay Line Write Data Read Data P cur P cur+ td.last P

(14)

Air Absorption

• Frequency-dependant power loss

– Higher frequencies are attenuated more – Only perceivable for large distances

(approx. -4 dB per 1000 m at 1 kHz)

• Simplified model

– High-shelving filter (bi-quad)

Source Distance Delay Air Absorption Distance Gain

(15)

Distance Gain

• Power loss according to distance

• Uses reference distance

Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking s ref d

D

L

=

(16)

3D Positioning

• Simulate source direction using a discrete, small number of speakers

• Distribute mono-stream onto multiple speaker channels

Source Distance Delay Air Absorption Distance Gain

(17)

3D Positioning

• Calculate channel gains with dot-product

• Open up "active angle" to avoid differences in perceived spread

• Normalize gain factors

Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking













⋅

+

=

,

0

1

.

1

.

0

max

spk s chn

v

L

(18)

Loudness Projection

• Correct the individual channels

to move "sweet spot"

• Use head-tracking information

• Allows irregular distances to

the listener for individual

speaker

Source Distance Delay Air Absorption Distance Gain spk spk

D

L

=

(19)

Room Simulation

• Simulate room echo

• Provide a sense of the size and material of the acoustic space

• Two fundamental approaches

– Simulated impulse response (large FIR filters) – Parameterized reverberation algorithms

Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking

(20)

Room Simulation

• Separate send channel to studio effect processor (t.c. M-ONE XL)

– Provides smooth, pleasing reverb – Intuitive parameterization

– Mix reverb output onto the mix bus

• Use effect send gain as additional distance cue

– High direct-sound to room echo ratio for close sounds Source Distance Delay Air Absorption Distance Gain 2 ref 1−_ _ = D L

(21)

Room EQ and LF

Management

• Parametric equalizer in the mixing bus

allows to adjust to acoustic environment

– Attenuate resonant frequencies

– Account for non-linear speaker response

• Low-frequency management

– low-pass filter a sum signal to drive subwoofer Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking

(22)

Fused Pipeline

• Mix signal onto main bus using a single mixdown-matrix

– Source gain – Distance gain

– Position and projection gain for each speaker channel

• Steps are reduced into a single vector-matrix multiplication Source Distance Delay Air Absorption Mix Matrix Speakers Rev. 3D Positioning Sub f g Room-EQ LPF Source Distance Delay Air Absorption Distance Gain

(23)

Fused Pipeline

• Mixdown matrix

– Calculated at audio block boundaries – Linear interpolation between last and

current matrix (every 32 samples)

– Provides smooth transition between different positions Source Distance Delay Air Absorption Mix Matrix Rev. 3D Positioning f g Room-EQ LPF

(24)

API Integration

• Sound service in the blue-c API core

– Control sound sources and system

• Audio nodes in the scene graph

– Sound as object attribute

– Support transformation nodes

– Provide translation between virtual (scene) and real coordinate systems (physical setup)

(25)

Benchmarks

• Single MIPS R12000 CPU, 400 MHz

• 44.1 kHz sampling rate, 20 ms latency, 8 channel ADAT input/output

• Delay-line is expensive

• Latency has little influence

33 sources 31 sources 65 sources Stream 30 sources 25 sources 54 sources Live 37 sources 33 sources 78 sources Preload Localized Stereo Mono Source

(26)

Applications

• Used for several applications

– Landscape (ship seeking test) – Infoticles

– "Fashion show" blue-c feature demo – Collaborative chess

(27)

Conclusions

• High quality sound system

• Based on standard components

• Moderate cost

– ~ US$5000 for audio system

(28)

Future Work

• Integration into area management

– Culling of sound sources – Portal effects

– Assign reverberation parameters to areas

• Linux port

(29)

http://blue-c.ethz.ch

(30)

Related Work - Acoustics

• [Begault:94] Overview

• [Gardner:92] Virtual Acoustics / Reverb • [Krockstadt:68] Ray-tracing

• [Funkhouser:99] Beam-tracing

• [Gardner:94] HRTF • [Pulkki:99] VBAP

(31)

Related Work - VR

• [Takala:92] Sound Rendering

• [Tsingos:97] Soundtracks for animation

• [Eckel:99] Cyberstage Sound Server

• [Jot:99] IRCAM Spatialisateur

• [Huopaniemi:99] DIVA

(32)

Implementation Notes

• Rendering runs in its own process

– Sound sources can be added and modified at any time

– Parameter updates only at block boundaries

• Runs on

– SGI Onyx 3200 (MIPS R12000, 400 MHz)

– I/O through 8 channel ADAT

– Inexpensive studio hardware and speakers

– ~ US$5000 for audio system

SoundService SoundSource PreloadSource LiveInputSource StreamSource PreloadData LiveInput 3DPositioning ReverbControl

(33)

Speaker Placement

• More speakers means better localization

– 6 speaker provide good results – 8 speakers almost "equal power"

distribution Source Distance Delay Air Absorption Distance Gain 3D Positioning Projection Room-EQ LPF Reverb Head Tracking