Active Audition and Sensorimotor Integration for Sound Source Localization

(1)

Active Audition and Sensorimotor Integration

for Sound Source Localization

Mathieu Bernard

(2)

Introduction

CIFRE thesis. Co-direction :

Patrick Pirim, Brain Vision Systems Bruno Gas, ISIR, UPMC

Alain de Cheveign´e, LPP, Paris Descartes

Outline

Artificial auditory system for sound localization, Audio-tactile model for texture recognition, Active audition and sensorimotor integration.

(3)

Autonomous robotics

(4)

Artificial auditory system

Bioinspired sound localization

Binaural localiation cues ITD - ILD Outer ear Outer ear Inner ear Inner ear Auditory model Implementation core library real time - c++ robot robotic simulation standalone programs sound capture simulated sound sources wav files

(5)

Outer ear model

Pinna Auditory fovea Microphone

Oriented toward the fovea Support

Servomotor

Outer ear = Pinna, microphone and software capture. Spectral and directional cues.

(6)

Inner ear model (1/2)

Inner ear = cochlear model and pulse train generation.

Cochlear model

Gammatone filterbank

Usually 30 channels from 300Hz to 8kHz at 20kHz

(7)

Inner ear model (2/2)

Inner ear = cochlear model and pulse train generation.

Pulse train generation

Cochlear channel output ⇒Pulse train

Discrete representation, noise suppression.

(8)

ILD computation (1/2)

Energy

for each pulse p(t), the energy is :

E(t) =

u=t X

u=t−T

p(u)2 (1)

(9)

ILD computation (2/2)

ILD = Interaural Level Difference

For each channeli :ILDi(t) =

2Eleft,i(t)

(10)

ITD computation (1/3)

ITD = Onset extraction and delay lines.

Onset extraction from a pulse train

Comparison with a dynamic threashold, 2 parameters.

(11)

ITD computation (2/3)

(12)

ITD computation (3/3)

(13)

Auditory and tactile model for texture perception (1/3)

Similar model for transduction and processing

Audition and touch support fine texture discrimination skills. Rat vibrissae and cochlea transduction based on resonance. Strong interaction for auditory/tactile spectral processing.

(14)

Auditory and tactile model for texture perception (2/3)

Whiskers filterbank adapted from gammatone cochlear model :

Feature extraction = Instantaneous Mean Power :

(15)

Auditory and tactile model for texture perception (3/3)

Texture classification : 8 textures, 3 layer perceptron

Response to a pure tone (whisker at 630 Hz)

(16)

Auditory Evoked Behaviors (1/2)

ILD Outer ear Inner ear Energy

Outer ear Inner ear

Auditory model

Motor control

Neck and body Energy

Motor control

Neck ⇒ ILD minimization

ILD <0⇒ turn right, ILD >0⇒ turn left. Video !

Wheels⇒ Follow neck orientation

Constant speed, Smooth rotations. Video !

(17)

Auditory Evoked Behaviors (2/2)

(18)

Sensorimotor Approach (1/3)

Environment state e ∈ E and motor state m∈ M,

Sensory states ∈ S, we haves = Φ(m,e), Φ is called a sensorimotor law.

(19)

Sensorimotor Approach (2/3)

Quand on dit [...] que nous localisons tel objet en tel point de

l’espace [...] cela signifie simplement que nous nous repr´esentons

les mouvements qu’il faut faire pour atteindre cet objet.

[...] Nous nous repr´esentons les sensations musculaires qui

accompagnent [ces mouvements] et qui n’ont aucun caract`ere

géométrique, qui par conséquent n’impliquent nullement la

pr´eexistance de la notion d’espace.

H. Poincaré, L’espace et la géométrie, 1845.

Proposed formalization for localization

Find the motor state ˜msuch as

˜

(20)

Sensorimotor Approach (3/3)

Sound source localization

Two ways for ˜mestimation

Orienting behavior

After completion we have ˜m=mend

send = Φ(mend,e)

= Φ(m0+δm,e0+δe),

Manifold learning

Sensory spaceS lies on a low-dim manifoldR,

Dimension reduction technique : S → R,

Same topology as the embodying space (at least locally).

(21)

Manifold learning (1/3)

(22)

Manifold learning (2/3)

CAMIL database, ITD vectors :

d=690→ d=3

(23)

Manifold learning (3/3)

2 outer ears models : HRTF and directive filters

(24)

Sensorimotor integration (1/2)

Learning algorithm

Iterative learning of R, Self-supervised.

(25)

Sensorimotor integration (2/2)

(26)

Intrinsic Dimension Estimation

Simulation (ILD, azimtuh 180 & 360)

CAMIL dataset (ILD & ITD, azimtuh 360, elevation [-30, 30])

(27)

Active Audition and Sensorimotor Integration for Sound Source Localization