II. Systems Incorporating Tangible Auditory Interfaces
9.4. AudioDB
9.4.2. Technology
Hardware
As a basis for AudioDB, we use the tDesk (described in Section 5.5.1), a tabletop system
for Tangible Interfaces. Its dimensions suited well the desired multi-person set-up required by AudioDB, and intended to equally serve people standing around the desk, and provided each member direct access to the surface. By its dimension of 80 × 80cm all places on the surface can be captured easily by an adult person.
9.4. AudioDB Vision approx. 20 Objects 2 Class IDs (node|cluster) SETOServer AudioDB_Engine Dynamics moved Object ID class position velocity moved Object ID classID position velocity SpatialNodeSon Node Sonification SpatialClusterSon Cluster Sonification Sound Data Base dt Sounds Sounds Multichannel Audio + "update" "isChanged(setobj)" "update"
Figure 9.34.: Overview of the AudioDB software and its interdependencies.
camera prevented otherwise prominently appearing visual occlusions by the users’ arms or other body parts.
The camera’s image was processed by a blob tracking algorithm implemented by Christof Elbrechter. It detects number, colour and position of the grains’ underside and additionally
applies a unique ID according to the Algorithm by Cox et al. [CH96]. Our in-house
implementation of this algorithm can process up to 50 objects in realtime on a recent computer system, while capturing and processing the image from the firewire camera in 20 fps. This is sufficient for a smooth interaction with AudioDB.
The system’s Auditory Display was rendered to an 8- resp. 16-channel audio system arranged in a ring of equidistant loudspeakers, surrounding the tDesk. This allowed for a natural auditory interface, directly coupled to the users’ action on the tabletop.
Software
Based on the described hardware, a system where implemented in SuperCollider [McC02] [WCC09].
As shown on Figure 9.34, it consists of a controlling and a sound synthesis part. The
objects’ motion is tracked by the system, send to SuperCollider, administered there by a
SETOServer10, and used as input to the data model called AudioDBEngine. In this
object, each grain’s positional information is linked to a corresponding data item. Mode, motion, speed and position of the object then determine the Auditory Display state as
described in Section 9.4.1.
One aim of the system is to support users in sound classification. The user can achieve this Shifting between
modes
by establishing clusters through the addition and removal of sounds to and from clusters. The system therefore implements rules to shift between the two abstraction modes as follows:
Turning an object from Node Mode into Cluster Mode triggers the system to collect sounds Node → Cluster
from all objects nearby. This means, that all these sounds are collected and implement the new sound set of the just established Cluster Mode object. The affected objects are assigned to new sounds, if they are in Node Mode, or otherwise – if in Cluster Mode – left empty. To decide which objects are affected by the restructuring process, the system invokes a hierarchical clustering process that builds a dendrogram of all the positions of
the objects on the surface.11 This dendrogram contains information about the distances
between each object and the next cluster of objects nearby. Based on this information,
10
See Section10.2for details.
11
9. Applications
a
b
c
initial configuration. all objects are associated with exactly one sound (Node-Objects, green)
The user turned a Node Object into a Cluster- Object (blue).
Sounds of Node-Objects loosely coupled to the newly instantiated Cluster Object (light green) are suplied to it. They get new sounds from scratch.
Figure 9.35.: Example layout for the transition from Node Mode to Cluster Mode.
AudioDB merges all objects in the sub-tree of the dendrogram that include the turned
object and are separated from the rest of the objects by a given threshold. Figure 9.35
displays a step-by-step illustration of the transition process focussing on the clustering,
whereas Figure 9.36exemplifies the transition process from the view of the flipped object
focusing on the actual algorithmic rules for sound distribution and collection. Turning
Cluster → Node
an object from Cluster Mode into Node Mode distributes the contained sounds to the surrounding Node Mode objects.
The feedback of information to the user is realised by spatial granular re-synthesis based
Sound synthesis
on the corresponding data item and its auditory representation. Each rendered audio grain is a part of the sound’s onset multiplied by a curve with a sharp attack, or a longer part multiplied with a smoother envelope. Transient respectively decaying parts in the granular sound stream are chosen to be uniformly distributed over time. Information on attack and decay of the underlying sound therefore is kept in the resulting steady sound stream. To closely link the AD to its corresponding physical object, we render the sound to originate from the same direction the object is located with respect to the tDesk’s centre.
As explained in the introduction to AudioDB, duration and attack of a single sound grain
depended on the grain’s speed of movement. As shown in Figure 9.33(a), these parameters
are coupled with each other. The envelope’s duration therefore determines its attack, i.e. how much of the transient part of the original sound is audible. For each grain in Node Mode, one synth is created according to the following Synthdef. The bufnum argument links to the sound file that is associated to the actual grain.
1 SynthDef(synthName, {|out=0, bufnum=0, dur = 0.1, 2 amp = 0.05, orient = 0, width = 2|
3 var player = PlayBuf.ar(1, bufnum); 4 var env = EnvGen.ar(
5 Env([0, 1, 0], [0.8, 0.2]*dur, [-1, 1] * ((dur*5).reciprocal-1)),
9.4. AudioDB flip Associated Node Mode Objects Node Mode Cluster Mode Associated Cluster Mode Objects
get all sounds
distribute sounds
get all sounds
Figure 9.36.: The transition from Node Mode to Cluster Mode viewed from the object that flips. Depending on the state, it either collects all sounds from objects nearby, or distributes its sounds to the surrounding node-mode objects.
7 doneAction: 2
8 ); 9
10 Out.ar(
11 out,
12 PanAz.ar(numChans, player * env, orient, width: width) 13 );
14 }).send(server);
The Cluster Mode uses the same synth definition for each associated sound as utilised in the Node Mode for one single sound. The resulting grains are spread in time and space:
1 buffers.do{|buffer, i|
2 server.makeBundle( server.latency + ((i*0.005) + 0.01.rand), {
3 Synth.grain(synthName, [ 4 \bufnum, buffer, 5 \dur, #[ 6 0.05, 0.01, 0.02, 0.04, 0.08, 0.16, 7 0.2 , 0.32, 0.4, 0.64, 0.8 , 1 8 ].wchoose(#[ 9 12, 11, 10, 9, 8, 7, 10 6, 5, 4, 4, 4, 4 11 ].normalizeSum), 12 \orient, (( 13 (pos).theta + 0.5pi 14 ) * pi.reciprocal 15 \width, 4 - (3 * pos.rho * 1/(2.sqrt)),
9. Applications
16 \amp, 0.25 * numBuffers.reciprocal * speed
17 ], target: server)
18 }) 19 }
This results in an asynchronous grain cloud as it is described in Section 6.4.1.