Specialised systems - Software tools for spectral modelling and

2.7 Software tools for spectral modelling and

2.7.1 Specialised systems

A wide variety of specialised tools for spectral processing of musical sounds can be found in the literature, generally focusing on analysis, transformation and synthesis using a particular sinusoidal modelling algorithm. Here we survey of some currently available software packages, their implementation details and the sound transformations that they provide.

SNDAN

SNDAN [8] is a free suite of C programs distributed as source code6_{, allowing for}

the analysis, modification and synthesis of (primarily monophonic) musical sounds. Tones can be analysed using either phase vocoder analysis or by using an implementation of MQ sinusoidal modelling. Analysis data is then synthesised using additive synthesis. Programs for viewing plots of analysis data are also included in SNDAN. A number of transformations can be applied to the analysis data, including several different sinusoidal amplitude and frequency modifications, smoothing sinusoidal amplitudes and frequencies over time, smoothing sinusoidal amplitudes ver- sus frequency and time stretching the input waveform. Analysis data can also be saved to a SNDAN file format and then reloaded for synthesis, so it is possible for other applications to apply modifications.

ATS

ATS [94] is an open source library for spectral analysis, transformation and synthesis of sound. It is written in Common Lisp and is designed to work in conjunction

with Common Lisp Music [106]. The ATS analysis process is similar to the SMS algorithm. Spectral components are identified by peak detection and partial tracking algorithms and then used to synthesise a deterministic signal. This harmonic component then subtracted from the original sound and the residual signal is used to create a stochastic component. However, the ATS partial tracking process also allows peaks to be ignored if they are considered to be inaudible due to psychoa- coustic masking effects within critical bands. Analysis data is stored in lisp abstrac- tions called sounds, and can be synthesised using a variety of techniques including additive synthesis, subtractive synthesis and granular synthesis. Available transformations include scaling partial amplitudes and frequencies by constant values or dynamic envelopes, transposition (with or without maintaining formant shapes) and time stretching. A graphical user interface (GUI) is also provided that enables the real-time control of synthesis parameters.

SPEAR

SPEAR [59] is a cross-platform graphical spectral analysis, editing and synthesis tool. Sinusoidal analysis is based on an extension of the MQ method. Like SMS, peak frequency estimates are improved by using parabolic interpolation. The partial tracking algorithm selects peaks by using linear prediction to estimate the future trajectory of the partial, and peaks are then selected based on how closely their parameters match the estimates [62]. Synthesis of analysis data is performed using either the inverse FFT method or by using additive synthesis of banks of sinusoidal oscillators. As one of the design goals of SPEAR was to enable integration with other sinusoidal modelling implementations, a wide variety of analysis data file types can be imported and exported. SDIF [125] and plain text files can be both

imported and exported, and SPEAR can additionally import both SNDAN and ATS analysis files. Using the graphical editor, sinusoidal components can be cut, copied and pasted, as well as being shifted in frequency and stretched or compressed in time.

Libsms

Libsms is an open source library that provides an implementation of SMS, derived from Serra’s original SMS code [32]. It is written in C, uses SWIG [9] to provide bindings for Python, and is also available as a set of external objects for Pure Data. Available SMS transformations include pitch-shifting (with and without the preser- vation of the original spectral envelope), time-stretching and independent control of the volumes of deterministic and stochastic components. Analysis data can also be imported from and exported to custom SMS analysis files.

Loris

Loris is an open source C++ implementation of the bandwidth-enhanced sinusoidal modelling system. Python bindings are provided using SWIG, and Loris can also be built as a Csound opcode (plugin). Time-scaling and pitch-shifting modifications can be performed on the analysis data, but sound morphing is of particular interest to the developers so a range of functions are provided for performing morphs between two sound sources. Loris also supports both importing and exporting analysis data via SDIF files.

AudioSculpt

AudioSculpt [14] is a commercial software package for spectral analysis and processing of sound files that is developed by IRCAM. It has a GUI that displays multiple representations of a source sound (a waveform view, a spectrum view and a sonagram), which are used to apply audio transformations to specific time or frequency regions. AudioSculpt relies on two different signal processing kernels in order to manipulate sounds. The first is an extended version of the phase vocoder called SuperVP [27]. It can be used to perform several different analyses including the computation of the standard and reassigned spectrograms, estimation of the spectral envelope by linear predictive coding and the true envelope, transient detection and fundamental frequency estimation. The second sound processing kernel, called Pm2, uses a proprietary sinusoidal modelling implementation. The sinusoidal partials that are estimated by the model can be exported to SDIF files.

AudioSculpt allows multiple transformations to be performed on sound files, including manipulating the gain of certain frequency regions that are selected using the GUI, transposing sounds (with or without time-correction), time-stretching, a spectral “freeze” effect, and a non-linear dynamic range stretching effect that is known as “clipping”. A sequencer is also provided so that transformations can be applied at specific time points in a sound file. Sound files can be processed and played back in real-time, allowing the results to be heard before saving them to a new sound file. However, AudioSculpt does not support the real-time processing of live audio streams.

It is possible to use the SuperVP sound processing kernel to manipulate audio streams in real-time by using the set of SuperVP external objects for Max/MSP

[52]. The supervp.trans~ object can be used to perform transposition, spectral envelope manipulation and decomposition of a signal into a combination of sinusoids, noise and transient components. Spectral envelopes and transients can also be preserved after modification. The supervp.sourcefilter~ object enables the spectral envelope of one signal to be “imprinted” onto another signal. An alternative cross-synthesis object called supervp.cross~ can be used to mix the amplitudes and phases of two signals in the frequency domain.

In document Sinusoids, noise and transients: spectral analysis, feature detection and real-time transformations of audio signals for musical applications (Page 71-75)