1.5 Publications by the Author
2.1.2 Statistical Methods
Independent Component Analysis (ICA) was first clearly defined in 1994 by P. Comon [8] after initial work done with Jutten and Herrault [32]. It is a statistical technique, used to expose underlying hidden components from sets of observed measurements. ICA can be seen as an advancement of Principal Component Analysis (PCA). In its simplest form it was proposed to solve the instantaneous mixture model; however, for real environments ICA has to deal with convolutive mixtures and it increases in computational cost due to increase in FIR filter length. To overcome this problem, frequency domain implementation of the ICA algorithm is carried out [31]. ICA for instantaneous mixture model forms the basis for dealing with convolutive mixtures. It is essentially parallel execution of a series of instantaneous ICA at each frequency bin. In order for ICA to perform unmixing, some assumptions need to be made [33]. Firstly, the source components are assumed to be statistically independent. Mathematically it implies that the joint Probability Density Function (PDF) is factorizable as [34]:
p (s1, s2, s3...sN) = N
Y
i=1
p(si). (2.9)
For example two random variables, sx and sy, are said to be independent if there is no
mutual information among the two, i.e. information of sxdoes not give any information
of sy, and vice versa. Although the source signals are independent, the mixture signals,
2.1. Related Work 13
applying ICA on mixture signals. ICA performs source separation by exploiting higher order statistics of the signals. The higher order cumulants of the Gaussian distribution are zero, implying that independent components of each source should ideally have non-Gaussian distributions.
Another assumption made for ICA is that the mixing matrix is invertible. As a result, the number of sources is equal to or less than the number of microphones. For specific scenarios, where the number of microphones and sources may vary, M greater than N case is termed as overdetermined, M less than N is termed as underdetermined and M equal to N is termed as a determined case. The time-frequency content of sources can be exploited to solve the underdetermined case as well [13]; however, it requires additional processing steps.
Considering the noise free version of equation (2.3), the ICA solution exists in having an unmixing matrix W such that:
S(t) = W X(t). (2.10)
The time relation will be dropped in the following representations for simplicity. Coef- ficients of W can be found using prior measurements of the transfer function H. This however is not a practical solution. ICA instead finds the coefficients of B such that:
BH = ˘I ˘D, (2.11)
where ˘D is a diagonal scaling matrix and ˘I is an (N × N ) permutation matrix. Hence, the separated statistically independent sources can be found as:
Y = BX. (2.12)
Here Y represents the statistically independent components of separated sources, ob- tained from processing the mixture signals, X. There are several techniques for solving the instantaneous ICA problem.
2.1.2.1 PCA Preprocessing
For performing ICA, the sources are assumed to be independent, and as such, their joint PDF can be represented in the form of equation (2.9). PCA preprocessing can be used to decorrelate the mixtures X prior to performing ICA [8]. Independence implies decorrelation and not the vice versa. PCA preprocessing facilitates ICA by performing decorrelation over the lower order statistics. It results in a transformation of X such that:
E ´x´xT = ˆI, (2.13)
where ´x represents the transformation of X and it is whitened data with resulting correlation matrix ˆI, a unit matrix. This is done only for dealing with the second order statistics of the signals. In order to achieve complete independence, the signals are to be processed for all orders using ICA. Therefore PCA is used as a preprocessing stage for ICA to decorrelate the data, reduce noise, reduce computations and achieve a quicker convergence [35].
2.1.2.2 ICA-based Techniques
ICA has developed as a general concept representing a family of techniques which aim to extract independent source signals. From the central limit theorem (CLT) it is known that the sum of independent signals tend to be more Gaussian than any of the two signals considered alone [36]. Hence independence implies non-Gaussianity, as such, in ICA an unmixing matrix (W ) is found such that non-Guassianity of individual source signals is maximised. ICA removes dependence in higher orders (after PCA preprocessing) and hence ICA is often termed as the rotation which makes decorrelated data independent [26]. For achieving independence, higher order statistics such as kurtosis are used. Iterative methods are utilised to either maximise or minimise the considered cost function, such as related to kurtosis [37] or mutual information entropy [8]. All the cost functions relate to non-Gaussianity to an extent. As such, ICA has been formulated with several implementation approaches:
2.1. Related Work 15
1. Maximum Likelihood (ML) [38][39][40]
2. Maximisation of information transfer [41][42][43]
3. Negentropy [44]
4. Higher order moments and cumulants [8][45]
5. Non-linear PCA [46][47][48]
These approaches are based on cost functions that try to exploit non-Gauassanity of source signals. The reasons for using different cost functions is related to computational cost or sensitivity to outliers. There are several gradient descent methods available for using these cost functions in order to derive unmixing matrix, W . These include Newton method of descent [35], steepest gradient descent and stochastic gradient descent [49]. The natural gradient descent method is one of the most commonly used methods for performing ICA [50].
ICA algorithms for instantaneous mixture model can be applied at each frequency bin in frequency domain to deal with convolutive mixtures. Commonly, a Short Time Fourier Transform (STFT) is used for framing the time data and transforming each time frame into the frequency domain. As such, the representation of signals is carried out as x (f, t), h (f, t) and s (f, t) exhibiting a dependence on both the time and frequency [51].
2.1.2.3 Limitations of ICA
In ICA there are ambiguities with regards to order and scaling of the separated source signal components. It is due to the permutation matrix, as given in equation (2.11). The problem extends to each frequency bin while dealing with convolutive mixtures. The permutation ambiguity is described as the mixing up of estimated independent sound source components due to lack of proper alignment [13]. The permutation ambiguity is faced because both S and H (equation 2.1) are unknown, so the order of components can get exchanged even after the independent source components have been determined [52].
Although many post processing algorithms have been proposed to designate the sep- arated independent components to their corresponding sources, permutation problem has no closed form solution [53]. The proposed solutions either utilise time-frequency source models, exploit room impulse response or use the array geometry information [31][54][55][56][57][52]. The post processing required to solve permutation problem adds to the computational cost, and as no closed form solution exists, it is still being widely investigated [58][53].
Another ambiguity faced by ICA is the scaling ambiguity [52]. Although it is not as severe as the permutation ambiguity, it occurs when the energy of independent components does not relate to the energy of source signals. Since both S and H are unknown, any scaling of source signal Si could be made ineffective by scaling down the
corresponding column Hi in H: X =X i 1 αHi (αSi) . (2.14)
This problem is commonly resolved by normalising the unmixing matrix W [59]. Another issue with ICA is the assumption of linearly stationary mixing conditions, i.e. the mixing matrix is assumed to be constant, independent of the time-variant source signals [60]. This assumption degrades the effectiveness of ICA in real life scenarios, for example, during electrocardiogram (ECG) the mixing conditions change over time when the person inhales or exhales [61]. Also, in acoustic applications, moving sources lead to changes in the mixing conditions [62], which are inherently assumed to be stationary by ICA. As such, this assumption creates problems while using ICA in more practical scenarios.
At the final stage, the approximated unmixing matrix W is used to perform separation in the time-frequency domain using STFT as:
y (f, l) = W (f, l) x (f, l) , (2.15)
where y (f, l) = [y1(f, l) , y2(f, l) ...yN(f, l)]T contains the source estimates, W (f, l)
is time-frequency representation of unmixing matrix and x (f, l) = [x1(f, l) , x2(f, l) ...
2.1. Related Work 17