BACKGROUND INTRODUCTION TO
Section 3.5. Solving the permutation problem
dA^ k) = - diag[WT(u;)E(i*, k)W(u)] (3.4.14) where diag(-) denotes the diagonahzation operator which zeros the off- diagonal elements of the matrix. The optimal unmixing matrix W (cj) and the noise autocorrelation matrix Av(cj, k) can then be obtained by a gradient descent algorithm using the gradients formulated in (3.4.12) and (3.4.14), and the autocorrelation matrix of the source signals can be obtained by setting the gradient in (3.4.13) to zero. A more detailed introduction for this approach can be seen in [9].
Although it has been shown that frequency domain methods are more efficient and have a better convergence property, the permutation problem in the frequency domain is more serious as compared with that in the time domain, since the blind estimated unmixing matrix at one frequency bin can at best be obtained up to a scale and per mutation. Therefore, at each frequency bin the separated signal s<(a;) can be recovered at an arbitrary output channel. Consequently, the recovered source signal is not necessarily a consistent estimate of the real source over all frequencies. So it is necessary to solve the permu tation problem and align the unmixing matrix to an appropriate order, so that the source signals can be recovered correctly. This problem will be discussed in the next section. The scale ambiguity problem at each frequency bin is mitigated simply by normalizing the unmixing matrix at each frequency bin [10].
3.5 Solving the permutation problem
Typical methods for solving the permutation problem are introduced in this section. These methods can be divided into four classes:
Section 3.5. Solving the permutation problem 52
1. Exploiting unmixing matrix spectral continuity.
2. Exploiting the similarity of the signal envelope structure at dif ferent frequency bins from the same source signal.
3. Utilizing geometrical constraints, i.e., exploiting the difference of the positions of the source signals.
4. Combined methods, which use the advantages of different ap proaches, whilst avoid their disadvantages.
3.5.1 Exploiting unmixing matrix spectral continuity
Parra & Spence [9] suggest to solve the permutation problem by impos ing a smoothness constraint on the unmixing filters, i.e., a constraint on the filter length Q < T, where T is the FFT window length. This finite length constraint in the time domain forces the solution to be smooth or continuous in the frequency domain, thus promoting convergence to a global minimum.
Smaragdis [34] [35] worked wholly with this problem in the fre quency domain by using ICA algorithms such as the natural gradient algorithm for each frequency bin. To solve the permutation problem, he proposed an adaptive scheme to exploit frequency coupling between neighboring frequency bins. The adaptation of unmixing matrics is formulated as follows:,
AW) = AWf + kAWf - U 0 < k < l (3.5.1) where Wf is the unmixing matrix W at frequency bin / .
Obviously, both methods have an implicit assumption of spectral continuity, i.e., the neighboring unmixing matrix are similar to each
Section 3.5. Solving the permutation problem 53
other.
3.5.2 Exploiting signal envelope structure
Murata et al. [38] solve the permutation problem by exploiting the similarity of the signal envelope structure at different frequency bins from the same source signal. This method can be formulated as follows: l.Sort u; in order of the weakness of correlation between the sepa rated signals 2/<(w,n), i.e., is chosen as the frequency bin in which the separated signals have the weakest correlation.
2.For o>i, fix the order of separated signals.
3.For Uk find the permutation of the order of separated signals which maximizes the correlation between the envelope of the separated signals
Viiojk, n) and the aggregated envelope of the separated signals from
yi(uu n) through yi(uk- i , n ) .
4.Assign the appropriate permutation to all the frequency bins. It is clear to see that this approach implicitly assumes that at dif ferent frequency bins, the envelope structure from the same signal is similar.
3.5.3 Utilizing geometrical constraints
All BSS methods make no assumption about the positions of the sources in the 3-dimensional space. As shown in [41] and [40], the source sig nals generally originate from different spatial locations in practice, and this geometrical information can be utilized to solve the permutation problem. This approach is motivated by the beamforming technique, which estimates the direction of arrival (DOA) of signals in order to steer the beam of an array of sensors to focus on a specific source [42].
Section 3.5. Solving the permutation problem 54
A beamformer consists of an array of sensors in a particular config uration. The output of each sensor is properly filtered and the filtered outputs of all the sensors are added up. Typically, a beamformer lin early combines the spatially sampled waveform from each sensor in the same way as an FIR filter which linearly combines temporally sam pled data. In the beamforming technique, the beamformer response is defined as the amplitude and phase presented to a plain wave as a func tion of location and frequency. Consider the signal is a plain wave with DOA 0 and frequency u>, and for convenience let the phase be zero at the first sensor, the sensor array response vector can be generally expressed as
where r, is the time delay of the source signal at the ith sensor, j =
y/^l. The beamformer response can then be formulated as
where w is the beamformer filter vector. It is straightforward to see in (3.5.3) that the angle between w and d(0, uj) determines the response
r(Q,uj). The ability to discriminate sources at different locations and
frequency is obtained by utilizing the difference of angles of their array response vectors.
It is clear to see that the model used in the beamformer is very similar to that of BSS, in which the row vector of the unmixing matrix can be deemed as a beamformer, and one source from one direction is extracted. The equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolu-
(3.5.2)
Section 3.5. Solving the permutation problem 55
tive mixtures is discussed in detail in [43]. By constraining the beam- former response to be a constant, the permutation of the unmixing matrix at different frequency bins can be removed [41] and [40]. Simi lar approaches can be seen in [44] [45] and [46].
3.5.4 Combined methods
A combined approach for solving the permutation problem is proposed in [47], which uses both the signal envelope structure and the geomet rical information of source signals. A more robust and precise method for the permutation problem is obtained by utilizing the good proper ties of both single approaches, whilst avoiding their disadvantages. The advantage of the geometrical information based method is the robust ness since a misalignment at one frequency bin does not affect other frequencies. However, DOA cannot be well estimated at some frequen cies, especially at low frequencies where the phase difference caused by the sensor spacing is very small, and also at high frequencies where spatial aliasing might occur. In essence, the DOA approach is not pre cise since the evaluation is based on an approximation of the mixing system. The correlation approach is not robust since a misalignment at one frequency bin may cause consecutive misalignments. The corre lation approach is precise as long as signals are well separated by ICA since the measurement is based on separated signals.
The combined approach integrates the two approaches to solve the permutation problem in the two following steps:
1) Fix the permutations at some frequencies where the confidence of the DOA approach is sufficiently high;