• No results found

AVERAGING TECHNIQUES FOR ENVELOPE ESTIMATION

In document Audio signal processing (Page 119-122)

Proper estimation of both the noisy signal envelope and noise envelope is paramount to the performance of any noise reduction technique. Improper es-timation of either envelope will result in an unacceptably high level of audible processing artifacts, such as musical noise. Up to this point, instantaneous en-velope quantities have been used in the gain formulae of the noise reduction methods discussed. To combat musical noise, however, averaged, or smoothed, envelopes are used. In the next few sections a variety of commonly used time averaging techniques are reviewed and discussed in the context of noise reduc-tion.

where M is the number of samples used in the average. The over-bar notation in (4.28) shall be used throughout to denote any averaged magnitude quantity, the reader should note that

regardless of the averaging technique actually used. In the computation of (4.28) is simply under the assumption that speech is absent.

use a smoothed version,

Because of statistical variation in the noisy signal, it is also fortuitous to in place of in the gain formulae.

Though the amount of smoothing necessary depends upon several aspects of the is generally smoothed much less than

implementation, such as the subband filter bank structure,

This i s because variations in and induce different artifacts in the signal estimate Referring to (4.26), positive fluctuations in can cause the difference in (4.26) to be negative, requiring rectification. Consequently, it is beneficial to average the noise magnitude over long intervals (assuming stationary noise). Similarly, positive fluctuations in resulting from the statistical variability of the is larger for

noise component reduce the effectiveness of noise reduction because

larger Thus, some smoothing of the noisy speech envelope is also beneficial. Excessive smoothing of the noisy speech envelope, however, degrades the speech quality of the signal estimate because

is not stationary. Excessive smoothing of and therefore disperses

component of the noisy observation

to the point that it is no longer well matched to the speech Early on, Boll [11] described the use of arithmetic averaging to reduce the presence of artifacts. For the STFT filter bank implementation used, Boll ap-plied the same sized average to both the noise and noisy speech envelopes (about 38 ms). McAulay and Malpass [13] also discuss arithmetic averaging.

4.2 SINGLE-POLE RECURSION

The arithmetic average requires an M-length history of the data. Further, each sample in the average receives the same weight, although it is trivial to include a tapered weighting window in (4.28) if desired. An alternative to arithmetic averaging is recursive averaging. Using a single-pole recursive

average the noise envelope estimate becomes

where is the coefficient of smoothing. Equation (4.29) defines a first-order lowpass filter and so the variance of is less than the variance of itself.

Recursive averaging has been by far the most popular method of averaging used in the spectral noise reduction methods. This is due to its simplicity and Also, because the impulse response corresponding to (29) decays as

the recursive average weights the recent past more heavily than the distant past.

This characteristic has been found beneficial to noise reduction processing.

Indeed, the first investigations into the use of averaging techniques employed recursive averaging. Schroeder’s method (Fig. 4.2) incorporates an analog version of (4.29) in the signal path common to both the noise and noisy signal envelope estimators. Sondhi et al. [4] experimented with variations on the recursive average for both the power subtraction and magnitude subtraction methods. For computing

or greater, were found to be effective [4].

cutoff frequencies of between 10 and 30 Hz,

For estimating cutoff

frequencies of between 1 and 10 Hz were sufficient. Other proponents of the recursive average include McAulay and Malpass [13], Ephraim and Malah [12], and Cappé [16].

efficiency, requiring only a single memory location for state variable storage.

4.3 TWO-SIDED SINGLE-POLE RECURSION

An alternative to the classic single-pole recursive filter involves choosing in (4.29) based upon the magnitude of relative to

Consider the so-called two-sided single-pole recursion in which in (4.29) is given by

where is the “attack” coefficient and is the “decay” coefficient. The two-sided recursive average employs two different filter response times, de-pending on whether the input is increasing or decreasing in magnitude relative

Consider, first, the computation of

to the current average. This property can be advantageous.

Although it is desirable to update only when speech is absent, it is not always possible to determine when speech is present and when it is not. If speech or other transient phenomena are present in

and (4.29) is updated, will become corrupt. This problem can In this case increases in

be reduced by choosing change

much less than decreases in and therefore is less

utterance when speech energy decays. This characteristic improves the response of in (4.26) to the onset of speech.

Etter and Moschytz [17] and Diethorn [19] used the two-sided single-pole recursion in the context of noise reduction; it is also used in the implementation discussed in Section 5. The technique has it origins in speakerphone technology, where it is used for voice activity detection; for example, see [26].

As a variation on (4.29)–(4.30), Etter and Moschytz [17] also proposed using

where and In subjective listening tests, this so-called two-slope limitation filter reportedly performs better than (4.29)–(4.30) for some material [17].

4.4 NONLINEAR DATA PROCESSING

To improve the stability of the noise estimate further, Sondhi et al. [4]

also experimented with a scheme to post-process the noise envelope

based upon a short-term histogram of its past values. This early rank ordering technique provided a means to prune wild points from the noise envelope es-timate. Median filtering and other rank-order statistical filtering can be used to post-process and following any of the averaging tech-niques described above; see [10] for an early reference on such methods. More recently, Plante et al. [25] have described a noise reduction method using re-assignment methods to replace envelope estimates that are deemed erroneous.

In general, nonlinear data processing techniques can provide improved noise reduction performance, although the behavior of such methods is sometimes difficult to analyze analytically.

In document Audio signal processing (Page 119-122)