APPLICATION OF WAVELET TRANSFORM FOR SPEECH PROCESSING

(1)

APPLICATION OF WAVELET

TRANSFORM FOR SPEECH

PROCESSING

SUBHRA DEBDAS

Department of Electrical Engineering, NIMS University, Jaipur, Raipur,C.G,492007, India

[email protected]

VAISHALI JAGRIT

Department of Electronic &Telecom Engineering, Csvtu Bhilai, Raipur,C.G, 492007, India

[email protected] CHINMAY CHANDRAKAR

Department of Electronic &TelecomEnginerring, Csvtu Bhilai, Bhilai,C.G,493441 , India

[email protected] M.F.QUERESHI

Department of Electrical Engineering, Govt. Poly Technique College Janjgir, CG, 495668, India

[email protected]

Abstract:

The distinctive feature of wavelet transforms applications it is used in speech signals. Some problem are produced in speech signals their synthesis, analysis compression and classification. A method being evaluated uses wavelets for speech analysis and synthesis distinguishing between voiced and unvoiced speech, determining pitch, and methods for choosing optimum wavelets for speech compression are discussed. This comparative perception results that are obtained by listening to the synthesized speech using both scalar and vector quantized wavelet parameters are reported in this paper.

Keywords: signal processing; wavelet transform; decomposition. 1. Introduction

(2)

thresholds are computed by extracted features of the input signal. Finally, the dynamic thresholds are used to classify the uncertain parts. The performance of the algorithm has been evaluated using a large speech database. The algorithm is shown to perform well in the cases of both clean and noise-degraded speech. Many speech signal processing applications have been applied in real-world [1]. The performance of speech coding and recognition system that operate in noisy environments decrease when high ambient noise levels occur. Therefore, speech enhancement system becomes a hot research topic to improve the performance of many computer-based speech recognition systems, coding and communication applications [2, 3]. The exiting methods such as spectral subtraction [4, 5], Wiener filtering [5, 6], and Ephraim-Malah filtering [7] are well-known. Recently, wavelet shrinkage has emerged as a powerful tool for removing noise from signal [8–11]. It is a simple denoising technique based on the thresholding of the wavelet coefficients (WCs). Donoho and Johnstone firstly proposed a universal threshold for removing the additive white Gaussian noise [8, 9]. In addition, they also proposed a level-dependent threshold to remove colored noise [12]. Bahoura and Rouat proposed a method of threshold adaptation in time domain by utilizing the use of Teager energy operator (TEO) [13]. The TEO can improve the discriminability for a speech frame. Chen et al. presented an improved wavelet-based speech enhancement method using the background noise can be almost removed by adjusting the wavelet coefficient threshold (WCT) according to the value of SNR [14]. After that, the adaptive wavelet-based methods in speech enhancement are widely presented. They utilize adequately WCT to improve the performance of speech enhancement. For noisy speech, energies of perceptual wavelet packet decomposition and the TEO. Lu and Wang proposed a method that the unvoiced segments are comparable to those of noise. In the most techniques which use the wavelet thresholding for speech enhancement, they may not only suppress additional noise but also some speech components like unvoiced ones. Consequently, the detection of the voiced/unvoiced segments of the speech signals is a main problem in wavelet-based methods.

2. Major Headings

Wavelet transformation has ability to analysis different speech quality problems simultaneously in both time and frequency domain. The wavelet transform is useful in detecting and extracting disturbance features of various types speech signal. Wavelet analysis deals with expansion of functions in terms of a set of basic functions, like Fourier analysis .Compared with Fourier transform, wavelet can obtain both time and frequency information of signal, while only frequency information can be obtained by Fourier transform.

The Fourier transform is a frequency domain approach which converts a continuous time signal into frequency domain. Fourier representation X (f) which is calculated by the Fourier transforms integral shown by

∞

The disadvantage of frequency-domain analysis approach is that a significant amount of information may be lost during the transformation process. This information is non retrievable unless a permanent record of the raw vibration signal has been made. The problem of Fourier transform is overcome up to some extent using Short Term Fourier Transform. STFT is simply the result of multiplying the time series by a short time window and performing a discrete Fourier transform. Mathematically for a signal , it is written as

STFT , ∞_∞

For discrete signals, this transform is known as Short Term Discrete Fourier Transform (STDFT) expressed mathematically with signal & window as

STFT ,

∞

(3)

sum the result to produce a single value. The continuous wavelet transform is defined as the convolution between the original signal s(t) and a wavelet a,b (t).

 ,

∞



1

√ 

∞

Where s(t) is the input signal; ‘a’ is the scaling factor; ‘b’ is the translation parameter; and (t) is called mother wavelet. The wavelet function is given by

 , 1

√ 

The Discrete Wavelet Transform (DWT) coefficients are usually sampled from the CWT on a dyadic grid parameters of translation b = n*2m and scale a = 2m and is defined as

 ,

1

√2 

2 2

3. Circuit operation

Proposed data acquisition systemfor speech separation different age of voice signal is recorded by connection between wave devices to wave file (in fig.1) in Matlab. Using wavelet density function detects and extracts the voice signal and also finds the age. The entire recorded signal maintain into a data file.

Fig.1. line in port to matlab

All matlab files are then applied in wavelet density estimation technique. debaunchi wavelet is considered

as the mother wavelet for further processing. Debaunchi coefficient analysis

φ(t)=h(0)√2 φ(2t)+ h(1)√2 φ(2t‐1)+ h(3)√2 φ(2t‐2)+ h(3)√2 φ(2t‐3) Where h (0)= (1/4√2) (1+√3), h (1)= (1/4√2) (3+√3),

h (2)= (1/4√2) (1‐√3), h(3)= (1/4√2) (3‐√3) and g(0)= (1/4√2) (3‐√3) g (1) = ‐ (1/4√2) (1‐√3), g (2) = (1/4√2) (1+√3), g (3)= ‐ (1/4√2) (1+√3).

2K+N‐1Sj(k) =∑h(m‐2k)Sj+1(m) m=2k

For debaunchi wavelet N=4 For K equal to 0,1,2,3

Sj(0) = h(0)Sj+1(0)+ h(1)S j+1(1)+ h(2)S j+1(2)+ h(3)S j+1(3).

Sj(1) = h(0)S j+1(2)+ h(1)S j+1(3)+ h(2)S j+1(4)+ h(3)S j+1(5).

Sj(2) = h(0)S j+1(4)+ h(1)S j+1(5)+ h(2)S j+1(6)+ h(3)S j+1(7).

Sj(3) = h(0)S j+1(6)+ h(1)S j+1(7)+ h(2)S j+1(8)+ h(3)S j+1(9).

dj(0) = g(0)Sj+1(0)+ g(1)S j+1(1)+ g(2)S j+1(2)+ g(3)Sj+1(3).

dj(1) = g(0)Sj+1(2)+ g(1)S j+1(3)+ g(2)S j+1(4)+ g(3)Sj+1(5)

(4)

dj(3) = g(0)Sj+1(6)+ g(1)S j+1(7)+ g(2)S j+1(8)+ g(3)Sj+1(9).

One of the density coefficients shown in fig.2‐5.

Fig.2

.

Male voice signal

Fig.3

.

Synthesized voice signal

(5)

Fig.5.Selected coefficient

4.

Conclusion

For voice synthesis debaunchi wavelet density estimation technique has been carried out. All the voice data have been processed through matlab wavelet toolbox. The method is simple and with block diagrams proposed implementation of the scheme is explained. The scheme is verified through simulation and gives a very satisfactory performance.

References

[1] J. Deller, J. Proakis, and J. Hansen, Discrete-Time Processing of Speech Signals, Prentice-Hall, Englewood Cliffs, NJ, USA, 1993. [2] B. H. Juang, “Recent developments in speech recognition under adverse conditions,” in Proceedings of the International

[3] Conference on Spoken Language Process (ICSLP ’90), pp. 1113– 1116, 1990.

[4] J.-H. Chen and A. Gersho, “Adaptive post filtering for quality enhancement of coded speech,” IEEE Transactions on Speech [5] and Audio Processing, vol. 3 , no. 1, pp. 59–71, 1995.

[6] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech, [7] and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979.

[8] J. R. Deller, J. H. L. Hansen, and J. G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, New York, NY, USA, 2nd edition, 2000.

[9] S. Haykin, Adaptive Filter Theory, Prentice-Hall, Upper Saddle River, NJ, USA, 3rd edition, 1996.

[10] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE

Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 6, pp. 1109–1121, 1984.

[11] D. L. Donoho, “De-noising by soft-thresholding,” IEEE Transactions on Information Theory, vol. 41, no. 3, pp. 613–627, 1995. [12] D. L. Donoho and I. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” Biometrika, vol. 81, no. 3, pp. 425–455, 1994. [13] S.-H. Chen and J.-F. Wang, “Speech enhancement using perceptual wavelet packet decomposition and Teager energyoperator,”

Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 36, no. 2-3, pp. 125–139, 2004.

[14] S.-F. Lei and Y.-K. Tung, “Speech enhancement for nonstationary noises by wavelet packet transform and adaptive noise stimation,” in Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS ’05), pp. 41– 44, Hong Kong, December 2005.

[15] M. Johnstone and B. W. Silverman, “Wavelet threshold estimators for data with correlated noise,” Journal of the Royal Statistical

Society. Series B, vol. 59, no. 2, pp. 319–351, 1997.

[16] M. Bahoura and J. Rouat, “Wavelet speech enhancement based on the Teager energy operator,” IEEE Signal Processing Letters, vol. 8, no. 1, pp. 10–12, 2001.

[17] C.-T. Lu and H.-C. Wang, “Enhancement of single channel speech based on masking property and wavelet transform,” Speech