Design and Implementation of Comb Filter using Field Programmable Gate Array for Improving Speech

(1)

Abstract— Hearing loss affects approximately about 10% of the population due to different reasons across the world.Sensorineural Hearing loss caused due to poor cochlear hair cell function and damages cochlear nerve. Sensorineural hearing impairment is the responsible for widening of auditory filter which leads to spectral masking and degrades speech perception. Different researches are incorporated to minimize this effect but from previous studies it is observed that Splitting of signal into two different parts and providing them dichotically to both the ears reduces it to effectively. Here, several technique are discussed to improve speech perception. Design of filter is carried out using structural realization of filter as it provides better performance compared to filter design using equation or convolution. In this project different filters structure realization is compared on basis of resources and time required for processing given data. Proposed scheme reduce the resources utilization and optimizes processing block effectively.

Index Terms— Binaural hearing aid, Dichotic presentation, FPGA implementation and Sensorineural hearing impairment.

Hearing is complex process , the reason it has been called complex hearing includes capacity of both ear for detection of sound to the and brains potential for interpretation of sound [19]. Hearing impairment is classified based on the location of defect in an ear. There are different types of hearing loss classified based on location of defect in auditory system [5]. Conduction hearing loss occurs due to an abnormality in the middle ear leading to poor transmission of the sound to the inner ear.

Sensorineural loss is caused by pathology in the cochlea and/or due to degeneration of the auditory nerves. Mixed hearing loss is combination of Conductive and Sensorineural hearing loss Central loss occurs due to inability of the brain in decoding the neural firings into meaningful linguistic information.

In general, most commonly found hearing loss is sensorineural loss and it progressively gets worse with time.

Sensorineural hearing loss is widening of the auditory filter

bandwidths thus the filter slope becomes glib and results in overlap of adjacent spectral bands called spectral masking. In Fig. 1 indicate, peaks and valleys of the speech spectrum are broadened affecting the perception of speech because of masking. This leads to a decrease in frequency resolving [9] results in overlap of adjacent spectral bands called spectral masking. The peaks and valleys of the speech spectrum are broadened affecting the perception of speech because of masking. This leads to a decrease in frequency resolving capacity of the auditory system of the ears. Sensorineural loss is associated with elevated hearing thresholds, decreased dynamic range and intolerable loudness, and increased temporal and spectral masking, resulting in degraded speech perception [3].

Spectral contrasts, reduces it results in broadening of auditory filters. Increased temporal masking results in the increase of forward and backward masking of weak acoustic segments by strong ones, which also affects speech intelligibility.

Hearing aids generally provide frequency-selective amplification to compensate for the elevated hearing thresholds. Automatic volume control and multichannel dynamic range compression are used to partially address the problems associated with the reduced dynamic range and loudness recruitment [3]. If there is increased in temporal masking , it results in poor detection of acoustic events. Increased spectral masking caused by widening of auditory filters results in poor discrimination of spectral features. The increased masking makes speech perception very difficult in the presence of noise. It also results in poor speech perception due to increased intra speech masking. Improvement of consonant-to-vowel ratio (CVR) has been investigated for reducing the effects of increased temporal masking [12],. Techniques based on spectral contrast enhancement [15] and multiband frequency compression [14] have been used for reducing the effects of increased spectral masking.

II. SPEECH PROCESSING TECHNIQUES

REVIEW

Researcher across the world employed speech processing methods for improving speech perception. A scheme proposed for speech discrimination in noise due to reduced frequency selectivity. Sum and difference with delayed analog signal [7] are employed in this scheme. The delay was adjusted for changing the bandwidth to 200Hz,

Design and Implementation of Comb Filter using Field Programmable Gate Array for Improving Speech

Perception

Sayli Pradhan, D. S. Chaudhari

(2)

improvement observed with the scheme. Later some of the researcher designed an 8-channel digital filter bank [6], realized using complementary interpolated linear phase filters with constant bandwidth of 700Hz, for spectral splitting. The filter gains were complementary on a linear scale and the magnitude responses had much better separation of pass and stop bands as compared to the filters used earlier by Lyregaad (1982)[7]. Dichotic presentation resulted in increase of 2 dB in signal-to-noise-ratio

.

Chaudhari and Pandey investigated a scheme for splitting the speech into two complementary spectra each with nine bands for binaural dichotic presentation using psychophysical tuning curves[13]. The FIR comb filters were designed by frequency sampling method using linear optimization techniques. The investigated design was able to improve recognition scores, response time, speech quality and transmission of consonantal features as observed in the listening tests. Twelve Different English Nonsense syllable formed using /p, b, t, d, k, g, m, n, s, z, f, v/ in vowel-consonant-vowel (VCV) and consonant-vowel (CV) context with the vowel /a/ were used as testing sound and incorporated using antialiasing filter for removing noise components with cut-off frequency 4.8 kHz and 16-bit ADC at a rate of 10 k Samples/s. The process of filtering speech was done offline and real time. The input and processed speech signal with the said scheme was presented to two ears through DAC, antialiasing filter and power amplifier. Hearing test conducted on ten subject of age group 18-58 years. The subject had mild-to-very severe bilateral sensorineural hearing loss. The sound to be tested was presented binaurally at the individual subject’s most comfortable listening level. Listening tests were conducted with ten subjects in VCV and CV context. All the subjects have shown significant improvement in recognition scores and significant decrease in response time[2].

.

A study of integration of information in two different spectral regions in monaural presentation of speech signal by normal listeners [11], indicated that one of the two bands could be attenuated by up to 40 dB before intelligibility was affected. The result shows a ripple in the perceived loudness of spectral components during binaural dichotic presentation may not adversely affect speech intelligibility. In the other study[17], the effect of dichotic presentation on source localization, by dividing the speech spectrum into a lower and an upper band, for two inter-band crossovers at 800Hz and 1.6 kHz. Under diotic presentation(same sound presented to both ear). These studies indicate the need for assessing the effect of binaural dichotic presentation using the proposed set of comb filters on source localization.

Some of the researcher designed the comb filters with 256 coefficients for improving band separation and reducing the variation [16] in the loudness of spectral components with frequency. The filter magnitude responses had pass-band ripple less than 1 dB, stop-band attenuation of greater than 30 dB, transition bandwidth of 78 – 117 Hz, and gain of -4 to - 6

subjects showed an improvement of 7 – 20% in consonant recognition. The improvement was mainly observed in the place feature, indicating that the processing reduced the effect of increased spectral masking. Chaudhari et al. proposed a scheme in which eighteen critical bands corresponding to auditory filters based on psychophysical tuning curves were used. In binaural dichotic presentation scheme, splitting of speech signal in real time [4] into two signals with complementary short time spectra using filters with magnitude based response on two complementary auditory filter banks with linear phase was implemented and evaluated. Filter banks corresponding to eighteen critical bands over 5kHz frequency range were used. Listening tests were conducted on subjects with mild to severe very severeǁ bilateral sensorineural hearing loss. The improvement possible with the scheme was better reception of spectral characteristics was evident as the results indicated improvement in speech quality, response time decrease, enhancement in recognition scores. Comb filters were designed having adjustable magnitude response at transition crossovers for minimizing any change in perception intensity. It also promoted pass band and increase attenuation in stop band. Listening tests were conducted on normal subjects with simulated hearing loss.

The designed filter indicated better speech recognition scores and relative information transmission than the earlier filter. Therefore the designed comb filter can be used for binaural aids for persons with bilateral sensorineural hearing impairment, for reducing the effect of spectral masking [2].

The speech perception gets degraded due to increased spectral and temporal masking in persons with sensorineural hearing loss.

Abed and Neurkar reported comb half-band FIR-FIR structure and conventional comb- FIR-FIR decimation filter for same specifications were implemented for comparative study [18]. The designed decimation filter of half-comb-band architecture contributed to a hardware saving of 69% as compared to the comb-FIR-FIR architecture; in addition, it reduced the power consumption by 83%, respectively. The resulting architecture was hardware efficient and consumed less power compared to conventional decimation filters. The resulting architecture was hardware efficient and consumed less power compared to conventional decimation filters.

A scheme proposed the use of three different types of bandwidths in the comb filter magnitude response: constant bandwidth filters with number of bands varying from 2 to 20, critical band based comb filters and 1/3 octave bandwidth filters [8]. In case of constant bandwidth filters, comb filters designed with up to 14 bands resulted in laterization of sounds presented. Constant bandwidth filters with 16 or more bands, critical band based filters, and 1/3 octave bandwidth based filters did not show this problem. In the presence of noise, all the three types of filters improved speech intelligibility. Improvements because of Critical Band based and 1/3 octave band based filters were similar for the same SNR, and better than constant bandwidth filters.

FPGA-based implementation of 513-coefficient filters using MatLab based offline processing [3] with sampling

(3)

frequency of 10 k Sample/s was carried out for use in binaural hearing aids. Implementation using a 16- bit CODEC and 15-bit integer filter coefficients used 47, 34, and 53% of combinational functions, logic registers, and logic elements, respectively, available on FPGA kit.

Binaural presentation through the headphones of stimuli used for listening tests did not show any distortion for the processed sound. The resulting magnitude responses have a close match to the offline floating-point implementation.

III. IMPLEMENTATION

Modeling of Sensorineural Hearing Aid , scheme of binaural dichotic presentation was used to reduce the effect of spectral

and temporal masking simultaneously for person with bilateral sensorineural hearing loss. Pair of time varying comb filter was used to split the speech signal into two for binaural dichotic presentation. The proposed method suggests the implementation of comb filter using field programmable logic array(FPGA). Implementation of the comb filters using different architectures was investigated to compare the filter characteristics and the resource requirements in order to evaluate the feasibility for use in binaural hearing aids. Implementation of digital filter was carried out using delays, multiplier and adder connected in specific structure to realize comb filter. As shown in Fig. 1, Personnel computer has data in audio form this data was given to serial conversion kit through which it becomes input to FPGA kit and FPGA kit provides two output for different ear which can observed on Personnel computer. In this scheme, Filter bank is designed using FPGA, FPGA implementation of filter was beneficial for fast prototype designing of Application Specific Integrated Circuit (ASIC) to reduce power and area constrains.

Fig. 1 Modeling of Hearing Aid for Sensorineural Hearing Loss

A scheme of binaural dichotic presentation is used to reduce the effect of spectral and temporal masking simultaneously for person with bilateral sensorineural hearing loss. Pair of time varying comb filter is used to split the speech signal into two for binaural dichotic presentation.

The proposed method suggests the implementation of comb filter using field programmable logic array. Implementation of the comb filters using different architectures is investigated to compare the filter characteristics and the resource requirements in order to evaluate the feasibility for use in binaural hearing aids. Implementation of digital filter is carried out using delays, multiplier and adder connected in specific structure to realize comb filter.

Filters can be implemented in different ways of realization filter structure offer different trade-offs between the number of processing blocks, tolerance to coefficient quantization errors, and dynamic range requirements for intermediate values.

Fig. 2 Comb filter realization using FIR Direct form I

Filter equation can be formed based on type of realization is used from Fig. 2 direct-form FIR filter realization, with input x(n),filter coefficients m and output y(n) given as

) 1 ( 1) - N - p(n h + ..

+ 1) - p(n h + p(n) h

= )

(n ₀ ₁ _N_-₁

q

Fig. 3 FIR Filter Realization using Transposed Structure[3]

A transposed-form structure as shown in Fig. 3 (for odd N) can be used for linear-phase FIR filters having symmetric impulse response, with the output as the following:

Exploiting the symmetry in coefficients, it reduces the number of multipliers to half as compared to that in the direct-form realization.

(2) 1)/2) - (N - [p(n h

+ ...

+ 2))]

- (N - p(n

+ 1) - [p(n h + 1))]

- (N - p(n + [p(n) h

= q(n)

1)/2) - ((N

1 0

The processing architecture as shown in Fig. 4 consists of a multiplier, an adder, N registers (Reg-I, 0 < I < N−1) for intermediate results, 2 registers (Reg-B, Reg-C) as buffers, a multiplexer (MUX-H) for tap weights, and a multiplexer (MUX-R) for intermediate results. It also has a sequence controller (SEQ. CNTRL.) which controls the sequence operation. MUX-sel is changed sequentially in phase with the processing clock. In each processing cycle, the input Personnel

Computer

Serial Communication

kit

FPGA Kit Altera Cyclone

DE 2 Board

(4)

added with the correspondingly selected register content.

Fig. 4 Comb Filter Design using Sequential Multiply Accumulate[3]

After the first processing cycle of a sampling interval, the Ro output is taken as the output y(n), given as the following:

N) , (n R

= )

( n

₀

q

N) 1, - (n R + p(n) h

= )

( n

₀ ₁

q

N) 2, - (n R + 1) - p(n h + p(n) h

= )

( n

₀ ₁ ₂

q

N) 3, - (n R + 2) - p(n h + 1) - p(n h + p(n) h

= )

( n

₀ ₁ ₂ ₃

q

…

) 3 ( 1) - N - p(n h + ..

+ 1) - p(n h + p(n) h

= )

( n

₀ ₁ _N_-₁

q

This method called as sequential as it does multiplication sequentially and uses only one multiplier and adder. In design a comb filter one by one delays are going to be added and multiplied so it can be effectively done with sequential multiply accumulate

Sequential Multiply Accumulate overcomes the drawback in parallel multiply accumulate operation .Sequential multiplier is used for optimize processing block .In parallel multiply accumulate clock cycle is idle for most of time duration sampling interval , sequential clock frequency more and less fanout which ultimately reduces the area and compilation time.

The direct form structure with sequential multiply accumulate has been used in designing of proposed filter as this approach gives a better performance than common structures in terms of speed of operation, cost and power consumption. The concept of pipelining has been incorporated that results in reducing the delay of the FIR filter, thereby enhancing the speed and reducing the power dissipation as compared to the non-pipelined techniques.

The choice between structures depends on a number of factors and trade-offs which include ease of implementation that is the implied hardware or software complexity, area

factors Direct Form Parallel Multiply Accumulate (Direct PMA), Transposed Structure Parallel Multiply Accumulate (Transposed PMA), and Direct Form Sequential Multiply Accumulate (Direct SMA) filter structures are compared.

In Fig. 6 three filter structures are compared based on number of slices and LUTs. It is observed that for proposed method that is direct form using sequential multiply accumulate resources required are reduced compared to remaining methods. Fig. 5 provides the complete information about logic utilization based on available and utilized devices for sequential multiply accumulate algorithm.

Fig. 5 Comparison of Filter Structures Device Utilization Based on Slice Flipflop and LUT’s

Fig. 6 Comparison of Filter Structures Based on Fanout When different filter structure compared on the basis of fanout. It is observed in Fig. 6 fanout is reduced in direct form filter structure using sequential multiply accumulate clock frequency more and less fanout which ultimately reduces the area and compilation time.

When this different filter structure compared based on time to process intermediate signal like offset before clock, offset after clock and delay. From Fig. 7 it is observed that

Direct

PMA Transposed

PMA

Direct SMA

Fanou

(5)

mentioned parameters are reduced in direct form sequential multiply accumulate.

Fig. 7: Comparison of Filter Structure Based on Intermediate Time to Process

When this different filter structure compared based on time to process intermediate signal like offset before clock, offset after clock and delay. From Fig. 7 it is observed that mentioned parameters are reduced in direct form sequential multiply accumulate.

This indicate total real time to xst completion time given by different filter structure it is found that for direct for filter using sequential multiply accumulate completion time is reduced significantly shown in Fig. 8

Fig.8 : Comparison of filter based on time of completion IV. RESULTS

Slice flip flop are resources on the FPGA that can perform logic functions. Logic resources are grouped in slices to create configurable logic blocks. A slice contains a set number of LUTs, flip-flop and multiplexers. In FIR filter structure using direct form parallel multiply accumulate number of slice flip-flop used are 57, filter structure using transposed parallel multiply accumulate number of slice flip-flop used are 80. In proposed method, Direct form structure using sequential multiply accumulate number of slice flip-flop used are 16 only hence reduced by 28%.

A LUT’s is a collection of logic gates hard-wired on the FPGA. LUTs store a predefined list of outputs for every combination of inputs and provide a fast way to retrieve the output of a logic operation. In FIR filter structure using direct form parallel Multiply Accumulate number of LUTs used are 328 , filter structure using transposed parallel Multiply Accumulate number of LUTs used are 322. In proposed method, direct form structure using sequential multiply accumulate number of slice flip-flop used are 32 only hence reduced significantly compared to other methods.

Time of completion using direct form parallel Multiply Accumulate time of completion is 8.66 seconds , filter structure using transposed parallel Multiply Accumulate time of completion is 8 seconds. In proposed method, Direct form structure using Sequential Multiply Accumulate time of completion is 5 seconds only hence reduced significantly compared to other methods.

V.CONCLUSION

The FIR filters are extensively used in digital signal processing and can be implemented using programmable digital processors. With the advancement in Very Large Scale Integration (VLSI) technology as the DSP has become increasingly popular over the years, the high speed realization of FIR filters with less power consumption has become much more demanding. Since the complexity of implementation grows with the filter order and the precision of computation, real-time realization of these filters with desired level of accuracy is now becoming a challenging task. So, the implementation of FIR filters on FPGAs is the need of the day because FPGAs can give enhanced speed and allows reconfigurable architectures for realization of FIR filter.This is due to the fact that the hardware implementation of a lot of multipliers can be done on FPGA which are limited in case of programmable digital processors.

The direct form structure may used in designing of this filter as this approach gives a better performance than common structures in terms of speed of operation, cost and power consumption. In direct form structure, N+2 shift registers, a adder and a multiplier used to realize the N order low pass filter. The concept of pipelining has been incorporated that power dissipation. The filter architecture using sequential multiply accumulate operations may found to be more efficient in resource utilization, and has shown the scope for implementing other processing blocks of a hearing aid on the same chip. Comb filter design itself, reducing the resource requirement and the delay in the signal processing path.

Acknowledgment

I would like to take the opportunity to express gratitude for doing everything from the inception of the project idea to giving invaluable suggestions at every step and giving me the opportunity to work on this project to my guide Dr. D. S.

Chaudhari in making this paper successful.

(6)

[1] J. Chandwani and D. Chaudhari, “Review on Improving Speech Perception to Sensorineural Hearing Impairment using Filters", Int. J.

Advanced Research in Computer Science and Software Engineering, vol. 3, pp. 207-209, 2013.

[2] D. Chaudhari and P. Pandey, “Dichotic Presentation of Speech Signal using Critical Filter Bank for Bilateral Sensorineural Hearing Impairment", Proc. 16th ICA, J. Acoust. Soc. Am., vol. 1, pp. 213–214, 2008.

[3] S. Kambalimath, P. Pandey, P. Kulkarni, S. Shetti, and S. Hiremath,

“FPGA Based Implementation of Comb Filters for Use in Binaural Hearing Aids for Reducing Intraspeech Spectral Masking", Int. Conf.

on Signal Processing, Communication and Networking IIT Bombay, pp. 1–5, 2013.

[4] D. Chaudhari, “Binaural Dichotic Presentation of Speech Signal for Improving its Perception to Sensorineural Hearing-Impaired using Auditory Filter", IETE Technical Review, 2008.

[5] M. Steel, “Binaural Fusion and Listening Effort in Children Who Use Bilateral Cochlear Implants", PubMed Central, /, January 15, 2016.

www.ncbi.nlm.nih.gov/pmc/articles/PMC4323344/

[6] Y. Lunnner, S. Arlinger and J. Hellgren, “8-channel Digital Filter Bank for Hearing Aid Use: Preliminary Results in Monaural, Diotic and Dichotic Modes", Scand. Audiol. Suppl., vol. 38, pp. 75–81, 1993.

[7] P. Lyregaard, “Frequency Selectivity and Speech Intelligibility in Noise", Scand. Audiol. Suppl., vol. 15, pp. 113–121, 1982.

[8] P. Kulkarni, P. Pandey, and D. Jangamashetti, “Binaural Dichotic Presentation to Reduce the Effects of Spectral Masking in Bilateral Sensorineural Hearing Loss", International Journal Audiology, vol.51, pp. 334–344, 2012.

[9] P. Kulkarni and P. Pandey, “Optimizing the Comb Filters for Spectral Splitting of Speech to Reduce the Effect of Spectral Masking", Int.

Conf. on Signal processing, Communications and Networking Madras Institute of Technology, pp. 75–79, 2008.

[10] R.. Warren, J. Bashford Jr. and P. Lenz, “Intelligibility of Dual Rectangular Speech Bands: Implications of Observations Concerning Amplitude Mismatch and Asynchrony", Speech Commun., vol. 40 , pp.

551–558, 2004.

[11] D. Jangamashetti, P. Pandey, and A. Cheeran, “Time Varying Comb Filters to Reduce Spectral and Temporal Masking in Sensorineural Hearing Impairment", Int. Conf. on Biomedical Engineering, pp.

445–451, 2001.

[12] E. Kennedy, H. Levitt, A. Neuman, and M. Weiss, “Consonant-Vowel Intensity Ratios for Maximizing Consonant Recognition by Hearing Impaired Listeners", J. Acoust. Soc. Am., vol. 103, pp. 1098–1114, 1998.

[13]E. Zwicker, “Subdivision of audible frequency range into critical bands", J. Acoust. Soc. Am., vol. 33, pp. 248–250, 1961.

[14] R. Gonzalez and R.Woods,”Digital Image processing", 2nd ed., Salt Lake City, Prentice Hall, 2001.

[15] P. Kulkarni, P. Pandey, and D. Jangamashetti, “Multiband Frequency Compression for Improving Speech Perception by Listeners with Moderate Sensorineural Hearing Loss", Speech Commun., vol. 54, pp.

341–350, 2012.

[16] T. Baer, B. Moore, and S. Gatehouse, “Spectral Contrast Enhancement of Speech in Noise for Listeners with Sensorineural Hearing Impairment: Effects on Intelligibility, Quality, and Response Times", J.

Rehabilit. Research and Development, vol. 30, pp. 49–72, 1993.

[17] A. Cheeran and P. Pandey, “Speech processing for hearing aids for moderate bilateral sensorineural hearing loss", Int. Conf. on Acoustics, Speech and Signal Processing, Montreal, Quebec, vol. 4, pp. 17–20, 2004.

Sayli Pradhan, received the B.Tech degree in Electronics and Communication engineering from JNTU university, Hyderabad, Telangana India in 2013.

She is currently persuing M.tech in Electronics system and communication engineering in Government college of engineering, Amravati, India