Nelder-Mead Simplex Optimisation - Intelligibility model optimisation approaches for speech pre

3.5 Discussion

4.4.3 Nelder-Mead Simplex Optimisation

The Nelder-Mead (NM) method, developed byNelder and Mead(1965), works by maximising a scalar-valued nonlinear function of a number of real variables using only function values, without any derivative information. This method is based on the iterative update of a simplex.

As the function,J(δ), we are maximising with respect toδparameters might be non-

convex and not necessarily have derivatives, we use a direct NM simplex search method. The NM simples search method designed for solving the unconstrained optimisation problem. The method designed for finding a local maximum of a function that has several variables (Lagarias et al., 1998; Nelder and Mead,1965).

Let J be our objective function of m dimension. A simplex is a geometric shape

in m dimensions that is the convex of m + 1 vertices. The simplex is represented by

verticesδ1, δ2,· · · , δm+1by ∆. The NM method iteratively yields a sequence of simplexes

to approximate an optimal solution ˜δ of Equation 4.16. In each individual iteration k,

the vertices{δi, i= 1,· · · , m+ 1}of the simplex are sorted using the objective function

values as:

J(δ1)≤ J(δ2)≤ · · · ≤ J(δm+1). (4.17)

whereδ1 denotes the best vertex, and δm+1 denotes the worst.

There are four alternative operations used in the NM method: reflection,expansion, contraction, and shrink, each of which is associated with a scalar parameter: ρ1 (reflec-

4.4 Optimising Intelligibility 55

have the following conditions:

ρ1 >0, ρ2 >1, 0< ρ3 <1, and 0< ρ4 <1.

According to the standard implementation of the NM method, the parameters are defined to be ρ1 = 1, ρ2 = 2, ρ3 = 1 2, and ρ4 = 1 2.

Furthermore, we assume that ¯δ be the centroid of the m best vertices which is defined

as ¯ δ= 1 m m X i=1 δi

A summary of one iteration of the Nelder-Mead (NM) algorithm, as described by La- garias et al.(1998), can be presented as follows:

4.4 Optimising Intelligibility 56

Summary of one iteration of the Nelder-Mead (NM) algorithm

1. Order. EvaluateJ at them+1 vertices of ∆ and order the vertices so that

Equation 4.17 implemented. 2. Reflection.

(a) Calculate the reflection point δr as follow

δr = ¯δ+ρ1(¯δ−δm+1)

(b) Evaluate Jr=J(δr). If J1 ≤ Jr <Jm, replace δm+1 with δr.

3. Expansion.

(a) If Jr <J1 then calculate the expansion point δe. as follow

δe = ¯δ+ρ2(δr−δ¯)

(b) Assess Je =J(δe). If Je <Jr, replace δm+1 with δe; otherwise replace

δm+1 with δr.

4. Outside Contraction.

(a) If Jm ≤ Jr <Jm+1, calculate the outside contraction point δoc as

δoc = ¯δ+ρ3(δr−δ¯)

(b) AssessJoc=J(δoc). IfJoc ≤ Jr, replaceδm+1 withδoc; if the condition

does not apply go to step 6. 5. Inside Contraction.

(a) If Jr ≥ Jm+1, calculate the inside contraction point δic as follow

δic= ¯δ−ρ3(δr−δ¯)

(b) Evaluate Jic = J(δic). If Jic < Jm+1, replace δm+1 with δic; if the

condition does not apply go to step 6. 6. Shrink. For 2 ≤i≤m+ 1, define

4.5 Chapter Summary 57

4.5 Chapter Summary

This chapter is a core chapter in this thesis, and has been devoted to a closed-loop optimisation based near-end intelligibility enhancement framework by means of gammatone filterbank analysis-resynthesis. First, it introduced basic notions of speech analysis and provided an in depth explanation of gammatone analysis system. Next, we examined the possible classes of speech manipulation with the aim of optimising the intelligibility. An efficient way to resynthesis speech was then presented using gammatone filterbank and peak-alignment method. The final section of this chapter dealt with solving the optimisation problem that employ unconstrained optimisation method known as Nelder-Mead simplex direct search method.

In the following three chapters, we will see how can this framework be implemented by selecting an intelligibility model and a modification strategy. Starting with Chapter5, we will use a measure of energetic masking and a spectral modification strategy.

Chapter 5 Spectral Modification Based on

Glimpse Proportion Measure

5.1 Introduction

Pre-enhancement generally works by making the speech harder to mask. In Lombard speech this equates to increasing the intensity, increasingf0, lengthening vowel duration

and reducing spectral tilt to boost the high frequencies (Junqua, 1993; Lu and Cooke,

2008;Van Summers et al.,1988), readers are referred to Section3.2for more information about Lombard speech. Noise-adaptive algorithms in particular make similar changes although generally a fixed-intensity constraint is applied because it is undesirable to maintain intelligibility by simply boosting the signal energy, readers are referred to Section3.3.2 for a review about noise-adaptive algorithms.

As mentioned in the previous chapter, the challenge for pre-enhancement systems is tooptimally adjust the parameters of the speech modification algorithm. Typically the

parameters may be tuned by using knowledge of the background noise and an objective intelligibility model,i.e.,they are adjusted so as to maximise the predicted intelligibility.

This automated closed-loop approach allows the adoption of highly flexible near-end enhancement algorithms that can finely control the acoustic features of the speech. We thus define, in the previous chapter, a general framework for a closed-loop near-end intelligibility enhancement system. In this chapter, we implement this framework using an intelligibility model and a speech modification strategy.

The starting point of implementing the closed-loop near-end intelligibility enhancement framework is to select a model of speech intelligibility. The choice here is made based on an auditory masking model used in the missing data theory (e.g.,(Cooke et al.,

In document Intelligibility model optimisation approaches for speech pre-enhancement (Page 78-83)