3.5 Discussion
4.4.3 Nelder-Mead Simplex Optimisation
The Nelder-Mead (NM) method, developed byNelder and Mead(1965), works by max- imising a scalar-valued nonlinear function of a number of real variables using only func- tion values, without any derivative information. This method is based on the iterative update of a simplex.
As the function,J(δ), we are maximising with respect toδparameters might be non-
convex and not necessarily have derivatives, we use a direct NM simplex search method. The NM simples search method designed for solving the unconstrained optimisation problem. The method designed for finding a local maximum of a function that has several variables (Lagarias et al., 1998; Nelder and Mead,1965).
Let J be our objective function of m dimension. A simplex is a geometric shape
in m dimensions that is the convex of m + 1 vertices. The simplex is represented by
verticesδ1, δ2,· · · , δm+1by ∆. The NM method iteratively yields a sequence of simplexes
to approximate an optimal solution ˜δ of Equation 4.16. In each individual iteration k,
the vertices{δi, i= 1,· · · , m+ 1}of the simplex are sorted using the objective function
values as:
J(δ1)≤ J(δ2)≤ · · · ≤ J(δm+1). (4.17)
whereδ1 denotes the best vertex, and δm+1 denotes the worst.
There are four alternative operations used in the NM method: reflection,expansion, contraction, and shrink, each of which is associated with a scalar parameter: ρ1 (reflec-
4.4 Optimising Intelligibility 55
have the following conditions:
ρ1 >0, ρ2 >1, 0< ρ3 <1, and 0< ρ4 <1.
According to the standard implementation of the NM method, the parameters are defined to be ρ1 = 1, ρ2 = 2, ρ3 = 1 2, and ρ4 = 1 2.
Furthermore, we assume that ¯δ be the centroid of the m best vertices which is defined
as ¯ δ= 1 m m X i=1 δi
A summary of one iteration of the Nelder-Mead (NM) algorithm, as described by La- garias et al.(1998), can be presented as follows:
4.4 Optimising Intelligibility 56
Summary of one iteration of the Nelder-Mead (NM) algorithm
1. Order. EvaluateJ at them+1 vertices of ∆ and order the vertices so that
Equation 4.17 implemented. 2. Reflection.
(a) Calculate the reflection point δr as follow
δr = ¯δ+ρ1(¯δ−δm+1)
(b) Evaluate Jr=J(δr). If J1 ≤ Jr <Jm, replace δm+1 with δr.
3. Expansion.
(a) If Jr <J1 then calculate the expansion point δe. as follow
δe = ¯δ+ρ2(δr−δ¯)
(b) Assess Je =J(δe). If Je <Jr, replace δm+1 with δe; otherwise replace
δm+1 with δr.
4. Outside Contraction.
(a) If Jm ≤ Jr <Jm+1, calculate the outside contraction point δoc as
δoc = ¯δ+ρ3(δr−δ¯)
(b) AssessJoc=J(δoc). IfJoc ≤ Jr, replaceδm+1 withδoc; if the condition
does not apply go to step 6. 5. Inside Contraction.
(a) If Jr ≥ Jm+1, calculate the inside contraction point δic as follow
δic= ¯δ−ρ3(δr−δ¯)
(b) Evaluate Jic = J(δic). If Jic < Jm+1, replace δm+1 with δic; if the
condition does not apply go to step 6. 6. Shrink. For 2 ≤i≤m+ 1, define
4.5 Chapter Summary 57
4.5
Chapter Summary
This chapter is a core chapter in this thesis, and has been devoted to a closed-loop opti- misation based near-end intelligibility enhancement framework by means of gammatone filterbank analysis-resynthesis. First, it introduced basic notions of speech analysis and provided an in depth explanation of gammatone analysis system. Next, we examined the possible classes of speech manipulation with the aim of optimising the intelligibility. An efficient way to resynthesis speech was then presented using gammatone filterbank and peak-alignment method. The final section of this chapter dealt with solving the optimi- sation problem that employ unconstrained optimisation method known as Nelder-Mead simplex direct search method.
In the following three chapters, we will see how can this framework be implemented by selecting an intelligibility model and a modification strategy. Starting with Chapter5, we will use a measure of energetic masking and a spectral modification strategy.
Chapter 5
Spectral Modification Based on
Glimpse Proportion Measure
5.1
Introduction
Pre-enhancement generally works by making the speech harder to mask. In Lombard speech this equates to increasing the intensity, increasingf0, lengthening vowel duration
and reducing spectral tilt to boost the high frequencies (Junqua, 1993; Lu and Cooke,
2008;Van Summers et al.,1988), readers are referred to Section3.2for more information about Lombard speech. Noise-adaptive algorithms in particular make similar changes although generally a fixed-intensity constraint is applied because it is undesirable to maintain intelligibility by simply boosting the signal energy, readers are referred to Section3.3.2 for a review about noise-adaptive algorithms.
As mentioned in the previous chapter, the challenge for pre-enhancement systems is tooptimally adjust the parameters of the speech modification algorithm. Typically the
parameters may be tuned by using knowledge of the background noise and an objective intelligibility model,i.e.,they are adjusted so as to maximise the predicted intelligibility.
This automated closed-loop approach allows the adoption of highly flexible near-end enhancement algorithms that can finely control the acoustic features of the speech. We thus define, in the previous chapter, a general framework for a closed-loop near-end intelligibility enhancement system. In this chapter, we implement this framework using an intelligibility model and a speech modification strategy.
The starting point of implementing the closed-loop near-end intelligibility enhance- ment framework is to select a model of speech intelligibility. The choice here is made based on an auditory masking model used in the missing data theory (e.g.,(Cooke et al.,