The Downhill Simplex method - Dispersion optimisation of SF57 HFs with the simplex method

4.2 Dispersion optimisation of SF57 HFs with the simplex method

4.2.1 The Downhill Simplex method

The Downhill Simplex method (also called Nelder-Mead method, after its original au- thors [188]) is one of the simplest multidimensional optimisation methods, arguably providing, of all algorithms, the most functionality for the least amount of code. In optimisation problems where the analytical calculation of partial derivatives with respect

to the design parameters is impossible, as in the current problem (where the function being searched is obtained from numerical calculations), the simplex method can pro- vide an efficient solution. The Nelder-Mead algorithm evaluates the multidimensional function to be optimised in a constellation of points, randomly chosen at the beginning of the process, and uses a recurrent algorithm, based on a few simple rules, to find the point at which the function is maximised or minimised.

A simplex is a geometrical figure that has one more vertex than dimensions: a triangle in 2D, a tetrahedron in 3D, etc. The algorithm starts by choosing an initial simplex (for example by adding D unit vectors to the starting point chosen by the user) and evaluating the function being searched at the D+1 vertices of the simplex. Then, at each step an iterative procedure attempts to improve the vertex with the highest value of the function (assuming the goal is minimisation; for maximisation the vertex with the smallest value is updated). The possible moves are summarised in Figure 4.1.

Initial simplex Highest

point

mean

reflect reflect and grow

reflect and shrink shrink shrink towards minimum

Figure 4.1: Downhill Simplex method in 2D: update moves.

The first step is to calculate the centre of the face of the simplex defined by all of the other vertices other than the one we are going to improve (from here on we will assume we want to minimise a function). Since the other vertices have a better function value, it is a reasonable guess that they give a good direction to move into. Therefore the next step is to reflect the point across the face (reflect). If the function calculated in the new vertex has improved, the move is clearly a good one. Therefore it’s worth checking to see if it is even better to double the size of the step (reflect and grow). If growing like this gives a better function value than just reflecting we keep the move, otherwise we go back to the point found by reflecting alone. If growing succeeds it’s possible to try moving the point further still in the same direction, but this isn’t normally done because it would result in long and asymmetric simplex shapes. For the simplex to be most effective, its size in each direction should be similar; therefore after growing it is better to go back and improve the new worse point. If the point obtained after reflection is still the worst point, it means we probably overshot the minimum of the function. Therefore, instead

of reflecting and growing we can try toreflect and shrink. If this results in a better point we keep the move and update the new worse point; if it still results in a worse point we can try just to shrink. If after shrinking we still haven’t found a point providing an improvement we can come to the conclusion that the moves we are making are too big to find the minimum, so we shrink all the vertices towards the best one (shrink

towards the minimum).

This set of simple rules automatically shrinks and enlarges the simplex to find the most advantageous length scale for searching in each direction; the only required inputs are the definition of the function to maximise or minimise, a starting point and a stopping criterion. A possible drawback of the algorithm, in it’s simplest form, is that it easily gets caught in a local minimum. This behaviour can be overcome by forcing the search to sometimes make moves that are not locally the best choice. For example this can be done by adding amomentum to the search, and forcing a fraction of the previous move to be added to the new update move calculated by the algorithm. Of course including a momentum does not guarantee that local minima will be avoided: this depends on the amount of momentum, on the initial condition and on the shape of the function. Moreover adding a momentum will generally slow down the search, in the case of simpler functions.

Local minima were indeed found in the optimisations reported in this section. However, due to the small number of free-parameters considered (2 or 3 only), the implementation of the additional momentum was not considered to be necessary, and the local minima were avoided simply by starting each optimisation from a number of different initial conditions. As will be presented in Section 4.3, when an additional number of free- parameters is taken into account, thus increasing the complexity of the function to optimise and presumably the number of local minima, a different optimisation strategy based on a Genetic Algorithm will be applied.

For the studies reported in this section, the simplex method has been integrated with our FEM modal solver. A target objective function, to be minimised by the simplex method, is defined, and its value is evaluated with the FEM solver for each new structure. The objective function is defined as:

Obj=w|D−D_t|+ (1−w)|D_slope−D_slope_t| (4.1) whereDtandDslopetare the target values of dispersion and dispersion slope, respectively,

while w is a weight coefficient chosen in such a way as to balance out the relative contribution of the two terms in the right hand side of the equation, and typically between 0.01 and 0.001. Keeping in mind the fibre fabrication tolerances, the algorithm was set to stop once the solution converged to within 1% of the optimum value of Λ and

In document Direct and inverse design of microstructured optical fibres (Page 117-120)