Environmental Models
2.3 GRADIENT-FREE OPTIMIZATION METHODS (DIRECT SEARCH)
2.3.3 G enetIc a lGorIthm
2.3.3.3 Similarity Templates
As the population evolves following the above operations, it may happen that some gene patterns are more successful than other, and should therefore be preserved, or even privileged, in their propa-gation. These patterns are called similarity templates or schema, and are defined, for example, as
H = [1 0 1 * 1 * 1 *] (2.49) where the symbol ‘*’ indicates a ‘don’t care’ gene, meaning that it may take any value, whereas the other genes are to be preserved, and therefore should not be involved in crossover and mutation. The length of a schema is computed as the difference between the positions of the last and the first fixed gene. Hence, in the example of Equation 2.49, the length is δ H
( )
= − =7 1 6. Longer schemes are more difficult to preserve because they are more likely to be disrupted by crossover. On the other hand, the length is a measure of the scheme importance.Goldberg (1989) and Holland (1992) have demonstrated how schema evolve through successive generations. Let m(H, k) be the number of chromosomes with a given template H at the kth genera-tion. Then, its number at the next generation will be
m H k m H k f H
f f
N fi i
N
, + ,
( )
=( ) ( )
=∑
=1 1
1
where (2.50)
and f is the mean fitness of the population at the kth generation. Equation 2.50 shows that schema with fitness above the average will expand in the next generation, whereas schema with fitness below the average will produce fewer and fewer offspring in successive generations. The propagation Equation 2.50 can be rewritten, considering a schema H that is above the average by an amount c f⋅ as
m H k m H k f c f
f c m H k
, + , ,
(
1)
=( )
+ ⋅ = +(
1)
⋅( )
(2.51)Considering the evolution of the population from the start
(
k =0 yields)
m H t
( )
, =m H(
,0)
⋅ +(
1 c)
k (2.52)which means that above-average schema increase exponentially over the generations, whereas underperformers disappear at the same rate. The problem with long schema is their possible break-down due to crossover, thus losing the distinctive features embedded in their pattern. For this rea-son, attention is focused on short templates, or building blocks, which are less likely to be disrupted by crossover. The previous fundamental theorem can be viewed in terms of building blocks, mean-ing that a schema is a collection of many short similarity templates that propagate independently and exponentially over generations, depending on their fitness.
A schema may be disrupted by crossover, and its robustness depends on its length, with long schema being more likely to succumb to crossover. The probability that a schema of length δ in a chromosome of genes is cut across is given by
p H
δ
=δ
( )
1− (2.53)
so its survival will be
If a random crossover occurs inside the schema with probability pc, then the survival probability becomes
Assuming that selection and crossover are independent random processes, then the probability that a schema survives in the next generation is given by the combination of the two terms (2.50) and (2.55). If the effect of mutation is also added, assuming that it can change only one gene at any one time with probability pm 1, the global dynamics of the schema is given by
m H k m H k f H
The MATLAB Optimization Toolbox provides an implementation of the GA with the solver named ga.
Its syntax is similar to the syntax of the other optimization functions implemented in the toolbox, and its complete form is
[ ]
(
P,fval,exitflag
= ga errorfcn,np,A,b,Aeq,beq,LB,UB,@nonlccon,options) (2.57) The solver ga returns the np-dimensional solution P, together with the value fval of the error func-tional E P
( )
, at the solution P, and the flag exitflag indicates the exit conditions. As to the input parameters, the first one calls the error functional, defined as either a MATLAB function file or an anonymous function, using the function handler ‘@’. In the latter case, please refer to the MATLAB help for its definition. The second input np is the dimension of the parameter vector P. If an uncon-strained optimization is performed, all the other input parameters should be set as empty quantities‘[]’. In the case of constrained optimization, the parameters A and b define the linear inequalities A × P ≤ b, whereas Aeq and beq defined the linear equalities Aeq x P = beq, with the vectors LB and UB representing respectively the lower and upper bounds for the solution LB≤P≤UB.
The last input, options, is a structure that includes all the solver settings, such as the maxi-mum number of generations (‘generations’) or the maximaxi-mum duration of the optimiza-tion process (‘TimeLimit’). The creaoptimiza-tion of the initial populaoptimiza-tion can also be defined here (CreationFcn), together with the number and the type of the populations (PopulationSize and PopulationType), the selection of the fittest individuals (SelectionFcn) and the method used for the mutations (MutationFcn).
The CreationFcn option generates the initial population. Setting it to ‘uniform’ creates an initial random population with uniform distribution, whereas ‘feasible population’ defines an initial well-distributed population that satisfies all the constraints. Choosing ‘custom’, a cus-tomized generation method can be defined with the data type indicated in PopulationType.
This option is used to set the data type of P, which can be either ‘doubleVector’ for real
variables ‘bitString’ if they are bit strings or ‘custom’ if they are user defined, in which case the user must specify the creation, mutation and crossover functions. If the PopulationType option is set to ‘bitString’ or ‘custom’, linear and nonlinear constraints cannot be specified.
The PopulationSize option defines the number of chromosomes of which the population is composed. A large population means a more accurate search of the solution with a more accurate exploration of the domain, but, of course, slows down the computation. For each option, the toolbox offers a set of pre-defined methods, but it is also possible to include user-defined methods. All these specifications can be created using the gaoptimset prior to the actual optimization. More details of the ga solver can be found in the MATLAB Help.
2.3.4 PerFormance aSSeSSmentoFthe combIned eStImatIon methodS aPPlIedtothe monod KInetIcS
The combined performance of the above search algorithms is now tested on the Monod kinetics, assumed as our benchmark. Though we have already studied the shape of the error functional and ruled out the existence of local minima, at least in the region of interest, nevertheless, starting the simplex search from a good point would surely expedite the procedure. A good test for any estima-tion algorithm is to generate data by simulating the model with known parameters and use them as experimental data. If the algorithm is correct, the estimated parameters should coincide, within inevitable round-off errors, with the original parameters used to generate the data.
We start the estimation by generating a population of parameters, and use a GA to explore the domain. After a certain number of generations, the GA will indicate the location of the minimum with a precision depending on the size of the population (N) and the number of generation. At this stage, we do not need to set very stringent requirements, as the main objective is to avoid pitfalls into which the simplex may fall, such as local minima. Consequently, when the GA obtains a rough esti-mate of the minimum, we can use this to initialize the simplex, which will provide more precision.
Figure 2.25 shows the two phases of the estimation of the two main (and most troublesome) parameters in the Monod kinetics. In the first phase, illustrated in Figure 2.25a, a GA with a popu-lation of 30 chromosomes selects the region around the minimum, and its best chromosome after 20 iterations is
(
µmax=0 4587. ,Ks=26 8417.)
. In Figure 2.25b, the simplex algorithm, initialized with the GA result, reaches the exact parameter values. Notice that the search is predominantly concentrated along the valley bottom.0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Initial population Final population
Best GA result along the valley bottom Estimated
minimum
(a) (b)
FIGURE 2.25 Estimation of the parameter couple (μmax,Ks). In (a), the GA explores the domain with a population of 30 chromosomes. This preliminary phase ends after the prescribed number of generations. At that time most of the points (solid dots) are closer to the minimum than the initial population (hollow circles).
In (b), the best chromosome is selected as the starter for the simplex, which reaches the true parameters by exploring predominantly along the valley bottom.
A full estimation (parameters plus initial conditions), shown in Figure 2.26, requires consider-ably more iterations, and the decrease of the error functional (b) is marked by several plateaus, where the algorithm seems to make little progress. These are the phases when a new search direc-tion is tested by the optimized expansion sub-algorithm of Secdirec-tion 2.3.2.1.
In the case of noisy data, shown in Figure 2.27a, the error functional is raised by an amount equal to the total measurement variance, provided that there are no modelling errors, as shown in Figure 2.21. As Figure 2.27b shows, the minimum of the error functional (Emin) equals the total noise variance on the data, while it retains its smoothness thanks to its averaging proper-ties. Thus,
However, numerical inaccuracies may creep into the computation of the error functional, whose surface may become ‘ripply’, as shown in Figure 2.28. The simplex may be trapped inside its many tiny local minima, if its size is too small as a result of many consecutive contractions and/or reductions.
Figure 2.29b shows the more elaborate search pattern of the simplex in finding the minimum, ending the exploration near the true parameters in spite of the noise affecting the data, Figure 2.29c, which is more evident in the inset (d).
N k = 1
E = 1012
Σ (
Sk − Skexp)
2+(
Xk − Xkexp)
2Iterations Simplex search summary Number of reflections: 110 Number of expansions: 59
Number of optimized expansions: 27 Number of contractions: 874 Number of reductions: 0 Estimated parameters
0 500 1000 1500 2000 2500
0 20 40 60 80 100
FIGURE 2.26 Full estimation of the Monod kinetics, including the initial conditions, with noise-free data (a). The decrease of the error functional (b) is uneven and very slow in the final part. In (c), the search progress is shown in the (µmax, Ks) subspace, terminating almost at the true values (hollow diamond). Notice the privileged search direction along the valley bottom.
2.3.5 MATLAB SoFTware organizaTion
We now turn our attention to the software implementation of the estimation algorithm. Figure 2.30 shows how the various modules are organized. In addition to the main script, there are two func-tions (simplex and error funcfunc-tions) and the Simulink® model. The main script specifies the nature of the problem (experimental data, model definition, simplex options, etc.). It also defines the global variables that will be made visible to the error function, and then passed on to the Simulink model.
In fact, global variables, defined in a script, are normally available to the called Simulink model
0 10
(a) 20 30 40 50 60 70 80 90 100 (b)
−4
−2 0 2 4 6 8 10 12
Time (h)
Concentration (mg/l)
Substrate Biomass Noisy substrate Noisy biomass σ2s= 1.8236
σX2= 1.9272
40 35 3025
E20
15 105 45 400
3530 Ks 20
15 0.3 0.4 0.5 0.6
25
μmax
Emin = σ2S+ σX2= 3.7350
FIGURE 2.27 When noisy measurements are used, as in (a), the minimum (Emin) of the error functional E equals the total measurement variance, as shown in (b), provided that there is no inherent model error. Due to the averaging property of E, the functional retains its smoothness, regardless of the noisy measurements. This result is in agreement with Figures 2.12 and 2.22b.
40
30
20
10
E
0 45 40
Ks
μmax 35
25 20
0.3 0.4 0.5 0.6
15 30
FIGURE 2.28 Numerical inaccuracies in the computation of the error functional may produce a ripply sur-face with many small local minima, in which the search algorithm can be trapped.
through the common memory area called Workspace, but in this case the model is called by a func-tion, which does not possess the prerogatives to publish its internal variables to the Workspace.
Hence, the variables created in the error function would not be visible by the Simulink, and thus the need to define them as global. All the involved MATLAB modules (main and functions) should have the same global declaration, with the same variable names, before any executable statement.
As Figure 2.30 shows, the main script loads the experimental data, defines the model and the sim-plex options, and then invokes the simsim-plex function that supervises the optimization and releases control back to the main script after the search is terminated, because either the minimum has been reached (within the prescribed precision) or the maximum number of iterations has been exceeded.
There may be a downstream processing section, where results are displayed and the estimates are validated, organized as shown in Box 2.1.