Optimization with Conjugate Gradients - Modeling, Reaction Schemes and Kinetic Parameter Estima

uniformly focuses on all modes of the catalytic converter operation, while, according to requirement 4, it is preferable to focus in regions with moderate conversion efficiency. To incorporate such a feature into the above performance measure, we may define the maximum error between computation and measurement as:

e_max(t_k) = max©Eb_t_k, 1 − bE_t_kª (4.15) and then modify f as shown below. We do not need to change the definition of F .

f2(tk) = |e(tk)| e_max(t_k), F2≡ µf2 = P_N 0 f2(tk) N (4.16)

When measured conversion efficiency tends to its minimum or maximum value, i.e. when bE → 0 or bE → 1, then emax→ 1 and f2→ f1. When measured conversion

efficiency is moderate, i.e. when bE → 0.5, then e_max → 0.5 and thus f₂ → 2f₁. Consequently, f2/f1= 1/emaxincreases hyperbolically from 1 to 2, as emaxdecreases from 1 to 0.5 (Figure 4.2). Thus, for the same error value, f2 increases for moderate

catalytic activity, which is desired according to requirement 4.

Essentially, f1indicates error while f2indicates error normalized to the maximum

possible error. This makes f2more sensible to get insight about model accuracy when

several measurements are available. Regardless of the measurement, f₂ may take all values in the range [0, 1]. If f2 = 0, computation is exact; if f2 = 1, computation is

as far from measurement as possible. On the contrary, f1 ∈ [0, emax], which means that its maximum value depends on bE. Thus, f1 is indirectly dependent on the

measurement, which is not preferred (requirement 3).

4.3.6 Performance measure definition for multi-response measurements Catalytic converter tests usually involve measurements of multiple species concentrations (multiple responses). According to paragraph 4.3.2, in a typical three-way catalytic converter test, we have to exploit the measured responses of CO, HC and NO_x concentrations at converter inlet and outlet.

In this case, the performance measure defined by eq. (4.16) is applied to each re- sponse, and we subsequently get a total performance measure F as the average of the individual performance measure. For the case that CO, HC and NO_xconcentration responses are available, the performance measure becomes:

F = FCO+ FHC+ FN Ox 3 , (4.17) where F_i≡ µ_f_i = 1 N N X 0 |e(t_k)| emax(tk) , i = CO, HC, NO_x. (4.18)

4.4 Optimization with Conjugate Gradients

The conjugate gradient (CG) minimization algorithm is a conventional calculus based method. It was initially a method for minimizing quadratic functions, that was later extended to include arbitrary functions. It proceeds from point to point

in the parameter space, using information from the neighbourhood of each point, gradually converging to the minimum.

The CG method works well only when the parameter space is unimodal (it has exactly one minimum). Given a starting point, the method will move ‘downhill’ until it reaches to a minimum but has no way to tell if it is a local or a global minimum. Hence the point of convergence depends on the starting point given to the algorithm and consequently the method will most probably be trapped in a local minimum.

Thus, CG is not the method of choice for multimodal functions. It is not the most efficient way to minimize unimodal functions either; quasi-Newton methods are supposed to perform faster [6]. Nevertheless, we chose to start with this method for three reasons. First, in the beginning of this work, we had no way to guess if the parameter space was unimodal or multimodal. In other words we could not a priori know if a calculus based method would be sufficient or a global search method would be needed, so we took the simple step first. Second, this method was simple to implement and acceptably efficient. Finally, there was evidence in the literature [2] that it could give useful results, albeit in simple cases.

The CG method belongs to the category of Direction Set methods. In order to minimize a function F (x) (which, in the context of the conjugate gradients method is usually referred to as the merit function), all Direction Set methods start at a point θ in the n-dimensional space and proceed by (a) choosing a vector direction h, (b) minimizing along this direction and (c) choosing a new direction and repeating the procedure. The line minimization along the vector direction h is performed with an appropriate one-dimensional minimization routine.

Direction Sets methods differ from each other only in the way they choose the next direction hi+1 to minimize, after a line minimization along hi is completed. The category of Direction Set methods includes the steepest descent method and the quasi-Newton methods. For a detailed treatment of all Direction Set methods methods (and calculus-based optimization in general) consult the textbook of Lu- enberger [6]; for a computer-oriented approach, consult the “Numerical Recipes” of Press et al. [7].

In detail, the conjugate gradients method performs successive line minimizations along the directions hi, which are set to be:

hi+1= gi+1+ γhi, (4.19) where gi= −∇f (θ) (4.20) γi= gi+1_g · gi+1 i· gi (4.21) h0= g0 (4.22)

It can proved [7] that the vectors h_i (along which we perform the minimizations) satisfy the conjugancy condition:

hi· H · hj, j < i (4.23) where H is the Hessian matrix of the function f at point P :

[H]ij = ∂ 2_F ∂θ_i∂θ_j ¯ ¯ ¯ ¯ P . (4.24)

Sec. 4.4 Optimization with Conjugate Gradients 107 Minimizing along a direction hi which is conjugate to the previous direction hi means that the gradient along the direction hi+1 is perpendicular to the direction hi. Thus, minimization along hi+1 will not spoil the previous minimization along hi. If the merit function F is an exact quadratic form, n line minimizations (where n is number of components of vector θ) along the conjugate directions hi will lead to the minimum. Practically, out merit function is not an exact quadratic form, therefore the repetition of the n line minimizations is needed in order to converge to the minimum.

In this work, we used the FORTRAN-77 implementation of the conjugate gradients algorithm that is given in the book of Press et al. [7]. The derivatives of the merit function were computed numerically, since analytical expressions of the derivatives of the merit function are not obtainable. The algorithm is slightly modified according to Luenberger [6] to include algorithm restarts.

4.4.1 Constraints

The tunable parameters of the model are not usually free to vary without limits. More or less complicated constraints are set in order to restrict their values to regions that make sense scientifically. In this work, very simple constraints were required: each parameter θ had to be restricted between a minimum and a maximum value (θmin and θmax respectively).

To handle the constraints, we follow Bates and Watts [8] and enforce a logistic transformation on the parameter θ, of the form:

θ = θmax− θmin 1 + e−ϕ or equivalently ϕ = ln Ã θ − θmin θmax− θ ! . (4.25)

It may be noticed that ϕ ∈ (−∞, +∞) while θ ∈ [θmin, θmax]. Hence, we may now express the problem of parameter vector θ estimation as the following problem of unconstrained function minimization:

minimize F (ϕ), F : Rn→ R1

where the elements of the transformed parameters vector ϕ are defined by the trans- formation (4.25), given the constraint vectors θmin and θmax). As a merit function F , we used the performance measure defined previously in (4.17).

It must be stressed that, using (4.25), we optimize in respect to the reparametrized parameters vector ϕ, which may take any real value. Thus we have transformed a constrained minimization problem into an unconstrained minimiza- tion problem. The conjugate gradients method is an unconstrained minimization technique. Therefore, it may only be applied to the transformed problem (4.4.1) and cannot be used directly to minimize in respect to θ.

The CG method initially was tested tuning the diesel oxidation catalyst model. The DOC model was chosen as a first step because it involves less tunable parameters and is in general simpler than its 3WCC counterpart. Indeed, the results were encouraging [9]. When we proceeded to the 3WCC kinetics tuning, however, the method failed, because the parameter space was multimodal and the CG algorithm was trapped in local minima.

In order to tune the 3WCC model efficiently, it was necessary to ‘guide’ the CG algorithm by manually fixing some parameters according to our experience and per- sonal judgement of the problem at hand. Only then was the outcome successful [10] and some important conclusions could be drawn:

1. The semi-automatic procedure resulted in far more accurate tuning than the completely manual one.

2. It became clear that the potential of the tunable parameters model was bigger than we had expected.

3. It seemed that even increased accuracy could be attained with the aid of a global-search procedure that also offer reduced manual interference.

The above points motivated the implementation of a global search method for the optimization procedure, specifically a Genetic Algorithm. The main concepts of the Genetic Algorithm are summarized in the next section.

In document Modeling, Reaction Schemes and Kinetic Parameter Estimation in Automotive Catalytic Converters and Diesel Particulate Filters (Page 121-124)