Sillimanite
3.5 Solving the GFEM problem
3.5.1 Review of previous methods and advantages of employing modern minimization techniques
Following the ‘chemical potential’ approach many numerical methods have been proposed in the past to solve the Gibbs free energy minimization problem. All are based on standard methods of computational mathematics. For example, those by Storey and van Zeggeren (1964), Eriksson (1975), Saxena (1982), Wood and Holloway (1984) and Bina and Wood (1987) make use of the steepest descent type of algorithm, while those by Sundman et al. (1985), Ghiorso (1985), De Capitani and Brown (1987), Harvie et al. (1987) and Ghiorso and Sack (1995) make use of quadratic approximation methods (e.g. conjugate gradients).
Many of these GFEM algorithms have been developed with particular applications in mind, and have shown to perform well in those cases. However, in some cases the existing algorithms are not quite suited to the particular problems to be solved. For example, even though most make use of Lagrange multiplier techniques to impose linear equality constraints, they often resort to quite ‘ad hoc’ methods for imposing non-negativity constraints, usually by introducing non-linear (exponential) transformations of variables (e.g. Storey and Van Zeggeren, 1964; Wood and Holloway, 1984). This can create numerical problems when solution values approach zero (for a discussion see Greenberg, Weare and Harvie, 1985). Other algorithms like that of De Capitani and Brown (1987) transform their particular GFEM problem into a coupled set of linear and non-linear programming steps.
Numerical techniques for constrained optimization comprise an active field of research, in which major advances have occurred in the past twenty years. The numerical methods adopted in former studies show that the development of Gibbs free energy algorithms has, to a large extent, taken place independently of the advances in constrained optimization. Hence, it appears that Earth scientists have not been able to take advantage of modern optimization methods developed in the computational sciences. Modern methods can, in fact, tackle the general minimization problem and avoid any transformation of variables. They can also find initial guess solutions that satisfy the constraints (a problem
which has itself been the subject of independent study by Earth Scientists, Asimov and Ghiorso, 1998). Modern methods also converge rapidly, are usually quite robust to numerical instabilities, and can handle more complex non-linear equality and inequality constraints, which may be useful if one had reason to impose such constraints on a solution.
3.5.2 Solution by a Feasible Iterate Sequential Quadratic programming (FSQP) algorithm
The GFEM problem falls within the more general class of constrained optimization problem usually called non-linear programming (see Gill et al. 1981, for a discussion). The problem is non-linear because in general for every phase φ the molar Gibbs free energy Gφ
depends non-linearly on the site occupancies Xikφ’s. In addition, as shown in sections 3.3 and 3.4, in a general case, the minimization of Gsystem has to be performed under several
different, linear and non-linear, types of constraints.
Given the complexity of the general problem, the minimization program ‘Gib’, written for this study, has been built using the Feasible Iterate Sequential Quadratic programming (FSQP) algorithm of Panier and Tits (1993), Zhou, Tits and Lawrence (1998). The FSQP method is a modern optimization technique that can handle minimization/optimization problems in general form, which means including all forms of constraints (i.e. general linear and non-linear inequalities, as well as non-linear equality
constraints). It can also deal with zero, one or multiple objective functions simultaneously. In the case where no objective function is supplied (i.e. the Gibbs function given by Eq. 3.3 is absent) the task becomes one of finding a single ‘feasible’ solution, which is one of many that merely satisfy the constraints (i.e. Eq. 3.12-3.16, in a general case). The Gibbs free energy problem is then to find a feasible solution that is also a (global) minimum of the non-linear Gibbs function in Eq. (3.3).
The complete FSQP algorithm makes use of a series of techniques described and analyzed in Panier and Tits (1993), Bonnans et al. (1992), Zhou and Tits (1993), and Schittkowski, (1986). To solve the minimization problem, it needs to be provided by the user with an initial guess solution for the site occupancies and number of moles per phase. If these values do no satisfy all the constraints, then they are used as the starting point for a ‘step 1’ optimization problem where the objective is simply to satisfy the constraints. This
is accomplished by solving a series of sub-minimization problems, where at each stage the objective function is built from the constraint which is most violated. The solution from step 1 is then a feasible solution used as the starting point for the constrained minimization problem.
A highly flexible (FORTRAN) computer program is available from the authors of
FSQP which implements the general algorithm (Zhou, Tits and Laurence, 1998). In the implementation for this work this code has been used as the basis of a Gibbs free energy minimization solver. The algorithm has ‘super-linear’ convergence, which makes it practical for routine use. The FSQP code used in this work is part of a library of similar software which can be obtained through the Optimization Technology Centre (OTC) of the Argonne National Laboratory13.
3.5.3 Automatic differentiation
Another very important feature of the GFEM algorithm adopted in this study is the use of automatic differentiation tools. All the previous GFEM methods cited in this chapter (with the exception of Bina, 1998) require derivatives of Gsystem to be evaluated and
chemical potential expressions to be explicitly written down. This means that analytical expressions for derivatives must be determined beforehand and coded into the minimization algorithm. If these expressions are changed then the process must be repeated. On the other hand, in this work, thanks to the technique known as automatic differentiation, there is no need to determine analytical derivatives of Gsystem.
The mathematical background and the theory of automatic differentiation is described in detail in Griewank (1989), Griewank and Corlis (1991), and Berz et al., (1996). The automatic differentiation package used in this work is: TAF (i.e. Transformations of Algorithms in Fortran), which is a source-to-source automatic differentiation tool for Fortran-77 and Fortran-95 programs (e.g. Giering and Kaminski, 2003). The main idea behind automatic differentiation is that any mathematical function that can be evaluated using a computer code (in this case the Gibbs function in Eq. 3.1 and the expressions for constraints given by Eq. 3.12-3.16) can be broken down into a combination of simpler elementary function, like x2, exp, log, etc. This is self-evident since
there are only a finite number of possible commands in a programming language, e.g. FORTRAN or C. The derivatives of each of these elementary functions are known and so the derivatives of any combination of them can be found using the chain rule of differentiation (see Stephenson 1973). These are the steps that one would normally take in writing down analytical expressions for derivatives of complicated functions. Automatic differentiation performs the same task by analyzing the computer code used to evaluate the function and in effect ‘writing a similar code in the same language’ for the derivatives.
From the mathematical point of view, a first advantage of automatic differentiation is that it combines the complete flexibility of purely numerical derivatives, e.g. finite differences (Press et al., 1992), with the accuracy of analytical derivative expressions.
From the point of view of thermodynamic modeling, an important aspect of automatic differentiations tools is that they allow considerable flexibility in the form of
Gsystem, and hence in the range of problems that can be addressed. It could, for instance, be
assumed that the model presented in chapter 2 is incorrect, or it could be decided to add other phases in the system. In either case, in the GFEM algorithm, one would only have to modify the Gsystem expression, with no need to determine its analytical derivatives. This
could be done at any time as it would require just few minor changes to the computer code. Finally, another significant advantage of automatic differentiation is that it removes the possibility of ‘coding errors’ which often occur when translating complex expressions into a computer language.