Chapter 2 Trajectory Optimization Using Nonlinear Programming and Forward-Backward
2.1 Nonlinear Programming
An NLP is an optimization problem whereby the minimization (or maximization) of an objective function J (x) is achieved by varying a set of real variables x, subject to a set of equalities and inequalities, commonly referred to as constraints. Examples of numerical NLP solvers include the Interior Point OPTimizer (IPOPT) [132] or the Sparse Nonlinear OPTimizer (SNOPT) [133], which is the solver that was used for this work. These algorithms are designed to solve mathematical programming problems of the following form:
(N LP ) = min x f (x) R n 7→ R s.t. lb≤ x c(x) ≤ ub (2.1) x ∈ Ω ⊆ Rn, c ∈ Rm lb, ub ∈ Rn+m
where f (x) is a smooth scalar function, c(x) is a vector of smooth nonlinear and linear constraint functions, {lb, ub} are vectors of constant lower and upper bounds on x and c and Ω is the feasible region for (N LP ).
A feasible point xf for the problem posed in Equation (2.1) is one that satisfies all of the problem con-
straints [118].
The core algorithm implemented in SNOPT is a sparse SQP solver, which like many NLP solution methods relies on the Karush-Kuhn-Tucker (KKT) first order necessary conditions [134, 135] to determine whether or not xf is in fact the optimal solution x∗ to the problem described in Eq. (2.1):
1. λ∗i ≥ 0 i = 1, 2, ..., m dual feasibility
2. λ∗ici(x∗) = 0 i = 1, 2, ..., m complementary slackness
3. ∇f (x∗) +Pm
i=1λ∗i∇ci(x∗) = 0 stationarity
objective function f and the constraint vector c with respect to the problem decision vector x. The matrix containing the partial derivatives of the constraints with respect to the decision vector is known as the Jacobian. Degraded Jacobian accuracy as a result of using approximation techniques such as finite differences in lieu of specifying analytically obtained values for the dense entries of the Jacobian, can increase the number of iterations required to solve the problem or could even result in the solver encountering numerical difficulties resulting in more severe exit conditions (e.g. singular entries in or rank-deficiency of the Jacobian).
2.1.1
Jacobian Calculation Techniques
Nonlinear programming solvers typically include a mode of operation where the software can compute the Jacobian matrix via a finite differencing approximation. There are many possible finite difference algorithms. The forward and central differencing methods are shown in Eq. (2.2) and (2.3). These algorithms exhibit first and second order Taylor series truncation error respectively, however, higher-order accuracy methods are also possible [136].
∂cj(x) ∂xi =cj(xi+ h) − cj(xi) h + O(h) (2.2) ∂cj(x) ∂xi = cj(xi+ h) − cj(xi− h) 2h + O(h 2) (2.3)
While divided differences are straightforward to implement, when the problem constraints are complex functions, the accuracy limits of this technique make its use inside an NLP solver undesirable for several reasons. First, finite differencing requires multiple calls to the NLP cost function in order to compute a single entry in the Jacobian. When the problem cost function is computationally expensive to evaluate, this can seriously hinder the progress of the optimizer from a pure runtime point-of-view. The more serious problem with employing divided differences is the inherent inaccuracy of the method. As Eq. (2.2) and (2.3) indicate, reducing the size of the perturbation step h will reduce the truncation error of the method. Doing so beyond a certain point, however, will begin to increase the floating point round-off error when the algorithm is implemented on a digital computer. For these reasons, finite-differencing is best avoided unless the use of other techniques is not possible, such as when using black-box software packages.
The concept of generating derivatives using complex numbers was introduced by Lyness and Moler [137, 138]. Perhaps the most advanced derivative computation technique in the category of complex step
methods is the work by Lantoine et al. [139] and Pellegrini and Russell [136] with their introduction of multicomplex step differentiation. Multicomplex numbers are an extension of the complex numbers, and this method uses minute perturbations along the appropriate multicomplex direction to obtain partial derivatives of arbitrary order for any holomorphic function. While this technique has been shown to be both accurate and robust, its implementation is non-trivial and typically requires the development of auxiliary software that implements multicomplex mathematical floating point operations.
Still yet another class of methods for numerically computing partial derivatives is automatic differentia- tion. Automatic differentiation, also known as algorithmic differentiation or computational differentiation, encompasses a class of techniques that seeks to numerically compute the derivatives of a function expressed in a computer program at accuracies matching machine epsilon precision and can be realized using a wide variety of implementation strategies. In general, Automatic Differentiation (AD) techniques fundamentally rely on the fact that the chain rule can be successively applied to the mathematical expressions contained in a computer program in order to obtain derivatives of those expressions to arbitrary order. Automatic differ- entiation was first introduced in 1964, with the pioneering work on forward mode derivative accumulation by Wengert [140]. Reverse mode accumulation was introduced shortly after by Linnainmaa [141, 142] and is the discrete analogue to continuous adjoints of differential equations. Automatic differentiation implementations can be roughly partitioned into these two main operational families, namely techniques that use forward mode derivative accumulation, and those that use reverse mode accumulation. Additional details regarding forward mode AD will be discussed in section 3.6.1.
Often the most computationally efficient means for computing the Jacobian is to provide analytic ex- pressions for the dense entries. Depending on the complexity of the NLP problem being solved, this can be a tedious and error-prone process. Despite this, a large portion of the contributions described in this work relate to the analytic calculation of partial derivatives for trajectory transcriptions. Chapter 3 will describe algorithms for computing the Jacobian for two bounded-impulse trajectory models and chapter 4 will extend these methods to a time-regularized bounded-impulse model. Chapter 5 will discuss techniques for computing the Jacobian of a finite-burn trajectory model that employs numeric integration.