Nonlinear Equations and Numerical Optimization Problems
6.1 Nonlinear Algebraic Equations
6.1.1 Graphical method for solving nonlinear equations
The implicit functions with one and two variables can be drawn easily using the MATLAB function ezplot(). With ezplot(), the nonlinear equations can be shown graphically and its real solutions can be obtained by extracting the coordinates of the intersections of the curves. The graphical methods are restricted only to nonlinear equations with one or two variables. Nonlinear equations with more than two variables have to be solved numerically or for some special cases, symbolically.Graphically solving nonlinear equations of single variable
The function ezplot() can be used to draw the curve from the implicit function f (x) = 0. The real solutions can be identified from the intersections of the curves with the line y = 0.
Example 6.1 Solve the equation e−3tsin(4t + 2) + 4e−0.5tcos 2t = 0.5 using graphical method and examine the accuracy of the solutions.
Solution The function ezplot() can be used to draw the curve of the function as shown in Figure 6.1 (a). The intersections with the horizontal axis are the solutions to the original nonlinear equation.
>> ezplot(’exp(-3*t)*sin(4*t+2)+4*exp(-0.5*t)*cos(2*t)-0.5’,[0 5]) line([0,5],[0,0]) % draw the horizontal axis as well
From the curve it can be observed that there are three real solutions, p1, p2, p3, over the interval t ∈ (0, 5). One may zoom the area around a particular solution until the horizontal axis reads the same scale. The horizontal scale is then regarded as a solution. An example of the zoomed curve is shown in Figure 6.1 (b) where a solution t = 0.6738 can be obtained. Substituting this reading back to the equation
>> t=0.6738; exp(-3*t)*sin(4*t+2)+4*exp(-0.5*t)*cos(2*t)-0.5
the error can be found as −2.9852×10−4. So, for this example, the achieved accuracy of the solution is not quite as high. Similar methods can be used to find and validate other solutions.
0 1 2 3 4 5
−2
−1 0 1 2
p1 p2 p3
(a) the curve for implicit function over t∈ (0, 5)
0.6738 0.6738 0.6738 0.6738 0.6738 0.6738 0.6738
−6
−4
−2 0 2 4
×10−8
(b) zoomed curve and the solution
FIGURE 6.1: Graphical solutions to an equation with single variable
Graphically solving nonlinear equations of two variables
Nonlinear equations with two variables can also be solved easily using the graphical method. Use ezplot() function to draw the solutions to the first equation. Then use hold on command to hold the graphics window such that the plot from ezplot() for the second nonlinear equation is superimposed to the first one. The intersections of the two sets of curves are then the solutions to the original nonlinear equations. The solutions can be read out graphically using the zooming method illustrated earlier.
Example 6.2 Solve graphically the equations
( x2e−xy2/2+ e−x/2sin(xy) = 0 x2cos(x + y2) + y2ex+y= 0.
Solution The graphical method can be used to solve the above nonlinear simul-taneous equations. The first equation can be displayed with the direct use of the implicit function drawing command ezplot(), as shown in Figure 6.2 (a)
>> ezplot(’x^2*exp(-x*y^2/2)+exp(-x/2)*sin(x*y)’) % the first equation Use the command hold on to ensure the curves will not be removed. Then, ezplot() draws the solutions to the second equation. The curves will then be superimposed on the curves obtained earlier, as shown in Figure 6.2 (b).
>> hold on; ezplot(’y^2 *cos(y+x^2) +x^2*exp(x+y)’)
The intersections are then the solutions to the nonlinear equation sets. In this way, all the real solutions to the given simultaneous equations can be displayed. To get the coordinates of a certain point, for instance point B in Figure 6.2 (b), one may zoom the interested region again and again until all the scales on the x- and y-axes read the same, as shown in Figure 6.3 (a). Thus the solution at point B is x = −0.7327, y = 1.5619.
From Figure 6.2 (b) it is also found that most of the solutions are located in the fourth quadrant. Thus a large area can be chosen. For instance, the rectangular region (0, 0), (8, −10) can be selected and the solutions in the new area can be displayed as shown in Figure 6.3 (b).
−6 −4 −2 0 2 4 6
−6
−4
−2 0 2 4 6
(a) solutions to the first equation
−6 −4 −2 0 2 4 6
−6
−4
−2 0 2 4 6
B
(b) superimposed curves
FIGURE 6.2: Graphical solutions to the nonlinear equations
−0.7327 −0.7327 −0.7327 −0.7327 −0.7327 1.5619
1.5619 1.5619 1.5619 1.5619 1.5619 1.5619
x
y
(a) zooming for a solution
0 1 2 3 4 5 6 7 8
−15
−10
−5 0
x
y
(b) more solutions
FIGURE 6.3: Graphical solutions to the nonlinear equations
>> ezplot(’x^2*exp(-x*y^2/2)+exp(-x/2)*sin(x*y)’,[0,8,-15,0]) hold on, ezplot(’y^2 *cos(y+x^2) +x^2*exp(x+y)’,[0,8,-15,0])
6.1.2 Quasi-analytical solutions to polynomial-type equations
Before illustrating solutions to polynomial equations, let us consider an example.Example 6.3 Solve the equations
(x2+ y2− 1 = 0
0.75x3−y+0.9 = 0using the graphical method.
Solution Using the graphical method, the two curves for the two equations can be displayed easily using the following statements, as shown in Figure 6.4. The intersections are the solutions to the original equations.
>> ezplot(’x^2+y^2-1’); hold on % solutions to the first equation ezplot(’0.75*x^3-y+0.9’) % the second equation
It can be seen from the curves in Figure 6.4 that there are two intersections.
However, it cannot be simply concluded that the original equations have only two
−1.5 −1 −0.5 0 0.5 1 1.5
−1
−0.5 0 0.5 1
0.75x3− y + 0.9
FIGURE 6.4: Solutions using graphical method
solutions. One may solve y from the second equation and find that y is a function of x3. Substituting the equation into the first one, it can be concluded that the equation can be converted into a polynomial equation of x, with the highest degree of 6. Thus the polynomial equation must have 6 roots. The graphical method can only be used to find the real solutions. No information on complex roots is available from the graphical method. Thus the graphical methods sometimes are not adequate in solving polynomial equations.
The function solve() provided in the Symbolic Math Toolbox of MATLAB is quite effective in finding the solutions to polynomial-type equations. The function can be used in finding all the solutions to the simultaneous equations which can be converted to polynomial equations. The syntaxes of the function are
S=solve(eqn1,eqn2,· · · ,eqnn) % the simplest syntax [x,y,· · · ]=solve(eqn1,eqn2,· · · ,eqnn) % direct solutions [x,y,· · · ]=solve(eqn1,eqn2,· · · ,eqnn,’x, y,· · · ’) % variables specified where eqni is the symbolic representation of the ith equation to be solved.
In this way, simultaneous equations can easily be represented. In the first statement, a structure variable S is returned, and the solutions are members of S. For instance, S.x and S.y.
Example 6.4 Solve again the equations in Example 6.3.
Solution The solve() function can be used in solving the equations such that
>> syms x y; [x,y]=solve(’x^2+y^2-1=0’,’75*x^3/100-y+9/10=0’) and the solutions are found as
x=
.35696997189122287798839037801365
.8663180988361181101678980941865 + j1.21537126646714278013183785444
−.553951760568345600779844138827 + j.354719764650807934568637899349
−.98170264842676789676449828873194
−.553951760568345600779844138827 − j.354719764650807934568637899349 .8663180988361181101678980941865 − j1.21537126646714278013183785444
y=
.93411585960628007548796029415446
−1.4916064075658223174787216959 + j.705882007214022677539188271388 .929338302266743628529852766772 + j.2114382218589592361562338176221
.19042035099187730240977756415289
.929338302266743628529852766772 − j.2114382218589592361562338176221
−1.4916064075658223174787216959 − j.705882007214022677539188271388
.
For this high-degree polynomial-type equation, according to the well-known Abel-Ruffini Theorem, there exist no analytical solutions. The Symbolic Math Toolbox of MATLAB can be used to obtain high-precision solutions. These types of solutions are referred to as the quasi-analytical solutions. It can be seen that apart from the two sets of real solutions, there are yet other sets of complex conjugate solutions to the original nonlinear equations. These solutions cannot be obtained using graphical or other search algorithms.
The following statements can be given to verify the accuracies of the solutions
>> [eval(’x.^2+y.^2-1’) eval(’75*x.^3/100-y+9/10’)]’
and the error to each equation can be obtained as
−.1×10−31 .5×10−30+ j.1×10−30 0. − j0. 0. 0. − j0. .5×10−30− j.1×10−30
0. 0. − j0 0. − j0. 0. 0. − j0. 0. − j0.
where each column corresponds to a pair of (xi, yi) solutions. It can be seen that the results thus obtained are extremely accurate, with an accuracy impossible to achieve by double-precision arithmetics.
Example 6.5 The polynomial-type equations with more variables can also be obtained using the solve() function. Find the solutions to the following equations
x + 3y3+ 2z2= 1/2 x2+ 3y + z3= 2 x3+ 2z + 2y2= 2/4.
Solution The equations given are with three variables x, y, z. It can be seen that there are only polynomial terms, thus it can theoretically be converted into polynomial equations of a single variable. The following statements can be used to find the quasi-analytical solutions to the given equations.
>> [x,y,z]=solve(’x+3*y^3+2*z^2=1/2’,...
’x^2+3*y+z^3=2’,’x^3+2*z+2*y^2=2/4’)
In fact, the original equations can be converted into a single polynomial equation with a single variable with a degree of 27. Thus the quasi-analytical solutions can be obtained using the above statements. Substituting the solutions back to the original equations, it can be found that the error could be as small as 6.9146×10−26. So, the equations can perfectly be solved using the quasi-analytical approach.
>> err=[x+3*y.^3+2*z.^2-1/2, x.^2+3*y+z.^3-2, x.^3+2*z+2*y.^2-2/4];
norm(double(eval(err)))
In fact, the terms given in the equations can also be written as a product of polynomials. For instance, if the last equation is given by x3+ 2zy2= 2/4, with the product of polynomials such as zy2, the solutions to the original equations can also
be found by the direct use of the solve() function. The statements can be changed to
>> [x,y,z]=solve(’x+3*y^3+2*z^2=1/2’,’x^2+3*y+z^3=2’,’x^3+2*z*y^2=2/4’) err=[x+3*y.^3+2*z.^2-1/2, x.^2+3*y+z.^3-2, x.^3+2*z.*y.^2-2/4];
norm(double(eval(err))) % norm of the error
and quasi-analytical solutions can be found and the norm of the error for the new equations can be as small as 6.4156×10−26.
Example 6.6 Solve the following equations where the reciprocals to the variables
are involved
1
2x2+ x +3 2+ 21
y+ 5 2y2 + 31
x3 = 0 y
2+ 3 2x+ 1
x4 + 5y4= 0.
Solution It is not likely possible to solve this kind of complicated equation without the help of powerful computer mathematics languages. However, with the following statements the quasi-analytical solutions can be obtained easily.
>> syms x y;
f1=x^2/2+x+3/2+2/y+5/(2*y^2)+3/x^3; f2=y/2+3/(2*x)+1/x^4+5*y^4;
[x0,y0]=solve(f1,f2)
and it can be seen that there are 26 pairs of solutions. Substituting all the solutions back to the original equations, one can immediately find that the norm of the error is 6.3172×10−30, which means that the solutions are very accurate.
>> err=[subs(f1,{x,y},{x0,y0}) subs(f2,{x,y},{x0,y0})];
norm(double(err))
Example 6.7 Solve the equations with constants
(x2+ ax2+ 6b + 3y2= 0 y = a + x + 3.
Solution The solve() function can be used directly to solve the equations, even if it contains extra variables. The solutions to the problem can be obtained by the direct use of the function calls such that
>> syms a b x y; [x,y]=solve(’x^2+a*x^2+6*b+3*y^2=0’,’y=a+(x+3)’,’x,y’) and the solutions can be written as
x = −6a − 18 ± 2√
−21a2− 45a − 27 − 24b − 6ab − 3a3 2(4 + a)
y = a +−6a − 18 ± 2√
−21a2− 45a − 27 − 24b − 6ab − 3a3
2(4 + a) + 3.
In fact, the method may apply to third- or fourth-degree equations as well.
However, the solutions are usually too complicated to display.
It should be noted that the analytical or quasi-analytical solution methods introduced in the previous subsections are not general-purpose. They can only be used in dealing with problems convertible to high-degree polynomial equations with a single variable. Furthermore, for most nonlinear equations, we cannot expect to find all the possible solutions.
6.1.3 Numerical solutions to general nonlinear equations
A numerical solution function fsolve() provided in MATLAB can be used to search for a real solution to given nonlinear equations. The syntax of the function isx=fsolve(fun,x0) % simple syntax
[x,f ,flag,out]=fsolve(fun,x0,opt,p1,p2,· · · ) % formal full syntax where fun can either be an M-function, an anonymous function or an inline function describing the equations to be solved. The variable x0 is the initial search point for the solution. A real solution to the equations can be obtained by searching method from the initial point x0 using numerical algorithms. If a solution is successfully found, the returned flag is greater than 0, otherwise the search is not successful.
For more complicated problems, the solution control option opt can be used to select methods and control accuracies in searching the solution. The opt variable is defined as a structured variable, with the commonly used members explained in Table 6.1. The following syntaxes can be used in modifying the contents in the control options
opt=optimset; % get default control template
opt.TolX=1e-10; or set(opt,’TolX’,1e-10) % set control parameters where some of the members such as MaxFunEvals are problem dependent, which is usually set to 100 to 200 times the number of variables. The user may change the options using the above mentioned function calls.
TABLE 6.1:
Control options for equation solutions and optimizationsmember name explanation to the options
Display To control whether the intermediate results are displayed, with the values
’off’for no display, ’iter’ for display in each iteration, ’notify’ for alert at none convergence, and ’final’ for final results display only
GradObj To indicate whether the gradient information is used in optimization. The options are ’off’ and ’on’, with ’off’ the default
LargeScale To indicate whether large-scale algorithms are used, with options ’on’ and
’off’. For problems with only a few variables, it should be set to ’off’
MaxIter The maximum allowed iterations for equation solution and optimization. This value can be increased for problems failed to converge within the current control options
MaxFunEvals The maximum allowed times of objective function calls TolFun The error tolerance of objective functions
TolX The error tolerance of the solutions
Example 6.8 Solve the equations in Example 6.3 using numerical algorithms.
Solution Before solving such equations, the variables should be selected such that
the unknown variables to be solved are assigned as a vector. Selecting the variables p1= x, p2= y, the original equations can be represented by an anonymous function, and then one may select the initial values at p0 = [1, 2]T. The function fsolve() can be used directly to solve the original equations and find a solution.
>> f=@(p)[p(1)*p(1)+p(2)*p(2)-1; 0.75*p(1)^3-p(2)+0.9];
OPT=optimset; OPT.LargeScale=’off’;
[x,Y,c,d]=fsolve(f,[1; 2],OPT),
The solution found is x = [0.35696997, 0.93411586]T, with the error Y = [0.1215×
10−9, 0.0964 × 10−9]. It can also be found by examining the d argument that 21 function calls are made. Thus the algorithm is quite effective.
Similarly, the original equations can also be described by the inline function or by M-file. With the inline functions and anonymous functions, there is no need to create a separate M-file for each problem, which makes the file management more tidy and convenient.
If the initial values are changed to p0= [−1, 0]T, then by using
>> [x,Y,c,d]=fsolve(f,[-1,0]’,OPT); x, Y, kk=d.funcCount
another solution is found at x = [−0.981703, 0.1904204]T, and this time 15 function calls are made and the norm of the error vector is 0.5618×10−10. In this example, it can be seen that the selection of initial values may lead to other solutions.
Example 6.9 The Lambert function is a special function defined as w = lam(x), where w is the solution to the equation wew= x for a given variable x. For different values of x, solve the Lambert equation and then draw the relationship between w and x.
Solution The solution to this problem can be obtained by taking loops for various values of x. The following statements can be given and the curves shown in Figure 6.5 can be obtained such that the Lambert function curve can then be obtained.
>> y=[]; xx=0:.05:10; x0=0; h=optimset; h.Display=’off’;
for x=xx
f=@(w)w.*exp(w)-x; y1=fsolve(f,x0,h); x0=y1; y=[y,y1];
end plot(xx,y)
A MATLAB function lambertw() is provided in the Symbolic Math Toolbox which can be used to evaluate the Lambert equations directly. The following statements can be used instead to draw the Lambert curves and the results are exactly the same as the ones obtained in Figure 6.5.
>> y0=lambertw(xx); plot(xx,y0)
Example 6.10 Consider again the equation e−3tsin(4t + 2) + 4e−0.5tcos 2t = 0.5 defined in Example 6.1. Find the solutions using numerical methods for a better accuracy.
Solution Using the solve() function
0 2 4 6 8 10 0
0.5 1 1.5 2
FIGURE 6.5: Solution of the Lambert function
>> syms t x; solve(exp(-3*t)*sin(4*t+2)+4*exp(-0.5*t)*cos(2*t)-0.5) it can be found that the solution is t = .67374570500134756702960220427474. It is obvious that the nonlinear equation has no analytical solutions. Graphical methods shown in Example 6.1 can be used to find the numerical solutions. However, the accuracy achieved by graphical method may not be very high. From the approximate solution by graphical approach t = 3.5203, better results can be obtained by directly using the fsolve() function.
By combining the graphical and numerical methods, it can be seen that a better solution can be found
>> y=@(t)exp(-3*t).*sin(4*t+2)+4*exp(-0.5*t).*cos(2*t)-0.5;
ff=optimset; [t,f]=fsolve(y,3.5203,ff)
such that t = 3.52026389294877 and f = −6.06378×10−10. The solution found is much more accurate than the graphical method. To get even better approximations, one can further modify the control options with the following statements
>> ff=optimset; ff.TolX=1e-16; ff.TolFun=1e-30;
[t,f]=fsolve(y,3.5203,ff)
and the new solution is t = 3.52026389244155 with f = 0.
6.1.4 Nonlinear matrix equations
In Section 4.4.4, a special form of nonlinear matrix equation, algebraic Riccati equation, is discussed. However, the solution is based on a very specialized algorithm, which cannot be extended to other forms of nonlinear matrix equations. For instance, if the equation is changed to
AX+ XD − XBX + C = 0 (6.1)
or even a tricky form
AX+ XD − XBXT+ C = 0 (6.2)
the are() function is no longer applicable. So here, a nonlinear matrix equa-tion soluequa-tion method is given for solving general nonlinear matrix equaequa-tions.
Example 6.11 Consider again the Riccati equation in Example 4.45, where the matrices are given below
A=
Solution The unknown variable X in Riccati equation is a matrix, however, the fsolve()function can only deal with unknown vectors. Thus the first thing to do is to transform the matrix X into a vector x. The simplest way is by x = X (:), i.e., by rearranging the matrix into a vector. O the other hand, when describing the Riccati equations, the best way is to keep using matrix X . Thus within the function, one should restore the matrix X using the reshape() function. The function describing the errors in the Riccati equation can be written as
1 function y=new_are(x,A,B,C)
2 X=reshape(x,size(A)); y1=A’*X+X*A-X*B*X+C; y=y1(:);
where A, B, and C can be regarded as additional arguments. From the equation function, the Riccati equation can be solved with another function written as
1 function X=solve_are(A,B,C,x0)
2 if nargin==3, x0=rand(size(A)); end
3 x=fsolve(@new_are,x0(:),[],A,B,C); X=reshape(x,size(A));
It can be seen from the new solver that a random initial vector x0 is assigned.
The following statements can be used to solve the Riccati equation
>> A=[-2,1,-3; -1,0,-2; 0,-1,-2]; B=[2,2,-2; -1 5 -2; -1 1 2];
C=[5 -4 4; 1 0 4; 1 -1 5]; X=solve_are(A,B,C), norm(A’*X+X*A-X*B*X+C)
It is surprising that, apart from the solution in Example 4.45, another solution may also be found and it is verified that the error norm is 1.0406×10−15
X=
−0.1538 0.10866 0.46226 2.0277 −1.7437 1.3475 1.9003 −1.7513 0.50571
.
Since Riccati equation is a quadratic equation, which is similar to the case of quadratic equation, more than one solutions exist. However, the new solution cannot be found using the are() function.
Example 6.12 Now consider the new Riccati-like equation given in (6.2), where A=
Find and verify all the possible solutions.
Solution To date, there are no existing algorithms for solving such types of equations. Similar to the above example, an M-function can be written to describe the equation
1 function y=new_are1(x,A,B,C,D)
2 X=reshape(x,size(A)); y1=A*X+X*D-X*B*X.’+C; y=y1(:);
where A, B, C and D are additional arguments. From the equation function, the Riccati-like equation can be solved with the following function:
1 function X=solve_are1(A,B,C,D,x0)
2 if nargin==4, x0=rand(size(A)); end
3 x=fsolve(@new_are1,x0(:),[],A,B,C,D); X=reshape(x,size(A));
Through repeated runs of the following statements
>> A=[2,1,9; 9,7,9; 6,5,3]; B=[0,3,6; 8,2,0; 8,2,8];
C=[7,0,3; 5,6,4; 1,4,4]; D=[3,9,5; 1,2,9; 3,3,0];
X=solve_are1(A,B,C,D), norm(A*X+X*D-X*B*X.’+C)
three solutions can be found. Note that X2 is more difficult to find X1=
1.7539 1.2408 −0.00023348 2.2114 3.3662 −0.72222 0.86565 1.8109 −0.26194
, X2=
6.74 −0.36997 0.8394
−1.7679 −0.25863 1.4835 1.7761 −0.39974 1.0043
,
X3=
−0.43386 0.40803 0.14075 1.3621 −2.5373 1.4561
−1.0243 0.97048 −1.0438
.