MATH2070: LAB 3: Roots of Equations, Python version

(1)

MATH2070: LAB 3: Roots of Equations, Python version

1 Introduction

The root finding problem is the task of finding one or more values for a variable x so that an equation f (x) = 0

is satisfied. Denote the desired solution as x0. As a computational problem, we are interested in effective and

efficient ways of finding a value x that is “close enough” to x0. There are two common ways to measuring

“close enough.”

1. The “residual error,” |f (x)|, is small, or

2. The “approximation error” or “true error,” |x − x0| is small.

Of course, the true error is not known, so it must be approximated in some way. In this lab we will look at some nonlinear equations and some simple methods of finding an approximate solution. We will use an estimate of the true error when it is readily available and will use the residual error the rest of the time.

You will see in this lab that error estimates can be misleading and there are many ways to adjust and improve them. We will not examine more sophisticated error estimation in this lab.

2 Programming style

The Python coding style used in these labs has an underlying consistency and is one of the easiest to read, understand and debug. Some of the style rules followed in these labs includes:

• One statement per line, with minor exceptions. • Significant variables are named with long names.

• Loop counters and other insignificant variables have short names, such as k, m or n. • The interior blocks of loops and if-tests must be indented.

• Function files always have comments following the initial declaration.

Clarity is important for debugging because if code is harder to read then it is harder to debug. Debugging is one of the most difficult and least rewarding activities you will engage in, so anything that simplifies debugging pays off. Clarity is also important when others read your code. Since I need to read your code in order to give you a grade, please format your code similarly to that in the labs.

3 A Sample Problem

Suppose we want to know if there is a solution to

cos x = x ,

whether the solution is unique, and what its value is. This is not an algebraic equation, and there is little hope of forming an explicit expression for the solution.

Since we can’t solve this equation exactly, it’s worth knowing whether there is anything we can say about it. In fact, if there is a solution x0 to the equation, we can probably find a computer representable number

x which approximately satisfies the equation (has low residual error) and which is close to x0 (has a low

(2)

Exercise 1: The following steps show how to plot the functions y = cos x and y = x together in the same graph from −π to π.

(a) Import numpy and call it np; (b) Import matplotlib and call it plt;

(c) Define a vector of x values with the appropriate range; (d) Define vectors y1 and y2 by evaluating functions at x; (e) Use the plot() command twice, for (x,y1) and (x,y2). (f) Use the show() command to make the plot appear; (g) Save a copy of the graph using the savefig() command. A basic version of your Python script might look like this: 1 import m a t p l o t l i b . p y p l o t a s p l t 2 import numpy a s np 3 4 x = np . l i n s p a c e ( −np . p i , np . p i , 100 ) 5 y1 = x # We w i l l i g n o r e a s u b t l e Python e r r o r h e r e ! 6 y2 = np . c o s ( x ) 7 8 p l t . p l o t ( x , y1 ) 9 p l t . p l o t ( x , y2 ) 10 p l t . show ( ) 11 12 p l t . s a v e f i g ( ’ e x e r c i s e 1 . j p g ’ )

You might save your script as exercise1.py. On Ubuntu, I can execute it by the command python3 exercise1.py. When you see the plot, notice that the two lines intersect. You can even estimate the value of x where this occurs.

As a report for this exercise, state your estimated value of the root, and include a copy of your plot. Let’s reorganize our investigation so that we can think of it as seeking a root x of a function f () for which we have f (x) = 0.

Exercise 2:

We begin by packaging our function as an actual Python function. Recall the structure of a Python function:

(a) A declaration statement of the form: def function name ( list of arguments ): (b) indented computational statements

(c) indended a return statement that reports the function value; Your Python function might be stored as cosmx.py, and have the form: 1 def cosmx ( x ) :

2 import numpy a s np 3 v a l u e = np . c o s ( x ) − x 4 return v a l u e

To test out your function, evaluate it at x = 0.5. Here’s how I do it: 1 python3

2 >>from cosmx import cosmx 3 >>cosmx ( 0 . 5 )

(3)

Your result should be about 0.37758.

Now let’s use graphics again, but this time, our two functionw will be y1 = cosmx(x) and y2 = 0. (a) From your previous script exercise1.py, make a fresh copy exercise2.py;

(b) To access your function, you will need to insert the statement from cosmx import cosmx

(c) To create the y2 data, you can’t simply write y2=0, which creates a single zero value. Instead, use the command y2 = np.zeros ( 100 ) to create a zero vector that is the same length as x. (d) After your plot statements, you should add the statement np.grid ( True ), so that we will see

helpful grid lines.

(e) Make sure you modify your savefig() statement too!

Does the value of x at the point where the curve crosses the x-axis agree with the value you saw in the previous exercise?

Send me the plot file as exercise2.jpg.

4 The Bisection Idea

The idea of the bisection method is very simple. We assume that we are given two values x = a and x = b (a < b), and that the function f (x) is positive at one of these values (say at a) and negative at the other. When this occurs, we say that [a, b] is a “change-of-sign interval” for the function. Assuming f is continuous, we know that there must be at least one root in the interval.

Intuitively speaking, if you divide a change-of-sign interval in half, one or the other half must end up being a change-of-sign interval, and it is only half as long. Keep dividing in half until the change-of-sign interval is so small that any point in it is within the specified tolerance.

More precisely, consider the point x = (a + b)/2. If f (x) = 0 we are done. (This is pretty unlikely.) Otherwise, depending on the sign of f (x), we know that the root lies in [a, x] or [x, b]. In any case, our change-of-sign interval is now half as large as before. Repeat this process with the new change of sign interval until the interval is sufficiently small and declare victory.

We are guaranteed to converge. We can even compute the maximum number of steps this could take, because if the original change-of-sign interval has length ` then after one bisection the current change-of-sign interval is of length `/2, etc. We know in advance how well we will approximate the root x0. These are very

powerful facts, which make bisection a robust algorithm—that is, it is very hard to defeat it. Exercise 3:

(a) If we know the start points a and b and the interval size tolerance , we can predict beforehand the number of steps required to reach the specified accuracy. The bisection method will always find the root in that number or fewer steps. What is the formula for that number?

(b) Give an example of a continuous function that has only one root in the interior of the interval [-2, 1], but for which bisection could not be used.

5 Programming Bisection

Before we look at a sample bisection program, let’s discuss some programming issues. If you haven’t done much programming before, this is a good time to try to understand the logic behind how we choose variables to set up, what names to give them, and how to control the logic.

(4)

Another reasonable recommendation for using functions is that by breaking up a calculation into in-dependent portions, we can hide the algorithm details and temporary variables. The names and values of variables used within the function will disappear once the function is complete, except for results reported by the return statement. This reduces the chance that we will accidentally use the same name in two different parts of a single huge program, which can cause confusion and error.

The bisection algorithm can be expressed in the following way. (Note: the product f (a) · f (b) is negative if and only if f (a) and f (b) are of opposite sign and neither is zero.)

Given a function f : R → R and a, b ∈ R so that f (a) · f (b) < 0, then sequences a(k) _{and b}(k) _{for k = 0, . . .}

with f (a(k)_{) · f (b}(k)_{) < 0 for each k can be constructed by starting out with a}(1)_{= a, b}(1) _{= b then, for each}

k > 0,

1. Set x(k)_{= (a}(k)_{+ b}(k)_)/2.

2. If |xk_{− b}(k)_{| is small enough, or if f (x}(k)_{) = 0 exit the algorithm.}

3. If f (x(k)_{) · f (a}(k)_{) < 0, then set a}(k+1)_{= a}(k) _{and b}(k+1)_{= x}(k)_.

4. If f (x(k)) · f (b(k)) ≤ 0, then set a(k+1)= x(k)and b(k+1)= b(k). 5. Return to step 1.

The bisection algorithm can be described in a manner more appropriate for computer implementation: Algorithm Bisect(f ,a,b,x,)

1. Set x:=(a + b)/2.

2. If |b − a| ≤ , or if f (x) happens to be exactly zero, then accept x as the approximate root, and exit. 3. If sign(f (b)) * sign(f (x)) < 0, then a:=x; otherwise, b:=x.

4. Return to step 1.

The second version of the algorithm is expressed in a form that is easily translated into a loop, with an explicit exit condition. To check the sign condition, the algorithm employs a sign() function rather than looking at the value of f (a) ∗ f (b). This is a better strategy because of roundoff errors if very small quantities are involved. In Python, there is a smallest positive real number, which we might nickname as “tiny”. Its value can be returned by the Python value numpy.finfo(float).tiny. If f (a) > 0 and f (b) > 0 are each less than√tiny, then their product is zero, not positive.

The language of this description, an example of “pseudocode,” is based on a computer language called Algol but is intended merely to be a clear means of specifying algorithms in books and papers. The term “pseudocode” is used for any combination of a computer coding language with natural language. One objective of this lab is to show how to convert pseudocode into a workable Python program. As a starting point, let’s fix f (x) to be the function cosmx that you just wrote. Later you will learn how to add it to the calling sequence. Let’s also fix = 10−10.

Here is a function that carries out the bisection algorithm for our cosmx() function. 1 def b i s e c t c o s m x ( a , b ) : 2 3 ””” b i s e c t c o s m x ( a , b ) u s e s b i s e c t i o n f o r a r o o t o f cosmx ( x ) = 0 . 4 5 I n p u t : 6 7 a , b : l e f t and r i g h t e n d s o f a c h a n g e o f s i g n i n t e r v a l . 8 9 Output : 10 11 x : t h e e s t i m a t e d r o o t . 12 13 i t C o u n t : t h e number o f b i s e c t i o n s . 14 ”””

(5)

17 # 18 # I n i t i a l i z a t i o n . 19 # 20 EPSILON = 1 . 0 e −10; 21 f a = cosmx ( a ) 22 f b = cosmx ( b ) 23 24 i t C o u n t = 0 25 # 26 # Loop 27 # 28 while ( True ) : 29 30 x = ( a + b ) / 2 31 f x = cosmx ( x ) 32 33 print ( i t C o u n t , ’ : ’ , \ 34 ’ f ( ’ , a , ’ ) = ’ , f a , \ 35 ’ f ( ’ , x , ’ ) = ’ , f x , \ 36 ’ f ( ’ , b , ’ ) = ’ , f b ) 37 38 i f ( f x == 0 ) : 39 return x , i t C o u n t 40 e l i f ( abs ( b − a ) < EPSILON ) : 41 return x , i t C o u n t 42 43 i t C o u n t = i t C o u n t + 1 44 45 i f ( np . s i g n ( f a ) ∗ np . s i g n ( f x ) <= 0 ) : 46 b = x 47 f b = f x 48 e l s e : 49 a = x 50 f a = f x

There are some significant differences between the pseudocode and the Python implementation. The algorithm generates a sequence of endpoints {ai} and {bi} and midpoints {xi}. The code keeps track of only

the most recent endpoint and midpoint values. This is a typical strategy in writing code—use variables to represent only the most recent values of a sequence when the full sequence does not need to be retained.

Note that in the expression fx == 0 a double equals sign is used. In Python, a single equal sign indicates assignment, which might better have been written as <=. It transfers a value. Whenever we are asking whether two items are equal, we need to use the double equal sign instead. This is a tricky distinction, and if you make the wrong choice, sometimes the compiler will complain, but sometimes it will simply run and compute nonsense.

Finally, note the text at the beginning of the function that is delimited by triple quotes. In Python, this is known as a docstring. Because the function has a docstring, a user can ask for information about it by typing help ( bisect cosmx ). You are encouraged to write a docstring for your Python functions. However, Python is a little fussy about the format of a docstring:

• It must be the first item following the function statement; • It must begin and end with triple double quotes;

• The first line should describe the purpose of the function;

• If there is more than one line, the second should be blank, and then you should describe the input and output variables, and any other information a user might find helpful.

Exercise 4:

(6)

(a) In your own words, what does the np.sign(x) function do? What if x is 0?

(b) The print() command is used only to monitor progress. Note the use of \ as the continuation character, which allows the print() statement to extend beyond a single line.

(c) What would the final value of itCount be if a and b are already closer together than the tolerance when the function starts?

(d) Try the command:

[x,itCount] = bisect_cosmx ( 0, 3 )

How many iterations were required? Is this value smaller than the value from your formula in Exercise 3?

(e) Why did I use abs(b-a) in the convergence test instead of simply writing b-a? (f) The program may seem to be working, but try this command:

[x,itCount] = bisect_cosmx ( 0, 0.5 )

What do you notice about the result? Apparently, in this case, the input values are not be suitable for the bisection method. Rather than carry out a worthless computation, the function should check the input and refuse to process it if it is illegal. Such a check can be made by code like this:

if ( condition ):

raise Exception ( "The input is illegal!" )

Replace condition by the appropriate computational check. Perhaps you should also replace the error message by a more understandable message. Rerun your program and verify that it will correctly process the input (0, 3) and reject (0, 0.5)

6 Functions as Input

Now we have written the file bisect cosmx.py that can find a root of f(x)=cos(x)-x, but it can’t handle any other function. Just as we are allowed to vary the endpoints a and b as input quantities, we’d like to be able to be able to specify as an input quantity the function whose root we are seeking.

Exercise 5:

Copy the file bisect cosmx.py to a new file named bisect.py.

(a) In the declaration line for bisect(), add one more input argument. We don’t know what the user calls the function, but we have to make up a name for it in the argument list, so we’ll just call it f.

(b) If you used a docstring to list your input arguments, include a statement about your new input quantity f.

(c) The import cosmx statement is no longer appropriate. Remove it.

(d) There are three places where we evaluate the function with a statement like fa = cosmx ( a ) In each such place, replace cosmx() by f().

Our first test of the new code simply repeats the analysis of cosmx(). However, now, we have to import that function during the Python session, and then include it on the argument list to bisect(). On my system, this might look like:

1 python3

2 >>from cosmx import cosmx 3 >>from b i s e c t import b i s e c t

(7)

For our second test, we need a new function. Using the model of cosmx.py, write a Python function called poly.py that evaluates the function: f (x) = (x − 5)(32x − 7)(x + 1). Call bisect() using the initial change of sign interval [0, 1]. We have already seen that we can predict the maximum number of steps necessary to shrink the interval below the limit of EPSILON. However, for this example, the program may stop early, and successfully. Can you explain?

Exercise 6:

The following table contains formulas and search intervals for several functions. Write function files, f1.py through f5.py, and use bisect() to find a root in the given interval.

Name Formula Interval approxRoot No. Steps

f1 x^2-9 [0,5] _________ _________

f2 x^5-x-1 [1,2] _________ _________

f3 x*exp(-x) [-1,2] _________ _________

f4 2*cos(3*x)-exp(x) [0,6] _________ _________

f5 (x-1)^5 [0,3] _________ _________

7 The secant method

The bisection algorithm allows the user to predict in advance the number of iterations necessary to reach the interval tolerance. For root-finding algorithms that do not simply shrink the interval, it can be harder to determine whether the computation should be continued. One common, but not always reliable, measure is the magnitude of the residual, which we might compare to a tolerance as in |f (x)| ≤ ). Our next root finder algorithm, the secant method, will use a test like this for convergence.

Just like the bisection method, the secant method starts out with two points a and b. It does not require that these points constitute a change of sign interval. The algorithm simply evaluates the function at those two values, constructs the straight line though (a,f(a)) and (b,f(b)) and sets c to the argument where the straight line crosses the axis. The formula is simply:

c = f (b)a − f (a)b f (b) − f (a)

If f () was actually a linear function, this would be the root. In general, it’s not going to be the root. We can’t even be sure that f (c) is smaller than f (a) or f (b). Nonetheless, we now discard a. We use the pair of values b and c to construct a new linear function, and we find its zero at d.

We can continue this process indefinitely, but of course we hope that the sequence of values is converging to a root, and we want to know whether we are getting close, or whether we need to cancel the process because we have taken too many steps, or the function values are blowing up.

The secant method can be described in the following manner. Algorithm Secant(f ,a,b,x,)

1. Set

x = f (b) ∗ a − f (a) ∗ b f (b) − f (a) 2. If |f (x)| ≤ , return x as the approximate root.

3. Replace a with b and b with x. 4. Go back to step 1.

(8)

Exercise 7:

Write a Python function named secant.py. You might start from the text of bisect.py. The function should begin as follows:

1 def s e c a n t ( a , b , f ) : 2 ””” s e c a n t ( a , b , f ) a p p l i e s t h e s e c a n t method f o r a r o o t o f f ( x ) 3 . . . 4 ””” 5 # I n i t i a l i z e 6 7 ? ? ? S e t f a , f b ? ? ? 8 9 EPSILON = 1 . 0 e −10 10 ITMAX = 1000 11 12 # Loop 13 14 f o r i t C o u n t in range ( 0 , ITMAX ) : 15 16 ? ? ? D e t e r m i n e x , e v a l u a t e f ( x ) 17 18 i f ( c o n d i t i o n ) 19 return x , i t C o u n t 20 21 ? ? ? a <− b , f a <− f b , b <− x , f b <− f x 22 23 # E x i t e d from l o o p w i t h o u t c o n v e r g e n c e 24

25 r a i s e E x c e p t i o n ( ”No c o n v e r g e n c e a f t e r maximum number o f i t e r a t i o n s . ” )

You will need to replace the sections marked ??? by the appropriate Python statements. Notice that our loop section now uses a Python for statement, set up to run no more than 1000 iterations. If we don’t leave the loop with convergence, then we trigger a final error message.

Compare the bisection and secant methods by filling in the following table.

Bisection Bisection Secant Secant

Name Formula Interval approxRoot No. Steps approxRoot No. Steps

f1 x^2-9 [0,5] _________ _________ _________ _________ f2 x^5-x-1 [1,2] _________ _________ _________ _________ f3 x*exp(-x) [-1,2] _________ _________ _________ _________ f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________ f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________ f5 (x-1)^5 [0,3] _________ _________ _________ _________

In the above table, you can see that the secant method can be either faster or slower than bisection. You may also observe convergence failures: either convergence to a value that is not near a root or convergence to a value substantially less accurate than expected. Regarding the bisection roots as accurate, are there any examples of convergence to a value that is not near a root in the table? If so, which? Are there any examples of inaccurate roots? If so, which?

8 The Regula Falsi method

(9)

than the secant method; however it is possible to find functions for which regula falsi does worse than either of the other two methods.

Assuming we begin with a change of sign interval [a, b], the algorithm can be described as follows: Algorithm Regula(f ,a,b,x,)

1. Set

x = f (b) ∗ a − f (a) ∗ b f (b) − f (a) 2. If |f (x)| ≤ , accept x as the approximate root, and return.

3. If f (x) has the same sign as f (a), replace a by x, otherwise replace b by x. 4. Return to step 1.

Exercise 8:

Starting from your bisect.py file, write a Python file named regula.py to carry out the regula falsi algorithm.

Fill in the following table, to compare Bisection, Secant, and Regula Falsi.

Bisection Secant Regula Falsi

Name Formula Interval approxRoot No. Steps No. Steps No. Steps

f1 x^2-9 [0,5] _________ _________ _________ _________ f2 x^5-x-1 [1,2] _________ _________ _________ _________ f3 x*exp(-x) [-1,2] _________ _________ _________ _________ f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________ f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________ f5 (x-1)^5 [0,3] _________ _________ _________ _________

You should observe both faster and slower convergence, compared with bisection and secant. You should not observe lack of convergence or convergence to an incorrect solution, except for function f5().

Function f5, (x-1)^5 turns out to be very difficult for regula falsi! Loosen your convergence criterion to a tolerance of EP SILON = 10−6 and increase the maximum allowable number of iterations to IT M AX = 500, 000 and try again.

SECOND CHANCE:

Regula

Name Formula Interval approxRoot No. Steps

f5 (x-1)^5 [0,3] _________ _________

You should observe convergence, but it takes a very large number of iterations.