Software Eng 3X03 Final Exam Notes Chaput

(1)

Ch 3 – Interpolation

Intro

Given (𝑥0,𝑦0), (𝑥1,𝑦1), … , (𝑥𝑛,𝑦𝑛),𝑥0 < 𝑥1 <

𝑥𝑛, construct a function 𝑓(𝑥𝑖) = 𝑦𝑖,𝑖 = 0, 1, … ,𝑛.

Desirable properties of f:

- Smooth: analytic and |𝑓′′₍_𝑥_{)| not too large (the} first and second derivatives are continuous) - Simple: polynomial of minimum degree, easy to

evaluate

Polynomial Interpretation

Advantages: Easy to evaluate and differentiate Weistrass Approximation Theorem:

“If f is any continuous function on the finite

closed interval [a,b], then for every ε > 0 there

exists a polynomial pn(x) of degree n = n(ε)

such that 𝑚𝑎𝑥𝑥∈[𝑎,𝑏]|𝑓(𝑥)−𝑝𝑛(𝑥)| < ∈” Impractical (degree often too high). Straightforward approach

A polynomial of degree n is determined by its n+1

coefficients. Given (x0,y0,…(xn, yn) to be interpolated, we construct the Vandermonde Matrix:

�

1 𝑥0 ⋯ 𝑥0𝑛 1 𝑥1 ⋯ 𝑥0𝑛

⋮ ⋮ ⋮ ⋮

1 𝑥𝑛 ⋯ 𝑥𝑛𝑛

� � 𝑎0

𝑎1

⋮ 𝑎𝑛

�=� 𝑦0

𝑦1

⋮ 𝑦𝑛

�

Then solve for the coefficients of the polynomial

𝑝𝑛(𝑦)(𝑥) =𝑎0+𝑎1𝑥+⋯+𝑎𝑛𝑥𝑛 With distinct 𝑥0, … ,𝑥𝑛, are distinct, the Vandermonde matrix is non-singular. The system has a unique solution, representing coefficients of the interpolating polynomial.

Example: Given two points (28, 0.4695) and (30, 0.5000), we have the system

�1 28_{1 30}� �𝑎0

𝑎1�=� 0.4695 0.5000� And the solution

�_𝑎𝑎0₁�=1₂�₋30₁ −₁28� �0.4695_0.5000�=�0.04250_0.01525�

The Vandermonde matrix is often ill-conditioned (very high condition number, defined on glossary on page 9), leading to computational inaccuracies. Lagrange Form

Basis of polynomials: �𝑙𝑗(𝑥)�(𝑗= 0, 1, … ,𝑛) of degree n such that

𝑙𝑗(𝑥) = �1, _0,_{𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒}𝑖𝑓𝑖=𝑗

Construct

𝑙𝑗(𝑥) =�_𝑥𝑥 − 𝑥𝑖 𝑗− 𝑥𝑖 𝑖≠𝑗

Thus

𝑝𝑛(𝑦)(𝑥) = � 𝑙𝑗(𝑥)𝑦𝑗 𝑛

𝑗=0

Example: Given three points: (28, 0.4695), (30, 0.5000), (32, 0.5299), construct a second degree polynomial in the Lagrange form:

𝑝2(𝑥) = ₍₂₈(𝑥 −₋30)(₃₀₎₍₂₈𝑥 −₋32)_{32) 0.4695}

+₍₃₀(𝑥 −28)(𝑥 −32)

−28)(30−32) 0.5000 +₍₃₂(𝑥 −₋28)(₂₈₎₍₃₂𝑥 −₋30)_{30) 0.5299}

𝑝2(31) = 0.5150 ≈sin (31°) Hard to evaluate.

Horner’s rule

Evaluating 𝑎0𝑥3+𝑥2+𝑎2𝑥+𝑎3 Horner’s form: �(𝑎0𝑥+𝑎1)𝑥+𝑎2�𝑥+𝑎3 v = a(0);

for (i = 1:n) v = v*x + a(i); end

The optimal (most efficient and accurate (least operations)) way of evaluating 𝑎0𝑥𝑛+ … +𝑎𝑛. Danger of polynomial interpretation:

It is often best not to use global polynomial interpolation.

Piecewise Polynomial Interpolation Given the partition

𝛼=𝑥𝑖<𝑥2<⋯<𝑥𝑛= 𝛽, Interpolate on each [𝑥𝑖,𝑥𝑖+1 ] with a low degree polynomial.

Linear

𝐿𝑖(𝑧) = 𝑎𝑖+𝑏𝑖(𝑧 − 𝑥𝑖),𝑧 ∈[𝑥𝑖,𝑥𝑖+1]

𝑎𝑖=𝑦𝑖, 𝑏𝑖=𝑦_{𝑥𝑖+1−𝑥𝑖}𝑖+1−𝑦𝑖, 1≤ 𝑖 ≤ 𝑛 −1

Given vectors x and y with interpolating points, this function returns the piecewise linear interpolation coefficients with vectors a and b. function [a,b] = pwL(x,y)

n = length(x); a = y(1:n-1); b = diff(y)./diff(x); Evaluation

First, locate [𝑥𝑖,𝑥𝑖+1] such that 𝑧 ∈[𝑥𝑖,𝑥𝑖+1]. Then, evaluate L(z) using Li (z).

Search by binary search (xi are sorted).

Algorithm: Locate

Use previous subinterval as guess, if not, binary search. Included on page 9.

Given piecewise linear interpolation coefficient vectors, a, b, from pwL and its breakpoints in x, this function returns the values of the interpolation evaluated at the points in z. function v = pwLEval(a,b,x,z)

m = length(z); v = zeros(m,1); g = 1;

for j=1:m

i = Locate(x,z(j),g);

v(j) = a(i) + b(i)*(z(j) - x(i)); g = i;

end

Piecewise Cubic interpolation (Cubic Spline) Properties

- In each subinterval, [𝑥𝑖,𝑥𝑖+1], s(x) is cubic - 𝑠(𝑥𝑖) =𝑦𝑖, 𝑖= 1, … ,𝑛

- 𝑠′₍_𝑥₎_and_𝑠′′₍_𝑥_{) are continuous at}_𝑥_1,_𝑥_{2, … ,}_𝑥 𝑛−1 - 𝑠′′₍_𝑥_{1) =}_𝑠′′₍_𝑥_{𝑛) = 0}_{, meaning s(x) is linear at}

the end points.

Conrad’s note: Second derivative commonly set to zero at the endpoints, provides a boundary condition that completes the system of equations. Not the only choice possible, other boundary conditions could be used instead.

Straightforward Approach

Suppose 𝑎𝑖+𝑏𝑖𝑥+𝑐𝑖𝑥2+𝑑𝑖 on [𝑥𝑖,𝑥𝑖+1], 𝑖= 1, … ,𝑛 −1. 4(𝑛 −1) unknowns to be determined. Interpolation:

𝑎𝑖+𝑏𝑖𝑥𝑖+𝑐𝑖𝑥𝑖2+𝑑𝑖𝑥𝑖3=𝑦𝑖,𝑖= 1, … ,𝑛 −1

𝑎𝑖+𝑏𝑖𝑥𝑖+1+𝑐𝑖𝑥2𝑖+1+𝑑𝑖𝑥𝑖+13 =𝑦𝑖+1,𝑖= 1, … ,𝑛 −1

Continuous first derivative (consider [𝑥𝑖−1,𝑥𝑖] and [𝑥𝑖,𝑥𝑖+1]):

𝑏𝑖−1+ 2𝑐𝑖−1𝑥𝑖+ 3𝑑𝑖−1𝑥𝑖2=𝑏𝑖+ 2𝑐𝑖𝑥𝑖+ 3𝑑𝑖𝑥𝑖2,

𝑖= 2, … ,𝑛 −1

Continuous second derivative:

2𝑐𝑖−1+ 6𝑑𝑖−1𝑥𝑖= 2𝑐𝑖+ 6𝑑𝑖𝑥𝑖,𝑖= 2, … ,𝑛 −1 Two end conditions:

2𝑐𝑖+ 6𝑑1𝑥1= 0 and 2𝑐𝑛−1+ 6𝑑𝑛−1𝑥𝑛= 0 Total of 4(𝑛 −1) equations, a dense system. Clever Approach

In the subinterval [𝑥𝑖,𝑥𝑖+1], let ℎ𝑖=𝑥𝑖+1− 𝑥𝑖 and introduce new variables:

𝑤= (𝑥 − 𝑥𝑖)/ℎ𝑖 , 𝑤�= 1− 𝑤

Note: 𝑤(𝑥𝑖) = 0, 𝑤(𝑥𝑖+1) = 1, and 𝑤�(𝑥𝑖) = 1

𝑤�(𝑥𝑖) = 1, 𝑤�(𝑥𝑖+1) = 0, (linear Lagrange polynomials).

Thus 𝑤𝑦𝑖+1+ 𝑤�𝑦𝑖 is the (linear) Lagrange interpolation on[𝑥𝑖,𝑥𝑖+1] (intrinsically solves

𝑠(𝑥𝑖) =𝑦𝑖 condition). Construct

𝑠(𝑥) =𝑤𝑦𝑖+1+𝑤�𝑦𝑖+ℎ𝑖2[(𝑤3− 𝑤)𝜎𝑖+1 + (𝑤�3_{− 𝑤�}₎_𝜎_𝑖] Where 𝜎𝑖 is to be determined, so that the properties are satisfied.

𝑤′_{= 1/}_ℎ

𝑖, 𝑤�′=−1/ℎ𝑖 We can verify:

1. 𝑠(𝑥𝑖) =𝑦𝑖, 𝑖= 1, … ,𝑛, s(x) interpolates (xi, yi)

2. 𝑠′′₍_𝑥_{) = 6}_𝑤𝜎

𝑖+1+ 6𝑤�𝜎𝑖, linear Lagrange interpolation at the points (𝑥𝑖, 6𝜎𝑖) and (𝑥𝑖+1, 6𝜎𝑖+1)

(2)

𝑠′₍_𝑥_{) =}𝑦𝑖+1− 𝑦𝑖

ℎ𝑖 +ℎ𝑖[(3𝑤 2₋₁₎_𝜎

𝑖+1

−(3𝑤�2₋₁₎_𝜎 𝑖] Let ∆𝑖= (𝑦𝑖+1− 𝑦𝑖)/ℎ𝑖

On [𝑥𝑖,𝑥𝑖+1],𝑤(𝑥𝑖) = 0,𝑤�(𝑥𝑖) = 1

𝑠+′(𝑥𝑖) =∆𝑖+ℎ𝑖(−𝜎𝑖+1−2𝜎𝑖) On [𝑥𝑖−1,𝑥𝑖],

𝑠′₍_𝑥_{) =}𝑦𝑖− 𝑦𝑖−1

ℎ𝑖−1 +ℎ𝑖−1[(3𝑤 2₋₁₎_𝜎

𝑖

−(3𝑤�2₋₁₎_𝜎_𝑖−1] And 𝑤(𝑥𝑖) = 1,𝑤�(𝑥𝑖) = 0. Thus

𝑠−′(𝑥𝑖) =∆𝑖−1+ℎ𝑖−1(2𝜎𝑖+𝜎𝑖−1) Setting 𝑠+′(𝑥𝑖) =𝑠−′(𝑥𝑖), 𝑖= 2,3, … ,𝑛 −1 We get 𝑛 −2 equations:

ℎ𝑖−1𝜎𝑖−1+ 2(ℎ𝑖−1+ℎ𝑖)𝜎𝑖+ℎ𝑖𝜎𝑖+1=∆𝑖− ∆𝑖−1 for 𝑖= 2, 3, … ,𝑛 −1.

Solve for 𝜎2, … ,𝜎𝑛−1, recalling that 𝜎0= 𝜎𝑛= 0 (natural cubic spline).

Matrix form

Diagonal: [2(ℎ1+ℎ2), … , 2(ℎ𝑛−2+ℎ𝑛−1)] Supper/subdiagonal:[ℎ2, … ,ℎ𝑛−2] Unknowns:[𝜎2, … ,𝜎𝑛−1]𝑇

Right-hand side: [∆2− ∆1, … ,∆𝑛−1− ∆𝑛−2]𝑇 The matrix is

- Symmetric - Tridiagonal

- Diagonally dominant (�𝑎𝑖,𝑗�>∑𝑗≠𝑖|𝑎𝑖,𝑗|), when 𝑥1<𝑥2<⋯<𝑥𝑛

Can apply Gaussian elimination without pivoting, working on (two) three vectors with O(n) operations.

Note: Had we gone with the straight forward approach to determining the coefficients, we would have ended up with a large �4(𝑛 −1)∗

4(𝑛 −1)� and dense system requiring O(n3) operations.

Conrad to Shawn (Tron Facebook Group): This method is pretty unique, but it's clever. Using his ugly equations, he's basically summarized a, b, c and d from the previous slide into this equation as well as sigma.

See the first terms, the wy + wbar y, they're actually a Lagrange polynomial. This term naturally interpolates the original 𝑥𝑖, 𝑦𝑖 set. The

next half of the equation, the uglier looking portion, is constructed so that the second derivative (which you can see on slide 24) has a Lagrange interpolation of the same form as the original wy +wbar y, which means s''(x) is continuous.

The remaining properties are that 𝑠′(𝑥) is continuous and 𝑠′′(𝑥) is zero at the endpoints. He solves for the equations specifying sigma to be continuous on slide 25. This is basically some algebra to specify the derivatives on the right side of the point are the same as those on the left (Meaning it's continuous, no "jumps" where it's like 5 on the left side of a point but 7 on the right side). This provides us with n-2 equations (we can't say anything about the endpoints being continuous, there's only one spline there, we can't set s′+ = 𝑠−′).

Next property, 𝑠′′(𝑥) is zero at endpoints. Why does 𝑠′′(𝑥) need to be zero at the end points? It's actually arbitrary. We need another two equations to solve the system (need n

equations to solve for each of the n sigmas) and saying "It must be linear at the end points!" is a decent enough way to do it and is convention for a "natural" cubic spline. Anyway, since we

know s''(x_i) = sigma_i, setting sigma_0 and

sigma_n to zero solves the above problem. By integrating the properties into the equation, you can set up the equations as a triangular, sparse matrix and not the massive 4(n-1)*4(n-1) dense system that it requires.

If 𝑠(𝑥) is evaluated many times, arrange 𝑠(𝑥) so that

𝑠(𝑥) =𝑦𝑖+𝑏𝑖(𝑥 − 𝑥𝑖) +𝑐𝑖(𝑥 − 𝑥𝑖) +𝑑𝑖(𝑥 − 𝑥𝑖)3 And rearrange it in the Horner’s form, for

𝑥𝑖≤ 𝑥 ≤ 𝑥𝑖+1 and calculate and store 𝑏𝑖,𝑐𝑖,𝑑𝑖 (instead of 𝜎𝑖)

𝑏𝑖(𝑦𝑖+1− 𝑦𝑖)

ℎ𝑖 − ℎ𝑖(𝜎𝑖+1+ 2𝜎𝑖)

𝑐𝑖= 3𝜎𝑖𝑑𝑖=𝜎𝑖+1_ℎ𝑖−𝜎𝑖 for 𝑖= 1, 2, … ,𝑛 −1

Algorithm. Natural cubic spline Ncspline (page 9)

Algorithm computes the coefficients b, c, d of natural spline interpolation, given a vector x with breakpoints and vector y with function values. 1. Compute ℎ𝑖 and ∆𝑖;

2. Form the tridiagonal matrix (two arrays) and the right hand side;

3. Solve for 𝜎𝑖;

4. Compute the coefficients b, c, and d.

Summary

- Polynomial interpolation: General idea and methods, Lagrange interpolation

- Piecewise polynomial interpolation: Construction of piecewise polynomial(linear and

cubic),evaluation of a piecewise function, ncspline, seval

Ch 4 – Numerical integration

Intro

Better term: Quadrature

Given finite number of function values, 𝑓(𝑥𝑖),𝑥𝑖∈ [𝑎,𝑏] or the function 𝑓(𝑥) capable of evaluation for any 𝑥 ∈[𝑎,𝑏], calculate

𝐼(𝑓) = � 𝑓𝑏 (𝑥)𝑑𝑥

𝑎

Partition 𝑎=𝑥1<𝑥2<⋯<𝑥𝑛+1=𝑏, and denote ℎ𝑖=𝑥𝑖+1− 𝑥𝑖. Then

𝐼(𝑓) = � 𝐼𝑖, 𝑛

𝑖=1

𝐼(𝑖) = �𝑥𝑖+1𝑓(𝑥)𝑑𝑥

𝑥𝑖 Quadrature rule: Approximation of 𝐼𝑖

Composite quadrature rule: Approximation of 𝐼(𝑓) as a sum of 𝐼𝑖

Rectangle rule

We use piecewise constant (degree zero polynomial) to approximate 𝑓(𝑥). In each interval [𝑥𝑖,𝑥𝑖+1], 𝑓(𝑥) is evaluated at the midpoint.

𝑦𝑖=𝑥𝑖+₂𝑥𝑖+1, 𝑖= 1, … ,𝑛

𝐼𝑖≈ ℎ𝑖𝑓(𝑦𝑖)

Trapezoid rule

We use piecewise linear interpolation (degree one polynomial) to approximate 𝑓(𝑥). In each interval [𝑥𝑖,𝑥𝑖+1], 𝑓(𝑥) is evaluated at the endpoints.

𝐼𝑖≈ℎ𝑖�𝑓(𝑥𝑖) +₂𝑓(𝑥𝑖+1)�

Composite trapezoid rule:

𝑇(𝑓) =∑ ℎ𝑛𝑖=1 𝑖�𝑓(𝑥₂𝑖) +𝑓(𝑥𝑖+1)�

=ℎ₂1𝑓(𝑥1) +ℎ1+₂ℎ2𝑓(𝑥2) +⋯

+ℎ𝑛−1₂+ℎ𝑛𝑓(𝑥𝑛) +ℎ𝑛+1 2 𝑓(𝑥𝑛+1) A weighted sum of function values.

In the trapezoid, rules, 𝑛+ 1 function evaluations at endpoints 𝑥𝑖.

Error Rectangle rule

Taylor expansion 𝑓(𝑥) about the midpoint

𝑦𝑖= (𝑥𝑖+𝑥𝑖+1)/2:

𝑓(𝑥) =𝑓(𝑦𝑖) +�(𝑥 − 𝑦_𝑝_!𝑖)𝑝

∞

𝑝=1

𝑓𝑝₍_𝑦 𝑖) Integrate both sides and note that

� (𝑥 − 𝑦𝑖)𝑝𝑑𝑥 𝑥𝑖+1

𝑥𝑖

=𝑓(𝑥) =� ℎ𝑖

𝑝+1

(𝑝+ 1)2𝑝, 0, 𝑜𝑑𝑑𝑝𝑒𝑣𝑒𝑛𝑝

Then

�𝑥𝑖+1𝑓(𝑥)𝑑𝑥

𝑥𝑖 =ℎ𝑖𝑓(𝑦𝑖) + 1

24ℎ𝑖3𝑓′′(𝑦𝑖) +₁₉₂₀1 ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) +⋯ When ℎ𝑖 is small, the error

𝐼(𝑓)− 𝑅(𝑓)≈₂₄1� ℎ𝑖3𝑓′′(𝑦𝑖) 𝑛

𝑖=1

+₁₉₂₀1 � ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) 𝑛

𝑖=1 For equal spacing, ℎ𝑖=ℎ, we have

𝐼(𝑓)− 𝑅(𝑓)≈ℎ₂₄3� 𝑓′′₍_𝑦 𝑖) 𝑛

𝑖=1

+₁₉₂₀ℎ5 � 𝑓𝑖𝑣₍_𝑦_𝑖) 𝑛

𝑖=1 Trapezoid rule

In order to make the error in the trapezoid rule comparable with that in the rectangle rule, we expand 𝑓(𝑥) at the midpoint 𝑦𝑖. Substituting

(3)

𝑓(𝑥𝑖) =𝑓(𝑦𝑖) +�(−1)𝑝 ℎ𝑖𝑝 2𝑝_𝑝_!𝑓(𝑝)(𝑦𝑖) ∞

𝑝=1

𝑓(𝑥𝑖+1) =𝑓(𝑦𝑖) +� ℎ𝑖𝑝 2𝑝_𝑝_!𝑓(𝑝)(𝑦𝑖) ∞

𝑝=1 Thus

𝑓(𝑥𝑖) +𝑓(𝑥𝑖+1)

2 =𝑓(𝑦𝑖) + 1 8ℎ𝑖2𝑓′′(𝑦𝑖) + 1

384ℎ𝑖4𝑓𝑖𝑣(𝑦𝑖) +⋯ Recall that in the case of rectangle rule, we had

𝑥𝑖

=ℎ𝑖𝑓(𝑦𝑖) +₂₄1ℎ𝑖3𝑓′′(𝑦𝑖)

+₁₉₂₀1 ℎ𝑖5𝑓(𝑖𝑣)(𝑦𝑖) +⋯ Combining the last two equations, we have

𝑥𝑖 =ℎ𝑖

𝑓(𝑥𝑖) +𝑓(𝑥𝑖+1) 2

−₁₂1� ℎ𝑖3𝑓′′(𝑦𝑖)−₄₈₀1 ℎ𝑖5𝑓(𝑖𝑣)(𝑦𝑖) +⋯ 𝑛

𝑖=1 Then the error is

𝐼(𝑓)−𝑇(𝑓)≈ −₁₂1� ℎ𝑖3𝑓′′(𝑦𝑖) 𝑛

𝑖=1

−₄₈₀1 ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) +⋯ Remarks

Usually rectangle rule (degree zero approximation) is more accurate than trapezoid rule (degree one approximation).

Using 𝐼(𝑓)− 𝑅(𝑓)≈1₃�𝑇(𝑓)− 𝑅(𝑓)�, we can estimate the error in 𝑅(𝑓) using 𝑇(𝑓) and 𝑅(𝑓). Similarly, 𝐼(𝑓)−𝑇(𝑓)≈2₃�𝑅(𝑓)− 𝑇(𝑓)� can be used to estimate the error in 𝑇(𝑓) (but they are approximations, it is possible that 𝑅(𝑓)− 𝑇(𝑓) = 0 where 𝐼(𝑓)− 𝑅(𝑓)≠0).

When each ℎ𝑖 is cut in half, 𝐼(𝑓)− 𝑅1 2(𝑓)≈ 1

4�𝐼(𝑓)− 𝑅(𝑓)�. Doubling the number of panels in either the rectangle rule or the trapezoid rule, it can be expected to roughly quadruple the accuracy.

Why? Conrad’s Note: Error in each panel is bounded by 𝐸𝑖≤ℎ

3

24𝑓′′(𝜉) where 𝜉𝜖[𝑎,𝑏](local

error is of order 𝑂(ℎ3₎_{. Summing this with n}

panels, this amounts to a total bounded error of

𝐸 ≤ 𝑛ℎ₂₄3𝑓′′₍_𝜉₎_{. As we know}_{𝑛 ∗ ℎ}_{= (}_{𝑏 − 𝑎}₎_,

the global error is bounded by 𝐸 ≤(𝑏 − 𝑎)ℎ₂₄2𝑓′′₍_𝑦_𝑖)_{, of order}_𝑂₍_ℎ2₎_{. Thus, half the}

interval size results in (1/2)2_{the error, or 1/4}

the error, as Qiao suggests.

This can be used to estimate the error as well as improve the accuracy. How?

Conrad’s Note: 𝐼(𝑓)− 𝑅1 2(𝑓)≈

1 4

� �𝐼(𝑓)− 𝑅(𝑓)� can be rearranged to show 𝐼(𝑓)− 𝑅1

2(𝑓)≈ 1

3

� �𝑅1

2− 𝑅�. Computing both 𝑅12 and

𝑅 allows us to then estimate the error, using the derived equation.

Simpson’s rule

Recall the rectangle rule

𝑅(𝑓) = 𝐼(𝑓)−₂₄1ℎ𝑖3𝑓′′(𝑦𝑖)−₁₉₂₀1 ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) +⋯

And the trapezoid rule

𝑇(𝑓) =𝐼(𝑓) +₁₂1� ℎ𝑖3𝑓′′(𝑦𝑖) +₄₈₀1 ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) 𝑛

𝑖=1 +⋯

Combining the above two equations (canceling the O(ℎ𝑖3) term), we get a more accurate method (Simpson’s rule):

𝑆(𝑓) =2₃𝑅(𝑓) +1₃𝑇(𝑓)

=𝐼(𝑓) +₂₈₈₀1 � ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) +⋯ 𝑛

𝑖=1

Simpson’s rule (a weighted average of the rectangle and trapezoid rules):

𝐼𝑖=2₃ℎ𝑖𝑓 �𝑥𝑖+₂𝑥𝑖+1�+1₃ℎ𝑖𝑓(𝑥𝑖) +₂𝑓(𝑥𝑖+1) Composite Simpson’s rule:

𝑆(𝑓) = �1₆ℎ𝑖 𝑛

𝑖=1

�𝑓(𝑥𝑖) + 4𝑓 �𝑥𝑖+₂𝑥𝑖+1�

+𝑓(𝑥𝑖+1)�

Function evaluations: 2𝑛+ 1 Error

𝐼(𝑓)− 𝑆(𝑓) =−₂₈₈₀1 � ℎ𝑖5𝑓𝑖𝑣(𝑦𝑖) +⋯ 𝑛

𝑖=1 Remarks

- Simpson’s rule can also be derived by using piecewise quadratic (degree two) approximation. - Simpson’s rule is exact for cubic function (one

extra order of accuracy) since the error term involves fourth derivatives.

- Doubling the number of panels in Simpson’s rule can be expected to reduce the error by roughly the factor of 1/16 (Simpson’s rule has a global order of accuracy of ℎ𝑖4)

A general technique: Richardson’s extrapolation

Idea: Combine two approximations (e.g., 𝑅(𝑓) and

𝑇(𝑓)) which have similar error terms to achieve a more accurate approximation (e.g., 𝑆(𝑓)). Example. Combining 𝑆(𝑓) and 𝑆1�₂(𝑓) to obtain an approximation which has error of order ℎ𝑖7. This gives the Romberg quadrature 16₁₅𝑆1_�₂(𝑓)−₁₅1𝑆(𝑓)

Adaptive Quadrature

Given a predetermined tolerance 𝜖, the algorithm automatically determines the panel sizes so that the computed approximation 𝑄 satisfies

�𝑄 −� 𝑓𝑏 (𝑥)𝑑𝑥

𝑎 �< 𝜖 Software interface: quad(fname, a, b, tol) The algorithm uses large panel sizes for smooth parts and small panel sizes for the parts where the function changes rapidly. Thus the prescribed accuracy is attained at as small a cost in computing time. (Measured by the number of function evaluations.)

Basic idea: Compute two approximations ( Simpson’s rule):

One-panel formula

𝑃𝑖=ℎ₆𝑖�𝑓(𝑥𝑖) + 4𝑓 �𝑥𝑖+ℎ₂𝑖�+𝑓(𝑥𝑖+ℎ𝑖)� Two-panel formula

𝑄𝑖=₁₂ℎ𝑖�𝑓(𝑥𝑖) + 4𝑓 �𝑥𝑖+ℎ₄𝑖�+ 2𝑓 �𝑥𝑖+ℎ₂𝑖� + 4𝑓(𝑥𝑖) +3₄ℎ𝑖�+ 𝑓(𝑥𝑖+ℎ𝑖)]

Note:

- From 𝑃𝑖 to 𝑄𝑖, we need only two function evaluations 𝑓 �𝑥𝑖+ℎ𝑖₄� and 𝑓 �𝑥𝑖+3ℎ𝑖₄� - 𝑄𝑖 can be viewed as the sum of two 𝑃’s from two

subintervals of length ℎ𝑖/2 Error estimation

Compare 𝑃𝑖 and 𝑄𝑖 to obtain an estimate of their accuracy.

𝐼𝑖− 𝑃𝑖=𝑐ℎ𝑖5𝑓𝑖𝑣�𝑥𝑖+ℎ₂𝑖�+⋯

𝐼𝑖− 𝑄𝑖=𝑐 �ℎ₂𝑖� 5

�𝑓𝑖𝑣_�𝑥

𝑖+ℎ₄𝑖�+𝑓𝑖𝑣�𝑥𝑖+3₄ℎ𝑖�� +⋯

Using the approximation

𝑓𝑖𝑣_�𝑥

𝑖+ℎ𝑖₄�+𝑓𝑖𝑣�𝑥𝑖+3ℎ𝑖₄� ≈2𝑓𝑖𝑣�𝑥𝑖+ℎ𝑖₂�, We have

𝐼𝑖− 𝑄𝑖≈2𝑐 �ℎ₂𝑖� 5

𝑓𝑖𝑣_�𝑥

𝑖+ℎ₂𝑖�+⋯ Thus we have a relation between the errors in 𝑄𝑖 and 𝑃𝑖:

𝐼𝑖− 𝑄𝑖≈₂14(𝐼𝑖− 𝑃𝑖) +⋯ Reformulate the above

𝐼𝑖−𝑄𝑖≈₂41₋₁(𝑄𝑖− 𝑃𝑖) +⋯ Now the accuracy of 𝑄𝑖 is expressed in terms of

𝑄𝑖− 𝑃𝑖. Scheme

Bisect each subinterval until 1

24₋₁|𝑄𝑖− 𝑃𝑖|≤_{𝑏 − 𝑎 𝜖}ℎ𝑖 Then

�� 𝑓𝑏 (𝑥)𝑑𝑥

𝑎 − � 𝑄𝑖

𝑛

𝑖=1

� ≤₂₄1₋₁�|𝑄𝑖− 𝑃𝑖| 𝑛

𝑖=1

≤_{𝑏 − 𝑎 � ℎ}𝜖 𝑖 𝑛

𝑖=1 =𝜖

if maxLev==0

too many levels of recursion, quit; compute one-panel quadrature R1; compute two-panel quadrature R2; use R1 and R2 to estimate error in R2; if the estimated error < tol

return R2 and estimated error; else

[I1, err1] =

AdaptQuad(fname,a,mid,tol/2,maxLev-1);

[I2, err2] =

AdaptQuad(fname,mid,b,tol/2,maxLev-1);

I = I1 + I2; err = err1 + err2; Example: Compute

𝜋

4 =� 1 1 +𝑥2𝑑𝑥 1

0

Using the adaptive rectangle rule.

AdaptQRec(‘datan’,0,1,0.0001,10):0.7853 96

(4)

2D Quadrature

Consider a 2D integral

𝐼=� � 𝑓𝑑 (𝑥,𝑦)𝑑𝑦

𝑐 𝑑𝑥

𝑏

𝑎 And let

𝑔(𝑥) =� 𝑓𝑑 (𝑥,𝑦)𝑑𝑦

𝑐

Applying the composite trapezoid rule to

� 𝑔𝑏 (𝑥)𝑑𝑥

𝑎

We get the numerical integration

�𝑔(𝑥𝑖) +₂𝑔(𝑥𝑖+1)ℎ𝑥 𝑚−1

𝑖=1 Written in vector form

ℎ𝑥𝑊𝑥𝑇�

𝑔(𝑥1)

⋮ 𝑔(𝑥𝑚)

�

Where 𝑤𝑥𝑡= [1/2, 1, … , 1, 1/2 ].

Again, applying the composite trapezoid rule to each 𝑔(𝑥𝑖), we get

𝑔(𝑥𝑖) =� 𝑓𝑑 (𝑥𝑖,𝑦)𝑑𝑦

𝑐

≈ �𝑓�𝑥𝑖,𝑦𝑗�+₂𝑓�𝑥𝑖,𝑦𝑗+1�ℎ𝑦 𝑛−1

𝑗=1 In vector form

𝑔(𝑥𝑖)≈ ℎ𝑦[𝑓(𝑥𝑖,𝑦𝑖), … ,𝑓(𝑥𝑖,𝑦𝑛)]𝑤𝑦 Where 𝑤𝑦= [1/2, 1, … , 1,1/2]𝑇.

Finally, we have the numerical integration, in matrix-vector form:

𝑄=ℎ𝑥ℎ𝑦𝑤𝑥𝑇𝐹𝑤𝑦 Where

𝐹=�𝑓(𝑥1⋮,𝑦1) ⋯ 𝑓⋱ (𝑥1⋮,𝑦𝑛) 𝑓(𝑥𝑚,𝑦1) ⋯ 𝑓(𝑥𝑚,𝑦𝑛)

�

Summary

- Composite quadrature rules: Rectangle rule, trapezoid rule, Simpson’s rule

- Richardson’s extrapolation technique: Combining two quadrature rules with similar error terms to achieve a more accurate quadrature rule by canceling the leading error term; Combining one-panel and two-panel results to estimate errors

- Adaptive quadrature: By using error estimates, determine the panel sizes so that the computed approximation satisfies a predetermined tolerance

- 2D quadrature: Formulation of the problem

Ch 5 – Solving Differential

Equations

Problem setting

Initial Value Problem (first order) Find 𝑦(𝑡) such that

𝑦′₌_𝑓₍_𝑦_,_𝑡₎ Initial value 𝑦(𝑡0), usually assume 𝑡0= 0 Generalization 1: system of first order ODEs: 𝑦 is a vector and 𝑓 is a vector function.

Example:

�𝑦1′ =𝑓1(𝑦1,𝑦2,𝑡)

𝑦2′=𝑓2(𝑦1,𝑦2,𝑡) Or in vector notations:

𝒚′₌_𝒇₍_𝒚_,_𝑡₎ Generalization 2: high order equation

𝑢′′₌_𝑔₍_𝑢_,_𝑢′_,_𝑡_). Let

𝑦1=𝑢

𝑦2=𝑢′

And transform the above into the following system of first order ODEs:

� 𝑦1′=𝑦2

𝑦2′=𝑔(𝑦1,𝑦2,𝑡) Solution family

A differential equation has a family of solutions, each corresponds to an initial value

𝑦′₌_−𝑦_{, solution family,}_𝑦₌_𝐶𝑒−𝑡

Euler’s method

We consider the initial value problem

𝑦′₌_𝑓₍_𝑦_,_𝑡_),_𝑦₍_𝑡_{0) =}_𝑦 0 Numerical solution: find approximations

𝑦𝑛≈ 𝑦(𝑡𝑛), for 𝑛= 1, 2, … Note: 𝑦0=𝑦(𝑡0) (initial value) A 𝑘-step method: Compute 𝑦𝑛+1 using

𝑦𝑛,𝑦𝑛−1, … ,𝑦𝑛−𝑘+1

A single-step method: Euler’s method.

𝑓(𝑦0,𝑡0) =𝑦′₍_𝑡

0)≈𝑦(𝑡1)_ℎ− 𝑦(𝑡0) 0 , Where ℎ0=𝑡1− 𝑡0. The first step:

𝑦1=𝑦0+ℎ0𝑓(𝑦0,𝑡0)

Euler’smethod

𝑦𝑛+1=𝑦𝑛+ℎ𝑛𝑓(𝑦𝑛,𝑡𝑛) Produces: 𝑦0=𝑦(𝑡0),𝑦1≈ 𝑦(𝑡1),𝑦2≈ 𝑦(𝑡2), … Example

𝑦′₌_−𝑦_,_𝑦_{(0) = 1.0. (Solution}_𝑦₌_𝑒−𝑡₎

ℎ= 0.4 Step 1:

𝑦1=𝑦0− ℎ𝑦0= 1.0−0.4∗1.0 = 0.6

(≈ 𝑦(0.4) =𝑒−0.4_≈_0.6703)

𝑢1(𝑡) = 0.6 𝑒−𝑡+0.4≈0.9851𝑒−𝑡 in the solution family.

𝑢1′=−𝑢1,𝑢1(0)≈0.8951(𝑢1(0.4) = 0.6)

Step 3:

𝑦2=𝑦1− ℎ𝑦1= 0.6−0.4∗0.6 = 0.36

𝑢2(𝑡) = 0.36𝑒−𝑡+0.8≈0.8012𝑒−𝑡 in the solution family.

𝑢2′=−𝑢2,𝑢2(0)≈0.8012(𝑢2(0.8) = 0.36)

In general

𝑢𝑛′ =𝑓(𝑢𝑛(𝑡),𝑡), in the solution family

𝑢𝑛(𝑡𝑛) =𝑦𝑛, passing (𝑡𝑛,𝑦𝑛)

(5)

Starting with 𝑡0 and 𝑦0=𝑦(𝑡0), as we proceed, we jump from one solution in the family to another.

Errors

Two sources of errors: discretization error and roundoff error.

- Discretization error: caused by the method used, independent of the computer used and the program implementing the method - Two types of discretization error:

- Global error: 𝑒𝑛=𝑦𝑛− 𝑦(𝑡𝑛) - Local error: the error in one step Local error

Consider 𝑡𝑛 as the starting point and the approximation 𝑦𝑛 at 𝑡𝑛 as the initial value, if 𝑢𝑛(𝑡) is the solution of

𝑢𝑛′ =𝑓(𝑢𝑛,𝑡), 𝑢𝑛(𝑡𝑛) =𝑦𝑛 Then the local error is

𝑑𝑛=𝑦𝑛+1− 𝑢𝑛(𝑡𝑛+1) Example:

Step 1

Local error 𝑑0=𝑦1− 𝑦(𝑡1) = 0.6− 𝑒−0.4≈

−0.0703

Global error 𝑒1 same as 𝑑0 Step 2

Local error 𝑑1=𝑦2− 𝑢1(𝑡2) = 0.36− 𝑢1(0.8)≈

−0.0422.

Global error 𝑒2=𝑦2− 𝑦(𝑡2) = 0.36− 𝑒−0.8≈

−0.0893.

Stability

Relation between global error 𝑒𝑛 and local error

𝑑𝑛. If the differential equation is unstable, |𝑒𝑁| >�|𝑑𝑛|

𝑁−1

𝑛=0 If the differential equation is stable,

|𝑒𝑁|≤ �|𝑑𝑛| 𝑁−1

𝑛=0 In this case, ∑𝑁−1|𝑑𝑛|

𝑛=0 is an upper bound for the global error |𝑒𝑁|.

In the previous example:

Local errors |𝑑0| = 0.0703 and |𝑑1| = 0.0422 Global error |𝑒2| = 0.0893

|𝑒2| < |𝑑0| + |𝑑1|

More generally, 𝑦′₌_𝛼𝑦_{, solution family}_𝑦₌

𝐶𝑒𝛼𝑡_.

Stable when 𝛼< 0. Accuracy

A measurement for the accuracy of a method An order 𝑝 method

|𝑑𝑛|≤ 𝐶ℎ𝑛𝑝+1 �or 𝑂�ℎ𝑛𝑝+1��

𝐶: independent of 𝑛 and ℎ𝑛. Example:

Local solution 𝑢𝑛(𝑡)

𝑢𝑛′(𝑡) =𝑓(𝑢𝑛(𝑡),𝑡), 𝑢𝑛(𝑡𝑛) =𝑦𝑛 Taylor expansion at 𝑡𝑛:

𝑢𝑛(𝑡) =𝑢𝑛(𝑡𝑛) + (𝑡 − 𝑡𝑛)𝑢′(𝑡𝑛) +𝑂((𝑡 − 𝑡𝑛)2) Since 𝑦𝑛=𝑢𝑛(𝑡𝑛) and 𝑢′(𝑡𝑛) =𝑓(𝑦𝑛,𝑡𝑛), we get

𝑢𝑛(𝑡𝑛+1) =𝑦𝑛+ℎ𝑛𝑓(𝑦𝑛,𝑡𝑛) +𝑂(ℎ𝑛2) Local error

𝑑𝑛=𝑦𝑛+1− 𝑢𝑛(𝑡𝑛+1) =𝑂(ℎ𝑛2) Euler’s method is a first order method (𝑝= 1) Consider the interval [𝑡0,𝑡𝑛] and partition

𝑡0,𝑡1, … ,𝑡𝑁. The global error is roughly |𝑒𝑁|≈ �|𝑑𝑛|

𝑁−1

𝑛=0

≈ 𝑁 ∙ 𝑂(ℎ𝑝+1₎_≈₍_𝑡

𝑁− 𝑡0)∙ 𝑂(ℎ𝑝) at the final point 𝑡𝑁 is roughly 𝑂(ℎ𝑝) for a method of order 𝑝.

For a 𝑝th order method, if the subintervals ℎ𝑛 are cut in half, then the average local error is reduced by a factor of 2𝑝+1_{, the global error is reduced by a} factor of 2𝑝_{. (But double the number of steps, i.e.,} more work.)

Roundoff Error

𝑦𝑛+1= 𝑦𝑛+ℎ𝑛𝑓(𝑦𝑛,𝑡𝑛) +𝜖 |𝜖| =𝑂(𝑢) Total rounding error: 𝑁𝜖=𝑏𝜖_ℎ(𝑏=𝑡𝑁− 𝑡0, fixed step size ℎ )

total error ≈ 𝑏 �𝐶ℎ+𝜖_ℎ� - If ℎ is too small, the roundoff error is large - If ℎ is too large, the discretization error is large The total error is minimized by

ℎ𝑜𝑝𝑡≈ �𝑢_𝐶 recalling that 𝑢 is the unit of roundoff.

Runge-Kutta methods

Idea: Sample 𝑓 at several spots to achieve high order.

Cost: More function evaluations

Example. A second-order Runge-Kutta method Suppose

𝑦𝑛+1=𝑦𝑛+𝛾1𝑘0+𝛾2𝑘1 Where 𝛾1 and 𝛾2 to be determined and

𝑘0=ℎ𝑛𝑓(𝑦𝑛,𝑡𝑛)

𝑘1=ℎ𝑛𝑓(𝑦𝑛+𝛽𝑘0,𝑡𝑛+𝛼ℎ𝑛)

𝛼 and 𝛽 to be determined. Taylor series (two variables):

𝑘1=ℎ𝑛(𝑓𝑛+𝛽𝑘0𝑓𝑦′(𝑦𝑛,𝑡𝑛)𝑓𝑛+ 𝛼ℎ𝑛𝑓𝑡′(𝑦𝑛,𝑡𝑛) +⋯)

Thus

𝑦𝑛+1=𝑦𝑛+ (𝛾1+𝛾2)ℎ𝑛𝑓𝑛+𝛾2𝛽ℎ𝑛2𝑓𝑛𝑓𝑦′(𝑦𝑛,𝑡𝑛) +𝛾2𝛼ℎ𝑛2𝑓𝑡′(𝑦𝑛,𝑡𝑛) +⋯ The Taylor expansion of the true local solution:

𝑢𝑛(𝑡𝑛+1) =𝑢𝑛(𝑡𝑛) + ℎ𝑛𝑢𝑛′(𝑡𝑛) +ℎ𝑛 2

2𝑢𝑛′′(𝑡𝑛) +⋯ =𝑦𝑛+ℎ𝑛𝑓𝑛

+ℎ₂𝑛2�𝑓𝑦′(𝑦𝑛,𝑡𝑛)𝑓𝑛+𝑓𝑡′(𝑦𝑛,𝑡𝑛)�+⋯. Comparing the two expressions, we set

�𝛾𝛾12+𝛽𝛾= 1/22= 1

𝛾2𝛼= 1/2 Then the local error

𝑑𝑛=𝑦𝑛+1− 𝑢𝑛(𝑡𝑛+1) =𝑂(ℎ𝑛3) The global error 𝑂(ℎ2₎

Let

𝛾1= 1−₂1_𝛼, 𝛾2=₂1_𝛼, 𝛽=𝛼 Second-order RK method

𝑦𝑛+1=𝑦𝑛+�1−₂1_{𝛼� 𝑘}0+₂1_{𝛼 𝑘}1 Where

𝑘0=ℎ𝑛𝑓(𝑦𝑛,𝑡𝑛)

𝑘1=ℎ𝑛𝑓(𝑦𝑛+𝛼𝑘0,𝑡𝑛+𝛼ℎ𝑛)

When 𝛼= 1/2, related to the rectangle rule When 𝛼= 1/2, related to the trapezoid rule Classical fourth-order Runge-Kutta Method

𝑦𝑛+1=𝑦𝑛+1₆(𝑘0+ 2𝑘1+ 2𝑘2+𝑘3) Where

𝑘0=ℎ𝑓(𝑦𝑛,𝑡𝑛)

𝑘1=ℎ𝑓 �𝑦𝑛+1₂𝑘0,𝑡𝑛+ℎ₂�

𝑘2=ℎ𝑓 �𝑦𝑛+1₂𝑘1,𝑡𝑛+ℎ₂�

𝑘3=ℎ𝑓(𝑦𝑛+𝑘2,𝑡𝑛+ℎ)

Multistep Methods

Compute 𝑦𝑛+1 using 𝑦𝑛,𝑦𝑛−1, … and 𝑓𝑛,𝑓𝑛−1, … possibly 𝑓𝑛+1 �𝑓𝑖=𝑓(𝑦𝑖,𝑡𝑖)�.

General linear k-step method:

𝑦𝑛+1=� 𝛼𝑖𝑦𝑛−𝑖+1 𝑘

𝑖=1

+ℎ � 𝛽𝑖𝑓𝑛−𝑖+1 𝑘

𝑖=0 - 𝛽0= 0 (no 𝑓𝑛+1), explicit method - 𝛽0≠0, implicit method

- Explicit and implicit defined in glossary, page 9

Conrad’s Note: Runge-Kutta method uses an initial step and some intermediate steps to obtain higher order but then discard all information before taking the next step. Multistep methods attempt to gain efficiency by keeping and using the information from previous steps rather than discarding it. They refer to several previous points and derivative values.

Example:

Adams-Bashforth methods.

𝑦𝑛+1=𝑦𝑛+� 𝑝𝑘−1(𝑡)𝑑𝑡 𝑡𝑛+1

(6)

Where 𝑝𝑘−1(𝑡) is a polynomial of degree 𝑘 −1 which interpolates 𝑓(𝑦,𝑡) at �𝑦𝑛−𝑗,𝑡𝑛−𝑗�,𝑗= 0, … ,𝑘 −1.

For example.

𝑝0(𝑡) =𝑓𝑛, Euler’s method

𝑝1(𝑡) =𝑓𝑛−1+𝑓𝑛_ℎ− 𝑓𝑛−1

𝑛−1 (𝑡 − 𝑡𝑛−1) Adams-Bashforth Family

𝑦𝑛+1=𝑦𝑛+ℎ𝑓𝑛

Local error ℎ₂2𝑦(2)₍_𝜂_{), order 1} 𝑦𝑛+1=𝑦𝑛+ℎ₂(3𝑓𝑛− 𝑓𝑛−1) Local error 5ℎ₁₂3𝑦(3)₍_𝜂_{), order 2}

𝑦𝑛+1=𝑦𝑛+₁₂ℎ(23𝑓𝑛−16𝑓𝑛−1+ 5𝑓𝑛−2) Local error 3ℎ₈4𝑦(4)₍_𝜂_{), order 3}

𝑦𝑛+1=𝑦𝑛+_{24 (55}ℎ 𝑓𝑛−59𝑓𝑛−1+ 37𝑓𝑛−2

−9𝑓𝑛−3) Local error 251ℎ₇₂₀5𝑦(5)₍_𝜂_{), order 4} There exists a “start-up” issue in multistep methods:

How to get the 𝑘 −1 start values 𝑓𝑗=𝑓�𝑦𝑗,𝑡𝑗�,

𝑗= 1, … ,𝑘 −1?

Use a single step method to get start values, then switch to multistep method. Note: Careful about accuracy consistency.

Example:

Model: The motion of two bodies under mutual gravitational attraction.

A coordination system: origin: position of one body

𝑥(𝑡),𝑦(𝑡): position of the other body

Differential equations derived from Newton’s laws of motion:

𝑥′′₍_𝑡_{) =}₋𝛼2𝑥(𝑡)

𝑟(𝑡)

𝑦(𝑡) = −𝛼_𝑟2𝑦₍_𝑡(₎𝑡)

where 𝑟(𝑡) = [𝑥(𝑡)2₊_𝑦₍_𝑡₎2_]3/2_and_𝛼_{is a} constant involving the gravitational constant, the masses of the two bodies and the units of measurement.

If the initial conditions are chosen as

𝑥(0) = 1− 𝑒, 𝑥′_{(0) = 0,}

𝑦(0) = 0, 𝑦′_{(0) =}_{𝛼 �}1 +𝑒 1− 𝑒�

1 2

for some 𝑒 with 0≤ 𝑒 ≤1, then the solution is periodic with period 2𝜋/𝛼. The orbit is an ellipse with eccentricity 𝑒 and with one focus at the origin.

To write the two second-order differential equations as four first order differential equations, we introduce

𝑦1=𝑥,𝑦2=𝑦,𝑦3=𝑥′,𝑦4=𝑦′ We have a system of first-order equations

𝑠=(𝑦12+𝑦22) 3 2

𝛼2 ,

� 𝑦1

𝑦2

𝑦3

𝑦4

�

′

=

⎣ ⎢ ⎢ ⎢ ⎢ ⎡𝑦𝑦34

−𝑦1 𝑠 −𝑦_{𝑠 ⎦}2⎥

⎥ ⎥ ⎥ ⎤

=

⎣ ⎢ ⎢ ⎢ ⎢

⎡0₀ 0₀ 1 0_{0 1} −1

𝑠 0 0 0

0 −1 𝑠 0 0⎦⎥

⎥ ⎥ ⎥ ⎤

� 𝑦1

𝑦2

𝑦3

𝑦4

�

With the initial condition

� 𝑦1(0)

𝑦2(0)

𝑦3(0)

𝑦4(0)

�=

⎣ ⎢ ⎢ ⎢ ⎡ 1− 𝑒₀

0

𝛼 �1 +₁_{− 𝑒�}𝑒

1 2

⎦ ⎥ ⎥ ⎥ ⎤

The function defining the system of equations. function yp = orbit(y, t)

global a global e

yp = zeros(size(y)); r = y(1)*y(1) + y(2)*y(2); r = r*sqrt(r)/(a*a); yp(1) = y(3); yp(2) = y(4); yp(3) = -y(1)/r; yp(4) = -y(2)/r; endfunction

A script file TwoBody.m (Octave) global a

global e

a = input(’a = ’); e = input(’e = ’); # initial value

x0 = [1-e; 0.0; 0.0; a*sqrt((1+e)/(1-e))];

# time span

t = [0.0:0.1:(2*pi/a)]’; # solve ode

[x, state, msg] = lsode(’orbit’, x0, t);

Matlab

[t, x] = ode45(’orbit’, [0.0 2*pi/a], x0);

Output x: matrix of four columns x(:,1): x(t)

x(:,2): y(t) x(:,3): x’(t) x(:,4): y’(t)

plot(x(:,1), x(:,2))

𝑒= 1/4, 𝛼=𝜋/4 Implicit methods

Example:

𝑦′_{= 20}_𝑦_,_𝑦_{(0) = 1 (Solution}_𝑒−20𝑡₎ Euler’s method, ℎ= 0.1

𝑦′_{= 20}_𝑦_,_𝑦_{(0) = 1}_(Solution_𝑦−20𝑡₎ Euler’s method, ℎ= 0.1

Backward’s Euler’s method

𝑦′_{= 20}_𝑦_,_𝑦_{(0) = 1}_(Solution_𝑦−20𝑡₎ Backward Euler’s method, ℎ= 0.1

𝑦𝑛+1=𝑦𝑛−20∗0.1𝑦𝑛+1=𝑦𝑛−2𝑦𝑛+1

Taylor expansion of the solution 𝑦(𝑡) about

𝑡=𝑡𝑛+1 (instead of 𝑡=𝑡𝑛)

𝑦(𝑡𝑛+1+ℎ)≈ 𝑦(𝑡𝑛+1) +𝑦′(𝑡𝑛+1)ℎ =𝑦(𝑡𝑛+1) +𝑓(𝑦(𝑡𝑛+1),𝑡𝑛+1)ℎ and set ℎ=−ℎ𝑛= (𝑡𝑛− 𝑡𝑛+1), then we get

𝑦(𝑡𝑛)≈ 𝑦(𝑡𝑛+1)− ℎ𝑛𝑓(𝑦(𝑡𝑛+1),𝑡𝑛+1) Substituting 𝑦𝑛 for 𝑦(𝑡𝑛) and 𝑦𝑛+1 for 𝑦(𝑡𝑛+ 1), we have

Backward Euler’smethod

𝑦𝑛+1=𝑦𝑛+ℎ𝑛𝑓(𝑦𝑛+1,𝑡𝑛+1)

Implicit methods tend to be more stable than their explicit counter parts. But there is a price, 𝑦𝑛+1is a zero of a usually nonlinear function.

Example:

A system of two differential equations.

� 𝑢_𝑣′′₌= 998₋₉₉₉𝑢_{𝑢 −}+ 1998₁₉₉₉𝑣_𝑣 If the initial values 𝑢(0) =𝑣(0) = 1, then the exact solution is

� 𝑢_𝑣₌= 4₋₂𝑒_𝑒−𝑡−𝑡−_{+ 3}3𝑒_𝑒−1000𝑡−1000𝑡

Suppose we use forward Euler’s method

𝑢𝑛+1=𝑢𝑛+ ℎ(998𝑢𝑛+ 1998𝑣𝑛)

𝑣𝑛+1=𝑣𝑛+ℎ(−999𝑢𝑛−1999𝑣𝑛) With 𝑢0=𝑣0= 1.0, we get

h = 0.01 h = 0.001

u1 1.0 1.0

u2 30.96 3.996

u3 -239.1 3.992

(7)

u5 52420 31.92

If we use the backward Euler method

𝑢𝑛+1=𝑢𝑛+ ℎ(998𝑢𝑛+1+ 1998𝑣𝑛+1)

𝑣𝑛+1=𝑣𝑛+ℎ(−999𝑢𝑛+1−1999𝑣𝑛+1) Solving the linear system

�1− 998ℎ_999ℎ _{1 +}−1998ℎ_{1999ℎ� �}_𝑣𝑢𝑛+1_𝑛+1�=�𝑢_𝑢𝑛_𝑣�

With 𝑢0=𝑣0= 1.0, we get

h = 0.01 h = 0.001

u1 1.0 1.0

u2 3.688 2.946

u3 3.896 3.242

u4 3.880 3.613

u5 3.844 3.797

For comparison, the solution values

h = 0.01 h = 0.001

u1 1.0 1.0

u2 3.960 2.892

u3 3.921 3.586

u4 3.882 3.839

u5 3.843 3.929

Hybrid methods

One of the difficulties in implicit methods is that they involve solving usually nonlinear systems. Combining both explicit and implicit: Use an explicit method for predictor and an implicit method for corrector

Example: Fourth-order Adam’s method Predictor (fourth-order, explicit)

𝑦𝑛+1=𝑦𝑛+₂₄ℎ(55𝑓𝑛−59𝑓𝑛−1+ 37𝑓𝑛−2

−9𝑓𝑛−3) Corrector (fourth-order, implicit):

𝑦𝑛+1=𝑦𝑛+₂₄ℎ(9𝑓𝑛+1+ 19𝑓𝑛−5𝑓𝑛−1+𝑓𝑛−2) PECE methods

Algorithm

1. Prediction: use the predictor to calculate

𝑦𝑛+1(0)

2. Evaluation: 𝑓𝑛+1(0) =𝑓(𝑦𝑛+1(0),𝑡𝑛+1) 3. Correction: use the corrector to calculate

𝑦𝑛+10

Steps 2 and 3 are repeated until �𝑦𝑛+1(𝑖+1)− 𝑦𝑛+1(𝑖) � ≤

𝑡𝑜𝑙. Then set 𝑦𝑛+1=𝑦𝑛+1(𝑖+1).

Usually, PECE, one iteration and a final evaluation for the next step.

Summary

- Family of solutions of a differential equation,, initial value problem

- Transform a high order ODE into a system of first order ODEs

- Errors: Discretization errors (global, local), stability of a differential equation (mathematical stability), roundoff error, total error

- Accuracy: order of a method

- Euler’s method, Explicit, single step, first order

- Runge-Kutta methods: Explicit, single step - Multistep methods: Adams-Bashforth family - Implicit methods: Backward Euler’s method - Prediction-Correction scheme: Combination of

explicit and implicit methods

Ch 6 – Nonlinear Equations

and Continuous Optimization

Intro

Find roots

𝑓(𝑥) = 0

Often, methods are iterative (roots cannot be found in finite number of steps).

Example:

Compute square roots

𝑥2_{− 𝐴}_{= 0}

Find the side of the square whose area is 𝐴

Start with a rectangle whose one side is 𝑥𝑐, then the other side is 𝐴/𝑥𝑐 so that its area is 𝐴. Make the rectangle “more square” by setting the new side to the average of the two sides:

𝑥+=1₂�𝑥𝑐+_𝑥𝐴 𝑐� Then 𝑥𝑐=𝑥+ and iterate. A better form:

𝑥+=𝑥𝑐−1₂�𝑥𝑐−_𝑥𝐴 𝑐� Three issues to be addressed - Initialization (𝑥0)

- Convergence (𝑥𝑘→ 𝑥∗?) and rate (how fast?) - Termination

Initialization

Why do we initialize like this?

Conrad to Drew (Tron Facebook Group): The only explanation I can gather that makes sense to me is that he constricts all possible answers

to a small range, namely 0.25 to 1, rather than 0 to infinity. Having the final answer being a

factor of base 2 is semi-useful as well, I suppose it makes it easier for the computer to deal with it. His choice of base may have been just a choice of convenience, given that it results in 2, but doing that base tomfoolery was smart as well.

Normalizing his values to this range allows him to better constrict his range and better define the convergence rates and such of the system. Square root is nearly linear on this scale, his linear approximation is a decent method to find a solid initial guess. If he normalized his interval

size to like 0.1 to 1000 rather than 0.25 to 1,

well, his linear approximation would have been much worse and I'm assuming his initial error bound would have been a lot larger.

That said, without doing some base shenanigans, he couldn't have even done any sort of linear approximation and would have had to find some other way to get an initial guess.

Honestly though, all that shit I just mentioned could be serendipitous side effects of his ulterior motive of world domination. I know nothing.

Write 𝐴 in base 4:

𝐴=𝑚 ∗4𝑒_{, 0.25}_{≤ 𝑚}_{< 1}

Then √𝐴=√𝑚 ∗2𝑒

Now we can assume 4−1_{≤ 𝐴}_{< 1.} Linear interpolation of 𝑓(𝐴) =√𝐴 at 𝐴= 0.25, 1.0 →(0.25, 0.5), (1, 1) :

𝑝(𝐴) = (1 + 2𝐴)/3.

Initial error bound: Differentiating

√𝐴 −1 + 2₃ 𝐴

with respect to A and then setting the derivative to zero to find the maximum, it can be shown that

|√𝐴 −(1 + 2𝐴)/3| ≤0.05 Initial value: 𝑥0= (1 + 2𝐴)/3 Initial error: 𝑒0≤0.05 Convergence

Denote the error 𝑒𝑘=�𝑥𝑘− √𝐴�, then the relation between 𝑒𝑘+1 and 𝑒𝑘:

𝑒𝑘+1=�𝑥𝑘+1− √𝐴�=1₂�𝑥𝑘− √𝐴

�𝑥𝑘 � 2

=_2|1_𝑥 𝑘|𝑒𝑘

2

Conrad’s Note: Above equation found for by plugging in original equation for 𝑥+ in place of

𝑥𝑘+1, rearranging until it resembles the

equation for 𝑒𝑘.

It can be shown that 0.5≤ 𝑥𝑘≤1.0. Since the initial error 𝑒0≤0.05,

𝑒𝑘≤ 𝑒𝑘−12 ≤ ⋯ ≤ 𝑒02𝑘≤(0.05)2𝑘 We have shown the convergence (𝑒𝑘→0 as

𝑘 → ∞) How fast?

Rate: Quadratic 𝑒𝑘+1≤ 𝑐𝑒𝑘2, each iteration doubles the accuracy.

Termination

Recall: 𝑒𝑘≤(0.05)2𝑘< 10−2𝑘 When 𝑘= 3,𝑒𝑘< 10−8. When 𝑘= 4,𝑒𝑘< 10−16.

Three iterations are enough for IEEE single precision (2−24₎

Four iterations are enough for IEEE double precision (2−53_).

Example Compute √3 Scale: 3 = 0.75∗4

Initial: 𝑥0= (1 + 2∗0.75)/3 = 2.5/3 Iterate: 𝑥𝑛+1=𝑥𝑛−(𝑥𝑛−0.75/𝑥𝑛)/2

n xn error

0 0.833… 3.3∗10−2

1 0.8667... 6.4∗10−4

2 0.8660… 2.4∗10−7

3 0.8660… 3.2∗10−14

4 0.8660… < 10−16

(8)

Bisection Method

Generic algorithm

If 𝑓(𝑎)∗ 𝑓(𝑏)≤0 and 𝑓(𝑥) is continuous on [𝑎,𝑏], then 𝑓(𝑥) has a root on [𝑎,𝑏] while (b-a)>tol

m = (a+b)/2; if f(a)*f(m)<=0

b = m; else

a = m; end; end;

r = (a + b)/2;

Two problems in the generic algorithm: - The while-loop may not terminate. - When a and b are two neighbouring floating

point numbers and (b-a)>tol, (a+b)/2 is rounded to either a or b.

- Redundant function evaluations Improved algorithm

fa = f(a);

while (b-a)>tol + eps*max(|a|,|b|) m = (a + b)/2;

fm = f(m); if fa*fm<=0

b = m; else

a = m; fa = fm; end;

end;

r = (a + b)/2;

Note eps*max(|a|,|b|) is about the distance between two consecutive floating-point numbers near max(|a|,|b|). (ulp)

Convergence

Since 𝑏𝑘− 𝑎𝑘≤(𝑏 − 𝑎)/2𝑘,𝑥∗∈[𝑎𝑘,𝑏𝑘], and

𝑥𝑘= (𝑎𝑘+𝑏𝑘)/2, we have

𝑒𝑘= |𝑥𝑘− 𝑥∗|≤𝑏𝑘− 𝑎₂ 𝑘𝑏 − 𝑎₂𝑘+1 →0 In this case, 𝑒𝑘+1≤0.5𝑒𝑘.

Improve accuracy by 1 bit per iteration or 1 decimal digit for every three or so iterations. In general, linear convergence rate:

𝑒𝑘+1≤ 𝑐𝑒𝑘 For some constant 𝑐< 1. Difficutly: Locate the interval [𝑎,𝑏]

Newton’s Method

Idea

The tangent line of 𝑓(𝑥) at 𝑥𝑐:

𝑦=𝑓(𝑥𝑐) + (𝑥 − 𝑥𝑐)𝑓′(𝑥𝑐) Set 𝑦= 0

Newton’s Method

𝑥+=𝑥𝑐−_𝑓𝑓′(₍𝑥_𝑥𝑐) 𝑐) Example.

Square root problem revisited, find a zero of

𝑓(𝑥) =𝑥2_{− 𝐴}

𝑥+=𝑥𝑐−𝑥𝑐 2_{− 𝐴} 2𝑥𝑐 =𝑥𝑐−

1 2�𝑥𝑐−_𝑥𝐴

𝑐� Complex case

𝑓(𝑥) =𝑥2₊_𝑥_{+ 1 (zeros (}₋_{1 ±}_𝑖√_3/2))

i xi error

0 i 5.2∗10−1

1 -0.40000+0.8000i 1.2∗10−1 2 -0.50769+0.86154i 8.0∗10−3 3 -0.49996+0.86600i 4.6∗10−5 4 -0.50000+0.86603i 1.2∗10−9 5 -0.50000+0.86603i converge

𝑥6=𝑥5 Convergence

No guarantee of convergence (unlike bisection) For example 𝑓(𝑥) = atan(𝑥), 𝑥+=𝑥𝑐− (1 +𝑥𝑐2) atan(𝑥𝑐)

𝑥0= 1.5(> 1.3917)

𝑥1=−1.6941

𝑥2= 2.3211

𝑥3=−5.1141

𝑥0=−1.3(|−1.3| < 1.3917)|

𝑥1= 1.1616

𝑥2=−0.8589

𝑥3

Conditions for convergence (qualitative): - 𝑥0 close enough to 𝑥∗

- 𝑓′₍_𝑥_{) does not change sign near}_𝑥 ∗ - 𝑓(𝑥) is not too nonlinear near 𝑥∗ Newton’s method is a local method Difficulty: Finding 𝑥0

Hybrid methods

Combining bisection and Newton’s methods Bracketing interval [𝑎,𝑏], and 𝑥𝑐=𝑎 or 𝑏 if

𝑥+=𝑥𝑐− 𝑓(𝑥𝑐)/𝑓′(𝑥𝑐)∈[𝑎,𝑏] Bracketing interval [𝑎,𝑥+] or [𝑥+,𝑏] Else

𝑚= (𝑎+𝑏)/2;

bracketing interval [𝑎,𝑚] or [𝑚,𝑏] Termination criteria: Any one of - (𝑏𝑘− 𝑎𝑘) <𝛿

- |𝑓(𝑥𝑐)| <𝛿

- Too many function evaluations Avoiding derivatives Approximation

𝑓′₍_𝑥

𝑐)≈𝑓(𝑥𝑐+𝛿_𝛿𝑐)− 𝑓(𝑥𝑐) 𝑐 Choice of 𝛿𝑐

Example. Secant method ( 𝛿𝑐=𝑥−− 𝑥𝑐)

𝑓′₍_𝑥_𝑐)_≈𝑓(𝑥𝑐)− 𝑓(𝑥−)

𝑥𝑐− 𝑥−

𝑥+=𝑥𝑐−_𝑓₍_𝑥𝑥𝑐− 𝑥−

𝑐)− 𝑓(𝑥−)𝑓(𝑥𝑐)

Usually, the convergence rate (if it converges) is (1 +√5)/2≈1.6

𝑒𝑘+1≤ 𝑐𝑒𝑘1.6, superlinear, between quadratic and linear.

Zeros of a polynomial Finding the zeros of a polynomial

𝑝=𝑥𝑛₊_𝑐

𝑛−1𝑥𝑛−1+⋯+𝑐1𝑥1+𝑐0 Many methods were proposed.

The eigenvalues of its companion matrix

𝐶(𝑝) =

⎣ ⎢ ⎢ ⎢

⎡0 0_{1 0} ⋯_⋯ 0₀ −𝑐_−𝑐0

1 0 1 ⋯ 0 −𝑐2

⋮ ⋮ ⋮ ⋮ ⋮

0 0 ⋯ 1 −𝑐𝑛−1⎦

⎥ ⎥ ⎥ ⎤

,

det�𝑥𝐼 − 𝐶(𝑝)�=𝑝

Example

The zeros of the polynomial

𝑥3₋₁ Are the eigenvalues of

�0 0 11 0 0 0 1 0�

One real and two complex conjugate eigenvalues. Note:

How to compute the eigenvalues of a matrix? Finding zeros of a polynomial used to be the way of finding the eigenvalues of a matrix A. Textbook method:

The eigenvalues of a matrix A are the zeros of its characteristic polynomial det(𝜆𝐼 − 𝐴). Now, we have efficient and reliable methods for computing eigenvalues of a matrix.

QR method, John G.F Francis and Vera N. Kublanovskaya, late 1950s.

We find the zeros of a polynomial by computing the eigenvalues of its companion matrix.

Systems of Nonlinear Equations

Problem setting

𝑓1(𝑥1, … ,𝑥𝑛) = 0

𝑓2(𝑥1, … ,𝑥𝑛) = 0

⋮ 𝑓𝑛(𝑥1, … ,𝑥𝑛) = 0 Denote

𝐟(𝐱) = 0

f: vector valued function

x: vector

Newton’s method

𝐱+=𝐱𝑐+𝐬𝑐 where 𝐬𝑐 is the solution of

𝐟(𝐱𝑐) +𝐽(𝐱𝑐)𝐬𝑐= 0 i.e 𝐬𝑐=−𝐽−1(𝐱𝑐)𝐟(𝐱𝑐), where 𝐽(𝐱𝑐) is the

(9)

𝐽(𝐱) =�_𝜕𝑥𝜕𝑓𝑖

𝑗�=

⎣ ⎢ ⎢ ⎢ ⎡𝛿𝑓1

𝛿𝑥1 ⋯

𝛿𝑓1

𝛿𝑥𝑛

⋮ ⋱ ⋮

𝛿𝑓𝑛

𝛿𝑥1 ⋯

𝛿𝑓𝑛

𝛿𝑥𝑛⎦

⎥ ⎥ ⎥ ⎤

Example:

A system of nonlinear equations

�𝑥12− 𝑥22= 0 2𝑥1𝑥2= 1 With starting point

𝑥0=�0₁� Solution: 𝑥1=𝑥2= 1/√2

𝐟(𝐱) =�𝑓1

𝑓2�=� 𝑥 12− 𝑥22 2𝑥1𝑥2−1� The Jacobian is

𝐽(𝑥) =�2𝑥1 −2𝑥2 2𝑥2 2𝑥1� And

𝐽(𝑥0) =�0₂ −₀2�

Step 1:

𝑥1=𝑥0− 𝐽−1(𝑥0)𝑓(𝑥0) Solving for 𝑑0 in 𝐽(𝑥0)𝑑0=𝑓(𝑥0), we have

𝑑0=�−_0.50.5� Thus

𝑥1=�0₁� − �−_0.50.5�=�−_0.50.5� Step 2:

𝑥2=𝑥1− 𝐽−1(𝑥1)𝑓(𝑥1)

𝐽(𝑥1) =�1₁ −₁1�, 𝑓(𝑥1) =�₋0_0.5�

Solving for 𝑑1 in 𝐽(𝑥1)𝑑1=𝑓(𝑥1), we have

𝑑1=�−₋0.25_0.25� Thus

𝑥2=�. 5_{. 5}� − �−₋0.25_0.25�=�0.75_0.75� Avoiding derivatives

The jth column of J(x)

⎣ ⎢ ⎢ ⎢ ⎢ ⎡𝜕𝑓_𝜕𝑥1

𝑗

⋮ 𝜕𝑓𝑛

𝜕𝑥𝑗⎦⎥

⎥ ⎥ ⎥ ⎤

=_𝜕𝑥𝜕𝐟 𝑗

can be approximated by the difference

𝐟�𝑥1, … ,𝑥𝑗+𝛿,𝑥𝑗+1, … ,𝑥𝑛� − 𝐟(𝑥1, …𝑥𝑛)

𝛿

Continuous Optimization

Problem setting

min_𝐱∈𝑆 𝑓(𝐱) or max_𝐱∈𝑆 𝑓(𝐱)

x: vector of f(x): objective function and real valued

S: support

Find a zero of the gradient

∇𝑓(𝐱) =

⎣ ⎢ ⎢ ⎢ ⎡𝜕𝑓(𝐱)

𝜕𝑥1

⋮ 𝜕𝑓(𝐱)

𝜕𝑥𝑛 ⎦

⎥ ⎥ ⎥ ⎤

Newton’s method

View the gradient ∇𝑓(𝐱) as a vector-valued function and apply the Newton’s method for solving nonlinear systems.

At current 𝐱𝑐, find the correction 𝐬𝑐:

𝐱+=𝐱𝑐+𝐬𝑐

where 𝐬𝑐 is the solution of

∇𝑓(𝐱𝐜) +𝐻(𝐱𝑐)𝐒𝑐=0

The matrix 𝐻(𝐱𝑐) (the Jacobian of the gradient at

𝐱𝑐) is called the Hessian of f at 𝐱𝑐(∇2𝑓(𝐱𝐜)):

𝐻𝑖,𝑗= 𝜕 2_𝑓

𝜕𝑥𝑖𝜕𝑥𝑗 Example

Minimizing 𝑓 ∶ 𝑹2_{→ 𝑹}_:

𝑓(𝐱) =𝑥₃13− 𝑥1𝑥22+𝑥2. Perform on iteration of Newton’s method for minimizing f using the starting point

𝐱0=�0₁�

Apply the Newton’s method for finding a zero of the gradient

∇𝑓(𝐱) =� 𝑥12− 𝑥22

−2𝑥1𝑥2+ 1� The Hessian

𝐻(𝐱) =𝛻2_𝑓₍_𝐱_{) =}_�2𝑥1 −2𝑥2

−2𝑥2 −2𝑥1� Step 1:

𝐱1=𝐱0− 𝛻2𝑓(𝐱0)−1∇𝑓(𝐱0) =�0₁� − �₋_1/20 −1/2₀ � �−₁1�

=�1/2_1/2�

Summary

- Issues in an iterative method: Initialization, convergence and rate of convergence, termination. The example of computing the square root

- Bisection method: Numerical termination problem

- Newton’s method: Initial value, convergence problems

- Newton’s method for systems of nonlinear equations, Jacobian matrix

- Newton’s method for minimization, gradient and Hessian

Appendix A – Glossary

Condition number: The condition number

associated with the linear equation Ax = b gives a bound on how inaccurate the solution x will be after approximation. Conditioning is a property of the matrix, not the algorithm or floating point accuracy of the computer used to solve the corresponding system. In particular, one should think of the condition number as being (very roughly) the rate at which the solution, x, will change with respect to a change in b. Thus, if the condition number is large, even a small error in b may cause a large error in x. On the other hand, if the condition number is small then the error in x will not be much bigger than the error in b

Explicit methods: Explicit methods calculate the

state of the system at a later from the state of the system at the current time without algebraic equations

Implicit method: Contrary to explicit, finds the

solution by solving an equation involving the current state of the system and the later one. Slower but more stable on stiff equations

Stiff equation: A differential equation for which

certain numerical methods for solving the equation are numerically unstable, unless the step size chosen is extremely small. Formulating a precise definition is difficult (not my words): Main

idea is that the equation includes some terms that can lead to rapid variation in the solution.

Appendix B – Things you

really should remember

considering you can do a

Fourier series in your sleep

but apparently you don’t

remember

𝑦 − 𝑦1=𝑦_𝑥2− 𝑦1 2− 𝑥1(𝑥 − 𝑥1)

Appendix C – Code

Ncspline

function [b,c,d] = ncspline(x,y) % Usage: [b,c,d] = ncspline(x,y) %

% natural cubic spline interpolation of (x,y)

% % Input:

% x,y coordinates to be interpolated % Output:

% b,c,d coefficients of cubic polynomials

% s(x) = yi + bi(x-xi) + ci(x-xi)^2 + di(x-xi)^3

%

% Dependencies:

% decompt.m tridiagonal decomposition % solvet.m solver for tridiagonal systems

n = length(x);

h = x(2:n) - x(1:n-1); % dx delta = (y(2:n) - y(1:n-1))./h; % dy/dx %

% form the tridiagonal matrix

u = h(2:n-2); % subdiagonal, symmetric d = 2*(h(1:n-2) + h(2:n-1)); % main diagonal

%

% decompose the matrix

[u,d,l,rcond] = decompt(u,d,u); if (rcond == 0.0)

error('Exact singularity is detected.');

end %

% form the right-hand side

r(1:n-2) = delta(2:n-1) - delta(1:n-2); %

% solve for sigma, sigma = zeros(size(x)); sigma(2:n-1) = solvet(u,d,l,r); % set end points, natural spline sigma(1) = 0; sigma(n) = 0; %

% compute the coefficients b = delta - h.*(sigma(2:n) + 2*sigma(1:n-1));

c = 3*sigma(1:n-1);

d = (sigma(2:n) - sigma(1:n-1))./h; locate

function i = locate(x, z, g) % Usage: i = locate(x, z, g) %

% This function finds i such that z is in [x(i), x(i+1)]

% % input:

% x coordinates, x(1)<x(2)<...<x(length(x))

% z a point in [x(1), x(length(x))] % g guess for the location, between 1 and length(x)-1

% optional third argment % output:

(10)

n = length(x); %

if nargin==3

% try the initial guess if ((x(g)<=z) & (z<=x(g+1)))

i = g; return

% try a neighboring subinterval elseif ((g<n-1) & (x(g+1)<=z) & (z<=x(g+2)))

i = g+1; return end end %

if z==x(n) i = n-1; else

% binary search left = 1; right = n;

while (right > left+1) % x(left) <= z <= x(right) mid = floor((left+right)/2); if (z < x(mid))

right = mid; else

left = mid; end

end i = left; end

seval

function v = seval(z,x,y,b,c,d) % Usage: v = seval(z,x,y,b,c,d) %

% Evaluate natural cubic spline at z %

% input:

% z points at which the spline is evaluated

% they are in the range [x(1), x(length(x))]

% x,y coordinates to be interpolated % b,c,d coefficients returned by ncspline.m

% output:

% v values of the spline at z % dependency:

% locate.m locator

n = length(z); %

g = 1; % initial guess of the location of z(1)

for j=1:n

i = locate(x,z(j),g); % locate z(j)

% evaluate at z(j) in [x(i), x(i+1)] tmp = z(j) - x(i);

v(j) =

y(i)+tmp*(b(i)+tmp*(c(i)+tmp*d(i))); g = i; % guess for the next location end;

QUADRm

function [numI, esterr, fcnt, minl] = QUADRm(fname, a, b, tol, maxlev) % Usage: [numI, esterr, fcnt, minl] = QUADRm(fname, a, b, tol, maxlev) %

% Modified adaptive quadrature using rectangle rule.

% % Inputs

% fname name of the function to be integrated

% a,b interval [a,b] % tol tolerance

% maxlev max level of recursion, set to 10

% if not provided or maxlev>10 or maxlev<1,

% to prevent infinite recursion.

% Outputs

% numI numerical integration % esterr estimated error

% fcnt the total number of fucntion evaluations

% minl the length of the smallest panel

% Dependency % quadrrm.m if (nargin<5) maxlev = 10;

elseif ((maxlev>10) | (maxlev<1)) maxlev = 10;

end

% initialization mid = (a+b)/2;

fm = feval(fname, mid);

[numI, esterr, fcnt, minl] =

quadrrm(fname, a, b, tol, maxlev, fm); fcnt = fcnt + 1;

quadrrm

function [numI, esterr, fcnt, minl] = quadrrm(fname, a, b, tol, maxlev, fm) % Usage: [numI, esterr, fcnt, minl] = quadrrm(fname, a, b, tol, maxlev, fm) %

% Modified recursive function for adaptive quadrature using rectangle rule.

% % Inputs

% fm function value at the middle poin

% Outputs

% fcnt the total number of fucntion evaluation

h = b - a; % interval length mid = a + h/2; % mid point

fm1 = feval(fname, mid-h/4); % for the first sub-internal

fm2 = feval(fname, mid+h/4); % for the second sub-internal

fcnt = 2; % the # of feval for this iter

minl = h/2; %

R1 = fm*h; % one-panel R2 = (fm1 + fm2)*h/2; % two-panel % use R1 and R2 to estimate the error in R2

esterr = abs(R1 - R2)/3; %

if ((esterr <= tol) | (maxlev <= 1)) % return if error is small enough or max level of recursion

% has been reached numI = R2;

else

% bisect the interval and apply the quadrature to

% the two subintervals [num1, err1, fcnt1, minl1] = quadrrm(fname,a,mid,tol/2,maxlev-1,fm1);

[num2, err2, fcnt2, minl2] = quadrrm(fname,mid,b,tol/2,maxlev-1,fm2);

numI = num1 + num2; esterr = err1 + err2; fcnt = fcnt + fcnt1 + fcnt2; % update minl

minl = min(minl1, minl2); end

QUADS

function [numI, esterr, fcnt, minl] = QUADS(fname, a, b, tol, maxlev) % Usage: [numI, esterr, fcnt, minl] = QUADS(fname, a, b, tol, maxlev)

%

% Adaptive quadrature using Simpson's rule.

% % Inputs

% Outputs

% Dependency % quadsr.m

if (nargin<5) maxlev = 10;

elseif ((maxlev>10) | (maxlev<1)) maxlev = 10;

end

h = b - a; % interval length mid = a + h/2; % mid point minl = h/2; % the length of the smallest panel

%

fa = feval(fname, a); fb = feval(fname, b); fm = feval(fname, mid);

[numI, esterr, fcnt, minl] =

quadsr(fname,a,b,tol,maxlev,fa,fb,fm); fcnt = fcnt + 3;

quadsr

function [numI, esterr, fcnt, minl] = quadsr(fname, a, b, tol,

maxlev,fa,fb,fm)

% Usage: [numI, esterr, fcnt, minl] = quadsr(fname, a, b, tol,

maxlev,fa,fb,fm) %

% Recursive function for adaptive quadrature using Simpson's rule. %

% IInputs

% fa evaluated function at the beginning point

% fb evaluated function at the end point

% fm evaluated function at the middle point

% OOutputs

h = b - a; % interval length mid = a + h/2; % mid point %

f1 = feval(fname, mid - h/4); f2 = feval(fname, mid + h/4); fcnt = 2;

minl = h/2;

S1 = (fa + 4*fm + fb)*h/6; % one-panel

(11)

% use S1 and S2 to estimate the error in S2

esterr = abs(S1 - S2)/15; %

if ((esterr <= tol) | (maxlev <= 1)) % return if error is small enough or max level of recursion

% has been reached numI = S2;

else

% bisect the interval and apply the quadrature to

% the two subintervals [num1, err1, fcnt1, minl1] = quadsr(fname,a,mid,tol/2,maxlev-1,fa,fm,f1);

[num2, err2, fcnt2, minl2] = quadsr(fname,mid,b,tol/2,maxlev-1,fm,fb,f2);

% sum up numerical integrations and errors

numI = num1 + num2; esterr = err1 + err2;

% sum up the numbers of function evaluations

fcnt = fcnt + fcnt1 + fcnt2; % update minl

minl = min(minl1, minl2); end

InvPend

function yprime = InvPend(t,y) % Usage: yprime = InvPend(t,y) %

% This function defines the inverted pendulum differential equation % transformed from the second order ode.

%

% Variables;

% t time, scalar % y a 2-vector, where %

% y(1) = theta(t) and y(2) = (d/dt)theta(t)

%

% where theta(t) is the angle in the inverted pendulum problem.l

% parameters global L; global A; global w;

% constant, gravitation global g;

yprime(1,1) = y(2); yprime(2,1) = 3*(g -

A*w*w*sin(w*t))*sin(y(1))/(2*L); pendulum

% script file pendulum.m %

% solve and plot the inverted pendulum problem defined in InvPend.m

%

% Dependency % InvPend.m

global g;

g = 386.09; % constant g global L;

global A; global w;

L = 10; % length A = input('Anplitude A: '); w = input('Angle speed w: ');

tInit = 0; % initial t tFinal = 2.0; % final t yInit = zeros(2,1); % [y(0); (d/dt)y(0)]

yInit(1,1) = input('Initial angle: '); %

% use default tolerance RelTol = 1e-3 and AbsTol = 1e-6

[t,y] = ode45('InvPend', [tInit tFinal], yInit);

%

% figure % start a new figure window