Iterative methods with known gradient at the solution

1.3 Problems statement

2.1.3 Iterative methods with known gradient at the solution

Suppose that we can measure the gradient g(x) of the function f (x) at the solution (i.e. g(x) can be measured). Can we use this additional information to improve the efficiency of root finding algorithms ? For example, using g(x) instead of g(x) in the standard Newton method leads to an algorithm which has the same order of convergence while being more efficient. For this reason I will call it the Efficient Newton method. The Efficient Newton method has been applied in vision-based robot control by [Espiau 92] and for image registration by [Baker 04]. I will show that we can use the information on g(x) differently and we can design an algorithm called ESM (see [Malis 04a], [Benhimane 04]) which has the same order of convergence as the Halley method while being more efficient since it has the same computational complexity per iteration of the standard Newton method. 2.1.3.1 The efficient Newton method

The efficient Newton method can be obtained by considering the Taylor series of f (x) about x :

f (x + x) = f (x) + g(x)x +1 2h(x

∗

)x2 (2.41)

Since f (x) = 0, keeping terms up to first order we obtain :

f (x + x) ≈ g(x)x (2.42)

Evaluating the equation at x = −ex = bx − x gives : f (bx) ≈ −g(x)ex We obtain an efficient Newton method by computing :

x = −f (bx) g(x)

It is efficient since the inverse of the gradient is computed once and for all. For the efficient Newton method k(x) = 1/g(x) and the function ϕ(x) is defined as :

ϕ(x) = x − f (x)

g(x) (2.43)

2.1.3.1.1 Convergence domain of the Efficient Newton method

The convergence domain of the Efficient Newton method could be bigger than the convergence domain of the Newton and Halley methods. Indeed, since I supposed g(x) 6= 0 the Efficient Newton iteration can always be computed.

Theorem 3 If 0 < g(x) ≤ g(x) then the monotone convergence domain of the Newton- Raphson method is included in the monotone convergence domain of the Efficient Newton method. This means that if the Newton-Raphson method converges then the Efficient Newton method also converges.

Proof: The monotone convergence domain of the Newton-Raphson method is defined by : |ϕN RM(x) − x| < λ|x − x| (2.44) where : ϕN RM(x) = x + exN RM = x − f (x) g(x)

while the convergence domain of the Efficient Newton method is defined by :

|ϕEN M(x) − x| < λ|x − x| (2.45)

where :

ϕEN M(x) = x + exEN M = x −

f (x) g(x)

The Efficient Newton increment can be rewritten as a function of the Newton increment : e xEN M = − g(x) g(x) f (x) g(x) = γ(x) exN RM where : γ(x) = g(x) g(x) If 0 < g(x) ≤ g(x) then 0 < γ(x) ≤ 1 and |ϕEN M(x) − x| ≤ |ϕN RM(x) − x| < λ|x − x| (2.46)

2.1.3.1.2 Order of convergence of the Efficient Newton method

The following theorem shows that the efficient Newton method has the same order of convergence as the standard Newton-Raphson method.

Theorem 4 The efficient Newton method has at least quadratic order of convergence. Proof: Simply apply Theorem 2 to the function ϕ(x) defined in equation (2.43). The derivative of the function is :

ϕ1(x) = 1 −g(x)

g(x) (2.47)

When computed at x we obtain :

ϕ1(x) = 0 (2.48)

2.1.3.2 The Efficient Second-order approximation Method

Like the Efficient Newton method, the Efficient Second-order approximation Method (ESM) assumes that g(x) can be measured. Like the Newton-Raphson method, it is a root finding method. However, like the Halley method, it uses the first two terms of the Taylor series.

The ESM uses the Taylor series of g(x) about bx which can be written : g(bx + x) = g(bx) + h(bx) x + 1

2q(x

∗

) x2 (2.49)

where the last term is a second-order Lagrange remainder. Then, we can compute : h(bx) x = g(bx + x) − g(bx) −1

2q(x

∗

) x2 _(2.50)

Plugging equation (2.50) into equation (2.2) we obtain an expression of the function f (x) without the second-order terms :

f (bx + x) = f (bx) + 1 2(g(bx) + g(bx + x)) x − 1 12q(x ∗ ) x3 (2.51)

Keeping the terms of this equation only to second-order we obtain an efficient second- order approximation of the function :

f (bx + x) ≈ f (bx) + 1

2(g(bx) + g(bx + x)) x (2.52) When compared to the second-order approximation (2.31) used in the Halley method, it is evident that the second-order approximation (2.52) is more efficient since it is obtained without computing the second derivatives of the function. Thus, the ESM can also be viewed as an efficient version of the Halley method where instead of plugging the Newton- Raphson iteration into the second order approximation (2.31) we plug equation (2.50). Setting x = ex and f (bx + ex) = 0 one can solve the following linear equation :

f (bx) + 1

2(g(bx) + g(x)) ex = 0 (2.53)

Supposing that g(bx) + g(x) 6= 0 we find the ESM iteration is : e

x = − 2 f (bx)

g(bx) + g(x) (2.54)

For the ESM k(x) = 2/(g(x) + g(x)) and the function ϕ(x) is defined as : ϕ(x) = x − 2 f (x)

2.1.3.2.1 Convergence domain of the ESM

The following theorem shows that the bounds on the convergence domain of the ESM are wider than the bounds on the convergence domains of the Newton-Raphson and Halley methods.

Theorem 5 The bounds of the convergence domains of the Newton-Raphson and Halley methods are included in the bounds of the convergence domain of the ESM.

Proof: The bounds for the Newton-Raphson and Halley methods are defined by g(x) = 0. On the other hand, the ESM method cannot be computed when g(x) = −g(x) which determines the bounds of the convergence domain. This means that the sign of g(x) must be opposite to the sign of g(x). Since g(x) is smooth, it must become null before changing its sign. This means that if g(x1) = 0 and g(x2) = −g(x) then |x1− x| < |x2− x|

which proves that the bounds of the convergence domain of the ESM are wider.

Theorem 6 If 0 < g(x) ≤ g(x) then the monotone convergence domain of the Newton- Raphson method is included in the monotone convergence domain of the ESM. This means that if the Newton-Raphson method converges then the ESM also converges.

Proof: The monotone convergence domain of the Newton-Raphson method is defined by : |ϕN RM(x) − x| < λ|x − x| (2.56) where : ϕN RM(x) = x + exN RM = x − f (x) g(x) while the convergence domain of the ESM is defined by :

|ϕESM(x) − x| < λ|x − x| (2.57)

where :

ϕESM(x) = x + exESM = x −

2f (x) (g(x) + g(x))

The ESM increment can be rewritten as a function of the Newton increment : e xESM = − 2g(x) (g(x) + g(x)) f (x) g(x) = γ(x) exN RM where : γ(x) = 2g(x) (g(x) + g(x)) If 0 < g(x) ≤ g(x) then 0 < γ(x) ≤ 1 and |ϕESM(x) − x| ≤ |ϕN RM(x) − x| < λ|x − x| (2.58)

2.1.3.2.2 Order of convergence of the ESM

The following theorem shows that the ESM has the same order of convergence as the Halley method which is also higher than the Newton method.

Theorem 7 The Efficient Second-order approximation Method has at least cubic order of convergence.

Proof: Simply apply Theorem 2 to the function ϕ(x) defined in equation (2.36). The derivatives of the function are :

ϕ1(x) = g(x) − g(x) g(x) + g(x) + 2 h(x) (g(x) + g(x))2f (x) (2.59) ϕ2(x) = 2 h(x) g(x) − g(x) (g(x) + g(x))2 + 2q(x)(g(x) + g(x)) − 4h(x)2 (g(x) + g(x))3 f (x) (2.60)

When computed at x we obtain :

ϕ1(x) = 0 (2.61)

ϕ2(x) = 0 (2.62)

This theorem is important since the computation cost of the ESM is almost the same as the computation cost of the Newton method and much less than the computation cost of the Halley method. This will be even more true for multidimensional systems where the inverse of the gradient become the inverse of a matrix.

In document Vision-based estimation and robot control (Page 32-36)