Inexact newton methods for singular problems

(1)

INEXACT NEWTON METHODS

FOR SINGULAR PROBLEMS

C. T. KELLEY and Z. Q. XUE

North Carolina State University, Center for Research in Scientic Computation and Department of Mathematics, Box 8205,

Raleigh, N. C. 27695-8205, USA.

In this paper we describe the eects of an inexact implementation of Newton's method on the behavior of the iteration for certain nonlinear equations in Banach space for which the Frechet derivative is singular at the solution. We give a termination criterion for the inner iteration that preserves not only the q-linear convergence of the Newton iterates but also the ne structure required for an acceleration method.

KEY WORDS: inexact Newton method, singular nonlinear equation, simple fold, acceleration of convergence

1 INTRODUCTION

In this paper we describe the eects of an inexact [12] implementation of Newton's method on the behavior of the iteration for certain nonlinear equations in Banach space for which the Frechet derivative is singular at the solution. As a particular example we consider the Newton-GMRES iteration [2]. We give a termination criterion for the inner iteration that preserves not only the q-linear convergence of the Newton iterates but also the ne structure required for a generalization of the acceleration method given in [22].

Let F be a map from a Banach space E into itself. Assume that F(x ) = 0 for some x

2 E and that F is twice Lipschitz continuously dierentiable in a neighborhood ofx

. We consider a class of problems where the standard assumption in nonlinear equations thatF

0( x

) is nonsingular fails to hold. In several cases [3], [6], [7], [5] [24], [25], [26], [15], [14], [17], [18], the performance of Newton's method is dierent from that in the standard situation in two respects. First of all, the set of initial iterates from which the Newton iterates will converge is not a ball aboutx

but rather a restricted region that avoids the set on which F

0is singular. Also, the convergence is not quadratic, but q-linear with a q-factor that depends This research was supported by Air Force Oce of Scientic Research grant

#AFOSR-FQ8671-9101094 and National Science Foundation grant #DMS-9024622. Computing activity was also partially supported by an allocation of time from the North Carolina Supercomputing Center.

(2)

on the nature of the singularity. This behavior is shared by problems that are near such singular problems [11], [16], and the structure of the singularity aects the performance of various Newton-like methods such as the Shamanskii method [21], the chord method [9], and Broyden's method [10].

In this paper we consider the class of singular problems with simple quadratic folds [19] at the root. Our methods extend to more general situations including the higher order singularities with higher dimensional null spaces considered in [22], but the case of simple quadratic folds is sucient to completely expose the ideas. We make the following assumptions on the singularity.

Assumption 1.1. F 0(

x

)has a one dimensional null space

N spanned by 2E

and closed rangeX such that E=NX. For any projection P N onto

N parallel

toX we have

P N

F 00(

x )(

;)6= 0:

Assumption 1.1 can be modestly weakened by replacingX with any compliment ofN inE and assuming that the range ofF

0( x

) has codimension one.

Following the notation in much of the literature on singular problems we let ~

x=x?x for

x2EandP X =

I?P

N. For singularities satisfying Assumption 1.1 one region of admissible initial iterates for Newton's method is

W(;) =fx2Ej0<kxk~ <;kP X~

xk kP N~

xkg;

for and suciently small. Other, larger, regions have been described in the literature [14], but iterates after the rst lie inW(;). The basic convergence and structure result is, [6], [25],

Theorem 1.1. LetF be twice Lipschitz continuously dierentiable in a

neigh-borhood ofx

and let Assumption 1.1 hold. Then if

and are suciently small

and x 0

2W(;)the Newton sequence x

n+1= x

n ?F

0( x

n) ?1

F(x n)

exists(i.e. F 0(

x

n) is nonsingular for all

n), remains in W(;), and converges to x

with q-factor1 =2,

lim n!1

kx~ n+1

k kx~

n k = 12

:

If E = R then quadratic convergence can be recovered by the simple artice of doubling the Newton step at each iterate. If E has dimension larger than one this trick will not work because the modied iterates may leaveW(;) and even diverge. Several methods have been proposed [8], [15], [5], [22] to recover superlinear convergence. The most generally applicable of these is the method from [22] which is described in the following theorem.

neigh-borhood ofx

and let Assumption 1.1 hold. Let

2(0;1) and realC6= 0be given.

Then ifandare suciently small,x 0

2W(;), andy n

;s n, and

x

(3)

forn0by

y n=

x n

?F 0(

x n)

?1 F(x

n) ; s

n= ?F

0( y

n) ?1

F(y n)

; x

n+1= y

n+ (2 ?Cks

n k

) s

n ;

(1)

then fx n

g exists, remains in W(;) and converges q-superlinearly to x

with

q-order1 +.

FIGURE 1: W(;)

N

X x

x c

y

x + 6

-In Figure 1 we plotX on the horizontal axis and N on the vertical. The gure shows one quadrant of the convergence cone W(;) and x

c is located near the boundary. We plot the locations of the iterations generated by one sweep of the iteration in (1) beginning with x

c

2 W(;). We plot the intermediate iterate y and the modied Newton iteratex

+. The dotted curve is the the boundary of the set

Q=fx2W(;)jkP X~

xkkP N~

xk 2

g:

It is known, see Lemma 2.1 in x 2, that the Newton iterates described by The-orem 1.1 satisfy x

n

2 Q for n 1 and and suciently small. Hence the intermediate iteratey satisesy2Qandkyk~ kx~

c

k=2. The purpose of the inter-mediate iterate is to locateydeeply enough inW(;) so that the modied iteration remains inW(;). For this purpose the sign ofC does not matter,jCjneed only be large enough to keepx

+ in

W(;) so that the iteration can continue. Smaller values ofjCjrequire smaller values ofand in order thatx

+

(4)

andmust be decreased as well. C= 1 and=:9 was a reasonable choice in the numerical experiments reported [22]. In the inexact form of (1) we make dierent decisions and discuss those inx2 and 3.

If the method of Theorem 1.2 is applied to a problem for whichF 0(

x

) is non-singular, the intermediate iterate y will reduce the error quadratically, since it is a true Newton iterate. However the modied iterate x

+ will increase the error in y by a factor of roughly 2. The overall rate, therefore, will be two-step quadratic, the modied step being wasted. One might try to detect singularity by monitoring the progress of an unmodied Newton iteration and turning on the acceleration procedure when the ratios of successive values of kF(x

n)

k seem to be converging to 1=4. Similar ideas might be applicable to the inexact situation considered here. Inexact Newton methods solve the equation for the Newton step at a current iteratex

c

F 0(

x c)

s=?F(x c)

approximately via some method that returns a step that satises

kF 0(

x c)

s+F(x c)

k

c kF(x

c)

k: (2)

The tolerance

c can vary as the iteration progresses and the results in [12] give bounds on

c that are sucient to maintain various convergence rates. We state the following special case of the results in [12] to have a basis for comparison with the results in this paper.

Theorem 1.3. LetF be Lipschitz continuously Frechet dierentiable near a root x

and let F

0( x

) be nonsingular. Let x

n+1= x

n+ s

n

where

kF 0(

x n)

s n+

F(x n)

k

n kF(x

n) k:

Then for all2[0;1), if x

0 is suciently near x

and

n

thenfx n

gconverges

tox

q-linearly in the norm kk

given by kw k

= kF

0( x

) w k:

Moreover

if n

!0thenfx n

gconverges to x

q-superlinearly, and if

n=

O(kF(x n)

k p) as

n! 1 for some p2(0;1]then fx n

gconverges to x

q-superlinearly with q-order at least1 +p.

The purpose of this paper is to describe the behavior of inexact Newton methods for the class of singular problems described by Assumption 1.1. We have recently reported on success with a nested iteration multilevel version of this algorithm [13] in the context of neutron transport theory and in this paper we focus on the theoretical aspects. Inx2 we describe conditions on the sequence f

n

(5)

2 CONVERGENCE RESULTS

2.1 The Structural Lemma

As was the case in [22] we require a critical lemma on the structure ofF and the Newton iterates. This lemma was stated in roughly the form that we use it here in [22], and follows from earlier estimates in [6], [25], and [26].

Lemma 2.1. LetF be twice Lipschitz continuously dierentiable in a

neighbor-hood ofx

and let Assumption 1.1 hold. Then there are

>0, >0, and K >0

such that for all x2W(;),F 0(

x)is nonsingular, kF(x)kKkxk~ (kxk~ +);kP

N F

0( x)

?1

kKkxk~ ?1

; and kP X

F 0(

x) ?1

kK : (3)

Moreover

y=x?F 0(

x) ?1

F(x)

satises

kP N~

y? 1 2 P

N~

xkK(kP X~

xk+kP N~

xk 2)

and

kP X~

y kK(kP X~

xkkP N~

xk+kP N~

xk 3)

:

(4)

We will require a more detailed inexact form of this lemma.

neighbor-hood ofx

and let Assumption 1.1 hold. Then there are K

I

>0,>0,>0, and >0such that if

c

,x2W(;), ands satises kF

0(

x)s+F(x)k c

kF(x)k (5)

then the inexact Newton iterate

x +=

x+s (6)

satises

kP N~

x +

? 1 2 P

N~ xkK

I( kP

X~ xk+kP

N~ xk

2+

c

c+

c

c)and kP

X~ x +

kK I(

kP X~

xkkP N~

xk+kP N~

xk 3+

c

2 c+

c

c)

;

(7)

where

c=

kxk~ and kP X~

xk= c

kP N~

xk: (8)

Proof. Let and <1 be small enough so that the conclusions of Lemma 2.1 hold. Keep in mind that

c

and

c

. Let =F

0(

x)s+F(x) and let the Newton iterate fromxbe

x N + =

x?F 0(

x) ?1

(6)

Sincex + ?x N + = F 0( x) ?1

we have by (3) that kP N( x + ?x N +)

kK 2 c( c+ c) and

kP X( x + ?x N +)

kK 2 c c( c+ c) : (9) Together with (4), this proves (7) withK

I = max( K ;K

2). From Lemma2.2 we can make estimates on

+ = kx~

+ kand += kP X~ x + k=kP N~ x + k. The inexactness adds complexity that is not present in the exact case (cf. [25]) in that the rst linear iteration needs to be done particularly accurately as described in (10).

neighbor-hood of x

r 2(1=2;1). Then there are K, M, >0, >0, and >0such that if

c

,x2W(;), x

+ is given by (5) and (6),(8)holds, and

c c M c (10) thenx +

2W( +

; +)

W(;) with (1?r)

c

=2?K(( c+ c+ c) c+ c c) + c

=2 + K(( c+ c+ c) c+ c c) r c ; (11) + K c ; (12) and + + M

+ for all

+

. (13)

Proof. We begin by letting,, andbe such that the conclusions of Lemma 2.2 hold. We will reduce,, andas the proof progresses. By (8)

(1 + c) ?1 c kP N~

xk(1? c)

?1

c and hence, decreasing if needed so that

c

1=2, kP

N~

xk(1 + 2 c) c 2 c ;kP X~ xk2

c c : Hence kP N~ x + ? 1 2 P N~

xk kP

N~ x N + ? 1 2 P N~ xk+kP

N~ x N + ?P N~ x + k =kP N~ x N + ? 1 2 P N~ xk+kP

N( x N + ?x +) k K(2 c

c+ 4

2 c) +

K 2 c( c+ c) 4K I( c+ c)( c+ c) ; (14) and kP N~ x + k 1 2 kP N~

xk+ 4K I( c+ c)( c+ c) c =2 +

c

c+ 4 K I( c+ c)( c+ c) : Similarly kP N~ x + k c =2?

c

c ?4K

(7)

SetK 0

I = 1 + 8 K

I. We have

c =2?K

0 I( c+ c)( c+ c) kP N~ x + k c =2 +K

0 I( c+ c)( c+ c) : (15) In the same way we have

kP X~

x +

k kP

X~ x + ?P X~ x N +

k+kP X~ x N + k K 2 c c( c+

c) + K(4

c+ 8 c) 2 c : Therefore, kP X~ x +

kK 0 I c( c+ c)( c+ c) : (16)

Combining (15) and (16) with

kP N~

x +

k?kP X~ x + k + kP N~ x +

k+kP X~ x + k yields c

=2?K 0 I(1+ )( c+ c)( c+ c) + c =2+K

0 I(1+ )( c+ c)( c+ c) : (17) This implies all but the rst and last inequalities in (11) with any choice of Kthat satises

KK

0 I(1 +

):

Now letr2(1=2;1) be given. If;, and are small enough so that

K(++)<r =2?1=4 and c c c(

r =2?1=4)=K (18)

then K( c+ c)( c+ c)

<(r?1=2) c

which yields the last inequality in (11) with any choice of M that satises

M (r =2?1=4)=K To obtain (12) we note that (16) and (15) imply

+ = kP X~ x + k kP N~ x + k K 0 I c(( c+ c+ c) c+ c c) c =2?K

0 I(( c+ c+ c) c+ c c) = c K 0 I( c+ c+ c+ M) 1=2?K

0 I( c+ c+ c+ M) : Reduce M if needed so that M 1=(8K

0

I). Reducing

, , and if necessary so thatK 0 I( c+ c+ c)

1=8 proves +

c and therefore (12) holds with any choice of Kthat satises

K1: Setting K= max(1;K

0 I(1 +

(8)

It remains to prove (13). Note that

+

K

+

c

K 1?r

+ :

Hence if we reduceso thatM(1?r)=K then the proof is complete. Note that enforcement of the condition

c

c M

c, perhaps by reduction of , is required only on the rst iterate and is satised automatically on all following iterates. This requirement that the rst inexact Newton step be computed particu-larly accurately never needed to be enforced in our computations, but is important in our convergence results.

2.2 Linear Convergence

The linear convergence result on inexact Newton methods parallels Theorem 1.3. Theorem 2.4. LetF be twice Lipschitz continuously dierentiable in a

neigh-borhood ofx

M be from Lemma 2.3. Then there

are,, and , such that ifx 0

2W(;),

0 kP

X~ x 0

kMkP N~

x 0

kkx~ 0

k; (19)

and f n

gis such that n

for alln then the sequence given by x

n+1= x

n+ s

n ;

wheres

nsatises

kF 0(

x n)

s n+

F(x n)

k

n kF(x

n) k;

remains in W(;)and converges q-linearly tox

. Moreover if

n

!0asn!1,

thenx n

!x

with q-factor1 =2.

Proof. Let x 0

2W( 0

; 0) with

kx~ 0

k= 0

<andkP X~

xk=kP N~

xk= 0

. Let , , andbe such that the conclusions of Lemma 2.3 hold for some r2(1=2;1). We may do this because (19) implies that (10) holds.

By Lemma 2.2 we have

1

r 0 and

1

K

0 : Hence if K

0 ,x

1

2W(;). Since Lemma 2.3 also states that 1

1

M 1, we may proceed with the iteration to nd thatx

n

2W(;) for allnand that

n+1= kx~

n+1 kr

n : Hence the convergence is q-linear.

To get a more detailed estimate we use Lemma 2.3 to obtain

n+1

K

n !0 asn!1. Hence

n+1=

n

=2 +O( n

n) +

o( n)

; and so if

n

(9)

2.3 Acceleration of Convergence

We consider a modication of the algorithm given in [22]. For

c given (perhaps depending onx

c And

c) and for x

c

2W(;) and c

, where; ;are such that the hypotheses of Theorem 2.4 hold we compute x

+ through the following algorithm.

Algorithm 2.1.

1. Computes

x such that kF

0( x

c) s

x+ F(x

c)

k

c kF(x

c) k:

2. Sety=x c+

s x.

3. Computes

y such that kF

0( y)s

y+

F(y)k c

kF(y)k: (20)

4. Setx +=

y+ (2 + c)

s y.

We use Lemma 2.2, Lemma 2.3, and Theorem 2.4 to obtain the rened form of Lemma 2.3 that we will need for an acceleration result.

neighbor-hood ofx

and let Assumption 1.1 hold. Then there are K

>0,, , and such

that ifx c

2W(;), c

, kx~

c k=

c and kP

X~ x c

k= c

kP N~

x c

k; (21)

(10)holds, and s

y is given by(20)in step 3 of Algorithm 2.1 then

s y=

? ~ y 2 +

1 ;

where

kP N

1

kK

c(

c+

c)and

kP X

1

kK

2 c

: (22)

Proof. Let ;, and be such that the conclusions of Lemma 2.2, Lemma 2.3, and Theorem 2.4 hold for some r 2 (1=2;1). As in the proof of Lemma 2.3 we assume that1=2. Since1=2 we have for allw2W(;) that

kw k=~ 2kP N~

w k2kwk:~

If y is given by step 2 of Algorithm 2.1 then (21) and Lemma 2.2 imply that y2W(;) and

kP X~

yk K

I( kP

X~ x c

kkP N~

x c

k+kP N~

x c

k 3+

c

2 c+

c

c)

K

I(

c4

2 c+ 8

3 c+

c

2 c +

c

c)

K

I(4

c+ 8

c+

c+ M)

2 c

K

A

2 c :

(23)

In (23),K A=

K I(4

(10)

Recall from (12) that y= kP X~ yk kP N~ yk K c : (24)

Letz =y+s

y. As pointed out above, Theorem 2.4 implies that

y 2 W(;). Hence, we may apply Lemma 2.2 tozand conclude that

~ z= 12P

N~ y+

1 ; where, since kyk~ r

c, kP

N

1

k K

I( kP

X~ yk+kP

N~ yk

2+

c

ky k~ + c y) K I( K A 2 c+ 4

r 2 2 c+ c r c+ K c c) K B c( c+ c) ; (25) whereK B= K I( K A+ 4

r 2+

r+ K). Similarly there isK

C

>0 such that kP

X

1

k K

I( kP

X~ ykkP

N~ yk+kP

N~ yk

3+

c ky k~

2+

c

y kyk~ )

K C 2 c( c+ c) : We may estimate

1in terms of

1 since

1 =

s y+ ~

y

2 = (~z?y~) + ~ y 2 = ~z?

~ y 2 = ~z?

1 2P N~ y? 1 2P X~ y= 1 ? 1 2P X~ y: Hence kP X 1 k 1 2kP X~ yk+kP

X 1 k K A 2 2 c+ K C 2 c( c+ c) and P N 1 = P N

1. This completes the proof with K = max( K B ;K A

=2 + (+ )K

C).

If we select

c properly we can apply Lemma 2.5 to Algorithm 2.1 to obtain superlinear convergence. We state the main result in terms of the transition from x

c to x

+ with a fairly general choice of

c and then as a corollary discuss possible implementations. Note that we require a nonzero lower bound forj

c

jin (26). The parametersC andplay the same role in the inexact algorithm as they do in the algorithm described in Theorem 1.2.

neigh-borhood of x

and let Assumption 1.1 and (10) hold. Let

r 2 (1=2;1), C 6= 0, C2R, and 2(0;1) be given. Then there are; ; ;;K

+ , and

K ?

such that if x

c

2W(;), c

,(21)holds, C( c+ c) j c

(11)

and x

+ is given by Algorithm 2.1 then x

+

2W(;), kx~

+ kr

c

; and K ?

j c

kx~ +

kK +

j c

: (27)

Moreover, there isK

such that if

+

K

j

c j(

c+

c)

(28)

then

+

M +

: (29)

Proof. Let ;, and be small enough so that the conclusions of Lemma 2.2, Lemma 2.3, Theorem 2.4, and Lemma 2.5 hold for the r 2 (1=2;1) given in the statement of the theorem.

Let

+=

kx~ +

kand + =

kP X~

x +

k=kP N~

x +

k: Note that

~ x

+= ~

y+ (2 + c)

s y = ~

y+ (2 + c)(

?y=~ 2 + 1) =

? c~

y

2 + (2 + c)

1

: (30) Theorem 2.4 implies that

(1?r) c

kyk~ r c

: Therefore, sincej

c

j1, Lemma 2.5 implies that (1?r)j

c j

c

2 ?6K

(

c+

c)

c

+

r j c

j c 2 + 6K

(

c+

c)

c

: (31) Reducing,, and, if needed so that

r =2 + 6K (

+)r

proves the rst inequality in (27). To prove the second, note that (26) implies that 6K

(

c+

c) = 6 K

(

c+

c) 1?(

c+

c)

6K

(

c+

c) 1?

C ?1

j c

j: Hence

j c

j c((1

?r)=2?6K

C ?1(

+) 1?)

+

j

c j

c(

r =2 + 6K

C ?1(

+) 1?) Reducingandif needed so that

(r =2 + 6K

C ?1(

+) 1?)

<1=2 proves the second inequality in (27) with

K ? = (1

?r)=2?6K

C ?1(

+)

1? and K

+

=

r =2 + 6K

C ?1(

+) 1? It remains to show thatx

+

2W(;) and that (29) holds. Decrease and if needed so that

C N = (1

?r)C 4 ?6K

( +)

1?

(12)

whereC andrare given as in the statement of the theorem. By (11)

kP N~

yk(1?r) c

=2: If we applyP

N to both sides of (30), take norms, and use Lemma 2.5 we have kP N~ x + k j c

j(1?r) c

4 ?6K

( c+ c) c C N( c+ c) c : If we applyP

X to both sides of (30) and use (22) and (23) we have kP

X~ x +

kkP X~

yk=2 + 3K 2 c (K A

=2 + 3K ) 2 c : SettingC X= K A

=2 + 3K

to obtain kP

X~ x +

kC X 2 c : Hence + C X 2 c C N( c+ c) c C X c C N( c+ c) C X C N 1? c : (33)

Reduceif needed so that

C X 1? C N to obtain, using the nal inequality in (33),

C X C N 1? c C X C N 1? : This shows thatx

+ 2W(

+ ;

+)

W(;).

We complete the proof by verifying (29). We use the rst inequality in (33)

+ + + C X c C N( c+ c) and (27) to see that (29) holds if

+ C X c C N( c+ c) MK ? j c j c M + ; which holds if

+ MK ? C N C X ( c+ c) j c j: Setting K = MK ? C N C X completes the proof.

Verication of (28) is simple in the case

c

C( c+

c)

for some C. In view of (11) we may approximate

c by ks

y c

(13)

will be made for the remainder of this paper. By Lemma 2.2 there arec ?

;c +

>0, independent of C andso that the choice of

c given in (34) satises

Cc ?(

c+

c)

c Cc

+(

c+

c)

: Hence if

+

K

Cc ?

2 c

: (35)

then

+

K

Cc ?(

c+

c)

2

K

(

c+

c)

j c

j and therefore (28) follows from (35). If we set

C =

K

c ? if

n=

0

n, for some

2[0;1], and2[0;1], then (35) will follow from

C

n(2?1) C

2 0

:

Hence if C is large enough and 2 [0;1=2], (28) will hold for any given 0 and 2 [0;1). If = 1 and C is large enough then (28) will hold for any given

0 and2[0;1). This control of

+ is, in fact, one of the roles of

C. As we will see from the numerical experiments an aggressive choice of C (i. e. a small value) is usually sucient to keep the iteration inW(;) and the aggressive choice makes the convergence more rapid.

We make a particular choice for deniteness in the statement of our next theorem, which is a trivial consequence of Theorem 2.6.

neigh-borhood of x

2 [0;1] be given. Then there

are ; ; such that ifx 0

2W(;), and n=

0

n, with

0

,fx n

gand fy n

g

are given by Algorithm 2.1, and f n

g is given by (34) with 2 [0;1=2], and C

suciently large, thenx n

!x

q-superlinearly. Another approach is to set = 1, x

n=

0 =

at a small value, and apply Algorithm 2.1 with a goal of obtaining q-linear convergence with a small q-factor. This approach would be especially appropriate in the context of Newton-iterative methods where the goal would be to reduce the number of inner iterations. Perfor-mance of this type of algorithm is described in the nal theorem of this section.

neigh-borhood ofx

2[0;1)and C>0. Then there

are; ;such that ifx 0

2W(;),Cis suciently large,fx n

gandfy n

gare given

by Algorithm 2.1, and

n=

C(+ks y n

k)

; (36)

thenx n

!x

q-linearly with q-factor at most K

+

(14)

3 NUMERICAL RESULTS

As an example we consider the H-equation of Chandrasekhar [4].

F(H)() =H()?

1? 1 2 Z

1 0

H()

+

d ?1

= 0: (37)

It is known [23], [20] that the assumptions of the theorems in x 2 hold for this equation in the spaceE=L

2([0

;1]) and for discretized formulations, provided the quadrature rule integrates constant functions exactly.

We report on computations using a composite 20 point Gaussian quadrature with ve subintervals to approximate all integrals. TheL

2 inner product was also approximated using this quadrature rule. In all computations the initial iterate was the function identically one. GMRES [27] was used as the iterative method with the function identically zero used as the initial iterate for the inner iteration. We use a modication of the Brown-Hindmarsh GMRES code [1] with changes made in the inner product to allow the approximateL

2 inner product to be used instead of the R

N inner product. Products of F

0(

x) with vectors were approximated with a forward dierence with a dierence step of 10?7. The computations reported in the tables were done on a SUN SPARC 1+ running SUN OS version 4.1.1 and FORTRAN compiler f77 version 1.3.1.

In each of the tables that followwe tabulate the iteration countern, the number of inner iteratesi

grequired to satisfy the termination criterion for the inner iteration, the norm of the function at the current pointkF(x

n)

k, and for n 1, the ratio of successive function norms. For the unmodied Newton iteration, one would expect this ratio to tend to 1=4. We tabulate the inner iteration counteri

g in order to compare the various methods in terms of function evaluations. Each Newton-GMRES iterate requires a function evaluation for the dierence approximation of the action of F

0 on a vector. In all cases the iteration was terminated when kF(x

n) k<10

?12. All norms in the tables are L

2 norms.

Our rst table illustrates the conclusions of Theorem 2.4. In the results reported in the left columns of Table 1 we set

n=

:25 for all n and in the right columns

n = 2

?n?2. Note that while the number of Newton iterates is larger for the

n=

:25 case, the number of function evaluations is slightly smaller because fewer GMRES iterates are required for each Newton step. A total of 58 GMRES iterates were required for termination of the iteration in the rst computation and 74 for the second. This easily osets the additional outer iterations. The slightly irregular behavior of the iteration in the rst computation is also worth noting.

To illustrate the modied method of Algorithm 2.1, Theorems 2.7 and 2.6 we set =:25,

n= 2

?n?2 and compute

nby (34) with

C=:01. This is an aggressive (small) choice of C. While such a choice was not appropriate for the exact algorithm described in [22], the inexactness had the eect of keepinglarge enough to remain inW(;) without the need to keep C large.

In the left columns of Table 2 we do not tabulate information onF(y

n) but do add the GMRES iterations required for computation ofy

nto the total in i

(15)

TABLE 1: Newton-GMRES Iteration

n=

:25

n i

g

kF(x n)

k

kF(x n)

k kF(x

n?1) k

0 0.37D+00

1 1 0.12D+00 0.31D+00 2 1 0.28D-01 0.24D+00 3 2 0.11D-01 0.40D+00 4 2 0.28D-02 0.25D+00 5 2 0.68D-03 0.24D+00 6 2 0.20D-03 0.29D+00 7 2 0.47D-04 0.24D+00 8 2 0.12D-04 0.26D+00 9 3 0.60D-05 0.48D+00 10 3 0.15D-05 0.25D+00 11 3 0.37D-06 0.25D+00 12 3 0.92D-07 0.25D+00 13 3 0.23D-07 0.25D+00 14 3 0.56D-08 0.24D+00 15 4 0.15D-08 0.27D+00 16 3 0.36D-09 0.24D+00 17 4 0.11D-09 0.32D+00 18 3 0.30D-10 0.27D+00 19 4 0.11D-10 0.37D+00 20 4 0.27D-11 0.24D+00 21 4 0.65D-12 0.24D+00

n= 2

?n?2

n i

g

kF(x n)

k

kF(x n)

k kF(x

n?1) k

0 0.37D+00

(16)

TABLE 2: Modied Method

n= 2

?n?2

;=:25;C=:01.

n i

g

kF(x n)

k

kF(x n)

k kF(x

n?1) k

0 0.37D+00

1 2 0.98D-01 0.26D+00 2 3 0.14D-01 0.14D+00 3 4 0.26D-04 0.19D-02 4 2 0.42D-06 0.16D-01 5 6 0.27D-08 0.64D-02 6 7 0.15D-12 0.55D-04

n=

:25;=:9;C=:01.

n i

g

kF(x n)

k

kF(x n)

k kF(x

n?1) k

0 0.37D+00

1 2 0.99D-01 0.26D+00 2 3 0.14D-01 0.14D+00 3 4 0.24D-04 0.18D-02 4 2 0.42D-06 0.17D-01 5 5 0.35D-08 0.84D-02 6 2 0.34D-09 0.97D-01 7 2 0.23D-11 0.68D-02 8 2 0.12D-12 0.51D-01 more GMRES iterations are required for each outer iterate x

n. The accelerated iteration is more ecient than the unaccelerated versions, requiring a total of 24 GMRES iterations. The superlinear convergence can be clearly seen in the tables. Small changes in the parametersand C had modest eects. For example, with

C = :01 and = :5, 26 GRMES iterations were required and with C = :1 and =:25 27 GMRES iterations were needed. The number of outer iterations, six, was the same as in the tabulated results.

Finally, in the right columns of Table 2 we illustrate Theorem 2.8. Here we set

n =

=:25 for all n, =:9, and C = :01. 22 GMRES iterates were required. Small changes in C and produced decreases in performance of at most 10% in our observations. Note that the number of functions required for 8 outer iterations and 22 inner iterations is the same as that required for 6 outer iterations and 24 inner iterations, so the costs of the two computations reported in Table 2 are the same.

REFERENCES

1. P. N. Brown and A. C. Hindmarsh,Reduced storage matrix methods in sti ode systems,

Tech. Report ucrl-95088, Lawrence Livermore National Laboratory, 1987.

2. P. N. Brown and Y. Saad,Hybrid Krylov methods for nonlinear systems of equations, SIAM J. Sci. Stat. Comp., 11 (1990), pp. 450{481.

3. R. Cavanaugh, Dierence Equations and Iterative Processes, PhD thesis, University of Maryland, 1970.

4. S. Chandrasekhar,Radiative Transfer, Dover, New York, 1960.

5. D. W. Decker, H. B. Keller, and C. T. Kelley,Convergence rates for Newton's method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 296{314.

6. D. W. Decker and C. T. Kelley,Newton's method at singular points I, SIAM J. Numer. Anal., 17 (1980), pp. 66{70.

7. ,Newton's method at singular points II, SIAM J. Numer. Anal., 17 (1980), pp. 465{ 471.

(17)

9. ,Sublinear convergence of the chord method at singular points, Numer. Math., 42 (1983), pp. 147{154.

10. ,Broyden's method for a class of problems having singular Jacobian at the root, SIAM J. Numer. Anal., 22 (1985), pp. 566{574.

11. ,Expanded convergence domains for Newton's method at nearly singular roots, SIAM J. Sci. Stat. Comp., 6 (1985), pp. 951{966.

12. R. Dembo, S. Eisenstat, and T. Steihaug,Inexact Newton methods, SIAM J. Numer. Anal., 19 (1982), pp. 400{408.

13. B. D. Ganapol, C. T. Kelley, and G. C. Pomraning, Asymptotically exact boundary conditions for the P-N equations. Nuclear Science and Engineering, to appear.

14. A. Griewank,Starlike domains of convergence for Newton's method at singularities, Nu-mer. Math., 35 (1980), pp. 95{111.

15. , Analysis and modication of Newton's method at singularities, PhD thesis, Aus-tralian National University, 1981.

16. ,On solving nonlinear equations with simple singularities or nearly singular solutions, SIAM Review, 27 (1985), pp. 537{572.

17. A. Griewank and M. R. Osborne,Newton's method for singular problems when the di-mension of the null space is>1, SIAM J. Numer. Anal., 18 (1981), pp. 179{189.

18. ,Analysis of Newton's method at irregular singular points, SIAM J. Numer. Anal., 20 (1983), pp. 747{773.

19. H. B. Keller,Lectures on Numerical Methods in Bifurcation Theory, Tata Institute of Fundamental Research, Lectures on Mathematics and Physics, Springer-Verlag, New York, 1987.

20. C. T. Kelley,Solution of the Chandrasekhar H-equation by Newton's method, J. Math. Phys., 21 (1980), pp. 1625{1628.

21. , A Shamanskii-like acceleration scheme for nonlinear equations at singular roots, Math. Comp., 47 (1986), pp. 609{623.

22. C. T. Kelley and R. Suresh,A new acceleration method for Newton's method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 1001{1009.

23. T. W. Mullikin,Some probability distributions for neutron transport in a half space, J. Appl. Prob., 5 (1968), pp. 357{374.

24. L. B. Rall, Convergence of the Newton process to multiple solutions, Numer. Math., 9 (1961), pp. 23{37.

25. G. W. Reddien,On Newton's method for singular problems, SIAM J. Numer. Anal., 15

(1978), pp. 993{986.

26. , Newton's method and high order singularities, Comput. Math. Appl., 5 (1980), pp. 79{86.