• No results found

Sisser [Sis—82a] has described some modifications of Newton’s method in which, given an intial estimate 2^®^ G of an unconstrained strong local minimizer x* of the objective function / : jR” —^ i2\ the sequence (2^^^) is generated from

a;(fc+i) = J.») _ (G® + ju®/) “*ÿ® (fc > 0), 3.2.1

where the are determined so as to satisfy

+ 3.2.2

and

> ^ p ® V « , 3.2.3

where /® = /(a® ), g® = g(i®) = V /(a® ), G® = G(z®) = V^/(a®).

p ® = - ( G ® + / i ® / ) 'g ® , 3.2 .4

65 -

The are determined by expressing as a sum of dyads and then using the Sherman Morrison Woodbury (SMW) formula [SheM-49] [Woo—50], as explained subsequently.

Theorem 3.2.1

Let Q E jR”^” be nonsingular and let u, v G R . Then (Q + exists if and only if v^Q~^u + 1 ^ 0 . Furthermore, if v^Q~^u + 1 ^ 0 then

{Q + uv^) = Q~^ — {1/(1 + v^Q~^u)] Q~^uv^Q~^. 3.2.5

Proof

Suppose that v^Q~^u + 1 = 0 . Then {Q + uv'^)Q~^u = 0. Now w ^ 0 because v^Q~^u +1 = 0, so Q~^u y 0, whence (Q + = 0 where w = Q"^u. So Q + uv^ has a zero eigenvalue, whence Q + uv^ is singular. Conversely, if v’^Q~^u + 1 ^ 0 then

(Q+uv'^)[Q~^ - {1/(1+v^Q~^u)}Q“^ut;^Q“^] =1,

whence 3.2.5 holds. ^

If, in 3.2.5, u = $q and v - q then

(Q + aqg^) ^ = ~ {5 /( 1 + ag^Q ^q)]Q ^qq^Q ^ 3.2.6

which is the form of the SMW formula which is required in Sisser’s modifications of New­ ton’s method. The Hessian G = is expressed in the form

G = D + Y^Siqiqf, 3.2.7

66

where D ~ p i (rj > 0) as explained in §2.3, and p is determined experimentally as explained in §3.12. The dyads Sipiqf are ordered so that fort = 1,,,., r — 1,

3.2.8

Theorem 3.2.2 If ti, V e then

det(I + uv^) = 1 + v^u. Proof

Let U G be defined by

Then

U^uv^U ~ diag(v^u,0,..., 0),

so that eigenvalues of uv^ are v^u and 0, the latter with multiplicity n — 1. Therefore the

*

I

The Hessian G is inverted by using 3.2.6. Initially Q = D so Q is positive definite, and the r

dyads Siqiqf are added, one at a time, by using 3.2.6 recursively, in such a way as to preserve ^ positive definiteness. The following results are needed in order to understand how this can

be done.

3

eigenvalues of J + uv'^ are 1 + and 1, the latter with multiplicity n — 1. Therefore ^ det(I + uv'^) = 1 + v'^u. ^

67

Theorem 3.2.3

Let Q G be symmetric with eigenvalues Ai > ... > Let tt G i2", and let the eigenvalues of Q + uu^ be /ii > ... > /i„. Then > X\ > /X2 > ... > Mn > Furthermore, if the eigenvalues of are i/i > ... > then Ai > 1^1 > ... > A„ > i/„.

Proof

See [Wol—78] Appendix I.

Theorem 3.2.4

If Q G is symmetric positive definite, s G i2, and g G i2”, then Q + sq(^ is positive definite if and only if 1 + sq^Q~^q > 0.

Proof

By Theorem 3.2.2,

det{Q + agg^) =det(Q{I + aQ“^gg^))

=det(,Q)det(I + (Q qXsqf )

=det(Q){l + sq'^Q ^g). 3.2.9

Suppose that 1 + sq'^Q ^g > 0. Then by 3.2.9,

det(Q+sqq^) > 0, 3.2.10

since det(Q) > 0 because Q is positive definite. Now if the eigenvalues of Q are Ai > ... > An and the eigenvalues of Q+sq<f are /xi > ,., > /Hn then by Theorem 3.2,3 with « = |a| î g.

- 6 8 -

Mi > Al > > /i» > An and if the eigenvalues of Q - sqq^ are i^i > ... > then Al ^ ^ An ^ Z/n. Also n det{Q + sqq^) = f l »=i det(Q - sqq'^) =]][%/„ »=i and 1=1

If <3 is positive definite then A„ > 0 so /i„ > 0 and z/n-i > 0- Only z/n can be negative. So at most one eigenvalue of Q + sqq"^ can be negative. So if det(Q + sqq^) > 0 then every eigenvalue of Q + sqq^ is positive whence Q + sqq'^ is positive definite.

Conversely, if 1 + sq'^Q~^q < 0 then by 3.2.9 det(Q + sqqf-) < 0 whence Q + agg^ has a negative eigenvalue. Therefore Q + sqqf is not positive definite. ^

Let

< l/g )/, 3.2.11

where D is as in 3.2.7, and let the sequence be generated from 3.2.6 according to

69

so that

JjO) = (Z) + ^ SiqiqJ) ^ (1 < ; < r), 3.2.13 1=1

provided that exists. Suppose that for some j > 1, jD + is positive definite. Then exists and is positive definite. Therefore by Theorem 3.2.4, if Sj > 0 then 2 9 s,g,gf is positive definite because 1 +aygjH^''^^qj > 0. Therefore exists and is positive definite.

Now by 3.2,8 the dyads in 3.2.7 are ordered so that those with positive coefficients come first. Therefore for some k < 2m, where m is as in § 2.3, a,- > 0 (t = 1,..., k) and D + S?=i is positive definite. If, after all the dyads have been added, positive definiteness has been retained, so that exists and is positive definite for / = 0, . . , , r then = Q-i _ ff(r) is computed fi-om 3.2.1 with - 0. If, however, it is found that dyads can no longer be added without causing D + lo lose positive definiteness for some / < r, then a lower bound on the smallest eigenvalue of is computed and is used in 3.2.1. The following results are needed in order to understand how this can be done.

Theorem 3.2.5

Let P, <3 € 22”^" be symmetric with eigenvalues Ai > ... > A„ and > ... > /^» respectively. Let R = P + Q and suppose that R has eigenvalues z/i > ... > %/». Then

^ A* + /iyi — 1 , . . . , tl), and

< ^ i + P\ (* = ! . . . , n). Proof

- 7 0 -

Theorem 3.2.6

Let P G be symmetric, let g G P ”, and let 5 < 0. If A„ is the smallest eigenvalue of P and Vn is the smallest eigenvalue of P where

P =P + agg^, then

> K +

Proof

By Theorem 3.2.5 with Q = agg^, = ag^g, /x,= 0 (i = 1,..., n — 1),

> A,' + ag^g (» = I ,,.., n),

where Ai > ... > A„ and v\ > ... > i/„ are the eigenvalues of P and P respectively. ^

Theorem 3.2.7

Let P G P ”^” be symmetric positive definite, let g; G P ” (t = 1,..., r) and let c,* < 0 (» = 1,..., r). If

r

R ~P + ^C iq iq f

»=l has eigenvalues > ... > i^n then

r

^ n >5 ^c,gfg,'. t=l

Proof

71

■;û

= P + Ciqiqf (t = 1,..., r),

1=1

and let the eigenvalues of ^(0 > ... > i/^\ Then by Theorem 3.2.6,

Xn+ciqfqi,

But P is positive definite so A„ > 0 whence > c\qfq\. Suppose that for some f > 1,

Now

so by Theorem 3.2.6,

> Ÿ^Ciqfqi.

»=1

iî(‘^')= iî® + ct+igwÆ„

i,(t+l) > j,W +

ct+iql.iqt+1

t+l >Y^CiQi9i- 1=1 So by finite induction on t. (r)

Suppose that, in 3.2.7, the first I dyads have been added to P , one at a time, and that the test for positive definiteness in Theorem 3.2.4 has been passed for each dyad. Then for

- 7 2 -

j < r, is positive definite, where is defined by 3.2.13. Let

G = D + Ÿ ^ Siqiqf + ÿ ] s<9.<ïf + aigidf, 3.2.14

1=1 i=k+l »=l+l

where a* > 0 ( i = 1,,.., fc) and s,- < 0 ( i = fc + 1,..., r ). Suppose that

1 + si+iqf+iH^^^qi+i < 0. 3.2.15

Then by Theorem 3.2.7, i/„, the smallest eigenvalue of G, is bounded below according to

r

1=1+1 so if

^ |a<|gfg.- 3.2.16

*=/+!

then G^*'^ + is positive definite and can be used in 3.2.1 to compute

The use of 3.2.16 is an inexpensive way of determining a suitable value for but as pointed out by Sisser [Sis—82a] a better value of could be obtained by splitting the (( + 1)^^ and all subsequent dyads and trying to absorb parts of those dyads without losing positive definiteness. Let

Pf+i = ( ^ - ^)/{si+iqf+iH^^^qi+\)y 3.2,17

where f > 0 is determined experimentally as explained in §3.12 and is given by 3.2.13. Then

- 7 3 -

1 + = S,

so by Theorem 3.2.4, the dyad pz+i^z+igz+igj+i can be absorbed without losing positive defi­ niteness to obtain where is given by 3.2.12 with / = f + 1 and pf+% gf+i replacing si+i, and by 3.2.13

== (P + ^ Siqiqf + pi+i SM qi+i qjh ^ . »=l

The procedure is repeated for » = 2 + 2,..., r, with

in which

E

y=i j=i+i

An improved value of for use in 3.2.1 is then given by

= y ^ ( l - Pi)\si\qiqi. 3.2.19 »=z+i

The matrix is inverted by using 3.2.12 as follows. There are two cases, namely (i) I < n; (it) I > n.

-7 4

and use 3.2.12 to obtain = H^'^K There is no need to test for positive definite­ ness when adding a dyad, because is large enough to ensure that is positive definite.

(it) 2 > n. In this case,

= ( ^ + Ç Siqiqf + ^ PiSiqiqf^

*=1 W+l

has already been computed. The SMW formula is used to add the dyads corresponding to namely (i = 1,..., n ) where e,- is column t of the n x n unit matrix, and then the SMW formula is used to add the r — 2 dyads (1 — p,)s»g,gf (i = 2 + 1,..., r) where pi is given by 3,2.18. Again there is no need to test for positive definiteness.

If, in 3.2.1, ||g(^)||2 < e for some fc, where e E R is given and is such that 0 < e « 1, and G^^^ is positive definite so that = 0, then is accepted as the final estimate of a;*. If however < s and is not at least positive semi-definite then is a saddle point. In this case, where is a small perturbation of random direction such that > e.

Sisser’s method is expressed as an algorithm in [Sis—82a], and has been implemented in S-algol [ColM—82a] on a VAX-11/785 computer using /(£ ÿ £ IB [SheW-85] to express the Hessian as a sum of dyads. Sisser's algorithm is referred to in this thesis as Algorithm S (see Figure 3.10.1).

Sisser [Sis—82a] has described 3 modifications of S which are referred to in this thesis as SI, S2, and S3. In SI (see Figure 3.10.2), is determined from 3.2.16. In S2 (see Figure 3.10.3), if, for some t G {1,..., r}.

75

1 + Siqf < 0, 3.2.20

SO that by Theorem 3.2.4, positive definiteness would be lost by adding the t * dyad, then in the dyads r , sj is replaced with |ay| and (/ = i , . . . , r) is computed from 3.2 .1 2 to give a positive definite matrix which is regarded as an approximation to

In S3 (see Figure 3.10.4), if for some t € {1,..., r } 3.2.20 holds then is replaced with the n X n unit matrix /, so that a steepest descent step is taken.

3.3 The Modifications M l and M2 of Sisser’s Method

Let G E be given by 3.2.7 and suppose that the dyads have not been ordered so that 3.2.8 holds. Let be given by 3.2.11 and let the sequence be generated from 3.2.12. Suppose that for some » € {2,..., 2m},

1 + S iq ^ H ^ -% > 0 (7 = 1, 1),

but that

l+a.gTp('-Og.<0. 3.3,1

Then for / = 1,..., (t — 1), is positive definite, but would not be positive definite. This suggests that the dyad should not be added. Suppose that for i = I ,..., 2m, all dyads SiqiqJ such that 3.3.1 holds are not added. Then the resulting matrix is positive definite. If at least one of the dyads 5,*g,gf (t = 1,., .,2m) has been added, and then the dyads —r) c»ef (» = 1,..., n) are added, then the resulting matrix P^*") is a positive definite

- 7 6 -

approximation to which is exact if all of the dyads 5»g,gf (» = 1 ,..., 2m) are added.

If 3.3.1 holds for t - 1 ,..., 2m then is replaced with the n x n unit matrix I and a steepest descent step is taken.

The preceding ideas are essentially modification Ml of Sisser’s method (see Figure 3.10.5). A considerable saving in computational labour can be made by omitting to add the dyads Siqiqf for which

3.3.2

where sm is the smallest machine number such that 1 + > 1. This is essentially modifi­

cation M2 (see Figure 3.10.6).

Related documents