A SUPERLINEARLY CONVERGENT SEQUENTIAL QUADRATICALLY CONSTRAINED QUADRATIC PROGRAMMING ALGORITHM FOR DEGENERATE NONLINEAR PROGRAMMING MIHAI ANITESCU

(1)

A SUPERLINEARLY CONVERGENT SEQUENTIAL

QUADRATICALLY CONSTRAINED QUADRATIC PROGRAMMING

ALGORITHM FOR DEGENERATE NONLINEAR PROGRAMMING

MIHAI ANITESCU

Abstract. We present an algorithm that achieves superlinear convergence for nonlinear pro-grams satisfying the Mangasarian-Fromovitz constraint qualication and the quadratic growth con-dition. This convergence result is obtained despite the potential lack of a locally convex augmented Lagrangian. The algorithm solves a succession of subproblems that have quadratic objective and quadratic constraints, both possibly nonconvex. By the use of a trust-region constraint we guaran-tee that any stationary point of the subproblem induces superlinear convergence which avoids the problem of computing a global minimum.

1. Introduction.

Recently, there has been renewed interest in analyzing and modifying the algorithms for constrained nonlinear optimization for cases where the traditional regularity conditions do not hold [5, 12, 11, 20, 24, 23]. This research has been motivated by the fact that large-scale nonlinear programming problems tend to be almost degenerate (have large condition numbers for the Jacobian of the active constraints). It is therefore important to dene algorithms that are as little dependent as possible of the ill-conditioning of the constraints. In this work, we term as degenerate those nonlinear programs (NLPs) for which the gradients of the active constraints are linearly dependent. In this case there may be several feasible Lagrange multipliers.

Many of the previous analysis and rate of convergence results for degenerate NLP [5, 12, 11, 20, 24, 23] are based on the validity of some second-order conditions. These are essentially equivalent to the condition in unconstrained optimization that, for a critical point of a function f(x) to be a local minimum, fxx 0 is a necessary

condition and fxx 0 is a sucient condition. Here is the positive semidenite

ordering. The place of fxxin constrained optimization is taken for these conditions

by Lxx, the Hessian of the Lagrangian, which is now required to be positive denite

on the critical cone for one or all of the Lagrange multipliers [7, 21].

This work diers from previous approaches in that we assume only that

1. At a local solution x of the constrained nonlinear program, the rst-order

Mangasarian-Fromovitz [18, 17] constraint qualication holds. 2. The quadratic growth condition (QG) [6, 15] is satised:

f(x)f(x ) + jjx?x jj 2 (1.1)

for some > 0 and all x feasible in a neighborhood of x.

3. The data of the problem are twice continuously dierentiable.

These assumptions are equivalent to a weaker form of the second-order sucient conditions [14, 6] which does not require the positive semideniteness of the Hessian of the Lagrangian on the entire critical cone. In a recent a paper [2] it has been shown

Thackeray 301, Department of Mathematics, University of Pittsburgh, Pittsburgh, PA 15213 ([email protected]). Part of this work was completed while the author was the Wilkinson

Fellow at the Mathematics and Computer Science Division, Argonne National Laboratory. This work was supported by the Mathematical, Information, and Computational Sciences Division sub-program of the Oce of Advanced ScienticComputing, U.S. Departmentof Energy, under Contract W-31-109-Eng-38. This work was also supported by award DMS-9973071 of the National Science Foundation.

(2)

that these conditions guarantee that x is an isolated stationary point and that a

steepest-descent like algorithm induces linear convergence to x. The framework used

here accommodates even problems for which no locally convex augmented Lagrangian exists [2], which do not satisfy the assumptions of most other convergence results [5, 12, 11, 20, 24].

In this paper we dene an algorithm that is superlinearly convergent even in the very general conditions outlined above. The trade-o is that the subproblems to be solved are more complex than a quadratic program. The algorithm can be justied by a particular perspective on Newton's method for unconstrained optimization. If f(x) is the function to be minimized without constraints then suciently close to a solution xNewton's direction, d, is a solution of the quadratic minimizationproblem.

min d2IR nf(x) + rxf(x)Td + 12dTr 2 xxf(x)d:

The term f(x) is constant for this minimization problem, but we include it to em-phasize that we can regard d as a solution of the second-order approximation to the problem. If we have an inequality constrained nonlinear program,

min_x f(x)

subject to gi(x)0 i = 1;2;:::;m;

its second-order approximation at x is the following problem min d2IR n f(x) + rxf(x)Td + 12dTr 2 xxf(x)d subject to g(x) +rxgi(x)Td + 12dTr 2 xxgi(x)d; i = 1;2;:::;m:

We call such a problem a quadratically constrained quadratic problem (QCQP). To ensure that the problem is bounded even for x far from the solution x, we add to

the problem a trust-region constraint, which is also quadratic: dT_d

2:

The problem is generally not convex and thus nding the global optimum may be a dicult problem. Also, the trust-region constraint may interfere with the order of convergence. However, we show that for x close to x and for suciently small but

xed:

1. The trust region constraint is inactive at any stationary point of the QCQP. 2. Any stationary point d of the QCQP used as a progress direction induces

superlinear convergence.

Therefore, nding a local solution to the QCQP is sucient to induce superlinear convergence of the iterates, which considerably reduces the conceptual complexity of a sequential QCQP (SQCQP) algorithm. Note that the QCQP subproblem is identical to the one used in [16], although the analysis conditions in this work are more general. The paper is structured as follows. In Subsection 1.1 we discuss the dierent con-ditions dening a stationary point of a nonlinear program and the quadratic growth condition. Section 2 characterizes stationary points of the second-order approximation (QCQP) of the nonlinear program at x. We show that, if the trust-region constraint

denes a suciently small region then the Mangasarian-Fromovitz constraint quali-cation is satised at any feasible point and d = 0 is the unique stationary point of

(3)

the QCQP. As a result, in Section 3 we prove that, for x suciently close to x, the

trust-region constraint is inactive at any stationary point of QCQP and we prove the superlinear convergence of the SQCQP algorithm. We conclude with Section 4, where we briey discuss possible approaches to solving the QCQP subproblem.

1.1. Previous Work, Framework, and Notations.

We deal with the NLP problem

min_x f(x) subject to g(x)0;

(1.2)

where f :IRn!IRand g :IRn!IRm are twice continuously dierentiable.

We call x a stationary point if the Fritz John conditions conditions hold: There exist 2IRm, 0 2IRwith (; 0) 6 = 0 such that rxL(x; 0;) = 0; 0 0; 0; g(x)0; Tg(x) = 0: (1.3)

HereL is the Lagrangian function L(x;

0;) = 0f(x) +

T_g(x):

(1.4)

A local solution x of (1.2) is a stationary point [19]. If certain regularity

con-ditions hold at x (discussed below), then there exists

0 such that x

with

and 0 = 1 satisfy (1.3). In that case (1.3) are referred to as the KKT

(Karush-Kuhn-Tucker) conditions [3, 4, 8] and are referred to as the Lagrange multipliers. For that case, which is the one that most oftenly appears in this work, we dene the Lagrangian as

L(x;) = f(x) + Tg(x);

(1.5)

and the Karush-Kuhn-Tucker conditions become

rxL(x;) = 0; 0; g(x)0; Tg(x) = 0:

(1.6)

Since our analysis is limited to a neighborhood of a point xthat is a strict local

minimum, we assume that all constraints are active at x, or g(x) = 0. Such a

situation can be obtained by choosing a suciently small trust-region and simply dropping the constraints i for which gi(x) < 0, since this relationship holds in an

entire neighborhood of x. This does not reduce the generality of our results, but it

simplies the notation because now we do not have to refer separately to the active set.

The regularity condition, or constraint qualication, ensures that a linear ap-proximation of the feasible set in the neighborhood of x captures the geometry of

the feasible set. Often in local convergence analysis of constrained optimization algo-rithms, it is assumed that the constraint gradientsrxgi(x

), i = 1;2::m, are linearly

independent, so that the Lagrange multiplier in (1.6) is unique. We assume instead the Mangasarian-Fromovitz constraint qualication (MFCQ) [18, 17]:

rxgi(x )Tp ? 0; i = 1;2;:::;m; for some 0> 0, p 2IRn,jjpjj= 1. (1.7)

It is well known [9] that MFCQ is equivalent to boundedness of the set M(x ) of

Lagrange multipliers that satisfy (1.6), that is,

M(x ) = f0 j (x ;) satisfy (1.6) g: (1.8) 3

(4)

Note that M(x

) is certainly polyhedral in any case. Another condition equivalent

to MFCQ (1.7) is [10] 06= m X i=1 irxgi(x ); 8i0; i = 1;2;:::;m such that m X i=1 i> 0: (1.9)

The critical cone at x is [7, 22] C= u2IRn jrxgi(x )Tu 0; i = 1;2;:::;m; rxf(x )Tu = 0 : (1.10)

We briey review some of the second-order conditions in the literature. In the framework of [7], the second-order sucient conditions for x to be an isolated local

solution of (1.2) are [7, 8]: 9 2M(x ); 9 > 0 such that vTLxx(x ;)v kvk 2 2; 8v2C: (1.11)

If these conditions hold at x for some , then the quadratic growth condition is

satised, irrespective of the validity of the rst-order constraint qualication [7, 8]. An important consequence of the condition (1.11) is that x is a local minimum of

the augmented Lagrangian

Lc(x;) =L(x;) + cjjg(x)jj 2

for a suciently large constant c.

A renement of the second-order conditions was introduced in [14]. In the pres-ence of MFCQ, those conditions require that

8u2C; 9 2M(x ); such that uT r 2 xxL(x ;)u > 0: (1.12)

Further analysis shows that, in presence of MFCQ, these conditions are necessary and sucient for the quadratic growth condition to hold [6, 14, 15, 22].

If the condition (1.12) holds, but (1.11) does not, then there may be no augmented Lagrangian with a positive semidenite Hessian, as it is shown with an example in [2]. This is an interesting aspect since it invalidates the usual working assumption of Lagrange multiplier methods [4]. It also shows that the analysis in this paper is done without assuming the existence of an augmented Lagrangian that has x as an

unconstrained minimum.

In our analysis we use the L1 nondierentiable exact penalty function:

P(x) = maxf0;g

1(x);:::gm(x) g:

(1.13)

If the MFCQ (1.7) conditions hold at x, then the quadratic growth condition

(1.1) and the second order conditions (1.12) are each equivalent to the following condition [6] minff(x)?f(x );P(x) gjjx?x jj 2 (1.14)

for some > 0 and all x in a neighborhood of x.

For some function h :IRn !IRk we denote by c

1h, c2h bounds depending on the

rst and second derivatives of h. The positive and negative parts of h(x) are h+(x) =

maxfh(x);0gand respectively, h

?(x) = max

f?h(x);0g, both taken componentwise.

With this notation h(x) = h+(x) ?h

?(x). Also, in our notation,

rxgi(x), and rxg(x) are column vectors.

(5)

In this work we need to estimate distances to sets described by linear constraints:

P = fd2IRnjMeqd + qeq= 0;Mind + qin0g;

(1.15)

where Meqand Minare neqn and, respectively, ninn matrices and qeqand qinare

neq and, respectively, nin dimensional vectors. By Homan's Lemma [13], ifP 6=;,

there exists cP> 0 such that 8~d2IRn; D( ~d;P)c Pmax n Meq~d+ qeq 1 ; (Min~d+ qin) + 1 o ; (1.16)

where by D( ~d;P) we denoted the distance from ~dto the setP. This result allows us

to relate the distance from a point ~d to a polyhedral set in terms of the infeasibility of ~din the representation (1.15).

2. Stationary Points of Quadratically Constrained Quadratic

Pro-grams.

In this section we investigate the stationary points of the quadratically con-strained quadratic program

min_d2IR n aTd + 1 2d T_Ad subject to b_Tid + 1 2d T_B_i_d 0; i = 1;2;:::;m dT_d 2 (2.1)

where > 0 denes a trust-region constraint, A, Bi, i = 1;2;:::;m, are nn

symmetric matrices and a2IRn, bi2IRn, i = 1;2;:::;m. We denote this program by

TRQCQP(). Our assumptions concerning (2.1) are: 1. At d = 0, MFCQ (1.7) holds bTip? 0; for all i 21;2;:::;m and some 0> 0, p 2IRn,jjpjj= 1. (2.2)

2. The quadratic growth condition is satised near d = 0: There exists 0 1 > 0

and 1> 0 such that

aT_{d +}1 2d T_Ad 1 jjdjj 2 whenever bTid + 1 2d T_B_i_d 0; i = 1;2;:::;m dT_d 02 1: (2.3)

A local solution of (2.1) is clearly d = 0.

The aim of this section is to show that under assumptions (2.3) and (2.2), there exists 5 > 0 such that d = 0 is the only stationary point of TRQCQP() (2.1), for

any 0

5. As a consequence any algorithm that reaches a stationary point of

TRQCQP() (2.1) nds its global optimum. The results from [2] ensure that d = 0 is an isolated stationary point of TRQCQP() (2.1). However, the developments of this section are necessary to ensure that additional stationary points are not introduced by the trust region constraint.

The proof has the following steps, each stated for suciently small .

Lemma 2.4 proves that MFCQ (1.7) is satised for all stationary points ~d

of (2.1). Therefore, at any stationary point there exist Lagrange multipliers that satisfy (1.6);

Lemma 2.3 ultimately implies that for any Lagrange multiplier at a

sta-tionary point ~d of (2.1) there exists a suciently close Lagrange multiplier at d = 0 whose active subset is included in the active subset of . This

leads to the identity (i+

i)(bTi~d+ 1 2~d

T_B_i~d) = 0 which helps bound above

the variations in the objective function of (2.1) in the proof of Theorem 2.7.

(6)

Lemma2.5 proves that the multiplierof the trust-region constraint is bounded

above. This in turn implies Lemma 2.6: the Lagrange multipliers of all potential stationary points are uniformly bounded.

Theorem 2.7, the main result of this section, proves that ~d= 0 is the unique

stationary point of (2.1).

Subsection 2.1 contains additional results implied by Homan's Lemma (1.16), which are used in Section 3.

2.1. Sensitivity results for Lagrange Multipliers.

An immediate conse-quence of MFCQ (2.2) is that the set of Lagrange Multipliers of TRQCQP() (2.1) at d = 0 M = f 2IRm ja + Pm i=1 ibi= 0; 0g (2.4)

is nonempty and bounded.

Lemma 2.1. There existscM

> 0such that, for anyw

2IRn, and for any2IRm satisfying a +Xm i=1 ibi= w; 0 there exists a 2M such that jj? jjc M jjwjj.

Proof

Follows by direct application of Homan's Lemma (1.16), after using thatjjwjj

1

jjwjj.

Lemma 2.2. There exists > 0 such that for all w2IRnwith jjwjj and any

satisfying a +Xm i=1 ibi= w; 0 there exists 2M such that i = 0) i = 0.

Proof

Assume the contrary: For any k 2IN, there exists wk 2IRn, such that wk 1

k and there exists k satisfying

a +Xm

i=1

_kibi= wk; k0

and an index set Ik f1;2;:::;mg, such that k I k = 0 but I k 6 = 0, 8 2 M . From Lemma 2.1, D(k_;M ) c M wk c M 1 k ! 0 as k!1. SinceM is a

compact set, and the set of subsets off1;2;:::;mgis nite, there exists a subsequence

kq, aI f1;2;:::;mgand a 2M such that Ik q = I , 8q2IN and k q ! .

From our assumptions I

6

= 0, 8

2M

. On the other hand, since k q I = kq I kq = 0 and kq ! we must have I

= 0 which is a contradiction. The proof is complete.

Lemma 2.3. There exists cM > 0 and > 0 such that for any w

2 IRn with jjwjjand any satisfying

a +Xm

i=1

ibi= w; 0

(2.5)

(7)

there exists 2 M with jj? jj c M

jjwjj and such that i = 0 )

i = 0,

8i2f1;2;:::;mg.

Proof

Let be the quantity dened by Lemma 2.2. LetI f1;2;:::;mgsuch

that there exists a satisfying (2.5) and I = 0. Lemma 2.2 implies that there exists

2M such that I = 0. Let M Ibe the set of 2IRm such that M I= 2IRm j a + Pm i=1ibi= 0; 0; I = 0 : (2.6) From Lemma 2.2, M

I is not empty. From Homan's Lemma (1.16), there exists

cM

I > 0 such that, for all

2IRm, we have D(;M I) c M Imax a + Pm i=1ibi 1; jj I jj 1; ? 1 : (2.7)

From Lemma 2.1 choose

2M such that ? c M jjwjj: (2.8)

From the denition ofM

(2.4) we have that a + Pm i=1 ibi 1= 0; ( )? 1= 0:

Thus, from (2.7) we must have D(; M I) c M I I 1: (2.9)

We also have from our choice of (2.8) that I ? I 1 ? c M jjwjj:

Since I = 0, we thus have I 1 = I ? I

1 which, in conjunction with the

preceding inequality and (2.9) implies that D(; M I) c M Ic M jjwjj:

Hence from (2.8) and the preceding inequality we have that D(;M I) ? + D( ; M I) cM jjwjj+ c M c M I jjwjj= c M (1 + c M I) jjwjj:

The conclusion now follows after taking cM= max If1;2;:::;mg;9 2M ; I=0 c M(1 + c M I):

2.2. Stationary Points of Quadratically Constrained Quadratic

Pro-grams.

In this section we analyze the stationary points of TRQCQP() (2.1) for suciently small values of the parameter . We choose 00

1 such that jjdjj 00 1 )(bi+ Bid)Tp? 0 2 ; 8i21;2;:::;m; (2.10)

where 0, p are the quantities appearing in MFCQ (2.2) with

jjpjj= 1. We choose 1= min f 0 1; 00 1 g> 0; (2.11) 7

(8)

which guarantees that whenever jjdjj

1, both (2.10) and the quadratic growth

condition (2.3) hold.

Lemma 2.4. There exists 2 > 0 such that TRQCQP() (2.1) satises MFCQ (1.7) at all its stationary points dwith such that 0 <

2.

The important consequence of this lemma is that Lagrange multipliers exist at any stationary point of TRQCQP() (2.1).

Proof

Take the quadratically constrained quadratic program min_d2IR n dTd subject to b_Tid + 1 2d T_B_i_d 0; i = 1;2;:::;m (2.12)

with global solution d = 0. At d = 0 (2.12) satises MFCQ (2.2) as well as the quadratic growth condition (1.1). From [2], d = 0 is an isolated stationary point of (2.12). Therefore there exists a 0

2> 0 such that the only stationary point d of (2.12)

that satises dT_d ( 0 2)

2 is d = 0.

Take now 2 = min f

1; 0 2

g. Assume that there exists , 0 <

2 such that

MFCQ (1.7) is not satised at some stationary point d of TRQCQP() (2.1). From (1.9) and (1.3) it follows that there exists 0 and

0

0, not both equal to 0, such

that 0d+ Pm i=1i(bi+ Bid) = 0 b_Tid+1 2d T_B_id = 0; i = 1;2;:::;m i(bTid+1 2d T_B_id) = 0; i = 1;2;:::;m 0( dTd ? 2) = 0: (2.13)

If 0= 0, this would imply

m

X

i=1

i(bi+ Bid) = 0

or, after multiplying with p from (2.10) we get

? m X i=1 i0 2 m X i=1 i(bi+ Bid)Tp = 0

which implies = 0, a contradiction with the assumption that not both 0and are

0. Therefore 0> 0 and from (2.13) we get d

T_d=2> 0 and, after dividing with 0, d+Pm i=1 i 0(bi+ Bid) = 0 b_Tid+1 2d T_B_id = 0; i = 1;2;:::;m i 0(bTi d+ 1 2d T_B_i_{d) = 0; i = 1;2;:::;m:} (2.14)

But this means that d6= 0 is a stationary point of (2.12) with a Lagrange multiplier

0, which contradicts the properties of our choice of

2. The proof is complete. Lemma 2.5. Consider the following quadratically constrained quadratic program

min_d2IR n (d) = aTd + 1 2d T_{Ad +}1 2c 1dTd subject to ?i(d) = bTid +1 2d T_B_i_d0; i = 1;2;:::;m: (2.15)

Then there exists3 > 0 and c

0such that whenever c 1

c the only stationary point of (2.15) that satisesjjdjj

3 isd = 0. 8

(9)

Proof

Choose 0

3 = min f

1;2

g. From (2.11) this implies that for all d with jjdjj

0

3 the quadratic growth condition (2.3) and (2.10) holds. Also, from Lemma

2.4, MFCQ (1.7) holds at any stationary point of (2.15).

Take ~d6= 0 a feasible point of (2.15). We now estimate the variation of the

constraints and objective function in a specic direction from ~d, in order to decide under what conditions ~d6= 0 can be a stationary point of (2.15). Let the active set

at ~dbe B ~ d= n i = 1;2;:::;mj?i( ~d) = 0 o : (2.16)

We estimate the rst-order behavior of ?i(d) in the direction?~d+p, where p is the

vector from (2.10) and 0. For i2B ~ d we get (rd?i( ~d))T(?~d+ p) = (bi+ Bi~d)T(?~d+ p) = ?bTi ~d? ~dTBi~d+ (bi+ Bi~d)Tp = ?bTi~d? 1 2~d T_B_i~d? 1 2~d T_B_i~d+ (b_i_{+ B}_i~d)T_p ? 0 2 ? 1 2~d T_B_i~d (2.17)

where we used (2.10) and that, from (2.16), if i2B ~

d then?bTi~d? 1 2~d

T_B_i_{~d= 0.}

For the objective function we have that

(rd ( ~d))T(?~d+ p) = (a + A~d+ c 1~d) T₍?~d+ p) = ?aT ~d? ~dTA ~d?c 1~d T_{~d+ a}T_{p + ~d}T_{Ap + c} 1~d T_p ? 1~d T~d?c 1~d T~d? 1 2~d T_{A~d+ a}T_{p + ~d}T_{Ap + c} 1~d T_p (2.18)

where we used the quadratic growth condition (2.3). Choose now c= 2 0 max i=1;2;:::;m jjBijj+ 1 (2.19) = c ~d 2 (2.20) 00 3 = min 1 2c; 1jjAjjc ;0 3 (2.21) c =jjAjj+ 2cjjajj+ 2: (2.22) Assume that ~d6= 0, ~d 00 3, c 1

c. Using thatjjpjj= 1 we obtain ? 1 2~d T_{A~d+ a}T_{p + ~d}T_{Ap + c}1~dTp ~d 2 1 2 jjAjj+ cjjajj+ c ~d jjAjj + c1c ~d ~d 2 ~d 2 ? 1 2 jjAjj+ cjjajj+ 1 +1 2c 1 ~d 2 1 2c 1 ~d 2 +1 2c 1 ~d 2 = c1 ~d 2 ; (2.23)

where we used that from our choice of 00

3 (2.21) and since ~d 00 3 we have c ~d 1 2 and c ~d

jjAjj1. We also used the denition of c (2.22) and that c 1 c. Using (2.23) in (2.18) we get rd ( ~d)T(?~d+ p)? 1 ~d 2 ?c 1 ~d 2 + c1 ~d 2 < 0: (2.24) 9

(10)

Using (2.19) and (2.20) in (2.17) we get for all i2B ~ d rd?i(~d)T(?~d+ p)? 2 0 (maxi =1;2;:::;m jjBijj+ 1) ~d 2 0 2 + 1 2 ~d 2 jjBijj? 1 2 jjBijj ~d 2 ? ~d 2 < 0: (2.25)

From (2.25) and (2.24) we get that if ~d6= 0 is feasible for (2.15), if c 1

c (2.22)

and ~d 00

3 (2.21) then there exists a direction

~ =?~d+ p

that produces strict decreases in the objective function and the active constraints. Therefore ~dcannot be a stationary point of (2.15). Otherwise (1.3) implies that there exist the multipliers 0

0, 0, 2IRm, not all of them of 0 such that

0 rd ( ~d) + X i2B~ d ird?i(~d) = 0:

From (2.25) and (2.24) we get, after multiplying with ~ that 0 > 0 rd ( ~d)T~+ X i2B ~ d ird?i(~d)T~ = 0:

which is a contradiction that proves the lemma with c dened in (2.22) and 3 = 00 3

(2.21).

Lemma 2.6. There exists1> 0 and 4> 0such that, if ~dwith ~d 4 is a stationary point of TRQCQP() (2.1) with Lagrange multipliers2IRm andc

1 2IR, where0 < 4, then jjjj 1 1.

Proof

We take 4= min f 1;2;3 g; (2.26)

where 1is dened in (2.11), 2is the quantity from Lemma 2.4 and 3is the quantity

from Lemma 2.5. Lemma 2.4 ensures that the Lagrange multipliers exist at any stationary point of TRQCQP() (2.1).

Assume the contrary of the conclusion of the Lemma: For any k2IN, there exists

~dk _{a stationary point of TRQCQP() (2.1) with 0 <}k

4 and with Lagrange

multipliers k 0, ck 1 0 satisfying k 1 k and (1.6). In particular, a + A ~dk₊Pm i=1ki(bi+ Bi~d k_{) + c}k 1~d k _{= 0:} (2.27) By Lemma 2.5 since ~dk 3, we must have c k 1 c. Since k jj k jj 1 1 = 1, we can choose such that for a subsequence k

q, q!1, we have limq !1 k q jj k q jj 1 = withjj jj 1= 1 and limq !1 ~d kq = ~d , where ~d 4.

We can now divide through (2.27) with kq

1 and take the limit as q

! 1 and kq 1 !1. We obtain m X i=1 i(bi+ Bi~d) = 0: 10

(11)

Since ~d 4

1, we can multiply with p and use (2.10) and the fact that jj jj 1= 1 to get ? 0 2 m X i=1 ipT(bi+ Bi~d) = 0:

which is a contradiction. This proves the lemma. Theorem 2.7. There exists 5 > 0, such that, for any such that 0 <

5, TRQCQP() (2.1) has the unique stationary pointd = 0.

Proof

Choose c= m1 max i=1;2;:::;m jjBijj+jjAjj+ c (2.28) 0 5= min 1;2;3;4; c (2.29)

where is the quantity from Lemma 2.3, c is the quantity from Lemma 2.5, 1

is the quantity from Lemma 2.6 and j, j = 1;2;3;4, are the bounds on the trust

regions that ensure that all preceding results hold.

Let ~d6= 0 be a stationary point of TRQCQP() (2.1) with 0 < 5. By

Lemma 2.4 TRQCQP() (2.1) satises MFCQ (1.7) at ~d. Therefore there exist the Lagrange multipliers 0, c

1

0 which, together with ~dsatisfy (1.6), or

a + A~d+Pm i=1i(bi+ Bi~d) + c 1~d = 0 bTi( ~d) +1 2~d T_B_i_~d 0; i = 1;2;:::;m ~dT~d () 2 i(bTi~d+1 2~d T_B_i~d) = 0; i = 1;2;:::;m c1( ~d T_~d?() 2) = 0: (2.30) Since ~d 0 5

3, Lemma 2.5 applies to give that c1

c. Since ~d 0 5 4we have that jjjj 1

1 from Lemma 2.6. We dene ?w = A ~d+ m X i=1 iBi~d+ c1~d: (2.31)

After applying the triangle inequality and using (2.28) we have that

jjwjj A ~d + Pm i=1 iBi~d + c 1~d ~d ( jjAjj+ m 1(maxi=1;2;:::;m jjBijj) + c) = c ~d : (2.32)

For the last inequality, 0 5 c and ~d results in jjwjj . From (2.30)

and (2.31) we have that

a +Xm

i=1

ibi= w:

This implies, from Lemma 2.3, that there exists

2M

( a Lagrange multiplier for

TRQCQP() (2.1) at d = 0 ) such that jj? jjc M jjwjj and i = 0) i = 0; 8i2f1;2;:::;mg: (2.33) 11

(12)

Since 2M it satises a +Xm i=1 ibi= 0:

Adding the last equality to the rst equation in (2.30), dividing by 2 and multiplying with ~dT _{we obtain} aT_{~d+ 12 ~dA~d+ 12}Xm i=1 h ibTi~d+ i(bTi~d+ ~dTBi~d) i + 12c1 ~d 2 = 0: We now use the identity u1v1+ u2v2=

1 2(u 1+ u2)(v1+ v2) + 1 2(u 1 ?u 2)(v1 ?v 2) as

well as the fact that (

i + i)(bTi~d+ 1 2~d

T_B_i~d) = 0, for i = 1;2;:::;m, which follows

from (2.33) and (2.30) to obtain 0 = aT _{~d+ 12 ~dA~d+ 12}Xm i=1 1 2( i + i)(2bTi~d+ ~dTBi~d)? 1 2( i ?i)(~dTBi~d) + 12c1 ~d 2 = aT_{~d+ 12 ~dA~d}? 1 4 m X i=1 ( i ?i)( ~dTBi~d) + 12c 1 ~d 2 which results in aT_{~d+ 12 ~dA~d+ 12c} 1 ~d 2 = 14Xm i=1 ( i ?i)( ~dTBi~d): (2.34)

Since ~d is feasible for TRQCQP() (2.1) and since ~d 0 5 1, the quadratic

growth condition (2.3) holds to give that aT_~d+1 2~dA~d 1 ~d 2 . Dene cB = max_i =1;2;:::;m jjBijj:

From (2.33), (2.31) and (2.32) we havejj ?jjc Mc ~d

. Using all these bounds

in (2.34), together with the arithmetic-quadratic mean inequality we get 1 ~d 2 aT~d+ 12 ~dA~d+ 12c 1 ~d 2 = 14Xm i=1 ( i ?i)( ~dTBi~d) 1 4 p mcB ~d 2 jj ?jj 1 4 p mcB ~d 2 cMc ~d : Since ~d

6= 0, from our assumption, we obtain, after dividing through the previous

inequality with ~d 2 that ~d 41 p mcBcMc : (2.35) Choose now 5= min 0 5; 21 p mcBcMc :

From (2.35) it follows that the unique stationary point of TRQCQP() (2.1) with 0 <

5 is ~d= 0. The proof is complete.

12

(13)

3. Sequential Quadratically Constrained Quadratic Programming.

In this section, we introduce the sequential quadratically constrained quadratic program-ming algorithm. We prove that under the conditions set forth in the introduction, the algorithm induces superlinear convergence. Since our main interest is the rate of convergence of the algorithm, we do not address global convergence issues.

We consider the following form of the algorithm: 1. Choose a starting point xk_{, k = 0.}

2. Let x = xk _{and determine d}k_{, a stationary point of}

min_d2IR n rxf(x)Td + 1 2d Tr 2 xxf(x)d subject to gi(x) +rxgi(x)Td + 1 2d Tr 2 xxgi(x)d = ?i(x;d)0; i = 1;2;:::;m dT_d 2 (3.1)

3. Take xk+1= xk+ dk and k = k + 1 and restart.

At every step, the algorithm solves a problem with quadratic constraints and a quadratic objective, none of which are assumed to be convex. We name the above algorithm sequential quadratically constrained quadratic programming or SQCQP.

As outlined in Subsection 1.1, we assume without loss of generality that gi(x) =

0; 8i = 1;2;:::;m; after eventually considering a suciently small trust-region, and

that the quadratic growth condition (1.1) and MFCQ (1.7) hold at a local solution x of the nonlinear program (1.2). From [14, 6] these conditions are equivalent to

MFCQ (1.7) and (1.12), which are expressed only in terms of the derivatives of the data up to the second order. We show that (3.1) is feasible for xed and x in some neighborhood of x. Since it is also bounded, a stationary point must exist.

Due to the fact that it captures the entire information up to second order for (1.2) at x, the quadratically constrained quadratic program

min_d2IR n rxf(x )Td + 1 2d Tr 2 xxf(x)d subject to rxg(x )Td +1 2d Tr 2 xxg(x)d 0; i = 1;2;:::;m dT_d 2 (3.2)

satises MFCQ (1.7) and (1.12) at d = 0. As a result of [14, 6] it follows that (3.2) satises MFCQ (2.2) and the quadratic growth condition (2.3). Therefore, all the results from Section 2 apply for (3.2). We follow a line of proof similar to the one in Section 2.

Theorem 3.1 proves that MFCQ (1.7) is satised by (3.1) in a neighborhood

of xand that the trust-region constraint is inactive at any stationary point

d of (3.1). Corollary 3.2 further insures that in a neighborhood of x, the

Lagrange multipliers of (3.1) are uniformly bounded.

Lemma 3.3 ultimately implies that for any Lagrange multiplier at a

sta-tionary point d of (3.1) at x = x there exists a suciently close Lagrange

multiplier at x = xwhose active subset is included in the active subset of

. This in turn leads to the conclusions of Lemma 3.4 that (i+

i)gi(x+d) =

o(jjdjj 2

) and that P(x + d) = o(jjdjj 2

), where d is a stationary point of (3.1). This helps bound above the variations in the objective function of (3.1) in the proof of Theorem 3.5.

Theorems 3.5 and 3.6 prove the superlinear convergence of a sequence xk +1=

xk_{+ d}k _{initiated suciently close to x}, where dk is any stationary point of

(3.1).

(14)

Theorem 3.1. There exists6 > 0 and a neighborhood N

6(x

), such that, for any with 0 <

6, there exists a neighborhood N(x

)of x such that (i) The QCQP (3.1) is feasible for any x2N(x

). (ii) For any x2N

6(x

)and anyd with

jjdjj 6 we have (rxgi(x) +r 2 xxgi(x)d)Tp ? 0 2

where0 and pare the quantities entering the denition of MFCQ (1.7). (iii) For any sequence xk 2 N(x

), k = 1;2;:::with xk

! x

when k ! 1. and with ~dk _{a stationary point of (3.1) at}_{x = x}k_{, we must have} ~dk !0 as

k!1.

(iv) The constraint dT_d

2 is inactive for any x

2 N(x

) and d stationary point of (3.1).

Proof

Since (3.2) satises MFCQ (2.2) and the quadratic growth condition (2.3) at d = 0, from Theorem 2.7 there exists 0

6, such that, for any 0 <

0 6, ~d= 0

is the only stationary point of (3.2). Choose now such that 0 < 0

6. Since (3.2)

satises MFCQ (2.2), then, from [21], for any suciently small perturbation of (3.2) we still obtain a feasible nonlinear program. We regard (3.1) as a perturbation of (3.2) and we therefore have, fromthe fact that f;g are twice continuously dierentiable, that there exist a neighborhoodN

2

(x) of xsuch that (3.1) is feasible for any x 2N

2

(x),

which proves part (i) as long as N(x )

N

2

(x), which will be established later.

We also have that, for all i = 1;2;:::;m, (rxgi(x) +r 2 xxgi(x)d)Tp =rxgi(x )Tp + ? rxgi(x)?rxgi(x ) + r 2 xxgi(x)dT p rxgi(x )Tp + c2g jjx?x jj+ c 2g jjdjj? 0+ c2g jjx?x jj+ c 2g jjdjj

since from MFCQ (1.7) jjpjj = 1, where c

2g is a bound on the second derivatives

of gi(x), i = 1;2;:::;m. If we chose 00 6 = 0 4c 2g, d with jjdjj 00 6 and N 6(x ) = B(x; 0 4c

2g), we get from the previous bound that, since now c 2g jjx?x jj 0 4, c2g jjdjj 0 4, (rxgi(x) +r 2 xxgi(x)d)Tp? 0 2 which shows part (ii), after dening 6 = min

f 0 6; 00 6 g. We now choose N 3 (x) = N 6(x ) \N 2 (x). For 0 <

6, both the conclusions of (i) and (ii) hold. In

particular, for any 2(0; 6], x

2N 3

(x), (3.1) must have a stationary point since it

is feasible and bounded.

Assume now that the conclusion (iii) does not hold: There exists > 0, with 6 and a sequence xk ! x , xk 2 N 3 (x), for k

2 IN, and the corresponding

stationary points dk _{of (3.1) are bounded bellow} dk

cf > 0 for all k suciently

large. Since dk _{is a stationary point of (3.1) at x = x}k_{, it must satisfy the}

rst-order necessary conditions (1.3) for some multipliers _ki 0; i = 0;1;:::;m + 1 with Pm +1 i=0 ki = 1: k 0 rxf(xk) + Pm i=1ki( rxgi(xk) +r 2 xxgi(xk)dk) + km+1d k _{= 0} for i = 1;2;:::;m : ?i(xk;dk)0; ?i(xk;dk)ki = 0; (dk₎T_dk 2; ((dk)Tdk ? 2)km +1 = 0: (3.3) 14

(15)

Since the multipliers k _{= (}k 0; k 1;:::;km+1) satisfy k

1 = 1, and the direction

dk _{satises c}_f dk

we can extract a subsequence kq such that xk q ! x , kq ! , dk q !d 6

= 0 as q !1. Taking the limit as q !1 in (3.3) we obtain

from the continuity of all data involved in terms of (x;d), that dis a stationary point

of (3.2). Since d

6

= 0 this contradicts the outcome of Theorem 2.7 that is valid due to our choice of 6. This proves (iii).

Assume now that (iv) does not hold. It then follows that there exists a sequence xk ! x

with dk a stationary point and such that dk

= . But this contradicts

the conclusion of (iii) and thus there exists a neighborhood N(x )

N

3

(x) such

that for x2N(x

) any stationary point of (3.1) satises dTd < 2 and for which the

conclusions of parts (i),(ii) and (iii) hold. The proof is complete. Corollary 3.2. Any stationary point of (3.1) satises the Kuhn-Tucker condi-tions 1.6, for any0

6,x

2N(x

)and for some

2Rm,0. There exists

1 such that, for anyx

2N(x

), any stationary pointdof (3.1) and any Lagrange multiplierssatisfying the Kuhn-Tucker conditions we havejjjj

1

1.

Proof

Since 0 <

6, then, by Theorem 3.1(iv), we have that for any

x2N(x

) and any stationary point d, we must have

jjdjj< . Therefore only the

constraints ?i(x;d), i = 1;2;:::;m can be active at a stationary point d. Then by

Theorem 3.1(ii), MFCQ (1.7) is satised at d and thus there exist multipliers 0

satisfying the Kuhn-Tucker conditions and in particular:

rxf(x) +r 2 xxf(x)d + m X i=1 i(rxgi(x) +r 2 xxgi(x)d) = 0:

Multiplying through with p we get 0 = (rxf(x) +r 2 xxf(x)d)Tp + m X i=1 i(rxgi(x) +r 2 xxgi(x)d)Tp (rxf(x) +r 2 xxf(x)d)Tp?jjjj 1 0 2 : Sincejjpjj= 1,jjdjj< and after using the usual norm inequalities we get

jjjj 1 2 0 ? jjrxf(x)j j+ r 2 xxf(x) : Since on N(x

) the expression from the right hand side is bounded above, there

exists 1 for which the conclusion of this corollary holds.

Lemma 3.3. There exists 7> 0and a constant c> 0 such that for any with

0 <

7 there exists a neighborhood N

1

(x)such that, whenever x 2N

1

(x), and for any d a stationary point of (3.1) with the Lagrange multipliersthere exist the Lagrange multipliers atx = x, 2M(x ), such that jj? jjc ( jjx?x jj+jjdjj) and i= 0) i = 0,8i = 1;2;:::;m.

Proof

Take such that 0 <

6 and x

2 N(x

). Let d be a stationary

point of (3.1) with the Lagrange multipliers 0 (which exist from Corollary (3.2)).

From the Kuhn-Tucker conditions we obtain

rxf(x) +r 2 xxf(x)d + m X i=1 i(rxgi(x) +r 2 xxgi(x)d) = 0; (3.4) 15

(16)

and thus rxf(x ) + Pm i=1i rxgi(x ) = rxf(x ) ?rxf(x) + Pm i=1i( rxg(x ) ?rxg(x))?r 2 xxf(x)d? Pm i=1i r 2 xxgi(x)d: (3.5) Using thatjjrxf(x)?rxf(x ) jjc 2f jjx?x jj, and that jjrxgi(x)?rxgi(x ) jj c2g jjx?x jj, where c

2f and c2gare bounds on the second derivatives of f and g, we

get from (3.5) and Corollary 3.2 that

jjrxf(x ) + Pm i=1i rxgi(x ) jjc 2f jjx?x jj+ mc 2g1 jjx?x jj + c2f jjdjj+ mc 2g1 jjdjj= (c 2f+ mc2g1)( jjx?x jj+jjdjj) (3.6) We choose = 2(c 2f +m 1c

2g), where is the quantity from Lemma 2.2. From (3.6)

it follows that, for any minf; 6 gand x2N 1 (x) = N(x ) \B(x ;) we have

that, sincejjdjj andjjx?x jj, rxf(x ) + m X i=1 irxgi(x ) 2 +2 = :

We can therefore apply Lemma 2.2 and (3.6) to get that there exists

2M(x )

with the properties required, after taking 7 = min

f;

6

g and c

= cM(c2f +

mc2g1), where cM is the constant from Lemma 2.3.

Lemma 3.4. Letx2N 1 (x), where 0 < 7 and N 1 (x)is the neighborhood obtained in Lemma 3.3. Let be a Lagrange multiplier associated with a stationary point d at x of (3.1). Let

2 M(x

) such that

i = 0 )

i = 0 and such that

jj jj 1 1. Then (i) P(x + d)P(d)jjdjj 2 , (ii) j(i+ i)gi(x + d)j2 1P(d) jjdjj 2 ;8i = 1;2;:::;m, whereP(d) is a continuous function that satisesP(0) = 0.

Proof

Using the rst-order Taylor remainder formula [1] for gi(y) around y = x

for y = x + w and the fact that gi(x) is twice continuously dierentiable for i =

1;2;:::;m we obtain that gi(x + w) = gi(x) +rxgi(x)Tw + 1 2w Tr 2 xxgi(x)w + R 1 0 w T_[r 2 xxgi(x + tw)?r 2 xxgi(x)]w(1?t)dt gi(x) +rxgi(x)Tw + 1 2w Tr 2 xxgi(x)w + jjwjj 2 maxt2[0;1] r 2 xxgi(x + tw)?r 2 xxgi(x) : (3.7) Sincer 2

xxgi(x) is a continuous function, it follows that

i(w) = max_x 2N 1 (x ) max t2[0;1] r 2 xxgi(x + tw)?r 2 xxgi(x) is a continuous function on jjwjj

7 with the property that i(0) = 0. We have

that d is a stationary point of (3.1) and as a result satises gi(x) +rxgi(x)Td + 1

2d

Tr 2

xxgi(x)d0. Replacing w with d in (3.7) we obtain

gi(x + d)i(d)jjdjj 2 ; i = 1;2;:::;m: Dene now P(d) = max_i=1;2;:::;m i(d): (3.8) 16

(17)

From the denition of i(d) we have that P is continuous and that P(0) = 0.

From the denition of P(x) (1.13), we get that,8i = 1;2;:::;m,

P(x + d) max i=1;2;:::;m i(d)jjdjj 2 = P(d)jjdjj 2 : This proves point (i). Since is such that

i= 0)

i = 0, from our hypothesis, and

since d is a stationary point of (3.1) and thus satises the complementarity condition i g(x) +rxg(x)Td + 12dTr 2 xxg(x)d = 0 this implies that, for i = 1;2;:::;m

(i+ i) g(x) +rxg(x)Td + 12dTr 2 xxg(x)d = 0 or, by using (3.7), (i+ i) g(x + d)? Z 1 0 ? dT_[r 2 xxgi(x + td)?r 2 xxgi(x)]d (1?t)dt = 0 and thus j(i+ i)g(x + d)j= (i+ i) Z 1 0 ? dT_[r 2 xxgi(x + td)?r 2 xxgi(x)]d (1?t)dt 21i(d) jjdjj 2 2 1P(d) jjdjj 2

which completes the proof of (ii) and of the Lemma.

From here on we use extensively that, for h twice continuously dierentiable, we have h(x)?h(x ) ? (rxh(x)+rxh(x )) T 2 (x ?x ) 3h( jjx?x jj)jjx?x jj 2 ; (3.9) where 3h(z) : I

R ! IR is a continuous function with

3h(0) = 0. Indeed by

Taylor's theorem we have that there exist continuous functions 1 3h; 0 3h; 2 3h: I R!IR which satisfy 1 3h(0) = 0 3h(0) = 2 3h(0) = 0 such that h(x) ?h(x ) ?rxh(x )T(x ?x ) ? 1 2(x ?x )T r 2 xxh(x)(x ?x ) 1 3h( jjx?x jj)jjx?x jj 2 ; (3.10) and rxh(x)?rxh(x ) ?r 2 xxh(x)(x ?x ) 0 3h( jjx?x jj)jjx?x jj (3.11)

which in turn implies, after choosing 2 3h =

1 2

0

3h and using the Cauchy-Schwarz

inequality, that (rxh(x)+rxh(x )) T 2 (x ?x ) ? (rxh(x )+rxh(x )) T 2 (x ?x ) ? 1 2(x ?x )T r 2 xxh(x)(x ?x ) 2 3h( jjx?x jj)jjx?x jj 2 : (3.12)

The relation (3.9) now follows by comparing (3.9), (3.10) and (3.12) and taking

3h(z) = 1

3h(z) + 2

3h(z). If h were three times continuously dierentiable, then 17

(18)

3h would be related to the third derivative of h, from the error formula of

trape-zoidal integration [1], which is the origin of our subscript notation.

Theorem 3.5. Let(xk₎_k

2IN be a sequence such that x

k !x ,xk

6

= x. Let dk be a stationary point of (3.1) for x = xk _for _{0 <}

7, where 7 is the quantity from Lemma 3.3. Then

lim k!1 xk+ dk ?x jjxk?x jj = 0:

Proof

Since xk !x

, the sequence xk eventually reaches N

1

(x). Since 0 <

7, this means that Lemmas 3.4 and 3.3, as well as all preceding results apply for

suciently large k. Using (3.9) we get that f(xk_{+ d}k₎?f(x ) 1 2 ? rxf(xk+ dk) +rxf(x ) T (xk_{+ d}k?x ) + 3f( xk+ dk ?x ) xk+ dk ?x 2 1 2 ? rxf(xk) +r 2 xxf(xk)dk+rxf(x ) T (xk_{+ d}k?x ) + 0 3f( dk ) dk xk+ dk ?x + 3f( xk+ dk ?x ) xk+ dk ?x 2 (3.13) where 0 3f( dk ) dk

is a bound obtained by using (3.11) for f(x) between xk+ dk

and xk_.0

3f is a continuous function satisfying 0

3f(0) = 0.

From Corollary 3.2, there exist the Lagrange multiplier k_{, which, together with}

dk _{satises the Kuhn-Tucker conditions (1.6) for (3.1) at x = x}k_{. From Lemma 3.3,}

there exists a k 2M(x ) such that k ? k c ( xk ?x k + dk ) and ki= 0 ) k i = 0: (3.14)

Using the Kuhn-Tucker conditions (1.6) to replacerxf(xk) +r 2

xxf(xk)dk and

rxf(x

) in terms of g and the Lagrange multipliers, and using the bounds k 1 1, jj jj 1

1, that follow from Corollary 3.2, we get from (3.13)

f(xk_{+ d}k₎?f(x ) 1 2 ? ? Pm i=1ki ? rxgi(xk) +r 2 xxgi(xk)dk ? Pm i=1 k i rxgi(x ) T (xk_{+ d}k?x ) + 0 3f( dk ) dk xk+ dk ?x + 3f( xk+ dk ?x ) xk+ dk ?x 2 1 2 ? ? Pm i=1ki rxgi(xk+ dk)? Pm i=1 k i rxgi(x ) T (xk_{+ d}k?x ) + m1 0 3g( dk ) dk xk+ dk ?x + 0 3f( dk ) dk xk+ dk ?x + 3f( xk+ dk ?x ) xk+ dk ?x 2 ; (3.15) where 0 3g( dk ) dk

is a bound obtained from applying (3.11) to gi(x),

i = 1;2;:::;m, between the points xk_{+ d}k _{and x}k_{, and taking the maximum among}

the resulting bounds. 0

3g a continuous function satisfying 0

3g(0) = 0. We now

make use of the identity ab + cd = 1

2[(a + c)(b + d) + (a

?c)(b?d)] for the terms

ki? rxgi(xk+ dk)T(xk+ dk?x ) +k i ? rxgi(x )T(xk+ dk ?x ) , i = 1;2;:::;m. Continuing the bounding in (3.15) we get

f(xk_{+ d}k₎?f(x ) ? 1 4 ? Pm i=1 ? ki+ k i ? rxgi(xk+ dk) +rxgi(x ) + Pm i=1 ? _ki? k i ? rxgi(xk+ dk)?rxgi(x ) T (xk_{+ d}k?x ) + m1 0 3g( dk ) + 0 3f( dk ) dk xk+ dk ?x + 3f( xk+ dk ?x ) xk+ dk ?x 2 : (3.16) 18

(19)

We now bound all terms involving and . Using that jj? jjc ( xk ?x + dk

) from (3.14) and that g is twice continuously dierentiable and thus rxgi(xk+ dk)?rxgi(x ) c 2g xk+ dk ?x ; i = 1;2;:::;m we get ? 1 4 Pm i=1(ki ? k i )? rxgi(xk+ dk)?rxgi(x ) T (xk_{+ d}k?x ) mcc2g( xk ?x + dk ) xk+ dk ?x 2 : (3.17) Using that jjjj 1

1 from Corollary 3.2, (3.9) for gi(x) and that gi(x ) = 0

i = 1;2;:::;m, as well as Lemma 3.4 (ii) we obtain that

? 1 4 Pm i=1(ki+ k i )(rxgi(xk+ dk) +rxgi(x ))T(xk+ dk ?x ) ? 1 2 Pm i=1(ki+ k i )gi(xk+ dk) + m1 3g( xk+ dk ?x ) xk+ dk ?x 2 m1P(d k₎ dk 2 + m1 3g( xk+ dk ?x ) xk+ dk ?x 2 : (3.18)

Putting together the bounds from (3.16), (3.17) and (3.18) we obtain f(xk_{+ d}k₎?f(x ) ? mcc2g( xk ?x + dk ) + m 1 3g( xk+ dk ?x ) + 3f( xk+ dk ?x ) xk+ dk ?x 2 + m1 0 3g( dk ) + 0 3f( dk ) jjdjj xk+ dk ?x + m 1P(d k₎jjdkjj 2 (3.19)

Since the bound on the right hand side is nonnegative, we can use Lemma 3.4 (i) and the quadratic growth condition (1.14) to get that

xk+ dk ?x 2 max f(xk_{+ d}k₎?f(x );P(xk+ dk) 1(x k?x ;dk) xk+ dk ?x 2 + 2(d k₎ dk xk+ dk ?x + +m1P(dk) dk 2 + P(dk) dk 2 : (3.20) where 1(x k?x ;dk) = mc c2g( xk ?x + dk ) + m1 3g( xk+ dk ?x ) + 3f( xk+ dk ?x ); 2(d k_{) = m} 1 0 3g( dk ) + 0 3f( dk ):

1 and 2 are continuous functions of their arguments that satisfy 1(0;0) = 0 and

2(0) = 0. We now use that ab

1 2(a

2+ b2) to get from (3.20) that

xk+ dk ?x 2 ? 1(x k?x ;dk) + 1 2 2(d k₎ xk+ dk ?x 2 + +? 1 2 2(dk) + (m1+ 1)P(dk) dk 2 : (3.21) Since xk!x and dk

!0, from Theorem 3.1, there exists K

1such that 8kK 1 we have 1(x k?x ;dk ) + 122(d k₎ 2: 19

(20)

Taking the corresponding term to the right-hand side, we get that,8kK 1, 2 xk+ dk ?x 2 1 22(d k_{) + (m} 1+ 1)P(d k₎ dk 2 : Now using the continuity of 2 and P, and that, from Theorem 3.1 (iii), dk

!0, we get that lim k!1 xk+ dk ?x 2 jjdkjj 2 2 limk!1 1 22(d k_{) + (m} 1+ 1)P(d k₎ = 0: or that lim k!1 xk+ dk ?x jjdkjj = 0: (3.22)

Using now the consequence of the triangle inequality

xk ?x ? dk xk ?x + dk

and dividing the relation with dk

and taking the limit, this implies that

lim k!1 xk ?x jjdkjj ?1 lim k!1 xk ?x + dk jjdkjj = 0 and thus lim k!1 xk ?x jjdkjj = 1: Dividing (3.22) by the last limit we get that

lim k!1 xk+ dk ?x jjxk?x jj = 0

which proves the claim of the Theorem. Theorem 3.6. Let be such that 0 <

7, where 7 is the quantity from Lemma 3.3. There exists a radius r such that for any x

2B(x

;r),x

6

= x, if dis a stationary point of (3.1), then

jjx + d?x jj jjx?x jj 1 2:

Whenever started insideB(x;r), the SQCQP algorithm produces a sequence xk !

x that is superlinearly convergent,

lim k!1 xk +1 ?x jjxk?x jj = 0:

Proof

Assume the contrary: For any q2 IN, there exists xq 6= x

such that jjxq?x

jj

1

q and dq a stationary point of (3.1) such that

jjxq+ dq?x jj jjxq?x jj 1 2: (3.23) 20

(21)

Therefore xq !x , and by Theorem 3.5 lim q!1 jjxq+ dq?x jj jjxq?x jj = 0:

which contradicts (3.23). As a result there exists r with the properties required by

the Theorem. When started with x0 2B(x

;r), the SQCQP algorithm produces a

sequence xk+1= xk+ dk, such that x 1 ?x jjx 0 ?x jj 1 2 which implies x1 2B(x

;r) and thus, by induction, xk 2 B(x

;r),

8k 2 IN and

xk!x as k

!1. We can now use Theorem 3.5 to claim that

lim k!1 xk +1 ?x jjxk?x jj = lim_k !1 xk+ dk ?x jjxk?x jj = 0

which proves the superlinear convergence of xk _{to x}. The proof is complete.

Note

If the data of the problems are three times continuously dierentiable, then the functions 3and

0

3are Lipschitzian in their respective arguments, which

considerably simplies the notation for the proof of Theorem 3.5. For instance,

0 3g( dk ) = O( dk

) in this case. Using essentially the same proof, the

conclu-sion can then be strengthened to show that the order of convergence is at least 3 2.

4. Conclusions.

We present an algorithm that achieves superlinear convergence of the iterates to a local minimum of the nonlinear program (1.2) at which MFCQ (1.7) and the quadratic growth condition (1.1) are satised. The conditions we impose allow even situations for which no locally convex augmented Lagrangian exists, a case not accommodated by most previous results in the literature.

At each step we solve a subproblem generated by approximating the function and the constraints by the second-order Taylor series at the current iterate. We also add a trust-region constraint, which insures that the problem is bounded. The algorithm therefore solves at each step a quadratically constrained quadratic program (QCQP) and we thus call it sequential quadratically constrained quadratic program (SQCQP). The subproblem to be solved is not necessarily convex. However we prove that for a suitable, xed size of the trust region, the associated constraint is inactive at any stationary point of QCQP. As a result, any stationary point of the QCQP induces superlinear convergence of the iterates, which obviates the need for nding the global optimum of the subproblem.

A subproblem that has quadratic constraints is more dicult to solve than a subproblem with linear constraints, the latter being the case of Sequential Quadratic Programming algorithms [19]. One could of course solve the QCQP with a nonlinear programming technique. The algorithm in [2] achieves at least linear convergence on the subproblem under the conditions considered here. Since in this work a more accurate model of the constraints is considered, compared to SQP, it would be ex-pected that a smaller number of exterior iterations and thus of function evaluations is needed before completion. However, given the complexity of the subproblem, this will not necessarily results in superior runtime. Nevertheless, algorithms can be derived to deal directly with quadratically constrained problem via semidenite relaxation [16]. Devising methods that specically accommodate quadratic constraints will be the subject of future research.

(22)

REFERENCES

[1] K. Atkinson,An Introduction to Numerical Analysis, John Wiley & Sons, 1988.

[2] M. Anitescu, Degenerate Nonlinear Programming with a Quadratic Growth Constraint. To

appear inSIAM Journal in Optimization.

[3] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, 1982.

[4] D. P. Bertsekas,Nonlinear Programming, Athena Scientic, Belmont, Massachusets, 1995.

[5] J.F. Bonnans,Local Analysis of Newton-Type Methods for Variational Inequalities and Non-linear Programming, Applied Mathematics and Optimization , 29 (1994), pp. 161{186. [6] J. F. Bonnans and A. Ioffe,Second-order suciency and quadratic growth for nonisolated

minima, Mathematics of Operations Research , 20:4 (1995), pp. 801{819.

[7] A. V. Fiacco,Introduction to Sensitivity and Stability Analysis in Nonlinear Programming,

Academic Press, New York, 1983.

[8] R. Fletcher,Practical Methods of Optimization, John Wiley & Sons, Chichester, 1987. [9] J. Gauvin, A necessary and sucient regularity condition to have bounded multipliers in

nonconvex programming, Mathematical Programming 12 (1977), pp. 136{138.

[10] J. Gauvin and J. W. Tolle,Dierential stability in nonlinear programming, SIAM Journal

of Control and Optimization , 15 (1977), pp. 294-311.

[11] W. W. Hager,Stabilized sequential quadratic programming, To appear in Computational Op-timization and Applications .

[12] W. W Hager and M. S. Gowda,Stability in the presence of degeneracy and error estimation, Technical Report, Department of Mathematics, University of Florida, 1997.

[13] A.J. Hoffman,On approximate solutions of systems of linear inequalities, Journal of Research of the National Bureau of Standards, 49 (1952), pp. 263-265.

[14] A. Ioffe,Necessary and sucient conditions for a local minimum.3: Second order conditions and augmented duality, SIAM Journal of Control and Optimization , 17:2 (1979), pp. 266-288.

[15] A. Ioffe,On Sensitivity analysis of nonlinear programs in Banach spaces: the approach via composite unconstrained optimization, Siam Journal of Optimization , 4:1 (1994), pp. 1-43. [16] S. Kruk and H. Wolkowicz,SQQP, Sequential Quadratic Constrained Quadratic

Program-ming, Research Report CORR 97-01, Universoty of Waterloo, Waterloo, 1997. [17] O. L. Mangasarian,Nonlinear Programming, McGraw-Hill, New York 1969.

[18] O. L. Mangasarian and S. Fromovitz,The Fritz John necessary optimality conditions in

the presence of equality constraints, J. Math. Anal. and Appl. 17 (1967), pp. 34-47. [19] E. Polak,Optimization, Springer, New York, 1997.

[20] D. Ralph and S. J. Wright,Superlinear convergence of an interior-point method despite de-pendent constraints, Preprint ANL/MCS-P622-1196, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1996.

[21] S. M. Robinson,Generalized equations and their solutions, Part II: Applications to nonlinear programmingMathematical Programming Study 19 (1980), 200{221.

[22] A. Shapiro,Sensitivity analysis of nonlinear programs and dierentiability properties of metric projections, SIAM Journal of Control and Optimization , 26:3 (1988), pp. 628{645. [23] S. J. Wright,Superlinear convergence of a stabilized SQP method to a degenerate solution,

Computational Optimization and Applications 11 (1998), pp. 253{275.

[24] S. J. Wright,Modifying SQP for degenerate problems, Preprint ANL/MCS-P699-1097, Math-ematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill, 1997.