A penalty-function-free line search SQP method for nonlinear programming

(1)

Contents lists available atScienceDirect

Journal of Computational and Applied

Mathematics

journal homepage:www.elsevier.com/locate/cam

A penalty-function-free line search SQP method for

nonlinear programming

I

Wenjuan Xue

a,c,∗

, Chungen Shen

b,c

, Dingguo Pu

c a_{Department of Mathematics and Physics, Shanghai University of Electric Power, 200090, China} b_{Department of Applied Mathematics, Shanghai Finance University, 201209, China}

c_{Department of Mathematics, Tongji University, 200092, China}

a r t i c l e i n f o

Article history:

Received 13 November 2005 Received in revised form 13 September 2008 MSC: 90C30 65K10 Keywords: Non-monotonicity Line search SQP Global convergence Local convergence SOC a b s t r a c t

We propose a penalty-function-free non-monotone line search method for nonlinear optimization problems with equality and inequality constraints. This method yields global convergence without using a penalty function or a filter. Each step is required to satisfy a decrease condition for the constraint violation, as well as that for the objective function under some reasonable conditions. The proposed mechanism for accepting steps also combines the non-monotone technique on the decrease condition for the constraint violation, which leads to flexibility and an acceptance behavior comparable with filter based methods. Furthermore, it is shown that the proposed method can avoid the Maratos effect if the search directions are improved by second-order corrections (SOC). So locally superlinear convergence is achieved. We also present some numerical results which confirm the robustness and efficiency of our approach.

1. Introduction

In this paper, we consider the constrained optimization problem:

(P)

(

_min _f

(

_x

)

s

.

t

.

ci

(

x

) =

0

,

i

∈

E

,

ci

(

x

) ≤

0

,

i

∈

I

,

(1)

with twice continuously differentiable functions f

(

x

) :

_Rn

_→

R

,

c

(

x

) = (

c1

(

x

), . . . ,

cm

(

x

)) :

Rn

→

Rm.E

= {

1

, . . . ,

me

}

,

I

= {

me

+

1

, . . . ,

m

}

. The Lagrangian function associated with problem (P) is L

(

x

, λ) =

f

(

x

) + λ

Tc

(

x

)

, where

λ ∈

Rmis the corresponding Lagrangian multiplier.

The sequential quadratic programming (SQP) method has been widely used for solving problem (P) and has been investigated by many researchers [12,13,16,17]. However, traditional SQP methods may encounter the trouble of choosing problematic penalty parameters. Fletcher and Leyffer [6] proposed the filter method which is an alternative to traditional SQP methods to solve problem (P). The key point of the concept is that the trial point generated by solving a trust region SQP problem is accepted if there is a sufficient decrease of the objective function or the constraint violation. No penalty parameter needs to be chosen. So, we call it a penalty-function-free method. In addition, the computational results of the

I _{This research is supported by National Science Foundation of China (No. 10771162) and Shanghai Excellent Young Teacher Foundation.} ∗_{Corresponding author at: Department of Mathematics, Tongji University, 200092, China.}

E-mail address:[email protected](W. Xue).

(2)

algorithm in [6] are also very encouraging and subsequently the global convergence of the filter SQP methods was proved in [7,8]. There are also various methods using the filter strategy (see [1,4,5,15,22–24]).

Another penalty-function-free method is the non-monotone trust region SQP method proposed by Ulbrich and Ulbrich [21]. This algorithm is the only trust region algorithm without a penalty function or a filter. Therefore, it also does not require any penalty parameter. However, only equality constrained optimization problems are discussed in their algorithm. In this paper, we propose a non-monotone line search SQP method, which interprets the optimization problem with general constraints as a bi-objective optimization problem. Our method uses neither a penalty function nor an augmented Lagrange function to test the acceptability of trial steps. Thus, it avoids the difficulty of choosing a proper penalty parameter. Since the filter acceptant criteria in the filter methods mentioned above require that the trial point compare with all the points in the filter, it implies that all the information about the points in the filter is needed. Storing the filter points results in a lot of memory being required in the optimization calculation. Therefore, the method that we investigate here does not use the concept of filters, either. Rather, the non-monotone line search technique is applied in our method. This strategy is comparable to filter methods with respect to the flexibility of accepting trial steps.

We also discuss the local convergence property of our method. The SQP method may suffer from the well-known Maratos effect [14]. As a remedy, Fletcher and Leyffer [6] proposed to improve the search direction by means of the second order correction (SOC) step which aims to further reduce the infeasibility, if the full step is rejected. In this paper, we show that this modification is indeed able to avoid the Maratos effect in our method.

Here, we make an extensive use of the symbols o

(·)

andO

(·)

. Denote

{

η

_k

}

and

{

ν

_k

}

as two vanishing sequences, where

η

k

, ν

k

∈

R

,

k is a positive integer. If there exists a constant C

>

0, such that

|

η

k

| ≤

C

|

ν

k

|

for all k sufficiently large, we write

η

k

=

O

(ν

k

)

. If the sequence of ratios

{

η

k

/ν

k

}

approaches zero when k

→ +∞

, we write

η

k

=

o

(ν

k

)

.

This paper is organized as follows. In Section2the algorithm is developed. We describe the decrease conditions for the objective function and the non-monotone decrease conditions for the constraint violation, which are the key ingredients of our algorithm. The global convergence of the algorithm is established in Section3. In Section4, we show that under some reasonable conditions the algorithm converges to a local minimum of problem (P) superlinearly. Some numerical results are reported in the final section.

2. Algorithm

Let xkbe the current iterate; the search direction is computed by solving the quadratic programming (QP) subproblem

as follows: QP

(

xk

)











min g_kTd

+

1 2d T_B kd s

.

t

.

Ai

(

xk

)

Td

+

ci

(

xk

) =

0

,

i

∈

E

,

Ai

(

xk

)

Td

+

ci

(

xk

) ≤

0

,

i

∈

I

,

(2)

where gk

=

g

(

xk

) = ∇

f

(

xk

)

, A

(

xk

) = [

A1

(

xk

), . . . ,

Am

(

xk

)] = ∇

c

(

xk

)

, and Bk

∈

Rn×nis symmetric positive definite. There is a common difficulty in solving the QP subproblems. That is, the linearization of the nonlinear constraints may give rise to infeasibility of the QP subproblem. It means that QP

(

xk

)

may be inconsistent. To overcome the difficulty, we define

a relaxed feasible QP subproblem. We use the technique in [10] to deal with the infeasible problem (P) and the infeasible QP subproblem. Since QP

(

xk

)

may be infeasible or the corresponding Lagrangian multipliers may be too large, we consider

solving the following auxiliary problem:

P

(γ )











min f

(

x

) + γ

eT

(v + w)

s

.

t

.

ci

(

x

) − v

i

+

w

i

=

0

,

i

∈

E

,

ci

(

x

) − v

i

+

w

i

≤

0

,

i

∈

I

,

v ≥

0

,

w ≥

0

,

(3)

where

γ

is a non-negative penalty parameter and

v

T

₌

_(v

1

, . . . , v

m

)

,

w

T

=

(w

1

, . . . , w

m

)

, eT

=

(

1

, . . . ,

1

)

with

corresponding dimension. If problem (P) has a feasible solution and

γ

is sufficiently large, then the solutions to problem (P) and problem P

(γ )

are identical. Otherwise problem P

(γ )

tends to determine a ‘‘good’’ infeasible solution if

γ

is large enough. The choice of

γ

requires heuristics. We use the value

γ =

100

k∇

f

(ˆ

x0

)k

, wherex

ˆ

0is the first iterate at which the inconsistent linearized constraints are detected. We do not stop increasing

γ

with the increasing factor 10 until

v =

0

, w =

0 or

γ =

1010

k∇

f

(ˆ

x0

)k

.

For simplicity, we do not introduce the quadratic model of problem P

(γ )

in the following part of this paper, and just assume that QP

(

xk

)

is always consistent.

Let dkbe the solution of QP

(

xk

)

. Then the KKT conditions of QP

(

xk

)

are as follows:











gk

+

Bkdk

= −

A

(

xk

)λ

k

,

ci

(

xk

) +

Ai

(

xk

)

Tdk

=

0

,

i

∈

E

,

λ

k,i ci

(

xk

) +

Ai

(

xk

)

Tdk

=

0

,

i

∈

I

,

ci

(

xk

) +

Ai

(

xk

)

Tdk

≤

0

,

λ

k,i

≥

0

,

i

∈

I

,

(4)

(3)

After the search direction dkis computed, a step size

α

k

∈

(

0

,

1

]

is determined in order to obtain the next iterate

xk+1

:=

xk

+

α

kdk. We propose a backtracking line search procedure, where a decreasing sequence of step size

α

is

tried until the acceptance criteria are satisfied. The basic idea in our approach is to interpret problem (P) as a bi-objective optimization problem with two goals, i.e., minimizing the objective function f

(

x

)

and the constraint violation h

(

c

(

x

)) :=

P

i∈E

|

ci

(

x

)| + P

i∈Imax

{

0

,

ci

(

x

)}

.

First of all, we consider the decrease condition for the objective function. Define1fk

=

f

(

xk

) −

f

(

xk

+

α

dk

)

and

1lk

= −

gkTdk. For simplicity, let h

(

ck

) =

h

(

c

(

xk

))

. If

g_kTdk

≤ −

ξ

dTkBkdk and h

(

ck

) ≤ ζ

1

k

dk

k

ζ2

,

(5)

with constants

ξ ∈ (

0

,

1₂

)

,

ζ

1

>

0,

ζ

2

∈

(

2

,

3

)

, then the sufficient decrease condition

1fk

≥

σ α

1lk (6)

is required be satisfied, where

σ ∈ (

0

,

1₂

)

is a constant.

Now we consider the non-monotone decrease condition for the constraint violation h

(

c

(

x

))

, we use the technique inspired by [21]. The well-known KKT conditions for problem (P) are











g

(

x

) +

A

(

x

)λ =

0

,

ci

(

x

) =

0

,

i

∈

E

,

λ

ici

(

x

) =

0

,

i

∈

I

,

ci

(

x

) ≤

0

,

λ

i

≥

0

,

i

∈

I

,

(7)

where

λ ∈

Rnis the Lagrangian multiplier. Denote Nfk

= k

Ak

λ

k

+

gk

k

. If Nfk

→

0 and h

(

ck

) →

0, then the KKT conditions

are satisfied at the limit points of

{

xk

}

if some reasonable conditions are satisfied. In order to reduce h

(

ck

)

and Nfkevenly,

we need a slack variable Tk.

The choice of Tkis an important issue in the design of the method. It is done in such a way that the feasibility requirement

is relaxed if the feasibility is much better than the stationarity, i.e., if h

(

ck

)

Nfk, then a value larger than h

(

ck

)

is chosen.

In order to keep minimum control over the constraint violation, we choose Tkto be not larger than some upper bound bjk.

Hereby,

{

bjk

}

is a slowly decreasing sequence tending to zero and only when Tkyields the maximum in the first term ofRk,

jkincreases, i.e., jk+1

=

jk

+

1. Let

{

bj

}

be a sequence with

b0

>

0

,

bj

=

b0 j

+

1

(

j

≥

1

),

bj

→

0

(

j

→ +∞

),

and 1 2

≤

bj+1 bj

<

1

.

With a fixed positive integer l

>

1, let

Ml,k

=

max

k−l+1≤i≤k−1h

(

ci

),

(8)

and

Rk

=

max

{

Tk

,

Ml,k

}

.

(9)

For the non-monotone decrease of the constraint violation, the trial step xk

+

α

dkis accepted as a new iterate xk+1if it satisfies Rk

−

h

(

ck+1

) ≥ αη

Rk

, η ∈

0

,

1 2

.

(10)

Now, we give a complete statement of our algorithm. Algorithm A (Update T_k). Initiate parameters: 0

< η

1

, η

2

<

1₂.

B If h

(

ck

) <

min

{

η

1bjk

, η

2Nfk

}

,

set Tk

=

min

{

bjk

,

Nfk

}

,

•

If Tk

≥

Ml,k, set jk+1

=

jk

+

1;

•

otherwise set jk+1

=

jk

.

B else set Tk

=

h

(

ck

),

jk+1

=

jk.

Algorithm B. Initiate parameters: x0

∈

Rn,

σ ∈ (

0

,

12

), η ∈ (

0

,

1 2

), ξ ∈ (

0

,

1 2

),

B0

∈

R n×n

_,

_t

_∈

₍

₀

_,

₁

_{), ζ}

1

>

0

, ζ

2

∈

(

2

,

3

),

k

:=

1.

Step 1 Compute the search direction.

Compute the search direction dkand the corresponding Lagrangian multiplier

λ

kfrom QP

(

xk

)

.

Set

α =

1.

Step 2 Evaluate functions at xk.

Compute f

(

xk

),

c

(

xk

),

g

(

xk

),

A

(

xk

)

. Step 3 Check for termination.

(4)

Step 4 Update T_k.

Update Tkby callingAlgorithm A. Step 5 Backtracking line search. If(5)holds, go to 5.1, else go to 5.2.

5.1 IfRk

−

h

(

c

(

xk

+

α

dk

)) < αη

Rkor1fk

< σ α

1lk,

Case 1:

α 6=

1, then set

α =

t

α

, and go to Step 5. Case 2:

α =

1, go to 5.3,

Otherwise set

α

k

=

α

, xk+1

=

xk

+

α

kdk, go to Step 6.

5.2 IfRk

−

h

(

c

(

xk

+

α

dk

)) ≥ αη

Rk, then set

α

k

=

α

, xk+1

=

xk

+

α

kdk, go to Step 6. 5.3 Compute the SOC step.

Solve the subproblem

f

QP

(

x_k

)

to obtain the SOC stepd

˜

_kand definex

˜

_k+1

=

xk

+

dk

+ ˜

dk.

f

QP

(

xk

)











min g

(

xk

)

T

(

dk

+

d

) +

1 2

(

dk

+

d

)

T_B k

(

dk

+

d

)

s

.

t

.

Ai

(

xk

)

Td

+

ci

(

xk

+

dk

) =

0

,

i

∈

E Ai

(

xk

)

Td

+

ci

(

xk

+

dk

) ≤

0

,

i

∈

I (11)

5.3.1 IfRk

−

h

(

c

(

xk

+

dk

+ ˜

dk

)) ≥ αη

Rk, go to 5.3.2, else set

α =

t

α

, go to Step 5.

5.3.2 If1f

˜

k

≥

σ

1lk, where1f

˜

k

=

f

(

xk

) −

f

(˜

xk+1

)

, then set xk+1

= ˜

xk+1, go to Step 6, else set

α =

t

α

, go to Step 1.

Step 6 Compute Bk+1, set k

:=

k

+

1

,

go to Step 1.

Remark 1. In step 6, update Bk+1by the modified BFGS method [16], or the modified Broyden methods [18,19]. We call Step 5 the inner loop, and call Step 1–Step 6 the outer loop.

3. Global convergence

In this section, we prove the global convergence ofAlgorithm B. Firstly, we give some assumptions: A1

{

xk

}

,

{

xk

+

dk

+ ˜

dk

}

and

{

xk

+

α

dk

}

are contained in a compact and convex set S of Rnfor all

α ∈ (

0

,

1

]

. A2 The functions f

(

x

),

c

(

x

)

are twice continuously differentiable on S.

A3 The matrix Bkis bounded and uniformly positive definite for all k. And the Lagrange multiplier

λ

kis also bounded for

all k.

Remark 2. A consequence of assumption A3 is that there exist two scalars

δ >

0 and M

>

0, independent of k, such that

δk

y

k

2

_≤

_yT_B

ky

≤

M

k

y

k

2for all y

∈

Rn. By assumptions A1–A3, without loss of generality, we may also assume that

k

λ

_k

k

∞

≤

M,

k∇

2ci

(

x

)k ≤

M, i

∈

E

∪

I,

k∇

2f

(

x

)k ≤

M, x

∈

S. Lemma 1. Suppose assumptions A1–A3 hold, then for any x_k

∈

S,

h

(

c

(

xk

+

α

dk

)) ≤ (

1

−

α)

h

(

c

(

xk

)) +

1 2

α

2_mM

_k

_d k

k

2

,

(12) and

|

1fk

−

α

1lk

| ≤

1 2

α

2_M

_k

_d k

k

2

.

(13)

Proof. From the Taylor Expansion Theorem,(4)and Assumption A3, we have that for all i

∈

E

∪

I,

ci

(

xk

+

α

dk

) = (

1

−

α)

ci

(

xk

) + α

ci

(

xk

) + α∇

ci

(

xk

)

Tdk

+

1 2

α

2_dT k

∇

2_c i

(

yi

)

dk

≤

(

1

−

α)

ci

(

xk

) +

1 2

α

2_M

_k

_d k

k

2

,

where yidenotes some point on the line segment from xkto xk

+

α

dk. This, together with the definition of h

(

c

(

x

))

implies (12). Similarly, we can also prove that(13)is true.

Theorem 1. Suppose assumptions A1–A3 hold, then the inner loop terminates in a finite number of iterations.

Proof. If xkis a KKT point of problem (P), then dk

=

0 solves QP

(

xk

)

andAlgorithm Bterminates without going to the inner

loop. Otherwise, we suppose dk

6=

0. FromAlgorithm A, we have that h

(

ck

) ≤

Tk. It follows with(12)and(9)that

h

(

c

(

xk

+

α

dk

)) ≤ (

1

−

α)

h

(

ck

) +

1 2

α

2_mM

_k

_d k

k

2

≤

(

1

−

α)

Rk

+

1 2

α

2_mM

_k

_d k

k

2

.

(14)

(5)

Thus,(10)holds when

α ≤

2(1−η)Rk

mMkdkk2. If(5)does not hold, then the inner loop terminates byAlgorithm B. Otherwise, if(5)

holds, two situations need to be considered.

Case 1. If

4

fk

< σ α4

lkand

α =

1, then SOC stepx

˜

k+1is taken. Ifx

˜

k+1is accepted as the next iterate xk+1, then the inner loop terminates.

Case 2. If

α 6=

1, we obtain fromRemark 2and(5)that 1lk

≥

ξδk

dk

k

2

.

This together with(13)implies

1fk

−

α

1lk

α

1lk

≤

α2_M_k_d kk2 2

αξδk

dk

k

2

=

α

M 2

ξδ

.

(15) Therefore,1fk

≥

σ α

1lkwhen

α ≤

2δξ(1 −σ ) M . Define C0

=

2

δξ(

1

−

σ)

M

,

Ck,2

=

2

(

1

−

η)

Rk mM

k

dk

k

2

,

Ck

=

min

C0

,

Ck,2

.

(16)

By(15)and(16), the inner loop terminates whenever

α ≤

Ck.

Therefore, the inner iteration terminates finitely in all cases.

Lemma 2. InAlgorithm A, if the index k satisfies jk+1

=

jk

+

1, then for all k0

≥

k

,

h

(

ck0

) ≤

bjk

.

Furthermore, for any k, jk

≥

1,

h

(

ck

) ≤

2bjk

.

(17)

Proof. If jk+1

=

jk

+

1, it follows fromAlgorithm Aand(8)that h

(

ck

) <

min

{

η

1bjk

, η

2Nfk

}

, and bjk

≥

Tk

≥

Ml,k

.

Thus for k0

₌

_k

₋

_l

₊

₁

_{, . . . ,}

_{k, h}

₍

_c

k0

) ≤

bjk

.

We use the inductive method to prove that for all k0

_≥

_k

₋

_l

₊

₁

_,

_h

₍

_c

k0

) ≤

bjk.

For k0

₌

_k

₋

_l

₊

₁

_{, . . . ,}

_{k, we have already proved that the claim is true. Now we assume that h}

₍

_c

k0

) ≤

bjk holds for

k0

₌

_s

₋

_l

₊

₁

_{, . . . ,}

_s

_(≥

_k

₎

_{. Next, we only have to prove that for k}0

₌

_s

₊

_{1, it still holds.}

FromAlgorithm A, the definition of bjk, for s

≥

k,

Ts

≤

max

{

h

(

cs

),

bjs

} ≤

max

{

h

(

cs

),

bjk

} =

bjk

,

(18)

where the last equality holds because of the induction hypothesis. ByTheorem 1, there exists a step size

α >

0, such that

Rs

−

h

(

cs+1

) ≥ αη

Rs

>

0

.

This together with the induction hypothesis and the definition ofMl,kimplies

h

(

cs+1

) ≤

Rs

=

max

{

Ts

,

Ml,s

} ≤

bjk

.

Therefore, for any k0

≥

k, where jk+1

=

jk

+

1, h

(

ck0

) ≤

bjk.

As to(17), if for some k, jk+1

=

jk

+

1, then it follows fromAlgorithm Athat(17)holds.

Now, consider any k0with jk0

=

jk+1

,

k0

≥

k

+

1. According to the definition of bjkand the previous result, we have that

h

(

ck0

) ≤

bjk

=

bjk0−1

≤

2bjk0

,

jk0

≥

1

.

Therefore, for any k

(

jk0

≥

1

)

,(17)holds.

Lemma 3. IfRk

6=

max

{

h

(

ck

),

Ml,k

}

holds for infinite iterations, then jk

→ +∞

, and h

(

ck

) →

0

.

Proof. From the assumption of the lemma andAlgorithm A, there exists an infinite sequence

{

k0

_}

_{, such that}

T_k0

6=

h

(

c_k0

)

and T_k0

>

Ml,k0

.

Then

T_k0

=

min

{

bj_k0

,

Nfk0

}

>

h

(

ck0

)

and Tk0

≥

Ml,k0

.

ByAlgorithm Awe have that for any k0, jk0+1

=

jk0

+

1. Therefore, jk

→ +∞

, and bjk

=

b0

jk+1

→

0

.

This combining with(17)

implies h

(

ck

) →

0

.

(6)

Proof. For the purpose of contradiction, we assume that h

(

ck

)

9 0. ByLemma 3, there exists a sufficiently large integer

j1

>

0, such that for k

≥

j1,

Rk

=

max

{

h

(

ck

),

Ml,k

}

.

(19)

It implies with(8)that

Rk

=

max

{

h

(

ck

),

Ml,k

} =

max

k−l+1≤i≤kh

(

ci

).

(20)

According toAlgorithm B,(10)and(20), we have that











h

(

ck+1

) < (

1

−

α

k

η)

max k−l+1≤i≤kh

(

ci

)

h

(

ck+2

) < (

1

−

α

k+1

η)

max k−l+2≤i≤k+1 h

(

ci

)

· · ·

h

(

ck+l

) < (

1

−

α

k+l−1

η)

max k≤i≤k+l−1h

(

ci

)

(21) and for k

≥

j1, max k−l+1≤i≤kh

(

ci

) ≥

k−l+max2≤i≤k+1h

(

ci

) ≥ · · · ≥

k≤maxi≤k+l−1h

(

ci

).

(22) Since h

(

ck

)

9 0, it follows that there exists a positive constant

1

>

0 and an infinite subsequence

{

h

(

cki

)}

, such that

h

(

cki

) ≥

1for all ki

>

j1. Then, for any k

>

j1, there exists a positive integer i0, such that ki0

>

k. So it follows with(22)that max

k≤i≤k+l−1h

(

ci

) ≥

k_i0≤maxi≤k_i0+l−1h

(

ci

) ≥

1

.

(23) According to(16)and(20), we have that there exists a step size

α

min

>

0, such that

α

k

> α

minfor all k

>

j1. Thus, it follows with(21)and(22)that

max k≤i≤k+l−1h

(

ci

) ≤ (

1

−

α

_min

η)

b k−_k1 l c _max k1−l+1≤i≤k1 h

(

ci

)

(24)

where

b

a

c

denotes the maximal integer less than a. So we have that lim k→+∞k≤imax≤k+l−1h

(

ci

) =

0

.

It implies that lim i→+∞ h

(

c_k_i

) =

0

,

which contradicts the assumption that h

(

cki

) ≥

1for all ki

>

j1. The conclusion follows.

Lemma 5. Suppose assumptions A1–A3 hold and(5)is satisfied. Then

α

k

=

1 if Ck

≥

1 or xk+1

=

xk

+

dk

+ ˜

dk

;

≥

tCk if Ck

<

1

,

(25)

where t and Ckare fromAlgorithm BandTheorem 1.

Proof. If xk+1

=

xk

+

dk

+ ˜

dk, we easily obtain

α

k

=

1. From the proof ofTheorem 1, the inner loop terminates whenever

α ≤

Ck. Since(5)holds, it follows with Step 5 ofAlgorithm Bthat

α

k

=

1 if Ck

≥

1 and that

α

k

≥

tCkif Ck

<

1. Lemma 6. Suppose assumptions A1–A3 hold, then

k

dk

k =

O

(

Nfk

).

Proof. It follows from(4)that

dk

= −

B

−1

k

(

gk

+

Ak

λ

k

).

(26)

Since

{

Bk

}

is uniformly positive definite and uniformly bounded, we obtain that

{

B

−1

k

}

is also positive definite and bounded

for all k. Therefore

k

dk

k =

O

(k

gk

+

Ak

λ

k

) =

O

(

Nfk

)

.

Theorem 2. Suppose assumptions A1–A3 hold. Then one of the following two situations occurs:

(

i

)

Algorithm Bterminates at a KKT point of problem

(

P

)

.

(7)

Proof. We only need to consider the situation (ii). Since the inner loop is finite, we only need to consider that the outer loop is infinite.

Let K0

= {

k

|

gkTdk

> −ξ

dTkBkdkor h

(

ck

) > ζ

1

k

dk

k

ζ2

}

. There are two cases, depending on whether K0is finite or not. (1) K0is an infinite set.

Since

{

xk

}

k∈K0

⊂

S is bounded, there exists an accumulation point denoted by x, i.e., lim

k∈K1,k→+∞

xk

=

x

,

where K1

⊂

K0is an infinite index set. It follows withLemma 4that h

(

ck

) →

0

,

k

→ +∞

. Thus x is a feasible point of

problem (P).

If there exists a subset K1

⊂

K1, such that limk∈K1,k→+∞

k

dk

k =

0, then x is a KKT point. Otherwise there exists a constant

>

0, such that

k

dk

k

>

for all k

∈

K1. ByLemma 4, there exists a positive integer j1, such that for all k

>

j1

,

k

∈

K1,

h

(

ck

) ≤

min

(

1

−

ξ)

2

_δ

M

, ζ

1

k

dk

k

ζ2

.

(27) It implies that h

(

ck

) ≤

(

1

−

ξ)δk

dk

k

2 M

≤

(

1

−

ξ)

dT kBkdk M

.

(28)

According to(4),(27)and(28), we have that for all k

∈

K1

,

k

>

j1

,

g_kTdk

= −

dTAk

λ

k

−

dTkBkdk

=

λ

T_kck

−

dTkBkdk

≤ k

λ

_k

k

∞h

(

ck

) −

dTkBkdk

≤

Mh

(

ck

) −

dTkBkdk

≤ −

ξ

dT_kBkdk and h

(

ck

) ≤ ζ

1

k

dk

k

ζ2

.

That is a contradiction with the definition of K0. So x is a KKT point. (2) K0is a finite set.

It implies that there exists a positive integer j2

>

0, such that(5)holds for any k

>

j2. It follows withTheorem 1that for any k

>

j2, there exists an iterate xk+1

=

xk

+

α

kdkor xk+1

=

xk

+

dk

+ ˜

dk, such that

f

(

xk

) −

f

(

xk+1

) ≥ σα

k1l

≥

ξσα

kdTkBkdk

≥

ξσα

k

δk

dk

k

2

.

(29) Denote K2

= {

k

|

α

k

=

1

,

k

>

j2

}

,

K3,1

= {

k

|

Ck

<

1 and C0

<

Ck,2

,

k

>

j2

}

,

K3,2

= {

k

|

Ck

<

1 and C 0

_≥

Ck,2

,

k

>

j2

}

.

According to(9)and(29)andLemma 5, we have that

X

k∈K2

(

f

(

xk

) −

f

(

xk+1

)) ≥

X

k∈K2

ξσδk

dk

k

2

,

(30)

X

k∈K3,1

(

f

(

xk

) −

f

(

xk+1

)) ≥

X

k∈K3,1 tC0

ξσ δk

dk

k

2

,

(31)

X

k∈K3,2

(

f

(

xk

) −

f

(

xk+1

)) ≥

X

k∈K3,2 tCk,2

ξσ δk

dk

k

2

=

X

k∈K3,2 t

ξσ δk

dk

k

2 2

(

1

−

η)

Rk mM

k

dk

k

2

,

≥

X

k∈K3,2 2

(

1

−

η)

t

ξσδ

Tk mM

,

≥

X

k∈K3,2 2

(

1

−

η)

t

ξσδ

min

{

bjk

,

Nfk

}

mM

.

(32)

(8)

By assumptions A1–A3, we have

+∞

>

+∞

X

k=j2+1

(

f

(

xk

) −

f

(

xk+1

)) =

X

k∈K2

(

f

(

xk

) −

f

(

xk+1

)) +

X

k∈K3,1

(

f

(

xk

) −

f

(

xk+1

)) +

X

k∈K3,2

(

f

(

xk

) −

f

(

xk+1

)).

(33) Since

P

+∞

k=1bjk

= +∞

, it follows with(30),(31),(32)andLemma 6that

X

k→+∞

k

dk

k

2

< +∞.

It implies that

k

dk

k →

0

,

k

→ +∞

.

The above two cases imply that the theorem holds. 4. Local convergence

In this section, we prove the local convergence ofAlgorithm B. Let x∗_{be a local minimizer. We also need the following}

assumptions:

A4 The sequence

{

xk

}

converges to x∗. And the sequence

{

Bk

}

converges to B∗. There exist constantsL

¯

>

0 and 1

≤

γ <

2,

such that

k

dk

k

γ

≤ ¯

Lbjk.

A5 The point x∗ _{is the KKT point of problem (P). The vectors}

_{∇

_c

i

(

x∗

),

i

∈

J∗

}

are linearly independent. The strict

complementarity condition holds. Denote the matrix Ac

(

x∗

) = {∇

ci

(

x∗

),

i

∈

J∗

}

, where J∗

=

E

S{

i

∈

I

:

ci

(

x∗

) =

0

}

. A6 The Hessian matrix

∇

_xxL

(

x∗

_{, λ}

∗

₎

_{is positive definite on the null space of A}c

₍

_x∗

₎

T_{, i.e., there is a constant}

_{% >}

_{0, such} that

dT

∇

_xxL

(

x∗

, λ

∗

)

d

≥

%k

d

k

2

for any d

6=

0 with Ac

₍

_x∗

₎

T_d

₌

_{0, where}

_λ

∗_{is the multiplier associated with x}∗_.

A7 Let Pk

=

I

−

Ac

(

xk

)[

Ac

(

xk

)

TAc

(

xk

)]

−1Ac

(

xk

)

Tand lim k→+∞

k

Pk

(

Bk

− ∇

xxL

(

x∗

, λ

∗

))

dk

k

dk

k

=

0

,

where I is an identity matrix with appropriate size.

Remark 3. The assumption

k

dk

k

γ

≤ ¯

Lbjkis important to the proof of the local convergence ofAlgorithm B. From the global

convergence analysis ofAlgorithm B, we know that the sequence

{

xk

}

converges to an optimal point. So dkand Nfkwith the

same order eventually converge to zero, which is proved byLemmas 6and7. FromAlgorithm B, we know that the role of the sequence

{

bjk

}

is to relax the infeasibility. Even if bjkconverges to zero, it will not decrease to zero too fast.

Theoretical speaking, since dkand Nfkconverge to zero with the same order simultaneously as xkconverges to an optimal

point, and either NfkorMl,krelaxes the infeasibility after sufficiently large k, the sequence

{

bjk

}

may not decrease after some

iterations from the mechanism ofAlgorithm A.

Furthermore, practical speaking, when xksufficiently approaches to an optimal point, the sequence

{

bjk

}

never decreases,

which is confirmed by the good numerical results in Section5. Thus, assumption A4 is reasonable. Lemma 7. Suppose assumptions A1–A7 hold, then dk

→

0

,

k

→ +∞

.

Proof. It follows from(4)and assumptions A1–A7 that gk

→

g

(

x∗

)

, Ak

→

A

(

x∗

)

,

λ

k

→

λ

∗, Bk

→

B∗

,

k

→ +∞

. Without

loss of generality, let dk

→

d∗, k

→ +∞

. assumptions A3–A4 imply that B∗is positive definite. Since x∗is the KKT point of

problem (P), it follows that g

(

x∗

_{) +}

_A

₍

_x∗

_)λ

∗

₌

_{0, B}∗_d∗

₌

_{0. Thus, d}∗

₌

_{0, i.e., d}

k

→

0

,

k

→ +∞

. Lemma 8. Suppose assumptions A1–A7 hold. Then there exists a neighborhood U1of x∗, such that for xk

∈

U1,

˜

dk

=

O

(k

dk

k

2

),

(34)

c_i

(

x_k

+

d_k

+ ˜

d_k

) =

o

(k

d_k

k

2

),

i

∈

J∗

.

(35) Proof. It follows from [20, Proposition 3.6] that d

˜

k

=

O

(k

dk

k

2

)

. According to assumptions A1–A7, there exists a

neighborhood U1of x∗, such that for all xk

∈

U1, QP

(

xk

)

is equivalent to the following subproblem

EQP

(

xk

)

(

min g

(

xk

)

Td

+

1 2d T_B kd s

.

t

.

∇

ci

(

xk

)

Td

+

ci

(

xk

) =

0

,

i

∈

J ∗

_.

(9)

From the Taylor Expansion Theorem and the feasibility ofQP

f

(

x_k

)

, we have

ci

(

xk

+

dk

+ ˜

dk

) =

ci

(

xk

+

dk

) + ∇

cTi

(

xk

+

dk

) ˜

dk

+

O

(k ˜

dk

k

2

)

=

o

(k

dk

k

2

),

for all i

∈

J∗.

In order to prove the local convergence ofAlgorithm B, we need two conclusions in [3], which are applications of the SOC steps on the exact penalty function. We introduce a penalty function

Φρ

(

x

) =

f

(

x

) + ρ

h

(

c

(

x

))

(36)

and its quadratic approximation

q_ρ

(

xk

,

d

) =

f

(

xk

) +

gkTd

+

1 2d T_B kd

+

ρ

"

X

i∈E

|

ci

(

xk

) + ∇

ci

(

xk

)

Td

| +

X

i∈I max

{

ci

(

xk

) + ∇

ci

(

xk

)

Td

,

0

}

#

(37) just for proof. The following two lemmas from [3] are important to our proof.

Lemma 9 ([3, Theorem 15.3.7]). Suppose assumptions A1–A7 hold. The exact penalty functionΦ_ρand q_ρare described by(36) and(37), where

ρ > kλ

∗

_k

∞, then lim k→∞ Φρ

(

xk

) −

Φρ

(

xk

+

dk

+ ˜

dk

)

q_ρ

(

xk

,

0

) −

qρ

(

xk

,

dk

)

=

1

.

(38)

Lemma 10 ([3, Theorem 15.3.2]). Suppose assumptions A1–A7 hold. Let dkbe the solution of QP

(

xk

)

,

λ

kis the associated

multiplier, and

ρ > kλ

∗

_k

∞, then

q_ρ

(

xk

,

0

) −

qρ

(

xk

,

dk

) ≥

0

.

(39) Lemma 11. Suppose assumptions A1–A7 hold, then there exists a neighborhood U₂

(⊂

U1

)

of x∗, such that for xk

∈

U2, 1fk

≥

σ

1lkor1f

˜

k

≥

σ

1lkwhenever(5)holds.

Proof. If1fk

≥

σ

1lk, then the theorem holds. Otherwise the SOC step is computed. We only need to prove that1f

˜

k

≥

σα

1lkholds. Since(5)holds, it follows

h

(

ck

) =

o

(k

dk

k

2

).

(40)

ByLemmas 8and9, for all sufficiently large k, Φρ

(

xk

) −

Φρ

(

xk

+

dk

+ ˜

dk

) ≥

1 2

+

σ

(

q_ρ

(

xk

,

0

) −

qρ

(

xk

,

dk

)),

(41)

where

ρ > kλ

k

∞, independent of k. Then

f

(

xk

) −

f

(

xk

+

dk

+ ˜

dk

) =

Φρ

(

xk

) −

Φρ

(

xk

+

dk

+ ˜

dk

) − ρ(

h

(

c

(

xk

)) −

h

(

c

(

xk

+

dk

+ ˜

dk

)))

(by(36))

≥

1 2

+

σ

(

q_ρ

(

xk

,

0

) −

qρ

(

xk

,

dk

)) +

o

(k

dk

k

2

)

(by(35),(40)and(41))

= −

1 2

+

σ

(

g_kTdk

+

1 2d T kBkdk

) +

o

(k

dk

k

2

)

(by(35)and(37)) (42) and f

(

xk

) + σ

gkTdk

−

f

(

xk

+

dk

+ ˜

dk

) ≥ −

1 2g T kdk

−

1 4

+

σ

2

dT_kBkdk

+

o

(k

dk

k

2

)

(by(42))

=

1 2

(

d T kBkdk

−

c

(

xk

)

T

λ

k

) −

1 4

+

σ

2

dT_kBkdk

+

o

(k

dk

k

2

)

(by(4))

≥

1 4

−

σ

2

dT_kBkdk

− k

λ

k

∞h

(

c

(

xk

)) +

o

(k

dk

k

2

)

(by(4))

=

1 4

−

σ

2

dT_kBkdk

+

o

(k

dk

k

2

),

(by the boundedness of

k

λ

k

∞)

.

(43)

(10)

Table 1 Numerical results.

Problem n m Algorithm B LANCELOT

NIT NF NG NC NA NIT NF/NC NG/NA

HS002 2 1 13 24 14 21 14 4 4 5 HS004 2 2 1 2 2 2 2 1 1 2 HS008 2 2 5 6 6 6 6 8 8 8 HS010 2 1 11 12 12 12 12 27 27 22 HS022 2 2 1 2 2 2 2 12 12 13 HS026 3 1 31 46 32 46 32 32 32 32 HS029 3 1 13 18 14 18 14 43 43 33 HS035 3 4 5 8 6 7 6 9 9 10 HS038 4 8 63 101 81 101 81 45 45 42 HS047 5 3 21 23 22 23 22 23 23 23 HS049 5 2 17 24 18 21 18 23 23 24 HS065 3 7 8 10 9 10 9 43 43 39 HS067 3 20 22 23 23 23 23 69 69 64 HS072 4 10 25 26 26 26 26 110 110 111 HS088 2 1 21 24 22 24 22 93 93 78 HS090 4 1 25 27 26 27 26 97 97 79 HS092 6 1 32 34 33 34 33 102 102 86 HS100 7 4 18 44 19 44 19 31 31 30 HS101 7 20 53 73 54 77 5 – – – HS113 10 8 14 22 15 22 15 48 48 43

Theorem 3. Suppose assumptions A1–A7 hold, then for all sufficiently large k, either xk

+

dkor xk

+

dk

+ ˜

dkis accepted and the

sequence

{

xk

}

converges to x∗superlinearly.

Proof. If x_kis not a KKT point of problem (P), then dk

6=

0. By the mechanism ofAlgorithm A, two cases may occur.

(1) If h

(

c

(

xk

)) <

min

{

η

1

,

bjk

η

2Nfk

}

, Tk

=

min

{

bjk

,

Nfk

} ≥

min

{

η

1bjk

, η

2Nfk

}

.

(2) If h

(

c

(

xk

)) ≥

min

{

η

1bjk

, η

2Nfk

}

, Tk

=

h

(

ck

) ≥

min

{

η

1bjk

, η

2Nfk

}

.

Therefore, Tk

≥

min

{

η

1bjk

, η

2Nfk

}

,

(44)

holds in both cases. From A4 andLemma 6, we have

k

dk

k

γ

≤ ¯

Lbjk

,

Nfk

=

O

(k

dk

k

),

1

≤

γ <

2

.

(45)

Combining this with(35), we have h

(

c

(

xk

+

dk

+ ˜

dk

)) ≤ (

1

−

η)

Tk

≤

(

1

−

η)

Rkfor all k

>

k1, where k1

>

0 is some sufficiently large integer. Now for k

>

k1, if(5)does not hold, then either xk

+

dkor xk

+

dk

+ ˜

dkis accepted as a new iterate

byAlgorithm B. If(5)holds, byLemma 11, it follows1fk

≥

σα

1lkor1f

˜

k

≥

σ α

1lk. So xk

+

dk

+ ˜

dkis accepted as a new

iterate. Therefore, by [9, Theorem 5.2], the sequence

{

xk

}

converges to x∗superlinearly.

5. Numerical results

In this section, we give some numerical results for a set of problems from Hock and Schittkowski’s problems [11]. The results are obtained by a preliminary Matlab implementation ofAlgorithm B. At each iteration, we use matlab command

quadprog

to solve the QP subproblem. Details about the implementation are described as follows. (a) Termination criteria.Algorithm Bstops if h

(

ck

) ≤ ε

√

m and Nfk

≤

ε

√

n.

(b) Update Bk. For each test problems, we chose B0

=

I, where I denotes the identity matrix with an appropriate size, as the initial guess of the Lagrangian Hessian. At each step, the matrix Bkwas updated by the BFGS formula from Powell’s

modifications [16]. Specifically, we set

Bk+1

=

Bk

−

BksksTkBk sT kBksk

+

yky T k sT kyk

,

where yk

=

˜

yk

,

y

˜

Tksk

≥

0

.

2sTkBksk

,

θ

k

_˜

_y k

+

(

1

−

θ

k

)

Bksk

,

otherwise

,

(11)

Table 2 Numerical results.

Problem n m Algorithm B SNOPT

HS001 2 1 38 56 39 48 39 71 49 48 HS002 2 1 13 24 14 21 14 18 15 14 HS003 2 1 9 10 10 10 10 2 2 2 HS004 2 2 1 2 2 2 2 2 3 2 HS005 2 4 7 12 8 10 8 8 9 8 HS006 2 1 9 13 10 13 10 5 8 7 HS007 2 1 11 12 12 12 12 18 31 30 HS008 2 2 5 6 6 6 6 2 7 6 HS009 2 1 6 7 7 7 7 9 9 8 HS010 2 1 11 12 12 12 12 20 32 31 HS011 2 1 7 8 8 8 8 11 18 17 HS012 2 1 9 10 10 10 10 12 12 11 HS013 2 3 7 8 8 8 8 1 7 6 HS014 2 2 5 6 6 6 6 2 10 9 HS015 2 3 2 3 3 3 3 5 11 10 HS016 2 5 4 5 5 5 5 1 5 4 HS017 2 5 6 9 7 8 7 13 19 18 HS018 2 6 8 9 9 9 9 19 31 30 HS019 2 6 5 6 6 6 6 2 9 8 HS020 2 5 3 4 4 4 4 1 5 4 HS021 2 5 1 4 2 3 2 1 1 1 HS022 2 2 1 2 2 2 2 2 6 5 HS023 2 9 5 6 6 6 6 1 7 6 HS024 2 5 4 5 5 5 5 4 6 5 HS025 3 6 0 1 1 1 1 0 2 1 HS026 3 1 31 46 32 46 32 25 25 24 HS027 3 1 18 26 19 28 19 22 24 23 HS028 3 1 3 6 4 5 4 4 4 4 HS029 3 1 13 18 14 18 14 17 19 18 HS030 3 7 11 12 12 12 12 7 5 4 HS031 3 7 9 17 10 17 10 9 11 10 HS032 3 5 2 3 3 3 3 5 5 4 HS033 3 6 3 4 4 4 4 3 9 8 HS034 3 8 7 8 8 8 8 5 8 7 HS035 3 4 5 8 6 7 6 5 5 5 HS036 3 7 1 2 2 2 2 10 9 8 HS037 3 8 9 17 10 15 10 12 10 11 HS038 4 8 63 101 81 101 81 160 119 118 HS039 4 2 12 13 13 13 13 20 31 30 HS040 4 3 6 7 7 7 7 7 8 7 HS041 4 9 7 8 8 8 8 12 9 8 HS042 4 2 8 11 9 11 9 8 9 8 HS043 4 3 11 15 12 15 12 14 10 9 HS044 4 10 5 6 6 6 6 12 12 12

HS045 5 10 7 8 8 8 8 Fail Fail Fail

HS046 5 2 34 40 35 40 35 28 27 26

HS047 5 3 21 23 22 23 22 24 32 31

HS048 5 2 3 10 4 7 4 6 6 6

HS049 5 2 17 24 18 21 18 Fail Fail Fail

HS050 5 3 11 14 12 13 12 33 22 21 HS051 5 3 2 7 3 5 3 6 6 6 HS052 5 3 6 10 7 10 7 5 5 5 HS053 5 13 8 10 9 10 9 2 2 2 HS054 6 13 1 2 2 2 2 5 5 5 HS055 6 14 1 2 2 2 2 3 3 3 HS056 7 4 16 22 17 22 17 13 15 14 HS057 2 3 15 23 16 22 16 5 6 5 HS059 2 7 16 22 17 20 17 20 20 19 HS060 3 7 7 11 8 11 8 11 13 12 HS061 3 2 9 16 10 16 10 17 24 23 HS062 3 7 9 18 10 14 10 13 16 15 HS063 3 5 8 9 9 10 9 13 14 13 HS064 3 4 42 64 43 65 43 33 26 25 HS065 3 7 8 10 9 10 9 15 11 10 HS066 3 8 7 8 8 8 8 7 6 5 HS067 3 20 22 23 23 23 23 38 28 27

HS068 4 10 37 50 38 50 38 Fail Fail Fail

HS069 4 10 12 18 13 18 13 Fail Fail Fail

(12)

Table 2 (continued)

Problem n m Algorithm B SNOPT

HS070 4 9 34 37 35 36 35 14 12 11 HS071 4 10 5 6 6 6 6 9 8 7 HS072 4 10 25 26 26 26 26 36 41 40 HS073 4 7 4 5 5 5 5 11 8 7 HS074 4 13 11 12 12 12 12 12 15 14 HS075 4 13 8 9 9 9 9 5 12 11 HS076 4 7 6 7 7 7 7 4 4 4 HS077 5 2 14 19 15 19 15 15 15 14 HS078 5 3 8 9 9 9 9 9 8 7 HS079 5 3 9 12 10 12 10 14 15 14 HS080 5 13 6 7 7 7 7 14 22 21 HS081 5 13 7 8 8 8 8 10 10 9 HS083 5 16 4 5 5 5 5 9 8 7 HS084 5 16 8 9 9 9 9 11 10 9 HS085 5 48 35 50 36 50 36 17 17 16 HS086 5 15 5 8 6 7 6 23 23 23 HS088 2 1 21 24 22 24 22 32 55 54 HS089 3 1 27 33 28 33 28 41 47 46

HS090 4 1 25 27 26 27 26 Fail Fail Fail

HS091 5 1 36 42 37 45 37 37 62 61 HS092 6 1 32 34 33 34 33 37 54 53 HS093 6 8 14 20 15 19 15 41 34 33 HS095 6 16 1 2 2 2 2 1 3 2 HS096 6 16 1 2 2 2 2 1 2 3 HS097 6 16 6 7 7 7 7 13 37 36 HS098 6 16 6 7 7 7 7 13 37 36 HS099 7 16 15 35 16 38 16 32 23 22 HS100 7 4 18 44 19 44 19 21 17 16 HS101 7 20 53 73 54 77 5 165 463 462 HS102 7 20 46 59 47 63 47 117 323 322 HS103 7 20 31 40 32 43 32 84 179 178 HS104 8 22 17 18 18 18 18 38 25 24 HS105 8 17 49 50 50 50 50 127 107 106 HS106 8 22 35 37 36 37 36 32 14 13 HS107 9 14 7 10 8 10 8 22 14 13 HS108 9 14 12 13 13 13 13 36 14 13 HS109 9 26 27 28 28 28 28 39 29 28 HS110 10 20 7 12 8 10 8 23 8 7 HS111 10 23 77 81 78 81 78 60 78 77 HS112 10 13 46 87 47 72 47 52 30 29 HS113 10 8 14 22 15 22 15 37 19 18 HS114 10 31 29 30 30 30 30 59 42 41 HS116 13 41 44 45 45 45 45 91 28 27 HS117 15 20 17 18 18 18 18 49 19 18 HS118 15 59 18 19 19 19 19 13 13 13 HS119 16 40 7 8 8 8 8 63 17 16 and







sk

=

xk+1

−

xk

,

˜

yk

= ∇

f

(

xk+1

) − ∇

f

(

xk

) + (

Ak+1

−

Ak

) λ

k

,

θ

k

₌

₀

_.

_8sT kBksk

/(

sTkBksk

−

sTky

˜

k

).

(c) Compute Tk. Let b0

=

min

{

0

.

1 max

(

1

,

h

(

c0

)),

Nf0

+

h

(

c0

)},

bj

=

_j+b01

,

j

≥

1. If h

(

ck

)

Nfk, then set Tk

=

min

{

bjk

,

Nfk

}

,

else set Tk

=

h

(

ck

)

.

(d) The parameters are chosen as follows:

σ =

0

.

1

,

l

=

5

,

η =

0

.

1

,

ζ

1

=

1

,

ζ

2

=

2

.

2

, ε =

10−6

,

t

=

0

.

6

,

η

1

=

η

2

=

0

.

2

.

In the tables :

Problem: the problem number given in [11],

n: the number of variables, m: the number of constraints,

NIT

=

the number of iterations,

NF

=

the number of evaluations for f

(

x

)

, NG

=

the number of evaluations for

∇

f

(

x

)

,

(13)

NC

=

the number of evaluations for c

(

x

)

, NA

=

the number of evaluations for

∇

c

(

x

)

.

‘–’ denotes the iteration number is greater than 1000.

‘Fail’ denotes some errors occur when the solver solves the problem.

Since we use a quasi-Newton method to update Bk inAlgorithm B, the second order information of the Lagrangian

function is not used. Therefore, it is not proper for us to compareAlgorithm Bwith the algorithm in [6]. Therefore, for comparison, we include the corresponding results obtained by the well-known optimization solvers LANCELOT [2] (column ‘‘LANCELOT’’ inTable 1). In LANCELOT solver, (1) The exact Cauchy step is computed; (2) Accurate solution of quadratic problems with bound constraints (BQP) is computed; (3) Bandsolver preconditioned Conjugate Gradient (CG) is used (semi-bandwidth

=

5); (4) Infinity-norm trust region is used; (6) SR.1 approximation to second derivatives is used.

FromTable 1, we can see that our algorithm is better than LANCELOT in these problems.

Furthermore, for more comparison, we also compareAlgorithm Bwith SNOPT [10] (column ‘‘SNOPT’’ inTable 2). We run on a NEOS Server [25] with default options, where feasibility tolerance and optimality tolerance are 10−6_{. A total of 114} problems are selected from [11]. HS087 is the only problem left out. Since it is a discontinuous optimization problem, it is not suitable for our algorithm.

The numerical results on these problems are summarized inTable 2. The initial point of HS025 is a stationary point, then our algorithm terminates at the first iteration which is an undesired result because of its non-minimal property. FromTable 2, we find thatAlgorithm Bsucceeds in solving all the test problems, and for all these problems the number of iterations is small. FromTable 2,Algorithm Bis more efficient for 89 problems in terms of NIT, 74 problems in terms of NF and 86 problems in terms of NG.

Therefore, all the computational results illustrate that our algorithm is competitive with those in [2,10].

As expected from our local convergence theory, we also observed a transition to fast local convergence for all problems tested, i.e., unit steps are accepted in last several iterations for all problems tested here. The results indicate that our new non-monotone line search method without a penalty function is an interesting and competitive alternative to algorithms with penalty functions and deserves further consideration. Considering the lower memory requirement, it is also competitive with filter methods. So, numerical tests confirm the robustness and efficiency of our approach.

References

[1] C. Audet, J.E. Dennis, A pattern search filter method for nonlinear programming without derivatives, SIAM J. Optim. 14 (2004) 980–1010.

[2] A.R. Conn, N.I.M. Gould, Ph.L. Toint, LANCELOT: A fortran package for large-scale nonlinear optimization, in: Numerical Optimization (Release A), in: Springer Ser. Comput. Math., Springer-Verlag, Berlin, 1992.

[3] A.R. Conn, N.I.M. Gould, Ph.L. Toint, Trust Region Methods, SIAM, Philadelphia, PA, USA, 2000.

[4] H.Y. Benson, D.F. Shanno, R. Vanderbei, Interion-point methods for nonconvex nonlinear programming jamming and numercical test, Math. Program. 99 (2004) 35–48.

[5] R. Fletcher, S. Leyffer, A bundle filter method for nonsmooth nonlinear optimization, Technical Report NA/195, Department of Mathematics, University of Dundee, Scotland, December 1999.

[6] R. Fletcher, S. Leyffer, Nonlinear programming without a penalty function, Math. Program. 91 (2002) 239–269.

[7] R. Fletcher, N.I.M. Gould, S. Leyffer, Ph.L. Toint, A. Wächter, Global convergence of a trust region SQP-filter algorithms for general nonlinear programming, SIAM J. Optim. 13 (2002) 635–659.

[8] R. Fletcher, S. Leyffer, Ph.L. Toint, On the global convergence of a filter-SQP algorithm, SIAM J. Optim. 13 (2002) 44–59.

[9] F. Facchinei, S. Lucidi, Quadratically and superlinearly convergent algorithm for the solution of inequality constrained minimization problems, J. Optim. Theory Appl. 85 (1995) 265–289.

[10] Ph.E. Gill, W. Murray, M.A. Saunders, SNOPT: An SQP algorithm for large-scale constrained optimization, SIAM Rev. 47 (2005) 99–131. [11] W. Hock, K. Schittkowski, Test Examples for Nonlinear Programming Codes, Springer-Verlag, 1981.

[12] S.P. Han, Superlinearly convergent variable metric algorithms for general nonlinear programming problems, Math. Program. 11 (1976) 263–282. [13] S.P. Han, A globally convergent method for nonlinear programming, J. Optim. Theory Appl. 22 (1977) 297–309.

[14] N. Maratos, Exact penalty function algorithms for finite dimensional and control optimization problems, Ph.D. Thesis, University of London, London, UK, 1978.

[15] P.Y. Nie, A filter method for solving nonlinear complementarity problems, Appl. Math. Comput. 167 (2005) 677–694.

[16] M.J.D. Powell, A fast algorithm for nonlinearly constrained optimization calculations, in: Numerical Analysis, Proceedings, Biennial conference, Dundee 1977, in: G.A. Waston (Ed.), Lecture Notes in Math., vol. 630, Springer-Verlag, Berlin, New York, 1978, pp. 144–157.

[17] M.J.D. Powell, Variable metric methods for constrained optimization, in: A. Bachem, M. Grotschel, B. Korte (Eds.), Math. Programming: The State of Art, Bonn, 1982.

[18] D. Pu, W. Tian, A class of modified Broyden algorithms, J. Comput. Math. 8 (1994) 366–379.

[19] D. Pu, W. Tian, A class of modified Broyden algorithms without exact line search, Appl. Math. - A J. Chinese Univ. 10 (1995) 313–322.

[20] E.R. Panier, A.L. Tits, Avoiding the Maratos effect by means of a nonmonotone line search I: General constrained problems, SIAM J. Numer. Anal. 28 (1991) 1183–1195.

[21] M. Ulbrich, S. Ulbrich, Nonmonotone trust-region methods for nonlinear equality constrained optimization without a penalty function, Math. Program. 95 (2003) 103–135.

[22] M. Ulbrich, S. Ulbrich, L.N. Vicente, A globally convergent primal-dual interior filter method for nonconvex nonlinear programming, Math. Program. 100 (2004) 379–410.

[23] A. Wächter, L.T. Biegler, Line search filter methods for nonlinear programming: Motivation and global convergence, SIAM J. Optim. 16 (2005) 1–31. [24] A. Wächter, L.T. Biegler, Line search filter methods for nonlinear programming: Local convergence, SIAM J. Optim. 16 (2005) 32–48.