Direct optimization algorithm user guide

(1)

DanielE. Finkel

Center for Researchin Scientic Computation

North Carolina StateUniversity

Raleigh, NC27695-8205

[email protected]

March 2,2003

Abstract

ThepurposeofthisbriefuserguideistointroducethereadertotheDIRECT

optimiza-tion algorithm, describe thetype ofproblems it solves, howto use the accompanying

MATLABprogram, direct.m,andprovideasynopis ofhowitsearchesfortheglobal

minium. An example of DIRECT being used on a test problem is provided, and the

motiviation for the algorithm isalso discussed. The Appendix providesformulas and

dataforthe7testfunctionsthat wereusedin [3].

1 Introduction

TheDIRECToptimizationalgorithmwasrstintroducedin[3 ],motivatedbyamodication

to Lipschitzian optimization. It was created in order to solve diÆcult global optimization

problemswithboundconstraintsandareal-valuedobjectivefunction. Moreformallystated,

DIRECTattemptsto solve:

Problem 1.1 (P) Let a;b 2 R N

, = n

x2R N

:a

i x

i b

i o

, and f : ! R be

Lipschitz continuouswith constant . Find x

opt

2 such that

f

opt =f(x

opt )f

+; (1)

where is a given small positiveconstant.

DIRECT is a sampling algorithm. That is, it requires no knowledge of the objective

function gradient. Instead, the algorithm samples points in the domain, and uses the

information it has obtained to decide where to search next. A global search algorithm

like DIRECT can be very useful when the objective function is a "black box" function or

simulation. An example of DIRECTbeing usedto try and solve a large industrialproblem

can be foundin[1].

TheDIRECTalgorithmwillgloballyconverge totheminimalvalueoftheobjective

func-tion [3 ]. Unfortunately, this global convergence may come at the expense of a large and

exhaustive search over the domain. The name DIRECT comes from the shortening of the

phrase"DIviding RECTangles", which describestheway thealgorithmmovestowards the

(2)

its class[3 ]. The strengths of DIRECTliein thebalanced eortit givesto local and global

searches, andthe fewparameters itrequires to run.

The accompanyingMATLABcode forthisdocumentcan be foundat:

http://www4.ncsu.edu/ definkel/research/index.html:

Any comments, questions, or suggestions should be directed to Dan Finkel at the email

addressabove.

Section 2 of this guide is a reference for using the MATLAB function direct.m. An

exampleof DIRECTbeingusedtosolveatestproblem isprovided. Section3introducesthe

readerto theDIRECTalgorithm,and describeshow itmoves towards theglobaloptimum.

2 How to use direct.m

Inthissection,wediscusshowtoimplementtheMATLABcodedirect.m,andalsoprovide

anexampleof howto rundirect.m.

2.1 The program direct.m

Thefunctiondirect.mis a MATLAB programthat can becalled bythesequence:

[minval,xatmin,hist] = Direct('myfcn',bounds,opts);:

Without theoptional arguments, thefunctioncan becalled

minval = Direct('myfcn',bounds);:

Theinputarguments are:

'myfcn' - Afunctionhandletheobjective functionthatone wishestobeminimized.

bounds - Ann2 vector which describesthe domainofthe problem.

bounds(i,1)x

i

bounds(i,2):

opts- An optional vector argument to customize theoptionsfortheprogram.

{ opts(1) - Jones factor. SeeDenition3.2. The defaultvalue is.0001.

{ opts(2) - Maximum numberof functionevaluations. Thedefaultis 20.

{ opts(3) - Maximum numberof iterations. The defaultvalueis 10.

{ opts(4) - Maximum numberof rectangle divisions. The defaultvalueis 100.

{ opts(5) - This parameter should be set to 0 if the global minimum is known,

and 1 otherwise. The defaultvalueis 1.

{ opts(6) -Theglobalminimum,ifknown. Thisparameterisignoredifopts(5)

is set to 1.

(3)

hist-Anoptionalargument,histreturnsanarrayofiterationhistorywhichisuseful

for plotsand tables. The three columnsreturned are iteration,functionevaluations,

and minimalvaluefound.

The program termination criteria diers depending on the value of the opts(5). If

the value is set to 1, DIRECT will terminate as soon asit exceeds its budget of iterations,

rectangle divisions,or functionevaluations. The functionevaluation budget is a tentative

value, since DIRECT may exceed this value by a slight amount if it is in the middle of an

iterationwhen thebudgethas beenexhausted.

If opts(5)issetto0,thenDIRECTwillterminatewhentheminimumvalueithasfound

iswithin0.01% of thevalueprovided inopts(6).

2.2 Example of DIRECT

Hereweprovideanexampleof DIRECTbeingusedontheGoldstein-Price(GP)testfunction

[2 ]. Thefunction isgiven bytheequation:

f(x

1 ;x

2

) = [1+(x

1 +x

2 +1)

2

(19 14x

1 +3x

2

1 14x

2 +6x

1 x

2 +3x

2

2 )]

[30+(2x

1 3x

2 )

2

(18 32x

1 +12x

2

1 +48x

2 36x

1 x

2 +27x

2

)] (2)

ThedomainoftheGPfunctionis 2x

i

2,fori2f1;2g,andithasaglobalminimum

value off

min

=3. Figure 1 shows thefunctionplottedoverits domain.

−2

−1

0

1

2 −2

−1

0

1

2

0

2

4

6

8

10

12 x 10

5 x

1 x

₂

GP Function

Figure 1: The Goldstein-Pricetest function.

A samplecallto DIRECTis providedbelow. Thefollowingcommandsweremade inthe

MATLAB CommandWindow.

>> opts = [1e-4 50 500 100 0 3];

>> bounds = [-2 2; -2 2];

(4)

iterations. Insteaditterminatesonceithascomewithin0.01%ofthevaluegiveninopts(6).

Theiteration historywouldlooklike:

>> hist

hist =

1.0000 5.0000 200.5487

2.0000 7.0000 200.5487

3.0000 13.0000 200.5487

4.0000 21.0000 8.9248

5.0000 27.0000 8.9248

6.0000 37.0000 3.6474

7.0000 49.0000 3.6474

8.0000 61.0000 3.0650

9.0000 79.0000 3.0650

10.0000 101.0000 3.0074

11.0000 123.0000 3.0074

12.0000 145.0000 3.0008

13.0000 163.0000 3.0008

14.0000 191.0000 3.0001

AusefulplotforevaluatingoptimizationalgorithmsisshowninFigure2. Thex-axisisthe

functioncount,and they-axis isthesmallestvalueDIRECThad found.

>> plot(hist(:,2),hist(:,3),'-*')

0

20

40

60

80

100

120

140

160

180

200

0

50

100

150

200

250 Function Count

f

min

(5)

Inthissection,weintroducethereaderto someofthetheorybehindtheDIRECTalgorithm.

For a more complete description, the reader is recommended to [3 ]. DIRECT was designed

toovercomesomeof theproblemsthatLipschitzianOptimizationencounters. We beginby

discussingLipschitzOptimizationmethods, andwhere these problemslie.

3.1 Lipschitz Optimization

RecallthedenitionofLipschitzcontinuity onR 1

:

Denition3.1 LipschitzLetM R 1

andf :M !R . Thefunctionf iscalledLipschitz

continuous on M withLipschitz constant if

jf(x) f(x 0

)jjx x 0

j 8x;x 0

2M: (3)

Ifthefunctionf isLipschitz continuouswith constant, thenwe can usethisinformation

to construct an iterative algorithm to seek the minimum of f. The Shubert algorithm is

one ofthemore straightforward applicationsofthisidea. [9 ].

Forthemoment,letusassumethatM =[a;b]R 1

. FromEquation 3,itiseasy to see

thatf mustsatisfythetwo inequalities

f(x) f(a) (x a) (4)

f(x) f(b)+(x b) (5)

foranyx 2M. The lines thatcorrespond to these two inequalitiesform aV-shape below

f,as Figure 3 shows. The point of intersection for thetwo lines is easy to calculate, and

a

x

₁

b

slope = −

α

slope = +

α

f(x)

Figure 3: Aninitial Lowerboundforf usingtheLipschitz Constant.

providesthe1stestimateoftheminimumoff. Shubert'salgorithmcontinuesbyperforming

thesame operation on the regions [a;x

1

] and [x

1

;b], dividing next the one with thelower

(6)

a

x

b

1 x

2 x

(a)

a x

1 x

2 b

x

3 x

(b)

Figure 4: TheShubertAlgorithm.

There are several problems with this type of algorithm. Since the idea of endpoints

does not translate well into higher dimensions, the algorithm does not have an intuitive

generalizationforN >1. Second,theLipschitzconstant frequentlycannotbedetermined

or reasonable estimated. Many simulationswith industrial applications may not even be

Lipschitz continous throughout theirdomains. Even if theLipschitz constant can be

esti-mated,apoorchoicecan leadtopoorresults. Iftheestimateistoolow, theresultmaynot

beaminimumoff,and ifthechoice istoolarge,theconvergenceof theShubertalgorithm

willbe slow.

The DIRECT algorithm was motivated by these two shortcomings of Lipschitzian

Op-timization. DIRECT samples at the midpoints of the search spaces, thereby removing any

confusion forhigher dimensions. DIRECTalso requires no knowledge of the Lipschitz

con-stant or for the objective function to even be Lipschitz continuous. It instead uses all

possible values to determine if a region of the domain should be broken into sub-regions

duringthe current iteration[3 ].

3.2 Initialization of DIRECT

DIRECTbegins the optimization by transforming the domain of the problem into theunit

hyper-cube. That is,

= n

x2R N

:0x

i 1

o

Thealgorithmworksinthisnormalizedspace,referringto theoriginalspaceonlywhen

makingfunctioncalls. The center ofthisspace isc

1

,and webeginbyndingf(c

1 ).

Ournextstepistodividethishyper-cube. Wedothisbyevaluating thefunctionat the

pointsc

1 Æe

i

,i=1;:::;n, whereÆ isone-thirdtheside-lengthofthehyper-cube,ande

i is

(7)

thereforewedene

w

i

=min(f(c

1 +Æe

i );f(c

1 Æe

i

)); 1iN

and dividethedimension withthe smallestw

i

into thirds, sothat c

1 Æe

i

arethe centers

of the new hyper-rectangles. This pattern is repeated for all dimensions on the "center

hyper-rectangle",choosingthenext dimensionbydeterminingthenext smallestw

i

. Figure

5showsthisprocess beingperformed onthe GPfunction.

3542.4

600.4

358.2 67207

200.6

Figure 5: DomainSpacefor GPfunctionafter initialization.

The algorithm now begins its loop of identifyingpotentially optimal hyper-rectangles,

dividingthese rectanglesappropriately,and samplingat theircenters.

3.3 PotentiallyOptimal Hyper-rectangles

Inthissection,wedescribethemethodthatDIRECTusesto determinewhichrectanglesare

potentially optimal, and should be divided in this iteration. DIRECT searches locally and

globallybydividingall hyper-rectanglesthat meet thecriteriaof Denition3.2.

Denition3.2 Let >0 be a positive constant and let f

min

be the current best function

value. A hyperrectangle j is said to be potentially optimal if there exists some ^

K >0 such

that

f(c

j )

^

Kd

j

f(c

i )

^

Kd

i

; 8i; and

f(c

j )

^

Kd

j

f

min jf

min j

Inthis denition,c

j

is the centerof hyper-rectangle j, and d

j

isa measure forthis

hyper-rectangle. Jones et. al. [3 ]chose to usethedistance fromc

j

to its vertices asthemeasure.

Others have modied DIRECT to use dierent measures for the size of the rectangles [6].

The parameter is used so that f(c

j

) exceeds our current best solution by a non-trivial

amount. Experimental data hasshown that, provided110 2

110 7

,the value

forhasa negligibleeect on thecalculations[3 ]. Agoodvalue foris110 4

.

A fewobservationsmaybemade from thisdenition:

If hyper-rectangleiispotentially optimal,then f(c

i

)f(c

j

) forall hyper-rectangles

that areof thesame sizeasi(i.e. d

i =d

(8)

i k i j that d i =d j

,then hyper-rectangleiispotentially optimal.

If d

i d

k

forall k hyper-rectangles,and iispotentially optimal,then f(c

i )=f

min .

An eÆcient way of implementing Denition 3.2 can be done by utilizingthe following

lemma:

Lemma3.3 Let>0beapositiveconstantandletf

min

bethe currentbestfunctionvalue.

Let I bethe set of all indices of all intervalsexisting. Let

I

1

= fi2I :d

i <d j g I 2

= fi2I :d

i >d j g I 3

= fi2I :d

i =d

j g:

Interval j2I ispotentially optimal if

f(c

j

)f(c

i

); 8i2I

3

; (6)

thereexists ^

K >0 such that

max

i2I

1 f(c

j

) f(c

i ) d j d i ^

K min

i2I

2 f(c

i

) f(c

j ) d i d j ; (7) and f min f(c j ) jf min j + d j jf min j min i2I 2 f(c i ) f(c

j ) d i d j ; f min

6=0; (8)

or

f(c

j )d

j min

i2I2 f(c

i ) f(c

j ) d i d j ; f min

=0: (9)

Theproofof thislemmacan be foundin[5]

Fortheexamplewe areexamining,onlyone rectangle ispotentiallyoptimalintherst

iteration. The shaded regionof Figure 6 identiesit. In general, there may be more than

onepotentiallyoptimalrectanglefoundduringaniteration. Oncethese potentiallyoptimal

hyper-rectangleshave beenidentied,wecomplete theiterationbydividingthem.

3.4 Dividing Potentially Optimal Hyper-Rectangles

Onceahyper-rectanglehasbeenidentiedaspotentiallyoptimal,DIRECTdividesthis

hyper-rectangleintosmallerhyper-rectangles. Thedivisionsarerestrictedtoonlybeingdonealong

thelongestdimension(s)ofthehyper-rectangle. Thisrestrictionensuresthattherectangles

willshrink on every dimension. If the hyper-rectangle is a hyper-cube, then the divisions

willbe donealong allsides,aswas thecasewith theinitialstep.

The hierarchy fordividing potentially optimal rectangle i is determined byevaluating

thefunctionat thepointsc

i Æ

i e

j

, wheree

j

isthejth unit vector, andÆ

i

is one-thirdthe

lengthof themaximumsideofhyper-rectanglei. The variablej takesonall dimensionsof

the maximallength for that hyper-rectangle. As was the case in theinitialization phase,

we dene

w

j

=minf f(c

i +Æ

i e

j );f(c

i Æ

i e

j

(9)

3542.4

600.4

358.2 67207

200.6

Figure6: Potentially OptimalRectangle in1st Iteration.

Intheabove denition,I istheset ofalldimensionsofmaximallengthforhyper-rectangle

i. Our rst divisionis done in thedimension with the smallest w

j

, say w

^

j

. DIRECTsplits

the hyper-rectangle into 3 hyper-rectangles along dimension ^

j, so that c

i , c

i +Æe

^

j , and

c

i

Æ are the centers of the new, smaller hyper-rectangles. This process is doneagain in

the dimension of the 2nd smallest w

j

on the new hyper-rectangle that has center c

i , and

repeatedfor alldimensionsinI.

Figure 7 shows several iterations of theDIRECT algorithm. Each row representsa new

iteration. The transition from the rst column to the second represents the identifying

process of thepotentiallyoptimalhyperrectangles. The shaded rectanglesin column2 are

thepotentially optimal hyper-rectangles asidentiedby DIRECT. The thirdcolumn shows

thedomainafter these potentiallyoptimalrectangleshave beendivided.

3.5 The Algorithm

We nowformally state theDIRECTalgorithm.

AlgorithmDIRECT('myfcn',bounds,opts)

1: Normalize thedomainto be theunit hyper-cubewithcenter c

1

2: Findf(c

1 ),f

min =f(c

1

),i=0 ,m=1

3: Evaluatef(c

1 Æe

i

,1in, and dividehyper-cube

4: while imaxits andmmaxevalsdo

5: Identifytheset S of all pot. optimalrectangles/cubes

6: forall j2S

7: Identifythelongestside(s)of rectangle j

8: Evaluatemyfcnat centers of new rectangles, anddividej intosmaller rectangles

9: Updatef

min

,xatmin,and m

10: end for

11: i=i+1

12: end while

(10)

(a)

(b)

(c)

Figure 7: SeveralIterations of DIRECT.

The termination occured when DIRECT was within 0.01% of the global minimum value.

DIRECTused191 functionevaluationsinthiscalculation.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 x

₁

x

2

(11)

i a T

i

c

i

1 4.0 4.0 4.0 4.0 .1

2 1.0 1.0 1.0 1.0 .2

3 8.0 8.0 8.0 8.0 .2

4 6.0 6.0 6.0 6.0 .4

5 3.0 7.0 3.0 7.0 .4

6 2.0 9.0 2.0 9.0 .6

7 5.0 5.0 3.0 3.0 .3

8 8.0 1.0 8.0 1.0 .7

9 6.0 2.0 6.0 2.0 .5

10 7.0 3.6 7.0 3.6 .5

4 Acknowledgements

This research was supported by National Science Foundation grants DMS-0070641 and

DMS-0112542.

In addition, the author would like to thank Tim Kelley of North Carolina State

Uni-versity, andJoergGablonksyofthe theBoeingCorporationfortheir guidanceand helpin

preparingthisdocument and theaccompanyingcode.

A Appendix

Herewepresenttheadditionaltestproblemsthatwereusedin[3]toanalyzetheeectiveness

oftheDIRECTalgorithm. Thesefunctions areavailableat theweb-page listedinSection1.

If x

, the location of the global minimizer, is not explicitlystated, it can be found in the

commentsof theMATLAB code.

A.1 The Shekel Family

f(x)= m

X

i=1

1

(x a

i )

T

(x a

i )+c

i

; x2R N

:

Thereare three instances of theShekelfunction, named S5, S7 and S10, with N =4,and

m = 5;7;10. The values of a

i and c

i

are given in Table 1. The domain of all the Shekel

functionsis

= n

x2R 4

:0x

i

10; 1i4 o

:

All three Shekel functions obtain their global minimum at (4;4;4;4) T

, and have optimal

values:

S5

= 10:1532

S7

= 10:4029

S10

(12)

i a i c i p i

1 3 10 30 1 0.3689 0.1170 0.2673

2 .1 10 35 1.2 0.4699 0.4387 0.7470

3 3 10 30 1 0.1091 0.8732 0.5547

4 .1 10 35 3.2 0.0382 0.5743 0.8828

Table3: Second case: N =6;m=4.

i a

i

c

i

1 10 3 17 3.5 1.7 8 1

2 0.05 10 17 0.1 8 14 1.2

3 3 3.5 1.7 10 17 8 3

4 17 8 0.05 10 0.1 14 3.2

i p

i

1 0.1312 0.1696 0.5569 0.0124 0.8283 0.5886

2 0.2329 0.4135 0.8307 0.3736 0.1004 0.9991

3 0.2348 0.1451 0.3522 0.2883 0.3047 0.6650

4 0.4047 0.8828 0.8732 0.5743 0.1091 0.0381

A.2 Hartman's Family

f(x)= m X i=1 c i exp 0 @ N X j=1 a ij (x j p ij ) 2 1 A

; x2R N

:

There are two instances of the Hartman function, named H3 and H6. The values of the

parameters are given in Table 2. In H3, N = 3, while in the H6 function, N = 6. The

domainforbothof these problemsis

= n

x 2R N

:0x

i

1; 1iN o

:

The H3 function has a global optimal value H3

= 3:8628 which is located at x

=

(0:1;0:5559;0:852 2) T

. The H6 has a global optimal at H6 = 3:3224 located at x

=

(0:2017;0:15;0:47 69 ;0 :27 53 ;0 :31 17 ;0 :65 73 ) T

.

A.3 Six-hump camelback function

f(x

1 ;x

2

)=(4 2:1x 2 1 +x 4 1 =3)x 2 1 +x 1 x 2

+( 4+4x 2 2 )x 2 2 :

Thedomainof thisfunctionis

= n

x2R 2

: 3x

i

2; 1i2 o

Thesix-humpcamelbackfunctionhas two globalminimizerswithvalues 1:032.

A.4 Branin Function

(13)

= n

x2R 2

: 5x

1

10; 0x

2

15; 8i o

:

Theglobal minimumvalue is0:3978, and thefunctionhas3 globalminimumpoints.

A.5 The two-dimensional Shubert function

f(x

1 ;x

2 )=

0

@ 5

X

j=1

jcos [(j+1)x

1 +j]

1

A 0

@ 5

X

j=1

jcos [(j+1)x

2 +j]

1

A

:

The Shubert functionhas 18 global minimums,and 760 local minima. The domain of the

Shubert functionis:

= n

x2R 2

: 10x

i

10; 8i2[1;2] o

:

(14)

[1] R.G.Carter,J.M.Gablonsky,A.Patrick,C.T.Kelley,andO.J.Eslinger. Algorithmsfor

noisyproblemsingastransmissionpipelineoptimization.OptimizationandEngineering,

2:139{157, 2002.

[2] L.C.W. Dixon and G.P. Szego. Towards Global Optimisation 2. North-Holland, New

York,NY, rstedition, 1978.

[3] C.D. Perttunen D.R. Jones and B.E. Stuckman. Lipschitzian optimization without

thelipschitzconstant. Journal of OptimizationTheory andApplication,79(1):157{181,

October1993.

[4] J.M.Gablonsky.Directversion2.0userguide.TechnicalReport,CRSC-TR01-08,Center

for Research in Scientic Computation, North Carolina State University, April2001.

[5] J.M. Gablonsky. Modications of the Direct Algorithm. PhD Thesis. North Carolina

State University,Raleigh,NorthCarolina,2001.

[6] J.M.GablonskyandC.T.Kelley. Alocally-biasedformofthedirectalgorithm. Journal

of Global Optimization,21:27{37, 2001.

[7] C. T. Kelley. Iterative Methods for Linear and Nonlinear Equations, volume 16 of

Frontiers in Applied Mathematics. SIAM, Philadelphia,PA, rst edition,1995.

[8] C. T. Kelley. Iterative Methods for Optimization. Frontiers in Applied Mathematics.

SIAM,Philadelphia,PA, rstedition, 1999.

[9] B. Shubert. A sequential method seeking theglobal maximumof a function. SIAM J.