Description of the PS algorithm - The pattern search method

The pattern search method

5.1 Description of the PS algorithm

The PS algorithm is an iterative process that aims to generate a sequence of iterates {xk} in Rⁿwith non-increasing objective function values. This is done by evaluating a finite number of points on a mesh in order to find an improved point. The exploration of the mesh is carried out in either one or two phases. The phases being the SEARCH and POLL steps. In order to better understand these phases we need to formally define important concepts such as positive combination and span [14] and mesh generation [45].

Definition 5.1: Positive combination, Positive Span

1. A positive combination of vectors {v_i}^p_i=1 is a linear combination Pp

i=1λ_iv_i where λ_i ≥ 0, ∀ i ∈ {1, 2, . . . , p} , n + 1 ≤ p ≤ 2n.

2. A positive span for a subspace B ⊂ Rⁿ is a set of vectors {vi}^p_i=1 such that every x ∈ B can be expressed as a positive combination of the vectors {v_i}^p_i=1. The matrix defined by V = [v1, . . . , v_p] is said to be a positive spanning matrix.

3. Let the subspace B ⊂ Rⁿ be of dimension m and V ∈ R^n×p be a positive spanning matrix for B. If p = n + 1, then V is said to be a minimal positive spanning matrix for B.

If, for example, B ⊂ R²then V = [e1, e₂, −e₁, −e₂] , where e₁and e2are unit vectors, is a positive spanning matrix. However V = [e₁, e₂, −(e₁ + e₂)] would be a minimal positive spanning matrix for B.

Definition 5.2: Base Direction Matrix

Let B be the set of all matrices whose columns positively span Rⁿ. Then, the base direction matrix D is any positive spanning matrix satisfying

D ∈ Q^n×p∩ B. (5.1)

The fact that Q^n×p is a rational matrix ensures that the matrix D has only rational elements and makes it very easy to establish the minimal distance between distinct mesh points [45].

Definition 5.3: Mesh

M (x_k, ∆_k) = {x_k+ ∆_kDm : m ∈ N^p} (5.2)

where xk is the current iterate and ∆k ∈ R+ is the mesh size parameter. We note that the mesh is not explicitly constructed but is rather a conceptual entity.

SEARCH STEP

A finite subset of mesh points, possibly none, are selected. These points are evaluated to find an improving point. If any of these points improves the current iterate, then xk is replaced by the improving point. However if this search fails to find an improving point, the next step i.e. the POLL step is invoked. Any strategy such as a heuristic rule may be used to select these candidate mesh points. Consequently, due to the lack of mathematical foundation the SEARCH step does not contribute to the convergence properties of the PS method and is considered by some researchers to be a liability [7, 8]. Most implementations of the PS algorithm do not use this step.

POLLSTEP

The POLL step consists of evaluating the function on the set of mesh points neighboring the current iterate xk. These neighboring points are referred to as the poll set and denoted as follows:

P_k= {x_k+ ∆_kd_i : d_i ∈ D, i = 1, . . . , p} . (5.3) Each point in the POLL step is evaluated until an improved mesh point is found. If this step is successful, the iterate is updated to the new improved mesh point.

MESH UPDATE

At each iteration, the SEARCH or POLL steps will either give an improved mesh point or

both will fail. This presents two possible end scenarios. If an iteration fails one can conclude that the current point is locally optimal for the current mesh. Hence the mesh is refined using the following rule:

∆k+1 = θk∆k (5.4)

with 0 < θk < 1. If however the algorithm succeeds in finding an improved mesh point, the mesh is either kept the same or increased via the following rule:

∆_k+1 = θ_k∆_k (5.5)

with θk> 1.

Typical values for the mesh parameter update are ∆k+1 = ¹₂∆kfor when the mesh needs to be refined and ∆k+1 = 2∆_k when the mesh needs to be coarsened [8]. Both these pro-cesses are implicit. The PS algorithm based on the POLL step is given below.

Algorithm 3 The PS algorithm

* Either increase the mesh size parameter ∆_kor keep it the same using (5.5) and then go to step 3.

• IF f (xⁱ_k) ≥ f (xk) for all xⁱ_k ∈ PkTHEN

* x_k+1 = x_k

* Decrease the mesh size parameter ∆kusing (5.4) and then go to step 3.

3. IF ∆_k < ∆_tolTHEN STOP, ELSE k = k + 1 and go to step 2.

For illustrative purposes we present a hypothetical example using only the POLL step in Figure 5.1. The current iterate is indicated by a shaded circle, an unsuccessful trial point is indicated by an unshaded circle and a successful trial point is given by a semi shaded circle.

We present the trial points in open brackets e.g. x₁ = (x¹₁, x²₁) and its corresponding function

Figure 5.1: Example of PS

value in square brackets e.g. [f (x1)].

In Figure 5.1 A) x1 = (1, 1) is the current iterate with a function value of 10 and we let

∆₁ = 1. If we poll around x₁ using the spanning matrix D = {e₁, e₂, −e₁, −e₂} our first trial point will be the point x₁ + ∆₁× e₁ = (1, 1) + (1, 0) = (2, 1), where the function value is 13. This trial point will therefore not provide an improvement and we proceed to the next trial point, (1, 2). Similarly we are unsuccessful at this trial point, (1, 2), as well as (0,1) where the function values are 17 and 14 respectively. The last trial point (1, 0) however has the a function value of 7 which is lower than that of x1. Therefore we let x2 = (1, 0) be our new iterate and poll center. The poll step is successful and so the mesh is kept the same.

The order in which the trial points are generated does not matter.

In Figure 5.1 B) once again we poll around x₂ however none of the trial points provides a decrease in the objective function value. Hence for the next iteration the poll center is kept the same and the mesh is refined with ∆₃ = ¹₂.

Figure 5.1 C) shows how the process is repeated again with x₃ = (1, 0) as our current iterate and ∆₃ = ¹₂. The second trial point (1, 0.5) provides an improvement and will be set as the new iterate x₄. The POLL process is restarted using x₄ = (1, 0.5) as the new iterate. This process will continue until ∆k < ∆tol.

In document Differential evolution algorithms for constrained global optimization (Page 46-50)