global optimisation algorithm in a finite domain is identical to the distributions for the corresponding range and averaged range processes.

3.2. The number of iterations before convergence for the averaged range process

37

Proof The range process is defined as the image of the domain process under f. Consequently, the number of iterations to convergence for this process is stochastically equivalent to the number of iterations to convergence in the original domain algorithm. Reference to RP is by way of a stepping stone to the new Markov process in the range, ARP .

Let Fd b e the cumulative distribution function of the number o f iterations to convergence for the domain process, and

Fa

be the cumulative distribution function of the number of iterations to convergence for the averaged range process. Then if

Nd

is the number of iterations to convergence for the domain process,

P(Nd

� n)

- P(Xn

= Xl

)

- 6n(1),

the first component of the vector

6n.

Similarly Fa (n) =

7rn(l).

Now since there i s only one state

at the optimal level, the probability of the domain process being in the optimal state is the same as

the probability of the range process being in the optimal level. Theorem

3.2.1

then shows that this probability is the same as the probability of the averaged range process occupying the optimal level.

Thus

so Fd = Fa and the proof is complete.

Fd(n) = 6n(1)

7rn(l)

= Fa(n)

• Thus the first stage in the approximation is exactly equivalent to the original optimisation al gorithm, in terms of the distribution of the number of iterations before convergence. This result corrects

[49],

where the number of iterations before convergence in the averaged range process is said to differ from the number of iterations before convergence in the domain process in moments higher

38

Chapter 3. The range, averaged range and asymptotic averaged range processes

than the first. The equality of means has been illustrated in the example of Subsection

2.2.1 .

The ensuing section moves on to the next stage of approximation, showing that the asymptotic

averaged range process provides a well defined approximation to any Markovian global optimisation algorithm.

3 . 3 Existence of the asymptotic averaged range process

This section demonstrates the existence of the asymptotic averaged range process approximation to

any averaged range process. Then, since the averaged range process has been defined for all domain processes, the whole approximation framework is defined up to this stage. The proof is straightforward

for most algorithms and problems, but care is required in making sure that the asymptotic averaged range process is well defined in each possible case.

To prove AARP exists is to prove that, as n tends to infinity, Rn tends to a constant matrix, or

possibly to oscillation between a finite number of constant matrices. As defined in Equation

(2.1)

Subsection

2.2.1,

each entry in Rn is constructed as a sum of terms of the form P(Xn = Xk)P(Xn+1 = xp l Xn = Xk)

P(Xn E j-l (J(Xk ) ) )

(3.3)

over various values of xp and Xk when P(Xn E j-l (J(Xk) ) ) >

O.

The conditional part of this is simply a one-step transition, determined from the transition matrix P, which is assumed constant.

It is possible to relax this assumption somewhat; the conditional part has a well defined limit as n

tends to infinity when strong ergodicity obtains in the normalised transient state transition matrix [44,

Definition 4.5] , and the existence of a quasi-stationary vector implied by this condition means that in

fact the whole expression has a limit

(

this is discussed in greater detail below

)

. Since the main point

is that in the limit a quasi-stationary vector should exist, the conditions under which the asymptotic averaged range process is well defined may be weakened even further; however, as stated earlier, only

algorithms with constant transition matrix P are considered here.

Taking this conditional part as constant, then, it must still be shown that the remaining part of

3.3. Existence of the asymptotic averaged range process

39

transition matrices, and then in Subsection

3.3.3

a general theorem establishes that in fact for any domain process the expression in

(3.3)

tends to oscillation within a finite set of limiting values. The existence of an asymptotic averaged range process approximation to any time-homogeneous Markovian domain process can then be demonstrated.

Before these results can be proved, though, some knowledge of Markov chain theory is required.

An introduction to the important points is presented in the following subsection.

3 . 3 . 1 Mar kov chains

The subsection following this one proves the existence of an asymptotic averaged range process ap proximation to any domain process where the transient state transition matrix is

primitive.

A matrix is primitive if and only if it is irreducible and acyclic. Define

pij

= P(Xn = Xj !Xo = Xi) ; then a

matrix is irreducible if for each pair of states Xi and Xj there exists an

n

such that

_{pij > O.}

If the set

{n

pfi > O}

has a greatest common denominator

d > 1

for some

i

then the state is cyclic with period

d;

otherwise the state is acyclic.

All states in a set of states for which the transition matrix is irreducible have the same period

d >

_[44,

Lemma

_1.2](

if the "period" is

1

then the states are acyclic

)

. States in this set are partitioned amongst exactly

_d

non-empty cyclic subclasses

(

this follows from

_[44,

Theorem

1.3]).

The Markov chain moves around these subclasses in order, sampling one state from the current subclass at each iteration.

Note that a transition matrix for more than one state can never be irreducible if one of the states is absorbing, since the probability of a transition from an absorbing state to any other state in

n

steps is zero for all

n.

The transition matrix formed by exclusion of all absorbing states, however, may be irreducible. If this transient state transition matrix is also acyclic then it is primitive.

A Markov chain with primitive transition matrix P has a stationary distribution v such that

vP =

v.

This stationary distribution is called the left Perron-Frobenius eigenvector of P

[44J.

Theo rem

_4.6

_[44],

which is repeated in the lemma below, generalises this idea to consider the so-called quasi-stationary distribution of transient states when the transition matrix of transient states only is primitive. Note that

I

denotes the set of transient states

(

called inessential in the lemma below

)

40 Chapter 3. The range, averaged range and asymptotic averaged range processes

Lemma 3.3. 1

Let

the submatrix of P corresponding to transitions between the inessential states

of the Markov chain corresponding to P, be primitive, and let there be a positive probability of {Xd

beginning in some i E I. Then for j E I, as k

� 00,

In document Convergence rates of stochastic global optimisation algorithms with backtracking : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University (Page 44-48)

global optimisation algorithm in a finite domain is identical to the distributions for the corresponding range and averaged range processes.

37

Fa

Nd

P(Nd

- P(Xn

)

- 6n(1),

6n.

7rn(l).

3.2.1

7rn(l)

[49],

38

2.2.1 .

(2.1)

2.2.1,

(3.3)

O.

(

)

39

3.3.3

(3.3)

primitive.

pij

n

pij > O.

{n

pfi > O}

d > 1

i

d;

d >

[44,

1.2] (

1

)

d

(

[44,

1.3]).

n

n.

v.

[44J.

4.6

[44],

I

(

)

Let

the submatrix of P corresponding to transitions between the inessential states

of the Markov chain corresponding to P, be primitive, and let there be a positive probability of {Xd

beginning in some i E I. Then for j E I, as k

_{pij > O.}

_[44,

_1.2](

_d

_[44,

_4.6

_[44],