From (4.1.28) and (4.1.31) by the local Lipschitz continuity of it follows that
Remark 4.2.2 Results in Sections 4.1 and 4.2 are proved for the case, where the two-sided randomized differences are
used where and are given by (4.1.3) and (4.1.4), respectively. But, all results presented in Sections 4.1 and 4.2 are also valid for the case where the one-sided randomized differences
are used, where and are given by (4.1.3) and (4.1.6), respec- tively.
In this case, in (4.1.27), (4.1.28) and in the expression of should be replaced by 1, and (4.1.29)–(4.1.32) disappear. Accordingly, (4.1.36) changes to
Theorems 4.1.1-4.1.4 and 4.2.1 remain unchanged. The conclusion of Theorem 4.2.2 remains valid too, if in Condition iv)
changes to
4.3. Global Optimization
As pointed out at the beginning of the chapter, the KW algorithm may lead to a local minimizer of Before the 1980s, the random search or its combination with a local search method was the main stochastic approach to achieve the global minimum when the values of L can exactly be observed without noise. When the structural property of L is used for local search, a rather rapid convergence rate can be derived, but it is hard to escape a local attraction domain. The random search has a chance to fall into any attraction domain, but its convergence rate decreases exponentially as the dimension of the problem increases.
Simulating annealing is an attractive method for global optimization, but it provides only convergence in probability rather than path-wise convergence. Moreover, simulation shows that for functions with a few local minima, simulated annealing is not efficient. This motivates one to combine KW-type method with random search. However, a simple combination of SA and random search does not work: in order to reach the global minimum one has to reduce the noise effect as time goes on.
A hybrid algorithm composed of a search method and the KW algo- rithm is presented in the sequel with main effort devoted to design eas-
ily realizable switching rules and to provide an effective noise-reducing method.
We define a global optimization algorithm, which consists of three parts: search, selection, and optimization. To be fixed, let us discuss the global minimization problem. In the search part, we choose an ini- tial value and make the local search by use of the KW algorithm with randomized differences and expanding truncations described in Section 4.1 to approach the bottom of the local attraction domain. At the same time, the average of the observations for L is used to serve as an estimate of the local minimum of L in this attraction domain. In the selection part, the estimates obtained for the local minima of L are compared with each other, and the smallest one among them together with the corre- sponding minimizer given by the KW algorithm are selected. Then, the optimization part takes place, where again the local search is carried out, i.e., the KW algorithm without any truncations is applied to improve the estimate for the minimizer. At the same time, the corresponding minimum of L is reestimated by averaging the noisy observations. After this, the algorithm goes back to the search part again.
For the local search, we use observations (4.1.3) and (4.1.4), or (4.1.5) and (4.1.6). To be fixed, let us use (4.1.5) and (4.1.6).
In the sequel, by KW algorithm with expanding truncations we mean the algorithm defined by (4.1.11) and (4.1.12) with
where and are given by (4.1.5) and (4.1.6), respectively. Sim- ilar to (4.1.9) and (4.1.10) we have
where
By KW algorithm we mean
with defined by (4.3.2).
It is worth noting that unlike (4.1.8), is used in (4.3.1). Roughly speaking, this is because in the neighborhood of a miminizer of is increasing, and in (4.1.11) should be an observation on
In order to define switching rules, we have to introduce integer-valued and increasing functions and such that
and
Define
In the sequel, by the search period we mean the part of algorithm starting from the test of selecting the initial value up to the next selection of initial value. At the end of the search period, we are given and being the estimates for the global minimizer and the minimum of L, respectively. Variables such as
and etc. in the search period are equipped by superscript etc.
The global optimization algorithm is defined by the following five steps.
(GO1) Starting from at the search period, the initial value
is chosen according to a given rule (deterministic or random), and then is calculated by the KW algorithm with expanding truncations (4.1.11) and (4.1.12) with defined by (4.3.1), for which , step sizes and and used for truncation are defined as follows:
where c > 0 and are fixed constants, and are two sequences of positive real numbers increasingly diverging to infinity.
(GO2) Set the initial estimate
for
and update theestimate for by
where is the noise when observing After steps, is obtained.
(GO3) Let be a given sequence of real numbers such that
and as Set For if
as
then set Otherwise, keep unchanged.
(GO4) Improve to by the KW algorithm with expanding
truncations (4.1.11) and (4.1.12) with defined by (4.3.1), for which
where in (4.1.11) and (4.1.12) may be an arbitrary sequence of numbers increasingly diverging to infinity, and
At the same time, update the estimate for by
where is the noise when observing At the end of this step, and are derived.
(GO5) Go back to (GO1) for the search period.
We note that for the search period is added to and (see (4.3.7) and (4.3.8)). The purpose of this is to diminish the effect of the observation noise as increases. Therefore, and both tend to zero, not only as but also as The following example shows that adding an increasing to the denominators of
and is necessary.
Example 4.3.1 Let
It is clear that the global minimizer is and are two local minima. Furthermore, and are attraction domains for –1 and +1, respectively.
Since is linear, for local search we apply the ordinary KW al- gorithm without truncation
Here, no randomized differences are introduced, because this is a one- dimentional problem.
Assume
where
and and are mutually independent and both are sequences of iid random variables with
Let us start from (GO1) and take
(not tending to infinity),
If then, by noticing one of and
must belong to Elementary calculation shows that
Paying attention to (4.3.13), we see
and
i.e.,
This means that is located in one of the attraction domains and Furthermore, by (4.3.12) and (4.3.13), the ob- servations carried out at these domains are free of noise. Let us consider the further development of the algorithm, once has fallen into the in- terval or To be fixed, let us assume
For we have
or which implies
If say, then since
It suffices to consider the case where i.e., because for the case we again have (4.3.14) and
Simple computation shows that starting from the observations are free of noise, and the algorithm becomes
As a result of computation, we have
Then, starting from the algorithm will be iterated according to (4.3.14), and hence
For the case it can similarly be shown that
Therefore, whatever the initial value is chosen, will never converge to the global minimizer if in (GO1) does not diverge to infinity.
Let us introduce conditions to be used.
Since we are seeking for global minima of Condition A4.1.2’ should be modified.
A4.3.1 is locally Lipschitz continuous,
and L(J) is nowhere dense, where the set of
extremes of L.
Note that for seeking minima of the corresponding part in A4.1.2’, should be modified as follows: used in (4.1.11) is such that
A4.3.2
A4.3.3 For any convergent subsequence of
where denotes given by (4.3.3) with replaced by denotes used for the ¢ search period, and
A4.3.4 For any convergent subsequence
where is given by (1.3.2).
It is worth emphasizing that each in the sequence is used only once when we form and
We now give sufficient conditions for A4.1.2, A4.3.3, and A4.4.4. For this, we first need to define generated by estimates and derived up-to current time. Precisely, for running in the search period of Step (GO1) define