CHAPTER 4: WELL PLACEMENT OPTIMIZATION
4.6 Benchmark Tests and Computational Complexity
In addition to their easy-to-implement properties, biologically inspired algorithms are generally popular because of their efficiency at finding approximate solutions in optimization problems. However, it is instructive to note that the performance of these algorithms across optimization problems of different complexities exhibit varying degrees of limitations in accordance to the no free lunch theorem. Thus, the effectiveness or otherwise of HPSDE over DE and PSO in the well placement problems as considered in this chapter are not sufficient conditions for making far-reaching conclusions. In this section, all three metaheuristic algorithms are subjected to benchmark problems of different complexities. The complexity of the test problems depends on the number and distribution of local optimums as well as the number of variables. In this regard, the algorithms are tested using six benchmark problems including, namely: Ackley Problem, De Jong F1 Function, Griewangk Problem, Rastringin Problem, Rosenbrock Problem and Schaffer F6 Function.
Besides the fact that these six test functions exhibits different degrees of complexities, we justify their selection in the light of the fact that they reflect opposite sides of fundamental complexity factors, such as modality, separability and scalability. Modality refers to the number of local optima a problem possesses. Most real life applications are multimodal (i.e. contains more than one local optima) as against unimodal or convex problems which have one optimum only. A function of variables p is said to be a separable problem if it can be expressed as a sum of y functions of one variable. In other words, separability entails that the optimization problem can be re-partitioned into sub-problems of lower dimensionality and therefore, is considerably easier to solve. Scalability refers to the ability to suitably and efficiently apply the algorithm to problems of larger dimensionality. Thus, scalability is the property of the problem which determines its behavior in different dimensions.
With respect to the benchmark problems considered here, De Jong F1 Function is unimodal, while Griewangk‘s Problem is multimodal; Rastringin‘s Problem is separable, while the Rosenbrock Problem is non-separable, and Ackley‘s Problem is scalable while the Schaffer F6 Function is non-scalable. Note that these benchmark problems are often characterized by more than one complexity factor. Take for example, besides the Sphere Function, the remaining five benchmark problems are multimodal; again, besides the Rastringin‘s Problem,
111
all other problems are inseparable. A general description of the benchmark problems employed in this complexity analyses is presented below.
1. De Jong‘s F1 Function (also known as the Sphere Function) is a smooth, symmetrical,
unimodal and strongly convex function. This function is not a complex problem because it has only one solution; its usefulness stems from the fact that it is a simple test function that allows one to check that the algorithms are in good working
condition and that there are no coding mistakes. It is given by 2
1 min ( ) n i x i f x x
andthe test area is often restricted to the hypercube 5.12 xi 5.12,i1, 2,..., .n
2. The Rosenbrock Problem is characterized by a very narrow and sharp ridge which rotates around a parabola; hence, it is considered a difficult problem. Algorithms with weak exploration capabilities (i.e. not able to search new and better regions of the search space) suffer severe limitation when deployed to this problem. Mathematically,
the problem is given by
1 2 2 2 1 1 min ( ) 100 ( 1) n i i i x i f x x x x
and the test area isusually restricted to 2.048 xi 2.048,i1, 2,..., .n
3. With a large search space and high number of local minima, the Rastrigin's function is
multimodal and fairly complex. It has a linearithmic complexity, and the surface of the function is determined by the amplitude (A) and the modulation frequency ( ).
With A=10 (as is the case in this work) the selected domain is dominated by the modulation, and the local minima are located at a rectangular grid with size 1. The fitness values of the local minima increases with increasing distance to the global
minimum; the function is represented as 2
1 min ( ) 10 10 cos(2π ) n i i x i f x n x x
andlike the Sphere Function, the test area is restricted to 5.12 xi 5.12,i1, 2,..., .n
4. Like the Rastrigin‘s problem, the Griewangk problem is a multimodal function which
has a linearithmic complexity of O n( ln( )),n where n is the number of the function's
parameters. The function is given by 2
1 1 1 min ( ) 1 cos 4000 n n i i x i i x f x x i
and its112
summation produce a parabolic solution space, while the local optima (created by the cosine function) are above the parabola. The dimensions of the search range increase on the basis of the product, which results in the decrease of the local minima. It is also noted that the function gets flatter the more the search range is increased. Thus, most algorithms have difficulties to converge close to the minima of this function; and this is because the probability of making progress rapidly decreases as the minima is approached, Diglakis and Margaritis (2000).
5. The Ackley‘s Problem is a widely used multimodal test function which is defined by
1 2 1
1 1
min ( ) 20 exp 0.2 exp cos(2π ) 20 .
n n i i x i i f x n x n x e
The presenceof an exponential term in this function creates numerous local optima that covers its surface; it is noted in Domingo et al. (2005) that algorithms with strong exploratory and exploitative properties yield good results when tested on Ackley‘s Problem. The test area of this benchmark test is within the hypercube 30 xi 30,i1, 2,..., .n
6. Schaffer‘s F6 Function is given as
2 2 2 2 2 2 (sin ) 0.5 min ( , ) 0.5 , (1.0 0.001( )) x x y f x y x y and its
test area is often restricted to the hypercube 100 xi 100,i1, 2,..., .n The main difficulty of the Schaffer's F6 test function is that the size of the potential optimum that need to be overcome to get to a minimum increases the closer one gets to the global minimum, Pohlheim (2006).
Having described the benchmark test functions and the reason behind the choice of the selected functions, we carry out benchmark tests with the view to ascertain the comparative performance of the algorithms. In our analyses of the efficiency of the algorithms, we use well established statistical quality indicators (or indices) such as best values, mean values and standard deviation of the results obtained. To this end, 25 independent runs were performed with randomly initialized populations of all three algorithms, and a common termination criterion of 5000 function evaluations is set for the algorithms. The termination criterion set out above (i.e. same function evaluation) serves as a level playing ground for all three algorithms – it restricts the window in which our inferences are made; and the choice of 25 runs is based on established rule of thumb as highlighted in Mersmann et al. (2010).
113
A summary of the best and mean function values of the experimental results are shown in Tables 4.6 and 4.7 respectively; whereas Table 4.8 shows the standard deviation – a statistical indicator which represents the extent of dispersion or variation of the data points from the arithmetic mean.
Benchmark Tests Functions Algorithms De Jong’s F1 Function Rosenbrock’s Problem Rastrigin’s Problem Griewangk’s Problem Ackley’s Problem Schaffer’s F6 Function
DE 0.193E+00 –2.899E+01 –3.114E+01 –2.750E+01 3.66E+02 1.40E+01
PSO 0.190E+00 –2.705E+01 –4.192E+01 –2.767E+01 1.03E+02 1.31E+01
HPSDE 0.043E+00 –3.131E+01 –3.511E+01 –2.899E+01 1.71E+02 1.22E+01 Table 4.6: Best values of 25 runs after 5000 function evaluations of algorithms on benchmark problems
Benchmark Tests Functions Algorithms De Jong’s F1 Function Rosenbrock’s Problem Rastrigin’s Problem Griewangk’s Problem Ackley’s Problem Schaffer’s F6 Function
DE 2.852E+00 –2.578E+01 –2.834E+01 –2.020E+01 5.78E+02 1.46E+01
PSO 2.069E+00 –2.523E+01 –3.125E+01 –2.441E+01 1.09E+02 1.45E+01
HPSDE 1.813E+00 –2.678E+01 –3.381E+01 –2.583E+01 1.93E+02 1.31E+01 Table 4.7: Mean values of 25 runs after 5000 function evaluations of algorithms on benchmark problems
Benchmark Tests Functions Algorithms De Jong’s F1 Function Rosenbrock’s Problem Rastrigin’s Problem Griewangk’s Problem Ackley’s Problem Schaffer’s F6 Function
DE 2.254E+02 2.416E-01 1.910E-02 1.211E+01 2.506E-02 2.76E+01
PSO 3.734E+01 2.030E-01 1.393E-02 1.541E+01 1.753E-02 6.10E+01
HPSDE 2.223E+01 1.383E-01 1.093E-02 1.150E+01 1.024E-02 1.15E+01 Table 4.8: Standard deviation after 5000 function evaluations of the algorithms on benchmark problems
On the strength of the results of the benchmark test functions given in Tables 4.6 and 4.7; we can infer that on the average, HPSDE algorithm outperformed both DE and PSO algorithms in all of the benchmark test function but for Ackley‘s Problem in which the PSO algorithm yielded better results for both the best function value as well as the mean function value. It is also noted that although the PSO algorithm yielded the best function value in the Rastrigin‘s Problem; the mean value attained by HPSDE over the 25 optimization runs was better than those of the other algorithms in the same benchmark test (Rastrigin‘s Problem). Interestingly, the lowest standard deviation in all six benchmark functions were those associated with
114
experimental results emanating from HPSDE algorithm. In statistical theory and indeed in the theory of probability, low standard deviation indicates that the data points tend to be in or around the proximity of the arithmetic mean of the distribution; whereas a high standard deviation indicates that the data points are well spread out over a large range of values. Consequently, it is safe to infer that for fewer number of optimization runs; the probability of attaining near-average optimal results is higher when HPSDE algorithm is employed than the other two algorithms. This corroborates the trends in the standard deviation resulting from the statistical analyses of the NPVs accruing from the fluid profile generated from 30 optimization runs of the algorithms as was carried out in applications 2 and 3 (sub-sections 4.5.2 and 4.5.3) – see Tables 4.3 and 4.4 respectively.
Furthermore, we compare the computation complexity of the three algorithms. As a valuable and qualitative insight into algorithmic efficiency, the objective of computational complexity is to determine the feasibility of an algorithm by estimating an upper bound on its usability. It also provides an avenue for relative comparison of algorithms in order to decide algorithmic suitability for any given problem. In this regards, algorithmic efficiency or computational complexity is measured in terms of time complexity – a measure of the amount of time required to execute the algorithm and its space complexity – which is a measure of the number of memory cells or nodes it requires for its execution. The selection and deployment of algorithms for any given problem often involve some kind of time-space-tradeoff; this is because most computational problems cannot be solved with short computing time and low memory space, Ziegler (2002).
Generally speaking, the better the time complexity of an algorithm, the faster the algorithm is in practice; and the better the space complexity, the lower the risk of running out of memory cells. Over the years, the big O notation has been a convenient way of expressing the computational complexity of problem-solving algorithms. This notation provides a simple but qualitative insight into how changes in the input of the algorithm N affect the algorithmic performance as N grows larger. In other words, it provides the window for understanding how the performance of an algorithm responds to changes in problem input size.
In Zielinski et al. (2006), it was demonstrated that the control parameters of DE algorithm (i.e. population size Np, crossover rate CR and mutation factor F) have a direct bearing on
115
the computational complexity of the algorithm. Since each iteration of the algorithm involves
a loop of Np conducted over another loop D; and the mutation and crossover operations are
performed at the component level for each DE vector, it therefore, follows that the number of fundamental operations in the algorithm (DE/rand/1/bin) is proportional to the total number
of loops conducted until the maximum number of iterations K is reached. Thus, the runtime
complexity of the algorithm is given by O N( p D K). For this algorithm, the space requirements is in the order of O N( pK)O E( ), where E is the number of fitness
evaluations required by the algorithm in a given problem. The space complexity of DE is low when compared to other popular metaheuristic algorithms; perhaps this explains why DE has been extensively employed in large scale optimization problems across a broad range of disciplines, Das and Suganthan (2011).
For the PSO algorithm, the runtime complexity is given by O N( pMK), where Np, M
and K are population size, number of neighborhood and maximum number of iteration
respectively. According to Liu et al. (2011), the worst case scenario in this algorithm occurs when the number of sub-swarms (or neighborhood) remains unchanged, and the number of iteration reaches the designated maximum iteration number. In that situation, the runtime complexity is given by O N( pMK); however, if the number of sub-swarms is reduced
after some iterations, the runtime complexity reduces to
1 , where 1 . K p i O L N L M
The space complexity of the PSO algorithm is in the order of O N( pK), and this complexity increases to the order of O Q N( pK) when Q number of particles (QNp) overlap on the same node, as pointed out in Gheitanchi et al. (2008).
The time and space complexities of HPSDE algorithm are in the order of O N( p D MK)
and O N( pK) respectively. The flowchart of the algorithm as depicted in Figure 4.5 shows
that in every iteration, there is an extra function evaluation that is absent in both the DE and PSO algorithms. The resources required to carry out this additional function evaluation in each iteration of the algorithm constitutes an increased computational burden vis-à-vis the resources required in each iteration of DE and PSO. In order words, although the use of hybrid algorithms may be desirable from a performance point of view; it however,
116
exacerbates computational efficiency by virtue of the fact that it increases runtime computational complexity, and to some extent, the space complexity. Therefore, it is needful to develop simple adaptation rules for algorithmic control parameters so as to improve performance without imposing considerable computational burden when hybrid algorithms such as HPSDE are deployed.
Finally, we end this section by analyzing the ratio of the function evaluations required by each of the algorithms with respect to full enumeration of the search space in each of the well placement optimization scenario considered in this chapter. Exhaustive enumeration of the search space is often undesirable and can be computationally prohibitive; particularly in large scale engineering problems where they can easily become intractable, regardless of the computational resource availability.
Using brute-force or exhaustive search as a baseline, we compared the number of function evaluations performed by DE, PSO and HPSDE algorithms; with the view of understanding the relative advantage of one algorithm over the other in this problem domain. In the first application considered in this chapter, the problem involved optimal placement of a single producer in a 2–D reservoir model with 45 45 1 grid-blocks. Theoretically, there are 2025 (or 2025C1) different possible placement of the single well; and this means that a full enumeration of the search space (i.e. sampling the entire search space) would require 2025 function evaluations. In this problem, however, the DE, PSO and HPSDE algorithms began to converge to near-optimum solutions after 740, 591 and 772 iterations respectively. Note that there is an extra function evaluation in each iteration of the HPSDE algorithm; thus, the minimum numbers of function evaluations required in this problem scenario (Application 1) are 740, 591 and 1544 for DE, PSO and HPSDE algorithms respectively. Therefore, using the theoretical maximum required function evaluation as a baseline, the ratio of the actual objective function evaluations for these algorithms are 0.3654, 0.2919 and 0.7625 for DE, PSO and HPSDE respectively.
In the second application considered, the defined problem statement was to optimize the placement of two wells in a 2–D reservoir model of 50 50 1 grid-block. Theoretically,
there are 3123750 (or 2500C2) possible placement for both wells; in other words, the computational cost of an exhaustive enumeration of the search space would entail over three million function evaluations. However, the DE, PSO and HPSDE algorithms attained near-
117
optimal solutions after 2632, 1459 and 5726 objective function evaluations respectively. Indeed, these numbers are much fewer than the theoretically required 3123750 objective function evaluations required for full enumeration of the search space; in fact, the numbers represent a ratio (with respect to the brute force baseline) of 8.4258×10-4, 4.6707×10-4 and 1.8331×10-3 for DE, PSO and HPSDE algorithms respectively.
For the nine wells placement problem considered in application 3, a full enumeration of the search space requires 1.036182146×1025 (or 2500C9) function evaluations; whereas DE, PSO and HPSDE algorithms converged to near-optimal solutions after 7010, 5097 and 17126 objective function evaluations respectively. This application clearly underscores the desirability of metaheuristic algorithms in this problem domain; it highlights their ability to yield approximate solution without exhaustive enumeration of the search space. It further highlights the fact that full enumeration of the search space in the well placement optimization problem becomes intractable as the number of decision variables (number of wells) increases; thus brute-force algorithms would suffer severe performance limitations in this domain. For this problem, the ratio of the actual number of objective function evaluations needed to reach near-optimum solutions to full enumeration of the search space pales into insignificance; they are 6.7652×10-22, 4.9190×10-22 and 1.6528×10-21 for DE, PSO and HPSDE algorithms respectively. The computed results for the ratio of the number of actual objective function evaluations required to reach near-optimum solutions to the theoretical maximum number of objective function evaluations for all three problem scenarios are tabulated and presented in Table 4.9 below.
Expectedly, the algorithms attained near-optimal solutions with fewer function evaluations as against the prohibitive number of function evaluations required for full enumeration of the search space; and the comparative advantage of the algorithms in terms of the ratio of the number of actual objective function evaluations required to reach near-optimum solutions to the theoretical maximum number of objective function evaluations become much pronounced as the decision variables of the underlying optimization problem increases. Interestingly, however, of the three metaheuristic algorithms; the PSO consistently converged to near- optimum solution with the fewest number of objective function evaluations in all the applications. This evidence is also reflected in the fact that the computed ratio of the number