Amount of Processing Required to Solve a Problem
8.1 Effect of Number of Generations
As previously mentioned, the population size M and the maximum number G of generations to be run on any one run are the primary control parameters for genetic programming (as well as the conventional genetic algorithm).
For a fixed population size M, the cumulative probability P(M, i) of satisfying the success predicate of a problem inevitably increases (or, at least, does not decrease) if a particular run is continued for additional generations. In principle, any point in the space of possible outcomes can eventually be
Page 195
Figure 8.3
Cumulative probability of success P(M, i)for the 6-multiplexer problem with a population size M = 500for generations 0 through 200.
Table 8.1 Total number of individuals that must be processed by generations 25, 50, 100, 150, and 200 of the 6-multiplexer problem with a population size M = 500. Generation number Cumulative probability of success P(M, i) Number of independent runs R (z)required Total number of individuals that must be processed I(M, i, z) 25 3% 171 2,223,000 50 28% 15 382,500 100 59% 6 303,000 150 73% 4 302,000 200 76% 4 402,000
reached by any genetic method if mutation is available and the run continues for a sufficiently large number of generations. However, there is a point after which the cost of extending a given run exceeds the benefit obtained from the increase in the cumulative probability of success P (M, i).
Figure 8.3 shows, for the 6-multiplexer problem (subsection 7.4.3), a graph between generations 0 and 200 of the cumulative probability of success P(M, i) that at least one S-expression in the population yields a success (i.e., the correct Boolean output for all 64 fitness cases). The graph is based on 150 runs of the problem for a population size of M = 500. The function set is F1 = {AND, OR, IF, NOT}.
Table 8.1 shows the total number of individuals that must be processed in order to yield a solution to this problem with 99% probability by generation 25, 50, 100, 150, or 200. As will be seen, this table will show that there is a point after which the cost of extending a given run exceeds the benefit obtained from the increase in the cumulative probability of success P(M, i).
Specifically, if this particular problem is run from generation 0 through generation 25 (i.e., a total of 26 generations) with a population size M = 500,
Page 196 the cumulative probability of success P(M, i) is found by measurement to be about 3% (as shown in column 2 of row 1). Column 3 of row 1 shows that yielding a probability of z = 99% for solving this problem by generation 25 requires making R(z) = 171 independent runs (as shown in figure 8.2). Column 4 of row 1 shows that these 171 runs require processing of 2,223,000 individuals (i.e., 500 x 171 runs x 26 generations). Note that the number in column 4 is somewhat overstated because it is possible that more than one run may yield a solution and it is also possible that a solution may appear before generation 25. Nonetheless, processing 2,223,000 individuals will yield a solution with 99% probability by generation 25.
If this particular problem is run from generation 0 through generation 50, the cumulative probability of success P(M, i) is found by measurement to be 28% (as shown in column 2 of row 2). Column 3 shows that yielding a probability of z = 99% for solving this problem requires making R(z) = 15 independent runs. Column 4 shows that these five runs require processing of 382,500 individuals (i.e., 500 x 15 runs x 51 generations).
Rows 3, 4, and 5 show that if the run is extended out to generation 100, 150, or 200, the cumulative probability of success P(M, i) increases to 59%, 73%, or 76%, respectively. These higher values of P(M, i) mean that only 6, 4, or 4 independent runs are sufficient to solve this problem with 99% probability. However, extending the run to generation 100, 150, and 200 requires processing 303,000, 302,000, and 402,000 individuals, respectively.
As can be seen, the cumulative probability of success is highest at generation 200; however, the computational effort required to yield a solution to this problem with 99% probability is higher at generation 200 than at at least three earlier generations (i.e., 50, 100, and 150) having lower values of P(M, i).
Figure 8.4 contains two overlaid graphs which together show, by generation, the relationship between the choice of the number of generations to be run and the total number of individuals that need be processed, I(M, i, z), in order to yield a solution to the 6-multiplexer problem with 99% probability for a
Figure 8.4
Performance curves for the 6-multiplexer problem with a population size M = 500 for generations 0 through 200.
Page 197 population of size 500. The horizontal axis applies to both of these overlaid graphs and runs between 0 and 200 generations. The rising curve is the cumulative probability P(M, i) of success and is scaled by the left vertical axis running between 0% and 100%. This rising curve is the same graph as in figure 8.3 and is based on the same 150 runs. The falling curve shows, by generation, the total number of individuals I (M, i, z) that must be processed in order to solve the problem with z = 99% probability, and is scaled by the right vertical axis running between 0 and 6,000,000 individuals.
Until a nonzero cumulative probability P(M, i) is achieved (as is the case at generation 12), the total number of individuals I(M, i, z) that must be processed is undefined. If P(M, i) had been measured over a sufficiently large number of runs or with a sufficiently large population, there would have been a small nonzero probability of solving the problem for every generation, including even generation 0 (representing
probability of solving the problem by blind random search).
Only one of 150 runs was successful at solving the 6-multiplexer problem by generation 12, so the cumulative probability of success P(M, i) was a mere 0.0067. With this low cumulative probability of success, R(z) = 689 independent runs are required to yield a solution to this problem by generation 12 with a 99% probability. This requires processing 4,478,500 individuals (500 x 13 generations x 689 runs).
Between generations 12 and 69 the P(M, i) curve has a rather steep slope. The curve rises rapidly from generation to generation, causing the required number of independent runs R(z) to drop rapidly from generation to generation. Meanwhile, the product of M x i increases only linearly from generation to generation. Thus, between generations 12 and 69 the total number of individuals that must be processed I(M, i, z) drops steadily until it reaches a minimum. The minimum occurs at generation 69.
At generation 69 the cumulative probability of success is 49%, so the number of independent runs R(z) is 7. Thus, processing only 245,000 individuals (i.e., 500 x 70 generations x 7 runs) is sufficient to yield a solution of this problem with a 99% probability. Generation 69 is highlighted with a light vertical line on figure 8.4. Both the generation number (i.e., 69) and the number of individuals that need to be processed (i.e., 245,000) are shown in the oval in the figure.
After generation 69, the increase in the cumulative probability of success P(M, i) from 69% is slower from generation to generation.
Consequently, the decrease in R(z) occurs very slowly. It takes so many additional generations to increase P(M, i) so that R(z) can be reduced that there is a net increase, after generation 69, in the total number of individuals that must be processed in order to solve the problem with 99% probability. Between generations 69 and 92, the total amount of computation relentlessly increases by 500 individuals for each additional generation; however, R(z) remains at 7. It is not until generation 93 that the cumulative probability of success P(M, i) reaches 54%, thereby reducing the required number of independent runs R(z) from 7 to 6. At generation 93, the number of individuals that must be processed I (M, i, z) is
Page 198 282,000 (i.e., 500 x 94 generations x 6 runs). This 282,500 is greater than the 245,000 individuals required by generation 69.
By generation 200, the probability of success P(M, i) has reached 76% and the number of independent runs R(z) has dropped to 4, so the number of individuals that must be processed is 402,000 (i.e., 500 x 201 generations x 4 runs). This 402,000 is considerably greater than the 245,000 individuals required by generation 69.
Note that increasing the number of generations beyond 69 definitely does increase the cumulative probability of success; however, the cost of this increased probability, as measured by the total amount of computation, outweighs the benefit. It is not that a particular run of genetic programming is incapable of solving the problem if it is continued for a sufficiently large number of generations. The point is that it is inefficient to continue a particular run for a large number of generations. The cost of solving the problem in genetic programming is minimized by making numerous shorter runs, rather than one long run.
Forty-two performance curves similar to figure 8.4 will appear throughout this book. Each such figure will contain two overlaid graphs showing, by generation, the probability of success, P(M, i), and the number of individuals that must be processed I(M, i, z). Each such figure will also contain an oval containing two numbers: the minimum number of individuals that must be processed to solve the problem with z = 99% probability for the stated choice of population size M and the generation number where the minimum is achieved. The minimum number of individuals that must be processed is an indication of the difficulty of the problem for the particular choice of population size M. Note that the sawtooth in the I(M, i, z) curve peaking at generation 21 is an anomaly created because of the approximate nature of the values of the P(M, i) curve.