6.4 Initial experimental results
6.4.1 GGAPack experimental setups
Initial experiments have been set up for investigating GGAPack results. Since the GA outputs are stochastic, the GGAPack was executed for one hundred times, and the results are collected for statistical analysis. The targeted FPGA architecture was the same as used in Section 5.5 for VPack and
Start Evaluate population Randomly initialise population, individual #: N Crossover Mutation Survived individuals individual #: N/2 New population individual #: N MO selection + Evaluate population MO selection Terminate? End Iteration Evolution Find the best
individual, and translate
to a netlist
Yes
No
Figure 6.7: The flow of GGAPack: The population is initialised randomly – each BLE is in a CLB with a random CLB index. Individuals are as-
signed multiple fitnesses by multiple fitness functions. MO selection uses the non-dominated sort and crowding distance (NSGA-2 method) to form new population. GGAPack, the GA, iterates for a fixed number of generations then stops. The best individual is filtered and translated as a netlist.
RVPack, where the CLB has I = 18, N = 8, one clock and the BLE contains 4-input LUT and a reconfigurable FF, and the experiments are based on the MCNC-20 benchmarks (Yang, 1991). For the initial experiments, GGAPack outputs, solution-quality-related results, are directly compared to other circuit clustering algorithms, which are the CLB numbers and CLB interconnect numbers. The detailed testing flow is shown in Figure 6.8. As the pattern match is a duplicated process, each GGAPack program starts from the pattern matched netlist – the BLEs. These tests are still carried out on the same high performance computing cluster, referred to Section 5.5, where execution time of each program is the cluster-processor-occupying time.
Before loading GGAPack to the computing cluster, a number of GA pa- rameters have to be chosen in order to produce useful results: First, according to the optimal CLB number of the MCNC-20 benchmark, the generation number is estimated. This is accomplished using the largest benchmark “clma”. The “clma” has 8,383 BLEs, when clustering this benchmark into a 8-BLE CLB, the optimal CLB number is 1,048 (8,383 over 8). In GGAPack,
Start
Pattern Match the Netlist Synthesised
MCNC-20
GGAPack GGAPack GGAPack GGAPack
End
...
Figure 6.8: GGAPack executing and testing flow. Before forwarding the synthesised MCNC-20 netlist (LUTs + FFs) to GGAPack, a duplicated process, the pattern match, can be first performed, so that the GGAPack deals with the BLEs directly.
mutation is the primary operation to create the CLBs, and in each generation two CLBs are eliminated. If this operation is carried out on all CLBs (the initial individual has one BLE per CLB), this will mean that the smallest mutation number is at least 4,192 (8,383 over 2). However, that is an ideal situation as the mutation does not happen to every effective CLB. In practice, the random-CLB-eliminating mutation can also occur to well clustered CLBs. In order to move all BLEs in an optimal CLB number, or a near optimal CLB number, the generation number is set to ten times the minimum re- quirement – the smallest mutation number. To simplify the problem, the GA generation number of each benchmark is rounded to 40,000, and this can also ensure smaller benchmarks have enough generations to evolve. Figure A.2 in Appendices shows GGAPack GA convergence under different generations. Test indicates that 40,000 generations are enough.
Second, the crossover rate of GA is adjusted. To get an efficient crossover rate, the crossover rate has been tested under a range, which is from 0.2 to 0.8 based on the GA empirical settings that are introduced in Chapter 4. The testing shows that if the rate is set to 0.6, as shown in Figure 6.9
Figure 6.9: Box plot of CLB numbers vs. different crossover rates of GGAPack executions. The test is based on “clma” – the largest benchmark in MCNC- 20. For each crossover rate, GGAPack executes for 100 times, final CLB number means that each GGAPack executes for 40,000 generations. When the crossover rate is 0.6, the GA result variation is small, and CLB number is small as well.
– testing GGAPack performance of different crossover rates based on the largest benchmark “clma”, the GA can achieve small number of CLBs in a short time, and its results also have small variations. The figure only shows the testing results of “clma”, however, the crossover rate is not linked to a particular benchmark; even when the benchmark is changed, the GA is still efficient. For the population size, to speed up the GA, it is set to 10, as, in GGAPack, experiments show that the population size does not significantly affect the quality of GA results, where this can be proofed by the graph of GA convergence vs. different population size as shown in Figure A.3 in Appendices.
The GA parameters for GGAPack are summarised as follows:
1) Population size: 10. 2) Crossover rate: 0.6
0 200 400 600 800 1000 1200 1400
alu4 apex2 apex4 bigkey clma des diffeq dsip ellip9c ex1010
CLB N umb er MCNC-‐20 Benchmarks GGAPack Worst GGAPack Best VPack RVPack 0 100 200 300 400 500 600 700 800 900
ex5p frisc misex3 pdc s298 s38417 s38584.1 seq spla tseng
CLB N umb er MCNC-‐20 Benchmarks GGAPack Worst GGAPack Best VPack RVPack
Figure 6.10: GGAPack clustered CLB number for MCNC-20 benchmarks compared to VPack and RVPack, lower is better. Data boxplot and detailed data are provided in Appendices in Figure A.12 and Table A.10.
4) Generation number: 40,000