4.3 Architectural Design of the Prototype
4.3.4 Pattern Evaluation Phase
When all chromosomes have been assigned fitness values, the evaluation process begins by first identifying a pair of chromosome from the GA GENOME array for producing new offspring. The ga run() function is used to execute such process and the ANN process (fitness computation module) is evoked to compute the fitness value for this offspring. The GA compares all chromosomes in the array to identify the least fit chromosome, which is then replaced by the new chromosome; and sorts these chromosomes in sequential order, ranking from the best fit chromosome to the least fit chromosome. The comparison process is performed using the ga getworst() function and the sorting process is invoked by the ga getbest() function. The latter two functions are invoked in the ga run function. In other words, the ga getworst and ga getbest functions are the subfunctions executed by the ga run function. Figure 4.10 presents the system flowchart of the module and the coding statements for this module can be found in Figure A.4 in Appendix A.
4.3.4.1 ga run() function
The ga run() function plays a key role in evolving chromosomes and it, in fact, represents the entire GA evolution process and to stop the prototype from over-learnt. In the ga run function (see Figure 4.10a), the GA randomly selects a chromosome from the GA GENOME array for comparison. The GA is then selects another chromosome from the array and the fitnesses of these two chromosomes are compared. The chromosome with high fitness value is chosen as the parent chromosome a. Such comparison process is known as tournament selection in GA. Similar process is performed to find the parent chromosome b. When the GA has identified parent chromosomes, these chromosomes are cross-overed to produce new offspring. This is known as reproduction in the context of GA.
In this module, two types of reproduction process are performed. They are network weight optimisation process and conventional GA process (i.e. feature subset optimisation). For the network weight optimisation process, the weight of the first input node from both parent are cross-overed based on the random crossover point (i.e. cut point) in the range of [1, network size - 2], to obtain the new input weight for the first input node in the new network. This new weight is then mutated with the Gaussian range [0, 0.5] and the mutated weight is saved in the GA GENOME array. The process is repeated until weights for all nodes in both parent are optimised. The similar optimisation process is used to optimise feature subset in the new offspring. For feature subset optimisation, the crossover point of the features is selected randomly in the range of [1, Input nodes - 2] and the mutation point is selected randomly in-between the range [-z, +z], in which the z value is the division of the entire features in the data set over 100. In other words, for the data set containing 7000 features using the network structure 10-5-2, the crossover point for new network is [1, 65], the crossover and mutation points for new feature subset is [1, 8] and [-70, 70], respectively.
(a) The ga run() function.
Figure 4.10: Pattern Evaluation Phase: A low level flowchart on ga run() function.
The fitness of this new offspring is calculated by repeating the processes in the fitness computation module and the comparison between chromosomes is carried out by the ga getworst function and the feature ranking is performed by the ga getbest(). The processing step in these two functions is presented in Figure 4.10b.
(b) The ga run() function.
Figure 4.10: – Continued
• ga getworst() In this function, all chromosomes in the GA GENOME array are copied to the tem- porary matrix M1. Then the comparison between chromosomes begins by comparing the first two
chromosome is removed from the matrix M1. The second chromosome from the matrix M1is retrieved
and compared with the chromosome in the matrix M2. The process is repeated until there are no
chromosome left in the matrix M1. The chromosome in the matrix M2 is considered as the worst
chromosome in the population and this chromosome is removed from the GA GENOME array. The new offspring is replaced the position of the removed chromosome in the array. The contents of the matrices M1and M2 are emptied so that these matrices can be reused by other functions.
• ga getbest() When the new offspring is introduced to the GA GENOME array, the ga getbest function is invoked by the ga run function. In this function, all chromosomes in the GA GENOME array are copied to the temporary matrix M1. A chromosome y, randomly selected from the matrix M1, is
copied into the first pointer (i.e. fittest) in the temporary matrix M2 for comparison. The second
chromosome retrieved from the matrix M1 is compared with the chromosome y in the matrix M2. If
this chromosome has higher fitness score than the chromosome y, the chromosome y is moved one step below the first pointer in the matrix and this new chromosome is allocated in the first pointer. If this chromosome has lower fitness score than the chromosome y, this chromosome is added at the second pointer after the chromosome y. This sorting process is repeated until there are no chromosome left in the matrix M1and the ranking of each feature in the chromosomes are stored in the HIST ENTRY
array.