Chapter 3: Two stage modelling using a Genetic Algorithm-Neural Network (GA-ANN) and an
3.6 Second stage process
As described in chapter 2, one of the key capabilities of the GA is in optimization. The optimization of the GA can be used to optimize any kind of problems, such as the optimization of the network architecture; or optimization of a dataset for better performance. Furthermore, due to the operation of the GA, which is based on genetics and natural selection, it also offers an effective search technique which gives better performance in optimization (Muhd Khairulzaman Abdul Kadir, 2012). Therefore, because of this capability, an optimization of the ANN architecture has been applied. Here the population evaluation will be the same as in previous stages, except that the population size will not be the same as the input variables for this stage, which will be explained in section 3.6.1.
73 START
Randomly initialize the all the inputs
Generate initial random population
Train NN
Calculate network output and the network error using mean square error
(mse) GA parameter and operation: Selection – stochastic uniform Crossover – crossover scqattered Mutation - uniform
30 hidden neurons and using Levernberg - Marquardt
(LM) learning method
Is the generation of GA end?
End
Size of Chromosome based on number of inputs and
output
No
Yes Randomly arrange all
the data
Rearrange the number of inputs accordingly depending the fitness value of the network
Figure 3.6: First stage process flow
The main objective of this stage is to optimize the weight and threshold of the ANN itself. After selecting the best inputs for the ANN in the first stage of the TSH model, the GA will be used again in deciding the best weight and threshold values for the ANN; this is to ensure that no generalization problem occurs. The overall process flow for this second stage is shown in Figure 3.7.
74
START
Randomly initialize the WIJ,
WKL and BN
Generate initial random population
Train NN
Calculate network output and the network error using mean square
error (mse) GA parameter and operation: Selection – stochastic uniform Crossover – crossover scqattered Mutation - uniform
30 hidden neurons and using Levernberg -
Marquardt (LM) learning method
Is the generation of GA end?
End
Size of Chromosome based on number of inputs and
output
No
Yes
Figure 3.7: Second stage process flow
Generally, an ANN will provide random initialization of the weights and thresholds of the network, and due to this feature; it produces local extremes and sometimes makes the convergence slow. In addition, when the ANN cannot generalize, the dataset involved in the model can be easily subject to over-fitting or under-fitting especially when it is divided into training, validation and testing sections (Muhd Khairulzaman Abdul Kadir, 2012, Hajir Karimi, 2011, Rajendra et al., 2009, Zhang and Wang, 2008, T. Hasangholi, 2010). In (Hajir Karimi, 2011), the optimization of weights and thresholds also indicates that this gives the ANN benefits
75
by minimizing the mean square error (MSE) between the output of the network and the actual output.
3.6.1 Variable presentation
In the second stage of the TSH model, the variables use decimal representation, which is different from that use in the first stage model. Another difference is that the chromosomes are based on the total number of inputs, the total number of layers of the ANN and the total number of hidden neurons being used. The total number of chromosomes is shown in (3.7) (Muhd Khairulzaman Abdul Kadir, 2012).
(3.7)
Where, WIJ = weight between input layer and hidden layer, WJK = weight between hidden layer and another hidden layer, WKL = weight between hidden layer and output layer, BI = input layer threshold, BK = hidden layer threshold, BL = output layer threshold. All of these interconnections are shown in Figure 3.8
76 1 2 3 2 1 3 1 3 2 L I1 I2 I3 IN Input signals Output signals Input Layer First Hidden Layer Second Hidden
Layer Output Layer
J K . . . . . . . . . WIJ WJK WKL Y1 Y2 Y3 YN BI BK BL
Figure 3.8: Interconnection of weights and thresholds for ANN (Muhd Khairulzaman Abdul Kadir, 2012)
For this model, as for the previous stage, three layers are used: one input layer, one hidden layer and one output layer. This is to ensure that the GA will optimize the same types of network as previous stages, giving better accuracy in terms of the model performance. Therefore, for equation (3.7), WJK can be removed from the total number of chromosome in the GA. The number of hidden neurons will also be the same as for the previous stage model, as describe in the previous section in equation (3.6).
The ANN has a very complex architecture, especially when it involves multiple layers and multiple variables, inputs and outputs. The techniques performed in this stage have the same GA fitness function as in the first stage – the ANN–MLP architecture, and use the same GA parameters for optimizing the input variables. The
77
reason for maintaining the same parameters is to ensure that the performance of the first stage will be maintained in the next stage process.
3.7 Remodelling ANN
In remodelling the MLP-ANN, all of the parameters of the second stage of the TSH model is maintained ensuring that the performance optimized network made by the GA is also maintained. This remodelling will affect the entire MLP-ANN architecture in terms of its accuracy, and give faster performance. When compared with the original MLP-ANN, it always gives random results and random performance because of the random weights and thresholds initialization by the MLP network itself, which produce under-fitting and over-fitting problems in the training, testing and validation process.