CHAPTER 3 Artificial Neural Networks 20
3.6 Model Implementation 28
The implementation phase of model development consists of learning or training and validation. Reed and Marks (1999) define training as the process by which the ANN adapts to learn the relationship or mapping between inputs and outputs. Learning processes consist of supervised, unsupervised, and reinforced learning and its success is typically measured by some performance metric. Validation is the testing of the model with input data that was not used to train the model in order to assess its ability to generalize the relationship between input and output data.
3.6.1 Supervised Learning
In supervised learning, the network is provided with correct answers to the problem for every input pattern. The connection weights of the network are adjusted to allow the network to produce answers as close as possible to target (teacher) answers.
With supervised learning, the ANN must be trained before it becomes useful. (Babovic & Bojkov, 2001) indicated that the training consists of presenting input and output data to the network. This data is often referred to as a training data set. That is, for each input provided to the network, the corresponding desired output set is provided as well. It is considered complete when the neural network reaches a user-defined performance level. This level indicates that the network has achieved the desired statistical accuracy as it produces the required outputs for a
given sequence of inputs. When no further learning is necessary, the weights are typically frozen for the application.
In supervised learning, there is an output or target specified for every input used in the training process. Pairs or samples are used during training input-output. The input consists of a vector of real numbers, with each element of the vector corresponding to an explanatory variable (Rojas, 1996). For example, in a site profiling modeling application, the elements of an input vector could be precipitation, groundwater elevation, and streamflow. Each input is propagated through the ANN and the model output is compared to the target data. The target data is also a vector of real numbers that gives the values of the variables being modeled by the ANN. Unless the model is perfectly trained, there will be differences between target data and the ANN output. The goal of the training process is to optimize the ANN to minimize the differences between ANN output and target data values by adjusting or updating the weights between nodes.
3.6.2 Unsupervised Learning
In unsupervised learning, during the training process no sample outputs are provided to the network against which it can measure its predictive performance for a given vector of inputs (Rojas, 1996). The network internally monitors its performance. It looks for regularities or trends in the input data set, and makes adaptations accordingly. Even without being told whether it is right or wrong, the network still must have some information about how to organize itself. This information is built into the network topology and learning rules.
30
3.6.3 Reinforcement Learning
The third type of learning is reinforcement learning. This is a special case of supervised learning in which the network is provided only with a critique on the goodness of network outputs for a given input pattern rather than true answers.
3.6.4 Training of a Network
The training of a network begins by:
1. Making an initial choice of the suitable neural network structure (or architecture), 2. Assigning initial random small values for the connection weights to calculate the
output
3. Finally, selecting a learning rate, which can appropriately control the adjustment rate of the connection weights.
The training procedure is repeated until the actual and calculated outputs agree within some pre-determined tolerance. In other words, the network stops learning when weight adjustment produces no improvement in the output values. Training is performed in order to determine the best possible values of connection weights for further use as a prediction tool (Najjar, 1999; Najjar et al., 2000). In this research, the process of training and on-line testing was repeated thousands of times for networks with different numbers of hidden nodes. Hundreds of networks were developed and then compared in order to select the one with optimum
performance.
Neural Networks can reach a least-error structure by training, using examples related to the problem under consideration. A least-error structure is the one responsible for producing outputs very close or equal to the real desired values (Jain, et al., 1996). Reasonable training input and output vectors should cover a wide range of the sampling domain. Deriving an
appropriate and representative mapping between input and output vectors reflects the
effectiveness of neural networks. For proper modeling, a network should at least pass through two stages, namely training and testing stages. Selected data with their input and output values are introduced to a network (having a certain number of hidden nodes and layers) so that the network trains itself to produce output values that are as close to the real values as possible. The training is achieved by modifying the values of the connection weights. The network stops learning when adjusting the weights produces no improvement in the output values. The same network should be tested on data that was never used in training in order to verify the network’s generalization capabilities. The procedures of training and testing should be repeated for
networks having different numbers of hidden layers and/or hidden nodes. Changing the input parameters and the number of outputs also affects the performance of a network. This is why at this stage, we will have hundreds or thousands of networks to compare and select the one with the optimum performance.
3.7 Accuracy Measures
Generated networks are compared by their performance (i.e., accuracy) parameters. These parameters are the Averaged Square Error (ASE), coefficient of determination, known as R-square (R2), and the Mean Absolute Relative Error (MARE %). The ASE value can be calculated by the formula:
ASE = sets data of y y # 2 ) ' ( − Σ 3.4
y' being the output generated by the network and y being the real value of the parameter. The MARE value is calculated using the formula:
( ) (
# of outputs) (
# of sets)
y y y MARE × × − ′ =∑
100 % 3.5Generally, we search for the network that produces the minimum values of ASE and MARE% and the highest R2. Testing performance parameters should be considered to select the best performing network. Training performance measure may used in special cases, if needed.
Figure 3.1. A typical multi-layer ANN showing the input layer for ten different inputs, the middle or hidden layer(s), and the output layer having three outputs
Figure 3.2. The basic building block of the network system, the neuron
Figure 3.4. McCulloch-Pitts neuron model
Figure 3.7. Sigmoidal function