Normalization and ANN Training - ARTIFICIAL NEURAL NETWORKS

3.2. ARTIFICIAL NEURAL NETWORKS

3.2.4. Normalization and ANN Training

Normalization – is a process of scaling the numbers in a data set into a specified range, to improve the accuracy of the subsequent numeric computarange, computations. Normalization of input and output data is very critical. The values in the external data files are not always well suited for directly copying to network activities. This is because the activities in the network are normally in the range [-1,1], and if some activities differ significantly from this behavior, training performance is often degraded. Hence, without normalization, there is a tendency that the signal or value of large magnitude will be too dominating. If the input and output variables are not of the same order of magnitude, some variables may appear to have more significance than they actually do. The training algorithm has to compensate for order-of-magnitude differences by adjusting the network weights, which is not very effective in many of the training algorithms such as back propagation algorithm. For example, if one input variable has a value of thousands and other input variable has a

value in tens, the assigned weight for the second variable entering a node of hidden layer 1 must be much greater than that for the first. In addition, typical transfer functions such as a sigmoid function, or hyperbolic tangent function, cannot distinguish between two values of inputs when both are very large, because both yield identical threshold output values of 1.

In general, proper normalization of particularly input data makes training algorithms numerically robust and leads to a faster convergence. The common normalization method is to normalize each pattern in such a way that the minimum value is mapped to -1 and the maximum value is mapped to +1. In this report, the input data patterns are normalized between between -1 and +1.

Training (Learning) –is the process of modifying the weights of a neural network in order to produce a network that performs some function. The goal of learning is to create a model that correctly maps the input to the output using historical data so that the model can then be used to produce the output when the desired output is unknown. This is met by adjusting all the connection weights and biases of the network, so that the calculated outputs may be approximated by the desired values.

Basically, the development of ANN involves two phases: training or learning phase and testing phase. Neural networks develop information processing capabilities by learning from examples called training set (a collection of input-output patterns that are used to train the network). Hence, the network learns by a process involving the modification of the connection weights between neurons and layers. As soon as the network has learnt the problem, it is tested with new unknown patterns, and its efficiency is checked.

As mentioned previously, learning techniques can be either supervised or unsupervised. Supervised learning requires a set of examples for which the desired network response is known. The learning process consists then in adapting the network in a way that it will produce the correct response for the set of examples. The resulting network should then be able to generalize (give a good response) when presented with cases not found in the set of examples. Whereas, in unsupervised learning, the neural network is autonomous; it processes the data it is presented with,

finds out about some of the properties of the data set, and learns to reflect these properties in its output. What exactly these properties are, that network can learn to recognize, depends on the particular network model and learning method.

The supervised learning type has been utilized in this study. Hence, all explanations following this will be based on this category.

The objective of different supervised learning algorithms is the iterative optimization of a so called error function representing a measure of the performance of the network. For obvious reason, the training process requires a proper set of data, i.e. input and target output, and a proper mapping of the input to the output. The error function that is usually used for training process is defined as the mean square sum of differences between the values of the output units of the network and the desired target values, calculated for the whole input pattern set. More specifically, the error for a pattern p is given by Equation (3.3):



   NO j pj pj p d a E 1 2 ) ( (3.3)

where d_pj and a_pj are the target and the actual response values of jth output neuron corresponding to the pattern p.

Thus, the total mean square error and the root mean square error (a typical performance function that is used for training feed forward neural networks) are given as in Equations (3.4) and (3.5), respectively.





      P p N j pj pj P p p O a d P E P E 1 1 2 1 ) ( 1 1 _(3.4) 0 * N P E RMSE  (3.5)

where P is the number of the training patterns and N0 is the number of outputs.

During training process, a set of pattern examples is used, each example consisting of a pair with the input and corresponding target output. The patterns are presented to the network sequentially, in an iterative manner, the appropriate weight corrections being

performed during the process to adapt the network to the desired behavior [Demuth and Beale, 2002]. This iteration process continues until the connection weight values allow the network to perform the required mapping. Each presentation of the whole pattern set is named an epoch. The iteration process is governed by different types of training algorithms which will be discussed in the following subsection.

In document Self-Adaptive Autoreclosing Scheme using Artificial Neural Network and Taguchi‟s Methodology in Extra High Voltage Transmission Systems (Page 74-77)