Basics of Artificial Neural Networks
output 1 and an initial
30 Basics of Artificial Neural Networks for more complex neural network architectures Let us consider two
layers and with M and N processing units, respectively. By providing connections to unit in the layer all the units in the layer, as shown in Figures and we get two network structures and which have fan-in and fan-out geometries, respectively [Grossberg, 19821. During learning, the
normalised weight = in approaches
the input vector, when an vector a =
M
is presented a t the layer. Thus the activation = a . of the unit in the layer will approach maximum value during learning. Whenever the input is given to then the unit of
(a)
(c) Group of (d) Group of
(e) Bidirectional associative memory Autoassociative memory Figure 1.7 Some basic structures of artificial neural networks.
Basic Learning Laws 31 will be activated to the maximum extent. Thus the operation of an can be viewed as content addressing the memory. In the case of an during learning, the weight vedor for the
from the unit in approaches the activity pattern in when an input vector a is presented at During recall, whenever the unit
j is activated, the signal pattern will be transmitted to where is the output of the signal pattern then produces the original activity pattern corresponding to the input vector a, although the input is absent. Thus the operation of an can be viewed as memory addressing the contents.
all the connections from the units in to are made as in Figure we obtain a heteroassociation network. This network can be viewed as a group of if the flow is from to On the other hand, if the flow is from to then the network can be viewed as a group of (Figure
the flow is bidirectional, we get a bidirectional associative memory (Figure where either of the layers can be used as If the two layers and coincide and the weights are symmetric, = i j, then we obtain an autoassociative memory in which each unit is connected to every other unit and to itself (Figure
1.6 Basic Learning Laws
The operation of a neural network is governed by neuronal dynamics. Neuronal dynamics consists of two parts: one corresponding to the dynamics of the activation state and the other corresponding to the dynamics of the synaptic weights. The Short Term Memory in neural networks is modelled by the activation state of the network. The Long Term Memory corresponds to the encoded pattern information in the synaptic weights due to learning. We will discuss models of neuronal dynamics in Chapter 2. In this section we discuss some basic learning laws [Zurada, 1992, 2.5; Hassoun, 1995, Ch. Learning laws are merely implementation models of synaptic dynamics. Typically, a model of synaptic dynamics is described in terms of expressions for the first derivative of the weights. They are called equations.
Learning laws describe the weight vector for the ith processing unit at time instant
+
1) in terms of the weight vector at time instant as follows:+
1) =+
where is the change in the weight
There are different methods for implementing the learning feature of a neural network, leading to several learning laws. Some
32 Basics of Artificial Neural Networks basic learning laws are discussed below. All these learning laws use only local information for adjusting the weight of the connection between two units.
1.6.1 Law
Here the change in the weight vector is given by
Therefore, the jth component of is given by
T
Aw.. =
for j = 1 , 2 M
where is the output signal of the ith unit. The law states that the weight increment is proportional to the product of the input data and the resulting output signal of the unit. This law requires weight initialization to small random values around = prior to learning. This law represents an unsupervised learning.
Perceptron Learning Law
Here the change in the weight vector is given by
where is sign of x. Therefore, we have w, = -
= -
,
for = 1, 2, M This law is applicable only for bipolar output functions This is also called discrete learning law. The expression for shows that the weights are adjusted only if the actual output is incorrect, since the term in the square brackets is zero for the correct output. This is a supervised learning law, as the law requires a desired output for each input. In implementation, the weights can be initialized to any random initial values, as they are not critical. The weights converge to the final values eventually by repeated use of the input-output pattern pairs, provided the pattern pairs are representable by the system. These issues will be discussed in Chapter 4.1.6.3 Delta Learning Law
Here the change in the weight vector is given by
Basic Learning Laws 33 where is the with respect to Hence,
This law is valid only for a differentiable output function, as it depends on the derivative of the output function I t is a supervised learning law since the change in the weight is based on the error between the desired and the actual output values for a given input. Delta learning law can also be viewed as a continuous
learning law.
the weights can be initialized to any random values as the values are not very critical. The weights converge to the final values eventually by repeated use of the input-output pattern pairs. The convergence can be more or less guaranteed by using more layers of processing units in between the input and output layers. The delta learning law can be generalized to the case of multiple layers of a network. We will discuss the generalized delta rule or the error backpropagation learning law in Chapter 4.
1.6.4 Wldrow and Hoff LMS Learning Law Here the change in the weight vector is given by
= - a
Hence
T M (1.9)
This is a supervised learning law and is a special case of the delta learning law, where the output function is assumed linear,
In this case the change in the weight is made proportional to the negative gradient of the between the desired output and the continuous activation value, which is also the continuous output signal due to linearity of the output function. Hence, this is also called the Least Mean Squared error learning law. This is same as
the learning law used in the Adaline model of neuron. In implementation, the weights may be initialized to any values. The input-output pattern pairs data is applied several times to achieve convergence of the weights for a given set of training data. The convergence is not for any arbitrary training data set. 1.6.5 Correlation Learning Law
34 Basics of Artificial Neural Networks