Synaptic Dynamics Models - Activation and Synaptic Dynamics

Activation and Synaptic Dynamics

2.3 Synaptic Dynamics Models

Learning

Synaptic dynamics is attributed to learning in a biological neural network. The synaptic weights are adjusted to learn the pattern information in the input samples. Typically, learning is a slow process, and the samples containing a pattern may have to be presented to the network several times before the pattern information is captured by the weights of the network. A large number of samples are normally needed for the network to learn the pattern implicit in the samples. Pattern information is distributed across all the weights, and it is difficult to relate the weights directly to the training samples. The only way to demonstrate the evidence of learning pattern information is that, given another sample from the same pattern source, the network would classify the new sample into the pattern class of the earlier trained samples. Another interesting feature of learning is that the pattern information is slowly acquired by the network from the training samples, and the training samples themselves are never stored in the network. That is why we say that we learn from examples, not store the examples themselves.

The adjustment of the synaptic weights is represented by a set of learning equations, which describe the synaptic dynamics of the network. The learning equation describing a synaptic dynamics model

Synaptic Dynamics Models 53 is given as an expression for the first derivative of the synaptic weight connecting the unit to the unit i. The set of equations for all the weights in the network determine the trajectory of the weight states in the weight space from a given initial weight state.

Learning laws refer to the specific manners in which the learning equations are implemented. Depending on the synaptic dynamics model and the manner of implementation, several learning laws have been proposed in the literature. The following are some of the requirements of the learning laws for effective implementation:

of learning laws:

(a) The learning law should lead to convergence of weights. The learning or training time for capturing the pattern information samples should be as small as possible.

An on-line learning is preferable to an off-line learning. That is, the weights should be adjusted on presentation of each sample containing the pattern information.

Learning should use only the local information a s far as possible. That is, the change in the weight on a connecting link between two units should depend on the states of these two units only. In such a case, it is possible to implement the learning law in parallel for all the weights, thus speeding up the learning process.

Learning should be able to capture complex nonlinear mapping between input-output pattern pairs, as well as between adjacent patterns in a temporal sequence of patterns.

Learning should be able to capture as many patterns as possible into the network. That is, the pattern information storage capacity should be as large as possible for a given network.

Categories of learning: Learning can be viewed as searching through the weight space in a systematic manner to determine the weight that leads to an optimum (minimum or maximum) value of an objective function. The search depends on the criterion used for learning. There are several criteria which include minimiza- tion of mean squared error, relative entropy, maximum likelihood, gradient descent, etc. 19951. There are several learning laws in use, and new laws are being proposed to suit a given applica- tion and architecture. Some of these

will

be discussed at appropriate places throughout the book, but there are some general categories that these laws fall into, based on the characteristics they are expected to possess. In the first place, the learning or weight adjustment could be supervised or unsupervised. In supervised learning the weight adjustment is determined based on the deviation

54 Activation and Synaptic Dynamics

of the desired output from the actual output. Supervised learning may be used for structural learning or for temporal learning. Structural learning is concerned with capturing in the weights the relationship between the given input-output pattern pairs. Temporal learning is concerned with capturing in the weights the relationship between neighbouring patterns in a sequence of patterns.

Unsupervised learning discovers features in a given set of patterns, and organizes the patterns accordingly. There is no exter- nally specified desired output in this case. Unsupervised learning uses mostly local information to update the weights. The local information consists of signal or activation values of the units at either end of the connection for which the weight update is being made.

Learning methods may be off-line or on-line. In an off-line learning all the given patterns are used together to determine the weights. On the other hand, in an on-line learning the information in each new pattern is incorporated into the network by incrementally adjusting the weights. Thus an on-line learning allows the neural network to update the information continuously. However, an off-line learning provides solutions better than an on-line learning since the information is extracted using all the training samples in the case of off-line learning.

In practice, the training patterns can be considered as samples of random processes. Accordingly, the activation and output states could also be considered as samples of random processes. Randomness in the output state could also result if the output function is implemented in a probabilistic manner rather than in a deterministic manner. These input, activation and output variables may also be viewed as fuzzy quantities instead of crisp quantities. Thus we can view the learning process as deterministic or stochastic or fuzzy or a combination of these characteristics.

Finally, in the implementation of the learning methods the variables may be discrete or continuous. Likewise the update of weight values may be in discrete steps or in continuous time. All these factors influence not only the convergence of weights, but also the ability of the network to learn from the training samples. 2.3.2 Distinction between Activation and Synaptic Dynamics

Models

In order to appreciate the issues in evolving and implementing learning, it is necessary to clearly understand the distinction between the functions of the activation and synaptic dynamics models. This is discussed in this section. Both activation dynamics and synaptic dynamics models are expressed in terms of expressions for the first derivatives of the activation value of each unit and the strength of the connection between the ith unit and the jth unit, respectively. However, the purpose of invoking activation dynamics model is to

Synaptic Dynamics Models 55 determine the equilibrium state that the network would reach for a given input. In this case, the input to the network is fixed throughout the dynamics. The dynamics model may have terms corresponding to passive decay, excitatory input (external and feedback) and inhibitory input (external and feedback). The passive decay term contributes to transients, which may eventually die, leaving only the steady state part. The transient part is due to the components representing the capacitance and resistance of the cell membrane. The steady state activation equations can be obtained by setting =

i = N. This results in a set of N coupled nonlinear equations, the solution of which will give the steady activation state as a function of time. This assumes that the transients decay faster than the signals coming from feedback, and the feedback signals do not produce any transients. I t is in the movement of the steady activation state that we would be interested in the study of activation dynamics. Note that even a single unit network without feedback may have transient and steady parts, and the steady part i n this case describes the stable state also. But in a network with feedback other units, the steady activation states may eventually reach a n equilibrium or a stable state, provided the conditions for the existence of stable states are satisfied by the parameters (especially the weights) in the activation dynamics model.

Thus,

in these cases we are not interested in the transient part of the solutions. We are only interested in the equilibrium stable states reached by the steady state activation values for a given input. The equilibrium states correspond to the locations of the minima of the Lyapunov energy function and are given by whereas the steady states are given by

= where is the activation vector with components i = N. The equilibrium behaviour of the activation state of a neural network will be discussed in detail in Section 2.5.

The case of synaptic dynamics model is different from the activation dynamics model. The objective in synaptic dynamics is to capture the pattern information in the examples by incrementally adjusting the weights. Here the weights change due to input. If there is no input, the weights also do not change. Note that providing the same input a t another instant again causes the weights to change, as it can be viewed as a sample given for further reinforcement of the weights. If the model contains a passive decay term in addition to the terms due to the varying external input, the network not only learns continuously, but also forgets what it had learnt initially. In discrete implementation, determining the weight change a t each discrete time step, suitable assumptions are made regarding the contribution of the initial weight state and also the contributions due to the samples given in the past. As an example, let us consider the following synaptic dynamics model, consisting of a passive decay term and a correlation term:

Activation and Synaptic Dynamics

The solution to the equation is given by

+

d r , (2.23)

where is a constant initial value of the weight. The above solution shows that the weight accumulates the correlation of the output signals, Note that the values and depend on the external input given in the form of samples, continuous in time in this case. This is because the activation dynamics depends on the external input besides the network parameters like membrane capacitance and the connection topology like feedback. The activation values considered here are steady and stable, since it is assumed that the transients due to membrane parameters like capacitances have decayed down, and the steady activation state of the network has reached the stable state. This assumption is reasonable, since the adjustment of synaptic weights takes place at a much slower rate compared t o the changes in the activation states.

The initial weight can be viewed as a priori knowledge. The term can be considered as a forgetting term. As t the contribution due to this term to the weight will be zero, the system would not remember the knowledge in the network at t = The second term reflects recency effect. It shows the accumulation of the correlation term with time. There is an exponential weightage to this accumulation, which shows that recent correlation value is given more weight than the correlation values in the past. As mentioned above, these correlations depend on the input samples. The weights are expected to capture the patterns in the input samples as determined by the synaptic dynamics model.

Most of the time the learning laws ignore the passive decay term. Then the initial weight receives importance as can be seen below the solution equation without the passive decay term. Let

(2.24)

In document ANN by B.Yegnanarayana.pdf (Page 68-72)