The FCBP Generalization Schemes - Modelling continuous sequential behaviour to enhance training

Generalization in FCBP uses a suitable interpolation method to produce approximations of continuous functions by using the sequence of the trained goal weight states as the parameters. All the standard interpolation techniques can be considered and explored for weight interpolation. Two main factors need to be considered in choosing the interpolation techniques. One is the time expense in computation in working out the approximated weight state, the other is about the approximation accuracy. For example, if we aim to spend as little time as possible in tlie approximation of a goal weight state, a neighbourhood approach may be applied. This means that the goal weight state associated with the last trained position is used to approximate all the goal weight states associated with the untrained positions until another trained position is met. It can be seen that this approach spends no time in computing the approximated goal weight state. However this approach may lack sufficient accuracy in approximation for some cases. For higher accuracy, the standard Fourier interpolation may be a more complete but time consuming approach comparing with the neighbourhood approach. This is to approximate a goal weight states based on the Fourier analysis of the sequence of learni weight states. By taking the n goal weight states associated with the n trained time slices in the goal weight path, Fourier analysis is used to generate the approximated goal weight states for the untrained time slices.

In the thesis, a simple and standard linear inteipolation technique (LIT) which has both a relatively high degree of accuracy and limited computation involved has been explored.

The implementation of the LIT regime has been carried out in the simulator cbptool. Experiments using the LIT regime are shown in Chapter 6.

Regime Linear Interpolation Technique (LIT) approach

This generalization regime uses a conventional linear inteipolation technique to approximate the weight state of a untrained pattern. The approximation is based on a linear interpolation of two learnt weight states associated with two trained time slices. The trained time slices

are the neighbouring trained slices associated with the time slice of the untrained pattern. I For the untrained I/O patterns amongst k such patterns regularly spaced between two

neighbouring trained time slices, its weight state can be calculated using the two learnt weight states W i and W2 associated with the two time slices through LIT. Each

component weight has the form: where and are component

values of the weight states Wi and W2.

4.6 Conclusions

In this chapter, the FCBP model has been introduced and the reasons for why FCBP may have better training and generalization results comparing to that of SBP have been analysed. It is clear that FCBP is evolved from conventional BP but is a very different approach in both concept and methodology.

In general, FCBP is designed for the investigation of temporal associations and sequential signal processing in a parallel processing system. FCBP offers an extension of the feasibility of the back-propagation approach to training; and better approximation in generalization. FCBP is able to use temporal structure to allow variable approximation within a small and fixed sized network. The major features of FCBP itself have been presented in §4.3.3. It is concluded that:

(1) FCBP is a neural system with a way of modelling the effect of time which is appropriate for dealing with the temporal and sequential signal problems.

(2) More than one sequential path can be trained and generalized within a single system as long as two constraints are obeyed, which are: each of the single paths has the same number of the training positions, at each of the positions if input values chosen from different paths have the same input values, they will be associated with the same output values.

(3) Training is feasible for arbitrarily close approximation of I/O mappings where there is an underlying continuity. The more I/O training data chosen along paths, the closer a network will provide approximation of I/O mappings along the paths. In FCBP this can be achieved with a tlxed sized network with the help of using dynamic interpolated memories. (4) The Untrained Output problem is less severe in FCBP compared with that in SBP. The interpolation of weight states provides additional generalization power through a theoretical and feasible basis for resolving the Untrained Output problem for finite sets of I/O paths. Also, generalization can be achieved with a variety of approximation methods. Compared to the zero order weight inteipolation in SBP, a much higher order approximation can be employed by FCBP.

CHAPTER 5

RECURRENT CONTINUOUS BACKPROPAGATION MODEL

5.1 Introduction

Another new approach called recuiTent continuous back-propagation (RCBP) is presented in this chapter. Like FCBP, RCBP is also an approach based on the path-based backpropagation (PBP) framework using recurrent links to embody temporal and sequential capacity in neural networks. It provides a neural dynamic system for I/O associations by generating sequences of internal activity states and weight states.

This chapter is arranged to discuss what RCBP is, how its features compared with other approaches, and how to define internal states in terms of the activities. In order to get a clear picture of RCBP, detailed features are given with emphasis on how internal states can be embodied in PBP at each position.

In §5.2 the desired features of RCBP are presented and the role of the activity states of neurons in recurrent nets is reviewed. The notion of activation sequences is introduced to see if the additional activity dynamic brings significant additional power for the path-based approach. Then in §5.3 and §5.4 respectively, the training and generalization schemes are given. Finally a conclusion is presented in §5.5.

5.2 The RCBP model

In document Modelling continuous sequential behaviour to enhance training and generalization in neural networks (Page 99-102)