A path-based framework - Modelling continuous sequential behaviour to enhance training and gene

In order to deal with the above weaknesses directly, it would seem possible to give equal status to the trained weight state as to the internal activity state, that is to allow weights to vary over time. Simpson’s definition of neural convergence allows this as a possibility: “ if the mapping converges to a fixed value, or to some fixed set, then 'the learning procedure is

properly capturing the mapping. “ (Simpson, 1989). This equality of status leads to a new framework for backpropagation. So in order to resolve the problems mentioned above, a new framework based on dynamic extension in time is proposed. It will be seen later on that such extension in time will lead to extension in space as well.

3.5.1 A N ABSTRACT MACHINE ANALOGY

For present puiposes, the aim is to not only increase the number of data points that can be captured but also to inteipolate the data points to various degrees.

An abstract machine analogy is that a kind of Interpolating Turing Machine (ITM) may be created. The extra feature relative to a Turing Machine (TM) is that there is more than one symbol associated with each tape square. Suppose each tape square has a data point number at the centre of the square. Other numbers are then able to be associated with other points along the centre of the tape through interpolation (Fig. 3.1). Hence there is infinite inteipolation as well as infinite discrete extension.

0.6

Fig. 3.1 An Interpolated Turing Machine tape

A neural dynamic approach which has the mechanism of using dynamic interpolated memories as the analogue to an infinite interpolated TM tape and embodies time sequences in the system is investigated in this thesis.

Important points to this new approach to note are: (1) In SBP, finite representation concentrated into a single moment of random access time is forced to correspond to an infinite mapping over a period of sequential time. (2) A dynamic interpolated memory approach provides, by contrast, one-one correspondence with an infinite sequential mapping. Such a conespondence provides an analogue of the Interpolated Turing Machine tape which evolves the FSM to cope with problems of infinite extension in time such as

those mentioned earlier. (3) The more natural correspondence with time will be shown to produce significant computational benefits with respect to the problems outlined in the previous section.

The specific goal of the new approach: (1) to have a conceptual exploration to find a resolution of the problems that occur in training potentially infinite analogue I/O associations where there are underlying continuous functions; (2) to explore the neural realisation of the relationships empirically.

3 .5 .2 THE ROLE OF GOAL WEIGHT STATES IN NEURAL CONVERGENCE

In general, the aim of training a neural network to realise I/O associations is to seai'ch for a goal as a condition for neural convergence. A machine is thereby obtained which provides the desired I/O mappings in performance. This aim does not imply that there can necessarily only be a single weight state as the goal of training. If the goal of the training is to find more than one weight state or a single weight path, and during performance these goal weight states at each moment can be associated with a particular I/O mapping accurately, this also satisfies the above aim.

The goal of neural convergence in SBP framework is a single goal weight state. The benefits of the approach has been briefly reviewed in §1.3. The infeasibility of this approach is also discussed in §3.2 and §3.4. A question being asked here is whether there is any other goal of neural convergence which can be applied for the same aim but without the training feasibility problem.

The new approach begins by appreciating that sequential access is suited to many types of TDSP. This is because for each next step in time, a coirect response is only required to the next input value along one path and not to any other predecessors or successors.

The framework explored in the thesis is to define the goal condition to be a sequence of weight states, a weight path rather than a single weight state. This trades off access to

■s

desired output in performance against the number of hidden units needed for training in such cases.

The conclusion of this section is: a weight path approach as an alternative goal of neural

convergence is proposed for approximating the production of I/O associations. 5

3.5.3 Major specihc featuresofthe path-based approach

In the path-based backpropagation framework PBP, instead of a single weight state as the s simultaneous solution for all the I/O training patterns as the goal of training, a sequence of I weight states, a weight path, is found and used. In general a weight path instead of a

weight state is used as the goal of training.

Each weight state in the goal path provides random access to just those individual I/O patterns occuning at the same evolved fractional distance in time along each of a number of training I/O paths. The fractional distance will be said to constitute a position in the I/O paths' state sequences. Travel along the weight path in time during performance allows sequential access to all the desired values at each position along the VO paths.

More technically, suppose we wish to train m sequential I/O paths with n positions. Let the sequence be S2' , ..., where Sn^ is denoted as the pair of I/O at position n along the i path. Then it is desired to find a sequence of goal weight states Wj, W2, ..., Wn where Wj realises the set { Sj^, Sf, ... Sf, ... Sf^ } at sequential position j. Fig. (3.2) is a diagram of showing the relationship between I/O and W for a 2-orbits problem. The one orbit starting at the North has a binary target output of 0. The other starting at the West position has a binary target output of 1. The two sets of end points of the straight lines are shown in Fig.(3.2), each line constitutes a training position and is associated with a goal weight state.

In practice, this framework trades the storage needed for storing the goal weight path and restricted access to desired outputs in performance against the variable number of hidden units needed for training.

Fig. 3.2 The two training positions along 2-orbits

The following points should be bom in mind and pursued throughout the design of the new framework: (1) Representation capability will be independent of the number of data points used sequentially as much as possible; for example the representation capability should not change whatever the length of a chosen curve represented in coordinates will be; (2) The network topology will be finite and relatively smaller than for SBP; (3) Accuracy in generalization is controllable, the Untrained Output problem can be resolved to a significant extent; (4) A single network can be trained to do I/O associations for multiple I/O paths.

During performance, I/O associations should be able to switch from one path to another. | The computational benefits of this approach lie in both the feasibility of training and the

provision of a trained weight path for use in generalization. These features will be further analysed and discussed together with two models based on PBP.

The two PBP models are specially implemented for investigation of approximations of sequences of I/O associations chosen from continuous functions or complex analogue- binary mappings The first one is for feedforward networks, and is called Feedforward Continuous Back-Propagation (FCBP). FCBP is a first step within PBP. It solves at least some problems associated with time-dependent signal processing. However, the dynamic capacity is still limited in FCBP. In order to further explore the dynamic systems of path- based sort, another model called Recurrent Continuous Back-Propagation (RCBP) based on both the weight path and a sort of internal state path approach on recurrent networks has

also been preliminarily investigated. For more details about the PBP and the models FCBP and RCBP, please refer to the chapter 4 and chapter 5 respectively.

In document Modelling continuous sequential behaviour to enhance training and generalization in neural networks (Page 80-85)