Implementing Back Propagation - Programming Neural Networks in Java JeffHeaton pdf

I will now show you how the JOONE neural network implements back propagation training. The training process uses many of the same methods as the recognition process that we just evaluated. Infact the back propagation method works by first running a recognition against the training data and then adjusting the weights and biases to improve the error. You can see this process by examining the Layer.run method, which is shown in Listing 5.1. We already examined the first part of the Layer.run method in the previous section. It is the second half of the Layer.run method that is responsible for providing training for the neural network. The second half of the Layer.run method can be thought of as the main loop for the training process. It is this section of code that will be ran against each item in the training set, and the training data will be ran repeatedly until the error of the neural network falls within an acceptable level. The Layer.run method first checks to see if the neural network is in training mode.

if ( step != -1 )

// Checks if the next step is a learning step m_learning = monitor.isLearningCicle(step); else

// Stops the net running = false;

To determine if we are learning we examine the step variable. If there is no current step then we are not training.

if ( (m_learning) && (running) ) { // Learning

If we are infact learning then we must calculate the gradient inputs. The concept of gradient was discussed earlier in this chapter. For now we simply allocate an array large enough to hold the gradient values.

gradientInps = new double[dimO];

Next we call the fireRevGet method.

fireRevGet();

The fireRevGet method is called to.

backward(gradientInps);

m_pattern = new Pattern(gradientOuts); m_pattern.setCount(step);

fireRevPut(m_pattern); }

} // END while (running = false)

The code that we just examined implements back propagation learning from a high level. Next we will examine the individual methods that were called by the Layer.run method to see how the learning actually takes place.

Listing 5.5: The Layer.fireRevGet Method

protected void fireRevGet() {

if ( aOutputPatternListener == null ) return;

double[] patt;

int currentSize = aOutputPatternListener.size(); OutputPatternListener tempListener = null;

for ( int index = 0; index < currentSize; index++ ){ tempListener = (OutputPatternListener)aOutputPatternListener.elementAt(index); if ( tempListener != null ) { m_pattern = tempListener.revGet(); if ( m_pattern != null ) { patt = m_pattern.getArray(); if ( patt.length != gradientInps.length ) gradientInps = new double[patt.length]; sumBackInput(patt);

} }; }; }

The Layer.fireRevGet method is very similar to the fireFwdGetMethod in that they are both used to sum the patterns obtained from multiple levels into one. In most cases there will only be one layer that you are summing. This is the case with the XOR example. This summation process is assitsted by the Synapse.sumBackInput method that is shown in Listing 5.6.

Listing 5.6: The Synapse.sumBackInput Method

protected void sumBackInput(double[] pattern) { int x;

int n = getRows(); for ( x=0; x < n; ++x )

gradientInps[x] += pattern[x]; }

As you can see the Synapse.sumBackInput method eseentially just sums every element of each pattern that it is passed. This is a cumulative effect as the sumBackInput method is called repeatedly.

Once the fireRevPut method completes it returns back to the Layer.run method. The next method that the Layer.run method calls is the SigmoidLayer.backward method that is shown in Listing 5.7.

public void backward(double[] pattern) { super.backward(pattern); double dw, absv; int x; int n = getRows(); for ( x = 0; x < n; ++x ) {

gradientOuts[x] = pattern[x] * outs[x] * (1 - outs[x]); // Adjust the bias

if ( getMomentum() < 0 ) { if ( gradientOuts[x] < 0 ) absv = -gradientOuts[x]; else

absv = gradientOuts[x];

dw = getLearningRate() * gradientOuts[x] + absv * bias.delta[x][0];

} else

dw = getLearningRate() * gradientOuts[x] + getMomentum() * bias.delta[x][0];

bias.value[x][0] += dw; bias.delta[x][0] = dw; }

}

This method is where the much of the training actually takes place. It is here that gradient output and new bias values will be calculated. The weights will be adjusted later. Next the other layers much be given a chance to update and train. To do this the layer must now pass control to the synapse. It is here, in the synapse that the connection weights will be updated. To pass control to the synapse the Layer.fireRevPut method is called. The fireRevPut method will call all synapses that are connected to this layer. This method can be seen in Listing 5.8.

Listing 5.8: The Layer.fireRevPut Method

protected void fireRevPut(Pattern pattern) { if ( aInputPatternListener == null ) { return;

};

int currentSize = aInputPatternListener.size(); InputPatternListener tempListener = null;

for ( int index = 0; index < currentSize; index++ ){ tempListener = (InputPatternListener)aInputPatternListener.elementAt(index); if ( tempListener != null ) { tempListener.revPut((Pattern)pattern.clone()); }; }; }

As you can see from the above listing the fireRevPut method will loop through each input synapse that is connected. You may notice that we are passing data to our own input synapse. It may seem backward to pass data to your inputs, but that is exactly the pattern that back propagation follows. The revPut method that is called in each of the input

synapses is shown in Listing 5.9. Listing 5.9: The Synapse.revPut Method

if ( isEnabled() ) { count = pattern.getCount(); while ( bitems > 0 ) { try { wait(); } catch ( InterruptedException e ) { e.printStackTrace(); return; } } m_pattern = pattern; backward(pattern.getArray()); ++bitems; notifyAll(); } }

As you can see the Synapse.revPut method is designed to be synchronized. This allows triaining to take advantage of a multi-processor computer or a distributed environment while training, as the program can operate concurrently.

The Synapse.revPut method then makes any adjustments needed to the neuron biases by calling the Synapse.backward method. Finally once all this is done the revPut method calls notifyAll() to inform any thread that might be waiting on data that data is available. You will now be shown how the Synapse.backward method works.

Listing 5.10: The Synapse.backward Method

protected void backward(double[] pattern) { int x;

int y;

double s, dw;

int m_rows = getInputDimension(); int m_cols = getOutputDimension(); // Weights adjustement

for ( x=0; x < m_rows; ++x ) { double absv;

s = 0;

for ( y=0; y < m_cols; ++y ) {

s += pattern[y] * array.value[x][y]; if ( getMomentum() < 0 ) { if ( pattern[y] < 0 ) absv = -pattern[y]; else absv = pattern[y];

dw = getLearningRate() * pattern[y] * inps[x] + absv * array.delta[x][y];

} else

dw = getLearningRate() * pattern[y] * inps[x] + getMomentum() * array.delta[x][y]; array.value[x][y] += dw; array.delta[x][y] = dw; } bouts[x] = s; } }

The Synapse.backward method very closely parallels the SigmoidLayer.backward layer. Both layers mathematically adjust the weights based on the back propagation training algorithm. Only this time we are modifying the connection weights, not the biases of the neurons.

Chapter 5: Understanding Back Propagation

Article Title: Chapter 5: Understanding Back Propagation Category: Artificial Intelligence Most Popular

From Series: Programming Neural Networks in Java

Posted: Wednesday, November 16, 2005 05:15 PM Author: JeffHeaton

Page: 6/6

Summary

In this chapter you learned how a feed forward back propagation neural network functions. You saw how the JOONE neural network implemented such a neural network. The feed forward back propagation neural network is actually composed of two neural network algorithms. It is not necessary to always use "feed forward" and "back propagation" together, but this is usually the case. The term "feed forward" refers to a method by which a neural network recognizes a pattern, where as the term "back propagation" describes a process by which the neural network will be trained.

A feed forward neural network is a network where neurons are only connected to the next layer. There are no connections between neurons in previous layers or between neurons and themselves. Additionally neurons will not be connected to neurons beyond the next layer. As the pattern is processed by a feed forward the bias and connection weights will be applied.

Neural networks can be trained using backpropagation. Backpropagation is a form of supervised training. The neural network is presented with the training data, and the results from the neural network are compared with the expected results. The difference between the actual results and the expected results produces an error. Backpropagation is a

method whereby the weights and input bias of the neural network are altered in a way that causes this error to be reduced.

The feed forward back propagation neural network is a very common network architecture. This neural network architecture can applied to many cases. There are other neural

network architectures that may be used. In the next chapter we will examine the Kohonen neural network. The most significant difference between the Kohonen neural network and the feed forward backpropagation neural network that we just examined is the training method. The backpropagation method uses a supervised training method. In the next chapter we will see how an unsupervised training method is used.

Chapter 6: Understanding the Kohonen Neural Network

Article Title: Chapter 6: Understanding the Kohonen Neural Network Category: Artificial Intelligence Most Popular

From Series: Programming Neural Networks in Java

Posted: Wednesday, November 16, 2005 05:15 PM Author: JeffHeaton

Page: 1/5

Introduction

In the previous chapter you learned about the feed forward back propagation neural network. While feed forward neural networks are very common, they are not the only architecture for neural networks. In this chapter we will examine another very common architecture for neural networks.

The Kohonen neural network contains no hidden layer. This network architecture is named after its creator, Tuevo Kohonen. The Kohonen neural network differs from the feedfroward back propagation neural network in several important ways. In this chapter we will

examine the Kohonen neural network and see how it is implemented. Chapter 7 will continue by showing a practical application of the Kohonen neural network, optical character recognition.

Chapter 6: Understanding the Kohonen Neural Network

Article Title: Chapter 6: Understanding the Kohonen Neural Network Category: Artificial Intelligence Most Popular

From Series: Programming Neural Networks in Java

Posted: Wednesday, November 16, 2005 05:15 PM Author: JeffHeaton

Page: 2/5

In document Programming Neural Networks in Java JeffHeaton pdf (Page 114-121)