Choosing an Artificial Neural Network Structure

CHAPTER 8: USING ARTIFICIAL NEURAL NETWORKS TO MODEL THE COMPLIANT DEVICE

8.2. Choosing an Artificial Neural Network Structure

The purpose of the artificial neural network in this thesis is to mathematically model the relationship between the forces acting on the probe, and the position of the probe relative to the manipulator arm. Such a model will need to be solved every 28 ms because the control loop on a PUMA 560 has a clock rate of about 37 Hz. In many force control applications, even faster clock rates are used. With more traditional methods such as Finite Element Analysis (FEA), such a model would typically take several minutes to solve on a PC. In addition, the highly coupled relationship between the forces acting on the probe, and the position of the probe relative to the manipulator arm is a non-linear mapping problem and as such is well suited to a neural network solution [Irwin 95].

It is worth looking at some o f the key artificial neural network structures and their primary uses. A useful definition of neural computing is that it is... “a study of networks of adaptable nodes which, through a process of learning fi’om task examples, store experiential knowledge and make it available for use” [Aleksander 89]. The reality is far short of artificial intelligence, but these tools enable many previously intractable tasks such as non-linear modelling or pattern recognition [Irwin 95] [Lippman].

There are many types of artificial neural networks that have been developed for a variety of different purposes. Some of the key types are shown in table 8.1.

Type of Network Primary purpose/application Reference Perceptron (single layer) simple pattern recognition problems [Rosenblatt 61] Linear signal processing, control and prediction [Widrow 85] Back Propagation function approximation [Rumelhart 86] Radial Basis Function function approximation [Chen 91] Associative Learning local function approximation [Hebb 49]

Self Organising categorisation [Kohonen 87]

TABLE 8.1 - PRINCIPAL ARTIFICIAL NEURAL NETWORK TYPES

For our purposes of non-linear function approximation, back propagation or radial basis functions are most suited. The radial basis function is a faster training alternative, but has the problem that, due to the manner in which each node represents only a local area, can leave gaps within its learnt space. It does not necessarily outperform the back propagation network [Tan]. In addition, at the time of starting this work, it was not available in commercial toolboxes and so back propagation was chosen.

8.2.2. Back Propagating Artificial Neural Networks

The chosen artificial neural network is a back propagating network with one hidden layer. The back propagating network has been one of the most widely used architectures and, with only a single hidden layer, is able to approximate any continuous fimction [Cybenko 89].

Wh„ Wh.i tan-sigmoid functions 'Ï input nodes P, m . .

input layer hidden layer

linear functions

‘n’ hidden ‘m’ output ‘m’

layer nodes layer nodes output

nodes

output layer

FIG. 8.1 - SINGLE HIDDEN LAYER BACK PROPAGATING NEURAL NETWORK

The vectors may be defined as: ••• »'k: Wl = BI = Bl, - - [b\„, ^«^2,,, PF2,,, - W"2,/ W2 = B2 = 82; ... - » r2 _ .

U

J

(8.1) (8.2)

In this thesis, the input nodes are always the six forces and torques recorded by the force sensor. The output nodes are always the six vector terms describing the position and orientation of the PUMA relative to a neutral (centred) probe position. Section 8.3.2 describes how this data was gathered.

During training, the neural network fimctions as follows:

2. Each input node is connected to each of the hidden layer nodes and weighted by a weighting matrix W l. At each node of the hidden layer, the weighted inputs are summed and added to a bias vector B l, before being applied to the tan-sigmoid function giving an intermediate output vector R such that:

R = tan«g[(Wl»P) + Bl] (8.3)

where tan5-/g(x) = —^----— (8.4)

3. Each output from the hidden layer is connected to each of the output layer nodes where, as before, they are weighted by the matrix W2, summed, and added to a bias vector B2 before the sum is applied to a linear function giving the output vector O such that:

O = p«re/m[(W2»R) + B2] (8.5)

where purelin(x) = x (8.6)

4. The output vector, O, is then compared with its target output vector, T, and an appropriate sum square error fimction is given by:

E = X (T i - O jy (T/ - Oj) (8.7)

/=1

5. This error is minimised by adjusting the output layer weights and biases. This error is then linearly backpropagated to the previous layer (the hidden layer) where the weights and biases are adjusted to again minimise the error. This can be continued for as many layers as exist; in our case only one.

6. The process is then repeated with the data re-input at the input layer until either the maximum number of training cycles (epochs) is reached or the sum square error is achieved.

After training, the neural network is used in the feed forward direction only. The input vector P is applied to the input nodes and an output vector O is the result. Practical problems arising in the training of artificial neural networks, such as how to choose the number of input nodes, are discussed in section 8.4.

8.3. Experimental Procedure For Data Gathering

In document Compliant force control for automated sub-sea inspection (Page 128-132)