• No results found

Higher-Dimensional Classification

4.1 Perceptron Network Functions and Options

4.2.3 Higher-Dimensional Classification

Classification in higher-dimensional problems, that is, when the dimension of the input pattern x is higher than two, can be done in the same way as the two-dimensional problems. The main difference is that you can no longer illustrate the result with nice plots. Instead, you can view the data at various two-dimensional projections. It is also possible to look at how the data is distributed among the classes. This may be done using the commands in the Neural Networks package as illustrated next.

Load the Neural Networks package and the data.

In[1]:= <<NeuralNetworks`

In[2]:= << threeclasses3D.dat;

The input patterns are placed in the matrix x and the output in y.

Check the dimensions of the data.

In[3]:= Dimensions@xD Dimensions@yD Out[3]= 840, 3<

Out[4]= 840, 3<

There are 40 data samples. The input matrix x has three columns, which means that the data is in a three-dimensional space. The output y also consists of three columns, which means that the perceptron should have three outputs, one for each class.

By analogy to the data in Section 3.4.1, Classification Problem Example, this data could correspond to (scaled values of) the age, weight, and height of children from three different groups.

The main difference compared to two-dimensional problems is that you cannot look at the data in the same way. It is, however, possible to look at projections of the data. To do that, you need a projection matrix of dimensions #inputs × 2.

Look at a projection of the data.

In[5]:= NetClassificationPlot[x . {{1,0},{0,1},{0,0}},y,SymbolStyle→

{Hue[.5],Hue[.7],Hue[.9]}]

0 0.5 1 1.5 2 2.5

-0.5 0 0.5 1 1.5 2 2.5

There are obviously 20 data samples of class two and ten of classes one and three.

You can now train a perceptron with this data set.

Train the perceptron.

In[6]:= 8per, fitrecord< = PerceptronFit@x, yD;

0 1 2 3 4 5 6 7 8

Iterations 0

10 20 30 40 50 60

SSE

Success of the training depends on the initial weights of the perceptron. If you repeat the command, you will likely obtain slightly different results.

You can use the training record to investigate the training process.

Plot the number of correctly and incorrectly classified data vectors of each class.

In[7]:= NetPlot@fitrecord, x, y, Intervals → 3D Correctlyêincorrectly classified data

0 1 2 3 4 5 6 7 8Iterations 5

10 15 20 25 30

Samples Class: 3

0 1 2 3 4 5 6 7 8Iterations 5

10 15 20

Samples Class: 2

0 1 2 3 4 5 6 7 8Iterations 5

10 15 20 25 30

Samples Class: 1

You can also illustrate the classification during training with a bar chart. The result is a graphics array.

Check the evolvement of the classifier during the training.

In[8]:= NetPlot@fitrecord, x, y, Intervals → 5, DataFormat → BarChartD Classification after

Iteration: 8 change the command to Apply[ShowAnimation,NetPlot[fitrecord,x,y,Intervals →1, Data Format→BarChart,DisplayFunction→Identity]].

If you are interested only in the end result, you submit the perceptron instead of the training record.

Look at only the end result.

In[9]:= NetPlot[per,x,y]

If the classification is perfect, then all samples should be on the diagonal of the three-dimensional bar chart.

The size of the off-diagonal bars corresponds to the number of misclassified samples.

If you cannot see all the bars properly, you can repeat the command and change the viewpoint. This is most easily done by using the menu command 3D ViewPoint Selector.

Change the viewpoint.

In[10]:= NetPlot[per,x,y,ViewPoint→{2.354, -4.532, 6.530}]

1 2

3 Data

1 2

3 Model

0 5 10

15 20

Samples 1

2 3 del

0 5 1

If the output y is not supplied, the distribution between the classes according to the perceptron model is given.

Illustrate the classification without any output.

In[11]:= NetPlot@per, x, DataFormat → BarChartD

1 2 3

Class 5

10 15 20 Samples

However, if you do not supply any output data, the graph cannot indicate which data samples are correctly and incorrectly classified. Instead, you can see only the distribution over the classes according to the perceptron.

4.3 Further Reading

The perceptron is covered in most books on neural networks, especially the following:

S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed., New York, Macmillan, 1999.

J. Herz, A. Krough, R. G. Palmer, Introduction to the Theory of Neural Computation, Reading, MA, Addison-Wes-ley, 1991.

This chapter describes FF neural networks, also known as backpropagation networks and multilayer percep-trons. Definitions, commands, and options are discussed in Section 5.1, Feedforward Network Functions and Options, and examples may be found in Section 5.2, Examples. A short tutorial introducing FF networks can be found in Section 2.5.1, Feedforward Neural Networks. Chapter 13, Changing the Neural Network Structure, describes how you can use the options and other ways to define more advanced network structures.

FF networks have a lot in common with those in Chapter 6, The Radial Basis Function Network. They are used for the same types of problems, and they use the same training algorithms (see Section 2.5.3, Training Feedforward and Radial Basis Function Networks).

The Neural Networks package supports the use of FF networks in three special types of problems, as follows:

è Function approximation è Classification

è Modeling of dynamic systems and time series

This section illustrates the first two applications. Dynamic neural network models are described in Chapter 8, Dynamic Neural Networks. However, because the dynamic neural network models are based on FF networks, they will also be examined here.

The Neural Networks package offers several important features for FF networks, most of which are uncom-mon in other neural network software products. These features are listed here with links to places where more detailed descriptions are given.

Initialization: There are special initialization algorithms that give well-initialized neural networks. You can obtain an initialization with better performance from these than from one derived from a linear model. After initialization the performance is improved by the training.

Fixed parameters: You do not have to train all parameters. By keeping some of them fixed to values of your choice, you can obtain special model structures that are, for example, linear in some parameters. This is described in Section 13.2, Fixed Parameters.

Different neuron activation function: You can specify any nonlinear activation function for the neuron.

This is described in Section 13.3, Select Your Own Neuron Function.

Regularization and stopped search: These techniques help you to obtain models that generalize better on new data. This is covered in Section 7.5, Regularization and Stopped Search.

Linear models: You can obtain linear models by specifying an FF network without hidden layers. The subsection Section 2.5.1, Feedforward Neural Networks, discusses why this might be a good choice.

Linear model in parallel to the network: You can choose to have a linear model in parallel to the neural network by setting the option LinearPart to True in InitializeFeedForwardNet of the FF network.

Several of these features make use of a combination of numeric and symbolic capabilities of Mathematica.