Mathcad - neural

(1)

Neural Nets: An Introduction for Mathcad Users

Page 1 of 27Page 1 of 27 by Eric Edelstein

by Eric Edelstein MathSo

MathSoft, ft, Inc.Inc.

In this article, we will consider modeling a feed forward network (a special type of weighted In this article, we will consider modeling a feed forward network (a special type of weighted directed graph

directed graph) after t) after t he way a brhe way a brain operaain operates tes and begin looking at algorithms tand begin looking at algorithms t hat teach hat teach thethe network how to learn.

(2)

One

One of the things that maof the things that makes humans efficient is our abkes humans efficient is our ability to changeility to change. . This manifests itThis manifests it self inself in many ways. The first is that our brains need not be completely redesigned just to change our many ways. The first is that our brains need not be completely redesigned just to change our lunch ord

lunch order wer when when we find theye find they're 're out of octopus sukiyaki. out of octopus sukiyaki. This is non-triviaThis is non-trivial. l. ConConsider thesider the adv

advantages our flexibantages our flexibility has overility has over, for example, a microchip. , for example, a microchip. The electronic circThe electronic circuits may beuits may be able to perform many different operations, but the number is finite and the abilities don't able to perform many different operations, but the number is finite and the abilities don't change over

change over time. time. If you have If you have an OR an OR gate, it musgate, it mus t be taken apart and t be taken apart and rebuilt if you want anrebuilt if you want an AND g

AND gate. ate. If you wanIf you want it t it to add numbeto add numbers, you have rs, you have to compile quite a few components.to compile quite a few components.

A human, however, can add to his/her stockpile of abilities without adding brain cells. How A human, however, can add to his/her stockpile of abilities without adding brain cells. How does this happen? No one knows exactly, but certain ideas in the theory of learning are does this happen? No one knows exactly, but certain ideas in the theory of learning are getting clearer, and some can be modeled on a computer. We are now in the age where getting clearer, and some can be modeled on a computer. We are now in the age where

computers can be taught to learn new tricks. That is, a program representing a neural net can computers can be taught to learn new tricks. That is, a program representing a neural net can be made to learn infinitely many different routines (one at a time). That makes it extremely be made to learn infinitely many different routines (one at a time). That makes it extremely flexible, and he

flexible, and hence, powerful. nce, powerful. NeuNeural nets have beral nets have been created that mimic and anticipateen created that mimic and anticipate human behavior, run machinery in automated factories, read books aloud, make complex human behavior, run machinery in automated factories, read books aloud, make complex financial decisions, and a host of other impressive tasks.

(3)

One of the most common tasks required of a neural net is the recognition of patterns and One of the most common tasks required of a neural net is the recognition of patterns and reaction to them in some manner. This will be demonstrated in this article. The reason for reaction to them in some manner. This will be demonstrated in this article. The reason for this particular emphasis is that once a neural net can find a pattern, it can start predicting. this particular emphasis is that once a neural net can find a pattern, it can start predicting. The art of prediction is an old one. There are many and varied statistical techniques to The art of prediction is an old one. There are many and varied statistical techniques to approximate predictions. However, neural nets have been shown to be more accurate on approximate predictions. However, neural nets have been shown to be more accurate on some occasions. Also, unlike a standard statistical program which allows for one set of some occasions. Also, unlike a standard statistical program which allows for one set of analyses, the same neural net can learn to do different analyses on different kinds of data. analyses, the same neural net can learn to do different analyses on different kinds of data. It will just need to be retrained. However, the most fundamental difference is one of action. It will just need to be retrained. However, the most fundamental difference is one of action. A neural net will not only predict, but will also act in accordance with this prediction as we A neural net will not only predict, but will also act in accordance with this prediction as we shall see later on.

shall see later on.

Now, consider the cellular make up of a brain: neurons. There are millions of neurons Now, consider the cellular make up of a brain: neurons. There are millions of neurons interconnected along axons. The center of a neuron receives stimuli and decides, interconnected along axons. The center of a neuron receives stimuli and decides,

somehow, whether or not to send a signal to neighboring neurons. If it decides to send out somehow, whether or not to send a signal to neighboring neurons. If it decides to send out a signal, an electrical burst, it does so through the axons. This is how the brain makes its a signal, an electrical burst, it does so through the axons. This is how the brain makes its own

own predpredictiictions and actions. ons and actions. Given Given this descthis desc ription, the brain can ription, the brain can be thoughbe thought of as a grapht of as a graph with the main neuron body represented by a node or vertex and the edges representing with the main neuron body represented by a node or vertex and the edges representing axons.

(4)

These graphs (also called networks) contain points, called nodes or vertices; the line These graphs (also called networks) contain points, called nodes or vertices; the line segments connecting these nodes are called ed

segments connecting these nodes are called edges. ges. The endThe endpoints of an edge points of an edge areare its vertices

its vertices. . An orienAn orientation of the edges is a choice of station of the edges is a choice of s tarting and etarting and ending vnding vertex forertex for the edge. Usually, we draw an arrow on an oriented edge pointing from the initial to the edge. Usually, we draw an arrow on an oriented edge pointing from the initial to the final node. If each edge of the graph has an orientation, the graph is called a the final node. If each edge of the graph has an orientation, the graph is called a directed graph (or digraph, for short).

directed graph (or digraph, for short).

A graph with this association and the inherent implications that brings is called a A graph with this association and the inherent implications that brings is called a neur

neural networal network (or a neurk (or a neural net, for sal net, for short). hort). We shall restrict We shall restrict ourselveourselves ts to the study ofo the study of neural nets of a certain form: we assume our neural nets are layered and

neural nets of a certain form: we assume our neural nets are layered and forward-feed.

forward-feed. These are These are weighted, directweighted, direct ed ged graphs with nodes that can be raphs with nodes that can be broken upbroken up into discrete vertical layers (that is, the nodes lie on vertical slices through the graph). into discrete vertical layers (that is, the nodes lie on vertical slices through the graph). The orientations given to the edges are the same throughout the graph, either left to The orientations given to the edges are the same throughout the graph, either left to right or

right or right to left. right to left. In this article we In this article we will use the will use the convenconvention of left to rightion of left to right. t. Such aSuch a digraph looks like:

(5)

With the edge orientation of With the edge orientation of left to right, the leftmost layer left to right, the leftmost layer is called the input layer, the is called the input layer, the rightmost,

rightmost, the output the output layerlayer,, and all those between, the and all those between, the hidden layers. Nodes are hidden layers. Nodes are often called units, making often called units, making the leftmost ones, input the leftmost ones, input units, the rightmost, output units, the rightmost, output units, and those in between, units, and those in between, hidden un

hidden unitsits..

As mentioned earlier, we are concerned not only with the choices of edges and nodes, but As mentioned earlier, we are concerned not only with the choices of edges and nodes, but also with

also with the weighting of the weighting of them. them. TTo do determine what etermine what the weighting should the weighting should be, let's return tobe, let's return to the brain.

the brain. If a neuron reIf a neuron receives a very ceives a very small ssmall stimulus, timulus, it does not fit does not fire. Once, howevire. Once, howeverer, it , it doesdoes receive

receive a significant enough stimulus, it fa significant enough stimulus, it fires a complete bires a complete burst. urst. It fIt follows the aollows the all-or-ll-or-nonenone principal.

principal. The The cut-off value cut-off value for stimuli is called a for stimuli is called a threshold. threshold. It is the amouIt is the amount of stimulus for ant of stimulus for a particular neuron below which no reaction signal will be sent.

(6)

In modeling graphs after brains, we associate to each node,

In modeling graphs after brains, we associate to each node, , a , a threshold valuethreshold value,, , rather, rather like a transistor has in a logic gate.

like a transistor has in a logic gate.

The axon connections may be very strong or weak. That is, the signal sent from one neuron The axon connections may be very strong or weak. That is, the signal sent from one neuron to another via a particular axon may be completely passed on, or it may be impeded. This to another via a particular axon may be completely passed on, or it may be impeded. This can be thoug

can be thought of as tht of as the strength of the connection. he strength of the connection. The degThe degree oree of connection betweenf connection between those t

those two neuwo neurons will reflect rons will reflect how interdhow interdependependent they are. This ent they are. This strength between them isstrength between them is used to define the w

used to define the weights on the edgeeights on the edges in the neuras in the neural net. l net. If tIf the whe weight is closeight is close to zero on ane to zero on an edge between two nodes, then we can think of these two units as having little effect on each edge between two nodes, then we can think of these two units as having little effect on each other. If, on the other hand, the weights are high in absolute value, then the units' effect on other. If, on the other hand, the weights are high in absolute value, then the units' effect on each other is

each other is strong. The weight on tstrong. The weight on t he edge from verhe edge from vertextex to vertexto vertex is denotedis denoted ww

At this point we've completed the fundamental association between a simplified brain At this point we've completed the fundamental association between a simplified brain and a

(7)

(8)

It remains only to show how signals are passed along. Let's say that we're in the middle of a It remains only to show how signals are passed along. Let's say that we're in the middle of a neural net at a vertex,

neural net at a vertex, . . It wouIt would loold look something k something like:like:

Where

Where x1x1 throughthrough x4x4 represent the strengths of the impulses that have been sent to thisrepresent the strengths of the impulses that have been sent to this node,

node, . The effect of. The effect of x1x1onon will be determined by the strength of the connectionwill be determined by the strength of the connection WW₁₁ . So. So by defining our weights correctly the effect of

by defining our weights correctly the effect of x1x1onon will be the productwill be the product WW₁₁ x1x1. Taking the. Taking the other incoming impulses into consideration,

(9)

i i



1 1 4



4 ii xi xi WW iiνν



 





The reaction at

The reaction at to this impulse must be dto this impulse must be determinedetermined. . First we muFirst we must sst see if the ee if the incomingincoming signal passes the threshold test. To do this, subtract the threshold from the impulse and signal passes the threshold test. To do this, subtract the threshold from the impulse and determine if the result is positive or negative. Then, a response function of some kind, determine if the result is positive or negative. Then, a response function of some kind, called the activation function, will act on the impulse, provided it is above the threshold called the activation function, will act on the impulse, provided it is above the threshold level.

level. We peWe perform these two steps as one by rform these two steps as one by assuming some stassuming some structure on ructure on the function.the function. We will assume that the activation function will treat positive numbers and negative We will assume that the activation function will treat positive numbers and negative numbe

numbers differentlyrs differently. . That is, That is, the function values for a pothe function values for a positive input will corresponsitive input will correspond to thed to the neur

neuron firing. on firing. The functThe function values for ion values for neganegative input tive input valuevalues s will correspond to non-firing.will correspond to non-firing. With this we find the response at

With this we find the response at to the stimulus is:to the stimulus is:

f f ii xi xi WW iiνν



 







ττ_ν_ν









(10)

A typical activation function might be

A typical activation function might be f xf ( ( ) x)



( ( x 0x 0



)) x x





55





4.9954.995



55

4 4  22 0 0 2 2 44 0.5 0.5  0 0 0.5 0.5 1 1 1.5 1.5 f f x( ( ))x x x _τ_τ ν ν



33 This is an example of an all or none response. Note what it would

This is an example of an all or none response. Note what it would look like when applied in a neural net with a threshold value 3: look like when applied in a neural net with a threshold value 3:

f f x( ( ) x)



 

xx



ττ_ν_ν



₀₀



0.5 0.5  0 0 0.5 0.5 1 1 1.5 1.5 f f x( ( ))x x x

(11)

To get an idea of what's going on geometrically, let's consider To get an idea of what's going on geometrically, let's consider two impulses going to a unit with the same threshold of 3. Let's two impulses going to a unit with the same threshold of 3. Let's say one edge has a weight of a half, and the other a quarter.

say one edge has a weight of a half, and the other a quarter. ww1 1



..55 ww2 2



..2255 f x1 x2 f x1 x2( (



) )



 

xx1 1 w



w11



x2 w2x2 w2





ττ_ν_ν



₀₀



i i





55



1010 j j





55



1010 MM i i 55 ( ( ) j ) j 5( ( 5))



f f i i jj( (



)) The

The zz-ax-axis is describes tdescribes t he node'he node'ss reaction output to the two stimuli reaction output to the two stimuli x1x1 and

and x2x2, plotted in the, plotted in the x-yx-y plane.plane.

The neural net to the left of the The neural net to the left of the node

(12)

Now that we know how a single node Now that we know how a single node reacts to stimuli, we can determine reacts to stimuli, we can determine the outputs of the output units for a the outputs of the output units for a choice of input units. We consider a choice of input units. We consider a very simple neural net:

very simple neural net:

There are two input nodes,

There are two input nodes, 11 andand 22, three hidden units,, three hidden units, 33,, 44, and, and 55, and one output, and one output node

node 66..

Let's assign some weights to the edges. Let's assign some weights to the edges.

w

w113 3 ww114 4 ww224 4 ww225 5 ww336 6 ww446 6 ww5566 (

( ))



( ( 11 11 11 11 11



2 2 11))

We must decide upon an activation function. Let's choose:

(13)

P

Piicck tk thhrreesshhoolld d vvaalluueess:: NoNow w ppiicck tk thhe e iinnppuut vt vaalluueess::

y1 y1 y2 y2









1 1 1 1











τ τ33 τ τ44 τ τ55 τ τ66













0 0 1.5 1.5 0 0 .5 .5















For the input layer we assume the thresholds are zero and the activation function is the For the input layer we assume the thresholds are zero and the activation function is the identity, so that the signal put into

identity, so that the signal put into 11 is the same as the signal coming out fromis the same as the signal coming out from 11..

y

y3 3



f yf

 

y1 1 w



w1133



τ_τ33



_y_y4₄



_{f y}_{f y1}

 

_{1 w}



_w1₁₄₄



_{y2 w24}_{y2 w24}





_τ44τ



_y_y5₅



_{f y}_{f y2}

 

_{2 w}



_w2₂₅₅



τ_τ55



y

y6 6



f y3 f y

 

3 w



w3366



y4 w46y4 w46





y5 w56y5 w56





τ_τ66



The output unit for the corresponding input pattern is

The output unit for the corresponding input pattern is y6 y6



00

Do you recognize this binary function? (Hint: It's one of the standard logical operations.) Do you recognize this binary function? (Hint: It's one of the standard logical operations.)

(14)

Let's now

Let's now consider how consider how to change the neto change the net. t. Thinking of the grapThinking of the graph as a brah as a brain, it sin, it s eems cleareems clear that as learning goes on, the vertices (neurons) aren't going to go wandering all about. that as learning goes on, the vertices (neurons) aren't going to go wandering all about. That is,

That is, as we learnas we learn, the c, the cellular structure of the brain can't move ellular structure of the brain can't move arounaround verd very much. y much. It It waswas found that as we learn, the chemical structure of the brain does change in small local ways. found that as we learn, the chemical structure of the brain does change in small local ways. When we learn to do something, or not to do something else, various connections

When we learn to do something, or not to do something else, various connections between the

between the neurons are neurons are either strengteither strengthened hened or wor weakened. eakened. This cThis corresponds to aorresponds to a

change on the edge weights of our network. We start with the simplest type of neural net, a change on the edge weights of our network. We start with the simplest type of neural net, a two layer

two layered, feed-forwaed, feed-forward net. We will srd net. We will s how hohow how the weight changes takw the weight changes tak e place. e place. SinceSince there are only two layers, and in every feed forward net there is both an input and output there are only two layers, and in every feed forward net there is both an input and output layer, there can be no hidden units.

layer, there can be no hidden units.

We say that a layered We say that a layered graph is fully connected if graph is fully connected if every node in each layer is every node in each layer is connected to every other connected to every other node in the next layer to the node in the next layer to the right. It generally looks like: right. It generally looks like:

Note that nodes in one Note that nodes in one layer aren't connected layer aren't connected to any other nodes in to any other nodes in the same layer. This is the same layer. This is always the case in always the case in layered neural nets. layered neural nets.

(15)

There is a routine that we can carry out so that the neural net can figure out what the weights There is a routine that we can carry out so that the neural net can figure out what the weights on the edges should be to realize a certain set of fixed reactions. We feed the net specific on the edges should be to realize a certain set of fixed reactions. We feed the net specific inputs with known desired outputs. We compare the network's output with the desired inputs with known desired outputs. We compare the network's output with the desired output and change weigh

output and change weights ts accordinglyaccordingly. T. This routine is his routine is then repeated unthen repeated until til all outputs all outputs areare correct for all inputs.

correct for all inputs.

Essentially, this can be thought of as a pattern recognition problem. Let's say that we have Essentially, this can be thought of as a pattern recognition problem. Let's say that we have a 2 layer neural net with two input nodes, and one output. We might want to teach the net to a 2 layer neural net with two input nodes, and one output. We might want to teach the net to produce the result

produce the result 11 ANDAND 22 for the outputfor the output 33, using the following logic table:, using the following logic table:

The net must be trained to recognize the pattern (1,1) as 1 and the other three as 0, in the The net must be trained to recognize the pattern (1,1) as 1 and the other three as 0, in the same way as you apply a name to a face.

(16)

For problems of this type it is often convenient to talk about input and output patterns. We've For problems of this type it is often convenient to talk about input and output patterns. We've already mentioned that the input can be thought of as a pattern. The output can be thought of already mentioned that the input can be thought of as a pattern. The output can be thought of one as w

one as well. ell. Consider a big Consider a big neural net wneural net with one input node, some hidden units, and 64 outputith one input node, some hidden units, and 64 output nodes arranged as an 8 by 8 square. We could train the net that given an input of 0 to send nodes arranged as an 8 by 8 square. We could train the net that given an input of 0 to send 1's to the outer most units of the square, and 0's to all others. We could in addition, teach it that 1's to the outer most units of the square, and 0's to all others. We could in addition, teach it that given in input of 1, it should produce outputs of 1's to the fourth and fifth columns in the square given in input of 1, it should produce outputs of 1's to the fourth and fifth columns in the square of output units, and 0's to the others. It would look like:

of output units, and 0's to the others. It would look like:

The 0'

The 0's have been s have been left out fleft out f or clarityor clarity. . The ellipse in the middle reprThe ellipse in the middle represents tesents the hiddehe hidden units.n units. The square r

The square repreepresents sents the output units in an 8 by 8 square. the output units in an 8 by 8 square. As you can see, tAs you can see, t he output nowhe output now represents a pattern in the visual sense. The output looks like the numeral for the input (well, represents a pattern in the visual sense. The output looks like the numeral for the input (well, sort of).

(17)

As far as the computer is concerned, the neural net is a function, As far as the computer is concerned, the neural net is a function,

f:R

f:R R

R

6464

with the following property: with the following property:

In this way we realize that pattern recognition and learning the action of a fixed function are the In this way we realize that pattern recognition and learning the action of a fixed function are the same in principle.

same in principle.

With this

With this in mind, tin mind, t here is a lhere is a learnearning algorithm which teaches ing algorithm which teaches the two layered feed-the two layered feed-forwaforwardrd neural net to recognize patterns. It works as follows:

(18)

Note: By "binary" in this section, we mean the set

Note: By "binary" in this section, we mean the set {-1,1}{-1,1} (we use -1 instead of the usual 0).(we use -1 instead of the usual 0). Assume we start with the edges having random weights assigned to them. Then, given an Assume we start with the edges having random weights assigned to them. Then, given an input pattern

input pattern II (some sequence of -1's and 1's), there is an output pattern(some sequence of -1's and 1's), there is an output patternZZ (a number,(a number, though in general, not the correct one) and a corresponding desired output pattern

though in general, not the correct one) and a corresponding desired output patternOO (also(also a number).The weights going out from the

a number).The weights going out from the vvthth _{input unit must be changed by adding:}_{input unit must be changed by adding:}

Δ Δ_w_w

v

v ==εε



OO



IIvv



[ [ 1 1



( ( O O == ZZ))]] so thatso that wwvv ==wwvv



εε



OO



IIvv



[ [ 1 1



( ( O O == ZZ))]]

is a small increment. We find the direction for the change from the

is a small increment. We find the direction for the change from the OIOI_v_v(1-(O=Z)) part. The(1-(O=Z))part. The step size is given by

step size is given by . Note that if the net's output is the ideal desired output (i.e., it has. Note that if the net's output is the ideal desired output (i.e., it has learned to identify that pattern or function correctly), then

learned to identify that pattern or function correctly), thenO=ZO=Z. In this case. In this case w=0w=0 for the net,for the net, so no changes will take place. This follows the "if it ain't broke, don't fix it" principle of higher so no changes will take place. This follows the "if it ain't broke, don't fix it" principle of higher computer science.

computer science. Since a function usually Since a function usually consisconsists of correctly identifying ts of correctly identifying severaseveral patternsl patterns (one pattern for each point in its domain), we would like to see this net learn several different (one pattern for each point in its domain), we would like to see this net learn several different patterns concurrently. This is one of the real advantages of the neural net model. It can learn patterns concurrently. This is one of the real advantages of the neural net model. It can learn severa

several diffl diff erenerent t things without cthings without c hanginhanging its g its basic basic structstruct ure. Yure. You can have a neural net learnou can have a neural net learn the AN

the AND function, D function, and then with a cand then with a changhange of e of weighweights ts learn the OR function. learn the OR function. No new circuitry isNo new circuitry is need

(19)

The more patterns we try to make the net learn, the more likely it will incorrectly remember a The more patterns we try to make the net learn, the more likely it will incorrectly remember a previously learned

previously learned pattern. pattern. LuckLuckily, the ily, the weights won't havweights won't have changed e changed much (withmuch (with



small), sosmall), so we keep training and retraining. In certain cases, it has been proven that this method must we keep training and retraining. In certain cases, it has been proven that this method must converge to successful several pattern recognition in a finite number of steps. This problem converge to successful several pattern recognition in a finite number of steps. This problem is very much like the tent peg problem. It's easy to nail in one peg, but while nailing the is very much like the tent peg problem. It's easy to nail in one peg, but while nailing the second peg, you'v

second peg, you've loosened the firse loosened the first, t, which then has twhich then has to get rehammereo get rehammered..d..

One final improve

One final improvement before continuing. Sincment before continuing. Since we want to be able te we want to be able to change the to change the thresholdshresholds as the network learns, we treat them as weights for new edges. To do this we add a new as the network learns, we treat them as weights for new edges. To do this we add a new node for each different threshold in the net. When we give the net its input patterns, we make node for each different threshold in the net. When we give the net its input patterns, we make sure the value of 1 goes to the nodes providing threshold values. The weight on an edge sure the value of 1 goes to the nodes providing threshold values. The weight on an edge connecting such a vertex to the next layer will work as a threshold.

connecting such a vertex to the next layer will work as a threshold.

Let's try an example: Say we want the computer to come up with a neural net that will produce Let's try an example: Say we want the computer to come up with a neural net that will produce an AND function. We start with a net that has two input nodes, one output node, and no hidden an AND function. We start with a net that has two input nodes, one output node, and no hidden units. This is only a guess. In general it is a difficult problem to know how many units are units. This is only a guess. In general it is a difficult problem to know how many units are needed to solve your problem, and if it's solvable by these methods at all. Let's assume that needed to solve your problem, and if it's solvable by these methods at all. Let's assume that all thresholds will be the same through the learning. In this case it is sufficient to add only one all thresholds will be the same through the learning. In this case it is sufficient to add only one input node (which will always get an input value of 1). The network looks like this:

(20)

With a little foresight and a hunch based on our choice of the binary system as

With a little foresight and a hunch based on our choice of the binary system as {-1,1}{-1,1} wewe chose the activation function accordingly:

chose the activation function accordingly:

f f 0( ( ) 0)



00 f f x( ( ))x xx x x



( ( x x == 00))



4 4  22 0 0 2 2 44 1 1  0 0 1 1 f f x( ( ))x x x f f 5( ( ) 5)



11 f f 5( (



5) )





11

(21)

k



0 0 2



2

We start

We start with the weights swith the weights s et randomlyet randomly. Let's . Let's try:try:

w w

0



11 ww11



00 ww22



22 εε



.3.3

For this network, the output,

For this network, the output, ZZ is given by:is given by:

Z Z₍₍νν₀₀



νν₁₁



νν₂₂₎₎ _f_f νν₀_{0 w}_w 0 0



νν₁_{1 w}_w 1 1





νν₂_{2 w}_w 2 2





 





(22)

I



( ( ( ( 1 1



11



11))))TT O O





11

The actual output is:

The actual output is: ZZ₁₁



Z Z 1 ( ( 1 1





1





11)) ZZ₁₁





11

The change of weights:

The change of weights: εε



_O_O_II

k k



1 1



 

O O ==ZZ₁₁



0 0 0 0 0 0















Change the weights:

Change the weights: ww

k k



wwk k



εε



OO



IIk k



1 1



 

O O == ZZ11



New weights: New weights: ww 0 0



11 ww11



00 ww22



22

(23)

The second pattern (1,1,-1). This has an ideal output of -1. The second pattern (1,1,-1). This has an ideal output of -1.

O O





11 I

I



( ( ( ( 1 1 1 1 1



1))))TT

The actual output is: ZZ22



Z Z 1 ( ( 1 1



1





11)) ZZ22





11



_O_O_II

k k



1 1



 

O O ==ZZ₂₂



0 0 0 0 0 0















Change the weights:

k k



wwk k



εε



OO



IIk k



1 1



 

O O == ZZ22





11 ww11



00 ww22



22

(24)

I



( ( ( ( 1 1



11 11))))TT O O





11

The actual output is: ZZ₃₃



Z Z 1 ( ( 1 1





1



11)) ZZ₃₃



11



_O_O_II

k k



1 1



 

O O ==ZZ₃₃



0.3 0.3



0.3 0.3 0.3 0.3

















Change the weights:

k k



wwk k



εε



OO



IIk k



1 1



 

O O == ZZ33





0.70.7 ww11



0.30.3 ww22



1.71.7

(25)

The fourth pattern (1,1,1). This has an ideal output of +1. The fourth pattern (1,1,1). This has an ideal output of +1.

I



( ( ( ( 1 1 1 1 11))))TT O O



11

The actual output is: ZZ₄₄



Z Z 1 ( ( 1 1



1



11)) ZZ₄₄



11

The change of weights: The change of weights:

ε ε



_O_O_II k k



1 1



 

O O ==ZZ₄₄



0 0 0 0 0 0















Change the weights:

k k



wwk k



εε



OO



IIk k



1 1



 

O O == ZZ44





0.70.7 ww11



0.30.3 ww22



1.71.7

(26)

At this point we've made a pass through each pattern exactly once. We repeat this At this point we've made a pass through each pattern exactly once. We repeat this procedure several times, until the weights stabilize. To do this, change the initial procedure several times, until the weights stabilize. To do this, change the initial assignments

assignments of the of the weigweights hts to tto the edges (whehe edges (where the big red arrow is.) Then pagere the big red arrow is.) Then page down to see what the new weights should be.

down to see what the new weights should be.

Eventually, you will see that the matrices of weight changes is zero. At this point the Eventually, you will see that the matrices of weight changes is zero. At this point the weig

weights hts stop cstop c hanginhanging, and the g, and the output will be toutput will be the correctly predicted and desired outputhe correctly predicted and desired output for each pattern. This should take six complete passes starting with

for each pattern. This should take six complete passes starting withw0=1w0=1,, w1=0w1=0, and, and w2=2

w2=2..

In Future Issues:

BIG

, Multi

, Multilayered neu

layered neural nets,

ral nets,

Gradient Descent Learning, and

Back P

(27)

References References

1. Drew V

1. Drew Van Camp, "Neuroan Camp, "Neurons ns for Computers," Scfor Computers," Sc ientific Aientific American, Sept. merican, Sept. 1992, pp.170-11992, pp.170-172.72.

2. R. C. Lacher, Artificial Neural Networks, An Introduction to the Theory and Practice. 2. R. C. Lacher, Artificial Neural Networks, An Introduction to the Theory and Practice. Lecture Notes, Version 1, Oct

Lecture Notes, Version 1, Oct ober 19, 1991.ober 19, 1991.

3. Patrick Shea and Vincent Lin, "Detection of Explosives in Checked Airline Baggage 3. Patrick Shea and Vincent Lin, "Detection of Explosives in Checked Airline Baggage Using an Artificial Neural System," Science Applications International Corporation, Santa Using an Artificial Neural System," Science Applications International Corporation, Santa Clara, CA.