4: Construct an autoassociative BAM using the following training vectors:

The BAM and the Memory

Exercise 4. 4: Construct an autoassociative BAM using the following training vectors:

and -1, -1,

Determine the output using = which is a Hamming dis- tance of two from each training vector. Try which is a complement of one of the training vectors. Experiment with this network in accordance with the instructions in Exercise 4.3. In addition, try setting the diagonal elements of the weight matrix equal to zero. Does doing so have any effect on the operation of the BAM?

4.2.4 BAM Energy Function

In the previous two chapters, we discussed an iterative process for finding weight values that are appropriate for a particular application. During those discussions, each point in weight space had associated with it a certain error value. The learning process was an iterative attempt to find the weights which minimized the error. To gain an understanding of the process, we examined simple cases having two weights so that each weight vector corresponded to a point on an error surface in three dimensions. The height of the surface at each point determined the error associated with that weight vector. To minimize the error, we began at some given starting point and moved along the surface until we reached the deepest valley on the surface. This minimum point corresponded to the weights that resulted in the smallest error value. Once these weights were found, no further changes were permitted and training was complete.

During the training process, the weights form a dynamical system. That is, the weights change as a function of time, and those changes can be represented as a set of coupled differential equations.

For the BAM that we have been discussing in the last few sections, a slightly different situation occurs. The weights are calculated in advance, and are not part of a dynamical system. On the other hand, an unknown pattern presented to the BAM may require several passes before the network stabilizes on a final result. In this situation, the x and y vectors change as a function of time, and they form a dynamical system.

In both of the dynamical systems described, we are interested in several aspects of system behavior: Does a solution exist? If it does, will the system converge to it in a finite time? What is the solution? Up to now we have been primarily concerned with the last of those three questions. We shall now look at the first two.

For the simple examples discussed so far, the question of the existence of a solution is academic. We found solutions; therefore, they must exist. Nevertheless, we may have been simply lucky in our choice of problems. It is still a valid question to ask whether a BAM, or for that matter, any other network, will always converge to a stable solution. The technique discussed here

4.2 The BAM 137

is fairly easy to apply to the BAM. Unfortunately, many network architectures do not have convergence proofs. The lack of such a proof does not mean that the network will not function properly, but there is no guarantee that it will converge for any given problem.

In the theory of dynamical systems, a theorem can be proved concerning the existence of stable states that uses the concept of a function called a Lyapunov function, or energy function. We shall present a nonrigorous version here, which is useful for our purposes. If a bounded function of the state variables of a dynamical system can be found, such that all state changes result in a decrease in the value of the function, then the system has a stable

This function is called a Lyapunov function, or energy function. In the case of the BAM, such a function exists. We shall call it the BAM energy function; it has the form

(4.13) or, in terms of components,

We shall now state an important theorem about the BAM energy function that will help to answer our questions about the existence of stable solutions of the BAM processing equations. The theorem has three parts:

1. Any change in x or y during BAM processing results in a decrease in E. 2. E is bounded below by = — .

3. When E changes, it must change by a finite amount.

Items 1 and 2 prove that E is a Lyapunov function, and that the dynamical system has a stable state. In particular, item 2 shows that E can decrease only to a certain value; it can't continue down to negative infinity, so that eventually the x and y vectors must stop changing. Item 3 prevents the possibility that changes in E might be small, resulting in an infinite amount of time spent before the minimum E is reached.

In essence, the weight matrix determines the contour of a surface, or landscape, with hills and valleys, much like the ones we have discussed in previous chapters. Figure 4.3 illustrates a cross-sectional view of such a surface. The analogy of the E function as an energy function results from an analysis of how the BAM operates. The initial state of the BAM is determined by the choice of the starting vectors, (x and y). As the BAM processes the data, x and y change, resulting in movement of the energy over the landscape, which is guaranteed by the BAM energy theorem to be downward.

Figure 4.3 This figure shows a cross-section of a BAM energy landscape in two dimensions. The particular topography results from the choice of exemplar vectors that go into making up the weight matrix. During processing, the BAM energy value will move from its starting point down the energy hill to the nearest minimum, while the BAM outputs move from state a to state

b. Notice that the minima reached need not be the global, or

lowest, minima on the landscape.

Initially, the changes in the calculated values of y) are large. As the x and y vectors reach their stable state, the value of E changes by smaller amounts, and eventually stops changing when the minimum point is reached. This situation corresponds to a physical system such as a ball rolling down a hill into a valley, but with enough friction that, by the time the ball reaches the bottom, it has no more energy and therefore it stops. Thus, the BAM resembles a dissipative dynamic system in which the E function corresponds to the energy of the physical system. Remember that the weight matrix determines the contour of this energy landscape; that is, it determines how many energy valleys there are, how far apart they are, how deep they are, and whether there are any unexpected valleys (i.e., spurious states).

We need to clarify one point. We have been illustrating these concepts using a two-dimensional cross-section of an energy landscape, and using the familiar

4.2 The BAM 139 term valley to refer to the locations of the minima. A more precise term would be basin. In fact, the literature on dynamical systems refers to these locations as basins of attraction.

To solidify the concept of BAM energy, we return to the examples of the previous section. First, notice that according to part two of the BAM en- ergy theorem, the minimum value of E is found by summing the neg- atives of all the magnitudes of the components of the matrix. A calculation of E for each of the two training vectors shows that both pairs sit at the bottom of basins having this same value of E. Our first trial vectors were

and The

energy of this system is E = = -8.

The first propagation results in = and a new

energy value E = — -24. Propagation back to the x layer re-

sulted in (1, The energy is now E =

-64. At this point, no further passes through the system are necessary, since is the lowest possible energy. Since any further change in x or y would lower the energy, according to the theorem, no such changes are possible.

Exercise 4.5: Perform the BAM energy calculation on the second example from

In document Neural Networks Algorithms, Applications,and Programming Techniques James A Freeman pdf (Page 149-152)