2.2 Hopfield model: an example of Content Addressable Memory
2.2.5 The energy function
The main feature of the Hopfield model is the converge of the state space flow algorithm to stable states. Any symmetric matrix T with zero diagonal elements (i.e Tij = Tji and
Tii= 0) will produce such a flow. The proof of this property is based on the construction of
an appropriate energy function, that is monotonically decreasing and represents a candidate Lyapunov function. If this is true, we obtain that the Hopfield net is Lyapunov stable15, a
condition stronger than the simple network stability.
Definition 2.2.6 (Lyapunov stability). Given a discrete in time system
x(t + 1) = f (x(t))
where f : D → RN continuous on D, with D ⊂ RN an open set containing the origin and x(t) ∈ D ⊂ RN. Suppose f has an equilibrium point x∗ so that f (x∗) = 0, then this equilibrium is said to be
• Lyapunov stable if, for every > 0, there exists a δ = δ() > 0 such that
kx(0) − x∗k < δ =⇒ kx(t) − x∗k < ∀ t ≥ 0
• asymptotically Lyapunov stable if it is Lyapunov stable and there exists δ > 0 such that
kx(0) − x∗k < δ =⇒ lim
t→∞kx(t) − x ∗k = 0
We assume asynchronous updating. Then the state of an asynchronous Hopfield network can be characterized by an energy function, defined as follows
E = −1 2 N X i=1 N X j=1 TijViVj− N X i=1 IiVi+ N X i=1 UiVi (2.42)
We assume Ui= 0 and Ii= 0, thus E is simplified into
E = −1 2 X i X j TijViVj (2.43)
We also assume that the stored patterns form an orthogonal basis and are not correlated to simplify the notations, but the result we will present (Prop. 2.2.9) holds also in the general case.
We recall the classical definition of Lyapunov function and introduce the definition of Lyapunov function in the abstract theory of dynamical systems, as reported in [1].
Definition 2.2.7 (Lyapunov function). A function f (x), x ∈ RN is said to be a Lyapunov
function if there exists an equilibrium state x∗, such that the following three conditions are satisfied:
1. f (x) is continuous with respect to all components of x.
2. f (x) is positive definite, that is f (x∗) = 0 and f (x) > 0 for x 6= x∗.
3. The time derivative ˙f (x) is negative semidefinite, i.e. the function decreases in time.
If it is defined only on a open neighborhood of the equilibrium point, f is said to be a local Lyapunov function ([23]).
Definition 2.2.8 (Lyapunov function-2). Let (X, d) be a metric space. A function h : X → R is said to be a Lyapunov function for the flow ψ = {ψt}t∈R on X if
h ◦ ψt≤ h, ∀ t ≥ 0 (2.44)
and is said to be a first integral if
h ◦ ψt= h, ∀ t ∈ R (2.45)
Proposition 2.2.9. The energy function (2.43) is a Lyapunov function16.
Proof. The function (2.43) satisfies the tree conditions of the definition of Lyapunov func- tion (2.2.7):
1. ∂V∂
iE(V ) = −
P
jTijVj, ∀ Vi, implying17 E(V ) is continuous with respect to all
components of V .
16In this proof we will follow [6], pp. 30-31, except for the second point where we will prove that E is also
positive definite and not only bounded.
17If f : A → RN is differentiable at c ∈ A, then there exists strictly positive numbers δ, K such that
2. Equation (2.43) does not satisfy Condition (2) of the Lyapunov function. In general this condition is substitute with the boundedness of the function, that can be easily proved and that highlights the existence of stopping points for the dynamics of the network, at which the function attains its lower bound. This represents the definition of Lyapunov function in the abstract theory of dynamical systems, as defined in Def. 2.2.8. As presented in [6], a rough bound18 for E can be found in this way:
E ≥ −1 2 X i X j |TijViVj| = − 1 2 X i X j |Tij| ≥ − 1 2N 2 p N = − 1 2pN (2.46)
where in the last inequality we used that |Tij| ≤ N1 Pβ|ξiβξ β j| =
p
N and that the absolute
value of all the components of the vector Vi are equal to one.
Statement. The bound given in Eq. (2.46) is never realized and so the inequality is strict.
Proof. For absurd we suppose that for a pattern V the inequality (2.46) is satisfied exactly: E(V ) = −1 2 X i X j TijViVj= − 1 2pN
We can find a bound for the argument of this double sum TijViVj by considering the
best case that can occur. First, we have that
Tij = 1 N p X µ=1 ξiµξjµ≤ p N
because ξiµξµj ≤ 1 for every µ and i, j. The equality is obtained when for every µ and i, j fixed
ξiµξµj = 1 =⇒ ξiµ= 1 = ξjµ or ξiµ= −1 = ξjµ Now we consider TijViVj and we have
TijViVj≤
p N
where we used Tij ≤ Np and ViVj≤ 1 for every i, j. From these considerations we have
that for i, j fixed
TijViVj =
p
N ⇐⇒ Tij = p
N and ViVj = 1
Now we take the sum over i and over j of these quantities and to maximize this double sum the conditions seen before have to be true for every i and j:
(a) Vi= 1 = Vj or Vi= −1 = Vj for every i, j.
(b) ξiµ= 1 = ξjµ or ξiµ= −1 = ξjµ for every µ, i, j.
and in this way we obtain
−1 2 X i X j TijViVj= − 1 2pN
18The boundedness of E is very important, because is telling us that E cannot converge to −∞, but is
• The first condition (a) implies that V is equal to one of these two N −vectors V1= (1, . . . , 1)T or V2= (−1, . . . , −1)T.
• The second condition (b) means that the unique stored patterns we can have, are represented by V1 and V2! This is absurd for the model we are describing, where
the number of memories is quite high.
We have thus obtain that (2.46) can be rewritten as
E > −1
2pN (2.47)
We now can derive a better lower bound identifying the minimum points of the energy function.
First, we compute the time derivative19, that in this case corresponds to the change of
energy ∆E due to the change in the state of the Hopfield network ∆Vi,
∆E = −X i X j TijVj∆Vi= − X i ∆Vi X j TijVj= −∆VT(T V ) (2.48) where
∆E = E(t + 1) − E(t), ∆Vi= Vi(t + 1) − Vi(t)
Then we identify the points at which this derivative becomes zero and thus ∆E = 0 if and only if one of this condition is satisfied:
(a) ∆Vi = 0 ∀ i: The vectors V satisfying this property correspond to fixed points
of the net, i.e. points such that their value does not change in further updating, Vi(t + 1) = Vi(t), thus the stored patterns. Using (2.21), the value of E in such
points is equal to E(ξµ) = −1 2 X i X j Tijξjµξ µ i = − 1 2 X i hN − p N (ξ µ j) 2i= = −N (N − p) 2N = − N − p 2 < 0 (2.49) (b) P
jTijVj = 0, ∀ i: This condition implies that V is in the kernel of the connection
matrix T
ker(T ) = {V | (T V )i=
X
j
TijVj= 0, ∀ i}
and evaluating E in these patterns we obtain
E(Vker(T )) = −1 2 X i X j TijV ker(T ) j V ker(T ) i = 0 (2.50) (c) P j P
iTijVj∆Vi = 0: In this case we are requiring that T V ⊥∆V and thus we
obtain a null value for E also for these points.
Statement. The first vectors analyzed represent minimum points for E and thus the only minimum points for E are represented by the stored patterns.
Proof. We consider the last two types of points that cancel the derivative of E. • The second type of vectors are in ker(T ), so they are such that P
jTijVj = 0.
Since we have assumed that the stored patterns form an orthogonal basis and we have seen previously in (2.24) how to modify the storage rule in order to memorize a new vector as a fundamental memory, we have that a vector of this type is exactly a new stored pattern. Therefore these are also minimum points for the energy function20.
• In the last situation we have that T V ⊥∆V with ∆V 6= 0 and T V 6= 0, otherwise we are in one of the previous cases. For this type of vectors we have that E(V ) = 0, which is different from the value attained in the stored patterns, and they are not fixed points because ∆V 6= 0. So they are only critical points and not minimum points.
Thus, we have proved that the only minimum points21for E are represented by the stored patterns ξµ.
Remark. We note that the minimum points of E are global minimum points and thus we have obtained that the value of E in all the minimum points is the same.
Therefore, we can improve the lower bound of E:
E(V ) > E(ξµ) = −1
2(N − p), ∀ V 6= ξ
µ.
(2.51)
From Eq. (2.51), we can immediately see how the energy function (2.43) is not the correct candidate to be a Lyapunov function, because from (2.49) if we evaluate it in a fundamental memory ξµ, E(ξµ) is not equal zero. In order to gain the Lyapunov
characterization we have to substitute (2.43) with
E = −1 2 X i X j TijViVj+ N − p 2 (2.52)
and for this function we have that E(ξµ) = 0 and the property of positive definiteness is achieved:
E(V ) > 0, ∀ V 6= ξµ. (2.53)
3. Condition (3) of the Lyapunov function is satisfied if we can prove that there is a decreasing in the energy when we update one or more components, thus any change in Vi results in a decrease of E. As seen before, we have that
∆E = −P ijTijVi(t)(Vj(t + 1) − Vj(t)) = −P j P iTjiVi(t) (Vj(t + 1) − Vj(t)) (2.54)
20In the general case, this deduction is not correct. For these patterns we have that the network stability
condition is satisfied if and only if Vi= sgn((T V )i) ∀ i and in our assumptions this is true. Furthermore
they are also minimum points, because using the updating algorithm they change in this way: Vi(t + 1) =
sgn((T V )i) = Vi(t) ∀ i. Thus in this case we have to take into account also these points with null value of
energy and null eigenvalue, that in section 2.2.6 we will call spurious states.
21In the general case we have that also the points with null eigenvalue are minimum points for E, but
this is not a problem. For the deduction of the better bound (2.51) the presence of these points does not affect the result, since the value of E in these points is larger than that in the stored patterns, where E(ξµ) = −(N −1) 2 − 1 2N P β6=µ h P iξ β iξ µ i i2 < 0.
where E(t + 1) is the energy associated with the updated pattern V (t + 1). By using rule (2.6) we have (P iTjiVi(t) ≥ 0 =⇒ 1 − Vj(t + 1) ≥ 0 P iTjiVi(t) < 0 =⇒ −1 − Vj(t + 1) ≤ 0 (2.55)
From this consideration we can conclude that E(t + 1) − E(t) ≤ 0 and it reaches the value zero when we are considering a stored pattern ξµ, that is a fixed point.
Therefore we have proved that the energy22(2.52) is a Lyapunov function characterized by a matrix of weights with zero diagonal.
We now recall the Lyapunov stability criterion:
Theorem 2.2.10 (Lyapunov stability criterion). The equilibrium state23 x∗ is Lyapunov
stable if in a neighborhood of x∗ there exists a positive definite function L(x), whose time derivative is negative semidefinite in that neighborhood and such that L(x∗) = 0. Thus L is a Lyapunov function. If the time derivative of L is negative definitive, x∗ is said to be asymptotically Lyapunov stable.
We have seen24that in the Hopfield net the equilibrium points are the stored patterns and
thus from the Lyapunov theorem we have that they are asymptotically Lyapunov stable. For this reason starting from a probe pattern we will converge through the iterations of the updating algorithm to a Lyapunov stable state, that do not further change with time.