Artificial neural networks –
Artificial neural networks –
Outline
Outline
Introduction
Introduction
Hebbian learning
Hebbian learning
Generalised Hebbian learning algorithm
Generalised Hebbian learning algorithm
Competitive learning
Competitive learning
Self-organising computational map:
Self-organising computational map:
Kohonen network
Kohonen network
The main property of a neural network is an
The main property of a neural network is an
ability to learn from its environment, and to
ability to learn from its environment, and to
improve its performance through learning. So
improve its performance through learning. So
far we have considered
far we have considered
supervised
supervised
or
or
active
active
learning
learning
learning with an external “teacher”
learning with an external “teacher”
or a supervisor who presents a training set to the
or a supervisor who presents a training set to the
network. But another type of learning also
network. But another type of learning also
exists:
exists:
unsupervised learning
unsupervised learning
.
.
In contrast to supervised learning, unsupervised or In contrast to supervised learning, unsupervised or
self-organised learning
self-organised learning does not require an does not require an
external teacher. During the training session, the
external teacher. During the training session, the
neural network receives a number of different
neural network receives a number of different
input patterns, discovers significant features in
input patterns, discovers significant features in
these patterns and learns how to classify input data
these patterns and learns how to classify input data
into appropriate categories. Unsupervised
into appropriate categories. Unsupervised
learning tends to follow the neuro-biological
learning tends to follow the neuro-biological
organisation of the brain.
organisation of the brain.
Unsupervised learning algorithms aim to learn Unsupervised learning algorithms aim to learn
rapidly and can be used in real-time.
In 1949, Donald Hebb proposed one of the key
In 1949, Donald Hebb proposed one of the key
ideas in biological learning, commonly known as
ideas in biological learning, commonly known as
Hebb’s Law
Hebb’s Law. Hebb’s Law states that if neuron . Hebb’s Law states that if neuron ii is is near enough to excite neuron
near enough to excite neuron jj and repeatedly and repeatedly
participates in its activation, the synaptic connection
participates in its activation, the synaptic connection
between these two neurons is strengthened and
between these two neurons is strengthened and
neuron
neuron jj becomes more sensitive to stimuli from becomes more sensitive to stimuli from neuron
neuron ii..
Hebb’s Law can be represented in the form of two
Hebb’s Law can be represented in the form of two
rules:
rules:
1. If two neurons on either side of a connection
1. If two neurons on either side of a connection
are activated synchronously, then the weight of
are activated synchronously, then the weight of
that connection is increased.
that connection is increased.
2. If two neurons on either side of a connection
2. If two neurons on either side of a connection
are activated asynchronously, then the weight
are activated asynchronously, then the weight
of that connection is decreased.
of that connection is decreased.
Hebb’s Law provides the basis for learning
Hebb’s Law provides the basis for learning
without a teacher. Learning here is a
without a teacher. Learning here is a local local phenomenon
phenomenon occurring without feedback from occurring without feedback from the environment.
Hebbian learning in a neural network
Hebbian learning in a neural network
i j
I
n
p
u
t
S
i
g
n
a
l s
O
u
t
p
u
t
S
i
g
n
a
Using Hebb’s Law we can express the adjustment Using Hebb’s Law we can express the adjustment
applied to the weight
applied to the weight wwijij at iteration at iteration pp in the in the following form:
following form:
As a special case, we can represent Hebb’s Law as As a special case, we can represent Hebb’s Law as
follows:
follows:
where
where
is the is the learning ratelearning rate parameter. parameter. This equation is referred to as theThis equation is referred to as the activity product activity product rule
rule..
]
[
( ), ( ))
( p F y p x p
wij j i
) ( ) (
)
( p y p x p
wij j i
Hebbian learning implies that weights can only Hebbian learning implies that weights can only
increase. To resolve this problem, we might
increase. To resolve this problem, we might
impose a limit on the growth of synaptic weights.
impose a limit on the growth of synaptic weights.
It can be done by introducing a non-linear
It can be done by introducing a non-linear
forgetting factor
forgetting factor into Hebb’s Law: into Hebb’s Law:
where
where is the forgetting factor. is the forgetting factor.
Forgetting factor usually falls in the interval
Forgetting factor usually falls in the interval
between 0 and 1, typically between 0.01 and 0.1,
between 0 and 1, typically between 0.01 and 0.1,
to allow only a little “forgetting” while limiting
to allow only a little “forgetting” while limiting
the weight growth.
the weight growth.
) (
) (
) ( ) (
)
( p y p x p y p w p
wij j i j ij
Hebbian learning algorithm
Hebbian learning algorithm
Step 1
Step 1: Initialisation.: Initialisation.
Set initial synaptic weights and thresholds to small
Set initial synaptic weights and thresholds to small
random values, say in an interval [0, 1].
random values, say in an interval [0, 1].
Step 2
Step 2: Activation.: Activation.
Compute the neuron output at iteration
Compute the neuron output at iteration pp
where
where nn is the number of neuron inputs, and is the number of neuron inputs, and jj is the is the threshold value of neuron
threshold value of neuron jj..
j n
i
ij i
j
p
x
p
w
p
y
1
)
(
)
(
)
Step 3
Step 3:: Learning.Learning.
Update the weights in the network:
Update the weights in the network:
where
where wwijij((pp) is the weight correction at iteration ) is the weight correction at iteration pp..
The weight correction is determined by the
The weight correction is determined by the
generalised activity product rule:
generalised activity product rule:
Step 4
Step 4: Iteration.: Iteration.
Increase iteration
Increase iteration pp by one, go back to Step 2. by one, go back to Step 2.
)
(
)
(
)
1
(
p
w
p
w
p
w
ij
ij
ij]
[
(
)
(
)
)
(
)
(
p
y
p
x
p
w
p
w
ij
j
i
ijTo illustrate Hebbian learning, consider a fully
To illustrate Hebbian learning, consider a fully
connected feedforward network with a single layer
connected feedforward network with a single layer
of five computation neurons. Each neuron is
of five computation neurons. Each neuron is
represented by a McCulloch and Pitts model with
represented by a McCulloch and Pitts model with
the sign activation function. The network is trained
the sign activation function. The network is trained
on the following set of input vectors:
on the following set of input vectors:
Hebbian learning e
Input layer x1 1
Output layer
2
1 y1
y2 x2 2
x3 3
x4 4
x5 5
4
3 y3
y4
5 y5
1 0 0 0 1 1 0 0 0 1 Input layer x1 1
Output layer
2
1 y1
y2 x2 2
x3 3
x4 4
x5
4
3 y3
y4
5 y5
1 0 0 0 1 0 0 1 0 1 2 5
O u t p u t l a y e r
I n p
u t
l a
y e r
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 2
1 3 4 5 1
2
3 4 5
O u t p u t l a y e r
I n p
u t
l a
y e r
0 0 0 0 0
1 2 3 4 5
1 2 3 4 5 0 0 2.0204
0
2.0204
1.0200 0 0
0
0 0.9996 0 0 0 0 0 0 2.0204
0
2.0204
(b).
When this probe is presented to the network, we When this probe is presented to the network, we obtain: obtain: 1 0 0 0 1 X 1 0 0 1 0 0737 . 0 9478 . 0 0907 . 0 2661 . 0 4940 . 0 1 0 0 0 1 2.0204 0 0 2.0204 0 0 0.9996 0 0 0 0 0 1.0200 0 0 2.0204 0 0 2.0204 0 0 0 0 0 0 sign Y
In competitive learning, neurons compete among In competitive learning, neurons compete among
themselves to be activated.
themselves to be activated.
While in Hebbian learning, several output neurons While in Hebbian learning, several output neurons
can be activated simultaneously, in competitive
can be activated simultaneously, in competitive
learning, only a single output neuron is active at
learning, only a single output neuron is active at
any time.
any time.
The output neuron that wins the “competition” is The output neuron that wins the “competition” is
called the
called the winner-takes-allwinner-takes-all neuron. neuron.
The basic idea of competitive learning was The basic idea of competitive learning was
introduced in the early 1970s.
introduced in the early 1970s.
In the late 1980s, Teuvo Kohonen introduced a In the late 1980s, Teuvo Kohonen introduced a
special class of artificial neural networks called
special class of artificial neural networks called
self-organising feature maps
self-organising feature maps. These maps are . These maps are based on competitive learning.
Our brain is dominated by the cerebral cortex, a
Our brain is dominated by the cerebral cortex, a
very complex structure of billions of neurons and
very complex structure of billions of neurons and
hundreds of billions of synapses. The cortex
hundreds of billions of synapses. The cortex
includes areas that are responsible for different
includes areas that are responsible for different
human activities (motor, visual, auditory,
human activities (motor, visual, auditory,
somatosensory, etc.), and associated with different
somatosensory, etc.), and associated with different
sensory inputs. We can say that each sensory
sensory inputs. We can say that each sensory
input is mapped into a corresponding area of the
input is mapped into a corresponding area of the
cerebral cortex.
cerebral cortex. The cortex is a self-organising The cortex is a self-organising computational map in the human brain.
computational map in the human brain.
Feature-mapping Kohonen model
Feature-mapping Kohonen model
Input layer Kohonen layer
(a)
Input layer Kohonen layer
1 0
The Kohonen model provides a topological The Kohonen model provides a topological
mapping. It places a fixed number of input
mapping. It places a fixed number of input
patterns from the input layer into a
patterns from the input layer into a
higher-dimensional output or Kohonen layer.
dimensional output or Kohonen layer.
Training in the Kohonen network begins with the Training in the Kohonen network begins with the
winner’s neighbourhood of a fairly large size.
winner’s neighbourhood of a fairly large size.
Then, as training proceeds, the neighbourhood size
Then, as training proceeds, the neighbourhood size
gradually decreases.
gradually decreases.
Architecture of the Kohonen Network
Architecture of the Kohonen Network
Input layer
O
u
t
p
u
t
S
i
g
n
a
l s
I
n
p
u
t
S
i
g
n
a
l s
x1
x2
Output layer
y1
y2
The lateral connections are used to create a The lateral connections are used to create a
competition between neurons. The neuron with the
competition between neurons. The neuron with the
largest activation level among all neurons in the
largest activation level among all neurons in the
output layer becomes the winner. This neuron is
output layer becomes the winner. This neuron is
the only neuron that produces an output signal.
the only neuron that produces an output signal.
The activity of all other neurons is suppressed in
The activity of all other neurons is suppressed in
the competition.
the competition.
The lateral feedback connections produce The lateral feedback connections produce
excitatory or inhibitory effects, depending on the
excitatory or inhibitory effects, depending on the
distance from the winning neuron. This is
distance from the winning neuron. This is
achieved by the use of a
achieved by the use of a Mexican hat functionMexican hat function
which describes synaptic weights between neurons
which describes synaptic weights between neurons
in the Kohonen layer.
The Mexican hat function of lateral connection
The Mexican hat function of lateral connection
Connection strength
Distance
Excitatory effect
Inhibitory effect Inhibitory
effect
In the Kohonen network, a neuron learns by In the Kohonen network, a neuron learns by
shifting its weights from inactive connections to
shifting its weights from inactive connections to
active ones. Only the winning neuron and its
active ones. Only the winning neuron and its
neighbourhood are allowed to learn. If a neuron
neighbourhood are allowed to learn. If a neuron
does not respond to a given input pattern, then
does not respond to a given input pattern, then
learning cannot occur in that particular neuron.
learning cannot occur in that particular neuron.
The The competitive learning rulecompetitive learning rule defines the change defines the change
wwijij applied to synaptic weight applied to synaptic weight wwijij as as
where
where xxii is the input signal and is the input signal and
is the is the learning learning raterate parameter. parameter.
n competitio the
loses neuron
if ,
0
n competitio the
wins neuron
if ), (
j j w
x
The overall effect of the competitive learning rule The overall effect of the competitive learning rule
resides in moving the synaptic weight vector
resides in moving the synaptic weight vector WWjj of of the winning neuron
the winning neuron jj towards the input pattern towards the input pattern XX. . The matching criterion is equivalent to the
The matching criterion is equivalent to the
minimum
minimum Euclidean distanceEuclidean distance between vectors. between vectors.
The Euclidean distance between a pair of The Euclidean distance between a pair of nn-by-1 -by-1
vectors
vectors XX and and WWjj is defined by is defined by
where
where xxii and and wwijij are the are the iith elements of the vectors th elements of the vectors XX
and
and WWjj, respectively., respectively.
2 / 1
1
2
) (
n
i
ij i
j x w
To identify the winning neuron, To identify the winning neuron, jj
X
X, that best , that best
matches the input vector
matches the input vector XX, we may apply the , we may apply the following condition:
following condition:
where
where mm is the number of neurons in the Kohonen is the number of neurons in the Kohonen layer.
layer.
,
j j
min
Suppose, for instance, that the 2-dimensional input Suppose, for instance, that the 2-dimensional input
vector
vector XX is presented to the three-neuron Kohonen is presented to the three-neuron Kohonen network,
network,
The initial weight vectors, The initial weight vectors, WW
j
j, are given by, are given by
12 . 0
We find the winning (best-matching) neuron We find the winning (best-matching) neuron jj
X
X
using the minimum-distance Euclidean criterion:
using the minimum-distance Euclidean criterion:
Neuron 3 is the winner and its weight vector Neuron 3 is the winner and its weight vector WW
3
3 is is
updated according to the competitive learning rule.
The updated weight vector The updated weight vector WW
3
3 at iteration ( at iteration (pp + 1) + 1)
is determined as:
is determined as:
The weight vector The weight vector WW
3
3 of the wining neuron 3 of the wining neuron 3
becomes closer to the input vector
becomes closer to the input vector XX with each with each iteration. iteration. 20 . 0 44 . 0 01 . 0 0.01 21 . 0 43 . 0 ) ( ) ( ) 1
( 3 3
3 p W p W p
Step 1
Step 1:: InitialisationInitialisation..
Set initial synaptic weights to small random
Set initial synaptic weights to small random
values, say in an interval [0, 1], and assign a small
values, say in an interval [0, 1], and assign a small
positive value to the learning rate parameter
positive value to the learning rate parameter ..
Step 2
Step 2:: Activation and Similarity MatchingActivation and Similarity Matching..
Activate the Kohonen network by applying the
Activate the Kohonen network by applying the
input vector
input vector XX, and find the winner-takes-all (best , and find the winner-takes-all (best matching) neuron
matching) neuron jjXX at iteration at iteration pp, using the , using the minimum-distance Euclidean criterion
minimum-distance Euclidean criterion
where
where nn is the number of neurons in the input is the number of neurons in the input layer, and
layer, and mm is the number of neurons in the is the number of neurons in the Kohonen layer. Kohonen layer. , ) ( ) ( ) ( 2 / 1 1 2
]
[
n i ij i jj p x w p
min p
Step 3
Step 3:: LearningLearning..
Update the synaptic weights
Update the synaptic weights
where
where wwijij((pp) is the weight correction at iteration ) is the weight correction at iteration pp.. The weight correction is determined by the
The weight correction is determined by the
competitive learning rule:
competitive learning rule:
where
where is the is the learning ratelearning rate parameter, and parameter, and jj((pp) is ) is the neighbourhood function centred around the
the neighbourhood function centred around the
winner-takes-all neuron
winner-takes-all neuron jjXX at iteration at iteration pp..
) ( ) ( ) 1
( p w p w p
wij ij ij
Step 4
Step 4:: IterationIteration.. Increase iteration
Increase iteration pp by one, go back to Step 2 and by one, go back to Step 2 and continue until the minimum-distance Euclidean
continue until the minimum-distance Euclidean
criterion is satisfied, or no noticeable changes
criterion is satisfied, or no noticeable changes
occur in the feature map.
To illustrate competitive learning, consider the To illustrate competitive learning, consider the
Kohonen network with 100 neurons arranged in the
Kohonen network with 100 neurons arranged in the
form of a two-dimensional lattice with 10 rows and
form of a two-dimensional lattice with 10 rows and
10 columns. The network is required to classify
10 columns. The network is required to classify
two-dimensional input vectors
two-dimensional input vectors each neuron in the each neuron in the network should respond only to the input vectors
network should respond only to the input vectors
occurring in its region.
occurring in its region.
The network is trained with 1000 two-dimensional The network is trained with 1000 two-dimensional
input vectors generated randomly in a square region
input vectors generated randomly in a square region
in the interval between –1 and +1. The learning rate
in the interval between –1 and +1. The learning rate
parameter
parameter is equal to 0.1. is equal to 0.1.
Competitive learning in the Kohonen network
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 W(1,j)
-1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W
(2
,j)
Initial random weights
Network after 100 iterations
Network after 100 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 W(1,j)
-1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W
(2
Network after 1000 iterations
Network after 1000 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 W(1,j)
-1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W
(2
Network after 10,000 iterations
Network after 10,000 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 W(1,j)
-1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W
(2