A consideration on the labels of Self-Organizing Map after refresh learning

(1)

A consideration on the labels of Self-Organizing Map

after refresh learning

Masahiro MITIHATA, Tsutomu MIYOSHI, Hiroshi MASUYAMA

Department of Intelligence and Knowledge Engineering, Faculty of Engineering,

Tottori University

Mailing address : Tottori-shi Koyama-chou Minami 4-101, Tottori, Japan

tel : +81-857-31-5221

fax : +81-857-31-0879

e-mail : [email protected]

Key Words : Evaluation, Learning, Neural Network

Abstract - Kohonen’s Self-Organizing Map (SOM)

is a kind of neural networks that learns the feature of multi dimension data without supervision. Although it is difficult for SOM to apply the data that goes with changing, the authors have proposed the method that "automatic refresh learning" that learns by the feature of input data that have been changed from learning data. However automatic refresh learning has difficulty that the users have to put labels on map every time after refresh learning. In this paper, we propose "semi-automatic labeling" on SOM feature map that detect the label position and territory of new feature map from the label on former map.

1. Introduction

In recent years, it is doing research the methods that is an automatic rule extraction from set of input-output data example, in order to classify the data which belong to multi property, on research of AI. In such methods, there is Self-Organizing Map (SOM) which is a kind of neural network technique. The SOM algorithm which based on unsupervised competitive learning, is a mapping from a high-dimensional space onto a two-high-dimensional plane. Although it is difficult for SOM to apply the data that goes with changing, the authors have proposed the method "automatic refresh learning" that learns by the feature of input data that have been changed from learning data[1]. However automatic refresh learning has difficulty that the users have to put labels on map every time after refresh learning. The label which fixed on the unit of original map turn not to correspond to new map, and need to replaced,

the map after refresh learning is different from original map.

In this paper, we propose "semi-automatic labeling" on SOM feature map that detect the label position and territory of new feature map from the label on former map, in order to save above-mentioned problems.

2. Self-Organizing Map (SOM)

A Self-Organizing Map (SOM) is a kind of neural network technique that algorithm based on unsupervised competitive learning. It provides a topology-preserving mapping from the high-dimensional space to map units. Map units, or neurons, usually form a two-dimensional grid and thus SOM algorithm is a mapping from a high-dimensional space onto a two-high-dimensional plane. SOM is grouping similar input data vectors on neurons, i.e. points, that are nearby each other in the input space, are mapped to nearby map units in the SOM. The SOM can thus serve as a clustering tool as well as a tool for visualizing high-dimensional data.

The process of creating SOM requires two layers of units: the first is an input layer containing units for each element in the input vector, and the second is an output layer or grid of units that is fully connected with those at the input layer. (Fig.2.1)

When an input pattern, or input vector, is presented to the input layer, the units in the output layer compete with each other for the right to be declared the winner. The winner will be the output unit whose incoming connection weights are the closest to the input pattern in terms of Euclidian

(2)

distance. The connection weights of the winner are then adjusted, i.e. moved in the direction of the input pattern.

output layer

input layer Fig.2.1 : SOM

After learning, each unit represents a group of individuals with similar features. The individuals with similar features correspond to the same unit or to neighboring units. That is SOM configures the output units into a topological representation of the original data, through a process called self-organization.

2.1 Learning Algorithm

SOM learning algorithm is summarized the following:

(1) Search the output unit whose incoming connection weights are the closest to the input pattern in terms of Euclidian distance:

d

_j

( )

t

=

(x

_i

(t)

−

ω

_ij

( )

t

)

2

i=0 n−1

∑

(2) Adjust connection weights of the winning unit and the adjacent output units in close proximity of the neighborhood of the winner, and get moved closer to input pattern.

w

_ij

(t

+

1)

=

w

_ij

(t)

+

η

( )

t

[x

_i

(t)

−

w

_ij

(t)]

(3) Repeat (1)-(2) while learning time.

(4) As learning progresses, the size of the neighborhood around the winning unit decreases. Initially, large number of output units will be update, and as the learning proceeds smaller and smaller numbers are updated, until at the end of the learning only the winning unit is adjusted. Similarly, the learning rate will decrease as learning progresses.

Where t_{stands for a learning times,}dj( )t stands for a

distance between the input vector and the weights vector of each output unit, xi(t) stands for a input

value, wij(t) stands for a connection weights value,

and η( )t _{stands for a learning rate.}

2.2 Labeling

After learning, We must labeling on SOM map in order that one can see immediately what the different neurons in the array mean. The following is that procedure.

(1) Put label on closest node with typical data of each class. They are typical nodes of the classes. (2) Investigate every node except typical nodes,

whose degree of how much close to each label’s typical data, and put closest label.

(3) At that time, set threshold and make the node which belongs to no class and put no label.

3. Refresh learning

After learning, the data that based on learning data can be classified suitable by SOM. However it is difficult for SOM to apply the data whose feature goes with changing from learning data. In case that the feature of input data have been changed from learning data, the following as abnormal will appear in output results.

(1) Input data is a distant abnormally from winner.

σ

(

d

_j*

( )

t

−

Th

1

)

t=t₁ t₂

∑

t

₂

−

t

₁

>

Th

2

(2) It exists some units that wins too high frequency.

max

j

cf j,t

(

1

,t

2

)

t

₂

−

t

₁

(

)











 >

Th

₃

(3) There is too many units which do not win.

M

−

σ

(

cf j,t

(

₁

,t

₂

)

j=0 M

∑

M

>

Th

4

Where t_{stands for the timing of data input,}j*_stands

for winning node, cf j,t( 1,t2) stands for the winning

frequency of node-j from t1 to t2, M stands for the

(3)

Th4 mean thresholds.

In case that the feature of input data have been changed slowly from learning data, we have been able to apply automatic refresh learning[1] that detects the changes of the feature and learns based on changing data automatically, when the above-mentioned errors is detected.

4. Labeling after refresh learning

Because of competitive and neighbor learning which are characteristics of SOM, the map after refresh learning is different from original map. So the label which fixed on the unit of original map turn not to correspond to new map, and need to replace. It is a lot of trouble for the users.

In order to save such troubles, we propose "semi-automatic labeling" on SOM feature map that detect the label position and territory of new feature map from the label on former map.

4.1 Semi-automatic Labeling

To solve the above-mentioned labeling problems, we propose the following system, that is, for no changed data, it is able to detect the position of label on new map through data which is putted label on former map automatically, and, for changed data, it is able to detect that changes : that is learned newly by refresh learning. The algorithm shows the following.

(1) Count the number of winning time on each node since the learning rate come to less than the threshold at refresh learning, because we considered learning must be finished. And we regard the nodes whose winning frequency was less than criterion as the border line of each class. (refer to Fig.5.4)

(2) Put labels on units which is the closest to the representative data of label on former map in terms of Euclidian distance after refresh learning. However, if its distance is over criterion, put no label on that unit, because we regard the class whose data goes with changing. (refer to Fig.5.6)

(3) Since (1) and (2), it shows blank territory that has no labels about the class whose data goes with changing. Besides it is showed that border of each class. So the users need to put new labels on that territory.

Thus, for no changed data, the position and territory on new map was detected and put labels automatically, and for changed data, the position and territory on new map was detected and it indicates that it is necessary to put new labels. So we can recognize the group with changed and save a trouble of labeling about changed data. After refresh learning, the users will put labels briefly because they need to put only class whose data goes with changing.

5. Experiment

We experimented whether it can detect the position of labels by the above-mentioned method on SOM that works automatic refresh learning.

The size of the map is 10x10, i.e. the number of processing units at the output layer M sets to 100. Initial number of learning time sets to 1000. And, at refresh learning, the number of learning time sets to 100.

At refresh learning, Th1 sets to 0.5, Th2 sets to 5.0,

Th₃_{sets to 0.05, and}Th4 sets to 0.3.

At semi-automatic labeling after refresh learning, it is set that the criterion of the distance between closest node and learning data sets to 0.5, the criterion of the learning rate threshold sets to 0.1, and the criterion of winning frequency sets to 60 times.

We set these criterions as appropriate value by experience and consideration the size of map, the learning time, the ratio that update of learning rate, the number and dispersion of used data.

The used data were assumed sets of points scattered at center and among it on 3-dimensional space. Then those scattered points are regular random number.

x

i

( )

t

= −

2 σ

2

log

e

Rand

1

sin 2

π

Rand

2

+

µ

i Where σ2

stands for the dispersion and sets to 0.1 in this case, Rand1 and Rand2 stands for the random

number, and µi stands for the center.

5.1 Case 1 Data

The initial data classify into 3 classes. Those data were 50 points in each class, in total 150 points as leaning data, and 500 points each, in total 1500 points as estimated data.

(4)

the center of each class in case 1 data. Then only data class "C" was divided into 2 types at first step, and changed at every steps.

Table 5.1 : the center of case 1 data

(0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (0.0, 2.0, 2.0) (0.5, 2.0, 2.0) (1.0, 2.0, 2.0) (1.5, 2.0, 2.0) (2.0, 2.0, 0.0) (2.0, 2.0, 0.5) (2.0, 2.0, 1.0) (2.0, 2.0, 1.5) (2.0, 2.0, 2.0) step 4 step 3 step 2 step 1 step 0 C B A 2.0 2.0 2.0 A B C

Fig.5.1 : the center of case 1 data

5.2 Case 1 Results

At Fig.5.2 through Fig.5.5, the x-axis and y-axis corresponded to the feature map, and the z-axis represented the winning frequency.

Fig.5.2 through Fig.5.4 show the results of automatic refresh learning. In Fig.5.2, it is shown the winning frequency that input data (step 0) onto the initial map. It is normal result because every classes’s outputs are almost uniformity. In Fig.5.3, it is shown the winning frequency that input data (step 4) onto the initial map. It shows inappropriate clustering because output frequency is too much partial. In Fig.5.4, it is shown the winning frequency that input data (step 4) onto the refreshed map. It is normal result because every classes’s outputs are almost uniformity. So we recognized that adaptive refresh learning had done suitable.

Fig.5.5 shows the winning frequency since the learning rate becomes under the threshold at refresh learning. It used distinction of the territory in case of labeling after refresh learning. Main difference

between Fig.5.4 and Fig.5.5 are input data. In Fig.5.4 we use testing data. In Fig.5.5, we use learning data because we only have learning data at the process of semi-automatic labeling, and it saves a trouble to get suitable testing data.

0 50

100 150

Fig.5.2 : the winning frequency that is the case of input data (step 0) onto the initial map

0 50

100 150

Fig.5.3 : the winning frequency that is the case of input data (step 4) onto the initial map

0 50

100 150

Fig.5.4 : the winning frequency that is the case of input data (step 4) onto the refreshed map

(5)

0 100 200 300 400

Fig.5.5 : the winning frequency since the learning rate becomes under the threshold

B B B B B # # -B -B -B -B -B -B # # A A B B B B B - # A A A - B B B - # # A A A # # - # # # - A A A # # # - # # # A A A # - C C C C # A A A - C C C C C - # A A C C C C C C # # A A C C C C C C C # - A Fig.5.6 : the labels on initial feature map # B B B B B B B -- # B B B B B B B B - - # B B B B B B B - - - # B B B B B # - - - - # # # # # A # - # # A A - A - A # # - # A A A A A A - - - # A A A A - A - - - - A A A - - A - - - - # A A A A A Fig.5.7 : the labels on refreshed feature map Fig.5.6 shows the labels which is putted on initial feature map. Fig.5.7 shows the labels which is putted automatically on refreshed feature map. "-" stands for the blank space which have no label. "#" stands for the border line whose winning frequency

is too low. About classes ("A" and "B") whose data goes without changing, it labeled automatically. So the users only needs to put label on blank space about classes which goes with changing. And we were known that the class "C" was divided into 2 types, so we recognized that it has necessary to put labels on two classes.

5.3 Case 2 Data

In the same way, Table 5.2 and Fig.5.8 shows a change at the center of each class in case 2 data. The data of class "c" and the data of class "d" changed to become one group at the last step.

Table 5.2 : the center of case 2 data

(2.0, 2.0, 2.0) (1.5, 2.0, 2.0) (1.0, 2.0, 2.0) (0.5, 2.0, 2.0) d (0.0, 2.0, 2.0) c b a (2.0, 2.0, 1.5) (2.0, 2.0, 1.0) (2.0, 2.0, 0.5) (2.0, 2.0, 0.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (0.0, 0.0, 2.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) (2.0, 0.0, 0.0) step 4 step 3 step 2 step 1 step 0 2.0 2.0 2.0 a b c d

Fig.5.8 : the center of case 2 data

5.4 Case 2 Results

Fig.5.9 shows the labels which is putted on initial feature map. Fig.5.10 shows the labels which is putted automatically on refreshed feature map. The signs are the same as it was symbolized above. About classes, "a" and "b", which has no changed data, it labeled automatically. So the users only needs to put label on blank space about classes which goes with changing. And we were known that

(6)

2 classes, "c" and "d", was become into 1 group, so we recognized that it has necessary to put labels on only one class.

b b b b b b - d - d b b b b - # d d d d b b b b b d d d d d b b b - b d d d d d # # # # # d - d # d a a a a a # # # # -a -a -a -a - c - c c c - a a a a c c c c c a a a - # c - c c c a a a a # c - c c c Fig.5.9 : the labels on initial feature map b b b b b # b b b b b b # b b b b b # b b b b b # # # a # a a a a # a a a a # # a a a a # # a a a a a # a a a a a a -Fig.5.10 : the labels on refreshed feature map

6. Conclusion

The proposed "Semi-automatic labeling" on SOM map after refresh learning is motivated to save some troubles that the users have to put labels on map every time after refresh learning. It detected the position and territory on new map about the labels of no changed data, and showed the necessity to put new labels about changed data, by blank territory. So we can recognize the group with changed and save a trouble of labeling about changed data.

At present, we set the minimum of winning frequency which stands for the border line of class as appropriate value by experience. But it is important to set the border line at the blank space when the number of classes goes with changing, so

we must need to experiment, consider enough and adjust them more detailed in the future.

References

[1] Masahiro Mitihata and Tsutomu Miyoshi and Hiroshi Masuyama : "A Consideration on Adaptive Automatic Feature Learning of Self-Organizing Map", 15th Fuzzy System Symposium , Osaka, pp.309-310 (1999) (Japanese).

[2] T.Kohonen : "Self-Organizing Maps", Springer (1997).

[3] R.Hecht-Nielsen : "Neurocomputing," Addison-Wesley Pub. Co. (1992).

[4] G.Deboeck T.Kohonen : "Visual Explorations in Finance with Self-Organizing Maps," Springer (1999).

[5] Stephen T.Welstead : "NEURAL NETWORK AND FUZZY LOGIC APPLICATIONS IN C/C++" John Wiley & Sons, Inc. (1994).

[6] Tatsuya Nomura, Tsutomu Miyoshi : "An Adaptive Fuzzy Rule Extraction Using Hybrid Model of the Fuzzy Self-Organizing Map and the Genetic Algorithm with Numerical Chromosomes," Journal of Intelligent and Fuzzy Systems, No.6, pp.39-52 (1998).