Organizing Map
6.2.1 Self-Organizing Map
The SOM was proposed by Kohonen125 as a specific type of neural network. Its concept is originated from the functions of cerebral cortex of brain. The cerebral cortex is divided into different areas for processing signals such as sight, hearing and tactile sensation 126. On receiving these signals, the cortex will first classify and then map them to the corresponding areas to be processed. In each area of the cortex, neurons with similar functionality are closely related, leading to fast and accurate processing of the signals. This form of classifying and mapping signals to the corresponding processing area is called topographic mapping which is also the fundamental concept of the SOM 125.
Self-organizing map is able to discover the nonlinear latent features from high dimensional data. These low-dimensional features are presented in the form of a layer of topologically ordered neurons on a 2D map. A typical two-dimensional SOM is shown in Figure 6-2.
Figure 6-2: Basic SOM structures
Training of SOM mainly composes of three phases; competition, cooperation and adaption 125. In the phase of competition, neurons first compete with each other and the neuron having the weight vector closest to the input signal vector is declared as the winner neuron or the Best Matching Unit (BMU). It is assumed the input signal vector is represented by I= [I1, I2, I3, . . ., In]T and the weight vector is represented by W= [W1, W2,
W3, . . ., Wn]T. Mathematically, the difference between the weight vector and the input signal vector is computed as the Euclidean Distance between them.
2 1 || || ( ) n i i i E I W I W
(6.1)6-6
The neuron that has the smallest E is the BMU. Next, in the cooperation phase, the direct neighbourhood neurons of the BMU are identified. Finally, in the adaption phase, these neurons are selectively tuned to form a specific pattern on the lattice. This pattern corresponds to a specific feature of the input signal vector. The tuning function is expressed as;
(t 1) ( )t ( ) ( )[ ( )t t t ( )]t
W W I W (6.2)
where 𝛼(𝑡) is the tuning rate and 𝜃(𝑡) is the exponential neighbourhood function. 𝛼(𝑡) decreases exponentially over iteration resulting in a more refined tuning towards the end of training process. ( / ) 0 ( ) t t e (6.3)
where α0 is the initial learning rate and λ is the time constant which is determined as.
0 N
(6.4)
where N is the total number of training samples. σ0 is the radius of the map. It is computed as the Euclidean distance between the coordinates of the outmost neuron and the centre neuron.
0 outmost centre
T T (6.5)
It is noted that on 2D map, the coordinates of each neuron is expressed as 𝐓𝑗 = [𝑡𝑗1 𝑡𝑗2].
𝐓𝑜𝑢𝑡𝑚𝑜𝑠𝑡 denotes the coordinate of the outmost neuron while 𝐓𝑐𝑒𝑛𝑡𝑟𝑒 represents the
coordinate of the central neuron. On the other hand, 𝜃(𝑡) is maximized at the BMU and decays exponentially with the distance from the BMU.
2 2 ( ) exp 2 ( ) j BMU t t T T (6.6) 0 ( )t exp t (6.7)
where 𝐓𝐵𝑀𝑈 is the coordinate of the best matching unit and 𝜎(𝑡) is the radius of the
neighbourhood. This means that the neurons that are farther away from the BMU are updated at a much lower rate. In addition, the neurons that are outside the radius of neighbourhood are skipped completely. As a result, the weight vectors of the BMU and its neighbours gradually become more similar to the input data samples. Conversely, the similarity between the weight vectors of the neurons farther away and the input data sample decreases over time. This type of differential tuning leads to similarity mapping of the data samples and topological ordering of the neurons. The training process stops until
Risk-based Fault Detection using Self-Organizing Map
6-7
the maximum number of training iterations is reached. After training, the topologically ordered neurons form a 2D pattern which corresponds to the low-dimensional latent features of the training data samples, such as the example shown in Figure 6-3. In this respect, the SOM can also be used as a data classification tool with which data samples with similar features are mapped into a single cluster.
When applying SOM to complex system, different operating states of the system are mapped as clusters on a two-dimensional map. During online monitoring, each sample data vector is compared with the weight vector of neurons within each cluster and the BMU is computed. Depending on the location of the BMU, the operating state of system is identified. By connecting the BMUs of all sample data, a trajectory is formed which shows the dynamic behaviour of the system. When the system is operating at a normal condition, the process data samples are rather similar to each other and are mapped into a single cluster. As a result, the trajectory is restricted within the cluster representing normal operation. In case of fault condition, the system is subjected to abnormal variations which lead to generation of data samples having very distinctive features. These faulty data samples are mapped in a different cluster. In addition, as the fault condition deteriorates, the data samples generated become more dissimilar and are mapped in a cluster farther away from the normal cluster. Consequently, the trajectory connecting neurons in which online process data samples are mapped deviates from the normal operating cluster and moves into the cluster representing one particular fault state.