T he KohonenSelf-OrganizingMap (KSOM) is one of the Neural Network unsupervised learning algorithms. This algorithm is used in solving problems in various areas, especially in clustering complex data sets. Despite its advantages, the KSOM algorithm has a few drawbacks; such as overlapped cluster and non-linear separable problems. Therefore, this paper proposes a modified KSOM that inspired from pheromone approach in Ant Colony Optimization. The modification is focusing on the distance calculation amongst objects. The proposed algorithm has been tested on four real categorical data that are obtained from UCI machine learning repository; Iris, Seeds, Glass and Wisconsin Breast Cancer Database. From the results, it shows that the modified KSOM has produced accurate clustering result and all clusters can clearly be identified .
KohonenSelf-OrganizingMap, invented by Teuvo Kohonen, a professor of the Academy of Finland, assume a topological structure among the cluster units. This property is observed in the brain, but it is not found in other artificial neural networks. There are m cluster units, arranged in a one- or two- dimensional array. The input signals are n-tuples . The weight vector for a cluster unit serves as an exemplar of the input patterns associated with that cluster. During the self- organizing process, the cluster unit whose weight vector matches the input pattern most closely, typically the square of the minimum Euclidean distance, is chosen as the winner. The winning unit and its neighboring units, in terms of the topology of the cluster units, update their weights . The weight vectors of neighboring units are not close to the input pattern . Fig. 3 below shows a sample SOM network architecture . It consists of 16 cluster units and 2 input units. Each input unit is connected to all cluster units. A cluster unit has 2 connection weights since there are 2 input units.
Since it was first proposed, it is amazing to notice how K- Means algorithm has survive over the years. It has been one among the well known algorithms for data clustering in the field of data mining. Day in and day out new algorithms are evolving for data clustering purposes but none can be as fast and accurate as the K-Means algorithm. But in spite of its huge speed, accuracy and simplicity K-Means has suffered from some of its own problem. Such as, the exact number of cluster is not known prior to clustering. The other thing that is causing problem is that it is quite sensitive to initial centroids. Not just that, K-Means fails to give optimum result when it comes to clustering high dimensional data set because its complexity tends to make things more complicated when more number of dimensions are added. In Data Mining this problem is known as “Curse of High Dimensionality”. Here in our paper we proposed a new Modified K-Means algorithm that will overcome the problem faced by the standard K- Means algorithm. We proposed the use of KohonenSelfOrganizingMap (KSOM) so as to visualize exact number of clusters before clustering and genetic algorithm is applied for initialization. The KohonenSelfOrganizingMap (KSOM) with Modified K-Means algorithm is tested on an iris data set and its performance is compared with other clustering algorithm and is found out to be more accurate, with less number of classification and quantization errors and can be applied even for high dimensional dataset.
The main objective of a KohonenSelf-organizingMap (SOM) (1997) is to determine the mapping of a n dimensional input signals set into a bi-dimensional grid. This mapping occurs in an adaptative and topologically ordered fashion. The SOM architecture consists in two layers: the input layer of n dimensionality (n equal to the dimensionality of the inputs set) and the output layer characterized as a bi-dimensional grid of neurons. Each input neuron is fully connected with the bi-dimensional grid, where each of the connections is represented by an associated synaptic weight. The weight vector of a grid unit consists in the set of weight values of all connections.
Abstract -- Clustering gathers similar objects. A Character can also be treated as object and can be recognized in the image through its visual features. In this work, characters of the Urdu script are clustered on the basis of 18 different visual features. A KohonenSelfOrganizingMap is used for clustering with four different topologies of sizes 6x5, 8x7, 9x8, and 10x10. Each topology is checked under 75, 100, 150 and 200 numbers of epochs. 30 Urdu characters make 106 different shapes due to the four different positions in the word. These 106 shapes are then classified into 53 general classes based on graphical similarity. The shape of each class comprises features for its description. Considering only 18 features of each shape, 53 general classes are then grouped into clusters using a KohonenSelfOrganizingMap (K-SOM). The above mentioned work has been implemented in MATLAB.
An important basic principle is that the features must be independent of class membership because, by definition, at the feature extraction phase the membership in the classes is not yet known. This implies that any learning methods used for feature extraction should be unsupervised in the sense that the target class for each object is unknown . One of the approaches is the KohonenSelfOrganizingMap (KSOM) that uses competitive learning, which in turn results in data clustering . The KSOM belongs to the class of unsupervised neural networks based on competitive learning, in which only one output neuron, or one per local group of neurons at a time gives the active response to the current input signal. The level of activity indicates the similarity between the input signal vector and its respective weight vector. A standard way of expressing similarity is through the Euclidean distance between these vectors. Since the distance between the weight vector of a given neuron and the input data vector is minimal for all neurons in the network, a neuron together with a predefined set of neighbor neurons will have their weights automatically updated by the learning algorithm. The neighborhood for each neuron may be defined accordingly to the geometrical form, over which the neurons are arranged. Figure 3
There are the researches related the Self- OrganizingMap Algorithm (SOM) in [15-19]. In , the self-organized map, an architecture suggested for artificial neural networks, is explained by presenting simulation experiments and practical applications. The self-organizingmap has the property of effectively creating spatially organized internal representations of various features of input signals and their abstractions. In , the KohonenSelf-OrganizingMap (SOM) is one of the most well-known neural network with unsupervised learning rules; it performs a topology- preserving projection of the data space onto a regular two-dimensional space. Its achievement has already been demonstrated in various areas, but this approach is not yet widely known and used by ecologists. The present work describes how SOM can be used for the study of ecological communities, etc.
(2) Where xi is the ith input vector, Wi,j is the weight vector connecting input i to output neuron j and Dj is the sum of Euclidian distance between input sample xi and it's connecting weight vector to jth output neuron which is called a map unit. There are different applications for SOM neural networks in WSNs routing protocols. These applications can be divided into three general groups: deciding optimal route, selection of cluster heads and clustering of nodes. The authors in  used Kohonen SOM neural networks for clustering and their analysis to study unpredictable behaviors of network parameters and applications.Clustering of sensor nodes using KohonenSelfOrganizingMap (KSOM) is computed for various numbers of nodes by taking different parameters of sensor node such as direction, position, number of hops, energy levels, sensitivity, latency, etc. Cordina and Debono  proposed a new LEACH like routing protocol in which the election of Cluster Heads is done with SOM neural networks where SOM inputs are intended parameters for cluster heads. LEA2C apply the connectionist learning by the minimiza- tion of the distance between the input samples
This research is based on using the Kohonen’s Self-OrganizingMap (SOM) to cluster software metrics (CK me- trics suite). The clustering of CK metrics was based on metrics threshold values that proposed in literature. We showed that SOM can be applied to clusters software metrics to visualize the relationship between software me- trics and its reusability level (High Reusable, Medium Reusable and Low Reusable). SOM was used in this re- search according to its powerful ability in clustering data vectors and its property of spatial autocorrelation. This helps in discovering software metrics patterns and its relationship with reusability category. The clustering va- lidity was based on the highest silhouette average value, after we applied many grids sizes and different number of epochs. Initially, we applied SOM on all CK metrics suite, in order to cluster classes to their suitable reusability category, but we found that NOC and DIT metrics dominated the clustering results because of their poor distri- bution, so it was helpless clustering. The solution was to eliminate NOC and DIT metrics from clustering process. The experimental results show that the clustering becomes more homogenous and meaningful.
Image compression algorithm is needed that will reduce the amount of Image to be transmitted, stored and analyzed, but without losing the information content. This paper presents a neural network based technique that may be applied to image compression. Conventional techniques such as Huffman coding and the Shannon Fano method, LZ Method, Run Length Method, LZ-77 are more recent methods for the compression of data. A traditional approach to reduce the large amount of data would be to discard some data redundancy and introduce some noise after reconstruction. We present a neural network based self-organizingKohonenmap technique that may be a reliable and efficient way to achieve vector quantization. Typical application of such algorithm is image compression. Moreover, Kohonen networks realize a mapping between an input and an output space that preserves topology. This feature can be used to build new compression schemes which allow obtaining better compression rate than with classical method as JPEG without reducing the image quality.
The training starts once the dataset has been initialised and input patterns have been selected. The learning phase of the SOM algorithm repeatedly presents numerous patterns to the network. The learning rule of the classifier allows these training cases to organize in a two-dimensional feature map. Patterns which resemble each other are mapped onto a specific cluster. During the training phase, the class for randomly selected input node is determined. This is done by labeling the output node that is more similar (best-matching unit) to the input node compared to other nodes in the Kohonen mapping structure. The outputs from the training are the resulting map that contains the winning neurons and its associated weight vectors. Subsequently, these weight vectors are optimised by PSO. The quality of the classification accuracy is calculated to investigate the behavior of the network in the training data.
On the other hand, in unsupervised tech- nique inherent features extracted from the image is used for the segmentation. Unsupervised seg- mentation based on clustering includes K-means, Fuzzy C-Means (FCM) and ANN. K-means algo- rithm is a hard segmentation method because it assigns a pixel to a class or it does not . FCM uses a membership function so that a pixel can belongs to several clusters having different de- gree. One important problem of these two clus- tering methods is that the clustering numbers must be known beforehand. ANN can change their responses according to the environmental conditions and learn from experience. Self-Or- ganizing Map (SOM) [9, 10] or Kohonen’s Map is an unsupervised ANN that uses competitive learning algorithm. The SOM features are very useful in data analysis and data visualization, which makes it an important tool in image seg- mentation . Although the use of SOM in im- age segmentation is well reported in the litera- ture [9, 11], its application under noisy condition is not widely known.
Researchers who studied about landslide have applied a model for predicting landslide hazard. Some of them use probabilistic models. Various models used are the logistic regression , neuro-fuzzy , fuzzy logic , geographic information system , self-organizingmap  and artificial neural network models . Artificial neural networks (ANN) have broad applicability to real-world business problems. In fact, they have been successfully applied in many industries . Since neural networks are best for identifying patterns or trends in data, they are well suited for prediction or forecasting including sales forecasting, industrial process control, customer research, data validation, risk management  and target marketing . The Kohonenself-organizing maps (SOM) as part of the unsupervised learning algorithm of ANN has been applied as a clustering algorithm of high dimensional data, as well as an alternative tool to classical multivariate statistical techniques . SOM algorithm is also employed as a tool for eco-morphological investigation concerning the life history of fish , as model for post-fire hydrologic and geomorphic hazards  , as a tool for forecasting the reservoir inflow during typhoon periods , and for visualizing the topical structure of the medical sciences . On the other hand, the K-means clustering algorithm is presented for image segmentation based on an adaptive approach , landslide detection  and spatial prediction for landslide hazard . As evidenced by the above lists of references, modeling utilizing SOM and K- Means have recently been applied to a broad range of geo- environmental fields.
While the two-pathway biology of BC is generally accepted, many tumors have aspects of low- and high-grade biology. For example, FGFR3 mutations are not found in CIS (carcinoma in situ) but they coexist with TP53 muta- tions in 10–20% of invasive BCs as do deletions of both chromosome 9 (typical of low-grade disease) and 17p (locus of TP53) in 15–74% BC [4, 8]. Clinical phenotypes, therefore, reflect either the timing or impact of genetic events combined with patient factors (such as type and continued exposure to carcinogens) and treatment effec- tiveness (such as timing, appropriateness and quality of treatment). A current challenge for translational researchers is to integrate distinct and, potentially, competing molecu- lar events into single- phenotype predictions. In BC, this represents the ability to discriminate future tumor behavior using molecular alterations typical for low- and high-grade tumor development. Nonstatistical methods are appealing in this role as they do not rely upon data distribution, can handle large datasets automatically without supervision or prior assumptions, and do not assume that statistical prox- imity equates to molecular association . Various struc- tures of artificial intelligence have been developed, of which Artificial Neural Networks (ANNs) are perhaps the best evaluated (reviewed in Ref. ). Here, we report the use of a self-organizingmap (SOM) to integrate molecular parameters in BC. SOMs are a type of unsupervised ANNs that are good for low-density data visualisation . We selected molecular events that characterize high- and low- grade BC pathways and used progression to more advanced disease as our primary outcome.
The Kohonen network or Kohonen's Self-OrganizingMap (SOM) or Kohonen's Self-Organizing (Feature) Map SO(F)M is one of the most popular network architectures. SOM provides data visualization techniques which help to understand high dimensional data by reducing the dimensions of data to a map. The main function of SOM networks is providing a way of representing multidimensional data in much lower dimensional spaces - usually one or two dimensions. This process, of reducing the dimensionality of vectors, is essentially a data compression technique known as vector quantization. In addition, the Kohonen technique creates a network that stores information in such a way that any topological relationships within the training set are maintained. SOM represents a clustering concept by grouping similar data together. Therefore it can be said that SOM reduces data dimensions and displays similarities among data. SOFMs are competitive neural networks in which the featured space is represented by organizing the neurons in a 2-dimensional grid (most simple case). According to the learning rule, vectors which are similar to each other in the multidimensional space will be similar in the 2-dimensional space. SOFMs are often used just to visualize an n-dimensional space, but its main application is data classification.
-------------------------------------------------------------------------ABSTRACT ------------------------------------------------------------- This paper presents a new approach of Kohonen neural network based SelfOrganizingMap (SOM) algorithm for Tamil Character Recognition. Which provides much higher performance than the traditional neural network. Approaches: Step 1: It describes how a system is used to recognize a hand written Tamil characters using a classification approach. The aim of the pre-classification is to reduce the number of possible candidates of unknown character, to a subset of the total character set. This is otherwise known as cluster, so the algorithm will try to group similar characters together. Step 2: Members of pre-classified group are further analyzed using a statistical classifier for final recognition. A recognition rate of around 79.9% was achieved for the first choice and more than 98.5% for the top three choices. The result shows that the proposed Kohonen SOM algorithm yields promising output and feasible with other existing techniques.
IJSRR, 7(4) Oct. – Dec., 2018 Page 2173 the most effective matching unit's neighbourhood. By observing these nodes can have their weight vectors altered within the next step. A singular feature of the Kohonen learning formula is that the world of the neighbourhood shrinks over time to the dimensions of only one node. After knowing the radius, iterations square measure disbursed through all the nodes within the lattice to work out if they lay among the radius or not. If a node is found to be among the neighborhood then its weight vector is adjusted. each node among the most effective Matching Unit„s neighborhood (including the most effective Matching Unit) has its weight vector adjusted. The SOFM design depicted below in figure.1demonstrates the mapping structure.
In this paper, a new algorithm named self-organizingmap for clustering social networks (SOMSN) is proposed for detecting such groups. SOMSN is based on self-organizingmap neural network. In SOMSN, by adapting new weight- updating method, a social network is divided into different clusters according to the topological connection of each node. These clusters are the communities that mentioned above, in social networks. To show the effectiveness of the presented approach, SOMSN has been applied on several classic social networks with known number of communities and defined structure. The results of these experiments show that the clustering accuracy of SOMSN is superior compared to the traditional algorithms.
Abstract. In the paper, text mining and visualization by self-organizingmap (SOM) are investigated. At first, textual information must be converted into numerical one. The results of text mining and visualization depend on the conversion. So, the influence of some control factors (the common word list and usage of the stemming algorithm) on text mining results, when a document dictionary is created, is investigated. A self-organizingmap is used for text clustering and graphical representation (visualization). A comparative analysis is made where a dataset consists of scientific papers about the optimization, based on Pareto, simplex, and genetic algorithms. Two new measures are also proposed to estimate the SOM quality when the classified data are analyzed: distances between SOM cells, corresponding to data items assigned to the same class, and the distance between centers of SOM cells, corresponding to different classes. The quantization error is measured to estimate the SOM quality, too.
Abstract. In the paper, two combinations (consecutive and integrated) of vector quantization methods (self-orga- nizing map and neural gas) and multidimensional scaling (MDS) have been investigated and compared. The vector quantization is used to reduce the number of dataset items. The dataset with a smaller number of items is analyzed by multidimensional scaling in order to reduce the number of features of data (dimensionality of space) and to map them onto the plane, i.e., to visualize. Some ways of the initialization (at random, on a line, by PCs and by variances) of two- dimensional vectors in MDS have been investigated. Two ways of assignment of two-dimensional vectors in the integ- rated combinations of MDS and vector quantization methods have been examined, too.