In this paper, two different techniques for selection of principal components in **PCA** have been proposed.Both of the methods are able toproduce satisfactory results with improved accuracy in **clustering** of the SOM model.So,it can be concluded that the proposed models are better than the existing models. Proposed models also have the abilityto cluster data at small lattice size and have the power of reducing the dimension of data.So, it can also be concluded that the proposed methods are computationally efficient.In spite of these satisfactory results, there is also scope for further improvement in the proposed models. Instead of K- means **clustering** algorithm,other **clustering** techniques can be used to cluster eigenvalues and eigenvectors of the covariance matrix and also to produce final clusters from the SOM **model**. Other pre-processing techniques can also become effectivein **clustering** SOM **model** with better output.

Show more
RBF is used to improve the accuracy of the SOM **model** [5]. On the other hand, **PCA** is also merged with SOM to improve the power of the SOM **model**. The present paper is for improvement of the **PCA** **based** SOM **model**. In this paper, RBF is used as a pre-processing tool for **PCA** **based** **model** and their combined effect is merged with SOM so that the accuracy of the SOM **model** is further improved. The proposed technique not only eliminates non-linearity of input data, it also helps in reducing dimension of data more than the application of **PCA** alone. In this paper, RBF is used as a dimensionality reduction tool in addition to its normal principles. After applying RBF and **PCA** in order, SOM algorithm is used to cluster the selected principal components. Finally, the desired numbers of clusters are obtained by further **clustering** of the SOM prototypes by using K- means algorithm.

Show more
We have proposed a new **model** for big data sentiment classification using a **Self**-**Organizing** **Map** Algorithm (SOM) – an unsupervised learning of a machine learning to classify the sentiments (positive, negative, or neutral) for all the documents of our testing data set according to all the documents of our training data set in English. We only run the SOM only once, the results of the sentiment classification of all the documents of the testing data are identified. The SOM is proposed according to many multi- dimensional vectors of both the testing data set and the training data set. The multi-dimensional vectors are **based** on many sentiment lexicons of our basis English sentiment dictionary (bESD). One document is corresponding to one multi-dimensional vector according to the sentiment lexicons. After running the SOM only once, a **Map** is used in presenting the results of the SOM. The results of **clustering** all documents of the testing data set into either the positive polarity or the negative polarity are shown on the **Map**, we can find all the results of the sentiment classification of all the documents of the testing data set fully. We only use many multi-dimensional vectors **based** on the sentiment lexicons of the bESD. In a sequential **system**, the new **model** has been tested firstly, and then, this **model** has been performed in a parallel network environment secondly. The accuracy of the testing data set has been achieved 88.72% certainly. Many different fields can widely use the results of this new **model**.

Show more
55 Read more

line dynamic **model** using **clustering** algorithm and periodic user transaction for mining user behavior prediction for Web personalization **system**. A complete framework for mining evolving user profiles in dynamic Websites is proposed in [11]. They also described how to enrich the discovered user profiles with explicit information need that is inferred from search queries extracted from Web log data. In [12], Jiyang Chen et al. have proposed a visualization tool to visualize Web graphs, representations of Web structure overlaid with information and pattern. They also proposed Web graph algebra to manipulate and combine Web graphs and their layers in order to discover new patterns in an ad hoc manner. In [13], Esin Saka et al. have proposed a hybrid approach which combines the strengths of Spherical K-Means algorithm for fast **clustering** of high dimensional datasets in the original feature domain and the flock-**based** algorithm which iteratively adjusts the position and speed of dynamic flock’s agents on a visualization plane. The hybrid algorithm decreases the complexity of FClust from quadratic to linear with further improvements in the cluster quality. In [14], S Park et al. have investigated the use of fuzzy ART neural network, to enhance

Show more
Researchers who studied about landslide have applied a **model** for predicting landslide hazard. Some of them use probabilistic models. Various models used are the logistic regression [15], neuro-fuzzy [8], fuzzy logic [9], geographic information **system** [9][10], **self**-**organizing** **map** [12] and artificial neural network models [15]. Artificial neural networks (ANN) have broad applicability to real-world business problems. In fact, they have been successfully applied in many industries [14]. Since neural networks are best for identifying patterns or trends in data, they are well suited for prediction or forecasting including sales forecasting, industrial process control, customer research, data validation, risk management [15][16] and target marketing [17]. The Kohonen **self**-**organizing** maps (SOM) as part of the unsupervised learning algorithm of ANN has been applied as a **clustering** algorithm of high dimensional data, as well as an alternative tool to classical multivariate statistical techniques [18]. SOM algorithm is also employed as a tool for eco-morphological investigation concerning the life history of fish [19], as **model** for post-fire hydrologic and geomorphic hazards [20] , as a tool for forecasting the reservoir inflow during typhoon periods [16], and for visualizing the topical structure of the medical sciences [21]. On the other hand, the K-means **clustering** algorithm is presented for image segmentation **based** on an adaptive approach [22], landslide detection [7] and spatial prediction for landslide hazard [13]. As evidenced by the above lists of references, modeling utilizing SOM and K- Means have recently been applied to a broad range of geo- environmental fields.

Show more
Nowadays, multi-agent systems are frequently used to control big components of IT systems. Since this systems are big and use and produce big data, the control of multi-agents systems by people is not beneficial anymore. So, **self**-organization, in which agents can organize themselves, is proposed as a resolution. In a **self**-**organizing** multi-agent **system**, the desired behavior of **system** is emerged **based** on local behavior of agents. But, emergence is not always as we desired and can result in a manner that we do not want the **system** to have it. On the other hand, **self**-adaptation is a centralized and top-down process, in which a central control loop can monitor, analyze, plan and execute decisions on a controlled element. Therefore, a combination of **self**-organization and **self**-adaptation can help the **system** stay **self**-organized while the bad results of emergence are controlled by **self**-adaptation features. Recently, combining **self**- organization and **self**-adaptation features is becoming a trend. In this paper, we tried to propose an organization **model** for **self**-**organizing** multi-agent **system** **based** on **self**-adaptation features to control emergence.

Show more
The **self**-organized **clustering** procedure is performed by constructing the SOM clusters using both product properties and required operations. The data (properties and operations) are encoded as described in section 1.1 which results in a set of 58 attributes describing each product. In the next step, the SOM algorithm is applied to construct a mapping from this 58-dimensional space into a 2-dimensional grid representing the initial layout. The result of this stage is the initial cellular layout (Fig. 1) with products distributed in cells according to the SOM **clustering** algorithm.

Show more
The GEO3DSOM on the other hand outperforms the stan- dard SOM in providing a grouping of the data in a spatially coherent way. Analysis of both the artificial and the real life data sets showed that the GEO3DSOM is capable of a more detailed grouping of both regularly and irregularly dis- tributed spatial data, compared to the standard SOM with geographical coordinates included in the data set. As is to be expected, the information about the data set at hand obtained through the component planes and the U-matrices by both versions of the SOM is very similar. The pseudo- coloring applied to both variants of the SOM-algorithm how- ever shows some clear differences between both techniques. The explicit incorporation of the geographic coordinates in the GEO3DSOM algorithm results in greater differences between groups in the U-matrix. This results in an in- creased resolution in the pseudo-coloring of the units. In the GEO3DSOM both the geology related subdivision and the vertical subdivision is apparent from the coloring, while in the standard SOM the coloring is dominated only by the vertical subdivision **based** on oxygen, iron and manganese. Within the samples having elevated oxygen and nitrate con- centrations, a subtle differentiation between Brussels and Di- est samples can be seen, while this differentiation is com- pletely absent in the group of samples with low oxygen and nitrate concentrations. Both coloring schemes do however identify the presence of outliers in the Brussels sands aquifer. In conclusion it can be stated that both techniques succeed very well in providing more insight in the quality data set, highlighting the main differences and pointing out anoma- lous wells. Incorporation the spatial correlation through in- cluding the geographic coordinates in the BMU-selection procedure of GEO3DSOM, however, provides the advantage of an increased resolution, while still maintaining a general- ization of the data set.

Show more
13 Read more

Kohonen **Self** **Organizing** Feature Maps, or SOMs provide a way of representing multidimensional data in much lower dimensional spaces - usually one or two dimensions. This process, of reducing the dimensionality of vectors, is essentially a data compression technique known as vector quantization. In addition, the Kohonen technique creates a network that stores information in such a way that any topological relationships within the training set are maintained. One of the most interesting aspects of SOMs is that they learn to classify data without any external supervision whatsoever. It consists of neurons or **map** units, each having a location in a continuous multi- dimensional measurement space as well as in a discrete two dimensional data collection is repeatedly presented to the SOM until a topology preserving mapping from the multi dimensional measurement space into the two dimensional output space is obtained. This dimensionality reduction property of the SOM makes it especially suitable for data visualization.

Show more
The KSOM algorithm has succeeded in **clustering** and solving complex problems in many areas, especially when they involve high dimensional data. However, this KSOM algorithm is incapable of handling the feature similarity problem efficiently; which leads to the scattered distribution of data in the **clustering** results. Thus, the modification of the distance measurement in KSOM algorithm using pheromone approach from Ant Colony Optimization helps to cluster the datasets efficiently. All data with similar features are closely grouped and located in one cluster. While the other dissimilar data are clustered in another cluster, separately. However, even though all datasets clustered correctly, there are a few overlapped clusters in the results and the separation boundaries between clusters are still very close. Furthermore, this modified algorithm will be fine-tuned to improve the **clustering**

Show more
Figure 9 shows this information computed by the social interaction (a) and SOM (b). As shown in Figure 9(a), we could see three classes on the **map** by the social interaction. On the other hand, by the SOM, as in Figure 9(b), boundaries between three classes were not always clear. On the lower left hand side of the maps by the social interaction and SOM, neurons with the highest information on input neurons appeared. This part corresponded to year 2011, where only mini-car was produced largely. This proves that the year 2011 showed the most explicit characteristic of all periods. Namely, the number of mini cars was much larger than any other cars in terms of production.

Show more
Artificial Neural Networks have been applied to many problems [3][11], and have demonstrated their superiority over classical methods when dealing with noisy or incomplete data. One such application is for data compression. Neural networks seem to be well suited to this particular function, as they have an ability to preprocess input patterns to produce simpler patterns with fewer components [1]. This compressed information (stored in a hidden layer) preserves the full information obtained from the external environment. The compressed features may then exit the network into the external environment in their original uncompressed form. The main algorithms that shall be discussed in ensuing sections are the Back propagation algorithm and the Kohonen **self**-**organizing** maps.

Show more
Here is an alternative method for **self**-**organizing** **system** range. Cluster is made up of collection of nodes, Characteristics of these nodes varies with respect to number of nodes present in each cluster. In this scenario nodes means number of vehicles participated in a particular cluster. Cluster identity can be used to distinguish between different clusters. **Clustering** is usually used in comparatively dense regions. The proposed **system** here consists of Road Side Unit (RSU) with its communication range. The communication range of this RSU is divided into multiple clusters. **Clustering** is done according to vehicle density. Any one of the vehicle in the cluster act as a cluster head. The decision of **clustering** is done by road side unit. Vehicle in each cluster is communicate with near vehicle and also to RSU. This RSU can capable of communicate with near RSU.

Show more
Many researchers have put forth their views by publishing papers on Big Data and SOM. The speed and the volume with which the data is generated in today’s world make it hard for traditional methods to analyze and organize this data in a structured way. So the author in [1] suggest various Big Data Challenges and solution and catering to the problems in hand through **Map** Reduce framework over Hadoop Distributed File **System** (HDFS).It also explains various Big data opportunities and detail architecture of **Map** Reduce. In [2] the author describes the analysis of distribution of data as well as scheduling of the tasks for execution and effective communication. He describes **Map** Reduce has a programming **model** for data intensive applications and has also proposed a **model** for scheduling the divisible loads. Use of Kohonen **Self** **Organizing** **Map**- Neural Networks (KSOM-NN) to study about cluster formation and simulations on Wireless Sensor Networks are carried on to determine the performance with respect to given application parameters and requirements in [3]. Simulations are carried on a number of parameters of sensor networks to understand the chaotic environment with respect to parameters of WSNs.Electronic Health Records (EHRs) have come up in today’s world because digitization has become very necessary. Hence the author

Show more
Security in mobile ad hoc networks can be provided either using a single authority domain or through full **self**-organization. In [8], security for vehicular networks was introduced within a game theoretic framework using input centrality measures **based** on single authority domain. In [9], a random network **model** with the neighboring nodes possessing primary security association was introduced to improve the throughput **based** on **self** organized public key scheme. A secure high throughput multicast mechanism for wireless mesh network was introduced in [10] using a measurement and accusation **based** technique.

Show more
Students‘ interactions with e-learning vary according to their behaviours which in turn, yield different effects to their academic performance. Some students participate in all online activities while some students participate partially **based** on their learning behaviours. It is therefore important for the lecturers to know the behaviours of their students. But this cannot be done manually due to the unstructured raw data in students‘ log file. Understanding individual student‘s learning behaviour is tedious. To solve the problem, data mining approach is required to extract valuable information from the huge raw data. This research investigated the performance of **Self**-**organizing** **Map** (SOM) to analyze students‘ e- learning activities with the aim to identify clusters of students who use the e-learning environment in similar ways from the log files of their actions as input. A study on Meaningful Learning Characteristics and its significance on students‘ leaning behaviors were carried out using multiple regression analysis. Then SOM **clustering** technique was used to group the students into three clusters where each cluster contains students who interact with the E-learning in similar ways. Behaviors of students in each cluster were analyzed and their effects on their learning success were discovered. The analysis shows that students in Cluster1 have the highest number of interactions with the e-learning (Very Active), and having the highest final score mean of 91.12%. Students in Cluster2 have less number of interactions than that of Cluster1 and have final score mean of 75.65%. Finally, students Cluster3 have least number of interactions than the remaining clusters with final score means is 36.57%. The research shows that, students who participate more in Forum activities emerged the overall in learning success, while students with lowest records on interactions have lowest performance. The research can be used for early identification of low learners to improve their mode of interactions with e-learning.

Show more
27 Read more

The wireless sensor network consists of a large number of sensor nodes deployed in a remote region to sense events inside the phenomenon or very close to it. These sensor nodes are generally equipped with sensing, data processing and communicating component for gathering data from a field [1]. A wireless sensor network typically consists of a base station and a group of sensor nodes (see Figure 1). When sensors are deployed, they **self** organize to form a network and then start sensing the surroundings in order to transmit the gathered data to the base station. Since sensor nodes have limited energy, the **self** organization protocols must focus primarily on energy/power conservation of large scale WSNs. These networks are used for the systematic collection of information related to the environment such as intrusion detection, weather prediction and detection of environmental conditions. In recent years, many kinds of efforts have been done on maximizing network lifetime as it is impractical to change or replace exhausted batteries of sensor nodes [2]. Such a constraint must be taken into account in the design of wireless sensor networks to minimize the energy consumption and allow the exchange of large amounts of data between nodes and the base station. These two competing objectives reveal the importance of an efficient **self** organization in WSNs. Therefore, many algorithms have been proposed for an efficient energy management in order to maximize the lifetime of wireless sensor networks.

Show more
In this paper, a novel Game theory-**based** data **clustering** algorithm is proposed by combining a new initialization method, Game theory, and SOM algorithm. The performance of the proposed NGTSOM is evaluated using several different synthetics and real datasets and the results show a significant accuracy improvement for the proposed data **clustering** **model**. This is due to the more competitive game provided by the proposed strategies. It resolves the major problem of the existing **clustering** techniques where the weight vectors of non-winning neurons are far from the input patterns without having any chance to contribute in the learning phase. The proposed NGTSOM were compared with K-means, NG, SOM and SOM **clustering** algorithm. The comparison results demonstrate the improved **clustering** quality of the proposed NGTSOM.

Show more
10 Read more

In this paper, a new algorithm named **self**-**organizing** **map** for **clustering** social networks (SOMSN) is proposed for detecting such groups. SOMSN is **based** on **self**-**organizing** **map** neural network. In SOMSN, by adapting new weight- updating method, a social network is divided into different clusters according to the topological connection of each node. These clusters are the communities that mentioned above, in social networks. To show the effectiveness of the presented approach, SOMSN has been applied on several classic social networks with known number of communities and defined structure. The results of these experiments show that the **clustering** accuracy of SOMSN is superior compared to the traditional algorithms.

Show more