Abstract— This paper proposes a **weight** **initialization** strategy for a discrete-time recurrent neural network model. It is based on analyzing the recurrent network as a nonlinear system, and choosing its initial weights to put this system in the boundaries between different dynamics, i.e., its bifurcations. The relationship between the change in dynamics and training error evolution is studied. Two simple examples of the application of this strategy are shown: the detection of a 2-pulse temporal pattern and the detection of a physiological signal, a feature of a visual evoked potential brain signal.

In this paper, an efficient **weight** **initialization** method is proposed using Cauchy’s inequality based on sensitivity analy- sis to improve the convergence speed in single hidden layer feedforward neural networks. The proposed method ensures that the outputs of hidden neurons are in the active region which increases the rate of convergence. Also the weights are learned by minimizing the sum of squared errors and obtained by solving linear system of equations. The proposed method is simulated on various problems. In all the problems the number of epochs and time required for the proposed method is found to be minimum compared with other **weight** **initialization** methods.

In the third phase, we start the algorithm with an **initialization** process to determine the number of tags to be used in the given area. The process continues by placing tags at random in the same area. The process continues which tags detecting by mobile RFID reader 3 times at different location or point in the given dimension area) we continue the process by detecting a reader and it is done until all three readers are detected the tags. Then we continue with the calculation to determine the position of tags by using the equations discussed in Section 3. The equation is based on the data of the received signal strength indication (RSSI) that is obtained during the detection process.

10 Read more

two references element, where the leftmost element is considered as the first reference element and the rightmost element considered as the second one. The actual direction of arrival covariance matrix is constructed by using auto-correlation and cross- correlation of the impinging signals information. The eigenvalue de-compensation is used to extract the actual steering vector based on the MVDR linear constraint. The interference null constraint is added to enable the beamformer to give more attention to cancel the interferences direction. Inspired by Null steering beamforming **weight**, the robust steering vector that interested to the Signal of Interest (SOI) and the interferences direction is constructed. Therefore, the proposed robust beamforming **weight** is calculated to maximize the radiation power in the SOI look direction and null perfectly the interferences direction with deeper negative power. Thus, the output radiation beamforming pattern orients the main lobe with unity gain accurately in the direction of interest and blocks the interferences direction with zero power.

This work has presented a basic WFA algorithm for CVRP, which differs from the basic meta-heuristic algorithm due to dynamic and self- adaptive population size of solution and tuning parameter during the optimization process. It can overcome the drawbacks of both single and multiple solutions based algorithms. The WFA- CVRP uses random steps for **initialization**, applies three neighborhood search strategies for splitting and moving which are (move, swap and 2-opt operator), and they have been selected randomly to generate a number of neighbors as sub-flows from the original solution. The experimental results demonstrate that WFA-CVRP is competitive compared with other algorithms in literature based on the quality of the solution. The WFA for CVRP has many room for further improvement especially on exploration on decision of the splitting and exploitation after flow merging. For future work, we suggest to use nearest neighbor heuristic to generate the initial solution that lead the algorithm to start to good quality solution. Furthermore, WFA can be hybridized with other local search algorithm to make a balance between intensification and diversification of search process.

11 Read more

The performance of DSR protocol under black hole attack was analyzed and a solution called Enhanced Dynamic Source Routing (EDSR) protocol to detect it was suggested by Mohanapriya and Krishnamurthi [19]. This is a new ACK based detection technique capable of detecting when false data packets reached a destination thereby detecting black hole attacks. Experiments showed that the new protocol achieved routing security with 16% increase in packet delivery ratio and 31% reduction in packet loss rate compared to standard DSR under black hole attack. The new technique is light **weight** as it did not involve high computational complexity. It is scalable as it achieved better packet delivery ratio than standard DSR in a network of 200 nodes.

11 Read more

This paper develops the NN algorithm that uses CG, comprising merits of PSO and GA as **weight** **initialization**, and this algorithm is called the conjugate gradient-neural network-PSO-genetic algorithm (CN-PSOGA). Furthermore, the data of the number of Juanda flight passengers (DATA 1) and the amount of rainfall (DATA 2) are used for CN-PSOGA simulation. It’s to see the superiority of CN-PSOGA algorithm in getting solutions and having the minimum error, and then the CN- PSOGA will be compared with other algorithms.

Business process discovery is a research field assembling techniques that allow representation of a business process, taking as input an event log where process data are stored. Several advances have been made in process discovery, but as data volume starts to **weight** considerably, improvement of discovery methods is crucial to follow up. In this paper, we discuss our new method, inspired from image processing techniques. Adapted to voluminous data logs, our method relies on generation of a Petri net using a matrix representation of data. The principal idea behind our approach consists of using several concepts: partial & feature blocks, filters as well as the adaptation of combinatory logic concepts to process mining in the perspective of extracting a business process model from a big event log.

11 Read more

In order to assess the e ff ectiveness of the proposed method we designed and con- ducted three di ff erent suites of experiments. The first suite deals with the comparison of the performance of the proposed method against six di ff erent **weight** **initialization** methods which are based on random selection of initial weights from predefined inter- vals. The benchmarks used for this first suite mainly concern classification problems, while one of them deals with regression and a second with prediction of a highly non- linear phenomenon. Moreover, a number of experiments were executed on function approximation and they are presented in a separate subsection. The second suite con- stitutes a thorough comparison of the proposed LIT-Approach with the well known **initialization** procedure proposed by Nguyen & Widrow (1990).

67 Read more

Scholars use BP neural network combined with EMD method to make predic- tions. However, BP neural network has the effects of local minimum value, slow convergence rate and poor generalization ability of the model. The AdaBoost algorithm can improve the prediction accuracy of the set weak predictor, and solves many problems that the weak predictor does not predict well. Therefore, in order to make up for the limitation of BP neural network **weight** **initialization** and sample data, and improve the prediction accuracy of BP neural network and EMD method, this paper proposes a BP_AdaBoost model time-series prediction method based on EMD method, and applies the model into crude oil. The em- pirical results are shown that the prediction accuracy of the model are preferable to the ARIMA model, BP neural network and EMD-BP combined model.

10 Read more

A hidden Markov model (HMM) can be used to disambiguate tags of individual tokens by max- imizing corpus likelihood using the expectation maximization (EM) algorithm. Our approach is motivated by a suite of oracle experiments that demonstrate the effect of **initialization** on the fi- nal tagging accuracy of an EM-trained HMM tag- ger. We show that initializing EM with accurate transition model parameters is sufficient to guide learning toward a high-accuracy final model.

ABSTRACT: Mahalanobis distance has been an appreciated choice for clustering in mixed Gaussian distribution field. With elliptical cluster formation, it has only one area of improvement that is, choosing proper **initialization**. We deal with this problem by proposing a deterministic **initialization** for k means to be used with Mahalanobis distance for fast yet effective results. We further compare our proposal with Melynkov and Melynkov’s algorithm. Also, evaluation of the proposal with benchmark datasets is done.

deviation of the accuracy is an alarming 8.3%. In light of this sensitivity to **initialization**, it is compelling to consider unsupervised models with concave log-likelihood functions, which may pro- vide stable, data-supported initializers for more complex models. In this paper, we explore the issues involved with such an expedition and elucidate the limitations of such models for unsupervised NLP. We then present simple concave models for depen- dency grammar induction that are easy to implement and offer efficient optimization. We also show how linguistic knowledge can be encoded without sacri- ficing concavity. Using our models to initialize the DMV, we find that they lead to an improvement in average accuracy across 18 languages.

The MSEs corresponding to different SNRs are plotted in Figure 10 which presents that the joint correction provides very small MSE only when SNR is above −12 dB. As mentioned in the last section, in the joint minimum entropy correction, coarse motion estimation for **initialization** is very essential, as only good **initialization** can avoid the coordinate descent algorithm being trapped in a local minimum of the entropy optimization. In these experiments, five iterations are used in the **initialization** estimation and order determin- ation according to the procedure in Figure 4, consuming CUP time of coarse motion estimation corresponding to different SNRs is 1.13, 1.16, 1.86, and 2.17 s, respectively. For clarity, the estimated radial ranges from the coarse motion estimation under different SNRs are presented in Figure 11. From Figure 11, one can note that effective coarse motion estimation is achievable even SNR decreases down to −10 dB, and the estimate accuracy degrades slightly with the increase of strong noise. For this dataset, the initialized estimates give high accuracy until SNR decreases below −12 dB, which also leads to failure of the coordinate decent algorithm for fine translational motion correction subsequently as we mentioned before, because the relationship between

19 Read more

At the re-**initialization** point, excessive pheromone values are decreased with the following requirements: deteriorate the difference between individual pheromone values and pheromone arithmetic mean (i), reduction will be directly proportional to its size (ii) and overall pheromone value will be reduced (iii). In this case a non-linear transformation is used (14).

To validate the effectiveness and efficiency of the EM and demonstrate the influence of the SMHT on the metaheuristic algorithm, computational experiments were conducted on 30 benchmark datasets available from the OR-library (http://people.brunel.ac.uk/~mastjjb/jeb/orlib/jobsh opinfo.html), including FT, LA, ORB, ABZ, SWV, and YN, and a benchmark dataset presented in [27] called Willem. These experiments were programmed by executing Matlab9 on an Intel Core2 Duo P8600 2.4 GHz RAM, 4 GB processor. The parameters of the SMHT and EM were tuned by running multiple simulations on some datasets. The size of initial population produced for each benchmark was twice the number of jobs on that. Random **initialization** and the SMHT were run 10 times on the datasets and the best initial population

11 Read more

During training we report the out-of-sample NMI, calculated by holding the word proportions φ fixed, running five sweeps of collapsed Gibbs sampling on the test set, and computing the topic for each document as the topic assigned to the most tokens in that document. Two Gibbs sweeps have been shown to yield good performance in practice (Yao et al., 2009); we increase the num- ber of sweeps to five after inspecting the stability on our dataset. The variance of the particle filter is often large, so for each experiment we perform 30 runs and plot the mean NMI inside bands spanning one sample standard deviation in either direction. Fixed **Initialization**. Our first set of experi- ments has a similar parameterization 3 to the exper-

The calculation flows of the improved algorithm can be listed as follows. ① Initial Population, including the population size and the **initialization** of each **weight** (generate according to the method for neural network to generate initial **weight**), and encode it; ② Calculate the selection probability of each individual and sort them; ③ Select good individual to enter next generation population according to spinning roulette wheel selection strategy; ④ In the new generation population, select adaptive individual to carry out crossover and mutation operation according to adaptive crossover probability and mutation probability to generate new individual; ⑤ Insert the new individual into the population and calculate the fitness of new individual; ⑥ Immigration operator operation. Judge whether there is “prematurity phenomenon”, if there is, immigration strategy shall be adopted and turn to the second step ; ⑦ If the satisfactory individual is found, it shall be ended; otherwise, turn to the second step.

In natural language processing, words are typ- ically represented by high-dimensional one-hot vectors. To reduce dimensionality and to be able to learn relationships between words, they are mapped into a lower-dimensional, continuous em- bedding space. Mathematically, this is done by multiplying the one-hot vector with the embed- ding matrix. Similarly, to receive a probability dis- tribution over the vocabulary, a mapping from an embedding space is performed by a projection ma- trix followed by a softmax operation. These two matrices can be tied together in order to reduce the number of parameters and improve the results of NLMs (Inan et al., 2017; Press and Wolf, 2017). Since the row vectors in the embedding and pro- jection matrices are effectively word vectors in a continuous space, we investigate such **weight** vec- tors in well-trained and fine-tuned NLMs. We ob- serve that the learned word vector generally has a greater norm for a frequent word than an in- frequent word. We then specifically examine the **weight** vector norm distribution and design ini- tialization and normalization strategies to improve NLMs.

The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz **initialization** algorithm for k- means, using both simulated and real data, and find it to be more efficient in many cases.