stochastic gradient adaptive filter algorithms

Top PDF stochastic gradient adaptive filter algorithms:

The Use of LMS and RLS Adaptive Algorithms for an Adaptive Control Method of Active Power Filter

The Use of LMS and RLS Adaptive Algorithms for an Adaptive Control Method of Active Power Filter

This paper deals with the adaptive control mechanism management meant for shunt active power filters (SAPF). Sys- tems driven this way are designed to improve the quality of electric power (power quality) in industrial networks. The authors have focused on the implementation of two basic representatives of adaptive algorithms, first, the algorithm with a stochastic LMS (least mean square) gradient adaptation and then an algorithm with recursive RLS (recursive least square) optimal adaptation. The system examined by the authors can be used for non-linear loads for appliances with rapid fluctuations of the reactive and active power consumption. The proposed system adaptively reduces distor- tion, falls (dip) and changes in a supply voltage (flicker). Real signals for measurement were obtained at a sophisticated, three-phase experimental workplace. The results of executed experiments indicate that, with use of the certain adaptive algorithms, the examined AHC system shows very good dynamics, resulting in a much faster transition during the AHC connection-disconnection or during a change in harmonic load on the network. The actual experiments are evaluated from several points of view, mainly according to a time convergence (convergence time) and mistakes in a stable state error (steady state error) of the investigated adaptive algorithms and finally as a total harmonic distortion (THD). The article presents a comparison of the most frequently used adaptive algorithms.
Show more

8 Read more

Quaternion Information Theoretic Learning Adaptive Algorithms for Nonlinear Adaptive

Quaternion Information Theoretic Learning Adaptive Algorithms for Nonlinear Adaptive

ent. The QKSIG algorithm minimizes Shannon’s entropy of the error between the filter output and desired response and minimizes the divergence between the joint densities of input-desired and input-output pairs. The SIG technique reduces the computational complexity of the error entropy estimation. Here, ITL with SIG approach is applied to quaternion adaptive filtering for three different reasons. First, it reduces the algorithm computational complexity compared to our previous work quaternion kernel minimum error entropy algorithm (QKMEE). Second, it improves the filtering performance by con- sidering the coupling within the dimensions of the quaternion input. Third, it performs better in biased or non-Gaussian signal and noise environments due to ITL approach. We present convergence analysis and steady-state performance analysis results of the new algorithm (QKSIG). Simulation results are used to show the behavior of the new algorithm QKSIG in quaternion non-Gaussian signal and noise environments compared to the existing ones such as quadruple real-valued kernel stochastic information gradient (KSIG) and quaternion kernel LMS (QKLMS) algorithms.
Show more

181 Read more

On an Adaptive Filter based on Simultaneous Perturbation Stochastic Approximation Method

On an Adaptive Filter based on Simultaneous Perturbation Stochastic Approximation Method

Abstract —In this paper, the simultaneous perturbation stochastic approximation (SPSA) algorithm is used for seeking optimal parameters in an adaptive filter developed for assimi- lating observations in very high dimensional dynamical systems. It is shown that the SPSA can achieve high performance similar to that produced by classical optimization algorithms, with better performance for non-linear filtering problems as more and more observations are assimilated. The advantage of the SPSA is that at each iteration it requires only two measurements of the objective function to approximate the gradient vector regardless of dimension of the control vector. This technique of- fers promising perspectives for future developement of optimal assimilation systems encountered in the field of data assimilation in meteorology and oceanography.
Show more

6 Read more

A Hybrid Ensemble Method for Accurate Breast Cancer Tumor Classification using State-of-the-Art Classification Learning Algorithms

A Hybrid Ensemble Method for Accurate Breast Cancer Tumor Classification using State-of-the-Art Classification Learning Algorithms

In this paper, we proposed a hybrid ensemble classification by evaluating the performance of simple logistic regression learning, stochastic gradient descent learning and multilayer perceptron network, random decision tree method, random decision forest method, sequential optimization method for support vector machine learning, K-nearest neighbor classifier, and Naive Bayes classification algorithms. We reported the performance of these eight classifiers using different performance measures i.e. accuracy, precision, recall, F1 score, F2 score, and F3 score. Later we selected the three classifiers with the best F3 score and proposed a hybrid ensemble method using voting mechanism.
Show more

11 Read more

De-Noising Of Speech Signal Using Adaptive Filter Algorithms

De-Noising Of Speech Signal Using Adaptive Filter Algorithms

It is a Normalized Least Mean Square algorithm. This is used to normalize the high input power of input vector u (t). When a high power signal comes in input vector, then LMS filter suffers gradient noise amplification problems. To overcome this problem, adjustment in tap weight vector of the filter at iteration (n+1). Step size of the filter is under the control of the designer. It supports the real value’s error e (n) as well as complex conjugate error *e(n)[2],[6],[7]. The mathematical relation of the filter is given bellow.
Show more

5 Read more

Advances in Monte Carlo Variational Inference and Applied Probabilistic Modeling

Advances in Monte Carlo Variational Inference and Applied Probabilistic Modeling

In this work, we present a novel approach to controlling the variance of the reparam- eterization gradient estimator in MCVI. Existing MCVI methods control this variance naïvely by averaging several gradient estimates, which becomes expensive for large data sets and complex models, with error that only diminishes as O(1/ √ N ). Our approach exploits the fact that, in MCVI, the randomness in the gradient estimator is completely determined by a known Monte Carlo generating process; this allows us to leverage knowl- edge about this generative procedure to de-noise the gradient estimator. In particular, we construct a computationally cheap control variate based on an analytical linear ap- proximation to the gradient estimator. Taking a linear combination of a naïve gradient estimate with this control variate yields a new estimator for the gradient that remains unbiased but has lower variance. Applying the idea to Gaussian approximating fami- lies, we observe a 20-2,000 × reduction in variance of the gradient norm under various conditions, and faster convergence and more stable behavior of optimization traces.
Show more

188 Read more

A Framework for Analyzing Stochastic Optimization Algorithms Under Dependence

A Framework for Analyzing Stochastic Optimization Algorithms Under Dependence

where A ∈ R n×p , b ∈ R n and x ∈ R p . We generated the entries of A and b from the standard normal distribution and set n = 10 6 , p = 1000, l = −1 and u = 1. This problem can be viewed as minimizing a sum of strongly convex functions subject to a polytope constraint. Such problems can be found in the shape restricted regression literature. We compared the ASFW and PSFW with two variance-reduced stochastic methods, the variance-reduced stochastic Frank-Wolfe (SVRF) method [30] and the proximal variance-reduced stochastic gradient (Prox-SVRG) method [6, 35]. Both Prox-SVRG and SVRF are epoch based algorithms. They first fix a reference point and com- pute the exact gradient at the reference point at the beginning of each epoch. Within each epoch, both algorithms compute variance reduced gradients in every step using the control variates tech- nique based on the reference point. The major difference between them is that in every iteration, the Prox-SVRG takes a proximal gradient step and the SVRF takes a Frank-Wolfe step. For detailed implementations of SVRF, we followed Algorithm 1 in [30] and chose the parameters according to Theorem 1 in [30]. For the Prox-SVRG, we followed the Algorithm in [35] and set the number of iterations in each epoch to be m = 2n and set the step size to be γ = 0.1/L found by [35] to give the best results for Prox-SVRG, where n is the sample size and L is the Lipschitz constant of the gradient of the objective function. For ASFW and PSFW implementations, we followed Algo- rithm 3 and Algorithm 4 and used adaptive step sizes since we know the Lipschitz constants of the gradients of the objective functions. The number of samples that we used to compute stochastic gradients for ASFW and PSFW was set to be 1.04 k + 100 at the iteration k. The linear optimization sub-problems in the Frank-Wolfe algorithms and the projection step in Prox-SVRG were solved by using the GUROBI solver. We summarize the parameters that were used in the algorithms at it- eration k and epoch t in Table 3.2. In this table, g (k) is the stochastic gradient, L (k) is the Lipschitz constant of the stochastic gradient at iteration k , d (k) is the direction the algorithms take at iteration k and γ max is the maximum of the possible step sizes (see Algorithm 3 and 4). In Prox-SVRG, L
Show more

145 Read more

arxiv: v6 [cs.lg] 18 Feb 2020

arxiv: v6 [cs.lg] 18 Feb 2020

M eanSquare(t) + , where = 10 −7 for all our experiments. This adaptive method is adopted to reduce the variance among coordinates with historical gradient values. For ease of reference, we denote the first implementation as DC- ASGD-c (constant) and the second as DC-ASGD-a (adaptive). In addition to DC-ASGD, we also implemented ASGD and SSGD, which have been used in many previous works as baselines (Dean et al., 2012; Chen et al., 2016; Das et al., 2016). Furthermore, for the experiments on CIFAR-10, we used the sequential SGD algorithm as a reference model to examine the accuracy of parallel algorithms. However, for the experiments on ImageNet, we were not able to show this reference because it simply took too long time for a single machine to finish the training 10 . For sake of fairness, all experiments started from the same randomly initialized model, and used the same strategy for learning rate scheduling. The data
Show more

20 Read more

A Stochastic 3MG Algorithm with Application to 2D Filter Identification

A Stochastic 3MG Algorithm with Application to 2D Filter Identification

◮ stochastic approximation algorithms in machine learning ◮ sparse adaptive filtering methods.. Employ a Majorize-Minimize strategy allowing the use of a wide.[r]

25 Read more

Analysis of the Sign Regressor Least Mean Fourth Adaptive Algorithm

Analysis of the Sign Regressor Least Mean Fourth Adaptive Algorithm

A new adaptive algorithm, called the SRLMF algorithm, has been presented in this work. Expressions are derived for the steady-state EMSE in a stationary environment. A condition for the mean convergence is also found, and it turns out that the convergence of the SRLMF algorithm strongly depends on the choice of initial conditions. Also, expressions are obtained for the tracking EMSE in a nonstationary environment. An optimum value of the step-size μ is also evaluated. Moreover, an extension of the weighted variance relation is provided in order to derive expressions for the mean-square error (MSE) and the mean-square deviation (MSD) of the proposed algorithm during the transient phase. Monte Carlo simulations have shown that there is a good agreement between the theoretical and simulated results. The simulation results indicate that both the SRLMF algorithm and the LMF algorithm converge at the same rate resulting in no performance loss. The analysis developed in this paper is believed to make practical contributions to the design of adaptive filters using the SRLMF algorithm instead of the LMF algorithm in pursuit of the reduction in computational cost and complexity whilst still maintaining good performance.
Show more

12 Read more

An information theoretic approach to speech feature selection applied to speech detection

An information theoretic approach to speech feature selection applied to speech detection

Computation of this information metric for speech training sets has shown the first adaptive linear prediction coefficient from the stochastic gradient algorithm to be the optimal single[r]

28 Read more

Performance Analysis of LMS and Normalized LMS Adaptive Filter Algorithms

Performance Analysis of LMS and Normalized LMS Adaptive Filter Algorithms

Abstract - Interference is the major problem in wireless communication. And suppression of interference or noise of a noisy signal is not an easy task when the frequency varies and having Gaussian noise. Frequently used filter like BPF, LPF has taken to de-noise, i.e. useful for the fix bandwidth, but if bandwidth is not fixed and it varies with respect to time, then it requires advance filter that can able to adjust the bandwidth of the desired signal like an adaptive filter which update its weight vector automatically and adjust the bandwidth of the desired signal. This paper analyzes the performance of LMS & Normalized LMS adaptive filter in the basis of taking output parameters like RMS value, Mean and Median of the de- noising signal. This paper also gives the brief detail of the adaptive filter.
Show more

7 Read more

Acoustic Echo Cancellation by Adaptive Combination of Normalized Sub band Adaptive Filters by Using Stochastic Gradient Algorithm

Acoustic Echo Cancellation by Adaptive Combination of Normalized Sub band Adaptive Filters by Using Stochastic Gradient Algorithm

The full band and sub-band systems, adaptive combination of sub- band adaptive filters and its improvement were modeled in MATLAB Simulink and many simulations for different inputs and number of sub-bands were performed. For the adaptive algorithm several different algorithms can be used, but the most common one is the normalized least mean squares (NLMS). The order of the NLMS filters was chosen from N=64 to N=2 .The designs were made in MATLAB Simulink environment and the simulations were run for speech input. A reverberating effect was added to the input by an artificial Schroeder reverberate which contained four comb filters in parallel and two all-passes filters series connected. The first estimation of a system capability is represented by SNR (signal to noise ratio).The second estimation of a system capability is represented by the (output error-voice input) , but in order to measure its potential, Echo Return Loss Enhancement (ERLE) should be computed; it is defined as the ratio of the power of the desired signal over the power of the residual signal.
Show more

7 Read more

Some Contributions to High Dimensional Mixed Effects Logistic Regression Models

Some Contributions to High Dimensional Mixed Effects Logistic Regression Models

In the non-convex log-likelihood aspect, our problem is non-convex in the loss function, while well studied non-convex problem examples in statistical literature focused on cases where non-convex penalty functions sum with convex loss functions (Fan and Li (2001); Zhang (2010a); Loh and Wainwright (2017)). Recently a few literatures have focused on the non-convex loss functions. Schelldorfer et al. (2011) have devised algorithms to solve the high dimensional linear mixed effects models for both fixed effects parameters β and the variance-covariance component Σ + σ 2 I, although it is non-convex solving for the the variance components, it remains a con- vex problem solving for β. There are some algorithms commonly used in solving non-convex objective functions, for example the proximal gradient algorithm, the EM algorithm, the alternating direction method of multipliers (ADMM), and the iterated filtering algorithms. We will consider developing our algorithms based on the proximal gradient algorithms in chapter II and the iterated filtering algorithms in chapter IV. Our initial numerical experiments showed that one ADMM we developed performs similar in estimation to our other algorithms, but converges much slower and takes significant longer time, so we will not pursue it further in this disserta- tion. The EM algorithm applied to our problem, which will be very different from gradient based algorithms we consider, can be an interesting independent research in the future. There is in general no guarantee of algorithm convergence for the proximal gradient and iterated filtering algorithms applied to solve non-convex prob- lems. Recently, Bolte et al. (2006); Attouch and Bolte (2009); Attouch et al. (2010, 2013) have proposed an framework of analyzing proximal gradient algorithms solv-
Show more

157 Read more

Experimental Comparisons of Multi-class Classifiers

Experimental Comparisons of Multi-class Classifiers

Gradient boosting proposed by Friedman [35] is a method to improve basic boosting algorithm. The traditional boosting method is adjusted weights to correct classification samples and error samples based on gradient descend at each iteration. The major difference of gradient boosting from the traditional boosting method is that purpose at each iteration is not to reduce the losses, but in order to eliminate the loss. The new model at each iteration is based on the residuals of former process. Inspired by Breiman[6]’s randomized bagging idea, Friedman introduces stochastic gradient boosting by randomized down-sampling to train basic classifier. 3.1.6 Support vector machines
Show more

16 Read more

Performance Analysis of Basic Adaptive Filter Algorithms for DSP Processor in LabVIEW

Performance Analysis of Basic Adaptive Filter Algorithms for DSP Processor in LabVIEW

The tremendous growth of development in the digital signal processing area has turned some of its specialized areas into fields themselves. If accurate information of the signals to be processed is available, the designer can easily choose the most appropriate algorithm to process the signal. When dealing with signals whose statistical properties are unknown, fixed algorithms do not process these signals efficiently. The solution is to use an adaptive filter that automatically changes its characteristics by optimizing the internal parameters. The adaptive filtering algorithms are essential in many statistical signal processing applications. The adaptive filter has the property that its frequency response is adjustable or modifiable automatically to improve its performance in accordance with some criterion, allowing the filter to adapt to changes in the input signal characteristics.
Show more

5 Read more

Random Forest and Stochastic Gradient Tree Boosting Based Approach for the Prediction of Airfoil Self-noise

Random Forest and Stochastic Gradient Tree Boosting Based Approach for the Prediction of Airfoil Self-noise

The running time for RF and stochastic tree boosting method to grow 1000 trees was found to be 4 seconds and 3 seconds respectively, much lower than that of the ANN approach. A salient characteristic of an ensemble method is that the complexity of the method is directly proportional to the complexity of growing each tree present in the model. The complexity of RF cannot be exactly stated, but for a data set of N instances and n attributes the computational cost can very nearly be approximated to O nN log N and hence for growing K trees the expression becomes O K nN log N 32 . However this is no exact manner of expressing the complexity of RF, as this does not take into account the case when partial number of features is selected, or when additional time is required to inject randomness into the model. The complexity of SGTB algorithms mostly depends on the number of leaf nodes defined for each tree and the number of iterations or the maximum number of trees to be grown. However due to the introduction of bagging concept to increase randomness, a smaller sample is drawn in each iteration which subsequently decreases the complexity.
Show more

13 Read more

SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent

SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent

Minimizing this loss is of course equivalent to minimizing the primal cost (1) with its regularization term. Applying the SGD algorithm to the examples defined in Table 2 separates the regularization updates, which involve the special example, from the pattern updates, which involve the real ex- amples. The parameter skip regulates the relative frequencies of these updates. The SVMSGD2 algorithm (Bottou, 2007) measures the average pattern sparsity and picks a frequency that ensures that the amortized cost of the regularization update is proportional to the number of nonzero co- efficients. Figure 1 compares the pseudo-codes of the naive first-order SGD and of the first-order SVMSGD2 . Both algorithms handle the real examples at each iteration (line 3) but SVMSGD2 only performs a regularization update every skip iterations (line 6).
Show more

18 Read more

Certain Systems Arising In Stochastic Gradient Descent

Certain Systems Arising In Stochastic Gradient Descent

2. Kesten algorithm [Kes58] introduced a stochastic approximation process in hopes of accelerating the convergence of the Robbins-Monro algorithm [RM51]. The idea here is: when we are confident that the process is close to the value θ we wish to estimate, we decrease the step size in order to stabilize the con- vergence. And whenever we suspect we are far away from θ, we keep the step size large in order to allow faster exploration. In order to determine when it is warranted to decrease the step size, they followed the heuristic that when the process is close to θ then the sign of X n − X n−1 should fluctuate more
Show more

105 Read more

Minimum Bit Error Rate Design for Space Time Equalisation Based Multiuser Detection

Minimum Bit Error Rate Design for Space Time Equalisation Based Multiuser Detection

Stationary System: The system used in our simulation sup- ported users with receiver antennas. All three users had an equal transmit power. The CIRs are listed in Table I, each CIR having taps. The CIRs used in both the stationary and fading channels are extensions of the often-used single-input single-output (SISO) CIRs proposed by Proakis in his book, which were extended to the MIMO scenario considered. In the actual simulation, all 12 CIRs were normal- ized to provide unit channel energy, i.e., for all and . Thus, SIR dB for all and . Each equalizer temporal filter had a length of , and the detector deci- sion delay was chosen to be . For this stationary system, Fig. 3 compares the BER performance of the MMSE and MBER STE-based MUDs. The BER of an STE-based MUD was com- puted using the theoretic BER formula (23), the MMSE STE weight vector was calculated using the formula (16), and the MBER STE solution was computed numerically using the sim- plified conjugate gradient algorithm. It can be seen that for all three users, the MBER STE detectors had better BER perfor- mance than the corresponding MMSE detectors. For the specific simulated channel conditions, the performance gap between the MBER and MMSE STE detectors was the smallest for user 3, with the MBER solution achieving above 1.0 dB gain in SNR at the BER level of . At this BER level, the MBER STE detector for user 1 had the largest performance gain over the corresponding MMSE STE detector, above 5.0 dB gain in SNR. The performance of the block-data gradient adaptive MBER al- gorithm employing the simplified conjugate gradient updating mechanism, as described in Section IV-A, was investigated. Our simulation results show that with a block size the block-data-based adaptive MBER STE can closely match the theoretical MBER STEs performance, and the algorithm typi- cally converged within 20 iterations. Space limitation precludes the inclusion of these simulation results.
Show more

9 Read more

Show all 10000 documents...