Common Issues in Particle Filters - Bayesian Filtering Methods

Chapter 2 - Bayesian Filtering, Metaheuristics and the SV Estimation Problem

2.1 Bayesian Filtering Methods

2.1.6 Common Issues in Particle Filters

Figure 2-4: The Sequential Importance Sampling Particle Filter

2.1.6 Common Issues in Particle Filters

The phenomenon of the ensemble collapse is known by many names in literature, namely;

sample-impoverishment, sample-degeneracy and sample-depletion. Though many different resampling steps have been proposed in literature [AMGC02, RAG04], their main function is to discard particles with negligible weights with particles with above average weights. In low-dimensional cases, they have successfully removed the particle collapse encountered earlier;

however they are unable to solve the particle collapse as the dimension of the state increases [SBBA08].

Sample Impoverishment

During the execution of a particle filter, the weights of the particles are updated with the arrival of each observation. However, after a few iterations, the weights of the particles start to be biased towards the particles with a greater weight. Eventually, the particle representation fails and except for a few, all the particles have negligible weights.

22 Figure 2-5: Particle Filter – After Initial Iteration

The diagram above taken form [SDFG01] shows the first iteration of a particle filter, where the distribution to be estimated is represented by a set of particles (yellow). After the first iteration, the weights of the particles are updated (blue). The particles with a higher probability are assigned a greater weight, which is shown in the diagram using the size of the blue particles. In this way, a set of particles with their respective weights represent a discrete representation of the probability distribution.

Theoretically, for a particle filter to approximate the posterior density function the weights are required to give a good relative probability of that particle occurring in that distribution. The next diagram below tries to explain the objective of a particle filter. The density function to be estimated is shown by a black line. The yellow circles are the initialized particles that are used to estimate the posterior, and they represent the samples that are drawn from the posterior, the weights of these particles are updated (blue) to represent the probability of that particle being sampled from the posterior. The density function to be estimated can be assumed to be made of an infinite number of particles. The aim is to sample a discrete set of particles form the infinite set that correctly represents the distribution. For an accurate estimate of the posterior, the weights assigned to the particles should be a good representation of the probability of drawing that sample from the actual posterior.

23 Figure 2-6: Objective of a Particle Filter - Accurate Posterior Density Estimation

The particles and their weights in the above diagram provide an accurate representation of the posterior. However, since in a particle filter the particles are initialized only once, this is only possible if the number of particles approaches infinity. Figure 2-7 shows the phenomenon of ensemble collapse.

24 Figure 2-7: The Phenomenon of Weight Degeneracy

When a particle filter is initialized, all the particles are assigned equal weights. However as the algorithm runs, the weight of the particle with the greatest weight continues to increment and within a few iterations its weight approaches one while all the other particles have negligible weights. Theoretically, this can be avoided by increasing the number of particles, but that is

25 computationally infeasible. Two main methods have been proposed in literature to address this issue. These are discussed next.

Resampling

To address sample impoverishment, Gordon et al., in [GSS93] proposed that the number of particles with above average weights be multiplied within the population by replacing the particles with negligible weights. Adding copies of these particles was able to address the issue in low-dimensions. This technique is called resampling.

The diagram below shows the working of a resampling function. After a threshold is reached, and the weights are biased to a few particles, the particle population is resampled, the particles with negligible weights are replaced and particles with a greater weight are assigned more copies in their neighbourhood space.

Figure 2-8: Resampling in a Particle Filter

The resampling step, if we were to use terms borrowed from genetic algorithm literature (Genetic algorithms will be discussed later in this chapter), can be seen as a selection operator, and has properties of an exploitation operator, and thus does not explore the search space

26 completely. Because of this, as the number of iterations increase, the same group of particles will be resampled and eventually the whole population of particles will be biased towards a few particles. The particle filter with an added resampling step is called a sequential importance resampling particle filter (SIR).

Figure 2-9: Particle Filter with Resampling Step – The SIR Particle Filter

27 Algorithm 2.2 : The Sequential Importance Re-Sampling Particle Filter

Choose a proposal distribution | ,, resampling strategy and the number of particles N.

Initialization: Generate and let initial weights to be 1/N.

For loop k = 1, 2…End of Observations START

1. Measurement Update: For i = 1,2,…, N

| _| | Where the normalization weight is given by:

∑ _| |

Regularization and Artificial Evolution

Resampling leads to a loss of diversity among the particles since the resultant sample set will contain many repeated particles for any given weight. To rectify the sample impoverishment due to resampling, after each resampling process a kernel density estimate of the particle density can be used to resample the particles a second time. In this process, each new particle is selected from the resampled particles based on a draw from a uniform distribution and then the sample point is moved a small amount based on a draw from the local kernel. This process tends to concentrate the particles in the region of highest probability and separates them in a random fashion. This method of reducing sample impoverishment is called regularization [Gen92]. A particle filter with resampling and regularization is called a resample and move particle filter.

The regularization part constitutes the move part.

There are several alternatives to the resample and move method [Hau11]; including a Markov Chain Monte Carlo (MCMC) sampling method that utilizes the Metropolis–Hastings acceptance algorithm instead of a regularization step and a Gibbs sampling method similar to MCMC. In general, these methods prove to be too computationally intensive for real-time filtering applications [Hau11]. However, in cases where the posterior density has large tail probabilities, as is the case for alpha-stable distributions such as a Levy distribution, the standard SIS particle filter methods may fail due to difficulties in the selection of an appropriate importance density. In such instances, the use of an MCMC method in particle filters provides an alternative for building efficient high dimensional proposal distribution. Other applications where these methods are useful are static parameter estimation and smoothing methods similar to the test problem being addressed in this research [Hua94].

Gordon et al. in [GSS93] introduced a method similar to regularization; they proposed the idea of adding additional random disturbances or roughening penalties to sampled state vectors in an attempt to address the issue of sample degeneracy. They called this method

‘artificial evolution’. Extending this idea to fixed model parameters leads to a synthetic method of generating new sample points for parameters. This ad-hoc idea is similar to using a Gaussian mutation in real coded genetic algorithm literature [ES93].

Consider a state distribution | . Where an estimate of the fixed model

The key motivating idea is that the artificial evolution provides the mechanism for generating new parameter values at each time step.

Thus the flow chart of the modified algorithm is shown in figure 2.10.

Figure 2-10: Flow chart of a Particle Filter with Artificial Evolution

30 Algorithm 2.3: The Resample Move Particle Filter

Choose a proposal distribution | ,, resampling strategy and the number of particles N.

Initialization: Generate and let initial weights to be 1/N.

For loop k = 1, 2…End of Observations START

Measurement Update: For i = 1,2,…, N

| _| |

Where the normalization weight is given by:

∑ _| |

Estimation:

The filtering density is approximated by

| ∑ _| And the mean is approximated by:

∑ _|

Time Update:

Generate predictions according to the proposal distribution

| And compensate for the importance weights

| _| ^|

| IF ( is not last observation)

k = k + 1 Go to step 1.

Else End For loop.

END

31 The addition of resampling and regularization were able to address the collapse of particle filters in low-dimensions however as the state-dimensions is increased, the sample impoverishment becomes too severe to be addressed by these methods. In [SBBA08], Snyder et al., showed that the ensemble size required for a successful particle filter scales exponentially with the problem size.

The Curse of Dimensionality

The estimation of continuous density functions using sequential Monte Carlo methods is known to suffer from the ‘curse of dimensionality’ [And99, BSN03, Lee03]. Snyder et al., in [SBBA08] showed that to avoid ensemble collapse, the particle population needs to increase exponentially with increasing state-dimensions. For a nonlinear estimation problem with zero-mean unit-variance Gaussian noise, they showed that 10¹¹particles are required for a 200-dimensional state-space. Similar observations were reported by Bengtsson et al., in [BBL08].

In [Bri11], Briggs visited the issue of high dimensional particle filtering in state-spaces where the noise distribution is meta-elliptical and the components of the observation vector are independent. The proposed a location-domain particle filter which created a particle population for each component of the observation vector. This greatly increased the space and time complexity of this algorithm. The author noted that compared to the generic particle filter which took 0.034 seconds for an observation update on their test problem, his proposed filter took 2100 seconds. He also noted that with an increase in the number of observation vector components, the time taken by the algorithm for each observation update would increase. This is a significant flaw since for their specific test problem with hundred observations; a generic particle filter took approximately 4 seconds to run, while their proposed location-domain particle filter took approximately 60 hours [Bri11].

Convergence of particle filters

It has been shown in [CD02, Mo98] that given a posterior density function P and a discrete particle population representing this density generated by a particle filter, the following holds true:

| |

√

Here:

∫

32 And ‖ ‖ , with a bounded measureable test function. Thus the equation shows that the RMSE converges to 0 as the number of particles N are increased.

Although it can be argued that using a reasonably large particle population size, one can approach near accurate approximation of the posterior, it was later argued and experimentally demonstrated by Quang et al., in [QMG08] that this equation does not take the dimension of the problem into account. The authors argued that an increase in the dimension of the state can require an exponential increase in the number of particles required for convergence. They showed that the constant changes with changes in the dimension. Later, Snyder et al., in [SBBA08] published a mathematical proof showing that the number of particles required increase exponentially with an increase in state dimension.

In document Real-coded genetic algorithm particle filters for high-dimensional state spaces (Page 32-43)