4.4 Automatic Initialisation
5.1.1 Simulated Annealing Particle Filter
Recalling Particle Filter, priori knowledge and the temporal model often are unknown and difficult to acquire, many studies eliminate the temporal termp(xit|xit−1)in the up- date equation A.2.3 by equatingπ(xit|xit−1,yti)top(xit|xit−1)in order to simplify the cal- culation. This type of the Importance Sampling Re-sampling algorithm is commonly known as the bootstrap filter and condensation algorithm [Isard and Blake 1998b]. In this form, the posterior probability is solely dependent upon likelihood, and the max- imum a posteriori solution is equivalent to the maximum likelihood solution. As a result, the problem is transferred from solving the maximum a posteriori to solving the maximum likelihood, eventually estimating the true state via optimisation.
Simulated Annealing as proposed by Kirkpatrick et al. [Kirkpatrick et al. 1983] serves as a general-purpose optimisation algorithm. Later, Deutscher et al [Deutscher and Reid 2005] introduced it for estimating the maximising likelihood of particle fil- tering in human motion tracking. Usually the likelihood probabilityp(yt|xt)is formu- lated as an exponential function f(yt,xt)with respect to a metric ofE(yt,xt)between ytandxt.
f(yt,xt) =exp{−E(yt,xt)} Adding annealing variableλ:
p(yt|xt) = f(yt,xt)λ =exp{−λE(yt,xt)} (5.1.1) Whenλ→ ∞, the probability mass will concentrate on the minimum ofE(yt,xt), or equivalently, the maximum of f(yt,xt). To avoid being trapped in local minima, λis initially assigned a small value, and is gradually increased according to a prede- fined set of values{λ = λm, ...,λ1}, whereλm < λm−1 < ... < λ1, which is known
Figure 5.1: Asλ gradually increases, the observation likelihood distribution evolves from flat (probably single mode) to peaked (multiple modes), and its probability mass slowly concentrates on the maxima, equivalently, the minima of E(yt,xt). Particles gradually move from low to high likelihood areas while their coverage shrinks from large to small. Provided the annealing schedule is sufficiently slow and long enough, particles will eventually concentrate on the global maximum mode.
as the annealing scheduling. Gradually increasing of λ introduces the evolution of
f(yt,xt)from a flat distribution (probably single mode) to a peaked distribution (mul- tiple modes). In a typical process, the samples of the statextare weighted byf(yt,xt), re-sampled to concentrate on a better minimum and finally perturbed with Gaussian noise. Theoretically, the Simulated Annealing Particle Filter should not be misled by local minima, so it can converge to the global minimum within the search space. Fig- ure 5.1 illustrates this evolving procedure.
Besides λm, another two important parameters are survival rateαm and a pertur-
bation covariance matrix Pm, which control and tune the pace at which samples are
§5.1 Simulated Annealing 87
survival rate is given by:
αm = Ne f f(m) N Ne f f(m) = 1 ∑N i=1(wit,m)2
where, Ne f f(m) shares the same sense as equation 3.3.6, but has a different weight definitionwit,m = exp{−λmE(yit,xit)}. A high survival rate corresponds to a flat im- portance distribution whose probability mass is uniformly distributed. Resampling from this distribution ensures that good and bad particles are roughly equally likely to be sampled, enabling a broad range of exploration. Conversely, a low survival rate with a peaked importance distribution ensures good particles are more likely to be sampled so that the exploration concentrates on highly likely areas in search space. Instead of simply increasing annealling variables, the annealing schedule should be determined by accounting for the shape of the importance distribution. This leads to more effective resampling in the probabilistically important directions. Given the survival rateαm of particles at the current phase,λm can be determined as suggested in [Deutscher and Reid 2005] by:
αmN N
∑
i=1 (wit,m)2= ( N∑
i=1 wit,m )2 (5.1.2)where,Nis the number of particles, andwit,m =exp{−λmE(yit,xit,m)}.
The survival rate at any phase can be assigned the desired valueαdesiredby adjust- ing the annealing variableλm. Note survival rates are fixedαm =..=α1 =αdesiredand λm is monotonically increasing, This implies that the perturbation covariance matrix Pmcontributes to increasing uniformness of the probability mass between phases. As λm gradually increases, particles become closer and closer to the global minimum.Pm is gradually scaled so that particles do not waste time exploring fruitless areas which are far away from the global minimum. The current perturbation covariance matrix
Pmcan be scaled by assigning it to be proportional to the product of the survival rate and the previous covariance matrix. ThusPmcan be given by:
Pm =αm×...×α1×P0 = (αdiresed)m×Pm
This is analogous to the situation where as the temperature falls, the energy of parti- cles decreases and therefore the range of movement of the particles is squeezed.
The Simulated Annealing Particle Filter can be regarded as a particle filter with
p(xt|xt−1)as the importance distribution, and extra optimisation steps. The optimisa-
tion process takes place after the weights have been updated, and completed before the number of effective samples is evaluated. The Annealing Particle Filter is shown in Algorithm 5.
Algorithm 5Annealing Particle Filter for a typical frame at timet
Require: appropriateαmis defined, previous particlesxt−1, observationyt, the num- ber of phases Mand the initial covariance matrixP0are given
form=1 toMdo
1: Initialise N particles xit from the previous phase or the temporal model
p(xit|xit−1).
2: Calculate the energyE(yt,xt)for all particles. 3: Findλmby solving an equation (5.1.2).
4: Update weights for all particles using equation (5.1.1). 5: ResampleNparticles from the importance distribution.
6: Perturb particles by Gaussian noise with covariancePm =αmPm−1and mean
µ =0. end for