4.3 Existing smc Algorithms
4.3.3 Theoretical Analysis
Actually Targeted Distribution. In the following, we characterise the
approximation induced by the above choice of forward and backward kernels, i.e. with only single-birth or single-adjustment moves.
First, we define some notation.
bj WDinffq 2N jPqlD11fbg.ml/Djgdenotes the index of the
smc
step at which thejth birth move occurs.
s. / WDinffq 2N jtq gdenotes the index of the first
smc
stepat which a jump with jump time could have been proposed.
Qs.1Wj/ WD supfs.j lC1/Cl 1 j l 2 Nj g denotes the minimum
number of
smc
steps needed to propose jumps with jump times1Wj.The product of the particle proposal kernels up to Stepn, P1.dx1/
n
Y
pD2
Pp.dxpjx1Wp 1/; (4.2)
then has support
ErnWD x1Wn2E1Wn ˇ ˇ ˇ ˇ b1D1 and 8j 2Z2;kn W Qs.n;1Wkj 1/ < bj n :
In particular, the marginal distribution of Xn under the distribution in
Equation 4.2 then has support
z
Ern WD˚
.kn; 1Wkn; 0Wkn/2 zEn ˇ
4.3 Existing
smc
Algorithms Recall that we writezn DxnnmnD.kn; 1Wkn; 0Wkn/. To ensure thatthe importance weights exist, the algorithm can therefore only target, as a marginal, the distribution
Q
nr;.dzn// Qnr;.dzn/D Qn.dzn/1zErn.zn/:
If the time between successive
smc
steps,tn tn 1, is short comparedto the average time between jumps, the difference between the distribu- tion in Equation 4.1 and the marginally targeted distributionQr;
n should
be negligible. We will consider the influence of this approximation in Section 4.5.
The extended distribution actually targeted by the algorithm is
nr;.dx1Wn// Qnr;.dzn/ˇ0.dm1jz1/ n 1 Y pD1 ˇp.dmpC1jzpC1/Lp;m pC1.dzpjzpC1/;
where we assume, for the moment, that the backward mixture weights
ˇp. jzpC1/can be chosen such that this extended distribution does not
have probability mass outside ofErnto ensure that the importance weights
exist. A detailed discussion of the choice of backward mixture weights is given below.
We conclude this subsection by describing some implementation issues regarding the above-mentioned
smc
algorithm. To our knowledge, they have not been pointed out in the literature. The point we wish to stress here is that backward and proposal kernels need to be chosen carefully and in such a way that they are consistent with each other in order to avoid introducing biases resulting from a loss of absolute continuity. Such biases may be small in the case of filtering (i.e. if the static parameters are known). However, if the static parameters are to be estimated alongside the jump times and jump sizes, even small biases in the filter can induce large biases in the estimates of the static parameters.Choice of Jump-Size Proposal Kernels. It was advocated in Whiteley
et al. (2011) to sample the jump sizes from their full conditional posterior distribution. However, given the structure of the algorithm, this posterior distribution will often be based on observations ‘too far’ into the future.
Proposal forbased ony ;C
Proposal forbased ony
;t n True process Observations t t C t n
Figure 4.3Proposal distributions for the jump size Dn;k
nassociated with the jump time Dn;knsampled at thenthsmcstep in the elementary change-point model (as part of one of the particles).Dashed line: full conditional distribution ofgiven all the data up to timetn.Solid line:full conditional distribution of given the data up to timeCc.
More formally, assume that the jump time D n;kn proposed at
Stepnis much smaller thantn. In this case, the full conditional posterior
distribution of the jump size D n;kn conditions on there not being
another jump in the interval.; tn. Due to conditioning on a potentially
large number of observations, this full conditional posterior distribution can then be highly concentrated.
However, not having another jump in the interval.; tncan have little
probability mass under the joint posterior distribution. If, at Step.nC1/,
a further jump is added in this interval, then the jump size might be
located in a region that has little probability mass under the full conditional posterior distribution (which then also conditions on the additional jump). This can lead to a high variance in the particle weights. The problem is illustrated in Figure 4.3.
4.3 Existing
smc
Algorithms ating observations into the proposal kernels, we recommend to only take observations from some intervalŒ; Cc/into account when sampling . For instance, we may takec D. C/Q ^tn,2.0;1/andQ may bethe mean interjump time or a quantile of the interjump-time distribution.
Choice of Backward Mixture Weights. As previously mentioned, the
backward mixture weights must be chosen such that the extended target distribution does not have probability mass outside of Ern. The most
obvious problem with a poor choice of backward mixture weights is that the extended target distribution does not actually admit the right marginal (in addition to having ill-defined importance weights).
There is a one-to-one correspondence between Pn
pD11fbg.mp/, the
number of birth moves, and kn, the number of jumps in the proposal
distribution and hence in the support of the truncated target distribution
Q r;
n . Therefore we cannot specify their distributions independently. The
target already specifies a distribution overkn; if the backward mixture
weightsˇp. jzpC1/do not depend onkpC1, then they implicitly specify a
second distribution overknand the marginal distribution of this quantity
under the target distribution will not be what is intended.
For instance, consider setting the backward mixture kernel weights to a uniform distribution over M D fa; bg – a popular choice. Write
AnWD fp 2Nn 1jmpC1Dag,BnWDNn 1nAn, andCnWD fbg Mn 1.
The algorithm targets as a marginal the distribution defined by
Z E1Wn nr;.dx1Wn/1D.zn/ / Z z E1Wn Q nr;.dzn/1D.zn/ X m1Wn2Cn Y p2An Lp;a .dzpjzpC1/ Y p2Bn Lp;b .dzpjzpC1/ D Z D Q nr;.dzn/# m1Wn2 Cn ˇ ˇ ˇ ˇ Pn pD11fbg.mp/Dkn D Z D Q nr;.dzn/ n 1 kn 1 ! ;
For regular proposal kernels, a possible choice of backward kernels restricting the support of the extended target distribution toErnmay be
induced by setting the backward mixture weights to
ˇp 1.bjzp/D
0; ifk p D1 andp >1, 1; ifp 2 f1; kp;s.Q p;1Wkp 1/C1g, qp.zp/; otherwise, (4.3) for some probabilityqp.zp/2.0;1/which may depend onzp.Local Adjustment Moves. Ideally, adjustment moves should direct the
jumps towards regions of higher posterior probability. If such moves cannot be devised, it is preferable to uselocaladjustment moves, e.g. small-
scale Gaussian kernels centred around the current location of the jump. This reduces the risk of moving jumps away from regions of high posterior probability, which would add to sample impoverishment. However, such local adjustment moves are unlikely to move a jump currently contained in.tp 1; tpout of such an interval. Therefore, even using Equation 4.3
could result in importance weights with infinite variance.
A simple remedy is to employrestrictedadjustment moves, i.e. local
moves that are limited to the particular interval.tp 1; tpcurrently con-
taining the jump. More formally, recall thats. /Dinffq 2N jtq g
is the first
smc
step at which a jump with jump time could have beenproposed. For restricted adjustment moves,n;a . jzn 1/then has support ..n 1;kn 1 1_tsn 1 1/; tsn 1, where sn 1 WD s.n 1;kn 1/, rather than
having support.n 1;kn 1 1; tn.
Also recalling thats.Q 1Wj/Dsupfs.j lC1/Cl 1 jl 2Nj grepres-
ents the minimum number of
smc
steps needed to propose jumps with jump times1Wj, the support of the joint proposal distribution from Equa-tion 4.2 is then given by
ErnD x1Wn2E1Wn ˇ ˇ ˇ ˇ b1D Qs.n;1/D1 and 8j 2 Z2;kn W Qs.n;1Wkj/bj n :
To ensure that the target distribution does not have probability mass outside ofErn, the distribution ofn 1;kn 1 underQ
n 1;a. jzn/must have
support ..n;kn 1_tsn 1/; tsn, where sn WD s.n;kn/. In addition, the
backward mixture weights might take the form presented in Equation 4.3 but withs.Q p;1Wkp 1/replaced bys.Q p;1Wkp/ 1.