• No results found

2.2 Interpretation as Importance Sampling

2.2.3 Importance Weights

For the moment, our aim is to approximateNt via

is

using the proposal

distribution Nt. The Rao–Blackwellisation described in the next subsec-

tion then leads to the usual

smc

approximation oft. We must therefore

ensure that the Radon–Nikodým derivative

x wt.xNt/WD dNt dNt .xNt/ D1fu1Wtg.x b1Wt 1Wt / 1.du1/1.u1;fb1g/ tjt.z1Wt;fbtg/q1m.b1;du1/ t Y sD2 s..u1Ws;z1Ws 1; os 1; bs 1/;fbsg/1fbs 1g.as tbs / Rsm 1..z1Ws 1; os 1; bs/;fbs 1g/ s.u1Ws 1;dus/ Qsm..z1Ws 1; os 1;as 1; bs/;dus/ (2.5)

is well defined. Here, we have slightly abused notation by writing Radon– Nikodým derivatives as.dx/=.dx/ WDŒd=d.x/, for any two meas-

ures, and where we have defined the following quantities. s 2 K.X1Ws 1;Xs/is a kernel which extends the Step-.s 1/target

measure to the Step-starget measure, i.e. it satisfiess 1˝ s Ds.

Rsm 1..z1Ws 1; os 1; n/;/denotes the ‘marginal’ distribution of thenth

parent index underRs 1..z1Ws 1; os 1/; /.

Qsm..z1Ws 1; os 1;as 1; n/; /denotes the ‘marginal’ distribution of the nth particle underQs..z1Ws 1; os 1;as 1/; /. The marginal Step-1 pro-

posal distributionq1m.b1; /is similarly defined.

2.8 Remark. The kernelsRm

s 1andQsminduce marginal distributions only

in the sense that they do not condition on the other parent indices or particles generated at Steps. They generally still depend on all the auxiliary variables, parent indices and particles sampled at previous steps. Indeed, this is why the ‘conditional’

smc

kernel does not represent a (full) conditional distribution under the distribution induced by the

smc

algorithm as pointed out in Remark 2.7 (see also Remark 1.16).

We will comment on particular choices for the kernels and measures guaranteeing the existence of the above importance weight in Section 2.3.

Distribution Over Particle Indices. The kernelssintroduced in the

previous subsection need to be chosen carefully to preserve absolute continuity, especially if a non-exchangeable resampling scheme is used. A generally applicable choice considered in Lee, Murray and Johansen (in prep.), which is also implicitly used by most

smc

algorithms, is to let

sbe a time-reversal kernel of some stochastic kernels2 K1.X1Ws;Ks/

(which defines a distribution overBt) underRsm 1, as defined in Assump-

tion 2.9.

2.9 Assumption. 1WD1and, fors >1,

s..u1Ws;z1Ws 1; os 1; abss1/;fbsg/ D R m s 1..z1Ws 1; os 1; bs/;fasbs1g/s.u1Ws;fbsg/ PNs nD1s.u1Ws;fng/Rsm 1..z1Ws 1; os 1; n/;fans 1g/ : (2.6)

It often suffices to lets.u1Ws; /UnifKs. However, more complex ker-

nels are sometimes needed to ensure absolute continuity in Equation 2.5. For instance, a more complex kernelsis needed in the discrete particle

filter summarised in Subsection 2.3.4.

The main advantage of the time-reversal kernel is that the import- ance weight in Equation 2.5 depends on the resampling distribution only through the denominator in Equation 2.6. Hence, it is usually not neces- sary to require the resampling scheme to be exchangeable – even if we cannot evaluate the distribution implied byRsm 1. For instance, if we use

an unbiased resampling scheme and ifs.u1Ws; / WDUnifKs, thenRsm 1

drops out in the importance weights from Equation 2.5 because

s..u1Ws;z1Ws 1; os 1; bs 1/;fbsg/ Rsm 1..z1Ws 1; os 1; bs/;fbs 1g/ D ( 1=Wbs 1 s 1 .z1Ws 1/; if we resample at Steps, 1fbs 1g.bs/; otherwise.

However, note that sampling according tos(which depends onRsm 1)

and Rsc 1 is still required when sampling from the

csmc

kernel. For

various common resampling schemes,Rsm 1andRsc 1are derived in Lee,

Murray and Johansen (in prep.) and for completeness, they are also stated in Appendix A of this work.

2.2 Interpretation as Importance Sampling

2.2.4

Rao–Blackwellisation

LetNtis;1WD xwt.Xxt/•Xxt be an

is

approximation of the extended target

measureNt based on a single sampleXxt D.U1Wt; B1Wt;Z1Wt/ Nt.

Of course, we are only interested in approximating the marginalt of

N

t. The usual

smc

approximation of this marginal measure,smc ;N1Wt

t , can

be obtained by Rao–BlackwellisingNtis;1as described in Lee, Murray and

Johansen (in prep.).

More precisely, note that

wb1Wt

t .Z1Wt/WDEŒwxt.Xxt/1fb1Wtg.B1Wt/jZ1Wt

is non-zero only ifb1Wt coincides with a particle lineage under the

smc

algorithm, i.e. ifb1Wt DB1nWtjt, for somen2Kt. We can therefore identify Nt (unnormalised) Step-t particle weights, forn2Kt, as

wnt.z1Wt/WDw bn

1Wtjt

t .z1Wt/:

For anyA2B.X1Wt/, a

mosis

approximation oft.A/is thus given by mosis;N1Wt t .A/DE N tis;1.A xZt/ ˇ ˇZ1Wt DX b1Wt2K1Wt wb1Wt t .Z1Wt/•X1bWt1Wt.A/ D Nt X nD1 wnt.Z1Wt/•XBn1Wtjt 1Wt .A/ Dsmc;N1Wt t .A/:

The above construction immediately implies that the

smc

estimate of the normalising constant,zsmc;N1Wt

t Dsmc ;N1Wt

t .1/D Nis ;1

t .1/, is a (one-

sample)

is

estimate and is therefore unbiased. Nonetheless, we stress again that the unbiasedness property alone does not ensure estimates that are useful in practice, i.e. estimates whose error can be controlled. Condi- tions under which this is guaranteed are summarised in Subsection 2.2.5. Finally, recall that wxt D dNt=dNt. For later reference, we state the

following slight generalisation of Andrieu et al. (2010, Theorem 2) (but which is really just a special case of Proposition 1.13).

2.10 Proposition. Assume that tjt.z1Wt;fng/DWtn.z1Wt/; for any.n;z1Wt/2 Kt Z 1Wt, then x wt.xNt/Dztsmc;N1Wt; for anyxNt 2 xXt.

Proof. This follows immediately from the definition ofwn

t.z1Wt/.