Colour Particle Filter - State Estimation of Dynamic Systems

4.2 State Estimation of Dynamic Systems

4.2.4 Colour Particle Filter

The colour particle filter is one of the tracking algorithms used in this thesis. This section provides an overview of the filter operation and further expands the concept to multiple target estimation.

The Particle Filter and its variations have been applied in tracking of objects in video sequences. One of the earliest methods was the CONDENSATION algorithm (CONditional DENSity PropagATION) (Isard, 1998). The algorithm is a non-linear Particle Filter which uses shape contours to form a state vector. Another use of the Particle Filter in the context of video originates from the track- before-detect (TBD) concept. The track-before-detect idea allows the system to detect stealthy targets which, from the sensor point of view, tend to blend into the background. Typical sensor processing involves thresholding which aims at eliminating noise before tracking and data association is performed. This means that targets with low detection profiles are likely to remain undetected in this scenario. The TBD concept works on unthresholded data, and by considering several scans of data, it detects patterns which are not random noise and therefore possible targets. However the problem with early TBD algorithms was that they were batch methods, prohibiting or penalising deviations from straight-line motion which were computationally intensive. These algorithms were based on the Hough transform, dynamic programming and maximum likelihood estimation (Ristic et al., 2004). Salmond and Birch (2001) proposed a TBD algorithm based on the Particle Filter and the concept was applied successfully in the computer vision field. The main Particle Filer technique used in computer vision uses colour via a coloured target template. The measurement model is based on a randomly

selected area of an image which is compared with the template to determine if this area represents the target or not. The following subsection describes the comparison operation.

Measurement Model

The state space model represents an area of the image. The shape of this area can be rectangular, elliptical or more complex depending on application requirements. The state vector for each particle can therefore include the centroid of the area, the width and height (Czyz et al., 2005). Scale and velocity components may also be included (P´erez et al., 2002).

The measurement model is built using a histogram based on the Hue Satura- tion Value (HSV) colour space (P´erez et al., 2002), HSV often deals better with illumination variation than the commonly used RGB colour space. The concept originates from the mean-shift tracker by Comaniciu and Meer (2002). The idea behind using a colour histogram is that the target becomes invariant to shape deformations. However if the background is of a similar colour to the target, problems may arise with detection of true targets.

As with any histogram, the method needs to determine a number of bins which will be required to represent colour. The number of bins in the histogram is N = NhNs + Nv, where Nh, Ns, and Nv are the number of bins for each

component of the HSV colour space. A pixel at position u can be represented as the transpose of its HSV components, y(u) = [h(u), s(u), v(u)]T_{. Each HSV}

component can be allocated to its own numbered “bin” and the bin representation of the colour vectory(u) denoted as [bh(u), bs(u), bv(u)]T. The overall bin number, b(u), for pixely(u) is defined as

b(u) =

(

bs(u)Nh+bh(u) if s(u)≥0.1 and v(u)≥0.2 bv(u) +NhNs otherwise

(4.22) This equation denotes that the hue information is only reliable when the satura- tion component is above 0.1 and value component is above 0.2 (Cai, 2005). At other times only the value component is used for bin calculations.

The histogram is calculated over the regionR(xk), defined by the state vector

xk, at timek and can be denoted asQ(xk) ={q(n;xk)}n=1,...,N, where q(n;xk) is

given by

q(n;xk) =C

u∈R(xk)

δ[b(u)−n] (4.23)

where δ is the Kronecker delta function,C is a normalisation constant, u is any pixel within the region R(xk) and PN_n=1q(n;xk) = 1. The reference histogram,

q∗(n;x0), of the colour target template can then be compared with the histogram

of the candidate region defined by xk using the Bhattacharyya coefficient (Cai,

2005). The distance metric, D, between candidate regionxk and template region x0, is defined as D(xk, x0) = 1− N X n=1 [q(n;xk)q∗(n;x0)] 1 2 !1₂ (4.24)

This metric is used in the calculation of the likelihood function (Eq. 4.20):

p(zk|xk)∝ 1 √ 2πσe − 1 2σ2D2k _(4.25)

where σ is a design parameter representing the standard deviation of a Gaussian density (Czyz et al., 2007).

The template matching can be further enhanced by incorporating the image’s background information. Given that the background image is available, the Bhattacharyya distance between the template histogram and the candidate region in the background image is given by

DB(xB_k, x0) = 1− N X n=1 qB(n;xk)q∗(n;x0) 1₂ !1₂ (4.26)

and the likelihood function in Eq. 4.25 becomes to

p(zk|xk)∝ 1 √ 2πσe − 1 2σ2 (Dk)2−(DBk) 2 (4.27) Multiple Targets

When dealing with multiple targets, standard particle filters tend to converge to a single mode of the posterior distribution and end up tracking just one target. To avoid this, a scheme is required that can maintain multiple modes - one for each target. If the number of targets is known, the method used by Czyz et al. (2007) can be applied, in which an existence variable Ek denotes the number of

targets in a current frame.

The existence variable can be defined as E ∈ _E ={0,1, ..., M}, where M is the maximum expected number of targets. It evolves using a Markov chain and can be described by a transitional probability matrix (TPM) Π = [πij], where

is the probability of transition from i at time k−1 to j at time k (Czyz et al., 2007). The TPM for two targets (M = 2) can be described as:

Π =    (1−Pb) Pb 0 Pd (1−Pd−Pm) Pm 0 Pr (1−Pr)    (4.29)

wherePb andPdare probabilities of target “birth” and “death” respectively, and Pm and Pr are probabilities that the number of targets will multiply or reduce

respectively.

The multi-target state vector for each particlen can then be redefined as

y_kn = [E_kn, xn_1,k, ..., xn_En k,k]

(n= 1, ..., N) (4.30) where N is the number of particles, En

k is the variable denoting the number of

targets in the state vector, and xn_i,k is the state vector for each target, where

i= 1, ..., En

k. Therefore the state vectoryk can have a variable length depending

on the value of Ek: yk =                Ek if Ek = 0, [xT 1,kEk]T if Ek = 1, [xT 1,kxT2,kEk]T if Ek = 2, .. . ... [xT 1,k...xTM,kEk] T if Ek =M, (4.31)

The likelihood for each particlenis derived from Eq. 4.27 but needs to incorporate

Ek: Ln_k(E_kn) = exp    − 1 2σ2 En k X i=1 Dn_i,k2−Dn,B_i,k 2    (4.32) Importance weights can then be calculated based on the likelihood and Ek:

wn_k =

1 if E_kn= 0,

Ln_k(E_kn) if E_kn>0. (4.33)

Czyz et al. (2007) further describes how the state vector for each particle changes based on transitions of the existence variable. They additionally suggest that to stop tracking of multiple objects on the same image region, a condition be applied that if the Euclidean distance between two target state vectors is too close, the corresponding weight will be reset to zero, ( ˜wn

R2 is a design parameter). Pseudo code of the colour particle filter is shown in Table A.1 and further details regarding the implementation will be discussed in Chapter 7. The probability, P r of number of targets detected in this work is

P r(Ek =m|Zk) = 1 N N X n=1 δ(E_kn, m) (4.34) where m= 1...M.

The estimate of state, ˆxi for i= 1...mˆ is

ˆ xi,k|k= PN n=1x n i,kδ(Ekn, i)) PN n=1δ(E n k, i) (4.35)

In document Computational techniques for automated tracking and analysis of fish movement in controlled aquatic environments (Page 58-62)