• No results found

Chapter 5 Region-Based Gait Analysis in Image and Feature Spaces

5.3 Proposed method: ReG-IF

5.3.2 Module 2, phase 1: SDIS

Similar to [22], the gait period of a subject in ReG-IF corresponds to the number of frames between two frames of a gait sequence with the maximum number of contour points en- closed by the bottom of the bounding rectangle and the anatomical position of the sub- ject’s hand measured from the bottom of the bounding rectangle, i.e., 0.377H whereH is the height of the bounding rectangle, as this foreground segment is not distorted by self- occlusion due to arm-swing. After estimating the gait period of the subject from the video sequence of lateral view of the subject, the ten phases of the gait period [7] (i.e., initial contact, ending swing, 2 mid-swings, pre-swing, propulsion, 3 midstances and double sup- port) which capture most of the significant gait characteristics, are extracted using region- of-interest based contour matching following STM-SPP [7]. A detailed description and pictorial illustration of the ten phases of a gait period are provided in STM-SPP [7] (and presented in Chapter 3).

We analyse the silhouettes at the ten phases using Lp-Gf and Hp-Gf in the frequency domain at different cut-offfrequencies to obtain the SDIS for a subject as follows. The DFT of anM×Nsilhouette imageI(x,y) is DFT(u,v)= 1 MN M∑−1 x=0 N∑−1 y=0 I(x,y)ej2π(ux/M)+(vy/N), (5.1)

whereu=0,1,2, . . . ,M−1 andv=0,1,2, . . . ,N−1 are frequency variables. The Fourier transformed silhouette, i.e.,DFT(u,v) is translation invariant, but since it retains rotation, it is subjected to shift operation to ensure that the zero-frequency components are at the centre. To represent the inner part of a silhouette gradually towards the centre more than its boundary, Lp-Gf is applied to the Fourier transformed image using selected cut-offfre- quencies, i.e.,

DFTL(u,v)= DFT(u,v)e(−(u

2+v2)/2D2)

, (5.2)

wheree(−(u2+v2)/2D2) is the transfer function of Lp-Gf in the frequency domain [98], and

DFTL(u,v) denotes the image filtered using Lp-Gf at the cut-offfrequencyD. The filtered

silhouette at cut-offfrequencyDin the image space is obtained by applying inverse DFT, i.e., I(x,y)= M∑−1 u=0 N−1 ∑ v=0 DFTL(u,v)ej2π(ux/M)+(vy/N). (5.3)

Fig. 5.1(a)-(k) show the silhouettes filtered by Lp-Gf with decreasing cut-offfrequency. Since Lp-Gf attenuates high frequency components, it blurs the silhouette and thus smooths detailed clothing curvatures at the silhouette boundary. As the cut-offfrequency decreases, it results in a greater loss of boundary and exterior regions due to increase in blurring to gradually highlight the inner shape characteristics. Note that Gaussian functions in the spatial and frequency domain behave reciprocally with each other, hence an decrease in standard deviation of Lp-Gf in the frequency domain results in more blurring and vice versa, while the reverse is true in the spatial domain [98].

To represent the boundary and exterior regions of a silhouette more than its central part, Hp-Gf is applied to the silhouette at the same cut-offfrequencies [98], i.e.,

DFTH(u,v)= DFT(u,v)(1−e(−(u

2+v2)/2D2)

), (5.4)

where 1−e(−(u2+v2)/2D2) is the transfer function of Hp-Gf with cut-off frequency D[98]. The filtered silhouette is similarly obtained using Eq. (5.3), and Fig. 5.1(l)-(v) shows the silhouettes filtered by Hp-Gf with decreasing cut-offfrequencies. Hp-Gf emphasises the high frequency components but retains limited low frequency components, thus making

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k)

(l) (m) (n) (o) (p) (q) (r) (s) (t) (u) (v)

Figure 5.1: Application of Lp-Gf (Row 1) and Hp-Gf (Row 2) to a subject’s silhouette from USF 2.1 dataset with decreasing cut-offfrequency: (a) & (l)D1 =20; (b) & (m) D2 =18; (c) & (n)D3=16; (d) & (o)D4=14; (e) & (p)D5=12; (f) & (q)D6=10; (g) & (r)D7= 8; (h) & (s)D8=6; (i) & (t)D9=4; (j) & (u)D10=3; and (k) & (v)D11=1.

the boundary characteristics of a silhouette more prominent, and its application represents the exterior regions of a silhouette as the cut-offfrequency decreases. Note that region- based shape descriptors using Lp-Gf and Hp-Gf are in the image space. Also, since ReG- IF is defined on lateral view of silhouettes, and all silhouettes are scale-normalised and translation invariant, it is unnecessary to reapply DFT to the polar transformed silhouettes to achieve rotation invariance as in [113].

The focus value which is used to measure the degree of sharpness of an image is the maximum for the most focused, i.e., the original silhouette. It is inversely proportional to the image blurriness caused by the Gaussian filtering at different cut-offfrequencies. Com- mon methods for computing focus values of an image include spatial domain based meth- ods, e.g., Tenengrad [114] and sum modified Laplacian [115], and wavelet based methods [116]. The first level 2D Daubechies-6 wavelet decomposition of a silhouette image f(x,y) of sizeM×Nresults in four subband images,WLL,WHL,WLHandWHH, whereLandHre-

spectively denote lowpass filtered and highpass filtered, and their order denotes the order of the filtering applied, e.g.,WHLis a subband image obtained by highpass filtering followed

by lowpass filtering. The focus value of a silhouette is measured using [116]

FV = 1 MN Ny=0 Mx=0 (|WHL(x,y)|+|WLH(x,y)|+|WHH(x,y)|). (5.5)

Since this wavelet based method provides the sharpest focus measure profile and higher depth resolution than the spatial domain based methods due to the localised support property of wavelet basis [116], it is used to compute the focus value of low resolution silhouettes.

The focus value of the original silhouette always reduces to below 50% if it is filtered by Lp-Gf at cut-offfrequency D= 20, and decreases linearly as the blurriness in-

creases with decreasing cut-off frequencies. If the cut-off frequency is decreased further to belowD = 8, the focus value decreases abruptly. The focus value becomes infinitesi- mally small ifD<4, resulting in excessively blurred silhouette without any discriminating information (e.g., Fig. 5.1(j)-(k)).

The boundary of a silhouette is obtained by the application of Hp-Gf usingDap- proximately in the range [18,22] for most of the silhouettes of the USF dataset [22]. Since the silhouette boundary corresponds to the sharpest image, e.g., Fig. 5.1(l)-(m), the focus value of a silhouette filtered by Hp-Gf usingDin this range remains the maximum which is considerably higher than the focus value of the original silhouette (i.e., Fig. 5.2(b)). With further decrease in cut-offfrequency, the focus value decreases linearly with the decrease in the sharpness of the image as the silhouette is reconstructed by regaining its central region. The focus value is nearly identical to that of the focus value of the original silhouette in the range 1 ≤ D < 4 due to almost reconstruction of the original silhouette. These sil- houettes hardly contribute to gait recognition as their shape is considerably affected by the covariates. Since the boundary as well as central shape characteristics of a silhouette are considered separately by using Lp-Gf and Hp-Gf, it is not necessary to process all silhou- ettes which will increase the computational complexity. Thus,Din [4,20] is considered to be the ideal range of cut-offfrequencies for the Gaussian filters to obtain SDIS.

Fig. 5.2(a) and (b) respectively show the normalised focus value w.r.t. decreas- ing cut-off frequencies in the range [22, 0] of a silhouette filtered by Lp-Gf and Hp-Gf, where normalised focus values are obtained by dividing each of the focus values with the maximum focus value in the range [22, 0]. Since the focus value of a filtered silhouette maintains a linear relationship with the cut-offfrequencies of Gaussian filters as shown in Fig. 5.2, a set of the smallest number of cut-offfrequencies to obtain the best performance for SDIS is determined by analysing the WAvgI at rank-1 (see Section 5.4.2) of SDIS on the USF2.1 dataset vs sets of cut-offfrequencies, where each set consists of increasing number of equidistant cut-offfrequencies in the range [4,20]. Fig. 5.3(a) shows the results for sets of cut-offfrequencies A= {4, 20}, B ={4, 12, 20}, C= {4, 9, 15, 20}, D= {4, 8, 12, 16, 20}, E ={4, 7, 10, 14, 17, 20}, F={4, 7, 9, 12, 15, 17, 20}and G={4, 6, 9, 11, 13, 15, 18, 20}on USF2.1 dataset. It shows that set D consists of the minimum number of cut-off frequencies for SDIS to achieve the highest WAvgI at rank-1 which is significantly higher than the WAvgIs obtained by using only the individual cut-offfrequencies comprising the set D (Fig. 5.3(b)). Any increase in the number of cut-off frequencies used in D, e.g., to form sets E, F and G, only increases computational complexity without having any effect on the identification rate. Thus, theN×Mimages obtained by applying Lp-Gf and Hp-Gf each at cut-offfrequencies comprising the set D (i.e., a total of 10 cut-offfrequencies) on the silhouettes of ten phases of a gait period for a subject are concatenated by arranging the

images at ten phases of a subject row-wise, and the corresponding filtered images by Lp-Gf and Hp-Gf column-wise to form a new 2D 10N×10Mconcatenated silhouette image (CSI). The resulting feature image corresponding to SDIS of a gallery subject is used to form part of the gallery database.

0 2 4 6 8 10 12 14 16 18 20 22 0 0.2 0.4 0.6 0.8 1 Cut−off frequencies

Normalised Focus Value

(a) 0 2 4 6 8 10 12 14 16 18 20 22 0.92 0.94 0.96 0.98 1 Cut−off frequencies

Normalised Focus Value

(b)

Figure 5.2: Normalised focus value w.r.t. decreasing cut-off frequencies of a silhouette from USF 2.1 dataset filtered using (a) Lp-Gf; and (b) Hp-Gf. ’-’ denotes focus value of the original silhouette. A B C D E F G 0 20 40 60 80 Rank−1 WAvgI

Set of cut−off frequencies

(a) 20 16 12 8 4 0 20 40 60 80 Cut−off frequencies Rank−1 WAvgI (b)

Figure 5.3: WAvgI at rank-1 for SDIS on USF 2.1 dataset vs sets of cut-offfrequencies, i.e., A, B, C, D, E, F and G; (b) WAvgI at rank-1 for SDIS on USF 2.1 dataset vs individual cut-offfrequencies comprising set D.