2.4.1
Gait Energy Image
The Gait Energy Image (GEI) is a model-free algorithm for human gait recognition proposed by Han and Bhanu [91]. In contrast to gait curve matching, the GEI algorithm utilizes a weighted combination of pixel values for in set of silhouette images. In other words, the GEI algorithm attempts to reduce the motion dynamics of an individual represented in multiple frames into a single image. As stated in the Chapter 1 (Section 1.3.2), the GEI image is a popular benchmark for comparing performance, owing to its ease of computation.
The algorithm computes ¯B, which is defined as the average of K space-normalized human
silhouette images, Bk, k = 1, 2, . . . , K. As in Section 2.2, K denotes the number of frames
Brian M. DeCann Chapter 2. Methods for Recognition of Human Gait 45
Figure 2.8: Examples of Gait Energy Images (GEI) extracted from two individuals. Note the pixel intensities in each GEI image correspond to the occupied foreground space as a person walks. ¯ B = 1 K K X t=1 Bk (2.17)
In computing the “average” silhouette, the features used for matching correspond to the “moving” pixel intensities as a human silhouette moves in time. In order to accurately
compute a GEI image, each silhouette image (Bk) must be normalized to the same number
of pixels and appropriately aligned. This is particularly necessary when an individual is observed walking towards or away from the camera. Alignment can be performed using the centroid of the image data. Examples of GEI images are presented in Figure 2.8.
Since the GEI algorithm denotes a weighted combination of K silhouette images, raw GEI images can have a very large dimensionality, which can cause difficulty in matching. In the original work by Han and Bhanu [91] this rectified through Principal Component Analysis (PCA), though other methods such as Linear Discriminant Analysis (LDA) [90], tensor discriminants [94], and 2-D PCA [135] have also been explored. In this implementation of the GEI algorithm, subspace optimization (i.e., dimensionality reduction) is performed using Principal Component Analysis, wherein a principal component is retained if the associated eigenvalue is greater than 0.001. The euclidian distance metric is used to generate match scores between a pair of GEI feature vectors. A brief description of PCA is presented in the following paragraphs.
Brian M. DeCann Chapter 2. Methods for Recognition of Human Gait 46 Principal Component Analysis
The aim of Principal Component Analysis (PCA) is to find an orthogonal subspace ψ that reduces the dimensionality of the original feature space (i.e., the theoretical range of values spanned by each element in a feature vector) while preserving a majority of the data variance. In other words, PCA aims to reduce the number of dimensions, d, in feature
vector v, to d′ where d′ ≤ d. The output, v′, is a vector whose elements will have a maximal
variance with respect to the transformed feature space.
PCA is accomplished by performing an eigenvalue decomposition on the covariance ma- trix computed from a set of training data. Here, the training data defines the feature space
which is to be optimized. Given n samples, xi ∈ ℜd,1, i = 1, 2, . . . , n, where xi denotes some
feature vector, the first step is to compute the sample mean µ (µ = 1nPn
i=1xi). Denote
X ∈ ℜd,n as a matrix containing each feature vector in the training data normalized by the
sample mean (i.e., X = {x1− µ, x2− µ, . . . , xn− µ}. A scatter matrix, S is computed from
X as S = XXT, where T denotes the transpose operator.
The optimal subspace ψ is computed by performing an eigenvalue decomposition on S,
which yields a matrix of eigenvectors ψ and eigenvalues Λ. Note the eigenvector in the ith
column of ψ corresponds to the ith diagonal of Λ (i.e., Sψ
i = Λiψi).
Typically, d′ < d eigenvectors are retained, such that
d′ = arg min ¯ d sumd¯′ i=1Λi P i = 1dΛ i > Ve (2.18)
where Ve ∈ [0, 1] is the fraction of data variation to be retained.
In general, most of the variance in X is stored in the largest eigenvector in ψ. By discarding eigenvectors associated with small eigenvalues, the feature dimensionality can be greatly reduced without losing the discriminatory information in the original feature vector.
Note, this description was adapted from the Ph.D dissertation of Klare [136].
2.4.2
Frieze Pattern Matching
The second comparison algorithm is referred to as “Frieze Pattern Matching”. In the mathematical context, a frieze pattern is defined as a two-dimensional pattern that repeats
Brian M. DeCann Chapter 2. Methods for Recognition of Human Gait 47
Figure 2.9: Visual example illustrating how the horizontal (row-sum) and vertical projections (column-sum) of the silhouette can be combined to create a spatiotemporal pattern for human gait recognition.
itself infinitely. Liu et al. [101] was the first to exploit the periodic nature of human gait as a frieze pattern. A frieze pattern denoting human gait is defined as the concatenation of the x- and y- projections of a silhouette moving in time. As with GEI, this method for denoting human gait is also classified as a model-free recognition algorithm. A mathematical description of such a pattern is described in the following paragraphs.
Consider a set of K silhouette images, denoted as Bk, k = 1, 2, . . . , K. Define a 2-d
frieze image, F (j, k) as the horizontal projection (row sum) of each of K silhouette images. Mathematically, this is described in Equation (2.19).
F (j, k) =X
i
Bk(i, j) (2.19)
F (j, k), denotes the width of the kth silhouette at the vertical pixel coordinate j. Simi-
larly, F (i, k) can be defined as the vertical projection (column sum) of the silhouette, captur-
ing the height of the kth image at the horizontal pixel coordinate i. In a set of K silhouette
images encompassing at least one gait cycle, the silhouette width varies periodically due to factors such as stride and arm sway. By observing F (j, k) as an image, the row projections combine to form a spatiotemporal pattern. A graphical example of how these patterns are constructed is illustrated in Figure 2.9.
In the original work by Liu et al., matching of frieze patterns was accomplished by com- paring the central moments of a probe and gallery sequence [101]. Alternatively, Dynamic
Brian M. DeCann Chapter 2. Methods for Recognition of Human Gait 48 Time Warping (DTW) or Hidden Markov Models (HMM’s) can be used to perform match- ing, as described by Kale et al. [100, 103]. In this implementation of the Frieze Pattern matching algorithm, match scores between two gait sequences are generated using 2-D Dy- namic Time Warping, [103]. The following paragraphs summarize how two Frieze patterns can be matched using 2-D Dynamic Time Warping.
Dynamic Time Warping
In general, Dynamic Time Warping is a method for evaluating similarity between two sequences that vary in time. For instance, in the context of a frieze pattern for gait recogni- tion, the periodicity varies depending on the speed at which a person is walking. Similarity between patterns is evaluated by generating cost matrix, C, which stores the difference be- tween each pattern for each unit of time (in this case, each frame). Thus, for a pair of
patterns, A and B, with length TA and TB, C(1, 1) denotes the difference between them at
t = 1 and C(1, TB) denotes the difference between A at t = 1 and B at t = TB. The resulting
distance score is obtained by summing the smallest valued path from C(1, 1) to C(TA, TB)
while moving strictly in increments of one in an 8-connected neighborhood.