Chapter 2 Literature Review
2.5 Classification / Data Summation
2.5.1 Gait Indexes
One simplistic solution to summarising gait data is to summate how far the subject deviates from that observed in healthy controls. One of the criticisms of this approach is that it ignores the inter-correlated nature of gait variables. An example which illustrates this has been modified from Schutte et al. (2000) and is displayed in Figure 2.9. From the figure, it can be seen that, when interpreting the distance from the standard deviation of highly correlated variables individually, this could result in the misinterpretation of gait data for subjects 2 and 4, who actually fall within the healthy hypothetical distribution of increasing peak flexion angle with increasing walking speed. Schutte et al. (2000) introduced a method called the ‘normalcy’ index, later becoming known as the Gillete Gait Index (GGI), which aimed to overcome this issue. The method is briefly summarised below and illustrated in Figure 2.8:
1. Principal component analysis is performed on the healthy data to define a new uncorrelated axis system, following the same steps listed in Section 2.4.1. 2. Project the data onto this new axis system, by multiplying the standardised values
by the eigenvector (effectively calculating the PC scores)
Figure 2.8 Graphic representation of the calculation of the normalcy index for two variables,
reprinted from (Schutte et al., 2000).
A) The ellipse represents the normal healthy distribution of two correlated gait variables within two standard deviations (STDs) away from the mean. The patient is represented
as a distance at an amount of STDs from these means. The primary axis of variation (principal component) is displayed (pointing top right) and the orthogonal axis to this is
displayed (pointing top left).
B) Principal component analysis is used to define a new axis system, and the data points have been projected onto this axis system by calculating their PC scores. The two
variables are now uncorrelated; however, the variation is not equal.
C) The two axis systems have been dived by the square root of their eigenvalues, such that the variance is now equal. The Euclidean length of the patient from the centre of
this new coordinate system is now calculated and named the ‘normalcy index’.
A
B
3. Divide each axis (and variable value) by the square root of the eigenvalue associated with that PC. This essentially re-standardises the data such that each axis has equal variation.
4. Calculate the Euclidean length of the patient from the centre of this new coordinate system, which is named the ‘normalcy index’.
The most obvious criticism of this method is that, by dividing the PC axes by the square root of the eigenvalues, the contribution of PCs representing a high degree of variance is lessened, and those representing low variance is increased. The PCs which represent the smallest amounts of variance may be very sensitive to noise and have no contextual relevance, yet are being equally weighted in the normalcy index. This might explain why the normalcy index of a single subject can vary drastically between different sets of healthy control data (McMulkin and MacWilliams, 2008).
Figure 2.9 An illustration, adapted from Schutte et al. (2000), of how the interpretation of
‘normalcy’ of two gait variables can be misleading when there is high correlation between these variables within a healthy cohort.
The blue ellipse represents the normal healthy distribution of the peak knee flexion angle relative to walking speed within two Standard Deviations (STDs) away from the mean. Hypothetical test subjects 1-4 have been highlighted as a black cross. While all four test subjects are all the same fraction of STDs away from the mean of both variables, subject 2 and
4 actually fall within the distribution found in healthy subjects, whereas subject 1 and 3 fall well outside this range.
In the demonstration, the normalcy index is calculated using 16 temporal-spatial and kinematic discrete variables which were subjectively selected based on the experience of the authors. Schutte et al. (2000) raise this issue themselves when presenting the technique, and even suggest the inclusion of kinetic parameters might more accurately reflect patient outcomes. Despite this, these same 16 variables have been used extensively by researchers to calculate the normalcy/GGI index (Assi et al., 2009, Wren
et al., 2007, Hillman et al., 2007, McMulkin and MacWilliams, 2008). Also, the normalcy
index seems to have been used primarily in the research field of cerebral palsy, hence children are often used as control subjects (Cretual et al., 2010). It appears that the GGI may also be valid in adult populations (Cretual et al., 2010), but it was again noted that these 16 variables may not be the optimal biomechanical descriptors of gait.
Further criticisms of the GGI include the difficulty of interpreting and the lack of physical meaning of the multivariate components which make up the score, and the non-normality of the index (Schwartz and Rozumalski, 2008). A new gait summary measure has been proposed by Schwartz and Rozumalski (2008) in an attempt to overcome these problems. The calculation involves the creation of a matrix, or state space, which includes all temporal gait parameters under consideration. A singular value decomposition (SVD) is then calculated on the matrix, a technique which has a lot of similarities to PCA. The SVD creates a new ‘orthogonal basis’ which can be used to reconstruct the data. Much like the PC selection process mentioned in Section 2.4.1, only some of the feature components will be considered for analysis. Also, each feature component accounts for a decreasing amount of variance to the previous. The analysis of Schwartz and Rozumalski (2008) included the use of threshold criteria for the minimum representation of total variance and a technique which measures the similarity of the reconstructed data to that of the original.
2.5.2 Artificial Intelligence
classification is to iteratively learn a mathematical relationship between input variables, e.g. discrete biomechanical variables and the target output, e.g. healthy or OA function.
Classification techniques can be broadly categorised as either supervised or unsupervised (Bishop, 1995). Both techniques will derive their relationships based on the training data. With supervised techniques, however, the training data has known class labels; for example, in this context, the data may have a class label of ‘0’ if they are known to be healthy, and ‘1’ if they are known to have OA.
Unsupervised classification techniques infer the classes based on the data itself (Bishop, 1995). For example, let’s suppose there were two distinct compensation strategies that OA subjects use to avoid pain and instability. These might be distinct from healthy subjects, hence an unsupervised classification technique might arrive at three class labels i.e. ‘healthy’, ‘OA1’ and ‘OA2’. While unsupervised classifier architectures have been applied to gait analysis, such as hidden Markov models (Cheng et al., 2008) and self-organising maps (Barton et al., 2006), this thesis is focussed on the supervised classification of labelled data. Further sub-classification or phenotyping of OA subjects using unsupervised techniques may be clinically informative, however, is beyond the scope of this thesis.