Optimal Feature Selection - Automatic classification of flying bird species using computer visi

The hypothesis is that the use of combined appearance and motion features may be used to improve classification performance; however, the results in the previous chapter demon- strate that simply combining the appearance and motion features does not automatically improve the correct classification rates. In Chapter 5 it was hypothesised that performance of the combined set may be undermined by the presence of redundant features and that significantly better classification rates may be achieved using a subset of the full 320 features. The results of the experimental investigation of optimal feature selection are presented in this section.

As previously mentioned, two methods of feature selection were evaluated: the correlation-based method proposed by Hall (1999), and a variation of the machine learn- ing based method proposed by Breiman (2001). Using each method, the optimal feature subset were determined, and the performance of each of the four classifiers using that subset was evaluated. This section describes how the optimal subset is determined, and also describes the composition of that subset (ie which of the original features are re- tained). The following procedure was used to determine the optimal feature subset, for both correlation and classifier-based methods:

• The feature selection method was used to rank features, in order from the most to least effective

• For consistency, the 10 lowest ranked features were iteratively removed from the list

6.4. OPTIMALFEATURE SELECTION 159

• For each iteration, the performance of each of the four classifiers were evaluated using the remaining features

• This was repeated until only the 10 highest ranking features remained

• A graph of the correct classification rates against the number of features were plotted, and the maxima for each of the classifiers were estimated.

Figures 6.1 and 6.2 show the resulting graphs for the correlation and classification based methods respectively. Each classifier is shown as a separate curve, and the mean classification rate across all classifiers is also shown.

Peaks in the classifier-based method occur at 180 features for the NB classifier, 320 for SVM, and 70 for RT and RF respectively. Likewise, peaks using the correlation-based method occurred at 170 features for the NB classifier, 320 for SVM, and 70 for RT and RF classifiers respectively.

Figure 6.1: Plot of correct classification rates vs. number of features for the four standard classifiers when classifier-based selection is applied. The maximum for each classifier is marked with a solid circle, and labelled with the number of features and

correct classification rate.

The mean of the four curves were taken and this was plotted with dashed lines: the highest correct classification rate for the mean curve is 75.40% when the correlation-based

Figure 6.2: Plot of correct classification rates vs. number of features for the four standard classifiers when correlation-based selection is applied. The maximum for each classifier is marked with a solid circle, and labelled with the number of features and

correct classification rate.

method is applied (this corresponds to 70 features). Similarly, this occurs at 74.82% when the classifier-based method is applied, which is at 180 features. The mode of both feature selection techniques occurred at 70 features (In both techniques RT and RF classifiers peaked at 70 features). Therefore, based on the mode and mean, the most optimal correct classification rates are achieved using the subset of 70 highest-ranked features (for both feature selection methods).

Table 6.3 shows feature groups by type, before and after selection, for both correlation-based (CoBfs) and classifier-based (CBfs) methods. From the table, the classifier-based method selected 62 appearance features, from seven feature groups, and 8 motion features from two groups. The correlation-based method selected 49 appearance features from six groups, and 21 motion features from three groups. It was noted that for the motion features, wing beat frequency and vicinity features were selected irrespective of the method used. This suggests that wing beat features can effectively contribution to species classification since they were selected by both methods. The features from the correlation-based method were used for the remaining experiments since the highest

6.4. OPTIMALFEATURE SELECTION 161

Table 6.3: The number of features remaining in each feature group before and after applying the classifier-based (CBfs) and correlation-based (CoBfs) Feature Selection (FS) methods. The table also includes the top feature selected for each of the classes of

features, using both feature selection methods.

Feature Group # before FS

# after CBfs

# after

CoBfs Top Selected Feature (CBfs) Top Selected Feature (CoBfs)

ppearance

Hue color features 37 13 18 σ of Hue Mean of Hue Saturation colour features 35 16 12 Mean of Saturate Mean of Saturate Value colour features 37 28 16 Entropy of value Entropy of value Shape 17 1 1 Hu’s First invariant Hu’s First invariant Gabor 20 1 1 Mean of Gabor (at θ = 0) Mean of Gabor (at θ = 0) Grayscale 8 1 0 Mean of Grayscale N/A

LogPolar 15 2 1 Mean of Logpolar hue Mean of Logpolar hue

Motion

FFT (Wingbeat) 27 7 8 First Peak of FFT (width) First Peak of FFT (width) CSS 22 0 0 N/A N/A

CDF 10 0 0 N/A N/A Turn 62 0 12 N/A Turn (θi=55)

Vicinity 20 1 1 Mean of Vicinity Curliness Mean of Vicinity Curliness Curvature 10 0 0 N/A N/A

Total Features 320 70 70

mean correct classification rate (75.40%) was achieved with this method.

The hue colour feature’s top selected feature using the classifier- and correlation- based methods is the hue histogram’s standard deviation and mean respectively. The mean of the hue histogram describes the general brightness of the hue colour, whereas the standard deviation (σ ) describes the contrast. The top selected saturation feature for both feature selection methods is the saturation histogram’s mean, which also describes the general brightness of the saturation colour. Finally, for the colour features, the top selected value feature for both the classifier- and correlation-based method is the value histogram’s entropy, which shows how many bits are needed to code the image data.

The most important feature for identifying the shape of bird species is Hu’s first moment as depicted by both correlation- and classifier-based feature selection methods. The Hu’s first moment determines the shape of the bird irrespective of translation, rotation, or scale.

Finally, most important motion feature is wingbeat frequency, which is represented by FFT. The top selected feature for this class of feature is the first FFT peak computed using the width metric. This is the wingbeat frequency of the bird species.

In document Automatic classification of flying bird species using computer vision techniques (Page 179-183)