• No results found

The final stage, Stage 4, of the region-based classification process is concerned with the classifier generation. From the foregoing, a single feature vectorF V(i) is used to describe each data volume i (retinal volume with respect to the motivation for this thesis) where each volume is made up of a number of features. For training purposes, each feature vector F V(i) was combined with a class label ci ∈ {−1,1}. The class label in our case is in a binary form where +1 indicates a retina with AMD, and -1 a normal retina. The resulting representation is compatible with a number of classifier generators.

In order to evaluate the seven region-based representations (FOR, SOR-VCM, SOR- VRLM, HOG, LBP, HOG-LBP and LPQ), three classifier generators were employed: (i) Support Vector Machines(SVM), (ii) k-Nearest Neighbours (KNN), and (iii) Bayesian Networks (BN). The implementations for these classifiers were taken from Weka [51]. For SVM, the Library for Support Vector Machines (LIBSVM) package [14] is used. The outcomes of the evaluation are presented in the following chapter.

5.6

Summary

The general process for generating region-based representations advocated in this chap- ter comprises a four-stage process: (i) image decomposition to produce a set of regions, (ii) feature vector generation for each region, (iii) a single feature vector and (iv) classi- fier generation applied to the complete set of single feature vectors. Two types of region- based techniques were considered: (i) statistical-based techniques and (ii) histogram- based techniques. With respect to the statistical-based techniques, two approaches were considered: (i) First-Order Representation (FOR) and (ii) two Second-Order Represen- tation (SOR) using VCM and VRLM. With respect to the histogram-based techniques, four approaches were considered: (i) HOG, (ii) LBP, (iii) HOG-LBP and (iv) LPQ. For stage three, two feature vector generation techniques were proposed: (i) dimensionality reduction and (ii) feature selection. An evaluation of the techniques described in this chapter is presented in the following chapter.

Chapter 6

Evaluation of Classification

Performance Using Region-Based

Volumetric Representations

6.1

Overview

In the previous chapter (Chapter 5), a four-stage region-based process for generating binary classifiers for application to volumetric data, specifically retinal volumes, was described. The four stages were: (i) image decomposition to produce a set of regions, (ii) region representation, (iii) single feature vector generation (feature vector combination) and (iv) classifier generation. In this chapter, the evaluation of this process is presented in terms of the different techniques that can be used at each stage.

Image decomposition (Stage one) was presented in Chapter 4 where it was noted that an important aspect of image decomposition is the use of critical functions to establish regional homogeneity. Recall that seven critical functions were considered:

1. Average Intensity Value (AIV).

2. Kendall’s Coefficient Concordance (KCC).

3. Gray Level Co-occurrence Matrix (GLCM).

4. Euclidean Distance (ED).

5. Dynamic Time Warping (DTW).

6. Longest Common Subsequence (LCS).

7. Kullback-Leibler divergence (KLD).

Of which the last five were adapted by the author with respect to volumetric decom- position. In this chapter, we compare between these different critical functions and

different forms of decomposition in terms of classification effectiveness. For the exper- iments the threshold value t used with the critical functions, as presented in Section 4.3, was set to 0.5. This value was selected because experiments conducted using the whole image-based representation indicated that this produced the best classification outcomes. The details of these experiments is not reported in the main body of this thesis; however, for completeness, it has been included in Appendix A. In Chapter 4, two forms of decomposition, standard and overlapping, were also discussed.

With respect to region representation (Stage two), seven representation techniques were proposed in Chapter 5 as follows:

1. First-Order Representation (FOR).

2. Voxel Co-occurrence Matrix (VCM).

3. Voxel Run-Length Matrix (VRLM).

4. Histograms of Oriented Gradients (HOG).

5. Histograms of Local Binary Pattern (LBP).

6. A combination of HOG and LBP (HOG-LBP).

7. Local Phase Quantisation (LPQ).

Of which the first three are statistical representations (2 and 3 are second-order statisti- cal representation or SOR) and the remaining four are histogram based representations. Note that each of the above resulted in a feature vector representation, one for each identified region (node) in the decomposition.

With respect to Stage three (feature vector combination for single feature genera- tion), two mechanisms were proposed:

1. A dimensionality reduction-based method using Principal Component Analysis (PCA).

2. A feature selection-based method using Improved Fisher Kernel (IFK).

The result in each case was a single feature vector representing an entire volume (made up of regions). In terms of IFK, as described in Subsection 5.4.2, different dictionary sizes could be used. Thus for the evaluation presented in this chapter we compare between the operation of a range of dictionary sizes (32, 64, 128, 256 and 512).

The final stage of the process (Stage four) was classifier generation. As noted previously, there are a great many binary classifier generators available that operate using a feature vector representation. Three were considered with respect to the work described in this thesis:

1. Support Vector Machines (SVM).

2. K-Nearest Neighbour (KNN).

3. Bayesian Networks (BN).

As noted in the previous chapter, these were selected because they tended to produce good results when used with other related application domains and because their usage is widely reported in the literature. For the SVM classifier, the complexity constant was set to one and the linear polynomial kernel was used with a coefficient value of one. For KNN, the number of nearest neighbours (k) was set to one.

Given the above, the different techniques can be combined to give 4 (levels) × 2 (standard and overlapping decomposition) × 7 (critical functions) × 7 (region repre- sentation techniques)×2 (single feature vector generation techniques)×3 (classifiers) = 2,352 different ways whereby region-based volumetric classification can be achieved. The combination of these techniques forms as a set of 3,528 (4×2×7×7×3) classifica- tion results if we include the option of not using a critical function at all. Experiments were undertaken with respect to all these combinations; however, for ease of presenta- tion these are reported in this chapter by considering the alternatives with respect to each stage (decomposition, region representation, feature vector generation and classi- fication) in isolation. In each case, a constant set of techniques for the remaining stages was typically used; these were selected according to their anticipated best performance. The overall objectives of the evaluation presented in this chapter were as follows:

1. Stage 1: To determine if the use of a critical function produces a more effective classification than when a critical function is not used, and if so to determine which critical function produced the best classification results; and to determine whether standard or overlapping decomposition was more appropriate.

2. Stage 2: To determine the most appropriate region-based representation tech- nique in terms of classification effectiveness.

3. Stage 3: To determine which single feature vector generation mechanism pro- duced the best result and whether using feature selection improves the outcome.

4. Stage 4: To identify the most appropriate classifier to be applied (out of the three different generators considered).

The experiments conducted with respect to each stage fell into two categories: (i) classifier performance evaluation and (ii) statistical significance testing. The evaluation metrics used for classifier performance evaluation were: (i) Accuracy (Acc.), (ii) Sensi- tivity (Sen.), (iii) Specificity (Spec.), (iv) Positive Predictive Value (PPV), (v) Negative Predictive Value (NPV), (vi) Error Equal Rate (EER) and (vii) Area Under the Curve

(AUC) of the receiver operating characteristic as defined in Subsection 2.8.1. Ten-fold Cross Validation (TCV) was used throughout. Significance testing was conducted using ANalysis Of VAriance (ANOVA). This aim was to check whether there is a statisti- cal significance between results in terms of AUC. Recall that the ANOVA procedure was described in Subsection 2.8.2. Using this procedure, the results concerned with the analysis of a particular stage were grouped together and ANOVA testing applied, resulting in a p-value in each case. The p-values indicate the level of significance; a re- sult was considered to be significant if the associated p-value was below 0.05. Tukey’s Honestly Significant Difference (HSD) Post-Hoc Test was then applied to determine whether there were any significant differences between the operation of the classifier performances (as also described in Subsection 2.8.2). The dataset used was the 3D OCT retinal image data set introduced in Section 3.5.

In the remainder of this chapter the evaluation results, with respect to each stage and each of the above objectives, are presented and discussed in Sections 6.2 to 6.5 below. A summary of the main findings of the conducted evaluation is presented at the end of this chapter in Section 6.6.