• No results found

Equation 6-2. Difference of Gaussian calculation

6.5 Results – Edge Detection

Computer Vision models associated with edge detection are assessed within this section. All trails within this section are performed on the ten test images listed in Test Images as well as the SUSAN [16] test image shown in SUSAN Test Image. Additionally, images used for empirical baseline edge detection testing [158] have been included to both compare the implementation of each CV model within this research, and to link results with associated research. Each image is subjected to the 352 trials listed in Schedule of Tests. Each trial listed in Table G-1. Edge detector schedule of tests (includes test numbers) is associated with a trial number, which indicates the type of test. The prefix ‘E’ indicates the trial is an Edge detection trial. The second number indicates the CV model under test. A number from ‘01’ to ‘16’ indicate the sixteen various edge detection models, down the left column. The suffix represents the type of filter applied to the trial. For example, the number ‘00’ represents no filtering, while ‘02’ indicates a Gaussian 5x5 matrix filter. Parametric values for each of the trails is indexed by the test number and listed in Configuration. Any setting required for the operation of the CV models is listed in this appendix. A summary of the full 4928 trial results for Image Analysis assessments, are listed in Trial Results.

Results for each trial listed in Schedule of Tests, are saved in a common file and are identified according to the names associated with the test images. Within the common results file, records represent individual trials against the test image, which are labelled by the trial number and includes the results when compared to the ground truth images, as well as the runtime for the model (in milliseconds).

Interpretation of raw data provides a partial measure of the effectiveness of the CV models’. Additionally, information regarding the suitability of a particular CV model for AR and RAL applications can be gained from the error image. An example of an error image is shown in Figure 6-24 which shows the output from the CV edge detection model, plus an image which displays the total population of true and false detected conditions. In the error image of Figure 6-24, green points indicate True Positive values, while red indicates a False Positive pixel. Purple points represent False Negatives where an edge should have been detected, but was not. True Negative pixels remain white.

6.5.1 Statistical Analysis

Evaluation of effectiveness for the image analysis trials is achieved through examination of the ‘confusion matrix’ and its measure of the pixel population. Each pixel in the image is classified as either True Positive or False Negative when tested against the ground truth images. The confusion matrix, shown in Table 6-1, defines the performance analysis measures, which are explained below.

True positives (TP) are detected edge or feature points which correspond to the stated edge points of the ground truth image. False positives (FP) are detected edge or feature points which are found in non-edge positions. Additionally, true negative (TN) are locations which should not have edge or feature points, and do not, while false negatives (FN) represent points not detected as edge or feature points but are supposed to be. The relationship between the four classifications can be seen in the confusion matrix of Table 6-1. Within the confusion matrix, actual positive (AP) are pixels which should be detected as edges or feature points. The opposite is true for actual negatives (AN) which are pixels not associated with edges or feature points. Detected values (on the

Actual Positive (AP) Actual Negative (AN) Detected Positive (DP) True Positive (TP) False Positive (FP) PPV = TP/(TP+FP) Type I Error Detected Negative (DN) False Negative (FN) True Negative (TN) NPV = TN/(FN+TN) Type II Error Recall = TP/(TP+FN) Specificity = TN/(FP+TN)

Table 6-1. Binary classifier ‘confusion matrix’

Figure 6-24. Edge detector model output and corresponding error map (Image GT-02-1 and test E-04-14)

left side) are the detected results. Detected positive (DP) indicates that the CV model has detected an edge or feature point. How detected results correspond to the real values determines the classification.

6.5.2 Performance Classifiers

Selection of a performance classifier requires consideration of the type of data collected, and the context of the data. Some CV models have been assessed through the use of ROC analysis [158], which is a popular analysis tool for many fields of research. ROC analysis also employs the confusion matrix for their assessment. The majority of edge detection models tested as part of this research are not suitable for ROC curve analysis due to their lack of parameter variation involved in their operation. As such, single discrete classifiers are employed which yield the one confusion matrix [160] per trial. Testing against ground truth images requires an effective binary classifier which is capable of differentiating the limitations of each model. A number of statistical analysis measures are available with which to ascertain differing concepts of effective results. Discussion on performance measures excludes the synthetic SUSAN image of SUSAN Test Image, and remains on the ten ground-truth and baseline images selected for this research, unless specifically stated. This must be the case due to the ideal nature of the SUSAN image. A significant portion of the highest scores are associated with the SUSAN image. Performance analysis of the previous empirical research [158], as listed in Empirical ROC Test Images, are compared alongside of the those of Test Images.

6.5.2. (a) Accuracy

The accuracy score of a trial may be calculated from accumulating the valid edge pixels from the CV model process. Equation 6-5 shows the accuracy (ACC) calculation, where the true values (both the TP and TN values) sum is divided by the total pixels assessed within the image.

From the collected trail results, the best performing trial overall (E-09-00) returned a staggering 98.67% accuracy score of correct edge detection for the test image GT-10- 1. All the best performance results occurred against either image GT-10-1 or SUSAN,

ACC = TP + TN

Total Pixel Count Equation 6-5. Accuracy

which is more about the type of image rather than the effectiveness of the image processing models. Image GT-10-1 consists of mostly clean straight lines, contributing to its high accuracy. With this in mind, a total of 1647 trials were performed with an accuracy score rated over 90%; however, 560 (34%) of them are associated with the synthetic test images GT-10-1 & SUSAN. This clearly does not give a comprehensive demonstration of one CV process over another, especially when also considering that there are only 352 trials per test image.

Trials with the second number “09”, associated with the First Order Gradient edge detector, are consistently among the highest accuracy scores for each test image. 6.5.2. (b) Recall

The recall score (also called the True Positive Rate) of a trial result, is the ratio of the TP edge pixels in relation to pixels that should be an edge (both TP and FN) as shown in Equation 6-6. A high recall score is an indication of a high probability for correctly identifying edge pixels when edge pixels are expected.

Assessment of CV edge detection model effectiveness based on the recall score assumes that the cost of the model’s failure to detect FP events is minimal. For example, test image GT-01-1 when processed by the Gaussian filter Circular edge detector (trial E-15-01) affects a recall score of 100%. This would seem an ideal result, however the ACC score is a poor 21%, and the reason is apparent when viewing the error map shown in Figure 6-25 (Red pixels indicate FP points and green pixels are TP points). Almost the entire image has been classified as an edge, causing almost 100% recall detection,

TPR = TP

TP + FN