Evaluation Criteria - Identifcation of correlation between 3D surfaces using data mining techni

The main aim of the work presented in this thesis was to identify the most appropriate 3D representation technique with which to model 3D surfaces in the context of springback prediction with respect to sheet metal forming processes such as AISF. To identify this representation the proposed representations were evaluated individually and comparatively. More specifically the conducted evaluation was as follows:

• Individually for each technique (discussed separately in each relevant chapter) using accuracy and Area Under ROC Curve(AUC) as the performance measures.

• Comparatively, first by comparing collated accuracy and AUC values, and then statistically by applying the Friedman and the Nemenyi tests to demonstrate whether there was a statistically significant difference between the operation of the proposed techniques.

This section presents an overview of the evaluation measures used with respect to the individual evaluations (accuracy and AUC). Details concerning the adopted statistical evaluation will be presented later in Chapter 8.

The most fundamental mechanism for analysing classifier performance within the data mining community is the confusion matrix where each instance can be classified as belonging to classX or¬X1 as shown in Figure 2.13. With reference to Figure 2.13 the True Positives (T P) are the number of instances that are correctly classified as belonging to classX, the False Negatives (F N) are the number of instances belonging to classX that are erroneously predicted as belonging to class¬X class, the True Negatives (T N) are the number of instances that are correctly classified as belonging to class ¬X and the False Positives (F P) are the number of instances belonging to class ¬X that are erroneously predicted as belonging to class X. Frequently used measures that may be derived from a confusion matrix are accuracy, sensitivity and specificity. These metrics are defined in Equations 2.14, 2.15 and 2.16 below.

Accuracy= T P +T N

T P +T N +F P +F N (2.14)

sensitivity= T P

T P +F N (2.15)

Figure 2.13: Confusion matrix.

specif icity= T N

F P +T N (2.16)

Accuracy is an overall indicator of the quality of a classifier although it does not take into consideration the distribution of the classes (“class priors”). Sensitivity is an indicator of the ability of the classifier to identify the positive instances (X), while specificity reflects the ability of the classifier to identify the negative instances (¬X). The Area Under a Receiver Operating Characteristic (ROC) Curve (AUC) [26, 97] is used extensively in this thesis for evaluation purposes. Broadly, the ROC curve concept was originally used in signal detection theory to depict the trade-off between hit rates and false alarm rates [58]. The “hit rate” is called the True Positive Rate (TPR), benefit or sensitivity; while the “false alarm rate” is called the False Positive Rate (FPR), or cost. Both are expressed in the form of a real number ranging from between 0.0 and 1.0. TPR and FPR are calculated as shown in Equation 2.17 and 2.18 respectively. Spackman [61] illustrated how the ROC curve can be used to evaluate the performance of a binary classifier. A ROC curve is generated by plotting the FPR against the TPR (with the FPR plotted along the X-axis and the TPR along the Y-axis). In the ROC space, the best classification performance exists in the upper left corner (FPR=0 and TPR=1) while the diagonal represents random classification (guessing). Therefore, a “good” ROC curve is one that reaches the upper left corner. Figure 2.14 shows four different ROC curves, each curve representing the operation of a classifier. From the figure, it can be seen that curve A is the best curve as it has the highest TPR over the other curves. Curves B, C and D are all below curve A. Curve C represents a classifier that operates in a completely random manner. The Area Under a ROC curve (AUC) is a single value frequently used to measure classifier performance (0≤AU C ≤1). In other words AUC is an indicator of the probability that a classifier will correctly classify instances [15, 69, 109, 135]. Note that an AUC value of 0.5 indicates a random classifier (guessing).

T P R= T P

F P R= F P

F P +T N = 1−specif icity (2.18) For example, consider a 2-class problem where class 1 has 990 instances and class 2 has 10 instances, then the accuracy of the generated model would be ₁₀₀₀990 = 99% as long as each new instance will be labelled with the majority class (class 1, in this case). However, a classifier that does this is clearly not a good classifier. The main advantage of AUC is its ability to deal with unbalanced data sets since it considers the distribution of classes (TPR and FPR values) [61]. Therefore, AUC was chosen to be one of the performance evaluation measures with respect to the proposed mechanisms presented in this thesis because of the uneven error (springback) distributions within the evaluation datasets. The Mann-Whitney-Wilcoxon (MWW) statistical method, which employs a ranking concept based on the signal detection theory proposed by [95], was used with respect to the work described in this thesis to calculate AUC values1_{. A full example on}

how to calculate the AUC value based on the MWW statistic, is presented in Appendix C.

Ten Cross Validation (TCV) [189] was also adopted with respect to the conducted evaluation in order to reduce theoverfitting problem and to ascertain the validity of the generated classifiers [73]. Overfitting mainly occurs when a generated classifier (model) fits the data set exactly and in a perfect manner. TCV is used in order to limit the implication of overfitting [23]. TCV is a well established technique for evaluating the performance of supervised learners whereby the data is divided into ten parts so that class labels are distributed equally (stratified). Using the TCV technique the learner is applied ten times, each time to a different 910 of the data set, and tested using the remaining 110. On completion, the recorded results of the ten iterations are used to compute an averaged set of results.

2.6 Summary

This chapter has presented the background to the work presented in this thesis. The chapter covers three main areas: (i) sheet metal forming processes, (ii) data mining (classification techniques) and (iii) 3D surface representation. Recall that springback is the major cause of deformation in AISF that affects the final geometry of the shape produced and that springback prediction was the main motivation of the work described in this thesis. Therefore, the chapter commenced with a general description for the springback phenomena within the context of the AISF process. Then an overview of the KDD process, and data mining in particular, was presented including reviews of the classification techniques used for evaluation purposes with respect to the work described in this thesis. The main 3D representation techniques that are the foundation of the thesis work were discussed next. In the context of this thesis the proposed 3D surface

The AUC/ROC calculation conducted using Weka is also done using the Mann Whitney statistic [215].

Figure 2.14: Four different example ROC curves (A, B, C and D). Curve C is the

curve produced as a result of simply guessing. Curve A is said to dominate B, C and D since A is above and to the left of B, C and D. However, B and D do not dominate each other therefore the AUC is a convenient way to compare their performance [135].

representations were not only required to capture effectively geometrical information, but also to facilitate the classification task. Finally the criteria used to evaluate the operation of the proposed techniques was presented. It was noted that two types of evaluation were conducted: (i)individualevaluation for each proposed technique using accuracy and AUC measurements and (ii)overall performance evaluation using statistical approaches. Only the first was considered in this chapter, the latter will be presented separately in Chapter 7. The following chapter describes the necessary data preprocessing that needs to be applied to the AISF data sets used for evaluation purposes.

The Grid Representation, Error

Calculation Mechanism and the

RASP Framework

In document Identifcation of correlation between 3D surfaces using data mining techniques: a case study of predicting springback in sheet metal forming (Page 59-63)