Chapter 6 Results and Discussion
6.2 Single Machine Learning Classifiers Results for Classification
6.2.3 Support Vector Machines
Support vector machines is a type of binary model that takes a set of variables as input and then classifies each variable (input) into two categories. The main idea behind that is to map the n- dimensional sample values space into a higher dimensional attribute space, and then the new instance is classified through building a linear approach. In this model, a data point is showed as a p-dimensional vector and SVM can be separated using p-1-dimensional hyperplane procedure. In fact, the main idea of this study is to identify geometrical patterns with 9 classes of the amount of medication that could be used universally across a number of models, including SVM. This study focused with a number of classifiers that are related to SVM to calculate the classification performance metrics. This thesis conducted the classification outcomes based on Support Vector Classifier (SVC), Trainable classifier: Support Vector Machine, nu-algorithm (NUSVC), Parzen Kernel Support Vector Classifier (RBSVC), Radial Basis Support Vector Classifier (RBSVC), and General kernel/dissimilarity-based classification (KERNELC). These models were used in our experiment and all of them work based on the support vector machine methodology.
Our main target is to illustrate that all these SVM models with different types of optimization setting have provided satisfactory outcomes in terms of accuracy and performance and yield by building a sophisticated model that used in medical domains. The proposed study used a single database with high dimensional data of 13 features using 9 classes. This research implemented SVM using various types of kernels, such as kernel matrix, linear and sigmoid kernel. NUSVC is dealing with linear kernel, while PKSVC works with sigmoid kernel and KERNELC compute the outcomes depending on the kernel matrix. The training results illustrated in Table 6.5, and the ROC and AUS histograms show in Figures 6.10 and 6.11, respectively.
Table 6-5: Range of SVM classifiers performance with an average of 9 classes (Train)
Model Sensitivity Specificity Precision F1 J Accuracy AUC SVC 0.74567 0.60389 0.20342 0.31444 0.34974 0.62944 0.68444 NUSVC 0.74344 0.74189 0.27152 0.389 0.48556 0.74244 0.79878 PKSVC 0.84844 0.90333 0.51033 0.62511 0.75189 0.89733 0.94267 RBSVC 0.86 0.89667 0.52033 0.63278 0.75644 0.89278 0.94411 KERNELC 0.819 0.783444 0.315667 0.449667 0.602556 0.787667 0.864111
120 | P a g e In our SCD datasets, the data points are considered not linearly separable due to the 9 target values (classes) with multi-class problems. To achieve high accuracy with multi-class issues, it is important to use a nonlinear mapping (φ) method within dimension space [246]. The computational complexity of the model rises, when the data point moves into high dimensional space. In order to construct the classification algorithm, the learning procedure iteration by the data points with a number of operation needs to be completed. This thesis carried out a number of SVM experiments, first implemented SVM utilising default parameters, then investigated in depth the main effect of normalisation with other SVM classifiers on the classification evaluation and its effect on the model performance. Then, applied SVM parameter evaluation optimization based on different SVM models, such as KERNELC and NUSVC with more sophisticated methods to estimate the classification parameters techniques.
121 | P a g e
Figure 6-11: AUC Histgram plot (Train) for a range of SVM classifiers
The linear kernel in this model has many parameters. However, the most significant one is C, which belongs to the cost function, and the penalty parameter values of the error rates. The cost function with each parameter comes with default value of zero. In terms of large value of cost function, it is allocated to margin errors with a large penalty. In contrast, a smaller value just ignores points that are identified close to the boundary and raises the margin side. The sigmoid kernel has an important parameter where the value of 𝛾 affects the classification accuracy and performance of this model. The default value is assigned with zero.
Figures 6.12, 6.13 and 6.14 illustrate the outcomes for each model for measuring the training and testing techniques of the classifiers. The ROC Curve graphs provide a visual comparison across the models tested. This study used the holdout method for allocating training and testing cases. This assisted us to estimate the generalisation performance and accuracy of the classifiers, particularly on independents objects. In order to learn the dataset, need to operate two stages to build the learning schemes. For the training method, built the basic structure for each model to calculate the error rates as shown in Figures 6.10 and 6.11. Then, evaluated the datasets through the testing set in order to predict the accuracy and error rate for each model. This study compared the performance of 5 machine learning models over 9 output classes formed through the discretisation of target values, denoted classes 1 through 9. The main purpose is to compare our models with the baseline control models LNN (test) and ROM (test) as illustrated in section 6.4, demonstrating that our classifiers provide significantly better results than such baselines. It is found that PKSVC (test) produced the best results among other
C la ss if ie r ac cu ra cy
122 | P a g e classifiers as shown in Table 6.6 shows PKSVC yields the best performance during the sensitivity and specificity in comparison with other classifiers.
Figure 6-12: Sensitivity and Specificity of SVM models
The plots show in Figures 6.13 and 6.14 show the ROC curve and the area under the ROC curve (AUC) for each class over each model within our experiment. The discretisation of target values into classes 1 through 9. The AUC value is a scalar summary used to characterise the global capability of a given classifier under study. In our plots, the X axis shows the models and classes, while the Y axis shows the AUC that corresponds to each of the model entries listed over the X axis. An AUC of 1 represents an ideal classifier, while an AUC of 0.5 represents random performance. Each of the bars plotted is associated with a corresponding curve in either of Figures 6.13 and 6.14, which represent the accompanying ROC curves for the training and testing sets. The purpose of the plot is to emphasise the AUC values in graphical form, such that a visual comparison can be drawn.
Table 6-6: Range of SVM classifiers performance with average of 9 classes (Test)
Model Sensitivity Specificity Precision F1 J Accuracy AUC
SVC 0.74433 0.62267 0.20774 0.32111 0.36663 0.65156 0.675 NUSVC 0.77778 0.69656 0.23956 0.35944 0.47423 0.70833 0.78122 PKSVC 0.83122 0.81478 0.34811 0.48411 0.646 0.81778 0.86556 RBSVC 0.81356 0.80867 0.33822 0.47078 0.62222 0.81033 0.859 KERNELC 0.757667 0.695889 0.239222 0.354333 0.4537 0.703667 0.765556 0 0.2 0.4 0.6 0.8 1 SVC NUSVC PKSVC RBSVC KERNELC E stim atio n Per Model
Sensitivity and Specificity of SVM models
Sen/Traing Spec/traning Sen/Testing Spec/Testing Linear (Sen/Traing)
123 | P a g e
Figure 6-13: ROC curve (Test) for a range of SVM classifiers
Figure 6-14: AUC Histgram plot (Test) for a range of SVM classifiers