Preliminary Studies
W 1 W class = size ( int ictal ) / size class( )
7.9 Multi-Patient Classification
7.9.2 Results for Multi-Patient Classification
The result of the multi-patient classification stages described in the previous section are summarised in Figure 8.1. On the X-axis, we see the number of single-channel Patient- Files g used to train the classifier at each stage of the experiment, starting from 1, indicating a single patient training mode, which we have used as the default mode thus far throughout this document. The highest value for g is 20, which indicates a Leave- One-Out setting for the experiment, where the classifier is trained on all Patient-Files except for a single Patient-File held out for testing. The Y-axis displays the mean S1- Score values in percentage. The S1-Score value at each stage is a mean of performance of ~50 multi-patient classifiers (~210 in the case of 2-Patient Classifiers, ~21 in the case of 1-Patient and 20-Patient classifies). The S1-Score plot solely reflects the mean result of the several trials on each mode on the ‘zero-training’ Patient-Files. This means that Patient-Files, which were used in training the model, were excluded from the summary results displayed in Figure 8.1 in order to reflect an unbiased result for the generalisation of each stage on unseen patients.
Figure 8.1 The mean S1-Score of the Multi-Patient Analysis on ‘zero-training’ Patient-Files: Across the X-axis, the number of Patient-Files in the training-set is displayed. The Blue line is the mean S1-Score of the classification of each group size on ‘zero-training’ data. The red line displays the linear fitting and the green curve is the 4th degree polynomial fitted to the main plot. Maximum, mean and minimum values of the mean S1-Score across group sizes are displayed respectively in dashed blue, green and cyan.
The results reveal that multi-patient generalisation is significantly low for training-sets comprising a single Patient-File. This is expected as the parameters in the model are fine-tuned on the single Patient-File, hence, the learner fails to correctly classify the values on other ‘zero-training’ Patient-Files, while yielding high performance values on the Patient-File it is trained on. However, as we increase the number of classifiers, the mean S1-Score seemingly increases, more so in some stages in comparison to others. In the case of 2 patients, the mean S1-Score rises to a higher 10.22% on ‘zero-training’ data. This is while the performance on the training files remains high. The S1-Score seemingly drops down at g = 3, however, it still remains higher than g = 1. From this point, there is an interesting peak at g = 4, where values are as high as 11.98%, followed by an insignificant dip at g = 5. The plot indicates another peak at g = 6 with a high value of 12.87%; this is approximately two times the S1-Score at g = 1.The performance then variably dips and peaks, not so far off the mean value at 11.58%. The maximum performance is at g = 13, with S1-Score of 13.58%. From this point through
to g = 17 values still remain high with minor variability. The performance then drops to below the mean at g =18 and then picks up again to higher values for g = 19 and g = 20 with respective values of 13.04% and 12.24%. From inspecting the results on the ‘zero- training’ Patient-Files we can identify 3 high performance regions: ,
and . Performance in these subsets is higher than the 1-Patient-File classifiers, but there is little variability among the values in any specific range. The fitted linear equation shown in red displays an increase of mean S1-Score as more Patient-Files are used at a rate of %0.17437. The curve reflects peaks at 2 of the
maximal regions we mentioned earlier, and a projected decrease at the final maximal
range. The mean of the S1-Score curve is at 11.89%, above which there is a higher density of the multi-patient classifiers.
It is worth noting that the S1-Score is a harmonic mean of the Specificity and Sensitivity of the classifiers. The Accuracy, on the other hand, is relatively higher throughout all stages (Figure 8.2), although it generally follows the pattern observed in S1-Score. Accuracy reflects a high performance in terms of an average machine learning algorithm, however, it is not sufficient for validating the efficacy of a seizure prediction model. Therefore, we look at S1-Score, which intuitively summarises the other two performance criteria of Sensitivity and Specificity
Figure 8.2. The mean Accuracy of the Multi-Patient Analysis on ‘zero-training’ Patient-files: The X-axis displays the number of Patient-Files in the training-set and the Y-axis displays the Accuracy averaged over all classifiers in the respective group.
g∈{4,5,6}
The multi-patient classifiers were also used to classify unseen data from the corresponding training files. Figure 8.3 displays the classification results of ‘unseen- trained’ Patient-Files. The multi-patient classifiers were trained on 70% of each Patient-File in the training-set, and tested on the remaining 30%, which were unseen at the time of training (‘unseen-trained’). The results in Figure 8.1 reveal an inverse trend to the zero-training Patient-Files. The highest S1-Score is at training-set of g = 1, and both linear and 4th degree polynomial trends display a monotonic decline in the performance as the size of the training-set is increased. Overall, the S1-Score decreases as g grows but it varies among several of the parameters. Some of the highest values are at g = 2, 3, 4, 5, 6, 9 which are above the mean S1-Score. The minimum value for S1-Score is 81.56% at g = 19. The plot in Figure 8.4 also displays a similar trend to Figure 8.3 for levels of Accuracy.
Figure 8.3 - The mean S1-Score of the Multi-Patient Analysis on 30% ‘unseen-trained’ Patient-Files: Across the X-axis, the number of Patient-Files in the training-set is displayed. The Blue line is the mean S1-Score of classification of each group size on unseen data. The red line displays the linear fitting and the green curve is the 4th degree polynomial fitted to the main plot. Maximum, mean and minimum values of the mean S1-Score across group sizes are displayed respectively in dashed blue, green and cyan.
The diagrams in Figures 8.1 - 8.4 suggest that in general, as the number of patients in a training-set grows, the performance on the unseen parts of the training-set decreases while generalisation on ‘zero-training’ patients improves. With respect to both results, we conclude that for the population of 21 patients, g = 4, 5, and 6 yield a relatively high performance on ‘zero-training’ Patient-Files as well as ‘unseen-trained’ Patient-Files.
Figure 8.4. The mean Accuracy of the Multi-Patient Analysis on 30% ‘unseen-trained’ Patient- Files: The X-axis displays the number of Patient-Files in the training-set and the Y-axis displays the Accuracy averaged over all classifiers in the respective group.