T estin g th e Features - and C lassification

and C lassification

5.3.2 T estin g th e Features

The features were tested using 200 ICs; 100 ICs containing eye blinks and 100

each kernel the average error values were estimated with 4-fold cross validation i.e. using 75% of the d ata as training examples and 25% for testing with no overlapping. The cross-validation was performed 10 times, each time the data were random ly rearranged in order to yield a better estimate of the error. To find the value of param eter C the average CV test error is evaluated for a range of values for C . The optim um value of C was found to be 64 in the case of the linear and cubic polynomial. For the RBF kernel the parameters C and p were adjusted and found the optim al values for the RBF kernel as C = 72 and p — 7. The CV error results are shown in Table 8.1.

Two largest principal co m p o n e n ts o f th e feature s p a c e The distribution o f the classifier output with e y e blinks without e y e blinks O O w Oo ^ o ° °oo o' © ° o CO ° ( o ^ ° ° °o° °o cP° ° ° 1 1 1 with e y e blink H H 1 without e y e blink J l J l r T T - r ^ - f f h f Classifier output (a) _(b)

Figure 5.4: (a) A plot of the two largest principal components of the feature space. There are 200 feature vectors, 100 from normal EEG (+) and 100 from EEG containing eye blinks (o). (b) A histogram plot showing the output of the classifier pre sgn(-) using the linear kernel.

Illustration of the distribution of the feature space becomes difficult when the dimension of the features is greater than 3. In order to understand the distribution of the feature space, one can use an RBF kernel with varying C and p to give

Table 5.1: The performance of the classifier based on the average number of correctly classified points. Three kernels are compared in the classification.

Kernel Average classification rate (%) (s.d.)

Overall Normal Eye Blinks

Gaussian RBF 98.50 (1.00) 98.26 (1.17) 99.03 (1.35)

Cubic Polynomial 94.50 (1.92) 91.15 (2.31) 97.91 (2.04)

Linear 99.00 (1.15) 99.24 (1.11) 99.21 (0.97)

further insight into the optim um shape of the separating hyperplane. One would expect the number of SVs to decrease as p decreases. A linear kernel corresponds

p —> oo. Fig. 5.5(a) and Fig. 5.5(b) show the hyperparam eter space for the

RBF kernel. W ith the kernel width param eter p being finite, and regularisation param eter kept constant the classifier yields its highest classification rate with the lowest number of SVs. Therefore the feature space can be considered a linear one. The RBF kernel can be considered as both a linear and non-linear kernel depending on the param eter values, C and p, th at are chosen.

In the case of cubic polynomial and linear kernels the number of support vectors found were 18% and 3.3% respectively of the training dataset size. The results in Table 8.1 show th at, with the exception of linear kernel, the classifier had lower classification rates when classifying normal EEG. This may be due to non ocular related artifacts present in the EEG such as spikes, which produce similar feature values to th a t of the true eye blinks.

The training error was found by using the training d ata to test the SVM. The

training error was found to be 2% (av) and the test error was 3% (av). This

avoids any overfitting since the training error is close to the training error. 98

The classifier was further evaluated by plotting the distribution of the classifier output for 200 test points. It is calculated by applying the classification function in (5.8) w ithout the sgn(-) function. The result from the training d ata using the linear kernel is shown in Fig. 5.4(b). The ICs containing eye blinks are clustered

around and above + 1 and the ICs containing normal EEG activity around and

below -1. There is minimal overlap between the classifier outputs, indicating th at

the proposed features are sufficiently significant to the detection of eye blinking artifacts for the test datasets.

For the dataset tested there is only 0.5% difference in the overall classifica tion rate between the linear kernel and the RBF kernel. The cubic polynomial had the lowest overall classification rate. The largest difference in classification performance was between the RB F and cubic polynomial kernel when classifying normal EEGs, there was a difference of 7.1%. The reason for the close overall classification rates is mainly due to the separability of the feature space. Since the linear kernel requires fewer SVs in calculating the OSH and due to its com putational simplicity, the linear kernel will be used to classify eye blinks in the following experiments. In order to test the significance of proposed features their eigenvalues were evaluated as 2.97, 0.68, 0.66 and 0.25; this testifies th a t the pro posed features are significant to the detection of eye blinks in EEGs. A plot of the two largest principal com ponents is shown in Fig. 5.4(a). From Fig. 5.4(a) it can be verified th a t the multidimensional feature space was linearly nonseparable, in the sense th a t there was an overlap between the features extracted from ICs containing eye blinking artifacts and those related to normal EEGs.

The BSS-SVM algorithm was applied to 10 real EEG datasets, each were 7 minutes long. The perform ance of the algorithm can be seen by comparing the EEG d ata obtained at the electrodes (see Fig. 5.6(a)) and the same segment of d ata after being processed by the proposed algorithm (see Fig. 5.6(b)). The

significance of the results was subjectively justified by a clinician at King’s Col lege Hospital. The proposed algorithm was compared to EEGs reconstructed by manual artifact rejection (i.e. m anually identifying and cancelling the artifact) by calculating the cross correlation between the BSS-SVM reconstructed EEGs and the manually reconstructed EEGs. The average value of cross correlation between the reconstructed EEGs is 0.92 (s.d. 0.02). In a number of trials the ef fect of ECG has been autom atically detected and removed, whereas the complete removal has not been achieved with the m ethod based on the manual selection. This had a detrimental effect on the cross-correlation measure since the BSS-SVM output will be less correlated with the manually reconstructed outputs, but has a positive effect on the output since there is less artifact present in the output.

As a second criterion for measuring the performance of the overall system a segment of EEG, x seg, and the reconstructed EEG, x seg, th a t do not contain any artifact were selected, and measured the waveform similarity,

/ M

edB = 10 log I 1/M

( I 1

“ E { ( x ljSeg[n] - x Meg[n])}|)

V i=i

When the value of edB is zero, the original and reconstructed waveforms are identi cal. From ten sets of EEGs the average waveform similarity was edB — —0.009dB

(standard deviation 10~AdB). These results suggest th a t the observations have been faithfully reconstructed both in term s of subjective visual inspection and objective performance metrics.

5.4 C onclusions

A robust method for removing ocular artifacts from EEGs by fusing BSS and SVM methods is presented in this chapter. The results show th at the proposed algorithm identifies and removes the effect of eye blinking artifacts. A second

order methods was used to separate the sources which are spatially and tem po rally uncorrelated. The main advantage of using second order methods is th at it requires fewer samples than the HOS methods, which lends itself to a lower com putational complexity and hence shorter processing times. The efficacy of the SOBI algorithm in separation of OAs has been demonstrated in [48] and was ex ploited in this algorithm to extract features from the ICs. A second order method for source separation was used since, unlike higer order methods, it exploits the time structure of the EEGs. The EEGs are separated using the time lagged SOBI algorithm and the identified artifacts are autonomously cancelled, then the EEG is reconstructed from the remaining ICs.

Four features were identified as effective descriptors of eye blinking compo nents. The selection of features were based on statistical measurements such as KL distance, cross correlation, power ratio, and skewness. The experiments herein dem onstrate th a t for the test dataset the eye blinking sources are effec tively classified by using the introduced features especially when the linear kernel is used for the SVM. It was dem onstrated th a t the feature space is linearly separa ble by fixing the RBF kernel width param eter p, adjusting the slack param eter C, examining the number of support vectors found, and the corresponding classifica tion rate. Based on the experimental d ata the BSS-SVM algorithm consistently removes the effect of eye blinking artifacts from the EEGs. W hen removing the artifacts from long d ata sets, manual removal of artifacts becomes infeasible and therefore autom ated techniques are required.

C la s s ific a tio n rates fo r the RBF kernel

(a)

Percentage of training data used a s support vectors

(b)

Figure 5.5: The (a) classification rate and (b) number of support vectors required for various param eter values of the RBF kernel.

The EEG Contaminated by Eye Blinking Artifacts

° v ^ v - t ^ V ) V v i A W y / y f M w ^ # v v y \ N > v w W V y ^

K)>--- 1--- 1--- »---1--- 1---1--- 1---1--- 1—---»—

■y*i|p> 1* ** »»» v

(a)

The EEG Corrected for Eye Blinking Artifacts

200 -200 500 100 -100 -5 0 -100 -50 1000 1200 1400 1600 1800 2000 600 800 200 400 S a m p le (b)

Figure 5.6: A selection of 8 electrodes from a 16 electrode EEG recording. The

OAs are clear in (a) between samples 400 to 600, 900 to 1400, and 1700 to 1900. They are more prominent over the frontal electrodes (F P l, FP2 etc.). (b) The same segment of EEGs after the eye blinking artifacts are removed using the proposed BSS-SVM algorithm.

C h ap ter 6

In document Signal processing algorithms for brain computer interfacing (Page 103-111)