• No results found

5.5 Real-time evaluation

6.3.1 Feature extraction

6.3.1.3 Neural synchronisation

Research in neuroscience fundamentally assumes that brain networks determine function, and it is deeply rooted in anatomical connectivity. A brain area’s structural connectivity network can predict the functional response in the brain. Saygin et al. (2011) reported that anatomical connectivity patterns can predict face selectivity in the fusiform gyrus (Saygin et al. 2011). Based on the Saygin et al. (2011) work, Jbabdi and Behrens (2012) recently stated that (Jbabdi & Behrens 2012),

“Brain regions exhibit specialization for different functions, but such functions are constrained by anatomical connections to other brain regions. By measuring these connections, we can predict complex functional responses before the subject has even performed the task.”

These findings prove that coupling exists between structural and functional organisation in the brain. Therefore, analysis of functional coupling in oscillatory neural activity at different levels of the motor system or different functionally connected areas of the brain has led not only to a better understanding of the inherent mechanisms of movement control but also to get additional discriminative information for decoding neural activity (Wang et al. 2007). One widely used method of estimating the functional coupling between two oscillatory signals is the ordinary coherence also called magnitude-squared coherence (MSC). The MSC is a normalised cross-spectral density function that measures the strength of association and relative linearity between two stationary processes on a scale from zero to one. The coherence value indicates the strength of the coupling in the frequency domain between two signals. The conditional coupling among multiple signals may be further measured by partial coherence as

mentioned in chapter 3 (cf. section 3.3). However, these techniques based on correlation or coherence are not sufficient to describe interdependence among signals. Thus, they do not help to elucidate functional coupling or causal relationships within the system. Therefore, to fully understand information processing from the oscillatory neural activity at different levels of the motor system, directional interaction analysis to reveal causal influence or synchronisation between neural signals is essential to uncover more specific information underlying the motor activity for decoding.

The causal relations were described initially as probabilistic concepts, which is that one variable may be caused by the other if it can be better predicted by incorporating knowledge of the second one. Granger formulated the concept in terms of predictability based on the linear regression models of stochastic processes (Granger 1969). This causality was expressed as one time series is caused by the other one if its prediction error at the present time can be reduced by including the past of the second one in the model. Nowadays, Granger causality is widely used in neuroscience for analysing and identifying directional influences or synchronisations between different brain areas and neural activities (Kaminski & Liang 2005; Silchenko et al. 2010). For instance, Granger causality analysis was performed for oscillatory local field potential activity in the beta (14–30 Hz) frequency range among sensorimotor cortical recording sites during a GO/ NO–GO visual pattern discrimination task in monkeys. It was also analysed to reveal the connectively between different parts of the sensorimotor cortical network (Brovelli et al. 2004) as well as interdependence between neural and muscular activities (Wang et al. 2007).

6.3.1.3.1 Granger causality

Let ( ) and ( ) denote the time series from two data channels. According to Granger causality, (or ) causes (or ) if the inclusion of past observations of reduces the prediction error of in a linear regression model of and , as compared to a model which includes only previous observations of . To illustrate the Granger causality, the temporal dynamics of ( ) and ( ) with length can be described by using an autoregressive model as:

Similarly, incorporating both ( ) and ( ) together in a bivariate autoregressive model as:

( ) ∑ ( ) ( ) ∑ ( ) ( ) ( ) (6.3)

( ) ∑ ( ) ( ) ∑ ( ) ( ) ( ) (6.4)

where, is the maximum number of lagged observations included in the model (the model order, ), is the coefficient of the model, and ( ), are the prediction errors with variance ( ) for each of the time series. If the variance of prediction error ( ) (or ( ) ) is reduced ( ( ) ( ) (or ( ) ( ))) by the inclusion of the (or ) terms in the equation 6.1 (or 6.2) as in 6.3 (or 6.4), then it is said that (or ) Granger causes (or ). Assuming that and are covariance stationary (i.e. unchanging mean and variance), the magnitude of this interaction can be measured by the log ratio of the variance of prediction errors and it can be quantified as:

( ( ) ( )) (6.5)

If , there is no causal influence from to and if , there is causal influence from to . Similarly, causal influence from to can be defined as:

( ( ) ( )) (6.6)

It assumed that the observed data can be well represented by multivariate auto regressive (MVAR) models. If the data is in the form of multiple repetitions of relatively short trials (e.g., event-related data), each trial is considered to be an independent realisation of a single statistically stationary process, such that a single MVAR model can be estimated based on the entire dataset. The estimation of MVAR model requires the inclusion of a parameter, the number of time-lags, i.e., the model order ( ). Small model order can lead to a poor representation of the data, whereas large model order can lead to problems of over-fitting in model estimation. A standard means to identify the model order is to minimise a criterion that balances the variance accounted by the model, against the number of coefficients to be estimated. The most commonly used criterion is the Akaike information criterion (AIC) (Seth 2010; Wang et al. 2007). For variables:

( ) ( ( )) ⁄ , (6.7)

where, is the estimation of the prediction error covariance matrix of the bivariate autoregressive model. The model can be validated by assessing the quality of the model fitness of the prediction ratio (Wang et al. 2007), which measures how much the model can explain the variance of the signal and the percentage of the variance contributed from the model in the total variance. This provides objective criteria on whether the model is capable of characterising the system dynamics. For a perfect fit, the prediction error is zero. If the model is correct and the true parameter values are estimated properly, the prediction error would be white noise (Seth 2010). If the autocorrelation function shows pronounced patterns, such as the ripples or slow decline at low lags, it suggests model inadequacy. In cases where the model order specified by the minimal AIC is too large to permit feasible computation, or in cases where the AIC does not reach a clear minimum over the range tested, a smaller model order can be chosen on condition that the AIC shows no further substantial decreases at higher orders (Brovelli et al. 2004).

6.3.2 Feature selection

Redundant features significantly reduce the efficiency of the pattern classification process and provide poor generalisation. To avoid using redundant features and thus improving the classification process, a feature selection strategy has become essential in many signals or image classification (discussed in chapter 3, section 3.4). In particular this is true for neural signals that contain highly redundant information. Due to practical issues related to neural data acquisition techniques, lack of training, concentration, discomfort, fatigue and varied physiological or pathological conditions, it may not always be possible to collect a large amount of reliable data to alleviate the redundancy in the feature space.

Feature selection or dimensionality reduction in the extracted feature space can be achieved by eliminating the features that carry the least useful information. In this chapter, we introduce a new feature selection strategy, weighted sequential feature selection (WSFS) based on the feature ranking, sequential feature selection (SFS) and feature contributions, to efficiently select the optimal subset of features from the available features. The WSFS strategy is capable of selecting the most effective and

set. As discussed in chapter 3 (cf. 3.4), the context of pattern classification, feature selection strategies can be considered in three taxonomical categories, (1) filter (2) wrapper and (3) embedded approaches depending on the characteristics of the evaluation and selection criterion (i.e. feature ranking and selection) (Saeys et al. 2007). Based on the feature ranking and selection criterion used to evaluate the SFS strategy, it is considered either a filter or wrapper approach. However our new strategy, WSFS potentially overcomes the drawbacks of the SFS strategy that minimise the risk of overfitting. Therefore, according to the feature ranking and selection criterion used to evaluate the WSFS strategy, it is considered as an embedded approach (cf. 6.3.4.2).