Comparison of linear, logarithmic and mel-frequency filter-bank energy cepstra in automatic seizure detection using radial basis function neural network

Chandrakar Kamath¹

Background

Epilepsy is the most prevalent chronic neurological disorder, which afflicts about 1-3% of the world population [1]. Epileptic patients suf-fer from recurrent unprovoked epileptic seizures, which are episodic and rapidly evolving temporary events. Epilepsy can be controlled, but not cured with anti-epileptic medication. Until now, not much is understood about the occurrence and mechanism underlying the epileptic seizure. Long term inpatient/ambulatory electroencephalo-gram (EEG), lasting from a few hours to several days which contain interictal and ictal hallmarks of epilepsy, is required clinically to di-agnose, monitor and localize the epileptogenic zone [2-3]. The tradi-tional methods rely on well-trained clinical neurophysiologists who visually inspect the entire lengthy EEG signals. However, this is a costly, tedious and a time-consuming process. Therefore, many au-tomated epileptic detection systems have been proposed in the re-cent years [4]. Such automated systems reduce the time taken to review off-line the long-term EEG recordings considerably and facili-tate the neurologist to diagnose and treat more patients in a given time.

The entire process of epileptic seizure detection can be generally subdivided into two main stages: (1) feature extraction and (2) clas-sification. Selecting an optimal set of significant features plays an im-portant role in developing a good classification system. Different methods have been used to extract diverse features, including those which capture frequency, energy, and structural content of the sig-nal, for the task of epileptic seizure detection [5-8]. In a recent study, we had found that the overall performance of both the composite vectors of the traditional cepstrum (CEP) deteriorated compared to

1Manipal Institute of Technology, India Correspondence: Chandrakar Kamath Email: kamath. [email protected]

that of the baseline vector in the seizure detection and classification of EEG segments [9]. However, there are not many studies that have explored to a sufficient depth other features used in different do-mains of signal processing, for example, features such as filter-bank based cepstrum (CEP), being tried for seizure detection. FBE-CEPs have been used for speech recognition and analysis [10-15].

However, to the best of our knowledge, this is the first study where, frequency scaled FBE-CEPs are applied on the EEG database by An-drzejak and investigated for unbalanced EEG data classification in eight classification problems, discussed below [16]. We also compare the performance of our approach with those of other researchers who have used the same database.

There are two variants in the approach adopted in automated detec-tion of seizures. The first is based on a set of heuristic rules and thresholds. The second is based on classifier which employs pattern recognition techniques. In the former approach, the results depend upon a single operating point and hence, there is not much control over the accuracy. On the other hand, the latter permits the classifier to adapt to the desired performance and meet the requirements.

Hence, we go in for the latter approach. The rationale behind choos-ing radial basis function neural network (RBFNN) is that: (1) the ear-lier literature shows that RBFNN is a more suitable classifier in medi-cal applications because of its simplicity and faster learning abilities due to locally tuned neurons [17]; (2) RBFNN is also suitable from the point of view of its high speed, high accuracy, strong tolerance to input noise, and real-time property in updating network structure [17].

Abstract

Background: Epilepsy is a neurological disorder affecting a very large number of people worldwide. A large fraction of the epilepsy patients have poorly controlled epilepsy. The conventional method relies on experienced neurophysiologists, who visually examine the continuous long-term inpatient/ambulatory electroencephalogram (EEG) signal. This is a tedious and time-consuming, and not a cost effective procedure. Several automated epileptic detection and classification systems have been proposed and such systems facilitate the neurologist to diagnose and treat more patients in a given time. There are not many studies that have explored to an adequate depth the features used in other areas of signal processing, for example, the seldom used feature such as filter-bank energy cepstrum (FBE-CEP), being tried for seizure detection.

Methods: Epileptic seizures are abnormal transient recurrent discharges in the brain with signatures manifesting in the EEG recordings by frequency changes and increased amplitudes. We employed static and dynamic features derived from FBE-CEP to capture these changes in amplitude and frequency. We compared the diagnostic performance of the linear, logarithmic, and mel-frequency, baseline FBE-CEP and its two composite vectors in epileptic seizure detection on a standard publicly available EEG database. The comparison was tried on eight different classification problems in the medical field related to epilepsy, using radial basis function neural network.

Results: All the three FBE-CEP methods, irrespective of frequency scaling, showed excellent overall performance.

Conclusion: The static and dynamic features derived from FBE-CEPs outperform those derived from CEP, suggesting their suitability in epilepsy seizure detection. (El Med J 2:2; 2014)

Keywords: Electroencephalogram, Epilepsy, Filter-bank Energy Cepstrum, Radial Basis Function Neural Network, Seizure Detection

Kamath C 83 Methods

EEG Records

The EEG data used for this work is from publicly available Bonn Uni-versity EEG database [16]. The choice of this database is based on the rationale that many seizure detection methods have employed this database and it becomes easy to compare the end results. The database consists of five sets (designated Z, O, N, F and S), each con-taining 100 single channel EEG segments of 23.6 second duration.

These segments have been picked from continuous multi-channel EEG recordings after removal of any artifacts like muscle activity or eye movements, making sure that they fulfilled stationarity requiments. Sets Z and O contain segments taken from surface EEG re-cordings acquired from five healthy volunteers using a standard 10-20 electrode placement scheme. The subjects were awake and re-laxed with their eyes open for set Z and eyes closed for set O, respec-tively. The segments for sets N, F, and S were acquired from five ep-ileptic patients undergoing pre-surgical diagnosis. The type of epi-lepsy identified was temporal lobe epiepi-lepsy with the epileptogenic focus as the hippocampal formation. These recordings were taken from intracranial electrodes as they offer the most precise access to the emergence of seizures. Sets N and F contained only activity measured during seizure free intervals (interictal epileptiform activ-ity), with segments in set N recorded from hippocampal formation of the opposite hemisphere of the brain and those in set F recorded within epileptogenic zone. On the other hand, set S contained only seizure activity (ictal intervals), with all segments recorded from sites exhibiting ictal activity. The patients had attained complete seizure control after resection of one of the hippocampal formations which was confirmed to be the epileptogenic zone. All the EEG signals were recorded using the same 128-channel amplifier system using an av-erage common reference. The data were digitized at 173.61 samples per sec with 12 bit resolution. The bandpass filter setting was at 0.53-40 Hz (12 dB/octave). Each single channel EEG segment has 0.53-4096 samples.

In this work, we cover the six classification problems (CPs) proposed by Guo et al and Tzallas et al [18-20]. To encompass other discrimi-nations in the medical field related to epilepsy, we have included two more CPs.

1. In the first CP, two classes, normal (includes only set Z) and sei-zure (includes set S) are examined. In this CP, 200 EEG segments are used.

2. In the second classification, two classes, namely, non-seizure (Z, N, and F) and seizure (S) are examined. In this CP, the dataset includes 400 EEG segments.

3. In the third problem, again, two classes, non-seizure (Z, O, N, and F) and seizure (S) are examined. In this CP, 500 EEG segments are used.

4. In the fourth CP, three classes are examined, normal (Z), non-seizure (F), and non-seizure (S). In this case, 300 segments are used.

5. The fifth CP takes care of five datasets comprising 500 EEG seg-ments into three classes, normal (Z and O), non-seizure (N and F), and seizure (S).

6. The sixth CP handles five datasets comprising 500 EEG segments into five individual classes, eyes-open (Z), eyes-closed (O), non-seizure interictal (N), non-non-seizure interictal (F), and non-seizure (S).

7. In the seventh CP, three datasets comprising 300 EEG segments into two classes, non-seizure (N and F) and seizure (S) are exam-ined.

8. Finally, in the eighth classification problem, three classes com-prising 300 EEG segments, normal (Z), non-seizure (N), and sei-zure (S) are examined.

The first three CPs were proposed by Guo et al, the next three CPs were proposed by Tzallas et al, while the seventh and eighth are proposed by us [18-20]. These CPs have been chosen such that they are close to clinical applications.

Cepstrum derived from log magnitude spectrum

Cepstrum (CEP) analysis is a nonlinear signal processing technique with a variety of applications in areas such as speech and image pro-cessing. It is possible to compare two relatively long time series with only a few cepstral coefficients. This implies that if two cepstral series are close then the corresponding signals have a similar evolution in time.

The real CEP is defined as the inverse Fourier transform of the log magnitude spectrum as given by:

Cr[k] = IDFT {log |DFT {x[n]}|} (1)

where Cr[k] represents k^th order real cepstral coefficient, x[n] is the discrete time signal whose cepstrum is to be computed. If the inverse Fourier transform is replaced by discrete cosine transform (DCT), the resulting equation becomes:

C[k] = DCT {log |DFT {x[n]}|} (2) where C[k] represents k^th order pseudo cepstral coefficient.

The advantages are that: (1) DCT has better energy compaction properties than the DFT and hence decreases memory requirements;

(2) it reduces the computational complexity drastically without de-grading the information content in the CEP and hence, decreases execution time; and (3) DCT produces highly uncorrelated features.

The resulting sequence of coefficients C[k], called pseudo CEP, is an approximation to the CEP, and in reality simply represents an orthog-onal and compact representation of the log magnitude spectrum.

The difference between cepstral coefficients of different time series can serve as a similarity measure among these time series. The cepstral coefficients decay rapidly to zero and hence, only the first few coefficients are needed to capture most of the dynamic infor-mation in the time series. This property of cepstral coefficients helps in reducing the dimensionality. Also, the number of coefficients to be retained does not depend upon the length of the time series.

Moreover, the higher order coefficients represent the excitation pro-cess which is less useful. The coefficient C[0] is similar to log energy (or DC component) of the signal and represents the segment energy.

It is, usually, not treated as a cepstral coefficient and is dropped.

Filter-bank energy Cepstrum (FBE-CEP)

In filter-bank based systems, the signal magnitude spectrum is di-vided into a few subbands using a multirate bank. The filter-bank uses triangular filters spread over the whole frequency range from zero up to Nyquist frequency. The center frequencies and the

84 Filter-bank energy cepstra in seizure detection…

bandwidths are determined by the frequency scaling of the filter-bank. In filter-bank based automatic speech recognition systems, fil-ter-banks with linear, logarithmic, and Mel scale have been used [10-15]. The cepstra derived from these filter-bank energies are respec-tively designated as linear-frequency FBE-CEP, logarithmic frequency FBE-CEP and Mel-frequency FBE-CEP. The magnitude spectrum of the signal is computed and warped to the frequency scale of the corresponding filter-bank followed by the usual log and DCT com-putation using eqn. (2) to obtain the FBE-CEPs of the EEG segment under consideration. We designate the resulting cepstrum by CFB[k].

The coefficient CFB[0] is similar to log energy (or DC component) of the signal. Like CEP, in this study, we do not account for CFB[0]. There are two major differences between traditional CEP and FBE-CEP: (1) the CEP coefficients are derived from the entire full-band signal spec-trum while the FBE-CEP coefficients are derived from the specspec-trum of the filter-bank energy, (2) the frequency scale for CEP computa-tion is always linear, while that for FBE-CEP can be linear, logarithmic or Mel-scale.

Radial basis function neural network (RBFNN)

In this work, we employ radial basis function neural network (RBFNN) for the classification of normal, non-seizure and seizure segments through FBE-CEPs derived from EEG signals. RBFNN has advantages of easy design, good generalization, strong tolerance to input noise, and online learning ability. The properties of RBF networks make it very suitable to design flexible control systems [17].

RBFNNs are nonlinear hybrid networks, which usually contain a sin-gle layer of hidden neurons. The general architecture of a typical RBFNN is shown in figure 1. There are three layers: an input layer, a hidden layer, and an output layer. Each input neuron corresponds to an element from the input vector and is connected to the k hidden layer neurons. Each hidden neuron is connected to the output neu-rons. The number of neurons in the output layer is equal to the num-ber of possible classes n in the CP. The input layer broadcasts the coordinates of the input vector to each of the nodes in the hidden layer. Each node in the hidden layer then produces an activation based on the associated radial basis function. Finally, each node in the output layer computes a linear combination of the activations from the hidden nodes. The output nodes from a RBFNN can be de-scribed as:

Cj(x) = ∑i wji ||x – μi || σi 1≤ i ≤ k and 1≤ j ≤ n (5) where Cj(x) represents the function corresponding to the j^th output unit or class-j and is a linear combination of k radial basis functions with center μi and bandwidth σi. wj is the weight vector of class-j and wji is the weight of j^th class and i^th center. The commonly used basis function in the RBFNN to solve pattern recognition problems is a Gaussian function and with this the eqn. (5) becomes:

Cj(x) = ∑i wji exp (||x – μi ||² / (2 σi2)) 1≤ i ≤ k and 1≤ j ≤ n (6) From eqn. (6) it can be observed that the output of RBFNN depends upon total number of neurons k, the weights between the output and the hidden layer wji, the centers of neurons μi and the band-widths of the neurons σi. This implies that the performance of RBFNN is determined by the selection of the right parameters. RBFNN can

be trained in different ways. In one of the conventional methods, the training begins with a predetermined network structure. Then the centers and the bandwidths are trained. Again, several methods are proposed to find the centers of which clustering based methods are popular.

Figure 1: A typical structure of RBFNN

In this work, we use MATLAB toolbox which greatly simplifies the implementation of the required RBFNN. The RBFNN uses a radial ba-sis layer which requires a parameter, spread constant, to be fixed. It is important that the spread constant be large enough that the radial basis layer neurons respond to overlapping regions of the input space, but not so large that all the neurons respond in essentially the same manner. From the MATLAB manual it is found that for the case of RBFNN the default value of spread constant, s = 1.

Results

Now, we compare the diagnostic capability of the three FBE-CEPs:

(1) linear scale FBE-CEP, (2) logarithmic scale FBE-CEP, and (3) Mel scale FBE-CEP and their composite vectors in the above eight CPs using RBFNN. Empirically, we had found that repeating the same pro-cedure as in [9], an analysis window length, W ≥ 868 samples (5.0 seconds), a spread constant, s ≤ 5 for RBFNN and a number of cepstral coefficients, N ≥ 9 leads to optimum results in all the three cases. In this work, a 1000-sample sliding window with 50% overlap between consecutive windows, N = 9, and s = 1, is used in the com-putation of FBE-CEPs. Distance-based classifiers demand normaliza-tion of the data and hence, feature vectors are normalized before they are applied to RBFNN. We adopted leave-one-record-out cross-validation method. In specific, we ran 10 runs of a 10-fold cross-vali-dation (with 10 runs for each fold split), thus having a total of 100 RBFNN runs to average to produce the final result. With each new fold split, the EEG data segments are randomized.

First, we compare the results of the performance of the three FBE-CEP baseline vectors in the general EEG seizure detection. The com-parison is tried on each of the abovementioned eight different CPs which have been widely used in the literature related to epilepsy.

Typical EEG segments, one from each dataset (in the order Z, O, N, F and S), are shown in figure 2. Figures 3, 4, and 5 show the first 9 coefficients corresponding to the baseline vector and the two com-posite vectors (from top to bottom) of the three FBE-CEPs (linear, log, and Mel scale, respectively), for the same EEG segments shown in figure 2, in the same order. Though there appears to be some sim-ilarity in the shape of the corresponding plots, the feature values are

Kamath C 85 different. The similarity is an outcome of the lower and narrow

fre-quency range (0.5 to 40 Hz) of the EEG signal, compared to that of speech signal where the frequency range is higher and broad (300 to 3000 Hz). Descriptive results of RBFNN analysis using FBE-CEP baseline vectors for discriminating different CPs are depicted in Ta-ble 1. It is found that the linear FBE-CEP baseline feature vector shows the good performance in CPs 3, 4, and 8, log FBE-CEP baseline vector shows the good performance in CP 1 only, while Mel FBE-CEP baseline feature vector shows the good performance in CPs 2, 5, 6, and 7.

Figure 2: Typical EEG segments from each of the five sets (Z, O, N, F, and S), from top to bottom

Figure 3: The first 9 baseline and composite vector coefficients for linear scale FBE-CEP method for the same EEG segments shown in figure 2 in the order Z, O, N, F, and S

Figure 4: The first 9 baseline and composite vector coefficients for log scale FBE-CEP method for the same EEG segments shown in figure 2 in the order Z, O, N, F, and S

Table 1: Percentage average accuracy of RBFNN analysis using linear, logarithmic, and Mel scale FBE-CEP methods (W=1000, N=9 and s=1) for baseline vectors in

discriminating eight classification problems (CPs) CP

86 Filter-bank energy cepstra in seizure detection…

Figure 5: The first 9 baseline and composite vector coefficients for Mel scale FBE-CEP method for the same EEG segments shown in figure 2 in the order Z, O, N, F, and S

The results of RBFNN analysis using composite cepstral vectors for discriminating different CPs in the three FBE-CEP methods are shown in Tables 2 and 3. The first composite vector includes velocity vector together with the static cepstral vector. The second composite vec-tor includes velocity and acceleration vecvec-tors together with the static cepstral vector. It is found that the first composite feature vector of linear FBE-CEP shows the best performance in all the CPs, that of log FBE-CEP shows the best performance in CPs 1, 2, 4, 7, and 8, while that of Mel FBE-CEP shows the best performance in all the CPs. The second composite cepstral vectors in all the three FBE-CEP methods show the best performance. This is in agreement with applications in other domains of signal processing where the composite vectors, in general, enhance the performance. For example, the composite cepstral vectors of MFCC and LFCC add to improved performance in speech processing [10-14]. In a recent study, on the other hand, we had found that the overall performance of both the composite vec-tors of the traditional CEP deteriorated compared to that of the

In document Volume 2, Issue 2 (Page 55-62)