4.2 Validation and Limitations of Methods
4.2.6 Spike detection and spike sorting
The manual detection of spikes is extremely time-consuming, but bears the advantage that all data are inspected visually so that conspicuous irregularities can be spotted easily. For the multi-units, the thresholds were manually set such that spikes were also detected for the phase of spontaneous activity, in order to enable a comparison between deactivation conditions also for this phase. This partly led to rather low thresholds, which in turn increased the resulting spike rates. Hence, in contrast to the selection of the single-units, which was approached in a rather conservatively way, the spikes for the multi-unit data were detected with a more generous approach. For the preparation of single-unit data, thresholds were set automatically. This is justified, since in this situation, units which are not clearly identified as a separated cluster in the first place, can be excluded later in the process.
Overlapping spikes, background noise, varying amplitudes of spikes of the same cell and signals from axonal fibre bundles that look very much like action potentials make spike sorting a highly complex problem (Lewicki, 1998). The high number of spike sorting methods and their different methodological
approaches show that there is no "go-to" method for spike sorting, and with none of the available methods it is possible to avoid false positives and negatives. In addition to sparse firing and small injuries caused by electrode penetration, deficits of current spike sorting methods (including WaveClus) contribute to the problem that fewer neurons are detected than should actually be detected, taking the density of neurons in the cortex into account (Pedreira et al., 2012). Validating whether the result of the spike sorting is optimal requires knowledge of the "ground truth" (Harris et al., 2000; Schulz et al., 2015). Since this knowledge is not available, it is not possible to conclusively determine whether all spikes have been assigned correctly.
Here, WaveClus (Quian Quiroga et al., 2004) was chosen because the WaveClus algorithm is au- tomatic, unsupervised, and fast. Moreover, it is widely used (Wild et al., 2012) and has been shown to outperform several other conventional spike sorting methods (Quian Quiroga et al., 2004; Wild et al., 2012): Quian Quiroga and colleagues compared the feature extraction procedures using wavelets against principal component analysis (PCA), the whole spike shape and a set of features. In addition, su- perparamagnetic clustering (SPC) and K-means clustering were carried out using wavelets and PCA. Su- perparamagnetic clustering is based on ideas from statistical mechanics, focusing on nearest-neighbour interactions. Since SPC is a stochastic clustering method, the repeated application of the algorithm to the same dataset might yield slightly different results. Clusters found by SPC do not need to have a well-defined mean or low variance, they do not have to be normally distributed and are allowed to over- lap (Quian Quiroga et al., 2004). SPC with wavelets led to the fewest classification errors and thus the best performance (Quian Quiroga et al., 2004). Wild and colleagues compared WaveClus to KlustaK- wik (Harris et al., 2000) and OSort (Rutishauser, 2006) and found WaveClus to be "the most accurate spike sorting algorithm" (Wild et al., 2012).
Hence, in general, the selected spike sorting approach performs well, also in comparison with other spike sorting approaches. However, the above-mentioned limitations still hold.
Temperatures for the SPC algorithm were automatically set by WaveClus, but adjusted manually in order to achieve a clear separation of the clusters. With this, the applied sampling rate of 20 kHz can be viewed as high enough to describe the spike waveforms in sufficient detail.When selecting the temperature, a rather conservative approach was taken, meaning that spikes that could not clearly be assigned to a cluster were rejected and not considered in the analysis. This can again lead to lower spike rates, because spikes that are in any way distorted, e.g. by overlapping with other spikes, or showing slightly different waveforms due a burst, will be excluded from further analysis.
Also, different spike waveforms were observed throughout the experiments. An example is visible in Fig. 2.6: while clusters 2 and 3 exhibit a rather typical action potential shape, the waveform of cluster 1 has a lower amplitude and is much more symmetrical to its peak. This could have several reasons: the distance and position of the neuron(s) to the electrode matters, as well as the shape of the tip itself. Other confounding parameters are non-stationarity of the background noise and electrode drift: the thresholds were set for a whole session, i.e. selected for a timespan of at least 30 minutes of the recording. It is unlikely that the background noise is stationary for this complete time interval, so that with the reliable detection of spikes, there is a bias high-amplitude spikes, while low-amplitude spikes might sometimes have dropped below the threshold. When an electrode drifts towards or away from a certain neuron, this might also alter the recorded waveform, so that these altered spike shapes are excluded from the respective cluster. The signal-to-noise ratio (SNR) was best for experiment 121007 on average, but also differed between different electrodes and sessions for all experiments. Sessions and electrodes with stable background activity and higher SNRs thus will have led to sorting results that are closer to the "ground truth" than for sessions with low SNRs.
Figure 4.4.: Network topology. The gure shows a schematic of the distribution of units around the electrode tips. It is assumed that spike signals are picked up in a radius of around 140 µm around the electrode tip (Buzsáki, 2004). Distance between the electrodes was 500 µm. Neurons cluster around the elec- trodes, while the space between the electrodes is not sampled. Note that this is a projection on the 2-dimensional plane, although in reality the networks were 3-dimensional and the electrode tips were placed at slightly dierent depths. (blue dots: electrode tips, pink dots: single units, circle: radius around electrode tip in which single units are detected)
The electrode constellation and sorting also has an impact on the network topology: the detected units group around the electrode tips, so that, assuming a detection radius of 140 µm around the tip, and a distance of electrodes of at least 500µm, physical distances between the units of the same electrode are much shorter than between units of different electrodes (see Fig. 4.4).
On the other hand, this also means that it is unlikely that two separate electrodes picked up the spike signal from the same neuron.