Part I. Classification
Chapter 3: Classification of data from a reliable training dataset
3.2.1. Creation of the classifiers
When a classifier is created, the ultimate objective is to have a classifier algorithm as efficient as possible to discriminate the different species of interest and to create a confusion matrix illustrating the accuracy and precision of the classifier.
The creation of the classifiers with the PWC were made in several steps (Table 3-1 A.) described in the next sections (3.2.1a to 3.2.1.c)..
Table 3-1: Main stages to create a whistle classifier and to apply it on new data using the PWC.
A:Creation of a classifier with PWC B: Classification of unidentified data with PWC
Data: time frequency contours from identified species
Data: time frequency contours from unidentified species organised in fragments and sections of optimal
length measured in (A.1) 1. Selection of optimal fragment and
section lengths (comparing quality coefficient, Q)
1. Classify sections
2. Creation of the confusion matrix: 2. Classification probabilities, pij 3. Variance for each pij
4. Organise sections in encounters and classify encounters (optional)
5. When it is possible, compare classification results with prior information
3.2.1.a Identified dataset
The identified data were used to create a classifier (Table 3-1 A). It was comprised of bottlenose dolphins, common dolphins, Risso’s dolphins, white beaked and white sided
41
dolphins recordings collected by different research groups (Table 3-2) on different small surveys platforms (sailing boat, small motor boats) along the coast of Scotland (Map 3.2).
Map 3-2: Locations of the training dataset.
For all different recordings it was possible to identify the recorded species with high confidence due to the proximity of the animal to the visual observers.
The following data sources were used: Recordings of all the species, except for bottlenose dolphins, were collected from the quiet sailing boat of the HWDT4 during small scale survey along the West coast of Scotland (Embling et al., 2010). Few additional recordings of Risso’s dolphins came from the North of Scotland. All recordings of bottlenose dolphins were
4 Hebridean Whale and Dolphin Trust
West Coast
Shetland
Moray Firth
St Andrews Bay
Part I Classification Chapter 3: Classification of data from a reliable training dataset
42
collected by scientists of the Sea Mammal Research Unit of St Andrews from a small motor boat in the North and in the East of Scotland for projects aiming to collect vocalisations to study social interaction or particularities in vocalisation patterns; e.g Janik, 2000; Quick et al., 2008) (Table 3-2). The sampling rates of the recordings varied from 48 kHz to 500 kHz.
Table 3-2: Training dataset and the general location and sources which collected them.
Species Location Sources
Bottlenose dolphin Moray firth St Andrews Bay Shetland St Andrews University St Andrews University St Andrews University
Common dolphin West Coast HWDT
White-beaked dolphin West Coast HWDT
White-sided dolphin West Coast HWDT
Risso’s dolphin West Coast
Shetland
HWDT
St Andrews University
The first classifier (called 2Sp classifier) classified acoustic detections as “BND” (for Bottlenose dolphins) or OTHER (for the four other species) (Table 3-3). The second classifier (called 5Sp classifier) distinguished between all five species in classification groups called “BND”, “COD” (common dolphin), “RSD” (Risso’s dolphin), ”WBD” (white Beaked dolphins) and “WSD” (white side dolphin).
Table 3-3: Groups of species classified for both classifiers. 2Sp classifier discriminated Bottlenose dolphins from all other species pooled, whereas 5Sp classifier discriminated between all five species.
Species 2Sp 5Sp
Bottlenose dolphin BND BND
Common dolphin OTHER COD
White-beaked dolphin OTHER WBD
White-sided dolphin OTHER WSD
Risso’s dolphin OTHER RSD
To be comparable and usable by the PWC, all the recordings were decimated to 48 kHz. Any sounds over a defined threshold (8dB) were automatically detected using the PAMGUARD Whistle and Moan detection module (Gillespie et al., 2013). The output of the detector created a file for each recording, with the time-frequency contours of each sound detected (Figure 3-1). These contour files were then used in the PWC to train the classifier.
43
Figure 3-1 Example screen grab showing whistle contours extracted from recordings of bottlenose dolphins using the PAMGUARD Whistle and Moan detector module. Frequency (kHz) is on the y-axis and time (10 seconds) is on the x-axis). The different colours show the contours identified by the WMD (clicks are also visible above 6 kHz). (SMRU ltd et al., 2011)
3.2.1.b Selection of the optimal parameters
The PAMGUARD Whistle classifier works by comparing properties of a group of whistle contours and does not look at each contour individually. Indeed, the output of the detector is rarely a full whistle contour but a part of a whistle contour. Often, contours break into segments because of other transient noises masking the whistle for a very short period of time or because whistles are intersecting each other and it is difficult for the detector to recognise the full contour. To homogenise these contours, Gillespie et al., (2013) divided each contour into smaller uniformly sized units called fragments. Many consecutive (in time) fragments are then regrouped in sections, from which nine parameters are extracted to run the classifier (Gillespie et al., 2013; chapter 2). These parameters described the properties of each section. The length of these fragments and sections were expected to influence the quality of the classifier. Indeed, short fragments and sections are more likely to generate unstable measurement of parameters. Whereas long fragments and sections require many more whistles to obtain a classification result (Gillespie et al., 2013).
Part I Classification Chapter 3: Classification of data from a reliable training dataset
44
When a classifier is created, the effect of fragment and section lengths on the classification probabilities needs to be measured to select the optimum lengths. To do so, the whistle classifier process described in the previous chapter (Figure 2-3, p30) was applied to the identified data set, using 80% of the sections to generate the training data (of the classification process). One hundred bootstraps were run for each possible combination of fragment lengths ranging from 26ms to 187ms (equivalent to 5 to 35 bins) and section lengths ranging from 10 to 60 fragments. To select the optimum fragment and section length, a variable was introduced called quality coefficient (Q). For each species j and each combination of fragments and sections length, Qj (Eq. 3-1) measured the quality of the classifier by subtracting the average correct classification probability (T) over the 100 bootstraps to the average false positives rates (F).
•
P=
∑ ‘„… €’’ …•€ 6bb−
∑€’’…•€“„… 6bb (3-1)A good classifier is characterised by a high correct classification probability and a low false positive classification probability so the higher Qj, the better was the classifier.
3.2.1.c Creation of the confusion matrix
These optimal parameters were used to generate the final confusion matrix of both the 2Sp and 5Sp classifiers. The classification probabilities of these final confusion matrices were an average of 100 bootstraps run with the training section being 80% of the identified data and with the optimal fragment and section length.
To estimate the variability of the classification probabilities, each classifier were trained with a training dataset made of 12.5%, 25%, 50% and 80% of the identified data. The nonlinear Least Squares Model 3 of chapter 2 was used to predict the variance if all the identified data were used to train the classifier.