Evaluation of spatial scale
3.2.4 Statistical analysis
3.2.4.1 Pre-modelling analysis
Since the coarsest scale for the environmental data was 9 km (for the FOAM oceanographic and current data), either consecutive pairs of listening stations were grouped into segments (at 10 knots this is the equivalent of 9 km segments), or four consecutive stations were grouped to make a segment (equivalent of 18 km segments). Environmental variables collected in the field were averaged over each segment. All other variables (topography, satellite and FOAM variables), values were determined for the mid-point of each segment using the STJG GIS extraction tool version 1.0.1 (Gontarek 2005) in ArcGIS 9.0 (ESRI Inc). For the 18 km segments, values were averaged over every two 9 km segments.
Prior to modelling, a Spearman’s Rank Correlation test was carried out using Minitab v12.23 (Minitab Inc. 1999) to test for correlations between environmental variables. If there was a strong correlation (r > 0.8) between variables, the first of the variables selected by the step-wise model selection was retained and any variables with which it was correlated were discarded.
The dataset was divided into a ‘training’ and ‘test’ dataset. One method of testing for overfitting within the full dataset was to randomly divide the data such that 75% of the segments were included in the ‘training’ dataset and used as the basis of the model selection, and the remaining 25% of the segments were used as the ‘test’ dataset, on which the predictive performance of the models were evaluated (Araujo & Guisan 2006). To ensure as much independence between re-sampling units as possible while retaining as much of the original spatial coverage surveyed as possible, the data were randomly sampled by groups of 5 segments. Groups of 5 segments (around 23 km) were unlikely to be autocorrelated according to the detected range of the sperm
whales (upper 95% confidence limit of 9.7 km). Test and training datasets were also created based on only Faroe-Shetland Channel (FSC) surveys, and Ellet Line (EL) surveys (both formed a training and test set for each other). This latter analysis allowed for testing the spatial robustness of the models.
3.2.4.2 Modelling sperm whale occurrence – model selection
Generalised Additive Models (GAMs) were used to relate sperm whale
presence/absence to the survey, temporal, topographical, oceanographic and current variables. The GAMs were fitted in R version 2.3.0 (The R Foundation for Statistical Computing 2006), using the MGCV library (Wood 2006). Forward step-wise model selection of variables to the null model (of no predictor variables) was carried out. Firstly, survey variables which were likely to affect detection probability (water noise, survey vessel noise, remote vessel noise, and vessel speed) were added to the model to compensate for survey effects. Time was also added to the model to investigate whether there were diurnal changes in vocalisations. Once compensated for survey and diurnal effects on the detection rate of sperm whale clicks, the topographical, oceanographic and current variables were selected using forward step-wise model selection.
Any of the predictor variables (survey, temporal or environmental) were only added if:
i) they reduced the AIC equivalent of the UBRE score (multiplying UBRE by n the sample size) by 2 or more, as recommended by (Burnham & Anderson 2002).
ii) the variables were significant at p < 0.05
Variables included in the model selection included: depth, SST, SSS, chlorophyll, halocline depth & strength, thermocline depth & strength, surface & bottom current speed. Month and year were added to the model after the survey and temporal effects but before the environmental variables to determine whether there was a significant difference between seasons, or years, prior to modelling. They were not included in the models at this stage. After adding environmental variables, they were tested to see if there was any deviance remaining that could be explained by month or year (to
check that the environmental variables were able to model season and yearly differences).
3.2.4.3 Modelling sperm whale occurrence - model evaluation
Ideally, any model developed should be able to fully model the species distribution without any remaining temporal or spatial autocorrelation. Remaining autocorrelation in the residuals could imply that important variables have not been included in the model. The Wald-Wolfowitz run test was used to test for any remaining non- randomness in the model residuals (Hardin & Hilbe 2003; §2.2.4.3).
Having assessed whether the model has adequately modelled the autocorrelation (if it exists), models were evaluated using the Receiver Operator Characteristic (ROC) curves and Area Under the Curve (AUC) (Pearce & Ferrier 2000). For binomial data, predictive models are best evaluated using the ROC since they are not sensitive to the threshold used to decide whether a predictive score is a presence or absence (Boyce et al. 2002). The measure is used extensively to evaluate the predictive ability of
binomial models within ecology (Boyce et al. 2002; Cumming 2000; Fielding & Bell 1997; Osborne & Suarez-Seoane 2002; Pearce & Ferrier 2000; Thuiller et al. 2004). A ROC curve plots the sensitivity and the specificity of the model predictions over a range of threshold values (Pearce & Ferrier 2000). The sensitivity of a model is a measure of the model to predict presence of a species where it is actually present, whereas the specificity is a measure of the ability of a model to predict absence where the species was actually absent (Boyce et al. 2002). The ROC curve is essentially 1- specificity plotted against the sensitivity for a range of cut-off thresholds, and the AUC value is the area under this ROC curve. For the model to have predictive power, it should have an AUC of > 0.5 (i.e. the area under a 45° line), for perfect model prediction the AUC would equal 1. Pearce and Ferrier (2000) consider AUC values < 0.7 to have poor predictive power, values between 0.7-0.9 to have reasonable
predictive power, and values >0.9 to have excellent predictive power.
The ROC AUC values were evaluated for each model based on the original data for a measure of model performance, and against other data sets to gain a measure of the predictive performance of a model. By using both ROC AUC values and dividing data into test and training data sets from (i) within the same datasets; (ii) by area
(extent); and (iii) by scale (grain size), it was possible to evaluate the most robust model in space and remove overfitted terms due to the autocorrelation inherent in the data.
The final models were predicted over a 18x18 km grid, set to twice the size of the segment size as recommended by Hedley (2000). This allowed for visual evaluation of the model against the actual detections, and between different models. Predictions were based on environmental variables available from 14 October 2004. This date was selected for several reasons: it was based on a day that was actually surveyed, forms the mid point of all the survey data, and October was the only month during which both areas around the Faroe-Shetland Channel and the Ellet Line were surveyed.
3.2.4.4 Models constructed
Environmental models of sperm whale occurrence (based on listening stations with loudness >1) were constructed in several stages. Firstly, to evaluate overfitting within the full 9 km segment dataset due to autocorrelation of sperm whale detections, the dataset was randomly divided into a 75% training and 25% test dataset. Models were developed based on the ‘training’ data and evaluated on the ‘test’ data. Secondly, to examine spatial robustness over different areas, models based on the 9 km segments were developed separately for the Faroe-Shetland Channel surveys and for the Ellet Line surveys, and evaluated on the other dataset. Finally, models were developed for the full data set at the 18 km scale, and compared to those at the 9 km scale to
determine whether models were consistent over these two spatial scales. The overall aim of the analysis was to obtain the most spatially robust model for sperm whale distributions off the west coast of Scotland, to explore the reliability of extrapolation between areas, and to understand some of the spatial variability in habitat preferences.
3.3 Results
A total of 11 426 km of surveys were carried out off the west coast of Scotland between July 2003 and October 2005, as detailed in Table 3.1. Overall there were 1242 9 km segments, with 622 hours of listening effort for which all environmental variables were available. Of these, 203 (16.3%) had any sperm whale click detections (i.e. 1-5 loudness) and 100 (8.1%) had the presence of sperm whale clicks greater than level 1 loudness. Sperm whales were detected mainly in the deep off-shelf waters of the Faroe-Shetland Channel (FSC), Wyville-Thompson Ridge (WTR) and throughout the Rockall Trough south of the WTR (Figure 3.2). There were also a few on-shelf detections of sperm whales on the Faroe Plateau.
Table 3.1 – Acoustic effort (in numbers of segments and km), the number of segments with any sperm whale clicks (1-5 loudness), and the number of segments with sperm whale clicks >1 loudness for the oceanographic surveys carried out off the west coast of Scotland.
Survey Date Number of segments (distance in km)
Number of segments with any clicks (% of segments)
Number of segments with distant clicks removed (% of segments) 19-25 July 2003 114 (1050 km) 18 (15.8%) 12 (10.5%) 16-28 September 2003 137 (1260 km) 8 (5.8%) 3 (2.2%) 13-29 May 2004 244 (2245 km) 78 (32.0%) 39 (16.0%) 5-17 October 2004 189 (1740 km) 13 (6.9%) 6 (3.2%) 8-26 May 2005 221 (2033 km) 14 (6.3%) 4 (1.8%) 27 Sept – 8 Oct 2005 130 (1195 km) 8 (6.2%) 1 (0.8%) 7-25 October 2005 207 (1905 km) 64 (30.9%) 35 (16.9%) TOTAL 1242 (11,426 km) 203 (16.3%) 100 (8.1%) 3.3.1 Environmental variables
A summary of the environmental variables is detailed in Table 3.2. There were significant increases in depth, Sea Surface Temperature (SST), and thermocline depth, and significant decreases in vessel speed, surface chlorophyll, and bottom current surveyed during the Ellet Line (EL) surveys in comparison to the Faroe-Shetland Channel (FSC) surveys (Table 3.2). For more detail, refer to §2.3.1.
Figure 3.2 – Survey effort and sperm whale detections for the west coast of Scotland surveys carried out between July 2003 and October 2003. 9km segments (n = 1242) are presented as black dots where sperm whales are present (>1 loudness) and white dots where sperm whales are absent (≤1 loudness). Data are overlaid on bathymetry (GEBCO).