Evaluating the effect of survey technique, time and space on model robustness
4.3.3 Model selection: temporal robustness with year
Of the survey variables, sea state was the most important predictor of the number of groups of porpoises detected visually in all models, explaining between 3.2% (2003) to 18.2% (2005) of the deviance (Appendix Table A3.3, A3.4 & A3.6). As before, detections decreased significantly above sea state 1. Boat speed generally had no effect on sightings rate, except in 2003 when it was the most important survey
variable, but only explained 3.7% of the deviance. In this one case, porpoise sightings decreased linearly with increasing boat speed. Engine on/off as a factor variable was also only significant at explaining the detection of harbour porpoises in one model: that for 2004-2005, and showed higher detections when the engine was on, explaining only 1.4% of the deviance.
The best models for each year individually, and grouped by two or three years is shown in Figure 4.14 and in Appendix Table A3.3, A3.4 & A3.6. Maximum tidal speed was the most significant predictor of harbour porpoise visual detections in all models, explaining between 2.7% (2005) to 9.0% (2003) of the deviance. Harbour porpoises visual detections were shown to decrease relatively linearly with increasing tidal speed (Figure 4.8 & 4.14). Position in the spring-neaps cycle was a significant predictor of harbour porpoises in all but one model (2005), explaining between 1.3% (2003-2005) to 6.3% (2003) of the deviance. As shown previously, harbour porpoises were seen more during spring than during neap tides (Figure 4.8 & 4.14).
Of the remaining environmental variables, longitude was significant in explaining harbour porpoise detection rates in 2003 and in the two year 2003-2004 model,
explaining between 1.0 to 1.3% of the deviance respectively. Harbour porpoises were found preferentially towards the east (towards the mainland) of the survey area, than to the west. Also in 2003, harbour porpoises were detected to a greater degree during slack tides than during flood or ebb (Figure 4.14), with position in the tidal cycle explaining 8.9% of the deviance (Appendix Table A3.6). In two-year model for 2003-2004, depth and the amount of sand in the sediment were the final two
significant predictors of harbour porpoises. Depth explained 1.5%, and the proportion of sand in the sediment explained 3.2% of the deviance (Appendix Table A3.5), with porpoises detected to a greater degree in deeper water and in areas with between 20- 70% sand in the sediment.
In the final full three-year model of the number of harbour porpoise groups detected per 2 km segment, time of day was also a significant predictor variable explaining 1.9% of the deviance and showing a maximum detection rate during the middle of the day (Figure 4.14, Appendix Table A3.6).
2003 VisNporp ~s(Speed) + s(SeaState) + s(MaxTideCur) + s(ClosetoSpring) + s(TidalState) + s(Lon) 2004 VisNporp ~s(SeaState) + s(MaxTideRange) + s(ClosetoSpring) 2005 VisNporp ~s(SeaState) + s(MaxTideRange) s(TimeFrLW) 2003-2004 VisNporp ~ s(SeaState) + s(MaxTideCur) + s(ClosetoSpring) + s(Lon) + s(Depth) + s(PctSand)
2004-2005 VisNporp ~ s(SeaState) + factor(EngineOn) + s(MaxTideRange) + s(ClosetoSpring) s(ClosetoSW) s(MaxTideCur) s(SeaState)
s(Longitude) s(Depth) s(PctSand)
2003-2005 VisNporp ~ s(SeaState) + s(MaxTideCur) + s(TimeOfDay) + s(ClosetoSpring) s(SeaState) s(MaxTideCur) s(TimeSunrise) s(ClosetoSW) 2003 VisNporp ~s(Speed) + s(SeaState) + s(MaxTideCur) + s(ClosetoSpring) + s(TidalState) + s(Lon) 2004 VisNporp ~s(SeaState) + s(MaxTideRange) + s(ClosetoSpring) 2005 VisNporp ~s(SeaState) + s(MaxTideRange) s(TimeFrLW) 2003-2004 VisNporp ~ s(SeaState) + s(MaxTideCur) + s(ClosetoSpring) + s(Lon) + s(Depth) + s(PctSand)
2004-2005 VisNporp ~ s(SeaState) + factor(EngineOn) + s(MaxTideRange) + s(ClosetoSpring) s(ClosetoSW) s(MaxTideCur) s(SeaState)
s(Longitude) s(Depth) s(PctSand)
2003-2005 VisNporp ~ s(SeaState) + s(MaxTideCur) + s(TimeOfDay) + s(ClosetoSpring) s(SeaState) s(MaxTideCur) s(TimeSunrise) s(ClosetoSW)
Figure 4.14 – Schematic of the best GAM models for the core survey area for all years 2003-2005 individually, and grouped by 2 or 3 years. Selected smooths are shown to illustrate the relationships between the numbers of harbour porpoise groups seen per 2 km and the environmental variables. Smooths are only shown where they were different, any smooths not shown can be assumed to be the same as for the full three year model.
Overall, the best model for the full three years (2003-2005) for the core survey area explained 15.9% of the deviance, of which 8.2% was explained by environmental variables:
VisNporp ~ s(SeaState) + s(MaxTideCur) + s(Time) + s(CloseToSpring) Each of the variables in each of the best models was tested to determine whether overfitting was occurring by comparing the full model with those excluding the last 1, 2, 3, 4 or 5 variables by calculating its performance on the test dataset. This
suggested that the majority of models were not overfitted except for the 2003 model, which suggested that neither tidal state nor longitude were required in the model. The model for 2003-2004 performed the best with p < 0.05 based on the training dataset, and p < 0.01 for the test dataset (Table 4.3). All the models with two-three years data modelled together resulted in a model that was significantly (p < 0.05) better than a random intercept-only model based on the training 75% dataset on which the models were built.
Table 4.3 – Significance of each of the models on the training 75% randomly sampled segments, and on the test 25% remaining segments for each of the models.
Dataset Model Training 75% Test 25%
2003 VisNporp ~ s(Speed) + s(SeaState) + s(MaxTideCur) + s(ClosetoSpring)
0.0843 (n=714)
0.564 (n=238)
2004 VisNporp ~ s(SeaState) + factor(EngineOn) + s(MaxTideCur) + s(ClosetoSpring)
0.0682 (n=455)
0.9547 (n=152)
2005 VisNporp ~ s(SeaState) + s(MaxTideCur) 0.0809
(n=516)
< 0.01 (n=172)
2003-2004 VisNporp ~ s(SeaState) + s(MaxTideCur)
+ s(ClosetoSpring) + s(Lon) + s(Depth) + s(%Sand)
< 0.05 (n=996)
< 0.01
(n=332)
2004-2005 VisNporp ~ s(SeaState) + factor(EngineOn) + s(MaxTideCur) + s(ClosetoSpring)
< 0.05
(n=972)
0.1125 (n=324)
2003-2005 VisNporp ~ s(SeaState) + s(MaxTideCur) + s(TimeOfDay)+ s(ClosetoSpring)
< 0.05
(n=1712)
0.1138 (n=570)
Figure 4.15 (previous page) – Predictive plots based on the best GAM models by forward model selection for 2003, 2004, 2005, two-year models 2003-2004 and 2004-2005, and the full three year model for 2003-2005 modelling the number of harbour porpoise groups per 2 km segment based on surveys carried out in the core area around the Argyll and Small Isles. Overlaid on the maps are the effort segments (white dots) and the visual detections (black dots). Colours represent density from low (blue) to high (red), ranging from 0 to 2.3, colour gradation based on 20 levels using quantile classification (ArcGIS 9.0).
The predictive plots for 2004, 2005, 2004-2005, and 2003-2005 are very similar but also have very similar models (Figure 4.15). There is also very little difference between the predictive plots for 2004-2005, and 2003-2005, despite the latter including three additional variables (longitude, depth and proportion of sand in the sediment). However, this model prediction suggested that some of the more westerly areas such as north of Coll & Tiree were predicted to have high densities of porpoises despite there being very few observed porpoises in this area. As in the previous section (§ 4.3.1), the model predicts low porpoise densities around Colonsay, to the west of the Ross of Mull (Iona), and around Eigg & Muck, where there were quite a high density of sightings.
The remaining two models (2003 and 2003-2004) predicted that harbour porpoises were spread over a wider proportion of the survey area than the other models, with higher densities predicted over the whole survey area. However, all models predicted highest porpoise densities towards the coastal areas, i.e. in the Sound of Jura, Firth of Lorne, west of Mull (around the Treshnish Isles) and around the Small Isles.