Statistical analysis - Testing and evaluating non extractive sampling platforms to assess deep

4.6 Conclusion

5.3.3 Statistical analysis

Spatial autocorrelation

Determining habitat-occurrence relationships needs to account for spatial

autocorrelation (SAC) in occurrence (abundance), habitat (environmental) or both. The geographic distribution of individuals can be spatially auto-correlated due to movement restrictions, social organisation or aggregative reactions to signals from other individuals of the species. Environmental variables are usually also spatially auto-correlated and are discussed at length in Legendre (1993). To asses the extent to which SAC was evident in our data, we applied an auto-correlation function

(ACF) to multiple linear subsections of all AUV transects (dives). The ACF

indicates at what lag (distance) SAC disappears. We assumed that observations further apart than the lag (distance) indicated by the ACF were spatially not

correlated. For each dive we generated several lag (distance) values one for each linear subsection. We took the largest lag (distance) as a threshold to rule out SAC. Distances between observations (presence of fishes per image) were calculated

using geographical easting and northing and the Pythagorean theorem. This

approach was only taken for binomial (presence/absence) analysis. To visualise the extent of spatial auto-correlation we used correlograms (Bjørnstad and Falck, 2001), which depict spatial dependencies between locations at different lag distances

using Moran’s I. Two relationships were investigated (A) location (spatial x, y

coordinates) ofH. percoidespresence-absence and (B) location and extent of habitat

classes.

Linear mixed-effects models (LMEs)

We investigated relationships between continuous variables (fish length and weight) and environmental variables (depth and habitat) using linear mixed effects models (LMEs) in R, package nlme (Pinheiro et al., 2009). The Maximum Likelihood (ML) method was preferred over the default Restricted Maximum Likelihood (REML) method as we intended to compare models with different fixed effects structures. LMEs allow for the observational units (image) to be clustered, e.g., observations by dive. Random effects across dives were assumed to vary. Another advantage of LMEs is their ability to incorporate several random effects that are spatially nested, i.e., habitat classes within dives within sites. LMEs were chosen since they can handle pseudoreplication. In our case, images fall in the category of spatial pseudoreplication where several measurements (length) were made from the same

vicinity (dive). Pseudoreplication violates one of the fundamental assumptions in statistical analysis; independence of errors. Conditions within each habitat class will affect all length measurements within this particular habitat class and therefore violate the independence of errors assumption. The best (minimal adequate) model was chosen by backward selection, where explanatory variables were deleted one at a time from the full (saturated) model. The model was

F Li =α+β×depthi×habitati+ai+i

log-transformed fish length (F L) and weight were modelled as an intercept (α) plus

the linear interaction between depth and habitat class effect, a random intercept (a)

and an error term . Index i refers to an image, where a length measurement was

taken. Fixed effects, depth and habitat class, influence the mean of y (fish length

and weight), whereas random effects influence only the variance of y. The reduced

model was compared to the full model utilising F-likelihood ratio tests. Restricted

Maximum Likelihood (REML) was used to compare models with different random effects structures and Maximum Likelihood (ML) was used for models where the fixed effects structure differed. Fish lengths and weights were log-transformed before analysis. Sightings without length measurement were excluded from analysis.

Generalised linear mixed-effects models (GLMMs)

Binary response variables, i.e., presence/absence ofH. percoideswere analysed using

used the R package lme4 (Bates and M¨achler, 2010), as it provides AIC (Akaike Information Criterion) for model selection. AIC is a measure of the fit of a model (Crawley, 2007) and for each model is calculated as:

AIC =−2(log−likelihood) + 2(p+ 1)

where p is the number of parameters in the model (1 is added for estimating the

variance). The lower the AIC number the better the fit of a model. As with LMEs we arrived at the ‘best’ model by backward selection. The model was:

logit(pi) = α+β×depthi+ai+i

for each habitat class individually the probability (p) of H. percoides presence in

imagei is modelled as an intercept (α) plus the linear depth effect, a random

intercept (a) and an error term . The depleted model was compared to the full

model using ANOVA. A non-significant result warranted model simplification, i.e., deletion of explanatory variables. After initial analysis using all habitat classes and

dives in one data set we decided to model H. percoides presence/absence for each

habitat class separately. This would allow for easier presentation of our results. The

binary response variable was presence or absence of H. percoides in image i. We

investigated the probability of H. percoidesoccurrence per image by depth for each

habitat separately. Site, dive and habitat class were incorporated into the model as random effects.

Habitat preference index

To address habitat preferences of H. percoides we used a log likelihood ratio test of

goodness of fit (Tolimieri et al., 2008) recommended by Sokal and Rohlf (1995). This

method is similar to Pearson’sχ2_{test, however, the test statistic is the deviance from}

a log-linear model. We tabulated observed counts by habitat and calculated expected counts assuming no habitat preference by adjusting for frequency of occurrence for each habitat. For illustrative purposes we created a preference index (observed proportions minus expected proportions).

Juveniles and adults

Finally, we investigated the proportion of juvenile and adult individuals by habitat, depth, dive and site. Due to our inability to determine the sex of fishes we used an average value based on numbers reported by (Park, 1993): males mature at 10 – 13 cm TL (approx. 2 – 5 years of age) females mature at 9 – 17 cm TL (2 –

6 years of age). Individuals >12.25 cm (24.7 g) were considered adult fish. Our

decision to take an average is based on a sex ratio close or equal to 1 and seems to prevail in other live bearing non-targeted species of the Sebastidae family but there are several reasons why the sex ratio can deviate from 1, e.g., fishing pressure (Harvey et al., 2006). A classification tree model using binary recursive partitioning was also used to investigate habitat preferences of juvenile and adult fish. Here,

individual length measurements (nlength = 937) were split along the coordinate axis

maximally distinguishes the response variable (length) between the two branches (Breiman et al., 1984).

5.4 Results

In document Testing and evaluating non extractive sampling platforms to assess deep water rocky reef ecosystems on the continental shelf (Page 162-167)