Limitations - Chapter Six General Discussion

Chapter Six General Discussion

6.8 Limitations

This thesis examines spatial-temporal data of almost an entire national resource. As with most large scale studies, a certain amount of error must be expected. This is particularly true for re-surveys of historic data, from non-permanent vegetation plots. Error becomes further compounded with, for example: shifts in the onset of flowering with temporal changes in seasons, and, multiple survey teams and different recorders of the re-survey. Nevertheless, the work presented in this thesis demonstrates, despite these unavoidable errors, vegetation re-surveys of historic, non-permanent vegetation plots, can provide valuable temporal data. Providing due care is followed to ensure the accuracy of sample locations and original survey methodology is honoured, re-surveys as conducted here, provide powerful datasets, where biological variation in space and time can be successfully explained by environmental parameters, providing understanding of the processes that govern vegetation communities.

Sampling error, particularly relocation error, is problematic when examining temporal turnover, as sampling the wrong area increases the likelihood of recording temporal turnover, where in-fact the vegetation community may not have changed. Because it's inevitable that relocation error inflates temporal turnover, even the slightest re-location inaccuracy is possible to inflate levels of turnover as calculated in Chapter 5. However, for the statistical models applied in Chapter 5, this should be relatively unimportant because the inflation of turnover should simply make the intercept larger and the proportion of unexplained variance of the model larger, thus having little affect on the power of the model to capture variation in the data.

Throughout the research presented in this thesis, many procedures of statistical analysis have been applied, few of which are not without alternatives approaches, and as is often the case, schools in

environmental determinants and a stepwise selection procedure as a model parameter selection method.

PCNM analysis is useful for modelling multi-scale spatial structures (Borcard & Legendre, 2002), and is a great advance upon earlier studies that used trend surface polynomial regression (Gilbert & Lechowicz 2004). However, truly fine-scale biotic processes such as species territoriality, or competition, important biotic processes that govern vegetation assemblages, exhibit much finer spatial scales than the resolution of the Scottish coastal survey data. To detect such fine scale processes, not only is a study design with a fine spatial grain and spatial lag required, but also a different spatial analyses tool, as biotic determinants most likely correspond to negative eigenvalues (negative spatial autocorrelation), eigenvalues that in the PCNM analysis are removed. Here, the method of choice would be Moran’s Eigenvector Maps (MEM). Similar mathematically to PCNM, MEM retains eigenvectors with negative eigenvalues, explaining 100% of a ‘n’ cantered matrix (Dray et al. 2006). Ecological data often consists of many explanatory environmental variables to try to better understand how and why species and communities are structured (Blanchet et al. 2008). The purpose when analysing the data is to establish a suite of variable that constitutes towards a best approximating model, or rather a parsimonious model, from which to develop statistical inferences (Burnham and Anderson 2002). In ecology, a general rule that models with fewer variables also contain fewer nuisance variables and greater predictive power (Gauch 1993) tends to hold true (Ginzburg and Jensen 2004), and a widely recognised procedure to achieve model parsimony is termed ‘stepwise forward selection’. However, biases and shortcomings of stepwise selection are also well established (Johnson et al. 2004; Stephens et al. 2005), of which the principal concern is in the over-estimation of the amount of explained variance, and highly inflated Type I error (i.e. falsely rejecting the null hypothesis). These shortcomings were circumvented in the model selection procedure in Chapter 2,

through use of an improved forward selection procedure (see Blanchet et al. 2008). However, despite

this newly improved method, the unpredictable nature of forward selection methodologies continues to be scrutinised, and it is argued that better approaches do exists (Anderson et al. 2000). One such approach rests on Akaike’s information criteria (AIC), which provides a simple, effective and

objective means for selection of an estimated best approximating model (Burnham and Anderson

2002). This form of model selection and model inference is known as the information-theoretic approach, discussed in detail in Burnham and Anderson (2002), and is applied in the analysis of temporal turnover of the Machair grassland in Chapter 5.

However, not least in terms of scrutiny received, is the method of Principal Component Analysis and its constrained counterpart, Redundancy Analysis (RDA), when applied to non-linear or unimodal data tables. In nature, most species only occupy part of an environmental gradient, whereby beyond the end of a species response curve, a replacement is made by another species along that gradient. When captured in large ecological datasets, species replacement generates many zeros in the species

data table; a feature of all Gaussian species response curves termed the “zero truncation problem” (Kent 2012). Such unimodal distribution of data when applied to linear models of PCA and RDA are known to result in a curvilinear representation of community gradients in ordination space, a phenomenon termed “horseshoe effect”, whereby the Euclidean distance between two sites, that share zero species, is found to be smaller than distances between sites sharing two or more species (Legendre and Gallagher 2001). To overcome this distortion issue, linear ordination methods in this thesis (Chapter 2, Chapter 3, Chapter 4) were applied to ecological data, post Hellinger transformation proposed by Legendre and Gallagher (2001). The transformation involves standardising the species data matrix by sample total and taking the square root of each element in the matrix. Analysing the resultant matrix using Euclidean distance methods thus results in a matrix of Hellinger distances (Kent 2012). However, despite many advocates of this procedure, recent results generated from simulated species assemblages with varying beta diversities, identified PCA to perform poorly despite the use of the Hellinger transformation (Mitchin and Rennie 2010).

Those opposed to using the Hellinger transformation for PCA and RDA would likely advocate the use of an alternative method, such as, non-metric multidimensional scaling (NMDS). Increasingly used over the past 20 years, its advantage over Euclidean based methods is that it does not assume a linear relationship between species. Furthermore, ranked distance measures that linearise the relationship between distances measured in species space and distances measured in environmental space overcomes the ‘zero-truncation’ problem (Kent 2012). The analysis presented in Chapter 3 relies heavily on the Hellinger transformation and Euclidean ordination methods. It would therefore be interesting to see if similar inferences are derived from substituting this approach with NMDS. In the first instance, complication would arise in terms of computational complexity when applied to the Machair dataset, given the number of survey plots, however this may be overcome by analysing data a regional levels or alternatively an area scale which distinguishes those regions once designated as environmentally sensitive areas (ESA, See Chapter 3). This would also provide an empirical test of PCA vs NMDS methods on large-scale unimodal data where beta diversity is high, a feature argued to exacerbate the ‘horse-shoe effect’ (Mitchin and Rennie 2010).

In document Investigating the drivers of spatial and temporal biodiversity patterns of the Machair (Page 154-156)