Chapter 1: Generic Modelling of Faecal Indicator Organism
1.5 Results
1.5.4 Inter-study transfer errors
Transfer errors were investigated for the high-flow, population-based EN model, since this had the highest level of explained variance of the more parsimonious land cover- and population-based models4.
Table 13: inter-study transfer errorsa in the high-flow population-based EN
model Study catchment testedb Mean errorc
(log10 CFU 100ml−1)
Mean absolute errord (log10 CFU 100ml−1) 1 Holland Brook 0.4975 0.4975 2 River Ribble −0.1883 0.2985 5 River Leven/Crake −0.0513 0.2215 9 River Irvine/Garnock 0.6227 0.6227 11 River Nairn −0.2126 0.3059 12 Afon Ogwr 0.0417 0.2116 14 Afon Rheidol/Ystwyth 0.4609 0.4912 Mean 0.1672 0.3784
a Determined by deriving a model with data for the tested study catchment omitted and using the resulting model to predict the geometric mean concentration for subcatchments in the omitted study; bOnly study catchments with ≥ 5 subcatchments with valid high-flow data were
included, see Table 4; cMean of predicted–actual log10 EN concentrations for each of the subcatchments in the study catchment being tested; d Mean of absolute difference between
predicted and actual log10 EN concentrations for each of the subcatchments in the study catchment being tested.
4 Other models were not assessed. The high-flow population-based FC model has a very similar explained variance (r2=0.622) so it is likely that transfer errors for that model will be broadly similar. Transfer testing was not undertaken on the base-flow population-based models. It is acknowledged that transfer errors may differ in those models. The remaining models (i.e. those using land use variables) were not assessed, as those models were inferior to the population- based models and were not used within the transfer analyses reported in Chapter 2.
76
The results in Table 13 reveal inter-study variability that is not accounted for by the model. Only the Leven/Crake and Ogwr studies have mean errors close to zero. For the Holland Brook, Irvine/Garnock and Rheidol/Ystwyth studies the models based on the other study catchments tend to overestimate the actual EN concentrations that were recorded (mean errors: 0.4975, 0.6227 and 0.4609 log10
CFU 100ml−1, respectively); whereas for the Ribble and Nairn studies the models
tend to underestimate actual EN concentrations (mean errors: −0.1883 and −0.2126 log10 CFU 100ml−1, respectively).
Figure 4: plot of actual high-flow log10 GM EN concentration against
predicted values using the population-based model reported in Table 11, with values from those studies showing clear +ve or –ve anomalies from
transferability testing (Table 13) identified.
The mean absolute error recorded is 0.3784 log10 CFU 100ml−1, with values
ranging from 0.2116 (Ogwr) to 0.6227 (Irvine/Garnock) log10 CFU 100ml−1. The
pattern in these results is closely reflected in the plot of predicted against actual high-flow EN concentrations based on the overall model, shown in Figure 4. Application of the model to the three sites in the Haverigg catchment produced a
77
mean error of −0.1810 log10 CFU 100ml−1 (ranging from −0.0513 to −0.2467 log10
CFU 100ml−1). It should be noted that inter-study transfer errors will tend to be
greater where levels of explained variance in the models are lower, notably in the base-flow models.
A linear assessment of transfer errors fitted very badly (i.e. was an inappropriate functional form). Figure 4 shows a log10 transformed plot of the values predicted
by the high-flow GM enterococci population based model (reported in Table 11) versus actual GM enterococci concentrations. The models underperform when predicting outliers: the model is tending to over-predict very low concentrations and under-predict very high concentrations. The primary reason for this is because the models are compiled using data from several river sites (each of which have differing ambient characteristics), which impacts on the fit of predicted concentrations. This transfer error may result in less accurate predictions of extreme values, but does not prevent the models from predicting generalised patterns of pollution or identifying potential ‘pollution hotspots’ that may require further investigation. A more sophisticated non-parametric approach to model construction may have resulted in more accurate predictions but this option was not explored as the parsimonious nature of the parametric approach provides tractable models which aids model transfer.
The models reported here represent the first generic transferable models which can be used to predict FIO pollution across unmonitored UK watercourses. They are the first models for which any transferability testing has been undertaken (so transfer errors cannot be directly compared to previous research): e.g. the Scotland and Northern Ireland Forum for Environmental Research (SNIFFER, 2006) screening tool provides insights into FIO export coefficients for catchments (in only Scotland and Northern Ireland, not elsewhere within the UK) but does not provide a basis for characterising base- and high-flow FIO concentrations separately, and the SNIFFER export coefficient calculations have yet to be fully evaluated against out of sample tests or data from monitored catchments. The immediate predecessors to this research – the meta-analyses conducted by CREH and reported in Kay et al. 2008a and 2008b – did not provide any assessment of transferability (e.g. out of sample analyses). Both of the Kay et al. studies are qualitatively different from this research. Kay et al. 2008a is not
78
directly comparable as it focused exclusively on FIOs from sewage and treated effluents (i.e. no FIOs from agricultural sources). Comparison with Kay et al. 2008b is not possible as it focused on the significance of differences in FIO export coefficients (cfu km-2 h-1) between base-flow and high-flow river flow conditions.
Furthermore the results of Kay et al. 2008b are somewhat obfuscated by examining the relationship between overall (i.e. base and high flow) FC export coefficients whereas within this research base- and high-flow are modelled separately and FIOs are expressed as concentrations (CFU 100ml-1), not export
79