• No results found

2.8 Critique of methods and proposal for data interpretation

IV-2.8.1 Classification of effects and definitions of recovery

VAN DER LINDEN et al. (2006) proposed a five-part classification system for effects of pesti-

cides for soil communities based on the model of BROCK et al. 2006 that was deduced from

Table IV-6: Classification of effects on collembolans based on NOEC-values. Synopsis of two TME studies, conducted separately in the years 2005 and 2006. The study period in both tests was about one year after application of the test item. The effect classes are defined accordingly to VAN DER LINDEN et al. (2006). as class 1 = no treatment related effects; class 2 = slight treatment-related transient effects, usually on one or a few isolated sampling dates only; class 3 = clear effects on several consecutive sampling dates, lasting less than 2 months post last application of the test item in the test system; class 4: clear effects on several consecutive sampling dates, lasting longer than 2 months but full re- covery within a year post last application of the test item in the test system; class 5 = clear long-term effects, full re- covery not within one year post last application of the test item in the test system.

Effects of lindane: dose-response relationship

156

aquatic mesocosm studies. The proposal was also picked up by SCHÄFFER et al. (2010). A

classification system has some major advantages for the interpretation of the complex results of a semi-field study that includes many different taxa. It can give a good overview of the experimental results, facilitate the comparison of different experiments, and be used to rank transient effects and show the dose-dependency of the effects by a clear arrangement. Table IV-6 shows the results of the TME-range-finding study of the year 2005 (refer to chapter III and the dose-response experiment of the year 2006 for the group of collembolans. This group of taxa turned out to be the most sensitive. Thus, it was taken as an example of the advantages of effect classification. It can be shown that the severity and persistence of effects spans over the two studies continuously. Whilst there were very few clear effects on most endpoints for soil concentrations between 0.0032 and 3.2 mg a.i./kg dry soil, no recovery took place for the principal response of the whole community and the diversity indices, as well as for the total abundance and several species. There was a steep increase of the effect between the two ex- periments. This finding can serve in future TME experiments as a rationale for the choice of effective concentrations in case lindane would become a standard reference substance.

IV-2.8.2 Statistical methodology

It was found worthwhile to analyse the results and consequences of the PRC analysis for the group of collembolans more intensely in a slight digression. This is meant to be beyond the strict requirements of the stepwise statistical procedure proposed in the literature as sketched below.

 Test the significance of the first principal component of a partial redundancy analysis by Monte-Carlo permutations tests by permuting the whole time series. Proceed only with step 2 if the precondition of a significant first principal component is fulfilled.  The same permutations tests have to be performed at each of the sampling dates on

several redundancy analyses separately, to test for significant differences between the treatments on the communities at the single sampling dates. Proceed with step 3 for the datasets of the significant sampling dates.

 Calculate the community NOEC by applying separate Williams tests on the sample scores of a Principal Component Analysis.

The procedure was originally described by VAN DEN BRINK &TER BRAAK (in a series of re-

lated papers, e.g. VAN DEN BRINK &TER BRAAK 1998) and applied in various publications in

the broader context of ecotoxicological community studies. By completely repeating all steps of PRC calculations for certain sub-sets of the whole dataset, we realized that the outcome and the final consequences of our experiment was not independent of the sub-set chosen

Effects of lindane: dose-response relationship

(set 5: data used until 5 months after application of the test item, set 12: data used until 12 months after application, Figure IV-2). While investigating the results, it became obvious that specific differences between the two datasets hamper the detection of effects because the sig- nificance of the first canonical axis did not reach the default level when applied to set 12. (Figure IV-5) What is the reason and how would it be possible to overcome the weaknesses of a step-wise procedure that largely depends on the total variation in the dataset? It is ques- tioned if it is statistically sound to relinquish on the consecutive procedure or to let allow for the interpretation of results that do not fulfil the prior steps currently defined as prerequisites of further analyses. We list some possible reasons and interpretations for the discussion in the scientific community.

The analysis of single sampling dates resulted in the same significances of the treatments and NOECs for both datasets because the corresponding data remains the same except for the last sampling date one year after application. The minor differences of the PRC cdt-diagram even at the corresponding sampling dates base on the differing total standard deviation in species data TAU given by the CANOCO program. The original sample scores are multiplied by TAU, which is 0.64 for set 5 and 0.57 for set 12, resulting in slight differences in cdt-values. The question remains why the PRC of set 5 is significant but not of set 12? The worst case would be that with increasing total variation due to progressing time the detectability of ef- fects will be decreased systematically. Alternatively, is as much non-treatment related varia- bility added to the dataset as necessary to hide the effects? Consequently, a power analysis for multivariate statistical methods is urgently stipulated but it demands much computational power because of the inhomogeneity of the univariate distributions underlying the multivari- ate dataset. In the arena of aquatic semi-field test guidance, the demand for information on the specific power of an analysis is already formulated (e.g. OECD 2006). As the final conse- quence, the detectability of initial effects on community level is lowered by either increasing variability of the dataset or by blurring the initial response of the community with onward duration of an experiment. The strict criteria of a step-wise statistical procedure should not apply for TME-data when the regulatory acceptable concentration in soil should be deduced. In higher-tier semi-field studies like aquatic mesocosms or TME, the Regulatory Acceptable Concentration (RAC) is defined as the No Observed Ecologically Adverse Effect Concentra- tion (NOEAEC) divided by the relevant safety factor.

Recently, in the arena of aquatic semi-field experiments using experimental data of mesocosm studies new methods beyond the PRC have been proposed and criticized. LIESS &BEKETOV

Effects of lindane: dose-response relationship

158

variate statistics (ANOVA and corresponding post-hoc tests) that were not indicated by the multivariate PRC-method. The immediate response and objection of VAN DEN BRINK &TER

BRAAK (2011) opened the floor for further discussions on the detection of subtle and long- termed effects on the structure of community under chemical stress. It was proposed to use further axes of the multivariate analyses to detect the effects of low dosages on predominant taxa groups rather than reducing the variability of a dataset by a priori classification. Our per- ception is that the statistical methodologies for the analysis of community effects are far from being exhausted. Additional to the improvement of the statistical methodologies, it is neces- sary to establish more relevant test systems.

V

Beyond substance related effects

The preceding chapters reviewed the history of TME studies, described the conceptual ap- proach of this thesis, exposed the methodology of TME studies in great detail and analysed the effects of the model compound lindane on populations and communities and of soil organ- isms in TME. By focusing on the effects of the toxic compound and by mainly using statisti- cal methods that were currently used for the analysis of model ecosystems for the aquatic risk assessment, the considered test system was seen under a quite narrow angle of mere regulato- ry ecotoxicology. The following section should provide a basis for a general discussion of the characteristics of a soil ecosystem that determine the limits of statistical analyses. Questions that arose during intensive discussions with internal and external experts at international con- ferences and workshops will be reasoned. Especially the international SETAC Workshop on the future use of terrestrial semi-field methods in Coimbra in the year 2007 gave many thought-provoking impulses and was helpful to sort out non-relevant directions of further in- vestigations. The following chapter is structured by three experimental phases ‘experimental design’, ‘experimental period’ and ‘analysis of results’, and it discusses the representative- ness of TME for the situation in the field.

 First, the experimental design has to be developed and adjusted to both the specific re- search question and the characteristics of the test system. An intensive field screening should deliver robust estimators of the expectable variability in the field and an opti- mized sampling strategy of TME soil cores (chapter V-1).

 Second, the test system should be stable over a certain experimental period. For this, the temporal stability of the test system should be demonstrated to avoid systematic ar- tefacts due to effects of isolation other aspects of the test systems properties (chapter V- 2).

 Third, for the analysis of the test results, the variability of the experimental units (TME) should be known in order to assess the relevance of effects and the limits of ef- fect detection. In the following, data of all studies in the field and in TME that had been conducted between the years 2004 and 2007 (chapter II-1.1) were used to investigate the different questions (chapter V-3).

Beyond substance related effects

160

 Fourth, the representativeness is analysed by comparing the TME collembolan com- munities with the communities of the coring site. Differences are assumed to be due to transport and storage effects after TME coring, and should be at least partly ascribable to climatic differences between the two areas of origin and the experimental storage site, respectively. Furthermore, a comparison between the TME species inventory with the typical agricultural species in Central Europe has been undertaken (chapter V-4).