Molecular analysis of soil microbial communities and data pre-processing

2.5 Discussion

3.3.6 Molecular analysis of soil microbial communities and data pre-processing

DNA was extracted from 0.1 g soil using the Quick-DNA™ Soil Microbe Kit and protocol (Zymo Research). Amplicon libraries were constructed according to a dual indexing strategy with each primer consisting of the appropriate Illumina adapter, 8-nt index sequence, a 10-nt pad sequence, a 2-nt linker and the amplicon specific primer (Kozich et al., 2013). For bacteria, V3-V4 16S rRNA amplicon primers (CCTACGGGAGGCAGCAG and GCTATTGGAGCTGGAATTAC) were used (Kozich et al., 2013). Fungi were targeted by amplifying the ITS2 region using primers GTGARTCATCGAATCTTTG and TCCTCCGCTTATTGATATGC (Ihrmark et al., 2012). Although the capability of detecting AM fungi using ITS primers is known to be somewhat limited due to bias (Hart et al., 2015), recent studies have shown that patterns in diversity and community composition can be adequately identified within sample types such as soil (Berruti et al., 2017; Lekberg et al., 2018). Amplicons were generated using a high fidelity DNA polymerase (Q5 Taq, New England Biolabs). After an initial denaturation at 95 °C for 2 minutes, PCR conditions were as follows: Denaturation at 95 °C for 15 seconds; annealing at 55 °C (16S) and 52 °C (ITS); annealing times were 30 seconds with extension at 72 °C for 30 seconds; cycle numbers were 25; a final extension of 10 minutes at 72 °C was included. Amplicon sizes were determined using an Agilent 2200 TapeStation system, samples were normalized using SequalPrep Normalization Plate Kit (Thermo Fisher Scientific) and pooled. The pooled library was quantified using a Qubit dsDNA HS kit (Thermo Fisher Scientific) prior to sequencing with an Illumina MiSeq using V3 600 cycle reagents at a concentration of 8 pM with a

million reads passing filter (16S and ITS, respectively). Sequences were processed in R using DADA2 to quality filter, merge, de-noise and assign taxonomies (Callahan et al., 2016). 16S forward reads were trimmed to 250 bases. ITS sequence reads were trimmed to 225 and 160 bases, forward and reverse, respectively. Filtering settings were maximum number of Ns (maxN) = 0, maximum number of expected errors (maxEE) = 1. Sequences were dereplicated and the DADA2 core sequence variant inference algorithm applied. mergePairs and removeBimeraDenovo functions were used at default settings to merge forward and reverse reads and remove chimeric sequences.

The amplicon sequence variants (ASVs) were subject to taxonomic assignment using assignTaxonomy and the training database UNITE version 7.2 (UNITE Community, 2017).

Fungal functional guild classifications were assigned to ASVs using the FUNGuild annotation tool (Nguyen et al., 2016). Only ASVs with unambiguous (non-multiple) classifications of “probable” or “highly-probable” confidence rankings were considered for analysis. These were used for calculating relative abundances of fungal guilds and sub-setting saprotrophic, mycorrhizal, EcM, pathogenic and parasitic fungal datasets for analysis of diversity and community dissimilarity. Only ASVs assigned to the kingdom of Bacteria and Fungi were retained for downstream analysis (99.34 % and 99.73 % of total 16S and ITS reads, respectively). Eight samples with abnormally low read counts in all libraries were removed. These all comprised SLG samples, including one entire SLG plot in B North. All singleton ASVs were removed and sample sequencing depth was normalised by rarefying to the minimum read counts of 13,498 (bacteria), 7,124 (overall fungal), 1,983 (saprotrophic fungal), 22 (mycorrhizal fungal), 13 (EcM fungal), 501 (pathogenic fungal) and 19 (parasitic fungal). Alpha diversity indices (ASV richness, Shannon index) and fungal guild relative abundances were

calculated using rarefied datasets. All sequencing data pre-processing was conducted in R version 3.5.1 (R Core Team, 2013) using the phyloseq package (Mcmurdie and Holmes, 2013).

3.3.7 Statistical analyses

All statistical analyses were conducted in R version 3.5.1 (R Core Team, 2013).

ANOVA was used to test differences in soil, environmental and vegetation characteristics and univariate soil microbial community attributes (alpha diversity and fungal guild relative abundances) between CCF and SLG (H1, H2 & H3). Plot-averaged data was used to control for spatial pseudoreplication, and the effect of vegetation type was tested after the site factor to control for potential site effects. To test the relationship between disturbance intensity (canopy openness) and soil characteristics and microbial community attributes (H4), linear mixed effects regression models (LMMs) were constructed in the lme4 R package (Bates et al., 2015) using data from SLG only. To control for potential within-plot pseudoreplication, plot ID was included as a random intercept term. Normality of model residuals were evaluated using Shapiro-Wilk tests, and variables were log-transformed where necessary to improve model fit and satisfy assumptions of homoscedasticity (ANOVA). Significance was considered at the p ≤ 0.05 level. For univariate soil microbial community attributes significantly affected by SLG, the soil physiochemical parameters explaining the largest proportion of variance across SL forest were identified through multiple linear regression. Prior to analysis, highly correlated soil variables were identified with correlograms using the corrplot R package (Wei and Simko, 2017). Variables correlated with Pearson’s r > 6 were removed after averaging correlation coefficients by site. The best model was identified through stepwise forward and backward variable selection using AIC as the criterion.

Soil microbial community data were Hellinger-transformed prior to analysis (Legendre and Borcard, 2018) to control for the effect of rare taxa, and merged at the plot level (n = 11) using the merge_samples function in phyloseq to control for spatial pseudoreplication. Soil microbial community compositions across CCF and SLG were visualised with PCoA using Bray-Curtis dissimilarities via the phyloseq, vegan (Oksanen et al., 2019) and ggplot2 (Wickham, 2016) packages. Differences in soil microbial community compositions between vegetation types (H2 & H3) were tested with PERMANOVA using the adonis vegan function and Bray-Curtis community dissimilarities. All permutational tests were run with 10,000 permutations and restricted by site to control for the nested sampling design using the permute R package (Simpson et al., 2019). Homogeneity of multivariate dispersion between vegetation types (an assumption of PERMANOVA) for soil microbial community dissimilarities was evaluated using the betadisper vegan function. For soil microbial community dissimilarities significantly affected by SLG, the soil physiochemical parameters explaining the largest proportion of variance across SL forest were identified through backwards-selection using PERMANOVA with adonis2, where variables with highest p-values in marginal tests were sequentially removed until all predictors were

significant. Indicator analysis was conducted to identify specific fungal taxa associated with different vegetation types using the labsdv R package (Roberts, 2016; Dufrêne and Legendre, 1997).

3.4 Results

3.4.1 Differences in soil, environmental and vegetation characteristics between CCF

In document Impacts of tropical forest modification on soil microbial community attributes and function in Southeast Asia (Page 88-91)