4.2 Methods
5.2.3 Analysing the databases for the presence of treponemes using BLAST
Identification of Treponema present in all these bacterial databases were determined using the Basic Local Alignment Search Tool (BLAST) which is available as part of the NCBI website (NCBI 2013a).
As part of the website, BLAST finds regions of local similarity between sequences of your choice. The program compares nucleotide or protein sequences to its sequence databases and calculates the statistical significance of matches. It allows comparison of whole databases of sequences with a
Chapter 5 GI tract investigation- metagenomic approach
157
known species of interest in order to identify similar sequences in bacterial data. This software is available on the NCBI website.
Using BLAST, each database of sequences was uploaded and the alignment tool was used to compare the database to a representative GI Treponema species, T. bryantii strain RUS-1 (Genbank accession: M57737) and a representative DD treponeme T. phagedenis strain T320A (Genbank accession: EF061261) which were used as the “query sequences”. The query sequence is the sequence which all the sequences in the database are compared to. The output is a list of alignment results between the query and the database subjects ranked based on length and significance. This allowed for identification of the most similar nucleotide sequences in the gene databases related to T. bryantii strain RUS-1 and T. phagedenis strain T320A. Any sequence which was listed as having above 70% query coverage (the length of query sequence that matches the subject sequence), and had a minimum identity of 82% (the highest percent identity the set of aligned segments to the same subject sequence) was saved. This enabled the exclusion of all other bacteria in the databases that were not spirochaetal bacteria. If all sequences in a database had below 70% query coverage (i.e. all were short sequences) just the maximum identity was used.
This was carried out for all databases, including databases which looked specifically at the effect of diet on the GI tract bacterial content.
5.2.4 Determining the relatedness of spirochaetes found in the GI tract of cattle and sheep to DD treponemes
The 16S rRNA gene sequence relatedness of spirochaetes found in the databases to the DD treponemes and ruminal treponemes was determined using BioEdit (version 7.0.9.0, Hall 1999). Using Bioedit enabled a more accurate estimate of relatedness between the 16S rRNA gene sequences, which would help in determining the exact amount of shared sequence identity of the spirochaete sequences with the known treponemes.
Chapter 5 GI tract investigation- metagenomic approach
158
Each database was opened in Bioedit and all sequences removed excluding the highest scoring sequence alignments to the two Treponema species found in BLAST. For further investigation of the relevant 16S rRNA sequences, Bioedit was used to determine the range of Treponema species in the databases and in particular, DD- associated Treponema or treponemes highly similar to DD- associated Treponema (putative DD Treponema). The 16S rRNA sequences were aligned using multiple sequence alignment program, ClustalW 1.81 (Thompson et al. 1994). The alignments were trimmed to the shortest sequence to allow effective comparisons of the sequences.
Nucleotide sequence identity matrices were produced for the aligned and trimmed sequence alignments in Bioedit. This gave a comparison between each sequence compared with every other sequence in the alignment, with sequence identity scores listed for each set of sequences between 0.00 (zero identity) and 1.00 (complete identity). The sequence identity matrices were used to determine the relatedness of the spirochaetal bacteria from the databases to the DD treponeme and the GI tract treponeme.
In previous phylogenetic analyses of the spirochaetes, it was found that there were two distinct large subgroups of treponemes (Paster et al. 1991; Evans et al. 2011b). Furthermore in terms of bovine treponeme sequences it is known that the DD treponemes are closely related to one of those subgroups and GI tract treponemes to the other (Evans et al. 2011b). Paster et al. (1991) found that interspecies similarity within the group in which DD treponemes are located was 89.9%, interspecies similarity within the GI tract treponeme group was 86.2% and interspecies similarity between the two groups was 84.2% (0.842). The Treponema group has an average interspecies similarity of 81.9% (0.819) (Paster et al. 1991).
Using this work it was determined that sequences with a similarity of 82% (0.82) or more when compared to the DD or GI tract treponeme were treponemes. Sequences that had a similarity of 90% (0.9%) or more when compared to the DD treponeme were considered to be within that subgroup and those that had a similarity of 86.2% (0.862) when compared to the GI tract
Chapter 5 GI tract investigation- metagenomic approach
159
treponeme were considered to be within the commensal GI tract subgroup (Paster et al. 1991). To be considered as the same species to which they were being compared to, the sequences had to have a similarity of 97% (0.97) or greater (Goebel and Stackebrandt 1994).
Any spirochaetal bacteria found to have over 97% (0.97) sequence identity to the ruminal treponeme (and therefore part of the same species) were not further analysed as these are identified as non DD- associated treponemes.
Spirochaetal bacteria with over 97% sequence identity to any of the DD- associated treponemes and any spirochaete found to have a higher sequence identity to the DD- associated treponemes than to the ruminal treponemes, was selected for subsequent phylogenetic tree analysis.