• No results found

polymorphism (SRLP): A novel universal marker system

S. chrysanthemifolius 0.223 0

4.4.4 Combining Datasets

As shown in the study presented here, some very useful patterns of variation emerged from the analysis of combined datasets making clear that surveys of snoRNA gene/gene cluster length variation are useful for examining phylogenetic relationships within closely related groups exhibiting some reticulate evolution. However, one crucial issue that arises when different single datasets are combined is that each of these datasets reflects a particular phylogenetic history which might or might not be similar to that reflected by other datasets (Tateno et al., 1982). Incongruence between different datasets emerges through various processes (Meng & Kubatko, 2009). Furthermore, a subset of the produced datasets might already be enough to represent the relationship between the species. Additionally, different datasets might be used for different analysis. For example, some datasets might be more useful to investigate more distantly related species whereas others might be able to separate more closely related species. In this study, all variable datasets were subjected to various analyses (FFA, NJ, PCO, AMOVA, STRUCTURE) to explore their variability and the abilities to cluster certain groups, to provide diagnostic hybrid-parent and species specific fragments. Furthermore, the NJ trees of 8 single datasets (made up of 6 snoRNA gene clusters) were compared with each other and with the NJ tree of combined matrix using TREEDIST and showed that the combined dataset

Chapter 4 Discussion is most similar to the snoR37/snoR22 dataset and most different to the U18/U54 matrix. The snoR37/snoR22 primer combination shows similar among species variation and is able to group most of S. vulgaris/S. cambrensis samples and separates S. aethnensis, S.

chrysanthemifolius and S. squalidus, albeit with great overlap. Therefore, all these

methods might be help in choosing regions for further investigation.

4.5

Conclusion

In this study, fragment length variation of an initial set of snoRNA genes/gene clusters was tested for their application in phylogenetic studies using a variety ofSeneciospecies. All primer pairs were designed using Arabidopsis thalianasequences and most of them showed amplification in the majority of species using a standardized protocol. The fragment profiles produced showed variation between and within species and by combining some of the datasets the results obtained were in accordance with previous studies mostly based on RAPD, AFLP and RAPD/ISSR markers in the delimitation of species and detection of reticulate evolution. Therefore, snoRNA gene/gene cluster fragment length polymorphisms (SRFLPs) can be used as a universal marker system for studying phylogenetic relationships between closely related species. However, to confirm that the amplification products are snoRNA genes/gene clusters these fragments should be sequenced. Sequencing would also provide information on the number of gene copies present, the sequence variation between orthologous and putative paralogous genes and might be used for isolating single copy regions which could then be used as codominant markers. Because snoRNA gene and genes/clusters are spread across the whole genome this marker system might also be used in the future for comparative mapping and to study the evolution of genes and genomes.

Chapter 5 Introduction

Chapter 5: SnoRNA genes and gene clusters in

Senecio

5.1

Introduction

Although amplification success in Senecio of primer pair sequences based on snoRNA sequences in Arabidopsis thaliana suggests a similar gene cluster organisation in both genera, differences are possible in the number and size of fragments amplified. While the number of fragments amplified can be used to estimate the number of putative gene/gene cluster copies, size differences might reflect gene reorganisation within a cluster (e.g. differences in gene order, gene losses, as well as duplications and inversions). For example, a primer pair might produce a fragment in Seneciosimilar in size to a fragment expected inA. thalianabut in addition might amplify an extra and much longer fragment. This would suggest either two gene copies, one similar to that inA. thalianaand one with a long intergenic region, or a tandem repeat duplication of one gene, which is probably more likely. The major aim of the work reported in this chapter was to characterize snoRNA genes and gene clusters in Senecio species and determine differences in the organisation of snoRNA gene clusters relative to those inA. thaliana.

5.2

Material and Methods

SnoRNA gene clusters inSeneciowere characterized by comparing the sizes of high and moderately frequent fragments, particularly from the diploid species S. aethnensis, S.

chrysanthemifoliusand S. squalidus, with fragments amplified by the same primers from

A. thalianaand other species using ePCR. Blast searches based on A. thalianasnoRNA

gene/gene cluster sequences were also performed (see Chapter 3). Various snoRNA genes plus primer sites within these sequences were identified, their organisation examined, and the sizes of possible PCR amplification fragments, together with sizes of genes and intergenic regions, were calculated. Most gene sizes should be relatively constant across species and, therefore, intergenic regions within Senecio were estimated by assuming gene sizes similar to other species, particularly those in A. thaliana. Some genes might show greater size variation and these were characterized using the gene size

Chapter 5 Material and Methods of the most similar fragments. Overall, fragment length differences observed between species were assumed to be almost entirely intergenic or the result of differences in gene cluster organisation. It should be noted that most primer pairs should bind within two neighbouring genes (i.e. neighbours in A. thaliana) and, thus, should only amplify one intergenic region. Some gene clusters were examined by more than one primer pair, thus allowing a more reliable characterization of these clusters.

5.3

Results

BLAST searches based on A. thaliana snoRNA gene cluster sequences resulted in the identification of sequences from various species. The ESTs obtained may not be full length sequences with some lacking 5’ and 3’ ends. However, within these sequences, snoRNA genes, plus intergenic and primer sequences, were identified and their lengths were calculated. For snoRNA genes and gene clusters detected, see Tables and Figures in appendix, gene clusters M and N were previously shown as clusters D and E in Chapter 3. As expected, all but one of the snoRNA genes found in different species were relatively constant in size and most variation was due to intergenic size variation. The box C/D snoRNA gene U49, which is present in three copies in A. thaliana, differed considerably in size ranging from 75 bp in Helianthus paradoxus to 246 bp in A.

thaliana. The organisation of most gene clusters appears to be strictly conserved with

differences evident for only a few species examined. For example, inBrassica oleraceae

the order of two adjacent genes of cluster A, i.e. snoR4 and U31, was inverted (see appendix, Figure A.16).

By comparing fragment sizes obtained from all primer pairs of clusters and assuming the gene sizes and organisation existing inA. thaliana, it is possible to estimate the size of intergenic regions which could accommodate additional genes. Furthermore, possible tandem repeats, gene losses and inversions might be identified. As an example, the reconstruction of gene cluster A is shown below. This cluster was chosen because three different primer pairs amplified it successfully, thus providing a particularly complete picture of it inSenecio.

Chapter 5 Results