population structure

Top PDF population structure:

Nonparametric approaches for population structure analysis

Nonparametric approaches for population structure analysis

The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges.
Show more

12 Read more

The effect of clumped population structure on the variability of spreading dynamics

The effect of clumped population structure on the variability of spreading dynamics

We wish to consider the limit as the number of patches, m, becomes large but where the patch size, n, remains finite and often relatively small. Note that for such a population structure, individuals are homogeneous, and the network topology (de- fined by within and between clump contact rates) also exhibits the desirable ‘small worlds’ property of high clustering and low shortest path lengths (Travers and Milgram, 1969; Watts and Strogatz, 1998). Many real-world populations exhibit this clumped population structure, where strong links exist within a clump and weaker links exist between clumps. High-profile examples include the transmission of avian influenza in poultry aggregated into sheds (Savill et al., 2006), spread of Foot-and- Mouth disease in livestock aggregated into farms (Tildesley et al., 2006), transmission of measles by children aggregated into schools (Riley et al., 1978), and the spread of a number of infections such as pandemic influenza through human populations that are aggregated into households (Black et al., 2013). For this type of clumped population, expressions for the probability of a major outbreak (successful invasion) and the final size of an epidemic (and hence results for percolation as a
Show more

10 Read more

Population Structure and Association Analysis in a Diverse Population of Eastern North American Winter Wheat.

Population Structure and Association Analysis in a Diverse Population of Eastern North American Winter Wheat.

Principal component analysis (PCA) was originally invented in 1901 by Karl Pearson (Pearson 1901) as a method to convert a set of possibly correlated variables into a linear combination of orthogonal (independent) variables (principal components) such that the total original variance is preserved. The principal components are ordered in decreasing order of the explained variance by that principal component. In plant breeding, the genetic marker data can be subjected to PCA and then the “top” one or more principal components may be included to adjust the genotypes and phenotypes of the population based on coancestry (Zhu et al. 2002; Price et al. 2006; Myles et al. 2009; Sneller et al. 2009; Price et al. 2010). As was pointed out by Price et al. (2010), it is important to recognize that the top PCs may not reflect population structure; rather, they may reflect family relatedness (Patterson et al. 2006), long- range LD (Tian et al. 2008), or assay artifacts (Clayton et al. 2005). These effects can often be eliminated by removing related samples, regions of long-range LD or low-quality data, respectively, from the population and genetic data.
Show more

164 Read more

Admixture, Population Structure, and F-Statistics

Admixture, Population Structure, and F-Statistics

study them under a wide range of population structure mod- els. I then review some basic properties of distance-based phy- logenetic trees, show how the admixture tests are interpreted in this context, and evaluate their behavior. Many of the results that are highlighted here are implicit in classical (Wahlund 1928; Wright 1931; Cavalli-Sforza and Edwards 1967; Felsenstein 1973, 1981; Cavalli-Sforza and Piazza 1975; Slatkin 1991; Excoffier et al. 1992) and more recent work (Patterson et al. 2012; Pickrell and Pritchard 2012; Lipson et al. 2013), but often not explicitly stated or given in a different context.
Show more

21 Read more

The population genomics of begomoviruses: global scale population structure and gene flow

The population genomics of begomoviruses: global scale population structure and gene flow

Whereas our results are consistent with the notion that host-range differences might underlie much of the minor sub-population structure we have uncovered, it must be pointed out that viruses from many “narrow- host range” sub-populations infect the same individual plant species as viruses sampled from “ broad host range ” sub-populations. There are therefore presumably at least some opportunities for gene flow amongst these populations in nature. This then suggests that genetic barriers to genetic exchange, in addition to host range barriers, may underlie some of the genetic cohesiveness of many sub-populations. It is known that the viability of recombinant viruses is influenced by the relatedness of their parents and that strong purifying selection prob- ably operates against the survival of recombinants with defective intra-protein and inter-genome region interac- tions [46,47]. Thus purifying selection acting against gene flow between sub-populations is likely to be at least partially responsible for the absence of admixture observed in some sub-populations. For example, despite its members co-circulating with, and infecting the same host species as other Af-Med and eAf-CAS minor sub- populations, the minor sub-population containing ACMV contains almost no evidence of admixture with any other Af-Med or eAf-CAS minor sub-populations. This result is consistent with recombination analyses which have found that whereas ACMV has occasionally donated genetic material to circulating recombinant viruses there are no known instances of predominantly ACMV genomes acting as acceptors of foreign genetic material [48]. It must, however, be stressed that while our results are consistent with the existence of genetic barriers to the flow of genetic material into sub-popula- tions displaying low degrees of admixture, it remains to be experimentally confirmed whether or not viruses such as ACMV are particularly intolerant of inheriting genetic material from viruses belonging to different sub- populations.
Show more

12 Read more

Genetic Diversity and Population Structure of Teosinte

Genetic Diversity and Population Structure of Teosinte

Figure 2.—Unrooted phylogeny of individual teosinte plants using the Fitch-Margoliash method and the log-transformed proportion of shared-allele distance among 93 microsatellite loci. The tree contains 237 individuals. A large H indicates plants identified as being of putative hybrid origin by population structure analysis. Z. mays ssp. parviglumis: (B) Central Balsas, (E) eastern Balsas, ( J) Jalisco, (O) Oaxaca, (S) South Guerrero. Z. mays ssp. mexicana: (C) Central Plateau, (H) Chalco, (D) Durango, (N) Nobogame, (P) Puebla. (U) Z. mays ssp. huehuetenangensis, (Xg) Z. luxurians (Guatemala), (Xn) Z. luxurians (Nicaragua), (R) Z. diploperennis, (Z) Z. perennis.
Show more

14 Read more

Analysis of Population Structure in Autotetraploid Species

Analysis of Population Structure in Autotetraploid Species

Population structure parameters commonly used for diploid species are reexamined for the particular case of tetrasomic inheritance (autotetraploid species). Recurrence equations that describe the evolution of identity probabilities for neutral genes in an “island model” of population structure are derived assuming tetrasomic inheritance. The expected equilibrium value of F ST is computed. In contrast to diploids, the

10 Read more

Recombination and population structure in salmonella enterica

Recombination and population structure in salmonella enterica

Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.
Show more

11 Read more

A Unified Characterization of Population Structure and Relatedness

A Unified Characterization of Population Structure and Relatedness

As Astle and Balding (2009) noted “population structure and [cryptic] relatedness are different aspects of a single con- founder: the unobserved pedigree defining the (often distant) relationships among the study subjects.” A similar point was made by Kang et al. (2010): “The presence of related individuals within a study sample results in sample structure, a term that encompasses population stratification and hidden relatedness.” Our goal is to provide a unified approach to characterizing population structure and individual relatedness and inbreeding, in terms of both the underlying parameters and the methods of estimation. By working with proportions of pairs of alleles that match, or are the same type, we can give a single estimator for F ST ; where the pairs are from the same or different populations,
Show more

19 Read more

Recombination and population structure in salmonella enterica

Recombination and population structure in salmonella enterica

Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.
Show more

10 Read more

Stability of the pneumococcal population structure in Massachusetts as PCV13 was introduced

Stability of the pneumococcal population structure in Massachusetts as PCV13 was introduced

We have previously collected community samples of pneumococci from young children in Massachusetts in 2001, 2004, 2007, and 2009 and used MLST to deter- mine the population structure and detailed molecular epidemiology [6-8]. Here, we report the results of MLST analyses of the 2011 samples, showing the further evolu- tion of the pneumococcal population compared to sam- ples from the same communities in 2009 [8]. While the overall carriage prevalence did not change substantially from 28.8% in 2009 to 31.5% in 2011, changes have already been observed in the prevalence of vaccine serotype carriage. Specifically, a substantial decline in 19A and the emergence of the interconverting 15B/C as the most common serotype have been observed in the early stages following the introduction of PCV-13 in Massachusetts [14]. These results will allow us to compare future trends and changes in the pneumococ- cal population with continued PCV13 use.
Show more

7 Read more

Population structure of Sclerotinia homoeocarpa from turfgrass

Population structure of Sclerotinia homoeocarpa from turfgrass

Many tools have been employed to investigate the genetic diversity and population structure of fungal pathogens. Currently, one of the most widely accepted and utilized methods is multilocus sequence analysis. The analysis of multiple house keeping genes has become a widely used tool in studying taxonomic relationships (15). This tool is also very suitable to examine genetic diversity among populations. Also in fact, it has been theorized that a small number of carefully selected gene sequences could equal or surpass the precision produced by analyses of whole-genome relatedness (26). Not only is multilocus sequence analysis easier and less costly than sequencing the whole genome of an organism, it can still provide us with useful information about the organism. Data seen depicting certain
Show more

89 Read more

Population structure of palms in rainforests frequently impacted by cyclones

Population structure of palms in rainforests frequently impacted by cyclones

The sampling strategy using 10m x 10m study plots was chosen as larger plots or transects could not be used in cyclone-damaged forest. The composition of palm populations was determined for three life stages: seedlings, juveniles and mature plants. Palm densities were also determined. The immediate impact up to 1.5 years after the strike of Cyclone Larry (20 March 2006) on the population structures of Sites A1, C and D were assessed in July 2007; Sites A2 and B in August 2008 and Site E in May 2008. For the purposes of data collection of population structure, one cluster of palms was counted as one individual (Table 1). Data on the regeneration from the eighteen permanent plots (seedling recruitment and survivorship as part of a recovery) were published in Latifah et al. 2016.
Show more

10 Read more

Population Structure and Eigenanalysis

Population Structure and Eigenanalysis

We can only uncover structure in the samples being analyzed. As pointed out in [39], the sampling strategy can affect the apparent structure. Rosenberg et al. [29] give a detailed discussion of the issue, and of the question of whether clines or clusters are a better description of human genetic variation. However, our ‘‘axes of variation’’ are likely to be relatively robust to this cline/cluster controversy. If there is a genetic cline running across a continent, and we sample two populations at the extremes, then it will appear to the analyst that the two populations form two discrete clusters. However, if the sampling strategy had been more geographically uniform, the cline would be apparent. Nevertheless, the eigenvector reflect- ing the cline could be expected to be very similar in both cases. Our methods are conceptually simple, and provide great power, especially on large datasets. We believe they will prove useful both in medical genetics, where population structure may cause spurious disease associations [1,40–43]; and in population genetics, where our statistical methods provide a strong indication of how many axes of variation are mean- ingful. A parallel paper [14] explores applications to medical genetics.
Show more

21 Read more

Population Structure and Dynamics of Magnaporthe grisea in the Indian Himalayas

Population Structure and Dynamics of Magnaporthe grisea in the Indian Himalayas

T HE study of microbial populations constitutes an within a microbial species are necessarily the same with intriguing dimension of population genetics. Un- respect to their capacity for, and the frequency of, ge- like diploid and obligate sexually reproducing organ- netic recombination. Indeed, Leslie and Klein (1996) isms upon which much population genetics theory is propose that sexual recombination in filamentous fun- based, bacteria and many fungi are haploid and have gal species that are believed to have completely lost this asexual clonal propagation as an important or, for some capacity (the Fungi Imperfecti) actually may be retained species in the Fungi Imperfecti, exclusive reproductive in populations residing near their centers of origin or strategy. The significance of sexual recombination in where environments are heterogeneous and variable. many microorganisms can be obscured by the degree The occurrence, frequency, and distribution of ge- of asexual reproduction in nature. Thus, the relative netic recombination in Fungi Imperfecti is of practical importance of sexual vs. asexual reproduction in de- significance because many devastating diseases of ag- termining microbial population structure and the ricultural crop species are caused by members of this means to detect their contributions have been topics of group. We therefore chose an important fungal patho- lively debate (Tibayrenc et al. 1991; Andrivon and gen of rice (Oryza sativa L.) to ask whether in a species Vallevielle-Pope 1993; Maynard Smith et al. 1993; believed to reproduce only asexually in nature there Kohn 1995; Zeigler 1998). An important aspect that are populations whose structures may be affected by has not been well addressed is whether all populations recombination; and, if so, how may the contributions of sexual and asexual reproduction to population struc- ture be distinguished.
Show more

14 Read more

Inference of Population Structure Using Multilocus Genotype Data

Inference of Population Structure Using Multilocus Genotype Data

We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more popula- tions if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individu- als. We show that the method can produce highly accurate assignments using modest numbers of loci—e.g., seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/ⵑpritch/home.html.
Show more

15 Read more

Population structure of graptolite assemblages

Population structure of graptolite assemblages

There have been few attempts to analyse the population structure of graptolites. The work that has been done (e.g. Palmer 1986; Rigby 1993) has focused on faunas on individual bedding planes, with results probably being more relevant to taxonomic and taphonomic studies than palaeoecology. Among other fossil organisms, studies of population dynamics have been more extensive, but have generally concentrated on community analysis or variations within a population for purely taxonomic ends. These studies of fossil populations have concentrated on shelly groups such as brachiopods (Cate & Evans 1992), foraminifera (Kaesler & Fisher 1969), ostracods (Kurten 1964) and bivalves (Snyder & Bretsky 1971). These are all benthic and either sedentary or poorly mobile, thus being subject to very different ecological controls to graptolites. Population structures of these organisms are thus not directly comparable to those of graptolites. It should be noted that the term survivorship has also been used to refer to taxonomic extinction rates (e.g. Pearson 1992). Although using similar techniques, species level survivorship cannot readily be compared to the survivorship of individuals within a species.
Show more

19 Read more

A genomic overview of the population structure of Salmonella

A genomic overview of the population structure of Salmonella

We have pursued still another alternative, a soft core genome: in Salmonella, this consists of 3,002 genes (Table 1) that were found to be present in 98%, intact in 94%, and of unexcep- tional diversity in 3,144 representative Salmonella genomes [22]. Publicly accessible websites based on the software framework BigsDB [27] present cgMLST schemes and their alleles for a variety of bacterial species [26,28], but not Salmonella, Escherichia, or Yersinia. We therefore developed a website, EnteroBase, which scours all short-read archives for sets of Illumina short reads from Salmonella, Escherichia, Yersinia, Moraxella, and Clostridiodes and supports uploading short reads by registered users. It assembles and polishes genomic contigs from the short reads within 2 hours and calculates MLST assignments from those genomes at the levels of legacy MLST, ribosomal gene MLST (rMLST) [6], soft cgMLST, and wgMLST (Table 1; Box 1) [22]. EnteroBase also supports calling SNPs from up to 1,000 genomes against a refer- ence genome as well as the graphic evaluation of genetic relationships between entries. Here, we re-examine the population structure of Salmonella based on the genomic contents of more than 110,000 Salmonella genomes in EnteroBase (Fig 1B).
Show more

14 Read more

Genetic Diversity and Population Structure ofVibrio cholerae

Genetic Diversity and Population Structure ofVibrio cholerae

the rfb region of V. cholerae, based on studies of relatively small numbers of strains and serogroups. Our observation that strains of the same serogroup frequently are found in diver- gent, even distantly related, lineages supports earlier evidence (2, 19) that the rfb genes are subject to horizontal transfer and further suggests that this process occurs with relatively high frequency. Convergence in serotype is, of course, an alterna- tive explanation, but reasoning by analogy from the lack of evidence for convergence in epitope structure in the serologi- cally diverse flagellins of S. enterica (22), we favor the first hypothesis. The issue can be settled by comparative sequencing of the epitope-encoding segments of the rfb region.
Show more

11 Read more

Correlation of Rhs elements with Escherichia coli population structure.

Correlation of Rhs elements with Escherichia coli population structure.

This correlation exists at several levels: the presence of R h s core homology in the strain, the location of the R h s elements present, and the identity of th[r]

10 Read more

Show all 10000 documents...