ABSTRACT The next generation of QTL (quantitative trait loci) mapping populations have been designed with multiple founders, where one to a number of generations of intercrossing are introduced prior to the inbreeding phase to increase accumulated recombinations and thus mapping resolution. Examples of such populations are Collaborative Cross (CC) in mice and Multiparent Advanced Generation Inter-Cross (MAGIC) lines in Arabidopsis. The genomes of the produced inbred lines are ﬁne-grained random mosaics of the founder genomes. In this article, we present a novel framework for modeling ancestral origin processes along two homologous autosomal chromosomes from mapping populations, which is a major component in the reconstruction of the ancestral origins of each line for QTL mapping. We construct a general continuous time Markov model for ancestral origin processes, where the rate matrix is deduced from the expected densities of various types of junctions (recombination breakpoints). The model can be applied to monoecious populations with or without self-fertilizations and to dioecious populations with two separate sexes. The analytic expressions for map expansions and expected junction densities are obtained for mapping populations that have stage-wise constant mating schemes, such as CC and MAGIC. Our studies on the breeding design of MAGIC populations show that the intercross mating schemes do not matter much for **large** **population** **size** and that the overall expected junction density, and thus map resolution, are approximately proportional to the inverse of the number of founders.

Show more
20 Read more

(1990), who assumed every homozygote has a fitness in a **population** of constant **size** N, where S 5 2Ns and 1 2 s relative to every heterozygote. In the diffusion m 5 F, the homozygosity. Equation 3 assumes that Ns limit (**large** **population** **size** and small s), on which both is **large** and i is relatively small. When those conditions Takahata’s and our theories depend, there is no differ- are not satisfied, as can be the case immediately after ence in the parameterizations. A value of s in Takahata’s an extreme reduction in **population** **size**, (3) no longer parameterization corresponds to s/( 1 2 s) in ours. provides a good approximation and t(x) must be com- New mutations arise and if they become common, puted numerically from the equation given on p. 2420 they persist in the **population** as common alleles for a of Takahata (1990). We found that, in agreement with long time. This situation is suitable for Gillespie’s Sasaki (1992, Equation 8), for very **large** values of S a (1984, 1991) SSWM approximation. We treat i, the num- better analytic approximation is given by (3) without ber of alleles, as a random variable for which transitions the 2 under the square root. But for values of S of from one time to the next can be modeled by a Markov practical interest in models of human populations (50 chain. In fact, the Markov chain we use when there is one to 200) a number between 1 and 2 provides a better class of alleles is of a particularly simple kind because it approximation (Takahata 1993; N. Takahata, per- allows for an increase or decrease in i by only one. sonal communication).

Show more
ABSTRACT Estimation of epidemiological and **population** parameters from molecular sequence data has become central to the understanding of infectious disease dynamics. Various models have been proposed to infer details of the dynamics that describe epidemic progression. These include inference approaches derived from Kingman’s coalescent theory. Here, we use recently described coalescent theory for epidemic dynamics to develop stochastic and deterministic coalescent susceptible–infected–removed (SIR) tree priors. We implement these in a Bayesian phylogenetic inference framework to permit joint estimation of SIR epidemic parameters and the sample genealogy. We assess the performance of the two coalescent models and also juxtapose results obtained with a recently published birth– death-sampling model for epidemic inference. Comparisons are made by analyzing sets of genealogies simulated under precisely known epidemiological parameters. Additionally, we analyze inﬂuenza A (H1N1) sequence data sampled in the Canterbury region of New Zealand and HIV-1 sequence data obtained from known United Kingdom infection clusters. We show that both coalescent SIR models are effective at estimating epidemiological parameters from data with **large** fundamental reproductive number R 0 and **large** **population** **size** S 0 :

Show more
39 Read more

Note the dramatic difference between Figure 2 and Figure 8—between expansion from equilibrium and from a bottle- neck. Both curves are declining toward the same equilibrium, but the left portion of the curve declines much more slowly after a bottleneck than after expansion from equilibrium. This can only reﬂect the state of the **population** just before the increase in **size**. The initial **population** of Figure 2 had been small much longer than that of Figure 8. Consequently, it had less heterozygosity. As we see in the next section, this accel- erates the rate of decline in LD between closely linked sites. It is sometimes suggested that bottlenecks inﬂate long- range LD, just the opposite of the pattern seen above (Slatkin 2008, pp. 481–482; Tenaillon et al. 2008). This discrepancy, however, evaporates on close inspection. When

Show more
13 Read more

Generations were discrete and **population** sizes finite. For each set of parameters, even sample sizes from 4 to 60 were simulated and the average value of F from 10,000 runs was calculated. The program does not allow for more than two lineages to coalesce at a time. This deviation from the Wright-Fisher model will cause a negative bias in F, but the effect is negligible for the parameter values we used (results not shown; the sample **size** has to be **large** relative to the **population** **size** for multiple coalescences to the same individual to matter).

10 Read more

For a proof of the unbiasedness of N b under the computer mode (point particles, e.g. , Fig. 1b) see Appendix S1 from Cruz et al. (2015). The unbiasedness property is independent from the orientation of the test system relative to the **population**. If the shape of the convex hull of the **population** image is nearly rectangular (as in Fig. 1b), however, then it is advisable to avoid parallelism between the edges of the image and the quadrat rows, or columns, in order to avoid an unduly **large** error variance. This idea is suggested in Fig. (11) from Gundersen et al. (1999). In the present experiment, each of the **population** images studied was framed by a rectangle, for which we adopted a common tilting angle of 30 ◦ with respect to the base of the frame. Cruz and González-Villa (2018a) propose a simple way to choose a grid compatible with preestablished values of the sample **size** and the number of nonempty quadrats.

Show more
From the foregoing analysis it follows that the effective size of a population can become arbitrarily large without the population actually having a large number of [r]

18 Read more

It is probably worth noting at this point that the minimum bin **size** must not correspond to particles below the resolution limit of the imaging technique; otherwise, excessively **large** error bars will be computed by the method proposed in the previous section. Also, the choice of binning scale is extremely important: A sufficient number of bins must be chosen such that the average **size** returned by the histogram is as close as possible to the mean of the collected data. If an excessive number of bins is chosen, then the fluctuations in the bin height will increase the uncertainty in the stereologically-calculated result. Figure 2 shows the standard deviation of the mean particle **size** represented by a linearly-binned histogram (a) and a logarithmically-binned histogram (b), as determined by 5000 simulations of particle **size** observations with a mean particle diameter of 1.2 microns and a lognormal shape parameter σ of 0.5. Isocontour lines of constant standard deviation regions are shown, with the number of bins plotted on the x-axis and

Show more
22 Read more

Analysis of historical effective population size provides a means to compare changes in population dynamic parameters ( e.g., population size and growth rate, harvested y[r]

10 Read more

1.33- and 1.5-fold higher than for Y-linked loci, respec- tively, with the diversity for X-linked loci being 1.12 We now consider the application of these formulas that for autosomal loci. Similarly, the limiting ratio of to some specific examples. We focus initially on the autosomal to cytoplasmic total diversities becomes 0.67. effects of **population** subdivision, assuming a 1:1 sex Differences between the pollen and seed migration ratio, Poisson variances in fertility for both sexes, and rates also influence the rate of increase in F ST . For in- equal mutation rates for males, females, and different stance, if migration occurs primarily through pollen chromosomes. Discrete generations and a deme **size** of (e.g., with m m ⫽ 100m z , Figure 1B), F ST, A and F ST,X increase N breeding adults for all demes are also assumed. In faster with decreasing migration than when migration this case, the ratios of effective **population** sizes and occurs equally through pollen and seeds (Figure 1A), within-deme diversities for different modes of inheri- or primarily through seeds (m m ⫽ 0.01m z , Figure 1C), tance G and G⬘ (r GG ⬘ ) are simply equal to the ratios of whereas F ST,C increases more slowly, but is higher

Show more
20 Read more

for negative I 1 (it is trivially true for positive I 1 ). This is clearly satisfied when I 1 is near 0 and when it is very **large** and negative, but possibly can be violated for intermediate values. Thus instability occurs in a critical range of information I 1 only, which for some parameters may be empty; if I 1 is **large** and negative then individuals are very attractive to predators and maximum camouflage is best, if I 1 is near zero individuals are slightly attractive to predators but cannot improve things much by changing appearance, so staying at r=0 is again best. For intermediate values individuals may be able to reduce their attractiveness by moving away from their current appearance, even though they will be discovered by predators more often. r 1 =0 is more likely to be a solution if the rate of decline of attacks as toxicity increases declines slowly, predators cannot identify differences between individuals for discriminatory purposes very well or camouflage is very effective. As long as a is not very **large**, the same pattern occurs for non-zero a.

Show more
32 Read more

Microsatellites are short tandem repeats that are widely dispersed among eukaryotic genomes. Many of them are highly polymorphic; they have been used widely in genetic studies. Statistical properties of all measures of genetic variation at microsatellites critically depend upon the composite parameter ⫽ 4N, where N is the effective **population** **size** and is mutation rate per locus per generation. Since mutation leads to expansion or contraction of a repeat number in a stepwise fashion, the stepwise mutation model has been widely used to study the dynamics of these loci. We developed an estimator of , ˆ F , on the basis

Show more
10 Read more

The study of the GHC was described in detail elsewhere [2]. Briefly, the GHC study is a **large** **population**-based cohort study supported by the German Federal Ministry of Education and Research. The study was approved by the ethics committee of the University of Duisburg-Essen, Germany. Informed written consent was obtained from all subjects. 18,000 subjects between the age of 18–65 years with German citizenship were randomly selected via pos- tal mail from statutory lists of residence drawn from three cities in Germany: Essen (585,481 residents, **large** town), Muenster (272,890 residents, medium-sized town), Sigmaringen (16,501 residents, small town and rural area). We obtained information per mailed questionnaire or telephone interview from 9,944 participants. The ques- tionnaire is based on the ICHD-2 classification criteria of the International Headache Society [3]. A detailed descrip- tion and validation of the headache-screening question- naire was published previously [4,5]. In summary, the questionnaire included questions on personal data, inquiry on socio-economic status based primarily on education to avoid direct questions about income, and medical inquiry to diagnose migraine and TTH according to the ICHD-2 classification criteria as well as questions to ascertain the number of days associated with headache and use of acute or preventive medication.

Show more
sites of the GMR (methodology described in Hearn et al., 2014). Finally, **population** **size** of hammerhead sharks at Darwin Island was estimated using a combination of visual counts with acoustic telemetry (methodology described in Peñaherrera-Palma, 2016). Unlike other underwater census methodologies (such as visual censuses or stereo-cameras), assessment of **population** **size** provides information on the number of unique individuals that exist in an area. This in turn makes it possible to determine with greater certainty the number of individuals that can co-exist in the same place during a defined period of time and thus calculate the true existing biomass with greater accuracy.

Show more
this model called the birth-death-collapse model [22]. This model assumes a priori “minimal clusters” of indi- viduals, which can be merged, but not split by the pro- gram. There are several priors specific to species delimitation. Most importantly, the Collapse Weight prior provides information about the likely number of species in a delimitation analysis, where values near 1 mean fewer species. In our analyses, the Collapse Weight prior was estimated and set to a uniform distri- bution [0, 1]. For the Caparinia dataset, the following models of nucleotide substitution were set for five pre- sumably unlinked loci: TIM (rDNA); TrN (EF1-α); TVM +G (SRP54); TrN+G (HSP70); and TVM+I (cox1). For the two-locus Dermatophagoides dataset, models were as follows: HKY+G (cox1); TIM+G (CPW2). STACEY was run with the strict clock model; the coalescent parameters were set as suggested in the STACEY manual v1.2.3; MCMC chain length was set to 10 9 sampling every 10 6 generation; 4–7 independent analyses were run to ensure consistency between runs. Runs that con- verged on a similar distribution were combined. Conver- gence, mixing, and ESSs were estimated in Tracer v1.6 [71]. For the Caparinia dataset, we evaluated single- and two-species models where Caparinia ictonyctis was either merged with Caparinia tripilis s. str. to form a ‘minimal cluster’ or these two OTUs were treated separ- ately (see the BPP section above). For the Dermatopha- goides dataset, we tested whether Dermatophagoides farinae groups 1 and 2 (DFa, DFb) are one or two species (see the BPP section above). In addition, because of the presence of a **large** number of individuals, we ran a species discovery analysis, where each individual was treated as a separate ‘minimal cluster’. Model comparison was done by using marginal likelihoods (Bayes factors); with standard errors estimated from 16–100 bootstrap replicates in Tracer [71].

Show more
15 Read more

The t-distribution (also called Student’s t-distribution) is another class of probability distribution commonly used in statistics and epidemiology. The t-distribution looks almost identical to the Normal distribution but has a bit shorter and fatter peak with long tails (2). When the sample **size** is (n<30) or when **population** standard deviation is unknown, the t- distribution is used instead of the Normal distribution in inferential data analysis. Unlike the Normal distribution, the shape of t-distribution and the number of observations that lie within a certain SD from the mean varies depending on the degrees of freedom (df). The t-distribution and associated t-score, in contrast to (standard) Normal distribution and z- score, respectively, are used in hypothesis testing (using t-test) and confidence interval estimation. Similar to the z-score, t -score can be read from t-distribution tables for a given degrees of freedom or generated from t-distribution using com- puter programs. As the sample **size** increase, the t-distribution looks more and more like the standard Normal distribution. Degrees of freedom refer to the information we have to estimate a parameter. Some parameter estimates are based on more information than others. For example, in the previous example, there is more information to estimate the mean blood sys- tolic pressure measurements with the sample **size** of 1,000 than with a sample **size** of 100. The degrees of freedom (df) of an estimate is the number of independent pieces of information on which the estimate is based. In estimation of the mean using 1,000 and 100 sample sizes, there are 1,000 and 100 degrees of freedom, respectively, assuming that the individuals in each sample are independent. Detailed description of degrees of freedom will be provided in the coming Issue of the EMJ.

Show more
In regard to familial migraine molecular genetics, studies performed successfully in other complex disorders (e.g., familial diabetes or obesity) may be a source of inspiration for future migraine genetic research. In familial disease, exome sequencing has been performed in the person with the disease, the parents (where one is affected and the other parent is not) and another distantly related affected person. With this design, the mutation is assumed to be autosomal dominant. After the exomes of these four cases have been sequenced, many individual variants are expected to be found. The next step is to reduce the number of variants by filtering. The first filters applied are variants taken from the HapMap (a database cataloguing all known SNPs in the human genome [42]) or the 1000 Genome Project, since these variants are not associated with severe disease phenotypes. The next filter is **population**-specific exome data, because some variants occur only in certain sub-populations and are not disease-causing. After filtering, the remaining variants may be reduced to a few family- specific variants, which are thereafter tested for segregation within the whole family. The idea is that the affected family members should have the variant(s), while it should be absent in the healthy family members. In the end, this approach will yield a few or at best one variant. The pro- cess is summarized in Fig. 2. Using this design, it may suffice to examine one big family with ten or more affected individuals. These variants are likely to be family-specific and will not be found by GWAS. In cases where no strongly associated variant is found, it is assumed that the variant is not located in the coding regions, and these families become suitable candidates for WGS, when this method becomes more available.

Show more
H ISTORY has been termed the “realm of contingen- designed to facilitate natural selection in the “test tube,” cies” (Kracauer 1969), implying that the course a design in which the fitness criterion is not specified of history seldom obeys deterministic laws. The contri- in advance. This is in distinction to artificial selection butions from both chance and necessity in the evolution where the desired outcome is guided by the investigator. of life are the subject of experimental evolution. The Over the course of 2000 generations there was a signifi- experimental evolution protocol involves establishing a cant increase in cell **size** in all 12 replicates. Furthermore **population** of microorganisms under laboratory condi- there was a significant increase in fitness quantified as tions. In experiments involving bacteria, these are the growth rate of an evolved variant placed in the grown at constant temperature in a sugar-rich broth. ancestral **population**. A consistent result across lineages When using viruses, the host species, commonly a bacte- was that the most rapid change in fitness occurred soon rium, is maintained alongside the phage in culture. A after introduction into the experimental environment, small set of the founder **population** is sampled and during the first 2000 generations. Each lineage ap- reintroduced into an otherwise identical but unpopu- peared to be approaching independent fitness peaks lated environment. The new **population** is allowed to asymptotically. Lenski and Travisano (1994) interpret grow and this **population** then contributes to the next the common patterns as results of evolution from a sample. The procedure is repeated for several genera- single founder combined with **large** **population** sizes, tions (serial passaging). These experiments aim to recre- whereas any differences are attributed to chance events ate conditions promoting adaptive radiation. Below we during the adaptive evolution.

Show more
12 Read more

Following Berry et al. (1994), we take (α 1 , β 1 ) = ( 1 , 200 ) and (α 2 , β 2 ) = ( 5 , 667 ) , the latter corre- sponding to the placebo (note that the β i values given by Berry et al. are 16 times those used here as they take ξ i to give the rate of cases per child-month). Berry et al. report that approximately 5400 Navajo are born each year so that minimization of HIB cases over a 20-year period would correspond to N = 108,000. Figure 6 shows a contour plot giving the prior expected gain for this N for a range of n 1 and n 2 values (plotted on logarithmic scales) together with the approximation given by (7) (dashed lines). It can be seen that even for small sample sizes, (7) gives a close approximation to the true prior expected gain. The optimal design has n 1 = 3162 and n 2 = 1585, and is marked by the plus sign. The approximately optimal design given by (10) and (11) has n ∗ 1 = 3524 and n ∗ 2 = 2089, and is marked by a circle. The prior expected gain, in this case corresponding to minus one times the prior expected number of HIB cases in the **population** over the 20 year period, is − 416 . 9 using the optimal design and − 417 . 4 using the approximately optimal design.

Show more
18 Read more

There are lots of scientific studies and theories about the **size** of groups and the degree of action of these groups. Mancur Olson (1965) is one of the most famous scientists about collective action (Lupia & Sin, 2003). According to Olson (1965) the **size** of a group influences the degree of action. In his book ‘The logic of collective action’ Olson (1965) claims that in contrast with larger groups, smaller groups will be more effective and would act more willing. In case of **large** groups the benefits for every individual member are smaller, members would be less motivated and the organizational costs are higher. Olson (1965) claims that the benefits in smaller groups will be higher than the costs. When the group is larger, the costs of coordinating behaviour and the costs of the formal organization will be higher too: this will lead to a higher barrier in larger groups to act collective (Lupia & Sin, 2003). This is in line with the statement of Buchagan & Tullock (1962).

Show more
69 Read more