• No results found

2.4 Fixed Budget

3.1.1 Selection

As mentioned above, our aim is to analyse the change in allele frequencies. However, selection acts mainly at the genotype level1and therefore we have to establish a relationship between allele and genotype frequencies. Hardy (1908) and Weinberg (1908) independently showed that under random mating (and absence of mutation, selection and migration) the genotype frequencies only depend on the allele frequencies. Furthermore, they remain constant during following generations, proving that diversity is preserved (under those conditions).

Definition 3.1 (Hardy-Weinberg (H-W) Frequencies). Let Ai be an allele i and pi its fre-quency. Then its genotype frequenciesFreq(AiAj) are

Freq(AiAi) = p2i

Freq(AiAj) = 2pipj ( j ̸= i) Freq(AjAj) = p2j.

Note that the order of the alleles in a genotype does not matter since it only represents if the allele comes from the mother or from the father. Then we do not distinguish between heterozygotes (AiAj= AjAi) and hence the factor 2 in the frequency Freq(AiAj). If we were interested in knowing the origin of the allele Ai, we would have to distinguish between AiAj and AjAi, and also between their frequencies Freq(AiAj) and Freq(AjAi). However, in most populations the number of genotypes AjAiis the same as the number of genotypes AiAjand the distinction is not necessary (see e.g. Chapter 2 in Crow and Kimura, 2009).

With the relation between genotype and allele frequencies we just have to define the notion of fitness within biology. The following definition is adapted from Chapter 2 in the textbook by Rice (2004).

Definition 3.2 (Fitness). Let P be a population of individuals with genotypes AiAj. Then we can define the following fitness measurements:

• Fitness of an individual, F: is the reproductive contribution of an individual to the next generation.

1Strictly speaking it acts on individuals through their phenotype, which is determined by the genotype.

• Genotypic fitness, fi j: is the average fitness of all individuals with the genotype Ai j in a population.

• Marginal fitness, fi: is the average genotypic fitness over all the genotypes that contains the allele Ai, weighted by the probability of finding such allele in that genotype.

Let us now focus on the most common selection model studied in PG: selection acting on one locus with two alleles (Rice, 2004). Hence, we can simplify the notation from Definition 3.1 to: just p for the frequency of the allele A1and obviously the other allele will have a frequency of 1 − p.

We have defined the mean fitness of a population in terms of how much it favours the growth of such population, it seems natural then, to measure this growth per generation.

Let Nt be the size of a population at time t, we can compute the population size after one generation as the sum of each genotype’s fitness, weighted by their frequency in the population. Assuming H-W frequencies, this is simply Nt+1= Ntf, where f is the mean population fitness.

f = p2f11+ 2p(1 − p) f12+ (1 − p)2f22. (3.1) Recall that since the order of the alleles in the genotype is not relevant we have that f21= f12. We now take the derivative of the previous equation to study how f varies with respect to the allele frequencies. For that, we have to consider the case where the fitness is a function of the allele frequencies, hence

d f

d p= 2p f11+ p2d f11

d p + 2(1−2p) f12+ 2p(1− p)d f12

d p −2(1− p) f22+ (1− p)2d f22

d p . (3.2) To simplify this expression we recover the concept of marginal fitness (see Definition 3.2),

fi can be computed as the probability of finding Ai in each genotype. If we stick to the assumption of random mating used to derived the H-W frequencies, these probabilities are simply each allele frequency. Therefore,

f1= p f11+ (1 − p) f12

f2= p f12+ (1 − p) f22. (3.3) Introducing this back into Equation (3.2) leads to

d f

d p = 2 ( f1− f2) + p2d f11

d p + 2p(1 − p)d f12

d p + (1 − p)2d f22 d p ,

which we can simplify by noticing that the three last terms are just E [d f /d p] since each derivative is multiplied by its corresponding H-W frequency, leading to

d f

d p = 2 ( f1− f2) + E d f d p



. (3.4)

Let us leave this equation for a moment and focus again on the marginal fitnesses, from Definition 3.2, we can also understand fi as the expected number of descendants of the allele Ai. Hence, if at the current generation there are ni alleles of type Ai, in the next generation there will be, in expectation, n1f1. Furthermore, we also notice that the total population size N = ∑ nigrows with the mean population fitness f . Then, we can derive the allele frequencies in the next generation as

pt+1= n1f1/N f = ptf1/ f .

Returning to our original aim, we compute the variation in the allele frequencies ∆p = pt+1− pt as follows

∆ p =

p f1− f

f = p(1 − p)( f1− f2)

f , (3.5)

where in the last step we have used the fact that f = p f1+ (1 − p) f2. Finally, we recover Equation (3.4) to introduce it into Equation 3.5 yielding

∆ p = p(1 − p) 2 f

 d f

d p− E d f d p



. (3.6)

This equation is crucial for the theoretical understanding of evolution. In the special case when the fitness is frequency-independent (i.e, d f /d p = 0) it corresponds with the equation for an adaptive landscape by Wright (1937).

∆ p = p(1 − p) 2 f

d f

d p. (3.7)

If we consider this equation as a dynamical system, we find a very important result. Apart from the extreme cases when one of the two alleles has taken over the whole population (p = 0 or p = 1), the fixed points of Equation (3.7) (i.e., when ∆p = 0) are located at d f/d p = 0. Therefore, we find stability when the mean population fitness is maximised!

This result backs up the principle of adaptation from the Darwin-Wallace evolutionary theory. Unfortunately, Moran (1963) showed that for the multi-loci case it no longer holds (in general) that populations tend to maximize f . Specifically, when the fitness is dependent

on more than one locus. However, Moran also mentioned that adaptive landscapes can still be used to obtain approximated results when the genotypic fitnesses are very close to each other.

To further understand Equation (3.7), we present a case study where the heterozygote has a higher fitness than both homozygotes. This is modelled by assigning (for example) the following genotypic fitnesses: f11 = f22 = 1 and f12= 1 + ∆ f . Then, assuming H-W frequencies, we can use Equation (3.1) to compute the mean population fitness: f = 1 + 2∆ f p(1 − p). Introducing this value of f in Equation (3.7) leads to

∆ p = p(1 − p)(1 − 2p)∆ f

1 + 2∆ f p(1 − p) . (3.8)

Let us now plot both ∆p and f against p. In the right-hand graph of Figure 3.1, we can observe that the mean population fitness approaches its minimum value when there are no alleles A1 (p = 0). This corresponds with all the population consisting of copies of the genotype A22. Analogously, when p = 1, all the alleles present in the population are A11, hence a minimal f = 1. However, when the population is composed of half A1alleles (p = 1/2) we observe that the fitness is maximised.

As we mentioned before, the value p = 1/2 must correspond with a stability point (∆p = 0). We can observe that this is the case on the left-graph of Figure 3.1. It is also interesting to study the stability of this fixed point. For that, we observe that f (p) is increasing for p < 1/2 and decreasing as soon as p > 1/2. Since the sign of ∆p is determined by the slope of f (p), recall Equation (3.7), we can conclude that the point p = 1/2 is a stable equilibrium point.

0 0.2 0.4 0.6 0.8 1

−1

−0.5 0 0.5 1

·10−2

p

∆p

0 0.2 0.4 0.6 0.8 1

1 1.01 1.02 1.03 1.04 1.05

p

f

Fig. 3.1 Heterozygote advantage model with f11= f22= 1 and f12= 1.1. The graph on the left shows the variation in the frequency ∆p = pt+1− p of allele A1, versus the frequency p of the allele A1. The graph on the right shows the mean population fitness f against p.

Finally, it is important to notice that if we recover Equation (3.6), then the fitness has a dependency with the allele frequencies and we see that it is not necessarily the case that f is maximised. This presence of non-optimal genotypes prevents the population from reaching the maximum fitness. To gain further insight into this issue we need to consider the mechanism that produces new genotypes, mutation.