• No results found

IS THE GENE THE UNIT OF SELECTION?

N/A
N/A
Protected

Academic year: 2020

Share "IS THE GENE THE UNIT OF SELECTION?"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

IS THE GENE THE UNIT

OF

SELECTION?’

IAN FRANKLIN* AND R. C. LEWONTIN

Department of Biology and Department of Mathematical Biology, Uniuersity of Chicago, Chicago, Illinois 60637

Received January 29, 1970

HE models of population genetics, which have remained almost unchanged for forty years, are most commonly criticized for ignoring the “natural” unit of selection, the genotype, in favor of the gene. This criticism is really an attack on one of the basic assumptions of population genetics theory, namely that the genotypic array in a random mating population, and evolutionary changes in that array, can be described in terms of gene frequencies at the individual loci.

Undoubtedly there is a complex relationship between an individual genotype and its fitness, and some population geneticists feel that complexity p e r se destroys the usefulness of classical theory. But additivity (in the statistical sense) is not a crucial assumption- in 1918 FISHER showed how to partition additive and nonadditive effects of genes, and changes in gene frequency can be expressed in terms of these effects. On the other hand, if there is enough nonallelic interaction to induce stable correlations in allelic state between loci, the genotypic array can no longer be usefully described in terms of gene frequencies alone. since the frequency of a particular gametic type, say AbC, would not be simply the product of the separate gene frequencies p A , pb, pa.

These correlations in allelic state between loci in the gametic pool commonly referred to as linkage disequilibrium (LEWONTIN and KOJIMA 1960) are the

principal subject of this paper.

For many years loci have been assumed to be in approximate linkage equilib- rium, and this belief has in part been justified by exact two-locus theory. LEWONTIN and KOJIMA (1960), and BODMER and PARSONS (1961) showed that for symmetrical fitness models there will be stable linkage disequilibrium if there is considerable epistasis or very close linkage.

KARLIN

and

FELDMAN

(1969) did show, however, that for certain special relations among the fitnesses, looser link- age may also lead to a stable disequilibrium. Moreover KIMURA (1965) claimed that for weak interaction and loose linkage, FISHER’S fundamental theorem of natural selection and WRIGHT’S concept of an adaptive topography hold approxi- mately (see also WRIGHT 1967), although this does not hold true for tight linkage and strong selection. Since it seemed a priori that only a small fraction of all pairs of loci would be closely linked or strongly interacting, and polymorphic, the classical concepts of population genetics were not seriously threatened.

l The research described here was supported by the Atomic Energy Commission Contract AT(ll-1)-1437.

Present address: Division of Animal Genetics, Commonwealth Scientific and Industrial Research Organization.

P.O. Box 90, Epping, New South Wales, 2121 Australia.

(2)

However, a number of recent observations suggest that linkage disequilibrium may be more common than previously envisaged.

1. The apparent ubiquity of polymorphic loci in a variety of organisms implies

a high density of segregating loci per map unit, so that many pairs of polymorphic loci must be closely linked. For example, PRAKASH, LEWONTIN and HUBBY (1969) estimate that 40% of all structural genes are polymorphic in Drosophila pseudo- obscura. If there are 5,000 loci in this species, a conservative estimate, and a total map length of 250 centimorgans, there will be 8 polymorphic loci per centi- morgan. Taking into account the lack of recombination in males, the average recombination fraction between adjacent polymorphic loci is .0006.

2. Some kinds of natural selection generate very large amounts of epistasis. LEWONTIN (1964b) showed that various forms of selection for an intermediate optimum phenotype created sufficient epistasis to produce stable linkage dis- equilibrium even for genes on different chromosomes. Also the models of natural selection discussed by KING (1967) and SVED, REED and BODMER (1967) induce much more epistasis than multiplicative models.

3 . Linkage disequilibrium may arise from causes other than selection, in par- ticular finite population size (HILL and ROBERTSON 1968; OHTA and KIMURA

1969; SVED 1968). Even neutral alleles that are segregating may show consider- able linkage disequilibrium if they are sufficiently tightly linked. The correlation in gene frequency for segregating neutral loci appears to be approximately equal to 1/4Nr where r is the recombination fraction between the loci in question, and

N

the population size (HILL and ROBERTSON 1968). Also, departures from random mating, such as positive assortative mating, may induce correlations in gene frequency.

4. LEWONTIN (1964a,b) showed that genes quite far apart on the chromosome will be held together in linkage disequilibrium by genes segregating between them. That is, the disequilibrium between loci I and 2 and the disequilibrium between loci 2 and 3 will result in a disequilibrium between loci I and 3 even though, in the absence of locus 2, these distant genes would not be correlated. However, genes on either side of a long interval within which there is no inter- acting locus will not be in linkage disequilibrium with each other.

Finally, there is the finding to which this paper is directed, that two-locus theory seriously underestimates the intensity of linkage disequilibrium between loci in a multilocus segregation. W e will show that epistatic interactions that are small for any pair of loci considered alone can accumulate nonadditively in such a manner that loci far apart on the chromosomes can be in marked linkage dis- equilibrium. In fact, it is possible for all the loci on a n entire chromosome or chromosome arm to be highly correlated in their allelic distribution.

(3)

UNITS O F SELECTION 709

TABLE 1

Correlation between two adjacent loci (2 and 3) in a 5-locus model,* compared with predicted correlation from 2-locus theory

Map distance Observed Predicted Ratio

.01 .02 .03 .04 .05 .06

.of3

.ow

.967 ,929 .883 .823 .736 ,581 .481 .367 ,916 ,825 .721 .600 .4+7 .200 0 0 1.06 1.13 1.22 1.37 1.64 2.91 CO CO

* From LEWONTIN 1964a, Table 9

some containing other loci interacting with them, than when they are considered in isolation. Such a result does not follow from simple considerations of correla- tion and arises from higher-order interactions that do not exist in the two-locus case.

Some of the results presented in this paper can be anticipated from the results of 5-locus calculations made by LEWONTIN (1964a). Table 1 shows the equilib-

rium linkage relations for different map distances, expressed as the correlation between loci,* for a pair of adjacent loci (locus 2 and locus 3) in a 5-locus multi- plicative fitness model (Table 9, LEWONTIN 1964a). The model has a symmetry such that the correlation for two loci can be calculated from exact 2-bcus theory. The last column gives the ratio of observed to expected. The table shows that as the map distance between the loci is increased, the effect of embedding adjacent loci in a multilocus chromosome grows greater and greater. The effect becomes relatively most extreme for map distances above .0625 when exact 2-locus theory predicts no correlation. We see two effects then. The correlation between loci is greater than expected from 2-locus theory and the critical map distance, above which there is no effect of linkage, is increased, although only slightly. It is the purpose of this paper to push these two observations much farther by increasing greatly the number of loci and examining a variety of selection and linkage models in an approach to a realistic model of the genome. The results turn out to be rather surprising.

METHODS

While exact numerical computations of changes in genetic composition of a population are possible with a small number of loci, a different attack must be used when dozens or hundreds of loci are involved. With only two alleles per locus, there are 21 gametic types in an 1-locus model which involves manipulation of a matrix of that order. For 30 loci there are approximately 109 gametic types. We have therefore resorted to Monte Carlo simulation for more than five loci. Since we are not primarily interested in the effect of small population size, we have used as large a population size as was reasonable given a limited available computer time. Our primary at-

(4)

tempt was to derive essentially deterministic results by minimizing the stochastic effects. Finite- ness of population size does turn out to be of some interest, however.

Monte Carlo simulation experiments were carried out using a general purpose program written for the IBM 7094 at the University of Chicago. This program consists of a main deck written i n FORTRAN which is primarily responsible for input and output, and a set of sub- routines written in assembly language concerned with mating, recombination, evaluation of phenotype and selection. Each of these subroutines has a set of optional decks which allow different mating schemes and patterns of selection. The options used to generate the results given in this paper were chosen so that the simulated populations had the following properties: 1) Population size

Males chosen in sequence and females chosen at random from a n array in store are used to generate progeny, which are then selected on the basis of their fitness. Progeny are generated until a population size N (equal to the number of parents) is attained. Because males are chosen i n sequence, they have virtually no variance in offspring number due to Poisson sampling, while the females have such variance so that effective population size is nearly (4N) /3.

2) Recombination

Crossovers are generated with no interferencethe number of crossovers in a segment with map length A morgans is calculated by sampling from a Poisson distribution with mean A, and the position of each crossover is determined by sampling from a uniform distribution [0, A]. 3) Selection

a) Multiplicative fitnesses

The fitness of a genotype is the product of the fitnesses at the individual loci; each locus is assumed to have an identical effect on fitness. We assign to each heterozygote a fitness W,, and to homozygotes O / O and 2 / 1 values W , and W',. Then an individual with n, loci homozygous

O / O , n2 loci heterozygous, and n, loci homozygous i/i will have a fitness W,nl Wen, W,n,. This value is compared to a uniform random variable between 0 and W,n. If the genotypic value is greater than the random variable, the individual is saved, if less, discarded.

b) Proportional selection

Following a model of KING (1967), which postulates a fixed proportion of the generated individuals surviving, a score is constructed by adding the number of heterozygous loci in an individual to a random normal deviate. The individual is saved if the phenotypic score is greater than a truncation point computed each generation so as to save a proportion R of the population. The random number is chosen from a normal distribution with mean 0, and variance equal to

C times the variance of the number of heterozygous loci in the initial population. The initial heritability is therefore 1/(C

+

1).

Because the word size of the IBM 7094 is 36, it is convenient when dealing with a large number of loci to consider a multiple of 36.

Pseudorandom sequences are generated by the power residue method using a base of 3l9. We have tested the sequences for frequency of single integers, pairs, triplets, and quadruplets. No

autocorrelations have been detected.

For 5-locus results, direct numerical calculation of equilibria was made using the method of LEWONTIN (1964a).

The original computer program, modified by us, a d the tests of random numbers were the work of our former colleague PROFESSOR MADHO SINGH of the State College, Oneonta, N.Y.

RESULTS

Initial 36-locus models

An initial survey of a range of recombination values was made for a sym- metrical overdominant multiplicative model with W , = 0.9, W e = 1, and W , =

(5)

UNITS O F SELECTION 71 1

0 100 200 300 400 5 0 0

GENERATION

FIGURE 1.-Results of initial simulations with 36 loci. Curves 1 and 2: r = 0.0; curve 7: r = ,005; curves 3, 4, 5, and 6: r = .0025, where r is recombination between adjacent loci. Abscissa is generation number; ordinate is mean fitness.

pendently segregatmg. The starting population consisted of a set of 600 ( 2 N ) gametes chosen at random from a population with all gene frequencies equal to 0.5, and in linkage equilibrium. The effective population size is 400. Figure 1 shows Wplotted against time, in generations, for three different values of r, the recombination fraction between adjacent loci. For r = 0.0, complete linkage, there is a small but not remarkable increase in

w

(curves 1 and 2). In contrast, at r = .0025, there was a marked increase in win all four replicates (curves

3, 4, 5, 6), accompanied by a reduction in the number of gametic types repre- sented. At a higher recombination value, r = .005 (curve 7), no increase in

w

was observed. Some simulations were also carried out with free recombination between loci ( r = 0.5) and Wremained at the theoretical level of 0.158 for populations in linkage equilibrium. Apparently there is a n intermediate optimum recombination fraction that minimizes the genetic load, maximizing fitness. How can that be? The explanation is given in Table 2 which shows the gametic com- position of the populations at equilibrium. Each line of 0's and 1's is the repre- sentation of the 36 loci along the chromosome for one gametic type in the gamete pool at equilibrium. 'The frequencies are ten generation averages after 100 genera- tions of equilibrium. For I = 0.0, the entire population consists of a very few

(6)

TABLE 2

Equilibrium gametic arrays from initial simulations

Each line is a gametic type. 0 and I are alternate alleles. Asterisks mark fixed loci.

T h e 36 loci are spaced in 12 groups of 3 for ease of reading only.

~~

Gamete Frequency

(a) r = 0.0

rep 1 101 OOO 011 001 100 000 111 011 101 110 101 101 .454

* *

010 110 111 010 011 101 001 000 110 101 110 010 .391

100 111 110 101 011 011 100 100 010 100 011 111 .155

r e p 2 001 010 101 100 011 001 101 100 101 110 010 010 .280

110 011 000 111 001 110 011 010 110 110 100 000 .I72

011 100 110 001 100 111 111 110 000 011 101 101 .346

000 011 101 011 111 001 100 011 001 OOO 011 101 .024 100 011 011 101 110 101 OOO 010 111 OOO 011 111 .178

(b) r = .0025

rep 1 ** * *

(genera- 011 010 110 011 000 110 101 011 011 110 001 101 .411

tion300) 100 101 001 010 111 001 010 101 100 000 110 010 .424

others .165

rep 2 *

tion420) 110 011 011 100 010 100 001 011 011 000 111 OOO .427

others .133

(genera- 001 100 100 011 101 011 111 100 100 111 000 111 .44Q

except for the two loci starred in rep 1, all the loci are segregating when all types are in the population. Since no two gametes make a balanced pair, no indi- vidual in the population is a complete or nearly complete heterozygote. All individuals, even those heterozygous for gametic types, are homozygous at many loci (a minimum of 10 in rep 1) so the mean fitness of the population is not raised much despite the tremendous restriction in the number of gametic types. This phenomenon is a result of the finiteness of the population. Since there is absolutely no recombination, each gametic type is a n “allele” and random drift causes the elimination of all but a few “alleles.” If the “alleles” do not include a perfectly balanced pair, there is no way to recover such a pair.

When there is a small amount of recombination ( r = .0025), recombination between the “alleles” can occur to produce perfectly balanced pairs (disregarding loci completely fixed in the population). Table 2b shows such pairs making up 85% of the gametic array in each replicate. In rep 2, for example, 38% of the population will be heterozygotes for the two main gametic types and therefore heterozygous for 35 loci.

w

will be high in such a population. Examination of the .detailed output from the computation shows that the partial plateau for curve 5

(7)

UNITS O F SELECTION 713

The fitness among simulations with r = .0025 does not rise higher because recombination is constantly generating unbalanced types (“others” in Table 2 ) . This is a recombinution load. If the population were infinite in size, then a

perfectly balanced set could be selected when I‘ = 0.0 and no recombination load would occur. The mean fitness of the population would then be

%’

= (.5) (1 .O)

+

(.5) (.9)36 = .5225, the maximum that can be achieved by linkage. Thus the appearance of an optimal recombination fraction is a result of the finiteness of population size, while for an infinite population the tighter the linkage, the higher the mean fitness at equilibrium. This is in agreement with the results of 5-locus models ( LEWONTIN 1964a).

A result that is not obvious from the tables of gametic types, but which appears in the full output of the computer calculations, is the very close adherence of gene frequencies at each locus to the theoretical infinite population value of 0.5. Except for the occasional fixed locus, and except for the case of r = 0 where chromosomes once lost can never be replaced, the variance in gene frequency among loci is distinctly smaller for tightly linked cases than for free recombina- tion. For the two replicates shown in Table 2b, the variance of gene frequencies (discounting fixed loci) is .000177 and .000140, respectively, as compared with .00239 for a case with r = 0.5.

Large numbers of loci, each with fairly large selection coefficients, can then be kept segregating with a much lower genetic load provided they are closely enough linked. In assessing whether r = .0025 is a reasonable map distance, it should be remembered that this is four times the average distance we estimated between segregating loci in

D.

pseudoobscura.

Measures of linkage disequilibrium

Before going on to discuss more complete results, we need some useful descrip- tions of the gametic array in a population. I n addition to the allelic frequencies at each locus (which will nearly always be close to .50 in our models), many linkage disequilibrium parameters are needed to completely specify the geno- typic array for multilocus systems. With 36 loci there are 36 x 35/2 = 630 parameters necessary to specify disequilibrium between pairs of loci, 7 140 parameters describing disequilibrium between triples, etc. Since linkage dis- equilibrium is most clearly understood for pairs of loci-in fact there is no satis- factory theory for three linked loci with selection, and since we expected most of the deviations in frequency to be accounted for by first-order interactions, we will only consider functions of

D,

defined as gllgoo

-

glogol where gll and goo are the frequencies of the coupling gametes and gol and g I o the frequencies of the repulsion gametes with respect to the two loci being considered (see LEWONTIN and KOJIMA 1960). Even narrowing down to 2-locus effects, there are 630 separate D’s, and we need to consider some kind of average value. In the subse- quent discussions we will use two.

(8)

dramatic effects on disequilibrium occur for loci that are tightly linked, hence the D s between adjacent loci will be most sensitive to changes in linkage dis- equilibrium in the simulated populations.

2) 7-the average squared correlation coefficient between genes over all n(n-1)

pairs of loci. The correlation between genes is defined as follows. A pair 2

of random variables ( X , Y ) is assigned to the pair of loci. The random variables will each take the value 0 or 1 depending upon the allele at that locus in each gamete. Thus the pairs

(O,O),

( O , l ) , ( I @ ) , and (1,I) correspond to the gametes

ab,

d,

A b , and AB, respectively, and have the frequencies goo, go,, g l o , and g l l in the gamete pool. The correlation in question is that between X and Y in the gamete pool. The correlation between a pair of loci is related to

D

simply as (1) where p , and p. are the allelic frequencies at the two loci. With gene frequencies at each locus equal to 0.5, p2 = 16D2. This measure has a range between 0 and 1 since

D

cannot exceed .25. The reasons for using the average value of p2 will be discussed later in this paper.

Because of the large number of pairwise combinations, it is impractical to make a direct computation of

D*

or

2

when large numbers of loci are involved. W e have estimated

7

using a relation derived by SVED ( 1 9 6 8 ) . If the number of loci heterozygous in a n individual is H and the number of loci is n. then

when gene frequencies are .5 at each locus.

p = D /

v

p 1 p 2 ( l - p 1 ) ( l - p z )

Var ( H )

=

n/4

+

8 Z D2ij (2)

2 1

Then

(3) - 4 V a r ( H ) - n

p2

In all the results, unless recombination is completely lacking ( I = 0)

,

except for

a n occasional locus that goes to complete fixation, all loci maintain gene fre- quencies extremely close to .5 so [hat approximation ( 3 ) is very good.

n ( n - I )

Existence of multiple equilibria

Since the first simulations showed that small, but reasonable, values of recom- bination in 36-locus systems would lead to very high linkage disequilibrium, a more thorough investigation of these cases was undertaken. Population size was increased somewhat to N e = 667 to reduce random fixation at separate loci. Figure 2a shows the result of two replicat2 runs for r = .003 and W , = W , = 0.9.

In both cases there was a very slight rise in over the first 300 generations

-

(9)

0 IO0

200

3 0 0 400 500

0 I O 0

200

300

400 3

GENERATION

FIGURE 2.-(a) Changes in (ordinate) over time (abscissa) for two replicates with r = .003

and W , = W,9 = 0.9. (b) Changes in linkage association, D* (ordinate) over time for the same p q d a t i o n s as in 2a.

linkage disequilibrium

(7

= .96). As in the earlier runs, two gametic types make up 80% of the gametic pool, and unfixed gene frequencies are very close to .5

at all loci.

(10)

*25

* 20

* I 5

D

'

*IO

-0 5

0 100 200 300 400

GENERATION

FIGURE 3.-Four simulations with 36 loci, IV = 667, r = ,005, W , = W , = 0.9, but starting from different initial amounts of linkage disequilibrium. Linkage disequilibrium D* (ordinate) is shown over time (abscissa).

r = .005. The first run started in linkage equilibrium and after 300 generations reached a plateau at

D*

= .09. Three other simulations with initial values of

D*

= 0.25, 0.16, and 0.125, show that there are indeed two stable points, one at

D*

= .24 and one at

D*

= .09 with the unstable point between them at approxi- mately .14.

As a further check on the existence of more than one stable point, and in order to discriminate the effect of finite population size in the Monte Carlo simulation, we have examined the same selection model for five loci where exact results can be computed. Figure 4 shows the result of computing trajectories of

D*

for five loci with W , = W , = 0.9 at each locus and r = .002, .003, .004, and .005. The 2-locus exact theory predictions for these cases are that equilibrium

D*

= .I12 for r = .002 but

D*

= 0 for the lesser linkages, since r = .0025 is the critical value of linkage from 2-locus theory. Figure 4 shows, as we have already demon- strated, that

D*

is greater for five loci than for two loci with D* = .220 where

r = .002 and D* = .I87 for r = .003. These are both stable points since the trajectories converge from above and below. I n addition, however, D* = 0 is also a stable point for r = .003 since the trajectory with an initial value of

D*

= .05 shows a decrease in D* with time. For r = .003 there are two stable points,

(11)

UNITS O F SELECTION

_____

---

-.

4-z

/ -

- 0

-

...

...

.I5

-

-....

0 *'

..

.

/ /

.e--

...

---___..

.e-

-

...

...

__---

---

-.-..

.2--.7-:..

...

...

...

...

--.._.

.IO

*-

.-...-

* 0 5

_-___---

___---

-

I

I O 0 2 0 0 300 400

71 7

2 5

2 0

45

40

0 5

5 0 0

0'"

FIGURE 4.-Five-locus deterministic numerical solutions with W , = W , = 0.9, amounts of recombination between adjacent loci, and various initial amounts of linkage tion. Solid lines: T = ,003; broken line: r = .002; dotted line: r = .004.

various correla-

with

D*

large. For large r there is a single point with

D*

= 0. For an intermediate range, there are two stable points, one with

D*

large and one with

D*

= 0.

The existence of simultaneous stable equilibria with

D

= 0 and

D

# 0 has not been found in 2-locus theory. Both LEWONTIN and KOJIMA (1960) and KARLIN and FELDMAN (1969) found multiple stable equilibria for symmetric models, and LEWONTIN ( 1964a) found multiple equilibria numerically in asymmetric cases, but all such multiple equilibria had

D

# 0.

The effect of finite size

Returning to Figure 3, we see that the lower stable point is not a t

D*

= 0,

unlike for the 5-locus calculation. This difference, however, is a result of finite population size. The population cannot remain at

D*

= 0 because in any finite population linkage disequilibrium is generated by sampling.

SVED (1968) has described a method for calculating the expected linkage disequilibrium between two loci, provided that the gene frequencies are held constant at a n intermediate value. In particular, he showed that for gene fre- quencies at each locus equal to 0.5, and with no epistasis,

1 16 ( 4 N r - k 1)

(12)

where N = effective population size

and r = recombination fraction between the two loci.

This formula, which is in close agreement with the result that OHTA and

KIMURA (1969) obtained by a different approach, has to be modified slightly in our case to allow for the epistasis generated by the multiplicative model. Follow- ing SVED, if we let z equal the sum of the frequencies of the repulsion gametes

( g l o

4-

g,,), the distribution of z at equilibrium is approximately

Where

and C is a constant

A (2) is the expected change in z in one generation

V (z) is the sampling variance of z

NOW, A ( z ) = a(gio) f A ( g o i )

= (z -

1/)

[Is2 z(1-z) - r]/w where s = l--W

=

(z -

1/)

[s2 z(1-x) - r ] (6)

and V(z)

=z(l-z)/2N

[z (1 -z) ] *NT-l

c

ezNszZ(l-Z)

E(D*) = 2 J s

( s

- s )

@(z)

dz

(8)

(9) Since

D

=

1/ (1/

- z) (assuming all gene frequencies = 0.5)

1 / 2

0

W e have evaluated the above expression numerically. For N = 667, and s = . l , we have

and

The predicted linkage disequilibrium for r = .003 falls far short of the observed value. At the recombination fraction r = .005, we have noted that there are apparently at least two stable equilibria, and the lower one does not differ greatly from that predicted by equation ( 9 ) . The other equilibrium

(D*

N .24)

shows much stronger linkage disequilibrium, and cannot be explained by genetic drift alone.

Clearly other factors need to be invoked to explain the strong linkage dis- equilibrium observed in these simulation experiments, although genetic drift could account for some of the initial rise in

D*,

especially at the looser recombi- nation value.

E(D*) = .0962 for r = .003

= .0677 for r = .005

The contribution of higher-order interactions to disequilibrium between pairs of

loci

(13)

U N I T S O F SELECTION

TABLE 3

The genotypic array for two symmetrically overdominant loci showing frequencies under random mating and fitnesses

719

in the interactions between all pairs of loci. The following example will make this clearer.

Consider two symmetrically overdominant loci, with fimesses shown in Table 3. The ratio of the fitness of an individual homozygous A A to the hetero- zygote Aa is

W A A

-

g211 Wl Wp

+

2gllg1oWI

+

g2ioW1 W ,

/

gllgoi W ,

+

gllgoo

+

giogoi

+

giogoow,

W A a

(10) _ _ -

gZ11 i- 2g11g10

+

g'10 g11g01

+

gllgoo

+

glogol

+

g1ogoo (1 - se)pz.4

+

2s!2gllg10

-

q A

= (1 - S I )

P - 4 q A - SS(gllg0l

+

glOg00) ' P A

Where SI = I-w, ; sz = I-wa and

Because of the symmetry in selective values, at equilibrium the gene frequency at each locus will be

1/2,

and g , , = goo =

%

f D; g l o = gol =

P A = I - q A = frequency of the A gene.

- D. Then g ( l - ~ p ) + 2 ~ , ( ( 1 / 1 6 ) - D z )

_ _ _

W A a W A A - (l-sl)

-

%

-

2s2( (1/16) - D2)

= (l-s,)

-(+7

(11)

1

+(c

- ~

Since 0

<

s

<

1, and 0

<

ID1

<

%,

expression (11) has a maximum value of

(1-s) w h e n D = O , a n d a m i n i m u m , (1-s)2,whenD=

*

i/.Forsmalls, (11) is approximately

I t is obvious that the selection coefficients at each locus are a function not only of the effects at each locus, but also of selective differences at all other loci which are correlated in gene frequency. We can therefore distinguish two kinds of contribution to the selective differences at a locus-the intrinsic and the extrinsic selective values. The former is the effect on the fitness of an individual organism of substituting, as by mutation, one genotype for another at a locus. For example, in the above model the effect of substituting a homozygote at the first locus for

(14)

l L - X W P

i

I L - X W P

(15)

UNITS O F SELECTION 72 1

a heterozygote is to reduce the fitness of the organism to ( l-sl) times its former value. The intrinsic selective value of a genotype is not necessarily independent of the background genotype. The extrinsic selective value is determined solely by the background genotype, and is the combined contribution of all other loci. The resultant, often referred to as the marginal fitness (LEWONTIN and KOJIMA

1960; BODMER and FELSENSTEIN 1967), or the apparent selective value (SVED 1968) is closely related to FISHER’S (1941) concept of the auerage excess of a gene substitution, as opposed to the effect of substituting one allele for another (the auerage effect of a gene substitution). There are important differences, how- ever. FISHER’S concepts are defined for genes, not genotypes, and are a function of the gene frequencies at the locus in question.

This phenomenon is very important when we consider more than two inter- acting loci. Increasing the intensity of linkage disequilibrium in a block of loci increases the selective differences at each locus, with a corresponding increase in the interactions among the set of loci. We know from exact %locus theory that epistasis can induce linkage disequilibrium, and there will be, under certain circumstances, a positive feedback, i.e., linkage disequilibrium producing epi- stasis which in turn causes stronger linkage disequilibrium. As we might expect, this phenomenon is most marked when there are many segregating loci, for it is only then that we have the potentiality for large contributions from other loci to the selective values at a locus.

I n a finite population, where initial disequilibrium is generated by sampling, we might expect a localized increase in linkage disequilibrium which then spreads to all loci in a block, For example, a number of loci might by chance become highly associated, with each component temporarily having a marginal fitness approximately equal to the product of the fitness of all associated loci. These loci will now interact strongly with other adjacent loci, and eventually many other loci should “crystallize” into a large supergene. Such “crystallizations” appear to occur in the simulation experiments. Figure

5

shows this process in one run. The abscissa represents position along the chromosome from locus I to locus 36 and the ordinate shows the value of D between adjacent loci for each interval. Each curve is a “map” of D in a successively later generation. As the figure shows, nearly all values of D are low for the first 120 generations. I n generation 60 there is a rather high value at interval 7-8 and this develops into a crystalliza- tion point. A second high point in generation 60 is at interval 15-16, but this turns out not to be exactly at the crystallization nucleus which is interval 14-15. Especially dramatic changes are seen to the left of interval 7-8 and to the right of interval 14-15 where between generation 150 and 200 the correlation is pulled from very low values to nearly unity by the presence of the “crystalliza- tion nuclei.”

(16)

The hypothesis outlined above accounts very well for the difference between the simulations at different recombination values. Evidently the disequilibrium generated by finite population size induces enough interaction between loci to increase linkage disequilibrium deterministically for I* = .003, but not for r = .005. In the latter case the genotypic array proceeds to a new high equilibrium value only if the average value of

D*

is increased to approximately 0.14. This could be achieved by reducing the population size. This hypothesis predicts that there will exist, for an appropriate set of selection coefficients and recombination values, two kinds of stable equilibria for three or more loci. One of these is the array in which all loci are in linkage equilibrium (i.e., all

D

= 0)

,

and the other, a set of equilibria with linkage disequilibrium parameters not equal to zero. Numerical calculations with five loci support this prediction (see Figure 4).

Robustness of the system to changes in the model

1 ) Asymmetrical fitnesses

In the symmetrical model chosen above, the fitness of a genotype is a function only of the number of homozygous loci, irrespective of whether they are 1/1 or

O/O.

Any of the 23G possible gametic types. and its complement, may pre- dominate and the stable gametic frequencies will be indistinguishable. To put it another way, the stability of the system is invariant to interchange of the alleles at any locus or set of loci. This is clearly an artifact of the symmetry of the fitnesses assigned to each locus. To investigate the effect of asymmetry on the system, we chose a rather extreme case, namely the set of fitnesses W , = 0.9375,

W , = 1 ,

W ,

= 0.75, at each locus, which in the absence of interaction at other loci predicts a stable equilibrium gene frequency of 0.8 for the favored allele in each case. This set of fitnesses was chosen to give the same genetic load per locus at equilibrium as the symmetrical case with

W ,

= W , = 0.9. Simulation using an effective population size of 667 and a recombination fraction of I = .003

showed that, as before, the initial array of gametic types in linkage equilibrium is reduced to a few predominant gametic types in roughly equal frequency. Table

4

shows the gametic composition after 700 generations of selection. The array shown in this table is not yet at equilibrium, the third gametic type shown

TABLE 4

Gametic array near equilibrium from simulation with asymmetrical fitnesses

Frequencies are 10 generation averages after 700 generations of selection starting in linkage equilibrium.

w,

= .9375,

w,

= 1,

w,

= .?5

Gamete Frequency

*

111 010 110 101 110 101 001 011 110 101 101 101 2 6 5

000 111 001 010 101 110 110 111 101 111 010 110 .I34 000 111 001 010 101 110 110 Ill 101 111 010 111 .IO8

111 101 111 101 111 011 111 100 111 010 111 011 .I11

(17)

UNITS O F SELECTION 723

in the array having arisen from a negligible frequency only in the last 40 genera- tions of the run. Reduction in the variety of gametic types at this point i n the run was so slow, however, that it was terminated for reasons of economy. Never- theless, the result, even at this point, is obvious and not clearly different from the symmetrical fitness runs.

All P5 distinguishable equilibria in the asymmetrical case with non-zero linkage disequilibrium appear to be stable, at least insofar as we were able to push m s close to equilibrium. On the average, one of these will be chosen, by chance, with an approximately equal number of favored and unfavored alleles on each gamete.

One effect of asymmetry was that there was a greater occurrence of fixation at each locus, despite the fact that at equilibrium all the unfixed loci are near a gene frequency of

.5.

This suggests that all the 235 equilibria are not equally stable in the sense that the returning force toward equilibrium from a random perturbation is not equally strong. In order to better understand the multiple equilibria in the asymmetric cases, we returned to a 5-locus model for which exact numerical evaluation is possible. The model had the same selection and linkage parameters as the 36-locus asymmetrical case. Because of the existence

of multiple stable states of unknown domains of stability, we used a numerical method that traces the trajectory of gametic frequencies from some initial fre- quency array and then finds the stable point in the domain of that trajectory by a method similar to the technique of “hill-climbing.’’ As initial conditions for each case we chose a different pair of complementary gametes such as 00000

and 11111 or 01101 and 10010, gave them each a frequency of

.5

and searched for the stable point. Because all loci are identical in action, there is complete symmetry between loci 1 and 5, and between loci 2 and 3 . Thus, there are only ten distinguishable complementary pairs and we looked for stable points corre- sponding to all these. The complete gametic array of all 32 gametic types for all ten cases is too extensive to display, but Table 5 attempts to summarize their main features. The first column symbolizes the initial condition from which the stable state was reached, i.e., 00000 means that the initial gametic frequencies were .5 of that gametic type and .5 of its complement. The next two columns show the four most common gametic types at equilibrium and their frequency. The next four columns show the values of p2 between adjacent genes. The eighth column gives the frequency of the allele

I

at locus 3 (it differs slightly over loci depending upon distance from the end of the chromosome) and the last column gives the mean fitness at equilibrium.

(18)

00101’

01001

01010

>

TABLE 5

Alternate stable states for a 5-locus asymmetrical model W , = .9375, W , = 1, W , = .75, and r = .003.

Frequency of allele 1 at locus 3 is p ( 3 ) .

- PZ

Most Frequency a t

Initial pair common gametes equilibrium f , 2 2,3 3,4 4,s P ( 3 ) W

OOOOO 11111 00001 10000 00000 00001 11110 0001 0

00000 00010 00001 11100 00000 001 00 11011 (11000) (OW1 1)

3

00000 11100 0001 1 0001 0

00000 00001 10000 01110 00000 00001 00110 11 000 00000 00001 00010 00100 01000

1

10000

.6683 .1637 .0221 .0221 .5104 .1687 ,1416 ,0263 .4630 .1724 ,0574 .0509 .461.9 ,1751. ,1105 ,0386 .4607 .1205 ,1123 .0644 .4052 .1235 .1235 .I133 .3808 .IO34 .0862 .0782 ,3277 .0819 ,0819 ,0819 .0819 .0819

.7210 .7919 .7919 .7210

,6494 .7107 .6556 .0350

.5965 .6369 .0482 .0417

,5681 .0529 ,0529 ,5681

,5203 .5225 ,0364 .2878

,0207 .5196 ,5196 .0207

.2386 ,0161 .244Q .OOM

0.0 0.0 0.0 0.0

(19)

U N I T S O F SELECTION 725

intermixed. The chromosomes with dispersed I and 0 alleles are those with a high “recombination index” according to

FRASER

(1967). He pointed out that for normalizing selection, the higher the recombination index, as measured by the number of switches from I to 0 and from 0 to I along the chromosome, the smaller the loss in fitness because of “recombination load.” In our case, with heterotic selection we observe the opposite effect, coherent chromosomes showing the greatest fitness, probably because of the asymmetry of fitnesses.

We call special attention to the phenomenon of equilibrium gene frequencies being closer to .5 when there is linkage. While the effect is very pronounced in 36-locus models (Table

4)

,

the 5-locus models show it (Table 5 ) and so do all the asymmetrical models discussed by LEWONTIN (1964a,b), as well as the sym- metrical model of BODMER and PARSONS (1962). The reason for the effect is clear. As linkage tightens, we are more and more dealing with a supergene with “pseudoalleles.” While the equilibrium predicted for single alleles with homo- zygous fitnesses .9375 and .75 is

= .20,

s - .2500

=

sft

- .0625

+

.2500

the equilibrium predicted for “p~e~doalleles’~ with homozygous fitnesses (.9375) 36

= .09793 and (.75) 36 = .00003178 is

= .473

s - .go207

o=--

s f t .go207

4-

.99997 In general:

1

-

W,”

2 - (W,”

+

W,.)

$=--

so limo =

%

if 0

<

W,,W,

<

1 n - f m

This effect means that tight linkage and strong linkage disequilibrium is a preserver of genic variation and that relative selective ualues of single gene sub-

stitutions cannot be judged from allele frequencies. The frequencies of poly- morphic alleles observed by LEWONTIN and HUBBY (1966) in their study of enzyme polymorphism, thus, bear no necessary relationship to the relative selec- tive values of the various genotypes when viewed as single locus effects. The selection of the chromosome as a whole is the overriding determiner of allelic frequencies.

2) Unequally spaced loci

(20)

3) Other selection models

The model of individual gene effects on fitness multiplying each other is to some extent unrealistic and has been strongly criticized as naive by MAYR ( 1963)

,

KING (1967)

,

MILKMAN (1967), and by SVED, REED and BODMFLR (1967). They point out that a more realistic model of selection is one in which some proportion of the population survives, irrespective of the exact genotypic composition, re- flecting the severity of the environment. Thus

w

does not change as the popula- tion evolves, since w i s simply the proportion of the population surviving. Rather, the relative fitnesses of the various genotypes change.

In

particular,

KING suggested a specific model in which the value of some

phenotypic character is increased linearly with increasing heterozygosity but with incomplete heritability. The result is that if many loci are involved, pheno- type will be normally distributed. Selection then occurs by truncation, saving the proportion

K

of the population with the highest phenotypic score. This model has many interesting aspects and is worthy of a completely separate paper. For our present purposes we show the results of a single case. Setting the mean fitness at .20 (80% of the population culled) and recombination between adjacent loci a t .003 provides a model close to the multiplicative cases we have dealt with. In

addition heritability in the broad sense was specified as 10%. Table 6 shows the resulting gametic types and their frequencies in several runs. Comparison with Table 2 shows that there is essentially no difference.

Thus, once again our results are robust to a change in the model. These observa- tions are in accord with the results reported by WILLS, CRENSHAW and VITALE (1970) who simulated a multilocus model using a truncation selection scheme. They too found the build-up of pronounced linkages with this selection scheme.

Passing to the limit

W e have seen in our models that the effects of linkage on the genetic composi- tion of the population increase as the number of loci considered increases. Thirty- six-locus models show much greater correlation between loci than do 5-locus models for the same selection intensities, and both show effects not predictable from exact 2-locus theory. Clearly, higher-order interactions among loci are not

TABLE 6

Equilibrium gametic arrays from proportional selection model of KING, with

w

= .20 and he = .IO, r = .003

Run 1, N = low, started in linkage equilibrium Frequency

OOO 101 110 001 001 010 101 111 110 011 001 000 ,406

I l l 010 001 110 110 001 000 000 001 loo 110 111 .405 Others .I89

Run 2, N = BOO, started from complete coupling

I l l I l l 111 111 111 111 111 111 111 111 111 111 .MI

(21)

UNITS O F SELECTION 72

7

negligible. We cannot, however, continue to increase the number of loci indefi- nitely while holding the effect of a gene substitution constant. Any realistic view of the genome must suppose that as the number of segregating loci affecting fitness increases, the effect of a single gene substitution must decrease.

Suppose we have a chromosome of some fixed map length 1. Moreover, suppose that the finess of chromosomal homozygotes for this element is K as compared to unity for a chromosomal heterozygote. (We are again assuming a symmetrical model so that all chromosomal homozygotes have the same fitness). What will happen as we pack more and more segregating genes into this fixed map length with a fixed amount of inbreeding depression? Two compensatory effects will occur. As more genes are packed into the linkage group, the recombination dis- tance between adjacent loci will decrease linearly with gene number, but the effect of a gene substitution on fitness will grow smaller. In a multiplicativemodel, the fitness, I-s, of a homozygote at any one locus will be

1 - S = K ~ / ~ or In K

ln(1-s) =-

n

where n is the number of loci. For large n, 1-s will be very close to 1 so that the effects of a single gene substitution will be

In

K

s z - -

n

That is, it will decrease linearly with n. The degree of epistasis as measured by the deviation, E , from additivity is proportional to s2, however, so epistasis de-

creases as n2. Now 2-locus theory suggests that the effect of linkage and selection as measured by correlation between loci depends upon the ratio E/..

If

this ratio

is large either because epistasis is great o r because recombination is rare, there will be a large correlation between loci. If it falls below a critical value, there will be no correlation at equilibrium. For the present case, then, E / .

-

n/n2

-

1Jn. Thus 2-locus theory predicts that as we increase the number of loci packed into a fixed chromosome length with a fixed inbreeding depression, the correlation between loci should decrease. This does not take into account, however, the higher-order interactions. Are these sufficient to counterbalance the decrease in effect, or will the importance of linkage grow vanishingly small?

To answer these questions we have carried out a large set of numerical calcula- tions and simulations with two different values of K (.0225 and .4832), a range of total map length up to 60 centimorgans and n = 2, 5,18,36, and 360.

The total map length is measured by

This expression ignores the nonlinear relationship between map length and recombination values. There will, however, only be a significant discrepancy when we are dealing with large chromosome segments, and we are concerned in this paper primarily with small recombination fractions between loci. We

(22)

have chosen 1 = ( n f 1 ) r because the average recombination between a pair of adjacent randomly distributed loci on a chromosome of length 1 is r = Z / ( n + l ) .

The average squared correlation at equilibrium among all pairs of loci, p2, is used as the measure of the effect of linkage. For two loci p2 is analytically a linear function of 1. From LEWONTIN and KOJIMA (1960) for this symmetrical case

-

D = ? V 1 - 4r

4

(1-W)'

4

so

For five loci, exact numerical computation was employed and p w a s directly calculated over all pairs of loci. For 18, 36. and 360 loci, simulation was used and F w a s estimated from the variance relation (3) given earlier in this paper.

The results of these calculations are shown in Figure 6a and 6b. In 6a, K =

0.0225 while in 6b, K = 0.4832. Table 7 gives the fitness per single locus homo- zygous for these two cases with different numbers of loci from relation (13). Attention is called to the extremely small fitness effects per single locus for large numbers of loci.

Figures 6a and 6b show a remarkable phenomenon. As predicted by 2-locus theory, the line relating

p?

to 1 has a negative slope and the 5-locus line lies entirely beneath the 2-locus line. That is. increasing the number of loci in the interval and sharing out the fitnejs among them does lessen the effect of the linkage as measured by F f o r any given map length. When we pass from 5 to

18 loci, there is again a decrease in both cases but a much smaller one than when passing from 2 to 5 loci. When the loci are now doubled to 36, there is only a barely perceptible change in the line in the case of strong selection and a small change for weak selection. Finally, for 360 loci there is no change at all for the stronger selection, while for the weak selection the simulation gives such high variance between generations that it was not possible to identify an equilibrium condition. Thus we have a property of invariance appearing as the number of

TABLE 7

Fitness of a homozygote at a single locus among n loci when the fitness of the n-tuple homozygote = K

.I

K

R ,0225 .4832

2 .I501 .6951

5 ,4683 3646

18 3100 ,9604

36 ,9000 .9800

(23)

UNITS O F SELECTION 729

FIGURE 6.--The relation between the average correlation among all pairs of loci, on the ordinate and total map length on the abscissa, for different numbers of loci making up the total map. (a) Strong selection: K = 0.0225; (b) weak selection: K = 0.4832. Solid line: 2 loci; broken line: 5 loci; dashed line: 18 loci (solid circles), 36 loci (crosses), 360 loci (open circles).

loci increases. For more than a couple of dozen loci, the average correldon be- tween genes on the chromosome is independent of the number of genes or their individual eflects and depends only on the total map length of the chromosomes and the total inbreeding depression. Apparently, higher-order effects do come into play in a significant way with many genes so as to just cancel out the weaken- ing of first-order effects. Once again 2-locus theory is seen to be inadequate in an important way for prediction of multilocus phenomena.

(24)

not depend simply on the single parameter, selection per unit map length. For

-

example, from Figure 6 we see that for 36 loci in the case of strong selection,

p2 = .8 when homozygous load per centimorgan is .9775/9 = .109. The same homozygous load per centimorgan for weak selection occurs for a map length of 4.7 centimorgans where - p2 = 0 for 36 loci.

GENERAL IMPLICATIONS O F THE RESULTS

Two-locus theory has shown that linkage disequilibrium will be generated among loci interacting multiplicatively provided that the loci involved are suffi- ciently tightly linked. We have shown in this paper, and this is not apparent from 2-locus theory, that the degree of linkage disequilibrium between a pair of loci is not simply a function of the fitnesses of the 2-locus system (i.e., the average effects of gene substitution at these loci), but may be overwhelmingly determined by the average effects of other loci forming a linked complex with the loci in question.

It appears likely, from the data presented above, that the disequilibrium be- tween pairs of loci is a function primarily of the map length of a chromosome segment and the loss in fitness accompanying homozygosity of this segment. Furthermore, this appears to hold under a wide degree of conditions, and in particular the average correlation in gene frequency between a pair of loci on a chromosome segment is largely independent of the number of loci, and hence the average effect of a locus, in that segment. In general the average degree of disequilibrium between a pair of loci on a chromosome segment will be somewhat less than that expected between two loci sharing the load of the segment, of the segment length apart, and greater than that expected if the fitness differences are spread uniformly along the map. The former value (i.e., the upper bound) is easily calculated from 2-locus theory, but as yet we have been unable to derive the values for the limiting case when the chromosome becomes a quasi-continuum. The model discussed above provides a possible explanation for the origin and persistence of inversions in natural populations. If we look at the gametic array formed in one of the simulated populations, we find considerable organization, and it should be apparent that an inversion encompassing the block of loci would be at a selective advantage. If this were the case, it would not be necessary that the selection coefficients which initially induced the disequilibrium within the

set of heterotic loci be maintained-it would only be necessary that some degree of heterosis persist. If inversions do arise this way, it should be noted that CO- adaption of the alleles within a n inversion is not necessary. Heterosis is the only prerequisite. Co-adaptation may, of course, arise later (see KOJIMA 1967).

(25)

UNITS O F SELECTION 73 1

tion genetic theory is framed in terms of the frequencies of alleles arid the effect on fitness of gene substitution at those loci. Observations, with the rare exception of a few single-locus polymorphisms, deal not with the frequencies of alleles and their fitness effects but with phenotypes dependent on the whole genome and with differences in fitness associated with genetic differences of large pieces of the genome. The recent work on enzyme polymorphisms has provided informa- tion on frequencies of alleles at single loci. But it has told nothing about the effect on fitness of single-locus substitutions. Indeed, the main hiatus between theory and experiment is that theory talks about single-locus fitness, while no one has ever invented a method for measuring it. Every population geneticist who has ever worked with experimental animals and plants knows that with the exception of a few major single-locus polymorphisms, the attempt to measure the differ- ence in fitness between homozygote and heterozygote at a single locus is frustrat- ing. First, most single-locus fitness effects must be very small and, second, it is impossible to know whether the results of two crosses differ only at the locus in

question or in a whole block of genes surrounding the locus. The usual approach of backcrossing to a common parent for eight o r a dozen generations (more often three or four generations) is self-delusion. Such backcrossing procedures certainly do not free a gene of the surrounding genetic material with which it is associated. Genes five units on either side of a marker will retain 50% of their correlation with the marker for 14 generations of backcrossing even without selection. Thus, the attempt to measure single-locus fitness effects for most loci is doomed to failure until methods of analysis several orders of magnitude more sensitive than we now have are created.

What the geneticist can and does measure, however, are the fitness effects that occur when whole blocks of genes, usually whole chromosomes, are made homo- zygous and heterozygous. A theory of population genetics that is framed in terms of chromosome segments, their map length, and their fitness effects per unit map length, would then be a theory that would finally link up with the facts of observation, much as biometrical genetics was an attempt to build a theory in

terms of the observable statistics of phenotype. We need a new formalism of population genetic theory in which the observables from experiment are the entities of theory. Our results in this paper give us some hope that such a formula- tion is possible.

But the problem is more than an epistemological one. Suppose one could succeed in completely randomizing the surrounding genetic material of a given locus. Suppose further that one could measure the very small fitness effects at that locus on the now randomized genetic background. Of what use would that be?

(26)

I n the data presented in this paper the strong association between loci is a result of the predominance in the gametic array of only two gametic types, and this prediction does not appear to be compatible with the observed chromosome variation in natural populations. There are several possible explanations for this which are compatible with the above theory. One is that the whole chromosome does not form a single super locus, but condenses into a series of complexes which segregate more o r less independently. Another factor, not introduced in the simulations, is the possibility of multiple allelic overdominant systems, and these offer the potential for a much greater array of chromosome types.

The final question we must concern ourselves with is a difficult one. Are selection coefficients of sufficiently great magnitude to produce this phenomenon in natural populations? It is difficult not only because we have little information about an essential parameter of the system, namely the effect on fitness of homozygosity for a segment of chromosome, but also we do not know if the multi- plicative model is a realistic one for the interaction of loci affecting fitness. More- over, we do not even know whether the most basic feature of all the models dis- cussed, single-locus overdominance, is the rule in nature. All our models have assumed single-locus overdominance in order that the gene frequencies will go to some intermediate equilibrium. There is nothing about work on linkage either in this paper or any previous paper that suggests that linkage alone can stabilize gene frequencies in the absence of overdominance. In fact the reverse is true. It

has been shown, on the other hand, by LEWONTIN (1964b) that even in the absence of overdominance the fixation of gene frequencies by selection may be tremendously retarded by linkage. While overdominance at each locus has been inserted in the present models simply in order to maintain genetic variance indefinitely, we do not know at the present time how important it is for the effects we have observed. For example, in the absence of overdominance but with ex- tremely small selection coefficients favoring one homozygote over the other, the change in gene frequencies might be extremely slow and all during the course of transient polymorphism the striking linkage effects we have observed may occur. I t remains for further investigation to show how essential the assumption of over- dominance is.

Let us consider as an example a chromosome arm in Drosophila with a map length of 0.4 morgans. Because there is no crossing over in the male, this cor- responds to 1 = 0.2 in the graphs. How much selection would be necessary to ensure a high degree of linkage disequilibrium among all the loci on the arm, say, with

7"

0.5? We can easily calculate the upper bound for K , using the formula for two loci

K <

[l-d

3(1-p') 41

I'

(27)

UNITS O F SELECTION 733

W e therefore require a depression in fitness of 93

%

accompanying homozygosity of this chromosome arm, and this seems a n excessively high value. Viability estimates made in Drosophila indicate much smaller inbreeding depressions than this. Even though such studies ignore important components of fitness such as sterility due to developmental and behavioral traits, depressions in fitness as large as 93% seem unlikely.

On the other hand, these considerations, which are based on multiplicative interaction of loci, may be very misleading ( SVED, REED and BODMER 1967; KING 1967) and alternate models proposed to sidestep many of the difficulties raised by genetic load arguments introduce much stronger epistasis than discussed in this paper. W e must emphasize that as the number of loci grows large the devia- tion of a multiplicative model from a purely additive one becomes vanishingly small per locus pair. The epistatic deviation from additivity is equal to S~ where

s is the selection coefficient at each locus so that if the effect of homozygosity at a locus is to reduce the fitness by 1%, the epistatic deviation is only 0.01% per pair of loci. The expected degree of linkage disequilibrium depends on the model of gene interaction, and it appears reasonable to suppose that other models will predict much higher degrees of association. As mentioned earlier, we have in- vestigated the truncation model discussed by KING, and indeed very strong link- age disequilibrium has been found for quite realistic models. Also we should note that the generation of linkage disequilibrium appears to be one of the rare cases in population genetics where selection and drift complement each other- both tend to increase association between loci.

Because this study emphasizes the possibility that almost neutral loci show strong linkage disequilibrium over considerable linkage distances, there is a greater need for experimental estimation of linkage disequilibrium in natural populations.

SUMMARY

Figure

TABLE 1 Correlation between two adjacent loci compared with predicted correlation (2 and 3) in a 5-locus model,* from 2-locus theory
FIGURE 1.-Results Abscissa r = of initial simulations with 36 loci. Curves 1 and 2: r = 0.0; curve 7: ,005; curves 3, 4, 5, and 6: r = .0025, where r is recombination between adjacent loci
TABLE 2 Equilibrium gametic arrays from initial simulations
FIGURE 2.-(a) and Changes in (ordinate) over time (abscissa) for two replicates with r = .003 W ,  = W,9 = 0.9
+7

References

Related documents

Knowledge and use of insecticide treated nets as a malaria preventive tool among pregnant women in a local government area of Lagos state,

Measurable plasma concentrations of calcitonin were observed in 11 of 33 hypercalcemic cats, and the highest plasma calcitonin concentrations (median, 4.7 [2.1, 28.4] pg/mL;

To further explore the role of the PAH domains in meiotic repression, the mRNA levels of IME1 , SPS4 , and SPO13 were monitored in the wild-type, sin3 ⌬ , pah ⌬ 3 , and pah ⌬

Another study that included patients with non-valvular persistent AF, but no other cardiovascular disease, reported that a low vitamin D level was associated with AF [16]..

Conclusion: Although it revealed the presence of low and excess birth weight, this study has shown that maternal anthropometry and dietary diversity score were

To extend the program services to additional students and increase the sample for the research study, in 2014 Inwood House expanded the pool of schools further to include small,

The CMM model asks us to consider “what spe- cific meanings are (people) making in given situations, how are they making those meanings, and how those meanings affect the social