• No results found

MULTILOCUS POPULATION GENETICS WITH WEAK EPISTASIS. II. EQUILIBRIUM PROPERTIES OF MULTILOCUS MODELS: WHAT IS THE UNIT OF SELECTION?

N/A
N/A
Protected

Academic year: 2020

Share "MULTILOCUS POPULATION GENETICS WITH WEAK EPISTASIS. II. EQUILIBRIUM PROPERTIES OF MULTILOCUS MODELS: WHAT IS THE UNIT OF SELECTION?"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

Copyright 0 1986 by the Genetics Society of America

MULTILOCUS POPULATION GENETICS WITH WEAK

EPISTASIS. 11. EQUILIBRIUM PROPERTIES

OF

MULTILOCUS MODELS: WHAT

IS

T H E U N I T

OF

SELECTION?

ALAN HASTINGS

Department of Mathematics and Division of Environmental Studies, University of Calqornia, Davis, Calijornia 95616

Manuscript received December 2, 1984 Revised copy accepted September 21, 1985

ABSTRACT

Using perturbation techniques, I study the equilibrium of deterministic dis- crete time multilocus models with weak epistasis. T h e most important results are on the relationship between epistasis and disequilibrium. Disequilibriuni involv- ing a particular set of loci reflects only epistasis simultaneously involving those loci. Moreover, all the disequilibria of all orders vary approximately as the inverse of the probability of at least o n e recombination event among the loci involved. Finally, higher order disequilibria among loci will be lower than lower order ones, even if the level of epistasis is the same a t all orders. In this sense, the unit of selection is small. However, given the larger number of higher order disequilibria, these higher order disequilibria may play an important role in the computation of gametic frequencies from allelic frequencies in models with a large number of loci. Finally, I show that epistasis between blocks of loci will be averages of epistatic effects, not additions of epistatic effects. Thus, failure to find significant epistasis on a chromosomal basis does not rule o u t the importance of epistatic effects.

HE study of two-locus models in population genetics was spurred by the

T

fact that two loci are the minimum number for which the effects of

linkage, recombination and epistasis appear. A large number of results for two-

locus models have been obtained (reviewed in KARLIN 1975; EWENS 1979). A

natural question is the extent to which results for multiple loci reflect two-

locus results (e.g., KARLIN and LIBERMAN 1982). One way to phrase this is to

ask the question (see LEWONTIN 1974), what is the unit of selection? I will give

answers to this question using deterministic, discrete time models for selection with nonoverlapping generations.

A natural way to start the study of multilocus models is to assume weak

additive epistasis. Experimental results reviewed in SIMMONS and CROW (1977)

indicate that epistasis is nonzero, but weak. This would suggest the importance

of extending the results in HASTINGS (1985-hereafter referred to as I) on

the two-locus model with weak epistasis to the multilocus case.

I shall begin with a review of previous results for models with more than

(2)

158 A . HASTINGS

two loci. Among the first studies of more than two loci were simulations

performed by LEWONTIN (1964a,b) and FRANKLIN and LEWONTIN (1970). In

the former case, a symmetric model was studied, and linkage disequilibrium was shown to be important to a greater extent than would have been predicted

by two-locus models. In the simulations of FRANKLIN and LEWONTIN (1970) a

symmetric, overdominant model with a large number of loci was studied. Here, they found very high disequilibrium values of all even orders, and for larger recombination rates than would have been predicted from two-locus theory.

For weaker and probably more realistic levels of selection, CLEGG (1978)

showed that this “crystallization” effect was not important. TURELLI and GINZ-

BURG (1983) simulated a large number of “random” fitness matrices and found that, in general, the intuition from one-locus two-allele models that heterosis is required for a stable equilibrium held for multilocus models.

Another approach to the study of multilocus models has been to determine conditions under which an equilibrium with zero disequilibrium of any order is stable. This equilibrium has been called the product or Hardy-Weinberg equilibrium. For a wide variety of models, conditions on local stability of the

product equilibrium are obtained in KARLIN and LIBERMAN (1979a,b), KARLIN

and AVNI (1981) and KARLIN and LIBERMAN (1982). An important result they

obtain is that, generally, for nonepistatic or symmetric fitnesses, if the product equilibrium is stable for a given recombination pattern, it is also stable for any

pattern with “more” recombination. In KARLIN and LIBERMAN (1978; 1979a,b),

it is shown that the product equilibrium is locally stable for the multilocus multiallele additive model as long as there is some recombination between all

the loci. KARLIN and LIBERMAN (1982) determine when the stability of the

product equilibrium for a multilocus multiplicative model is controlled by two- locus conditions.

Another analytical approach to multilocus models has been to consider the

case of small recombination rate (see KARLIN and MCGREGOR 1972; KARLIN

1978). This is done by examining the case with no recombination and then considering the implications for the case with small recombination. This ap- proach treats a case different from the one considered here. T h e results ob-

tained by the method of small recombination give insight into solutions with

large levels of disequilibrium; the results in the current paper complement the small recombination results. In fact, the justification for the method used here

can be expressed in terms of the results in KARLIN and MCGREGOR (1972).

Finally, there have been several papers that have examined three-locus models in detail. Some of these are modifier models, with only two loci undergoing direct viability selection. T h e symmetric three-locus model was

studied by FELDMAN, FRANKLIN and THOMSON (1974), who explicitly found a

large number of equilibrium solutions for weak recombination. A three-locus

multiplicative model was studied by STROBECK (1976).

There have also been several studies that have considered properties of

equilibria without considering the role of stability. EWENS and THOMSON (1977)

derived a number of properties of marginal subsystems, which are used below.

HASTINGS ( 1984) showed that conditions for maintaining disequilibrium appear

(3)

WEAK EPISTASIS WITH MANY LOCI 159

A different approach has been taken by NAGYLAKI (1976, 1977) and then

by SHASHAHANI (1979) and AKIN (1979). NAGYLAKI showed that, with weak

selection, linkage disequilibrium disappears fairly rapidly in a two-locus model

and is at a low level thereafter. SHASHAHANI and AKIN use sophisticated math-

ematical tools to examine the continuous time models for multilocus systems. In some ways, the approach of the current paper is similar in spirit to these works, by examining small deviations away from the globally stable additive model. One important difference is that, with the continuous time model,

disequilibrium can be generated without epistasis by the effects of overlapping

generations. Also, I concentrate on an explicit determination of the equilib-

rium rather than the dynamics. A primary goal of the current paper is to contrast results for multiple locus models with weak epistasis with results for

two-locus models obtained in (I). There are two ways to envision potential

complicating effects of multiple loci on disequilibria. First, there could be an

imbedding effect that would cause an increase in the level of pairwise dlsequi-

libria in a multilocus model relative to a two-locus model. Second, the higher

order disequilibria could be important if their magnitude is roughly that of

the pairwise disequilibria. T h e primary goal of this paper is to determine

whether either or both of these effects is important in the context of weak

epistasis.

OVERVIEW

Additive models for multiple loci, and for nonepistatic models in general,

are a natural class of multilocus models that have been extensively studied (see

KARLIN and LIBERMAN 1979a,b; 1982). Nonepistatic models are models for which the fitness of an individual depends on independent effects at different

loci. In the additive model, the fitness of an individual is determined by sum-

ming effects at single loci; in the multiplicative model it is determined by

multiplying effects at single loci. More general nonepistatic models are also

possible (KARLIN and LIBERMAN 1979a).

For all nonepistatic models there is a product equilibrium, an equilibrium

where there is no correlation between the alleles at different loci (KARLIN and

LIBERMAN 1979a). At this equilibrium, gametes have a frequency equal to the product of the frequencies that alleles at single loci would have, as determined

by the single locus effects entering into the fitnesses. For the additive model,

KARLIN and LIBERMAN (1978; 1979a,b) have shown that this product equilib-

rium is locally stable, independent of the number of loci or alleles involved,

for positive recombination rates. KARLIN and LIBERMAN have also shown that

this equilibrium is stable for the multiplicative model, if recombination rates are large enough.

It is unlikely that fitnesses in natural systems are ever truly additive or truly multiplicative across loci. This is confirmed by the experimental work with

Drosophila reviewed by SIMMONS and CROW (1977). Hence, I shall consider

models where the fitnesses deviate from those of a nonepistatic model, usually

the additive model, by terms for which the size is measured by a parameter 6.

(4)

160 A. HASTINGS

attention to those cases for which the product equilibrium is locally stable-if

epistasis is weak, 6 is small-there remains a stable equilibrium for the model,

close to the product equilibrium (see KARLIN and MACGRECOR 1972). The

new equilibrium is a function of 6.

T h e goal of this paper is to characterize this perturbed equilibrium. This is done by calculating the effect of the nonepistatic terms on mean fitness, allele frequencies and disequilibrium, up to first order in 6. T h e main results are as

follows, with the calculations and a number of minor results collected in later

sections.

Result 1: T h e change in the mean fitness due to weak epistasis does not

depend on the recombination pattern, to a first approximation. As in the

additive model, the mean fitness is independent of recombination, to a first approximation. This result does not depend on the additive model, but re- quires that there should be a stable equilibrium with no disequilibrium, which is independent of the recombination rates.

T h e remainder of the analysis depends on the study of marginal systems

(EWENS and THOMSON 1977), what would be observed at a subset of loci in a

multiple locus system. For this purpose the following result is necessary and also is of independent interest. This result would also apply to small deviations from t h e multiplicative model, when the recombination rates are large enough so the product equilibrium is stable.

Result 2: For a one-locus marginal system, fitnesses depend only on epistatic

parameters actually involving the particular locus being considered.

As for the one-locus marginal fitnesses, it is straightforward to show that, if

there is no simultaneous epistasis (at the order equal to the number of loci or

greater) involving some set of m loci, then for that set of m loci there is no

epistasis (at equilibrium) for the marginal fitnesses involving that set of m loci. Thus, if in a four-locus model all the epistasis can be accounted for by two- factor interactions, then there is no deviation from additivity for all three-locus marginal systems, to lowest order in weakly epistatic systems.

Deviations away from additive fitness at a particular group of loci in a marginal system are weighted averages of the deviations away from additive fitnesses at the loci involved. In particular, the epistatic effects are not added. Making use of result 2, information about allele frequencies and disequilib- rium can be deduced.

Result 3: To lowest order, weak epistasis affects allele frequencies only

through epistasis directly involving the particular locus in question. Moreover,

only the epistasis is involved, and the recombination pattern does not enter.

T h e next results depend on the original nonepistatic model being additive. Result 4: Pairwise disequilibria reflect additive epistasis involving only the loci being considered. T h e disequilibria vary as the reciprocal of the recom- bination rate between the loci.

Result 5: Three-way disequilibria directly reflect the presence of additive

epistasis at the loci involved. Three-way disequilibria probably are similar than pairwise disequilibria because, in most reasonable models, the strength of ep-

(5)

WEAK EPISTASIS WITH MANY LOCI 161 frequencies at the third locus. Finally, the dependence of disequilibrium on recombination is simple: disequilibrium is roughly proportional to one divided by the probability that there is some recombination among the loci involved.

Proceeding to more than three loci in this manner leads to more and more algebra. Instead, the following conjecture is offered, with some motivation given below.

Conjecture: For weak selection, result 5 is approximately true for higher

order disequilibria, with the appropriate changes in the wording.

MULTILOCUS MODELS

Models describing the dynamics of multilocus genetic systems have been written down in a variety of places. Some care is needed in the choice of notation. I shall use a notation similar to that of KARLIN and LIBERMAN (1979a,b). Let the vector

i =

(il,

i 2 ,

.

.

a , in) (1)

represent an n-locus gamete, where

i,

represents the allele at locus a. Define

the fitness (viability) of an individual with gametes i a n d j as wq. I shall assume

that there is no position o r cis-trans effect in the fitnesses. T h e difficulty of

induced position effects studied by TURELLI (1982) need not be considered, as

I shall include an arbitrarily large number of loci. T h e definition of “enough”

loci will be one result of the analysis. Let x ( j ) represent the frequency of the

gametej. T h e marginal gametic fitness is defined as

and the mean fitness of the population as

w

= X(i)Wi.

I

( 3 )

Then the dynamics of this system can be written as

~ ( i ) ’ =

[C

w j k ~ ( j ) ~ ( k ) R ( i , j , k ) ] / W (4)

j , k

where R(i, j ,

K )

is the probability that an individual with gametes j and

k

produces the gamete

i

as a result of meiosis. T h e inclusion of interference

makes the function R particularly complicated.

One additional notation that will be needed below is a way to describe

recombination. T h e notation is easiest to describe by example

(cf:

BENNETT

1954). T h e quantity r I 2 p 4 will be the probability of a recombination event that

separates loci 1 and

2

from 3 and 4, while r13/2 is the probability of an event

that separates 1 and 3 from 2. Thus, it is possible to describe recombination

among fewer loci than the number included in the model and to describe the probablity of no recombination. For example,

(6)

162 A. HASTINGS

Marginal systems: T h e definition of marginal genetic systems (EWENS and

THOMSON 1977) will be important in the discussion of the equilibrium prop- erties of weakly epistatic systems. Think of the fitnesses actually depending on n loci when only m loci are, in fact, observed. Define a marginal m-locus subsystem of an n-locus system as the system obtained by averaging all the fitnesses over the missing loci, weighted by the appropriate frequencies of the

gametes. Thus, following EWENS and THOMSON, denote the frequencies of the

m-locus gamete

p

by z ( p ) . By properly renumbering the loci, the m loci being

considered can be thought of as being the ones numbered one through m in

the full n-locus system. (Note that this means that the numerical order of the loci need not correspond to the physical order on the chromosome). Thus,

where the set Sp is defined as

s,

=

(i

I

i,

=

p,,

a = 1, m),

(7)

the set of n locus gametes that have the same alleles at the first m loci as the

m-locus gamete

p .

T h e induced marginal fitness of the genotype formed by the m-locus gametes

p

and q will be denoted by Wpg and is obtained by averaging over all genotypic combinations making up these two m-locus gametes, weighted appropriately by fitnesses and frequencies. This yields

This definition is useful below because of the following fact found by EWENS and THOMSON. T h e dynamic equations for the m-locus subsystem are the same as those for a genuine m-locus system, with the marginal fitnesses taking the place of the actual fitnesses. Note, however, that the marginal fitnesses are not constants, but depend on allele frequencies and disequilibria at other loci. Thus, the marginal systems are particularly important for deducing equilibrium behavior.

A slight change in notation will prove convenient below. Since the gametes

themselves denote the number of loci being considered, the ‘‘x” designation

for marginal systems will be used at times instead of the “z” designation. No

confusion results, since the number of loci being specified is given explicitly.

P R O P E R T I E S OF T H E A D D I T I V E M O D E L

As the multilocus additive model is an approximation to the weakly epistatic

case, it is necessary to review some properties of the additive model here. Both

the definitions of additive nonepistasis and the equilibrium properties of an additive model play an important role in the analysis of weakly epistatic sys- tems.

Epistasis: T h e study of epistasis began with FISHER’S (1 9 18) classic paper and was continued by many later workers (see KEMPTHORNE 1957). In an

(7)

WEAK EPISTASIS WITH MANY LOCI 163

wij =

c

Wa(2,,

ja)

a

where the sum is over all the loci in the model (see KARLIN and LIBERMAN

1979a,b). In what follows, the one-locus fitnesses wa are assumed to satisfy the

conditions for all the alleles at locus a in the model to be present at equilibrium

(see KINCMAN 1961).

Two-factor epistasis can be included by adding to the definition of fitness

terms of the form waB(i,p,j,B) that depend on the alleles at the two loci a and

8,

where the sum will now be over all pairs of loci in the model. This procedure

can be expanded to include epistasis with any given number of factors.

Equilibrium properties of the additive model: When fitnesses are additive,

satisfying (9), a number of important properties emerge (KARLIN and LIBER-

MAN 1979a,b), as discussed above. From the fitnesses at each particular locus

it is possible to define a one-locus equilibrium, where the frequency of the

allele

i,

at locus a is x(Q. T h e equilibrium for the full model given by

n

x ( i ) =

n

x(i,) a=i

will be called the product equilibrium.

ANALYSIS

T h e first step in the analysis will be to study the definition of marginal fitnesses under weak epistasis in a multilocus system. T h e equilibrium of the

dynamic equations (3) will be studied, using regular perturbation techniques.

As mentioned above, the equilibrium is known in the additive case, and each fitness in the case studied here is assumed to be “close” to the additive model. Hence, write all the fitnesses as

wt, = wy.0

+

6Wt,;I (1 1)

where 6 is a small parameter and 6wy;, gives the deviation away from the

additive fitness, wq;o. Thus, wy;o satisfies (9). Then, write all the allelic frequen-

cies, disequilibria of all orders, gametic frequencies and various marginal fit-

nesses in a similar fashion. As the notation is consistent, I shall just give several

examples:

x(2) = x(i);o

+

6X(i);l

+

O(62)

D

= D;o

+

6D.1

+

0(6*).

(12)

(13)

and for a typical disequilibrium coefficient

D

of unspecified order,

T h e terms with a second subscript of zero (after the “;”) are the equilibrium

values for the additive model given in (10). Thus, all the zero order terms are known. Other quantities of interest will be expressed in a similar manner. It is clear, however, that

D;o = 0 (14)

(8)

164 A. HASTINGS

T h e analysis described here will be used to find the terms with the second

subscript of “1 ” in the expansions for the disequilibrium (1 3) and for the allele

frequencies. This will give the dependence of these terms on epistasis. As an

example, the marginal fitness of allele i depends on the small parameter 6 as:

Definition of disequilibrium: Before proceeding with the analysis, it is first necessary to write down formulas for linkage disequilibria of higher order, as

has been done in BENNETT (1954) and SLATKIN (1972). As noted in HILL

( 1 974), in general these two definitions are, in fact, different, although they

are identical to the order (in 6) considered here. BENNETT defines the higher

order disequilibria in such a way so that

D’ = rD (16)

where T is the probability that there is no recombination among the loci in-

volved in the disequilibrium specified by D. SLATKIN defines the higher order

disequilibrium as statistical correlations.

T h e definition of pairwise and third-order disequilibria are the same in both

SLATKIN’S and BENNETT’S form. Pairwise disequilibria between the alleles i l

and i p are defined as

D(i1ip) = x(ili2)

-

x(il)x(ig), (17)

which reduces to the familiar

in the two-allele case. Third-order disequilibria are defined as

in the two-allele case. A definition like (17) is possible here as well, with

extensions to an arbitrary number of alleles and loci a straightforward exercise.

Change in mean fitness: I shall now analyze the model equations. The first step will be to compute the change in the mean fitness caused by the addition

of t h e epistatic terms. A small amount of algebra shows

(4.

I):

This is result 1.

(9)

WEAK EPISTASIS WITH MANY LOCI 165

Using the fact that the zero order terms arise from the additive model, one can show that the sum

depends only on the m-locus gamete $I, not on which n-locus gamete i deter-

mines

p .

This calculation makes use of the fact that the fitnesses are nonepi-

static and that the zero order frequencies are a product equilibrium.

Making use of (6) and the fact that (22) is independent of

i,

one sees by

interchanging the order of summation that the first two terms of (21) cancel the last two terms of (21) to yield

This formula leads to result 2, after using (9) and (10) and the formula for

the equilibrium of a one-locus model and performing a straighforward calcu- lation to show that the first-order change in the marginal fitnesses (23) for a one-locus marginal system depends only on epistatic parameters actually in- volving the particular locus being considered.

Numbering of gametes: T o facilitate the derivation of the results for the equilibrium allele frequencies and two- and three-locus disequilibria, it is con- venient to restrict attention to two-allele models. Analogous results would hold for more alleles. Moreover, a traditional labeling of loci by letters of the

alphabet, and numbering of gametes, is useful. At one locus, label the allele A

as 0, and the allele a as 1. At two loci, let the gametes AB, Ab, aB, ab be

numbered 0, 1, 2 , 3, respectively. Also, in the three-locus model, let the

gametes ABC, ABc, AbC, Abc, aBC, aBc, abC, abc, be numbered as 0, 1, 2, 3,

4, 5, 6, 7.

Computation of allele frequencies: To find the first-order change in the

allele frequencies, use equations (23) and a formula analogous to (30) in (I).

With the more general perturbation to additive fitnesses allowed here, formula (30) from (I) is not correct. Taking a derivative of the formula for the allele

(10)

166 A. HASTINGS

(24)

Z(0);I = (( 1

-

2Z(O),O)~Ol;O

+

Z ( 0 ) ; O ~ O O ; I

-

Z(l);OWll;1)/(2WOl;O

-

7&l,O

-

W O O ; O ) , where the marginal system that the z’s refer to is a one-locus system, and the standard notation for numbering alleles and gametes introduced above is used. U se

2WOl;O

-

W 1 l ; O

-

W O 0 ; O = 2w ( 1 , 0)

-

Wk( 1, 1)

-

W k ( 0 , O), (25)

which follows from the fact that the unperturbed fitnesses are for an additive

model. Thus, the single locus being referred to is locus Iz. Combining (23),

(24) and (25) yields

(26)

Z(O);1 = ((1

-

2Z(O);O)WOI;O

+

Z(0);OWOO;l

-

Z(1):OWll;l)/

(2Wk(l, 0)

-

W k ( l , 1 )

-

Wk(O, 0)).

Making use of result 2, (26) can be summarized as result 3.

Calculation of pairwise disequilibria: I shall now describe the calculation of the disequilibria along the lines of the calculation in (I). to simplify the notation in the presentation, I shall present the derivations as though each system of m-loci being considered were truly an m-locus system. If the system has more loci than the order of the disequilibrium, use a marginal system with only the loci involved in the disequilibrium coefficient being calculated.

T h e first step will be to compute the pairwise disequilibria. Take the dy-

namic equations for a two-locus system and set x(i)’ = x(i). Substitute for each

term its appropriate Taylor series (in 6, about 6 = 0, the additive case). From

this, the equations determining the order 6 terms are easily seen to be

(6

equation (19) in (I))

x(i);OW;l = x(i);Owi;I 2 ( 1

-

Tab)wOS;$AB. (27) (The sign in (27) is determined by the usual convention for two-locus models.)

Take each equation in (27) and divide by x(&, and multiply each equation by

the sign in front of the disequilibrium term. Summing the resulting four equa- tions yields

(28)

3

WOS;ODAB;I(1 - Tab) (x(i);O)-’ = wO;l

-

w1;l

-

W P ; l

+

W 3 ; l . i=O

It would be useful to express this answer in the form of a correlation coeff-

cient, p A B , where

P A B = DAB[x(A)x(B)( 1

-

x(A))( 1

-

x(B))]-”, (29)

where the role played by the allele frequencies is made clear. Make use of (10)

to write

PAB;O = [x(A)x(B)(1

-

x(A))(1 - x(~))l’/’[wo,i

-

wi;I - w z ; l

+

q I ] /

(30)

Although the definition (15) of the quantitites wG1 would appear to suggest

that the right-hand side of (30) itself depends on the disequilibria, one can

(11)

WEAK EPISTASIS WITH MANY LOCI 167

make use of (9) to show that this is not the case. This is where the fact that the original nonepistatic model is additive is used.

Calculation of higher order disequilibria: T h e next step in the analysis will be to calculate the three-way disequilibria in a manner analogous to that used

to calculate the pairwise disequilibria. Rather than give the details for these

extensive calculations, which were performed with the aid of a computer al-

gebra program, I shall summarize the steps. First, form the equations analo-

gous to ( 2 5 ) that arise in a three-locus model. Second, use the two-locus case

as a guide to the appropriate sign by which to multiply each equation. Divide

each equation by x(i),o, multiply by the appropriate sign and then sum all eight

equations. Replacing any gametic frequencies by allele frequencies and dis-

equilibria and simplifying, one obtains

(31) DAW1 ([x(c);Owl7;0

+

x(C)p;OwOS;O]ra/b

+

[X(b);Ow27;0

+

X(B);OwO5;O]ra/c

7 7

+

[ X ( ~ ) ; O ~ G ; O

+

~ ( A ) ; o w o ~ ; o ] ~ b / c J (

E

(x(i);0)-’]/2 =

-E

(-1)At)~;i,

t=O t=O

wheref(i) is the number of “lower case” alleles in the gamete i . Note that only

the differences between the detriments in fitness caused by the homozygotes

at each of the three loci arising on the left-hand side of (31) make the formula

differ from

7 7

the three-locus analogue of (28). Thus, one would conjecture that the following

holds for higher order disequilibria, where the scheme for numbering gametes is extended in the natural way:

W . . , ~ D , ~ ( ~

-

r )

I:

(x(i);o)-l

=E

( - ~ ) % u ~ ; ~

+

ERROR, (33)

I I

where ERROR is a term that depends only on differences among the fitnesses of

different genotypes that are homozygous at n

-

2 loci (where n is the order

of the disequilibrium) and, therefore, will usually be fairly small. (Note, how-

ever, that ERROR may increase as the number of loci increases.) Here w..;o is

the fitness of an individual heterozygous at all the loci in the system, D is the

disequilibrium coefficient for all the loci in the system, r is the probability of

no recombination among the loci in the system, and the sums extend over all

the gametes in the system. T h e information in this section and the preceding

ones is summarized in results 3 through 5 .

A crude approximation: Is there an approximation that sheds some light

on formulas (31)-(33)? I provide details in the case of three loci, but the steps

carry over to an arbitrary number of loci, mutatis mutandis. BENNETT shows

that

d ( A B C ) = r,b$(ABC)

+

r,/b,x(A)x(BC)

+

rb/,,x(B)x(AC)

+

r,/,#(C)x(AB). (34)

T h e definition of the recombination parameters [see equation ( 5 ) ] shows this

(12)

168 A. HASTINGS

x’(ABC) = x(ABC)

+

ra/bcx(A)x(BC)

+

Y ~ / ~ = x ( B ) x ( A C )

+

rc/abx(C)x(AB)

-

x(ABC) (Ta/bc

+

r b / u c

+

rc/ab)

Substituting for the second x(ABC) in equation (35) from (19) yields

(35)

Now substitute from ( 1 8) for all the two-locus gametic frequencies [x(AB) etc.]

in (36) to obtain, after rearrangement

(37)

x’(ABC) = x(ABC)

-

(1

-

T ~ ~ ~ ) D A B c

-

( 1

-

rb,)x(A)D~c

-

( 1

-

raC)x(B)DAc

-

(1

-

T ~ ~ ) X ( C ) D A B

Similar equations hold for the other gametic frequencies, with changes in the

signs on the right-hand side of ( 3 7 ) . Note that an alternative method of deriv-

ing (37) would be to use the compact notation employed by KARLIN and

LIBERMAN (1979a,b). These methods would be needed to study cases of more

loci or alleles.

It is the analogy between this formula and the corresponding formula for

the two-locus model that leads to the approximations (32) and (33). Adding

selection to (37) is not simple, because the disequilibria do not involve just

specified pairs of gametes as in the two-locus case. If selection is sufficiently weak, however, the errors resulting from simply multiplying the disequilibria

on the right-hand side of (37) by a particular fitness will be small, however.

This leads to the conjecture presented earlier.

DISCUSSION

One of the primary questions that arises in the study of multilocus systems is the question of how many loci must be included to obtain valid description of the natural system. This question can be rephrased as asking on how many loci do linkage disequilibria of various orders depend? T h e results of this paper answer this question in the case of weak epistasis.

First, disequilibria of any order depend only on epistatic interactions of that

order. Second, disequilibria among any group of loci depend only on epistatic interactions involving that group of loci. Thus, there is absolutely no imbed- ding effect. There is a direct correlation between disequilibria and epistasis. Also, disequilibria scale as one divided by the recombination fraction, which is a faster decline with recombination than in the case of the additive model (for

two loci) with drift (FELSENSTEIN 1974). Finally, higher order disequilibria are

(13)

WEAK EPISTASIS WITH MANY LOCI 169

However, the higher order disequilibria are not a lower order (in 6) than

the lower order disequilibria. Also, in a system of n loci there are many more

disequilibria of order roughly n / 2 than pairwise disequilibria. Thus, these

higher order disequilibria may play an important role in the computation of gametic frequencies from allelic frequencies in models with a large number of loci.

For a definition of what 1 mean by weak epistasis here, it is necessary to

consider the results of KARLIN and LIBERMAN (1979a,b; 1982) and KARLIN

and AVNI (1981). As in (I), I shall argue that epistasis must be weak relative

to both recombination and selection for the results here to apply. 1 also claim

that one can obtain a rough estimate of the minimum level of recombination

for which the results here apply by taking the level of recombination necessary for the product equilibrium to be stable in nonadditive models. This level of recombination is truly small. Estimating this way by plugging into formulas in the papers by KARLIN and co-workers, it can be seen that, even if the recom- bination between adjacent loci is a small fraction of the per locus selection strength, then the results of this paper would apply. T h e numerical calculations in (I) for two-locus models suggest that an estimate obtained in this way is reasonable. Moreover, the numerical results there suggest that estimates of

disequilibrium are quite accurate even for 6 as large or larger than 0.2. T h e

error is typically about 10% for 6 about 0 . 2 .

What is the evidence on the level of epistasis in natural populations? Unfor- tunately, this is a very difficult question to answer. There are grave statistical

difficulties in estimating higher order epistasis (KEMPTHORNE 1957). Typical

analyses (e.g., MUKAI et al. 1974) just try to estimate the presence of epistasis,

without trying to ascertain more details.

However, the formula ( 2 2 ) and result 2 do suggest an interesting interpre-

tation of one attempt to detect epistasis by TEMIN et al. (1969). They do not

detect significant epistasis between chromosomes, but only within chromo-

somes. Since ( 2 2 ) suggests that deviations from additivity are averaged in mar-

ginal systems (not added), by looking at larger blocks of genes at one time, one may be covering up epistasis, rather than making it easier to detect. In

fact, this may be just what TEMIN et al. (1969) found. Thus, the failure to

detect epistasis by looking for it between chromosomes does not imply that

epistasis is unimportant.

What is the evidence on higher order disequilibria? Estimating higher order disequilibria in nature also has grave statistical difficulties for essentially the same underlying reason that it is hard to measure higher order epistasis (see

BROWN 1975). Among the studies that have searched for disequilibrium in

natural populations, that of LANGLEY, TOBARI and KOJIMA (1974) is among

the most extensive for outcrossers. They found no evidence for higher order correlations among loci.

(14)

170 A. HASTINGS LITERATURE CITED

AKIN, E., 1979

BENNETT, J. H . , 1954

BROWN, A. H. D., 1975

CLEGG, M. T., 1978

EWENS, W., 1979 Mathematical Population Genetics. Springer-Verlag, New York.

EWENS, W. and G. THOMSON, 1977 Properties of equilibria in multi-locus genetic systems. Ge- FELDMAN, M., I. FRANKLIN and G. THOMSON, 1974 Selection in complex genetic systems I. T h e FELSENSTEIN, J., 1974 Uncorrelated genetic drift of gene frequencies and linkage disequilibrium FISHER, R. A., 1918 T h e correlation between relatives on the supposition of mendelian inherit-

The Geometry of Population Genetics, Springer-Verlag, New York.

O n the theory of random mating. Ann. Eugen. (Lond.) 18: 311-317.

Sample sizes required to detect linkage disequilibrium between two or

Dynamics of correlated genetic systems. 11. Simulation studies of chromo- three loci. Theor. Pop. Biol. 8: 184-201.

somal segments under selection. Theor. Pop. Biol. 13: 1-23.

\

netics 87: 807-819.

symmetric equilibria of the three-locus symmetric viability model. Genetics 7 6 135-162.

in some models of linked overdominant polymorphisms. Genet. Res. 2 4 281-294.

ance. Trans. R. Soc. Edinb. 52: 399-433.

FRANKLIN, I. and R. LEWONTIN, 1970

HASTINGS, A., 1984

Is the gene the unit of selection? Genetics 65: 701-734.

Linkage disequilibrium, selection and recombination at three loci. Genetics

Multilocus population genetics with weak epistasis. I. Equilbrium properties

Disequilibrium among several linked genes in finite population. I. Mean changes in disequilibrium. Theor. Pop. Bio. 5: 366-392.

General two-locus selection models: some objectives, results and interpretations. Theor. Pop. Biol. 7: 364-398.

Theoretical aspects of multi-locus selection balance I. pp. 503-587. In: Studies

in Mathematical Biology Part 11: Populations and Communities, Edited by S. A. LEVIN. Math. Assoc. Amer., Washington, D.C.

symmetric viability regime. Theor. Pop. Biol. 2 0 241-280.

Biol. 5: 201-21 1 .

106 153-164.

HASTINGS, A., 1985

HILL, W. G., 1974

KARLIN S., 1975

of two-locus two-allele models. Genetics 1 0 9 799-81 2.

KARLIN, S., 1978

KARLIN, S. and H. AVNI, 1981 Analysis of central equilibria in multilocus systems: a generalized T h e two-locus multi-allele additive viability model. J. Math.

Representative of nonepistatic selection models and analysis

Central equilibria in multilocus systems. I. Generalized

KARLIN, S. and U. LIBERMAN, 1982 T h e reduction property for central polymorphisms in KARLIN, S. and J. MCGREGOR, 1972 Application of method of small parameters to multi-niche KEMPTHORNE, O., 1957 An Introduction to Genetic Statistics. John Wiley and Sons, New York, KINGMAN, J. F. C., 1961 A mathematical problem in population genetics. Proc. Camb. Phil. Soc. LANGLEY, C. H., Y. N. TOBARI and K.-I. KOJIMA, 1974 Linkage disequilibrium in natural pop KARLIN, S. and U. LIBERMAN, 1978

KARLIN, S, and U. LIBERMAN, 1979a

of multilocus Hardy-Weinberg equilibrium configurations. J. Math. Biol. 7: 353-374.

nonepistatic regimes. Genetics 91: 777-798.

nonepistatic systems. Theor. Pop. Biol. 22: 69-95.

population genetic models. Theor. Pop. Biol. 3: 186-209.

KARLIN, S. and U. LIBERMAN, 1979b

57: 574-582.

(15)

WEAK EPISTASIS WITH MANY LOCI 171

LEWONTIN, R., 1964a T h e interaction of selection and linkage. 1. General considerations; het- T h e interaction of selection and linkage. 11. Optimum models. Genetics

The Genetic Basis of Evolutionary Change. Columbia University Press, New York.

T h e genetic variance for viability and its components in a local population of Drosophila melanogaster. Genetics 78:

erotic models. Genetics 4 9 49-67. LEWONTIN, R., 1964b

LEWONTIN, R., 1974

MUKAI, T . , R. CARDELLINO, T. K. WATANABE and J. F. CROW, 1974

5 0 757-782.

1195-1208.

NAGYLAKI, T . , 1976 The evolution of one- and two-locus systems. Genetics 83: 583-600.

NAGYLAKI, T . , 1977 The evolution of one- and two-locus systems. 11. Genetics 85: 347-354. SHASHAHANI, S., 1979 A new mathematical framework for the study of linkage and selection. SIMMONS, M. J. and J. F. CROW, 1977 Mutations affecting fitness in Drosophila populations.

Annu. Rev. Genet. 11: 49-78.

SLATKIN, M., 1972 On treating the chromosome as the unit of selection. Genetics 72: 157-168. STROBECK, C., 1976 T h e three-locus model with multiplicative fitness values: the crystallization

of the genome. In: Population Genetics and Ecology, Edited by S. KARLIN and E. NEVO. Aca- demic Press, New York.

TEMIN, R. G., H. U. MEYER, P. S. DAW” and J. F. CROW, 1969 T h e influence of epistasis on homozygous viability depression in Drosophila melanogaster. Genetics 61: 497-5 19.

TURELLI, M., 1982 Cis-trans effects induced by linkage disequilibrium. Genetics 102: 807-8 15. TURELLI, M. and L. GINZRURG, I983

Memoirs AMS 2 1 1 .

Should individual fitness increase with heterozygosity. Genetics 104 191-209.

References

Related documents