• No results found

Clusters of Identical New Mutations Can Account for the “Overdispersed” Molecular Clock

N/A
N/A
Protected

Academic year: 2020

Share "Clusters of Identical New Mutations Can Account for the “Overdispersed” Molecular Clock"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Copyright 0 1997 by the Genetics Society of America

Clusters of Identical New Mutations Can Account

for

the

“Overdispersed” Molecular Clock

Haiying

Huai

and

R.

C.

Woodruff

Department of Biological Sciences, Bowling Green State University, Bowling Green, Ohio 43403 Manuscript received January 25, 1997

Accepted for publication May 16, 1997

ABSTRACT

Germ-cell mutations may occur during meiosis, giving rise to independent mutant gametes in a Poisson process, or before meiosis, giving rise to multiple copies of identical mutant gametes at a much higher probability than the Poisson expectation. We report that the occurrence of these early premeiotic clusters of new identical mutant alleles increases the variance-to-mean ratio of mutation rate ( R ( u ) > 1 )

.

This leads to an expected variance-to-mean ratio ( R ( t) ) of the molecular clock that is always greater than one and may cover the observed range of R ( t ) values. Hence, the molecular clock may not be over- dispersed based on this new mutational model that includes clusters. To get a better estimation of R ( u )

and R ( t ) , one needs measurements of the intrageneration variation of reproductive success

(N./

N e ( , , ) , population dynamics

(X,

) , and the proportion of new mutations that occur in clusters ( rc) , especially those formed before germ-cell differentiation.

T

HE molecular clock has often been described as overdispersed, since the variance-to-mean ratio,

R ( t ) , of substitutions is usually larger than the Poisson

expectation of one. This is true even when lineage ef- fects have been removed and appropriate phylogeny is

used (OHTA and KIMURA 1971; LANGLEY and FITCH

1974; GILLESPIE 1989, 1991; TAKAHATA 1991; &STEAL

1994), and it is true for synonymous and nonsynony-

mous changes (GILLESPIE 1991; OHTA 1995; but see

GOLDMAN 1994). GILLESPIE (1991, 1994) reported an average R ( t ) value of 7.5, and OHTA ( 1995) reported average nonsynonymous changes of 5.60 and synony- mous of 5.89. Several mechanisms involving selection have been proposed to explain this overdispersion, in- cluding episodic selection ( GILLESPIE 1984)

,

fluctuat- ing neutral space (fluctuations in the neutral mutation rate caused by substitutions that change selective con- straints) (TAKAHATA 1987), and slightly detrimental

mutations (house of cards model) ( OHTA and TACHIDA

1990; TACHIDA 1991; OHTA 1992; IWASA 1993). Here

we show that an intrinsic mechanism, the premeiotic origination of clusters of identical new mutant alleles, can account for the observed large index of dispersion of molecular evolution ( R ( t )

>

1 )

.

Mutations that occur premeiotically and give rise to multiple copies of identical new mutant alleles are com- mon in many multi-cellular organisms, and these clus- ters can change the dynamics of evolution, including an increase in fixation probabilities of new alleles

(WOODRUFF and THOMPSON 1992; WOODRUFF et al.

1996). For example, 21% of 3585 spontaneous reces-

Corresponding author: Haiying Huai, Department of Biological Sci- ences, Bowling Green State University, Bowling Green, Ohio 43403. E-mail: hhuai@bgnet.bgsu.edu

Genetics 147: 339-348 (September, 1997)

sive sex-linked lethal mutations and 39% of 194 reces- sive autosomal lethal mutations from

D.

melunogusteroc- curred in clusters, and premeiotic genetic changes have been reported in the progeny from single individuals of

several species, including nematodes, silkworms, guinea pigs, mice, rabbits, cattle and humans (WOODRUFF and THOMPSON 1992; WOODRUFF et al. 1996 and reference therein )

.

Clustering makes the mutation ( origination )

process more episodic and increases mutational vari- ance relative to its mean, causing the mutational pro- cess to deviate from the Poisson distribution. This in- creased variance in the origination process of new al- leles every generation leads to a molecular clock that appears overdispersed ( R ( t )

>

1 ) relative to the Pois- son expectation.

CONTRIBUTION OF CLUSTERS TO MUTATION RATE

Mutations can occur any time during cell divisions from the zygote to formation of mature gametes. All new independent mutations that occur during meiosis fit the Poisson distribution. These genetic changes al- most always result in a single unique mutant allele, be- cause of the very low mutation rate at the base-pair level. If all mutations are of this type, R ( t ) is expected to be close to one ( GILLESPIE 1991 )

.

Even with various kinds of selection models, GILLESPIE (1994) concluded that an observed average R ( t ) as large as 7.5 cannot be easily explained with a model based on a mutational process following the Poisson distribution alone.

(2)

340 H. Huai and

R.

C. Woodruff

: i o n Frequency o f C l u s t e r s Mutants In

P rc

A . H a l f o f a l l g a m e t e s

a r e A ‘

?[?

? p i 1

?!?

G e r m Cell D i f f e r e n t i a t i o n

I Early Premeiotic l / h Z Y S S

v

0 . Nen

0”

3

f Mutation

A-A’

\

Frequency of

Mutants I n d e p e n d e n t

P (1-6)

FIGURE 1.-Partition of mutation rate according to an ideal model of gametogenesis in multicellular organisms. Germ- cell divisions from zygote cleavage (cell division 1 ) to germ- somatic cell differentiation (cell division j ) and to gamete formation (cell division a) are labeled in this diagram. New mutants are grouped according to their occurrence at differ- ent stages of the germ line development: early premeiotic, late premeiotic and meiotic. ( A ) Early premeiotic mutations that occur before germ-somatic cell differentiation (between cell division 1 and j ) will yield a cluster with about one-half of gametes mutant. New mutants in this group are the major contribution to r,. The probability of recovering this type of new mutants is approximately ur,, with u the mutation rate per generation. ( B ) Mutations that occur during meiosis (cell division n) result in a single and unique mutant allele. Muta- tions that occur a few cell divisions before meiosis but after germ cell differentiation yield a small- to medium-size cluster of identical mutant alleles. If germ cells grow exponentially after somatic-germ cell differentiation (in our model germ cells number doubled each cell generation after cell division j ) , the chance to sample even one mutant allele out of these small- to medium-size cluster is small. Therefore, late premei- otic mutations contribute little to r,, but quite a lot to “single meiotic” mutants. We theoretically factor these late premei- otic mutations into early premeiotic clusters and meiotic type of single mutations. The probability to recover “single” new mutants is approximately u ( 1 - rC).

possible cluster size. In nature such small clusters may seldom contribute more than one copy, if any, of the mutant allele to the gene pool, unless the gametes sam- pled from one individual is large. Furthermore, as shown below, these small clusters can be mathematically factored into either early premeiotic clusters of muta- tion or single, independent mutations near meiosis. Hence, we first model the effect of clusters of mutant alleles on the molecular clock by focusing mainly on premeiotic mutations that occur before germcell differ- entiation (see APPENDIX for a model accounting for all possible clusters). As shown in Figure 1, such mutation

events will lead to approximately one-half of gametes carrying identical copies of the new mutant allele. Early premeiotic mutation events have been observed

(Rus

SELL 1964; WOODRUFF and THOMPSON 1992; WOODRUFF

et al. 1996; RUSSELL and RUSSELL 1996), but their pro- portion ( r,) among all new mutations is unknown.

An unbiased estimation of mutation rate should in- clude the total number of mutant gametes recovered over the total gametes sampled. This means that all members of a premeiotic cluster, even though they are identical genetic changes, should be counted separately in the estimation of u (AUERBACH 1962; DROST and LEE 1995 )

.

The accumulated overall mutation rate per genera- tion, u, in any multicellular organism is the sum of the mutational probability per cell-division ( u , ~ L ( ~ ) ) over all cell divisions from zygote (cell generation 1 ) to gamete (cell generation n) , ie.,

n

u =

C

UceE(2) > ( 1 )

,= 1

with n the total number of cell divisions from zygote to gamete.

The mutation rate, then, can be partitioned into two

parts: clustered and independent mutations, with fre- quency of r, and 1 - r,, respectively. Hence,

where j is the cell generation when germcell differenti- ation occurs (see Figure 1 )

.

Assuming that the muta- tional probability per cell division ( ) is similar for all cell divisions after somatic-germline cell differentia- tion, then

1

and hence,

Therefore, in this ideal model, late premeiotic clus- ters that occur after germcell differentiation and be- fore meiosis contribute relatively little to r,, and the majority of them will be identified as “single meiotic” mutants. Furthermore, these small clusters can be math- ematically factored into either early premeiotic clusters that occur before germ-line differentiation and “inde- pendent-single” mutations that occur during or just a few cell divisions before meiosis.

(3)

Mutation Clusters/Molecular Clock 34 1

otic cluster of six identical mutants, r, would be equal to 0.3 or 6/ ( 14

+

6 )

,

which means that 30% of the total mutation rate is contributed by clusters. The overall mutation rate, u, would be

20

out of the total gametes sampled. Since there are a number of cell divisions from zygote to germ cell differentiation in many organ- isms (for example, five cell divisions in Caenorhabditis

ekguns; SULSTON et al. 1983) and mutation rates may be high in these early embryonic cell divisions (MULLER 1954; DROST and LEE 1995), the r, values we use below, ranging from 0.05 to 0.40, may be conservative.

This new evolutionary parameter, r,, not only repre- sents the proportion of early premeiotic clusters among all new mutations, but r, is also the primary contributor to the recurrent probability that another gamete within an individual shares an identical new mutation. For a particular gene, excluding its homologue in a diploid organism, the conditional probability of another allele identical to the initial mutant within an individual is

r,

+

( 1 - rc) u. ( 5 )

Considering both copies of a gene in a diploid organ- ism, the recurrent probability of an identical mutant gamete, given an initial mutation, is

[rc

+

( 1 - r , ) u 1 / 2

+

u /

2

(mutation In same gene) (mutation in homologue)

= [rC

+

( 2

-

T , ) u ] / ~ . ( 6 ) The r, values can be orders of magnitude greater than u, which is the recurrent probability of identical mutations in the absence of clusters. The occurrence of early premeiotic clusters therefore greatly alters the probability distribution of new identical mutations among gametes from any multicellular individual, with- out changing the total mutation rate. As a practical example of the influence of clustering, the observation of premeiotic clusters of new mutant alleles, germinal mosaics, in humans has altered genetic counseling esti- mations of the probabilities of recurrent identical muta- tions within a family, from mutation rate ( 10-5-6) to 5-14% (HARTL 1971; WIJSMAN 1 9 9 1 ; Y o u ~ ~ 1991; Coo- PER and KRAWCZAK 1993; BRIDGES 1994).

VARIANCE-TO-MEAN RATIO OF MUTATION RATE

To model the effect of premeiotic clusters on the mean and variance of mutation rate, one can assume that in an ideal mutation experiment each screened gene contributes exactly c copies (0: = 0 ) to the next generation. Each gamete can be either a mutant or not. If it is a mutant, then it can be categorized as a single mutation, two or more recurrent independent muta- tions in the same gene (at an extremely unlikely rate of u2 or less), or a member of a cluster of c copies of identical premeiotic mutants within a family (only early premeiotic clusters modeled here, see APPENDIX for a

general result for all possible premeiotic clusters).

These categories of mutation can be identified experi- mentally by parentage and alleleism tests (WOODRUFF et al. 1996). Shown in Table 1 are the expected frequen- cies of each mutation state and their contributions to the mean and variance of mutation rate. Since the scal- ing factor 1 - UT,( 1 - 1

/

c) is very close to one, the values in Table 1 lead to an expected variance-to-mean

ratio of mutation rate

R ( u ) = 1

+

r c ( c - 1 ) .

( 7 )

A similar analysis can be done at the family level by applying the Poisson distribution to model the indepen- dent mutations within each family. Assume that the product of family size ( c) and mean independent muta- tion rate [ u ( 1 - rc) ] is much smaller than one and always very close to zero, then e p m 1 - m 1.0 [ m

= u c ( 1 - rc) ] (Table 2 ) . With the scaling factor 1

+

ur, so close to 1.0, the analysis in Table

2

also leads to

the expected variance-to-mean ratio of mutation rate as shown in (

7 ) .

Hence, with premeiotic clusters and c

>

1, R ( u ) is always greater than one. In those extremely rare cases where c = 1 for all members of a population (there are no clusters), the variance-to-mean ratio of mutation rate reduces to one and fits a Poisson distribution.

The R ( u ) value in

( 7 )

is for an ideal population, where c does not vary. However, in a real population, cluster sizes do vary with different family sizes and sam- pling process, this may further increase the values of R ( u )

.

How does variability in cluster size affect molecu- lar evolution in real populations?

DISPERSION INDEX ( R ( t ) ) OF MOLECULAR EVOLUTION

Independent (Poisson) mutations and premeiotic

clusters of mutations should be modeled in combina- tion in molecular evolution. In every generation, the mixture distribution used to describe clustering of pre- meiotic mutation has the mean

pee

and the variance

PC

( (T

+

c*

) :

PC

( the probability or frequency of premei-

otic cluster events) is equal to UT,/

c,

cis equal to the average cluster size, including those not sampled in the next generation, and a: is the variance of cluster size. This distribution can be modeled as either generalized

or compound Poisson distribution (FELLER 1968).

(4)

342 H. Huai and R. C. Woodruff

TABLE 1

Frequencies of mutation state

Approximate contribution to

Mutation results" Frequency ( f ) Mean (fx) Variance ( fx')

No change ( 0 ) 1 - u 0 0

Single (1) 4 1 - 7 c ) 4 1 - rc) 4 1 - rc)

Double (2) u"1 - rc)2

=

0 =O =O

Triple (3) u3(1 - r c ) 3

o

=O =O

Etc. (n) u"(1 - rc)n GZ 0 X 0 =O

Cluster ( c ) UT,/ c ur, ur,c

Sum 1 - ur,(l - l / c ) U u

+

ur,(c - 1)

Scaling factor Approx. mean Approx. variance

a Values of x in parentheses.

a: will be treated as the mean and variance of family

size at the individual level, but they can also be the mean and variance of replication success or transmis- sion success from one generation to the next at levels lower than the individual. If there are no different pat- terns of selection among all the levels, then all will give the same mean and variance of cluster size, and the individual level is modeled for convenience.

The compound Poisson process has previously been used to model multiple fixation events in molecular evolution ( GOLDING 1987; GILLESPIE 1991 )

.

We are us- ing this distribution to model part of origination events, i.e., clusters of new mutant alleles. Note that both the mean (

r2

=

y2xt

) and the variance

( a :

=

'/Aj

+

Y4o:) of this generalized or compound Poisson distri-

bution may vary from generation to generation, even if the new mutation is completely neutral, and this con- tributes additional stochastic components to the origi- nation process.

The variance of family size,

a:,

can be obtained from formulas for inbreeding effective population size of generation i, N e ( ; ) ,

for monoecious species, with N, the actual population size at generation i (WRIGHT 1938; CROW and KIMURA

1970). For dioecious species, the numerator is altered to

N,R

-

2, which makes little difference in the calcula- tion of

a:, unless

A$ or N e ( , ) is very small).

Stable population size: If the population size is sta- ble, =

2

for all generations; therefore,

r =

1,

(if Nj is not small)

.

Therefore, for the cluster distribu- tion the mean

(p,n

is equal to ur,

( p ,

= urd r a n d

per

= UT,)

.

The variance

[PC

(

a:

+

c')

] for this cluster

distribution is equal to UT, ( a :

+

1 ) [with = 1,

p,(a:

+

2 ) = ur,(a:

+

I )

1 .

When clusters are com- bined with independent mutations that follow a Poisson distribution, where mean and variance are both equal to u ( 1

-

r,)

,

the combined mean for independent and

TABLE 2

Frequency of mutation for each family

Mutation results for Approximate contribution to

each family" Frequency ( f ) Mean (fx) Variance ( f x ' )

No change ( 0 ) 1

-

uc(1 - rc) 0 0

Single ( l / c ) uc(1 - rc)/l! Uc(1

-

rc) uc(1 - rc)

Double ( 2 / c ) (uc)*((l - r,)'/2! 0 =O =O

Triple (3/ c) (uc)3(1

-

r c ) 3 / 3 ! =

o

=O =O

Etc. ( n / c , n < c ) (uc)"(l - r,)"/n! = 0 4 0 =O

Cluster ( c / c ) U7.C Urn, u% 7,

Sum 1

"

ur, uc uc

+

urnc(c - 1)

Scaling factor Approx. mean Approx. variance

(5)

Mutation Clusters /Molecular Clock 343

clustered mutations is u, and the combined variance is

u( 1

+

r,a;)

.

The ratio of variance to mean, R( u)

,

for

any single generation

i

in a stable population is

R ( u ) = 1

+

r,af

F+I 1

+

T , N / N ~ ( ~ ) . (8a)

Applying this ratio over many ( t ) generations, with Ni constant for all generations ( Ni = N for all

i,

hence

E ( N )

=

N ) ,

R ( t ) = 1

+

T , E ( N / N ( i ) ) = 1

+

r c N E ( l / N e ( j ) ) .

Since long-term effective population size, Ne (long

-

-

harmonic mean(Ne(i)) = l / E ( l / N e ( i ) ) ,

R ( t ) = 1

+

rcN/NG(longtem). ( 8 b )

Fluctuating population size: Fluctuations of popula- tion sizes should also be taken into consideration in molecular evolution. In any generation i,

Xi

= 2N.+,/

Ni,

rz

= l/&, a: = ( N i / N G ( j ) -

1 ) X T

+

(1

-

l / N e ( i ) ) & and a: =

1/4(N/NG(i)

-

1 ) X f

+

1/4(2

-

1 / ~ , ( ~ ) ) 7 i F+I * / 4 ( N / N e ( i )

-

1 ) x f

+

%Xi (assuming

is large). Thus, the generalized or compound Pois- son distribution for clusters has a mean

( p , q

equal to

ur, and variance equal to UT,( N i /

Ne(i)Ri

/ 2

+

1 ) [with

PC

= ur,/Cand

ri

=

y2&,

variance p c ( o :

+

r 2 )

=

UT,( N , / N e ( i ) a / 2

+

1) 1. The Poisson distribution for

independent mutations has both mean and variance of

u( 1

-

r , ) .

Combining the Poisson and the cluster distributions, the mean mutation rate is u, and the variance is u( 1

+

r , N / N ( j ) x i /

2 ) .

Hence, the variance-to-mean ratio

for mutation rate,

R(

u )

,

in generation i is

R ( u )

= 1

+

r c N / N ( i ) & / 2 , (9a)

and the variance-to-mean ratio of neutral evolution in a fluctuating population accumulated over t genera- tions is

R ( t ) = 1

+

r,E[N/Ne(i,Xi/2]. (9b)

With

xi

=

2N,+1

/N.,

if one assumes that

N e ( i )

is indepen- dent of Ni+l in ( 9 b ) , then

R ( t )

= 1

+

r c E ( N + l ) E ( l / N e ( j ) ) *

Since long-term effective population size, Ne = har- monic mean(Ne(i)) = ~ / J T ( ~ / N ~ ( ~ ) ) and

E(N,+,)

=

E ( Ni )

,

we have

R ( t ) = 1

+

r c E ( N , ) / N e ( 1 o n g t e m ) . (9c)

On the other hand, if one assumes that Cov (

Ni

/

Ne

( i ) ,

k i ) = 0, or that

N,

/ N e ( i ) and

xi

are independent, then from ( 9 b )

-

R ( t ) = 1

+

r C E ( N / N e ( , ) ) E ( ' 7 i i ) / 2 . ( 9 d )

If

N,

/

Ne (,) and

%

are positively correlated, then ( 9 d )

underestimates R ( t )

.

If these two values are negatively correlated, ( 9 d ) overestimates the dispersion index

R (

t )

.

In general,

R ( t )

= 1

+

r c [ E ( N / N ( i ) ) E ( & )

+

C O V ( N / N ( ~ ) , ~ ~ ) I / ~ * ( 9 e )

In summary, for a more accurate estimation of the variance-to-mean ratio of molecular evolution in a fluc- tuating population, one needs to estimate the propor- tion of new mutations that occur in clusters ( r,) , the mean fitness distribution ( N , / N e ( i ) ) and the average family size

(X

) over time, plus the covariance of the latter two parameters.

For a complete model with all possible cluster sizes, notjust those modeled from ( 5 )

-

( 9 ) with cluster sizes

Ti = 1 - / 2 k i , see the APPENDIX.

ESTIMATION OF DISPERSION INDEX OF

MOLECULAR EVOLUTION

Molecular evolution has been assumed to follow a simple Poisson origination process with the expected

R (

t ) close to one for most evolutionary models. How- ever, some R( t ) values are observed to be significantly greater than one ( GILLESPIE 1991, 1994; OHTA 1995). This disparity between expected and observed R( t ) val- ues has been explained mainly by events that occur after the origination of new alleles, i.e., by different types of selections. In this study it has been shown that the origination process is more complex than pre- viously assumed. Independent mutations, defined as those that occur near or during meiosis and that fit the Poisson distribution, almost always arise as single events. However, there is also a large input of clusters of identi- cal new mutations in an idealized uniform cluster size, with

ci = N/Ne(,)7i/2

+

1 (10)

(compare Equation

7

with 9a; see also WOODRUFF et al. 1996).

This more complex origination process leads to an increased mutational variance over mean ratio always greater than one. In addition, as shown in (9b) and

( 9 d ) , this increased dispersion is dependent on the mutation rate distribution during germ-cell develop ment program ( rC) within individuals, the variation of within-generation reproductive success [

Ni

/ N e ( i ) ]

,

and

the population fluctuations among generations ( ki )

.

These factors acting together always cause an expected R ( t )

>

1.

(6)

344 H. Huai and R. C. Woodruff

l o E

3t

/

2v

0 20 40 60 80 100 120 140

Var ( k i )

FIGURE 2. -The expected

&

in stable and fluctuating popu- lations over evolutionary time. In a fluctuating population persisting t generations, N,

=

N, and N, = -N,[ geometric

mean (& ) / 21

’.

Hence, the geometric mean of k, is very close

to 2.0 if one follows the fluctuating population over many generations. Also note: geometric mean

(X,

)

=

E ( & ) -

Var(Si,)/[2E(Ri)], therefore

- Notice that if the population size is constant (Var ( k , ) = 0, k, = 2.0 for every generation over all t generations), .E(&) = 2.0. Whenever a population fluctuates, Var

(x,)

> 0 and

E ( & )

>

2.0. Since population sizes almost always fluctuate over time, a large range (0-140) of variance around

xi

is used.

generations before and after somatic-germ cell differen- tiation, the mutation rate of each cell generation, and the possible germ-cell lineage sorting effects. It is im- portant that we get measures of r, for different types of mutation events in a variety of multicellular organisms. In the estimations of R ( t ) from ( 9d) , r, values have been used that range from 0.05 to 0.40.

For the mean ratio of actual population size to in- breeding effective population size (

Ni

/

N e ( t ) ) over many generations, we have used the values of 1 to 10 in ( 9 d ) (see FRANKHAM 1995, for similar reported

Ni

/

Ne(i)

values).

The expected i t i value over many generations is

mainly dependent on whether the population size is stable or not. If the population fluctuates, the expected k, is mainly determined by the variance of

xi

over evolu- tionary time (see Figure

2 )

.

We have used an expected k j in the range from 2 to 10.

With the above estimations of r, (0.05-0.4),

N,/

Ne(I)

(1-10) and expected

X,

( 2 - l o ) , the observed R ( t ) values with clusters based on (9d) are given in Figure

3. From this figure it is clear that R ( t ) is expected to be >1, with values as high as 20 in some “extreme” cases. Even with a low estimation of Nj

/

N e ( 2 ) of

two, expected

x,

of four, and an r, value of 0.2, R ( t ) is about two.

-

-

In the above analyses, we focus on the within-popula- tion variance-to-mean ratio R ( t )

,

but the same formulas can be applied to any species and to any evolutionary lineage, as long as they are multicellular organisms with soma and germ-cell differentiation.

Each evolutionary lineage may have its own variance, mean, and therefore variance-to-mean ratio of evolu- tionary rate. Related lineages may not have indepen- dent population dynamics, r, or

N,

/

ratio, so great care is needed in combining their means and variances in molecular clock studies.

DISCUSSION

The origination of mutation is more heterogeneous than a simple Poisson process where the variance-to- mean ratio, R( t ) , is equal to one. When premeiotic clusters of mutation are included, the variance of muta- tion rate is always larger than the mean. This variance of mutation rate is dependent on population factors such as within-generation variation of reproductive suc- cess (

N,

/

Ne(i)

) and among-generation population dy- namics

(X,).

Since there is always fitness variation within any generation and populations do fluctuate among generations, these population factors will lead

to R( t )

>

1, even under the neutral model of molecular

evolution ( K I M U ~ 1983). Hence, the observed molec- ular clock, with R ( t )

>

1, may not be overdispersed, i e . , with premeiotic clusters of new mutants, R ( t ) is expected to be greater than one.

A possible test of the importance of premeiotic clus- ters of mutation on the overdispersed molecular clock is a comparison of the index of dispersion of multicellu- lar and single-cell organisms. The latter organisms should have a molecular clock that fits a Poisson distri- bution better, with variance-to-mean ratios closer to one, because there are no premeiotic clusters of muta- tion ( r, = 0 ) . For example, OCHMAN and WILSON (1987) reported that the evolution rate of 16s rRNA in eubacteria is closer to a Poisson expectation than is that of protein evolution in mammals, but more studies are needed on this topic. If one assumes that the mecha- nisms that have been proposed to explain the over- dispersed molecular clock ( GILLESPIE 1984; TAKAHATA 1987; IWASA 1993) are the same in multicellular and single-cell organisms, the difference in dispersion index between these two types of organisms is a possible mea- sure of the influence of premeiotic clusters of mutation on the molecular clock.

There is another cause of an increase variance to mean ratio of mutation rate: different rates of mutation for the leading and lagging strands of newly synthesized

DNA (FURUSAWA and DOI 1992; VEAUTE and FUCHS

(7)

Mutation Clusters/Molecular Clock

rc

=

0.05

A

345

re

=

0.10

B

10

2

10

8

-

'G

6

b

110

4

2

rc

=

0.20

C

FIGURE 3.-Estimation of dimersion index

TRf

t ) 1 of

10

0

M

4

2

10

8

'lu"

6

b

np

4

2

r,

=

0.40

D

molecular evolution. Here we use f 9 d )

IRf

t ) = 1

+

r,E( N / N z ( t l ) E ( & ) / 2 ] to estimatk the variance-to-mean ratio of molecular evolutionary rate. In the estimations of R ( t ) from ( 9 d ) , we choose r, values of 0.05, 0.10, 0.20, 0.40 to cover a reasonable range of the proportion of new mutations that occur in clusters. The reason to choose E ( & ) that range from 2 to 10 is given in Figure 2. For the ratio of actual population size to inbreeding effective population size ( N , / ) , we use the values of 1 to 10 in this figure (see FRANKHAM 1995 for similar estimated N , / N e ( 2 ) values).

L \ , - \ I L \ ,

tion rate 100 times the leading strand, the variation to mean ratio is less than 1.50

(H.

HUM and R. C . WOOD-

RUFF, unpublished results).

Our model also predicts that both synonymous and nonsynonymous sites should have variance-to-mean ra- tios of molecular evolutionary rate greater than one. During the long evolutionary process, nonsynonymous sites are frequently exposed to stronger and more types of natural selection, i . e . , deleterious mutations are se- lected against while beneficial ones are selected for. They will have larger variation of replication success or larger variation of transmission success compared with synonymous sites. The evolutionary or population dy- namics in the long run between synonymous and non-

(8)

346 H. Huai and R. C. Woodruff

that resembles a reduction of long term effective popu- lation size ( CHARLESWORTH et al. 1995)

.

The increase in the variance of mutation rate associ- ated with premeiotic clusters of mutation that alters our view of molecular clock also increases the new muta- tional variance input each generation by the same amount. This altered mutational variance will lead to a corresponding increase in standing genetic variation, a reduction of mutational load, and a likely change in allelic frequency distributions

(R.

C.

WOODRUFF,

H.

HUM and J.

N.

THOMPSON, unpublished results). We thank PETER BEERLI, IAN Boussu, SIMON EASTEAL, BRIAN G o t DING, HAIYAN HUN, RUSSELL WDE, TOMOKO OHTA, ADAM PORTER, NAOWKI TAKMATA andJAhlEs N. THOMPSON,JR. for their comments on this manuscript. This work was supported, in part, by an Ohio Board of Regents Academic Challenge Award to the Department of Biological Sciences, Bowling Green State University (R.C.W.) and by the Graduate College, Bowling Green State University (H.H.) ,

LITERATURE CITED

AUERBACH, C., 1962 Mutation: A n Introduction toResearch on Mutagene-

sis. Part I. Methods. Oliver and Boyd, Edinburgh.

BRIDGES, P. J., 1994 The Calculation of Genetic Risks. The Johns Hop- kins University Press, Baltimore.

CHARLESWORTH, D., B. CHARLESWORTH and M.T. MORGAN, 1995

The pattern of neutral molecular variation under the back- ground selection model. Genetics 141: 1619-1632.

COOPER, D. N., and M. KRAWCZAK, 1993 Human Gene Mutation. BIOS Scientific Publishers Limited, Oxford.

CROW,J. F., and M. KIMuRA, 1970 A n Introduction to Population Genet-

ics Theoly. Burgess Publishing Company, Minneapolis, MN.

DROST, J. B., and W. R. LEE, 1995 Biological basis of germline muta- tion: comparisons of spontaneous germline mutation rates among Drosophila, mouse and human. Environ. Mol. Mutagen.

EASTEAL, S . , C. COLLET and D. BETN, 1995 The Mammalian Molecular

Clock. Landes, Austin,

TX.

F~LLER, W., 1968 An Introduction to Probability T h q and Its Applica-

tions, Vol. 1, Ed. 3. Wiley, New York.

FURUSAWA, M., and H. DOI, 1992 Promotion of evolution: disparity in the frequency of strandspecific misreading between the lag- ging and leading DNA strands enhances disproportionate accu- mulation of mutations. J. Theoret. Biol. 157: 127-133.

FRANKHAM, R., 1995 Effective population size / adult population size

ratios in wildlife: a review. Genet. Res. 6 6 95-107.

GILLESPIE, J. H., 1984 The molecular clock may be an episodic clock. Proc. Natl. Acad. Sci. USA 81: 8009-8013.

GIJLESPIE, J. H., 1986 Rates of molecular evolution. Annu. Rev. Ecol. GILLESPIE, J. H., 1991 The Causes of Molecular Evolution, Oxford Uni-

versity Press, Oxford.

GILLESPIE, J. H., 1994 Alternatives to the neutral theory, pp. 1-17

in Non-Neutral Evolution: Theories and Molecular Data., edited by

G. B. GOLDING. Chapman & Hall, New York.

GOLDING, G . B., 1987 Multiple substitutions create biased estimates of divergence times and small increase in the variance to mean ratio. Heredity 58: 331-339.

GOLDMAN, N., 1994 Variance to mean ratio, R( t) , for Poisson pro- cesses on phylogenetic trees. Mol. Phylogenet. Evol. S: 230-239.

HARTL, D. L., 1971 Recurrence risks for germinal mosacis. Am. J. Hum. Genet. 2 3 124-134.

IWASA, Y., 1993 Overdispersed molecular evolution in constant envi- ronments. J. Theor. B~ol. 164: 373-93.

KIMURA, M., 1983 The Neutral T h e 9 of Molecular Evolution. Cam- bridge University Press, Cambridge.

LANGLEY, C. H., and W. M. FITCH, 1974 An estimation of the con- stancy of the rate of molecular evolution. J. Mol. Evol. 3: 161-

177.

MULLER, H. J,, 1954 The nature of the genetic effects produced by 25 ( SUPPI. 26) : 48-64.

Syst. 17: 637-665.

radiation, pp. 351-473 in Radiation Biology, edited by A. HOL

LAENDER. McGraw-Hill, New York.

OCHMAN, H., and A. C. WILSON, 1987 Evolution in bacteria: evi- dence for a universal substitution rate in cellular genomes. J.

OHTA, T., 1995 Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J. Mol. Evol.

OHTA, T., and M. KIMURA, 1971 On the constancy of the evolution- ary rate of cistrons. J. Mol. Evol. 1: 18-25.

OHTA, T., and H. TACHIDA, 1990 Theoretical study of near neutral- ity. I. Heterozygosity and rate of mutant substitution. Genetics RUSSELL, L. B., 1964 Genetic and functional mosaicism in the mouse, pp. 153-181 in The Role of Chromosomes i n Development,

edited by M. LOCKE. Academic Press, New York.

RUSSELL, L. B., and U. L. RUSSELL, 1996 Spontaneous mutations re- covered as mosaics in the mouse specific-locus test. Proc. Natl. Acad. Sci. USA 9 3 13072-13077.

SULSTON, J. E., E. SCHIERENBERG, J. G. WHITE and J. N. THOMPSON.

1983 The embryonic cell lineage of the Nematode Caenorhab-

ditis eleguns. Dev. Biol. 100: 64-119.

TACHIDA, H., 1991 A study on a nearly neutral mutation model in finite populations. Genetics 128: 183-192.

TAKAHATA, N., 1987 On the overdispersed molecular clock. Genet- ics 116: 169-179.

TAKAHATA, N., 1991 Statistical models of the overdispersed molecu- lar clock. Theoret. Popul. Biol. 39: 329-344.

VEAUTE, X., and R. P. P. FUCHS, 1993 Greater susceptibility to muta- tions in lagging strand of DNA replication in Escherichia coli than in leading strand. Science 261: 598-600.

WADA, IC, H. DoI, S. T m m , Y. WADA and M. FURUSAWA, 1993 A neo-Darwinian algorithm: asymmetrical mutations due to semi-

Acad. Sci. USA 90: 11934-1 1938.

conservative DNA-type replication promote evolution. Proc. Natl.

W~SMAN, E. M., 1991 Recurrence risk of a new dominant mutation in children of unafTected parents. A m . J. Hum. Genet. 4 8 654- 661.

WOODRUFF, R. C., and J. N. THOMPSON, JR., 1992 Have premeiotic clusters of mutation been overlooked in evolutionary theory? J. Evol. Biol. 5: 457-464.

WOODRUFF, R. C., H. HUN and J. N. THOMPSON, JR., 1996 Clusters of identical new mutation in the evolutionary landscape. Genet- ica 98: 149-160.

WRIGHT, S., 1938 Size of population and breeding structure in rela- tion to evolution. Science 87: 430-431.

YOUNG, I. D., 1991 Introduction to Risk Calculations i n Genetic Con-

sulting. Oxford University Press, Oxford.

Mol. EvoI. 26: 74-86.

40: 56-63.

126: 219-229.

Communicating editor: N. TAKAHATA

APPENDIX

Model for clusters of all possible sizes: Some multi- cellular species may not have a simple development program that can be idealized as in Figure 1. Smaller clusters between the two “extremes” shown in Figure

1A and Figure 1B may also be common among new mutants. The cluster effect on the inflated variance-to- mean ratio of mutation rate and the molecular clock can also be modeled by considering all possible cluster sizes, not just those limited to cluster sizes of F; = 1/27i

(see Figure 1A)

.

Define ref, as the proportion of cluster mutants among all new mutants in a characterized fraction f ,

( 0

<

Ai

s 1/2) out of

Ri

gametes. The mean cluster size

(9)

Mutation Clusters/Molecular Clock 347

two clusters of three and five clusters of two among 100 new mutants. Then

~ ~ ~ ~= 01 *5/ . 5 0

100

= 0.05

r c , ~ p = o . ~ o =

2

*

3 / 100 = 0.06

rc,p=o.20 = 5*2/100 = 0.10.

This new evolutionary parameter, rcA,, not only repre- sents the proportion of premeiotic clusters among all new mutations, but rc,f, is also the primary contributor to the recurrent probability that another gamete within an individual shares an identical new mutation. For a particular gene, excluding its homologue in a diploid organism, the conditional probability of another allele identical to the initial mutant within an individual is

2

C

ftrc,hz

+

1 -

2

C

fb.rc~, U. (A1

f , = 1 / 2

(

f , = 1 / 2

fn=O L = O

)

Considering both copies of a gene in a diploid organ- ism, the recurrent probability of an identical mutant gamete, given an initial mutation, is

fa= 1 / 2

(

f,=1/2 , j u l /

[

2

c

fb.C&,

+

1

-

2

C

fb. CJ"

2

(

f,=O

)

f i = O Li=o

(mutation in same gene)

f , = l / P

+

u/

2

=

I:

j&,

(mutation in homologue) fn=o f , = 1 / 2

+

1

-

C

fCi.rc,fm U. ( A 2 )

All the

rex,

values can be orders of magnitude greater than u, which is the recurrent ,probability of identical mutations in the absence of clusters. The occurrence of premeiotic clusters therefore greatly alters the proba- bility distribution of new identical mutations among gametes from any multicellular individual, without changing the total mutation rate. If one assumes the only type of cluster mutations are those early premeiotic ones that occur before germcell differentiation, then

(A1 ) and ( A 2 ) give identical predictions on recurrent probability as ( 5 ) and ( 6 ) .

In every generation, the distribution used to describe these clusters of premeiotic mutation has the mean pc,cl and the variance

pcz

(

CT:~

+

c:)

.

Here,

pci,

the proba- bility or frequency of premeiotic cluster events happen- ing, is urCrlm/ c i , where ci is cluster size, and

CT:~

is the variance of cluster size around the characterized mean cluster size ci

.

This distribution can also be modeled in the generalized or compound Poisson distribution as

before. Based on probability theory, this generalized or compound Poisson distribution has mean characterized cluster size ci and variance of cluster size

with the mean and variance of progeny size

xi

and

CT

z ,

respectively.

The variance of family size, CT

f

,

can be obtained from

the formula for inbreeding effective population size of generation i , Ne ( i ) ,

N,Ri - 1

T + R i - l 0: ki

Nkl) =

For any group of clusters, which is characterized by mean cluster size c, or by a fraction among all gametes

f,

( ci =

f , X

) , we have

P C , ( ( . : ,

+

~ 3 )

= urc,fcs[l

+

N I / N , ( i ) c i I

= urcJC,[1

+

N I / N c ( i ) & J t I . (AS)

Independent mutations, which follow a Poisson distri- bution, have both mean and variance of u ( 1 -

Z b ~ t / ~

rC,&,).

Combining the Poisson and the cluster distributions for all group of clusters with different sizes, the mean mutation rate is u, and the variance is u ( 1

+

N , / N e + ) X

X&=;/2fcir,,f,,). Hence, the variance-to-mean ratio for mutation rate, R ( u)

,

in generation i is

/, = 1 / 2

~ ( u ) = 1

+

N / N ( ~ ) X

C

Ji.rc,ht, (-44) h,=O

and the variance to mean ratio of neutral evolution in a fluctuating population accumulated over t genera- tions is

c

ht=O

1

f c , = 1 / 2

R ( t ) = 1

+

E

N,/N(i)Ri

C

fn.rc,fa

.

(A5)

If most clusters have the same expected characteristic fraction fc, (0

<

f,

I 1/2) out of

Rj

gametes, i.e.,

f,

=

0.25 (see RUSSELL 1964; RUSSELL and RUSSELL 1996), then (A4) and (A5) become

R ( u ) = 1

+

C J a N , / % i ) R i J t ( A6a )

R ( t ) = 1

+

r c , f , E I N , / N , ( i , R i f , z l . (A6b)

All the equations above lead to an expected variance- to-mean ratio ( R ( t ) ) of the molecular clock always greater than one, and they are more general than previ- ous equations (Equations 8 and 9 ) .

If one assumes that all recovered clusters are early premeiotic clusters, which occurred before germ-line differentiation, or that late premeiotic clusters, which occurred after germcell differentiation, contribute lit- tle to mutation rate and evolution, then only f, = 1/2

(10)

348 H. Huai and R. C. Woodruff

A4 and A6a then simplify to ( 9a) , and (A5 ) and (A6b) idealized development program as shown in Figure 1,

become ( 9 b ) . but one needs much more data on the detail distribu-

Equations 8a and 8b are also special cases where only tion of clusters of different sizes, which is extremely clusters with

f,

= or cluster sizes

rj

=

y2Xz

are consid- difficult to measure. In many cases, large clusters (

ri

ered and the population size is constant

( X

= 2 for all = /&) in Figure 1A may well dominate the contribu-

generations )

.

tion to the deviation of mutational process away from

1 -

Figure

FIGURE 1.-Partition of mutation rate according to an ideal model of gametogenesis in multicellular organisms
TABLE 2 Frequency of mutation for each family
FIGURE 2. mean generations. lations  over  evolutionary  time.  In to persisting -The  expected & in  stable  and  fluctuating  popu- a fluctuating population t generations, N, = N, and N, = -N,[ geometric (& ) / 21 ’
FIGURE to 10 in this figure (see FRANKHAM \ I 1995 for similar L inbreeding effective population size estimated in clusters

References

Related documents

possible diseases of the given questions in community based health services. We first analyze and categorize the information needs of health seekers. As a

The Emergency Services Foundation restructured their fundraising plan to garner funding from for-profit businesses through corporate partnerships.. ESF

Define Change Management in the e-health project Present 10 key Change Management concepts.. Identify common outcomes of ineffective change

and globally, remain on the negative side of the digital divide. This age-based digital divide is of concern because the internet enables users to expand their

The study was conducted using the survey model because its aim was to determine the TPACK of pre-service teachers in the departments of primary school elementary school teaching

Referral is a professional obligation that is present throughout all phases and aspects of the chiropractic practice. The primary obligation of Doctors of Chiropractic is to

The others (e.g. Playing Videos, adding the shutdown button) are not crucial to the camera project but can be done if you’re also interested in exploring these capabilities.

Data were collected at four time points during the study: (1) demographic data, obtained from an institutional on-line tool for administration facilities, were collected from all