**Copyright **Q **1997 by the Genetics Society of America **

**An **

**An**

**Approximate Model of Polygenic Inheritance **

**Kenneth **

**Lange **

**Lange**

* Departments of Biostatistics and Mathematics, Universily of Michigan, Ann Arbor, Michigan 481 09-2029 *
Manuscript received November

**25,****1996**

Accepted for publication July **23, 1997 **

ABSTRACT

The finite polygenic model approximates polygenic inheritance by postulating that a quantitative
trait is determined by * n *independent, additive loci. The

**3"**possible genotypes for each person in this model limit its applicability.

**CANNINCS, THOMPSON,**and

**SKOLNICK**suggested a simplified, nongenetic version of the model involving only

**2 n**### +

**1**genotypes per person. This article shows that this hypergeo- metric polygenic model also approximates polygenic inheritance well.

**In**particular, for noninbred pedigrees, trait means, variances, covariances, and marginal distributions match those of the ordinary finite polygenic model. Furthermore as

*+ m, the trait values within a pedigree collectively tend*

**n**toward multivariate normality. The implications of these results for likelihood evaluation under the polygenic threshold and mixed models of inheritance are discussed. Finally, a simple numerical example illustrates the application of the hypergeometric polygenic model to risk prediction under the polygenic threshold model.

### T

HE mixed model of polygenic plus major gene in- heritance has proved to be a useful alternative to classical Mendelian models in the analysis of pedigree data (ELSTON and STEWART**1971;**MORTON and MACLEAN

**1974).**Unfortunately, exact likelihood calcu- lation under the mixed model is virtually intractable except for nuclear families and small pedigrees. This impasse has prompted the development of approximate methods of likelihood evaluation in the mixed model. In two recent papers, ELSTON, FERNANDO, and STRICKER

(FERNANDO **et **al. **1994; **STRICKER **et **al. **1995) **have sug-
gested a particularly interesting approach that appears
to produce much more accurate results than the best
previous approximations ( HMSTEDT **1982,1991 **)

### .

This new approach is computationally fast enough to permit linkage analysis under the mixed model.ELSTON, FERNANDO, and STRICKER (FERNANDO **et **al.

**1994; STRICKER ****et **al. **1995) **adopt as their point of de-
parture the finite polygenic model suggested earlier by
**WINGS *** et al. *(

**1978).**The finite polygenic model approximates polygenic inheritance

**(FISHER 1918)**by postulating that trait values are determined by a small number of biallelic loci that have equal and additive effects. If for convenience the two alleles at each con- tributing locus are termed positive and negative, then the finite polygenic model incorporates the symmetry assumption that all positive genes contribute

**+1**and all negative genes

**-1**to an individual's trait. Trait means are forced to equal

*by taking positive and nega- tive alleles to be equally frequent in the surrounding population.*

**0****An**arbitrary trait variance can be achieved by scaling the positive and negative contributions by the same multiplicative constant.

**Author email: ****[email protected] **

**Genetics 147: 1423-1430 (November, 1997) **

**As **

just described, the finite polygenic model for **As**

**n**loci involves **3" **possible genotypes for each person. This
represents a depressingly fast escalation in combinato-
rial complexity that defeats likelihood calculation for **n **

sufficiently large to approximate normality well. How-
ever, WINGS **et ***al. ***(1978) **note that the **2n **

### +

**l possi-**ble phenotypes of the model are determined solely by the

**2n**

### +

**1**possible counts for the number of positive genes at the participating loci. They then suggest treat- ing multilocus genotypes

**as**equivalent if they involve the same number of positive genes. This radical simpli- fication, which we call the hypergeometric polygenic model, is inconsistent with Mendelian transmission at

* n *separate loci. Because of these objections, the finite
and hypergeometric polygenic models have languished.

ELSTON, FERNANDO, and STRICKER (FERNANDO **et **al.

**1994; **STRICKER **et al. ****1995) **now raise the possibility of
reviving the finite polygenic model for computational
purposes. They do * so *by altering its gamete transmission
probabilities. Their tinkering with the finite polygenic
model captures the essential features of the hypergeo-
metric polygenic model. In this article, we explore some
of the logical consequences of the hypergeometric poly-
genic model and demonstrate mathematically that it
provides an excellent approximation to polygenic in-
heritance. Because of this good mimicry, the hypergeo-
metric polygenic model is also apt to be extremely help
ful in pedigree calculations involving the polygenic
mixed model (MORTON and MACLEAN

**1974)**and the polygenic threshold model (

**LANGE**

**et**al.**1976a).**In this regard, the numerical results of ELSTON, FERNANDO,

and **STRICKER **(FERNANDO **et **al. **1994; STRICKER ****et **al.

**1424 ** K. **Lange **

**MODEL DEFINITION **

The hypergeometric polygenic model of CANNINCS **et ***al. *(1978) is distinguished by two features. One is the
absence of loci; the other is transmission from parent
to offspring by sampling without replacement. To avoid
introducing a completely alien vocabulary in discussing
the model, we will retain traditional genetic terms such
**as **gene and genotype but invest them with slightly dif-
ferent meanings. The transmitted agents or genes in
the model are classified **as **either positive or negative. **As **
noted above, positive genes contribute

### +

1 and negative genes -1 to the quantitative trait**x**of a person. Each person has a genotype consisting of a set of exactly

*genes and a trait value ranging from*

**2n****-2n**to

**+2n.**Be- cause genes come in only two varieties, we can identify a genotype g with the number of positive genes con- tained in it. In this sense, there are

**2n**

### +

1 possible genotypes per person. The formulas**x**=

**2**

*( g*

### -

*and*

**n)**g = **n **

### +

(**x/ 2)**convert a genotype g into a trait value

**x**and vice versa.

Pedigree founders are created by independently sam-
pling **2n **genes from an infinite pool of equally likely
positive and negative genes. Thus, the founder geno-
type g occurs with the binomial probability

The genotype of each child in the pedigree is created
by sampling without replacement * n *genes from the ge-
notype of his mother and

**n**genes from the genotype of his father. These

**two**transmitted sets of genes or gametes are pooled to form the child's genotype. Sam- pling of parental genes is done independently for each gamete created. Sampling without replacement implies that gamete transmission probabilities are hypergeo- metric. In other words, the probability that a parent with genotype

*contributes a gamete with*

**i***positive genes is*

**j**Because of independent formation of gametes, parents
with genotypes * i *and

*produce a child with genotype*

**j**

**k**with probability

min(i,k)

* 7 i x p k *=

### x

**7 i - + m r j + k - m ,****m=rnax(O,k-j) **

**as **noted by CANNINCS * et al. *( 1978).

This completes our description and reinterpretation
of the hypergeometric polygenic model. We will dem-
onstrate that it satisfies three crucial desiderata. First,
in noninbred pedigrees it gives exactly the same covari-
ance structure **as **the finite polygenic model and indeed
the polygenic model itself. Second, in this setting it also
entails the same marginal distribution of trait values **as **

the finite polygenic model. Third and finally, it shares
with the finite polygenic model the desirable feature
that the standardized trait values within a pedigree tend
toward multivariate normality **as the underlying num- **
ber of genes * 2n *tends to

*CQ*(LANCE 1978; LANCE and

BOEHNKE 1983)

### .

We will also consider a variant of the hypergeometric polygenic model that involves gamete formation by sam- pling with replacement. This variant is apt to give a decidedly inferior approximation to polygenic inheri- tance because it fails to capture correctly the covariance structure of a pedigree. This and other issues will be taken up after reviewing some elementary moment cal- culations relevant to sampling without replacement.

**REVIEW OF SAMPLING ** **WITHOUT REPLACEMENT **

Suppose that a sample of * n *genes is taken without
replacement from a random collection of

**2n**genes with numerical values W = ( Wl,

### .

### . .

,**Wzn).**The trait value associated with this gamete sample can then be ex- pressed as

**2 n **

Y = A i m ,

**i= ****1 **

where the Ai are correlated indicator random variables satisfying the conditions Pr (Ai = 1 ) =

### '/*

and Ai= * n. *From this indicator function representation
( COCHRAN 1977), we immediately deduce the condi-
tional and unconditional means

**2 n **

1 **2 n **

E ( Y ( W) = E(A,)W, = - *W , *

**i= ****1 ** **2 :=1 **

1 **2n **

**2 i-1 **

E ( Y ) =

### -

**E ( W , ) .**

If **E(W,) **= 0, then E ( Y ) = 0 **as **well.

and that under sampling without replacement

Cov(Ai, Aj) = Pr(Ai = **1, Aj = 1) **

To compute variances, note that each * Var ( A i *) =

**1/4**

- Pr(Ai = 1) Pr(Aj = 1 )

for **i ****f *** j . *It follows that

=

### [-+

1**4 ** **4(2n- **1 ) **i=l **

**A Model of **Polygenic Inheritance **1425 **

Var(YI W)

1

**] 2 n - ** 2

**4 ** **4 ( 2 n + **1) **4 ( 2 n - 1 ) **

### (1

### ZW,

### )

and

Var(Y) = E[Var(YI W)]

### +

Var[E(YI W)]**4 ** **4 ( 2 n - **1) **] 2 n - 4 ( 2 n - **1 1) E[

## (z

E[Var(YI W)] =

### [:

### -

### +

**4 ( 2 n - **1) **]2n **

and

Var(Y) =

### [:

### - +

**4 ( 2 n - **1) **]2n **

**4 ) **

### +

### ["

**4 ** **4 ( 2 n - **1)

If, in addition, the random variables

### w

are indepen- dent and have unit variances, then Var ( Y ) =**n.**

**MEANS, VARIANCES, **AND **COVARIANCES **

We now undertake computation of the means, vari-

ances, and covariances of the trait values within a pedi-
gree. Assuming each person has **2n **genes, let

**X i **

be the
trait value **X i**

**of**pedigree member

*When*

**i.***is a pedigree founder, then E (*

**i****X i ) **

= 0 by virtue of binomial s a m -
pling. When **X i )**

*has parents*

**i***and*

**k***in the pedigree, then we can decompose*

**I****X i **

into a gamete contribution from
**X i**

*plus a gamete contribution from*

**k***In symbols,*

**1.****X , **

=
Y,,ei **X ,**

### +

Yn,bi. If we assume inductively that the means of the parental values*and*

**X ,****X l **

vanish, then
**X l**

= E[E(Y,,bil -X,)]

### +

E[E(Yn,,il X , ) ]= E(Y2-G)

### +

**E(%&,)**=

**0.**

Obvious iteration of this argument shows that all trait means within the pedigree vanish. The equality E

**( X i ) **

**( X i )**

= 0 also holds under the polygenic model, the finite polygenic model, and the sampling with replacement variant of the hypergeometric polygenic model.

We next **turn **to the calculation of variances and co-
variances. If * i *is a founder, then Var (

**Xj) **

= **2n.**If

*has parents*

**i***and*

**k***then in view of*

**I ,****( 4 )**

When Var * ( X , ) *= Var

**( X , ) **

= **( X , )**

**2n**and Cov

**( X &****X l ) **

=
0, this reduces to
**X l )**

Var

### (Xi)

=

### [-+

1 1**2 ** **2 ( 2 n - **1) **] 2 n + i ( 1 - & ) 2 n **

= **2n. **

If the parents * k *and

*of*

**I***are unrelated, then*

**i***and are independent and Cov( &,*

**X ,****X , ) **

= 0 holds.
Next let us address covariances. If **X , )**

*is a founder and j is not a descendant of*

**i***then just*

**i,****as**with unrelated parents the trait pairs

**X i **

and **X i**

**X i **

are uncorrelated. If **X i**

**i****has **

parents *and*

**k***and*

**1**

**j****is**not a descendant of

*then standard calculations show that*

**i,**COV (

### Xi

**9****X i ) **

**X i )**

=

### '/2

COV(&,**X i ) **

**X i )**

### +

**72 **

COV(&~, **X i ) , **

( 6 )
based on the conditional independence of X& and **X i ) ,**

*given the parental values*

**X . j**### &

and**X,. **

This same recur-
rence relation holds under the polygenic model, the
finite polygenic model, and the sampling with replace-
ment variant of the hypergeometric polygenic model.
**X,.**

To speclfy the trait covariance matrix * Ow *= (

*w , ~ )*for a pedigree of

*q*people, we begin by numbering the people 1,

### . . .

### ,

*q*in such a way that parents precede their children. Then mimicking the usual recursive pro- cedure for computing a kinship matrix (

**LANCE**1976b), we fill in R,, starting in its upper left corner with person

**1.**The initial conditions for founders and the recur- rences

*and ( 6 ) inductively enlarge the upper left block of 0, by adding the partial row and partial col- umn corresponding to the current person*

**( 5 )***The pedi- gree numbering scheme insures that all previously vis- ited people*

**i.***are not descendants of*

**j**

**i.****1426 ** **JL **Lange

uniformly in a noninbred pedigree. In the presence of
inbreeding, **wnii **

### >

*can occur. However, we always have the bounds*

**2n****0 5**

**wnii****5**

**( 2**### +

*on the entries of the*

**q ) n***q*X

*q*matrix

### a,,.

Verification is left to the reader.For a noninbred pedigree, the hypergeometric poly-
genic model not only correctly captures means, vari-
ances, and covariances, but it also implies that all peo-
ple share the marginal trait distribution characterizing
founders. This fact is more or less obvious because all
genes contributing to any person * i *are unique and
equally likely to derive from positive or negative genes
drawn from the ancestral pool sampled in creating
founders. Skeptical readers can verify analytically that
the number of positive genes possessed by

*follows the binomial distribution (*

**i***) by checking the reproductive property*

**1**for gene transmission under sampling without replace- ment.

When properly standardized, the marginal trait distri-
bution (1) shared by noninbred people quickly a p
proaches univariate normality. Not only does the mar-
ginal distribution behave well, the joint distribution of
trait values within a pedigree follows an approximate
multivariate normal distribution. Because our treat-
ment of this central limit theorem is necessarily lengthy,
we defer a precise statement and proof to the **APPENDIX. **

It is interesting to contrast these encouraging results
with the results under sampling with replacement. **As **
noted above, the mean condition **E ( **

**X i ) **

= 0 and covar-
iance recurrence **X i )**

**( 6 )**continue to hold. However, the variance recurrence now becomes

Var(Xl) +-Cov(&, **1 **

**K,). **

**K,).**

**( 7 )****4 ****2 **

This recurrence can be verified noting that the gamete
contribution * Yn,ei *can be expressed

**as**

*-*

**2Gn,ei**

**n ,**where * GnTri *is the number of positive genes sampled
from

**k. **

It is clear that given **k.**

### &,

the random variable* Gn,ej *is binomially distributed with success probability

over * n *trials. Hence,

* 1 * 1

= **E **

### (

**n**### - -

*=*

**X',**

**n**### -

- V a r ( X h ) .**4n **

### )

**4n**Substituting this expression and the corresponding ex-
pression for the quantity E [Var( * Yn,+jl X,,,) *] into the
expansion ( 5 ) yields the recurrence (

**7 ) **

### .

The recurrence relation * ( 7 ) *does not lead to stability
of trait variances in a noninbred pedigree. Let

*be the trait variance of a person*

**c,***from all pedigree founders. Then*

**m generations removed****c,**satisfies the recur- rence

with solution

**c m = * + [ 5 ( 1 - : ) ] m ****1 **

**(G)-+) **

**(G)-+)**

### .

**1 + - **** 1 + - **

* n * \

**n /**The limiting value * 4 n / *(

**1**

### + 1

/*imposes an upper bound on the trait variances in a noninbred pedigree.*

**n )****VARIANCES **AND **COVARIANCES **

**FOR INBRED PEDIGREES **

When inbreeding occurs under the hypergeometric
polygenic model, trait variances can diverge from those
predicted under the finite polygenic model. According
to known arguments involving kinship coefficients **(JAG **
**QUARD 1 9 7 4 ) , **the correct recurrence relation for vari-
ances under the finite polygenic model is

Var (

**K i ) **

= **K i )**

**2 n**### + 7

### '

### Cov (&,

**x i ) .**

**( 8 )**The **two **recurrences ( **5 ) and **( 8 ) differ whenever ei-
ther Var (&)

### >

*or Var (*

**2n****K l ) **

**K l )**

### >

*For example, the hypergeometric polygenic implies a slightly higher variance for the child of an inbred parent than does the finite polygenic model. In general, one can prove by induction that the variances and covariances of the hypergeometric polygenic model always dominate their counterparts in the finite polygenic model.*

**2n.**To gain some feel for the differences involved be- tween the two models, it is instructive to consider a simple example. For convenience, let us first pass to standardized random variables ( 1 /&)

**X i **

with unit
variances for founders. In the limit **X i**

**as**

*"* 01*

**n**### ,

the matrix R =*of standardized variances and covariances satisfies under sampling without replacement the initial conditions*

**( w ~ )***w.. *

=

**1***w.. *

= **0**

for a founder * i *and his nondescendant

*and the recur- rence relations*

**j****v ****( 9 ) **

* wii *=

**7 2**### +

**74%**### +

**74WU**### +

**Y2WU****A **Model of Polygenic Inheritance **1427 **

**1 **

**2 **

**3 **

**4 **

**5 **

**6 **

**7 **

**8 **

**FIGURE 1.-An ** inbred pedigree.

genic model except for substitution of the variance re-
currence *wii *= **1 **

### +

**1/2wM.**Now consider the eight-member pedigree depicted
in Figure **1. **The covariance matrix calculated under
the polygenic initial conditions and recurrences is

**7 2 ** **'/2 ** **7 2 **

**7 2 ** **7 2 ** **7 2 **
**'/2 **

**74 **

**74 **

**1 **

_{74 74 }

_{74 74 }

**"/4 **

**74 74 **

**74 **

**7 4**

**7 4**

**78 **

**78**

** 78 **

**78**

**7* **

**7***

**0** **0** **0**

**0 **
**0 **
**0 **
**0 **

*0 *
*0 *
**'/2 1 **

**78 **

**78**

**7 8 **

We derive exactly the same matrix R = ( *w g ) *from the
initial conditions **( 9 ) **and recurrences ( **10) **except for
the single slightly inflated entry *wg8 *= **17/ 16. **

APPLICATION T O **RISK **PREDICTION

For a simple numerical application to the polygenic
threshold model, consider the pedigree of Figure **2. **In
this pedigree, darkened individuals are afflicted by a
hypothetical disease with a prevalence of **0.01 **and a
heritability of 0.5. We approximate the polygenic liabil-
ity to disease of person * i *in the pedigree by the

**sum**

where * X, *is determined by the hypergeometric poly-
genic model with

**2n**polygenes, the

*Y ,*are independent, standard normal deviates, the additive genetic variance

**I **

[7

**3 ** **4 ** **5 ****6 ****7 ** **8 ** **9 ** **10 **

*6 *

_{11 }*6 *

_{12 }**FIGURE 2.-Risk ** prediction under the polygenic threshold
model.

* uz *= 0.5, and the random environmental variance

*= 0.5. Given that each*

**a:**### .&

follows an approximate stan- dard normal distribution, the liability threshold of**2.326**is determined by the prevalence condition 1

### /&

**s2m326 **

*=*

**dz****0.01.**

Three potential children are represented by open
diamonds ( 0 ) in Figure **2. **Table **1 **gives the risks that
these unborn children will be afflicted with the disease
under the hypergeometric polygenic model. The recur-
rence risks recorded evidently stabilize at **-24%, ****8%, **

and **5% ****as the number of polygenes ** **2n **+

### 03.

Underthe alternative hypothesis of an autosomal dominant
mode of disease inheritance, these risks are **1/2, **

and 0, respectively. Evidently the calculated risks are strongly model dependent.

DISCUSSION

The stunning reduction in computational complexity
in exchanging **2n **

### +

**1**possible genotypes for

**3**possible genotypes in the finite polygenic model should pay rich dividends in genetic epidemiology. Quick computing and good biological modeling are inseparable. If a model does not permit accurate likelihood evaluation, then by and large it is untestable.

**As **

geneticists tackle
common diseases, the necessity of alternatives to classi-
cal Mendelian inheritance becomes paramount. The
mixed model is one well-posed alternative, even if
rather naive.
**As**

The hypergeometric polygenic model developed

here should be a good approximation to the polygenic
model. Although strictly speaking the hypergeometric
polygenic model is nongenetic, there is no compelling
theoretical evidence to suggest that it approximates the
polygenic model less well than the finite polygenic
model does. It is true that the hypergeometric polygenic
model gives slightly inflated variances and covariances
for inbred pedigrees, but the majority of applications
involve noninbred data. Furthermore, likelihood evalu-
ation is substantially more demanding for inbred pedi-
grees than for noninbred pedigrees. Of course, it would
be helpful **to **know the rate of convergence to multivari-
ate normality of each model, but this would require a
more refined analysis than the one undertaken in the

The finite polygenic model neatly sidesteps some of the computational problems associated with the poly-

1428 **K. **Lange

**TABLE 1 **

**Recurrence **risks **for the unborn children **in **Figure 2 **

Polygenes 2n Child 8 Child 11 Child 12

10 0.189 0.058 0.045

**20 ** 0.221 0.073 0.051

30 0.231 0.079 0.053

40 0.235 0.082 0.053

50 0.237 0.083 0.054

genic model. For a large pedigree, naive likelihood computation under the polygenic model is easily frus- trated by the task of inverting the trait covariance matrix

(LANCE 1976b). Fortunately, ELSTON and coworkers
(ELSTON and STEWART 1971; ELSTON * et al. *1992) have
devised likelihood algorithms that avoid matrix inver-
sion for the polygenic model under some forms of
shared environment and no dominance component.
Under the polygenic threshold model, even graver
problems arise in evaluating complicated multivariate
normal distribution functions ( MENDELL and ELSTON
1974;

**M G E**

*1976;*

**et****al.****RICE et**

*al.*1979). Finally, exact likelihood evaluation under the mixed model is virtu- ally impossible except for small pedigrees. The hyper- geometric polygenic model dramatically improves on the advantageous behavior of the finite polygenic ap- proximation to the polygenic threshold and mixed models.

Besides the obvious computational advantages for de-
terministic likelihood evaluation, the hypergeometric
polygenic model lends itself well to Markov chain
Monte Carlo methods **(LANCE **and **MATTHYSSE **1989;

LANGE and SOBEL 1991; THOMPSON and **GUO **1991; **SO- **

BEL and LANGE 1993; THOMPSON 1994). One can easily
construct a relevant Markov chain for a pedigree. **A **
state of the chain specifies the number of positive genes
carried by each person and the number of positive
genes transmitted by each gamete. **A **transition between
two states occurs by random resampling of founder ge-
notypes or by random resampling of gametes. Resam-
pling via the binomial distribution ( 1 ) and the hyper-
geometric distribution ( 2 ) are natural in this context,
but other proposal distributions such as the uniform
also offer interesting possibilities. Radical rearrange-
ments can be achieved by taking a random number of
transitions per step of the chain (SOBEL and LANGE

1993). Imposing the usual Metropolis mechanism for
accepting proposed steps guarantees that the chain has
an equilibrium distribution consistent with the condi-
tional distribution of states given observed phenotypes.
One can even include this chain **as **part of more com-
prehensive Markov chains involving the mixed model
and linked markers.

Over time, the hypergeometric polygenic model promises to become a standard technique in the reper- toire of genetic epidemiologists. Before this happens,

good software needs to be developed and tested. EL

STON, FERNANDO, and STRICKER (FERNANDO * et al. *1994;

STRICJSER * et al. *1995) have made an excellent start. In
spite of continuing dramatic gains in computer hard-
ware, good algorithms are as relevant as ever.

I thank MICHAEL BOEHNKE, SUN-WEI Guo, and STEVEN **MATTHYSSE **
for suggesting various improvements to the first draft of this manu-
script. This research was supported in part by **U.S. **Public Health
Service grant **GM-53275. **

LITERATURE CITED

BICKEL, P. J., and K. A. DOKSUM, **1977 Mathematical Statistics: Bmic **

BILLINGSLFX, **P., *** 1986 Probability and Memure. *Wiley, New York.
CANNINGS, C., E. A. THOMPSON and M. H. SKOLNICK,

**1978**Probabil-

ity functions on complex pedigrees. Adv. Appl. Prob. **10: 26-61. **
COCHRAN, **W. **G., * 1977 Sampling Techniques, *Ed.

**3.**Wiley, New York. ELSTON, R. C., andJ. STEWART,

**1971**A general model for the genetic

analysis of pedigree data. Hum. Hered. **21: 523-542. **

ELSTON, R. C., V. T. GEORGE, and **F. **SEVERTSON, **1992 ** The Elston-
Stewart algorithm for continuous genotypes and environmental
factors. Hum. Hered. **4 2 16-27. **

FERNANDO, R. L., C. STRICKER, and R. * C . *ELSTON,

**1994**The finite polygenic mixed model: an alternative formulation for the mixed model of inheritance. Theoret. Appl. Genet.

**88: 573-580.**FISHER, R. A,,

**1918**The correlation between relatives on the suppo-

sition of Mendelian inheritance. Trans. Roy. SOC. Edinb. **52: 399- **
**433. **

HASSTEDT, **S. **J., **1982 ** A mixed model approximation for large pedi-
grees. Comput. Biomed. Res. **1 5 295-307. **

HASSTEDT, **S. **J., **1991 ** A variance components/major locus likeli-
hood approximation on quantitative data. Genet. Epidemiol. **8: **

JACQUARD, A., * 1974 The Genetic Structure ofPopulutions. *Springer, New
York.

LANGE, K , **1978 ** Central limit theorems for pedigrees. J. Math. Biol.

**6: 59-66. **

LANGE, K., and M. BOEHNKE, **1983 ** Extensions to pedigree analysis.

**lV. **Covariance components models for multivariate traits. Am.
J. Med. Genet. **1 4 513-524. **

LANGE, K, and S. M A ~ S E , **1989 ** Simulation of pedigree g e n e
types by random **w a l k s . **Am. J. Hum. Genet. **4 5 959-970. **
LANGE, K , and E. SOBEL, **1991 A **random walk method for comput-

ing genetic location scores. Am. J. Hum. Genet. **4 9 1320-1334. **
LANGE, K., J. WESTLAKE and M. A. SPENCE, **1976a ** Extensions to pedi-

gree analysis. **11. **Recurrence risk calculation under the polygenic
threshold model. Hum. Hered. **2 6 337-348. **

LANGE, K, J. WESTLAKE and M. **A. **SPENCE, **1976b ** Extensions to ped&
gree analysis. **111. **Variance components by the scoring method.
Ann. Hum. Genet. **3 9 485-491. **

MENDELL, N., and R. C. ELSTON, **1974 ** Multifactorial qualitative
traits: genetic analysis and prediction of recurrence risks. Biomet-

MORTON, N. E., and C. J. MACLEAN, **1974 ** Analysis of family resem-
blance. **111. **Complex segregation analysis of quantitative traits.

**Am. **J. Hum. Genet. **2 6 489-503. **

* Idem and Selected Topics. *Holden-Day, Oakland,

**C A .****113-125. **

r i c ~ **3 0 41-57. **

REM, A,, * 1970 Probability Themy. *North-Holland, Amsterdam.
RICE, J., T. REICH and C. R. CLONINGER,

**1979**An approximation to

the multivariate normal integral: its application to multifactorial
qualitative traits. Biometrics **3 5 451-459. **

SOBEL, **E., **and K. LANGE, **1993 ** Metropolis sampling in pedigree
analysis. Stat. Methods Med. Res. **2: 263-282. **

STRICKER, C., R. L. FERNANDO and **R. *** C . *E ~ o N ,

**1995**Linkage analy- sis with an alternative formulation for the mixed model of inheri- tance: the finite polygenic mixed model. Genetics

**141: 1651-1656.**THOMPSON, E.

**A., 1994**Monte Carlo likelihood in genetic mapping.

Stat. Sci. **9: 355-366. **

THOMPSON, **E. **A., and SW. Guo, **1991 ** Evaluation of likelihood ra-
tios for complex genetic models. I. M. A. J. Math. Appl. Biol.
Med. **8: ****149-169. **

**A Model **of Polygenic **Inheritance ** **1429 **

**APPENDIX **

**MULTIVARIATE NORMALITY **

To demonstrate multivariate normality, let * X,, *= (

**X , , l ,**### . . .

,**X,,,) **

denote the trait vector for a pedigree with **X,,,)**

*q*

people. We now show that the standardized random
vectors ( 1 /&) * X,, *tend in law to a multivariate normal
distribution with mean vector

**0**and covariance matrix

0 = ( * w V ) *defined by the initial conditions ( 9 ) and
recurrences (10). According to the well-known
Cram&-Wold device ( BILLINGSLEY 1986), it suffices to
prove for every vector

*/&)*

**v that ( 1***tends in law*

**V ' X ,**to a univariate normal distribution with mean 0 and
variance **V Q V . **

We argue by induction on the number of people *q. *

The claim is certainly true for *q *= 1 owing to the classi-
cal central limit theorem for independent, identically
distributed random variables. Now suppose we add per-
son *q *to an existing pedigree of *q *- 1 people, assuming
as before that parents always precede their children in
the pedigree. If *q *is a founder of the enlarged pedigree,
then consider the decomposition

The terms (1/&) Zg: viKiand (l/&)v,&are independent and by the inductive hypothesis separately converge to univariate normal distributions with means 0 and variances Zj'Z,'

### Zp:

*and*

**viwvq***respectively. But the sum of these*

**viwgg,****two**quadratic forms is the correct quadratic form associated with the block diagonal ma- trix R under independence. Since convergence in law preserves convolution

**of distribution functions and the**convolution of normal distributions is normal, multivar- iate normality follows in this special case.

If *q *is the child of the existing members * k *and

*of the pedigree, then consider the decomposition*

**2**where * wi *=

**vi when****i****f**

*or*

**k***=*

**1,****Wk**

**vk**### +

*and*

**1/2vg,***=*

**w,**

**vl**### +

**%vq.****As **

usual we decompose **As**

**X,,, **

**X,,,**

**as**

*X,,q*=

**Y,,,,**### +

**Yn,trq, **

where **Yn,trq,**

*and*

**Y,,,,***are the gamete contribu- tions of*

**Y,,,,***and*

**k***to*

**1***q,*respectively. '

The centered gamete contributions Yn,,g

### -

(

**X,,k/**

**2 )**and **Y,,,, **

### -

(*are almost independent of (*

**X,,J****2 )****X l , **

### .

### .

### .

,**X , , , g - . l ) .**We can achieve independence by defin-

ing

**2&, **

and **2&,**

**&,eg**to be gamete contributions sampled independently without replacement from two separate gene pools containing exactly

*positive genes and*

**n***negative genes each. With this in mind, we rewrite the decomposition ( 11 )*

**n****as**

Our strategy now is to couple **Y,,,, **

### -

*and*

**( & / 2 )*** &,hq so *that

tends in probability to 0. Likewise coupling **Yn,,q **

### -

**(X,,,/ **

**(X,,,/**

*and we can invoke Slutsky's theorem*

**2 )**(BICKEL and **DOKSUM **1977) and deduce that

( **1 **/&) * ut& * has in the limit a distribution that is the
convolution of the limit distributions of the three inde-
pendent terms ( 1 /&)

**XS: **

**wi****X i , **

**X i ,**

*and (*

**(v,/&)%,,,,***By the induction hypothesis, the*

**V ~ / & ) Z , , , , ~ .**sum ( 1 /&)

**B **

*?Z:*

*tends in law to a univariate normal distribution with mean 0 and variance*

**w i****X,,i****q-1 ****9 - 1 ****9-1 ** **q-1 ****9- ****1 **

* w i w q w j *=

**C **

**V i w # q**### +

*vq*

**w k j v j**

**i = l**** j = 1 **** i = l **

** j = 1 **** j = l **

**9 - 1 **

### +

**vg**

**w g v j**### +

**v',****( 7 4 W M**### +

**y4wu**### +

**7 * W M ) .**

**j = l**According to Bernstein's central limit theorem for the hypergeometric distribution (REM 1970),

( 1

*/I&)&,,, *

and ( **1**converge in law to normal distributions with means

**0**and variances

**1 / 4 .**Hence, the sum

converges in law to a univariate normal distribution with mean 0 and variance

**q-1 ****9 - 1 ****9-1 ** **9- 1 **

### X

**v i w i j v j**### +

**vg**

**w k j v j**### +

**vg**

**WgVj**

**i=l j = l**** j = l **** j = 1 **

### +

**( 7 2**### +

**Y4WM**### +

**7 4 w u**### +

***

**7 2 w k l )**In view of the variance and covariance recurrences ( **10) **
applied to * i *=

*q,*the total variance reduces to the qua- dratic form

*=*

**uQv****X?,, **

**X$=, **

**X$=,**

**v i w v v j .**This completes the proof except for coupling **Yn,bg **

### -

*and By definition, person*

**( & / 2 )**

**k****has **

**n**### +

**1/2& ** positive genes and **n **

### -

*negative genes. If*

**1 / 2 &**we imagine ordering these * 2n *genes

*that the positive genes come before the negative genes, then we can write*

**so****2 n **

**Yn,k+g **=

### X

**A W , **

**A W ,**

**i= ****1 **

**1430 ** K. **Lange **

**l/z& ** * s i s 2n, *and

**Ai****is**a random variable indicating whether the

*gene*

**i t h****is**transmitted to

*q*or not.

**As **

**As**

noted earlier, we have E (Ai ) = Var (Ai ) = and Cov(Ai,Aj) = -(1/4(2n- l)).Therandomvariable

*Z,,,bq * can be represented **as **

I n

*Z z . b q *=

**X **

**A i****ui **

**ui**

**9****i= ****1 **

**w h e r e U , = l i f l s z s n a n d q = - l i f n + l s i s **

* 2n. *By construction,

**Pn **

### r.

**( W **

**( W**

### -

*u , )*

### =&,

### x

### ( W

### -

=

**21&1.**

**i= 1**2 n

**i=l **

Conditional on

### &,

the random differencehas mean

and variance

Var(Dn,wql &) =

### -

**1**### x

**2 n****( W ,**### -

**G I 2 **

**4****i=l**

**1 ** 1

**4 *** 4 ( 2 n - * 1)

**12'&' **

**12'&'**

**- 4 ( 2 n - 1 )****X L . **

**X L .**

Hence, the unconditional mean and variance of

( 1 /&) * Dn,eq *are 0

**and**

Because Cauchy's inequality implies

and

**1 ****1 **

**" E ( X 2 ) **

= -Var(&)
**" E ( X 2 )**

**2n ****2n **

is bounded, it follows from ( * 1 2 ) *that

### 1

Chebychev's inequality finally gives the bound