A Model Selection-Based Interval-Mapping Method for Autopolyploids

(1)

©

DOI: 10.1534/genetics.104.035410

A Model Selection-Based Interval-Mapping Method for Autopolyploids

Dachuang Cao,* Bruce A. Craig* and R. W. Doerge*

,†,1

*Department of Statistics, Purdue University, West Lafayette, Indiana 47907 and†_{Department of Agronomy,} Purdue University, West Lafayette, Indiana 47907

Manuscript received August 24, 2004 Accepted for publication January 14, 2005

ABSTRACT

While extensive progress has been made in quantitative trait locus (QTL) mapping for diploid species, similar progress in QTL mapping for polyploids has been limited due to the complex genetic architecture of polyploids. To date, QTL mapping in polyploids has focused mainly on tetraploids with dominant and/or codominant markers. Here, we extend this view to include any even ploidy level under a dominant marker system. Our approach first selects the most likely chromosomal marker configurations using a Bayesian selection criterion and then fits an interval-mapping model to each candidate. Profiles of the likelihood-ratio test statistic and the maximum-likelihood estimates (MLEs) of parameters including QTL effects are obtained via the EM algorithm. Putative QTL are then detected using a resampling-based significance threshold, and the corresponding parental configuration is identified to be the underlying parental configuration from which the data are observed. Although presented via pseudo-doubled back-cross experiments, this approach can be readily extended to other breeding systems. Our method is ap-plied to single-dose restriction fragment autotetraploid alfalfa data, and the performance is investigated through simulation studies.

Q

UANTITATIVE trait loci (QTL) mapping detects polyploidrepresents species with homology in between

and identifies the regions of a genome associated allopolyploids and autopolyploids (Suzukiet al.1989). with the variation of a quantitative trait of interest. Mo- For allopolyploids (or disomic polyploids), such as bread lecular markers have been used extensively to construct wheat and potato, meiosis is usually restricted to the genetic maps for diploid species (Koornneefet al.1983; pairing of ancestral parental homologs. Thus, the

genet-Dietrichet al.1996) and act as the foundation for QTL ics of allopolyploids are similar to those of diploids, and

analyses such as interval mapping (Lander and Bot- diploid QTL-mapping methods can be carried out by

stein1989;Knappet al.1990;HaleyandKnott1992), treating each of the two sets of homologous chromo-composite interval mapping (Zeng1993, 1994), and mul- somes from an ancestral genome as a diploid.

tiple-QTL mapping (Jansen1993). The statistical issues in- _{For autopolyploids (or polysomic polyploids), however,} volved in diploid QTL mapping are reviewed inDoerge _{the high ploidy level and homology create}

complica-et al.(1997),Broman(2001), andDoerge(2002). _{tions in interpreting the meiotic process. Unlike}

dip-Polyploids are organisms having more than two com- _{loids whose two homologs always pair during meiosis,} plete sets of chromosomes in a cell. Polyploidy is most _{autopolyploids may undergo either bivalent pairing} common in plants, but also found in some insects, am- _{(two chromosomes pair) or multivalent pairing (more}

phibians, and reptiles. Some 30–70% of today’s angio- _{than two chromosomes pair) (}_Muller_1916;_Newton

sperms and at least half of the flowering plants are _and _Darlington _1929; _Darlington _1931; _Sybenga

thought to be polyploid (SoltisandSoltis1999). Not _{1995). Furthermore, the manner in which the paired} only is polyploidy important in agriculture, but it is also _{chromosomes segregate during meiosis, especially for} an important evolutionary force (Soltis and Soltis _{multivalent pairing, varies in different species (}_Mather

2000;Wolfe2001;Osbornet al.2003;Soltiset al.2003). _1938; _Rieseberg_and_Doyle _1989;_Jackson_and Jack-To describe the current state of QTL mapping in poly- _son_{1996). Finally, the marker genotypes in} autopoly-ploids, they must first be classified according to their ho- _{ploids are not always identifiable (}_i.e._{, the banding} mology. A polyploid with genomes all derived from the _{patterns are not unique since not all the alleles can be} same species is called an autopolyploid. If the multiple _{differentiated by any one marker system). Even if we are} sets of chromosomes are derived from different species, _{able to identify the different alleles of one locus, the} the polyploid is called anallopolyploid. Asegmental

allo-number of copies (dosage) of each allele and the link-age phases (i.e., the arrangement of alleles on the infor-mative chromosomes) between loci remain unknown. 1_{Corresponding author:}_{Department of Statistics, Purdue University,}

There are two different approaches for dealing with

1399 Math Bldg., West Lafayette, IN 47907.

E-mail: [email protected] the multiple copies of one locus for autopolyploids.

(2)

With a dominant marker system, each locus is treated tal marker dosages, parental QTL configuration, and QTL effect. Initially, we estimate the parental marker as biallelic. For this case,Wuet al. (1992) estimated a

genetic map for autopolyploids with simplex markers, or dosages using progeny marker data to reduce the num-ber of potential parental configurations. On the basis single-dose restriction fragment (SDRF) markers, which

represent only one homolog and segregate 1:1 in the of each putative parental configuration, interval map-ping is performed to estimate QTL location and to cal-progeny. Ripol et al. (1999) extended the method of

Wuet al.(1992) to any dominant marker with an unob- culate maximum-likelihood estimates of QTL effect and

population variation that are obtained using the expec-servable dosage level. With codominant markers each

locus is treated as multiallelic (Luo et al. 2000, 2001; tation maximization (EM) algorithm (Dempster et al.

1977). We limit our approach to any even ploidy level

Hackett et al.2001). Each band observed on a gel is

viewed as a different allele for that locus. This approach with multiple-dose dominant markers since odd ploidy levels are often associated with high infertility. After is valid for some marker systems, such as simple sequence

repeats (SSRs) that do not require the use of DNA describing the methodology, our method is applied to

single-dose restriction fragment alfalfa data with an esti-digestion by restriction enzymes and whose motifs have

high variation in the number of repeat units. However, mated linkage map fromBrouwerandOsborn(1999). Several putative QTL have been identified as associated since the use of SSRs may be responsible for inaccurate

allele identification,RodzenandMay(2002) suggested with winter hardiness traits in bivalent tetraploid alfalfa and act as the basis of simulation studies that investigate scoring multiallelic SSR markers as individual dominant

markers, unless the markers’ underlying modes of in- the performance of our approach.

heritance are known, simply because different loci may have different inheritance patterns. Because the

avail-METHODS

ability of codominant markers for autopolyploids is

rela-tively rare, and dominant markers (e.g., SDRF markers) Our polyploid QTL-mapping methodology is devel-oped for organisms undergoing bivalent pairing during have the advantages of being rich in plants and easily

scored, the remainder of this article is based on the meiosis. Under the bivalent pairing assumption, the pairing of the chromosomes can range frompreferential

dominant marker system.

An additional advantage of using SDRF markers is that pairing(always pairing with the same homologous chro-mosome) torandom pairing(equally likely to pair with the only unknown factor is the linkage phase (coupling

or repulsion) between the SDRFs and this can be esti- any other homologous chromosome). The data analysis and simulation are based on random pairing.

mated by a goodness-of-fit test. In turn, the

maximum-likelihood estimate (MLE) of the recombination between Letndenote the number of progeny andkdenote the ploidy level. In what follows, for each locus, uppercase is SDRF markers can be calculated on the basis of the

estimated linkage phase. Using SDRFs, Brouwer and used to denote both the locus name and its dominant

allele, and lowercase represents the recessive allele (e.g.,

Osborn (1999) estimated the first composite genetic

map for tetraploid alfalfa. When considering multiple- M1andm1). The dosage of a locus refers to the number of copies of the dominant allele at that locus. For exam-dose markers,Ripolet al. (1999) first estimated marker

dosages and linkage phase and then constructed the ple, a locus dosage of two refers to two copies of the dominant allele at that locus. When a marker is ob-genetic map by computing the maximum-likelihood

estimate of recombination fraction on the basis of esti- served,at leastone copy of the dominant allele for that marker is present.

mated parental marker configurations.DoergeandCraig

(2000) extended polyploid QTL methodology by devel- Marker data (presence/absence) and the quantitative

trait data are measured on the F1progeny. These prog-oping an algorithmic model selection process for a

sin-gle-marker QTL analysis with dominant markers for eny are the result of a pseudo-doubled backcross

ex-periment (Figure 1), where an informative parent P1is autopolyploids with any even ploidy level. Recently,Luo

et al.(2004) addressed both bivalent and quadrivalent crossed with a noninformative parent P2. The dosage at

each locus in the parents is at least one for the informa-chromosomal pairing during meiosis by employing a

sta-tistical framework for the analysis of tetrasomic linkage. tive parent and zero for the noninformative parent. Furthermore, at most half of the chromosomes of the As a continuation of the work byDoergeandCraig

(2000), we propose an interval-mapping method by em- informative parent contain dominant alleles. Chromo-somes that contain dominant alleles in the informative ploying an available genetic map to increase the power

of detecting and estimating QTL locations within an parent are called informative chromosomes. The pseudo-doubled backcross design is equivalent to doubling the autopolyploid bivalent pairing framework. For

simplic-ity, and in keeping with this previous work, we base our half of the noninformative chromosomes of the infor-mative parent to produce the noninforinfor-mative parent work on a pseudo-doubled backcross experiment (

Grat-tapagliaandSederoff1994) and employ a model selec- and then crossing them to get F1. An example of a pseudo-doubled backcross experiment for a tetraploid (k⫽4) tion-based interval mapping to simultaneously estimate

(3)

Figure 2.—Possible parental marker configurations for a tetraploid with two marker loci (M1/m1andM2/m2) under a pseudo-doubled backcross design. (A) Double coupling. (B) Asymmetric coupling 12. (C) Asymmetric coupling 21. (D) Coupling. (E) Repulsion.

Figure1.—A pseudo-doubled backcross experiment for a _{all the dosages of QTL. The vector}_␪ _⫽ ₍_{␮ ⫽} ₍_␮0_, _␮1_, tetraploid with two loci. Each locus is assumed to be biallelic:

. . . ,␮k/2), ␴) is used to denote model parameters. At

M1/m1andM2/m2, whereM1andm1denote two alleles of the

any fixed putative position of the QTL the likelihood

first locus andM1is the dominant allele, and similarly forM2

andm2. For each locus, there are two copies of the dominant function of the trait is a mixture of normal distributions allele in the informative parent P1. P2is the noninformative (Equation 1) under the assumption of one QTL, doubled haploid produced by doubling a haploid of P1, which

contains only recessive alleles.

L(␪|x,c)⫽

兿

n

i⫽1

兺

k/2

j⫽0

P(dQ⫽ j|Ii,c)φ(yi;␮j,␴2)

䉭

⫽

兿

n

i⫽1

兺

k/2

j⫽0

␻j|iφ(yi;␮j,␴2), (1) In this case, the dosage for both markers is two in the

informative parent. A pseudo-doubled backcross

popu-lation can also be developed from diploids by choosing _where_P₍_d

Q⫽j|Ii,c) ⫽ ␻䉭 j|iis the probability of theith two homozygous parents carrying different alleles,

dou-bling each of them to produce tetraploids, crossing progeny having dosage_{ence status}_I jof the QTL given marker pres-i, andφ(yi;␮j,␴2) denotes the normal den-them to produce an F1, and then crossing the F1to the _{sity function valued at}_y

iwith mean␮jand variance␴2.

parent containing recessive alleles. _The_␻j

|i’s are a function of parental configurationc and

Interval mapping:Consider two flanking markers,M1 _{the recombination fractions between the QTL and}

andM2, and one QTL,Q, with marker genetic distance _markers.

or recombination fraction provided by an available ge- _{In the interval-mapping framework (}_Lander _and

netic map. Once the parental configurationcis known, _Botstein_{1989), the putative QTL is considered at} in-interval mapping can be applied to locate QTL. A paren- _{cremental positions in an interval defined by adjacent} tal configuration includes loci dosages and linkage _{markers, and a test statistic (commonly the} log-likeli-phase. If the dominant alleles of two loci are on the _{hood ratio) is evaluated. Let}_R

h(h ⫽1, 2) denote the

same chromosome, they are in coupling phase; other- _{true recombination fractions between marker}_M

h and wise, they are in repulsion phase. On the basis of this, _{QTL. At each evaluation position, let}_r

h(h⫽1, 2) denote the five respective configurations for a tetraploid with _{the recombination fractions between markers defining}

two markers under a pseudo-doubled backcross design _{the interval and the current QTL position. The}

log-are called double coupling (Figure 2A), asymmetric cou- _{likelihood-ratio test statistic is calculated to test the} hy-pling 12 (Figure 2B), asymmetric couhy-pling 21 (Figure 2C), _pothesis

coupling (Figure 2D), and repulsion (Figure 2E).

Simi-H0: Rh⫽0.5,h ⫽1, 2 larly, there are three possible QTL configurations: (1)

one dose of the QTL located on the first informative _H

a: Rh⫽rh,h⫽1, 2. chromosome, (2) one dose of the QTL on the second

The null hypothesis assumes that the QTL is present, informative chromosome, and (3) two doses of the QTL

but unlinked to both markers, while the alternative hy-on both informative chromosomes. In what follows, chy-on-

con-pothesis assumes that the QTL is present and linked to figurations A–E are used to denote a marker

configura-M1 and M2 with recombination fractions rh. The log-tion for a tetraploid as defined above. A marker

config-likelihood-ratio test statistic LRT is uration letter followed by a QTL configuration number

represents acompleteparental configuration, such as “A1,”

LRT⫽ ⫺2 ln L(␪ˆ0,R1⫽R2⫽0.5)

L(␪ˆa,R1⫽r1,R2⫽r2) , which represents a parental configuration with markers

in double coupling and one QTL on the first

informa-tive chromosome. where␪ˆ0and␪ˆaare MLEs under the null and alternative

For theith individual, the observed data arexi⫽(yi, hypotheses, respectively. If the test statistic is significant

Ii), where yi is the trait value, and Ii is the indicator then the corresponding interval position with the largest vector for marker presence (i.e.,Iih⫽1 if markerMhis LRT statistic is considered as the estimated QTL posi-present and 0 otherwise). Assume for each QTL dosage tion. A permutation test can be performed to estimate levelj(j⫽0, 1, . . . ,k/2) that the traityhas a normal the significance threshold for the test statistic (

(4)

Because the summation format of the likelihood func- form the candidate parental configuration set. The joint method, on the other hand, takes a similar approach, tion (1) causes difficulty in finding the MLE of␪, the

complete likelihood function, containing both observ- but information on all markers is considered jointly. In this case, parental marker configuration posterior able and unobservable data, is considered,

probabilities are calculated directly, and candidate

pa-L(␪|x)⫽

兿

n

i⫽1

兿

k/2

j⫽0

[␻j|iφ(yi;␮j,␴2)]zi j, (2) rental configurations are chosen using the same posterior probability rule as described for the marginal method. and the EM algorithm (Dempsteret al.1977) is employed Given the marginal marker presence/absence distri-to estimate␪. Here, the unobservable data {zi⫽(zi0, . . . , bution (Ripolet al.1999), the posterior probabilities for

zi,k/2)}ni⫽1, wherezij⫽1 if the QTL dosage for theith prog- parental marker dosages of each marker are calculated eny isjand 0 otherwise, are indicator vectors represent- using Bayes’ rule (Bayes1783). For one markerM, the

ing progeny QTL dosages. number of progeny withMabsent,n⭋, has a binomial

dis-To implement the EM algorithm, the E step and M tribution Bin(n,p_dM), wheredMdenotes the marker dos-step are iterated until a convergence criterion of the age ofMin the informative parent. Under the pseudo-log-likelihood (e.g., logLt⫹1⫺_log_Lt⬍₁₀⫺6_{) is satisfied.} _{double backcross design,}

In the E step, the missing datazi jare estimated on the

basis of the observable data, and, in the M step, the _p

d_M⫽

冢

k⫺dM

k/2

冣

冒

冢

k k/2

冣

MLE of␪are calculated using the estimatedzij. Aftert

iterations, estimated values ofzi jt,␮t, and␴tare used at

for random pairing. Thus, given the informative parent stept⫹1 such that thezijt⫹1are estimated in the E step by

marker dosagedM,

zt⫹1

ij ⫽P␪(dQ⫽j|xi,c,␮t,␴t)

P(N⭋⫽n⭋|n,dM) ⫽䉭 Bin(n⭋;n,dM)⫽

冢

n n⭋

冣

p

n_⭋

dM(1⫺pdM)

n⫺n_⭋_. ⫽ ␻j|iexp{⫺1⁄2((yi ⫺ ␮tj)/␴t)2}

兺

k/2

l⫽0␻l|iexp{⫺1⁄2((yi⫺ ␮tl)/␴t)2} .

Let ␲ denote the prior distribution of marker dosage Givenzijt⫹1, the MLE of␪⫽(␮,␴) is obtained (M step) dM. The posterior probability of each dosage level is by maximizing (2),

P(dM|n,n⭋)⫽

Bin(n⭋;n,pd_M)␲(dM)

兺

k/2

d⫽1Bin(n⭋;n,pd)␲(d) ␮t⫹1

j ⫽

兺

n i⫽1ztij⫹1yi

兺

n

i⫽1ztij⫹1

, _j_⫽_{0, 1, . . . ,}_k_/2;

⫽ p n_⭋

d_M(1⫺pd_M)n⫺n⭋␲(d_M)

兺

k/2 d⫽1p

n_⭋

d(1⫺pd)n⫺n⭋␲(d) . ␴t⫹1 ⫽

冪

1

n

兺

n

i⫽1

兺

k/2

j⫽0

zt⫹1

ij (yi⫺ ␮tj⫹1)2.

As an alternative method to the described marginal ap-proach, we use the joint marker information and genetic map to calculate the posterior probability of a parental Since the estimation procedure is based on a specific

parental configuration (model), the model space

ex-pands quickly as the marker number and/or ploidy _{marker configuration. Given markers} _M

1 and M2, all level increase. Enumerating each possible model for the _{progeny can be classified into four sets according to the} purpose of exploring which is most likely to produce _{presence status of the two markers. Let}_n

obs⫽(n00,n01,

the observed data and then estimating parameters on _n

10, n11) denote the observed frequency vector in the the basis of the most likely model(s) is computationally _{four sets, where the} _h_{th (}_h _⫽_{1, 2) subscript is 1 if}_M

h challenging. For example, under a pseudo-doubled back- _{is present and 0 otherwise. Assuming no segregation} cross design with two markers and one QTL, there are _{distortion and a parental configuration}_c_{, the} distribu-14 possible parental configurations for a tetraploid, 91 _{tion of}_n

obsfollows a multinomial distribution with proba-for a hexaploid (k⫽6), and 390 for an octaploid (k⫽ _{bility parameter vector} _pc⫽₍_pc

00,pc01,pc10,pc11). Table 1 8). To reduce these computations, we employ a model _lists _pc _{for a tetraploid. The probability of observing}

reduction step prior to parameter estimation. _n

obs⫽(n00,n01,n10,n11) is

Model reduction: To reduce the model space, one

can consider one of two approaches: either a marginal _P₍_n

obs|n,c) ⫽䉭 Multinomial(nobs;n,c)⫽

n!

兿

1

i,j⫽0nij!

兿

1

i,j⫽0 {pc

ij}ni j. or a joint statistical method. Under the marginal method,

we calculate the posterior probability of dosages for

each marker individually. If one particular dosage level _Let_␲_{denote the prior distribution of marker} configu-has a posterior probability higher than a specified cut- _ration _c _僆 _C_{. The posterior probability for a specific} off, only that marker dosage is considered in candidate _{parental marker configuration}_c

0僆Cis parental configurations. Otherwise, we select the most

likely dosage levels until the sum of posterior probabili- _P₍_c

0|n,nobs)⫽

(5)

TABLE 1 Multinomial distribution probability parameterspc⫽₍_pc

00,pc01,pc10,pc11) in a tetraploid with

two markers under random pairing

(M1M2|M1M2)a (M1M2|M1) (M1M2|M2) (M1|M2) (M2|M1)

p00c

r2_⫺_2r_⫹₁ 6

1⫺r 6

1⫺r 2

1⫹r 6

p01c

⫺r2_⫹_2r 6

2⫹r 6

r 6

r 2

2⫺r 6

p10c

⫺r2_⫹_2r 6

r 6

2⫹r 6

r 2

2⫺r 6

p11c

r2⫺_2r⫹₅ 6

3⫺r 6

1⫺r 2

1⫹r 6

r, estimated recombination fraction between the two markers.

a_{Only dominant alleles are shown for informative chromosomes, separated by |.}

correct marker dosages for both marker loci, while for the joint method it refers to selecting the correct marker ⫽

兿

1

i,j⫽0{pcij0}ni j␲(c0)

兺

c僆C

兿

1i,j⫽0{pijc}ni j␲(c)

.

configuration. The proportion of data sets for which a unique configuration was selected,puni, was also recorded Among the many criteria for assessing the worth of

as a measurement of the efficiency for reducing the model a model reduction method are ease of implementation,

space (Table 3). In general, both methods performed a high probability of selecting the correct model, and

better with a larger sample size and shorter marker efficiency in reducing the size of the candidate model

distances. With sample sizeⱖ100, bothpincandpuniwere

space. Even though the marginal method is easy to im- _ⵑ

1.0 for all the simulation settings. With a small sample plement and less computationally expensive, the joint

size, the extra information from the genetic map added method directly estimates the posterior probability of

strength to the joint method both in selecting the cor-parental marker configurations. One might expect that

rect configuration and in reducing the model space. because there may be multiple marker configurations that

On the basis of the simulation results, the joint model have the same marker dosages, especially if the marker

reduction method is recommended for data analysis. dosages are low and ploidy high, the joint method will

Once the model reduction step is completed, the be more efficient in reducing the model space.

selected submodels are used to form the pool of candi-A simulation study was performed to compare the

date parental marker configurations by combining these performance of the marginal and joint methods for a

submodels with possible parental QTL configurations. tetraploid under random pairing. We chose two markers

After this, the likelihood function is constructed for each and marker genetic distance varied from 0.1 to 0.5 M

candidate parental configuration, and the EM algorithm with 0.1-M increments. Sample sizes ranged from 50 to

is applied to maximize the likelihood function and esti-500 with increments of 50. For each simulation setting,

mate parameters. The parental configuration that has 10,000 data sets were generated. The cutoff probability

the maximal likelihood is then identified as the parental was set to be 0.90 for both methods. Without any prior

configuration from which the observed data have arisen. information concerning the parental marker

configura-tion, we assumed all the possible parental marker config-urations to be equally likely (i.e., the prior for the joint

DATA ANALYSIS

method is discrete uniform). Under this assumption, the

The proposed approach was employed to analyze a prior distribution of a marker dosage is not uniform since

published alfalfa SDRF data set with traits related to a lower dosage is more frequent than a higher dosage.

winter hardiness (BrouwerandOsborn1999). Alfalfa Among the five possible parental marker configurations

is an autotetraploid that undergoes bivalent random (Figure 2, A–E) for a tetraploid, dosage 1 has a frequency of

pairing (BinghamandMcCoy1988;Caoet al.2004). 3/5 and dosage 2 has a frequency of 2/5 for each marker

Winter hardiness is a complex trait and one of the most locus. In turn, this frequency distribution was used as the

important adaptations for alfalfa. Because there is no prior of marker dosage for the marginal method.

direct measurement of winter hardiness, related traits The proportion of data sets for which the correct

con-including winter injury (WI), fall growth (FG), freezing figuration was included in the candidate configuration

injury (FI), and unifoliate internode length (UIL) were space, pinc, was recorded (Table 2) for both methods.

(6)

TABLE 2

Estimatedpinc, probability of selecting the correct configuration in the candidate configuration (model)

space, for a tetraploid with two markers, 10,000 simulated data sets, and a cutoff of 0.90 for both the marginal and joint model reduction methods under random pairing

n

50 100 150

0.1a _0.3 _0.5 _0.1 _0.3 _0.5 _0.1 _0.3 _0.5

Marginal

(M1M2|M1M2) 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 (M1M2|m1M2) 0.992 0.992 0.991 0.999 0.999 0.999 1.000 1.000 0.999 (M1M2|M1m2) 0.992 0.992 0.993 0.999 0.999 0.999 1.000 0.999 0.999 (M1M2|m1m2) 0.988 0.987 0.985 0.998 0.999 0.999 1.000 0.999 1.000 (m1M2|M1m2) 0.983 0.987 0.987 0.998 0.998 0.997 0.999 0.999 0.999

Joint

n, sample size.

a_{Marker genetic distance expressed in morgans (M).}

(SURV). SURV was measured by the percentage of each in 1995 and 1996; WI scaled 1–5 (1, no injury; 5, dead) was measured in 1996 and 1997; and UIL was measured plant that survived in the winter (0–100%). FG was

mea-sured by height of vertical regrowth in centimeters in for all seedlings. The histogram plots of these traits can be referred to Figure 1 inBrouwer et al.(2000). early October in two successive years, 1995 and 1996;

FI was measured by absorbance at 265 nm (ABS) and Two genotypes, B17 and P13, representing the

ex-tremes for each trait were crossed, and a single F1plant electrical conductivity per gram fresh weight (COND)

TABLE 3

Estimatedpuni, probability of selecting a unique configuration in the candidate configuration (model)

space, for a tetraploid with two markers, 10,000 simulated data sets, and a cutoff of 0.90 for both the marginal and joint model reduction methods under random pairing

n

50 100 150

0.1a _0.3 _0.5 _0.1 _0.3 _0.5 _0.1 _0.3 _0.5

Marginal

Joint

n, sample size.

(7)

TABLE 4

Estimated locations and effects of QTL associated with winter hardiness and its related traits for autotetraploid alfalfa

Trait Year Chr Marker interval Pos dˆ1 dQ ␮ˆa ␴ˆ FGb ₁₉₉₅ ₁ _{uwg284–vg2b9} _0.31 _0.31 ₂ _{12.24, 21.45, 24.61} _2.61

1996 1 uwg284–vg2b9 0.34 0.34 2 12.51, 19.18, 19.29 3.17 1995 5 vg2a2–vg2e5 0.498 0.001 2 12.30, 21.65, 23.75 2.82 1996 5 hg2b12–vg2a2 0.456 0.05 2 12.70, 18.48, 22.08 2.96 1996 5 mtsc35–ugac473 0.29 0.02 2 15.23, 18.46, 20.90 3.63 1996 8 ugac291–ugac540 0.187 0.10 2 13.02, 18.60, 21.53 3.08 1996 8 ugac109–uwg169 0.704 0.06 2 15.57, 18.95, 19.95 3.62 CONDc ₁₉₉₆ ₆ _{ugac281–vg2c2} _0.356 _0.09 ₂ _{52.42, 59.05, 94.60} _13.30 1996 8 ugc291–ugac540 0.247 0.16 2 50.91, 58.36, 92.89 12.64 SURVd ₁₉₉₅ ₈ _{ugac291–ugac540} _0.187 _0.10 ₁ _{0.54, 0.14} _0.229 WIe ₁₉₉₆ ₈ _{ugc291–ugac540} _0.217 _0.13 ₂ _{2.66, 4.40, 4.82} _0.513 UIL — 8 ugac235–ugac291 0.001 0.001 2 9.34, 3.35, 12.74 1.847

Data are fromBrouwerandOsborn(1999). Significant QTL were detected by experiment-wise thresholds calculated from 1000 permutation tests at the 0.05 significance level. Chr, chromosome number; Pos, estimated composite map position of the QTL;dˆ1, genetic distance between the detected QTL and left end marker of the interval;dQ, estimated QTL dosage;␴, estimated common standard deviation of the trait. UIL, unifoliate internode length.

a␮ ⫽₍␮

0,␮1, . . . ,␮dQ) is the estimated mean vector.

b_{The trait fall growth (FG) was measured by height of vertical regrowth in centimeters in early October in} 2 successive years, 1995 and 1996.

c_{Electrical conductivity per gram fresh weight (COND) was measured in 1995 and 1996 to represent freezing} injury (FI).

d_{Winter survival percentage of plants (SURV) was measured in 1995 and 1996 to represent winter hardiness.} e_{Winter injury (WI) was scaled 1–5 (1, no injury; 5, dead) and was measured in 1996 and 1997.}

was crossed to each parent to create two populations of were mapped on chromosome 1, three on

chromo-some 5, and two on chromochromo-some 8. Two pairs of QTL 101 individuals each. Both populations were scored for

82 SDRF loci and measured for each trait in 2 years were detected at close map positions across 2 years: on chromosome 1, one QTL was detected at 0.31 M for of replicated field trials. The composite map and the

original analysis of the trait data and marker data can the 1995 average and one at 0.34 M for the 1996 average; on chromosome 5, one QTL was detected at 0.498 M

be found inBrouwerandOsborn(1999) and

Brou-weret al.(2000). On the basis of the genetic map, seven for the 1995 average and one at 0.456 M for the 1996

average. Two QTL were detected on chromosomes 6 and linkage groups were identified and corresponded to

seven out of the eight chromosomes. As such in what 8 for freezing injury FI. One QTL was detected on chro-mosome 8 for SURV and WI. On chrochro-mosome 8, in the follows, the terms for chromosome and linkage group

are used interchangeably. region from 0.187 to 0.247 M, four QTL were detected

for multiple traits, fall growth (FG 1996), freezing injury Because only B17 markers were considered for the

P13⫻ F1 backcross population and similarly only P13 (COND 1996), winter hardiness (SURV 1995), and win-ter injury (1996); and two of them were at the same lo-markers for the B17 ⫻ F1 backcross, each backcross

population is equivalent to a pseudo-doubled backcross cation, 0.187 M. The one QTL for UIL in the P13 back-cross was at the top of chromosome 8. All of the detected population. Our approach was applied to identify

re-gions in the genome related to the previously described QTL were identified to have dosage two, except for the QTL for winter hardiness (SURV). In general, an addi-quantitative traits for both backcross populations based

on the composite map. The average over the replicates tional copy of a QTL was found to be associated with a higher quantitative trait mean, except for the QTL for in each year for each measurement was treated as a

quantitative trait, and the two averages were analyzed SURV and UIL.

Brouweret al. (2000) used a regression-based single-separately. A permutation test (ChurchillandDoerge

1994) with 1000 permutations for a 0.05 significance marker analysis to identify markers associated with the previously described quantitative traits. Multiple regres-level was used to detect a significant QTL.

Seven QTL were detected for FG, two for FI, one for sion analyses were performed using the same data for all 82 SDRFs to identify the best polygenic models for SURV, and one for WI in the backcross to B17; and

(8)

TABLE 5

Estimated locations of QTL affecting traits related to winter hardiness in Brouweret al. (2000): (A) fall growth (FG) 1995 average; (B) fall growth (FG) 1996 average; (C) freezing injury

(FI: COND) 1996 average; (D) winter injury (WI) 1996 average

A B C D

Chr Marker Pos Chr Marker Pos Chr Marker Pos Chr Marker Pos

1 vg2b9 35.8a ₁ _vg2b9 _35.8a ₁ _vg2b9 _35.8b ₂ _vg0c5 _23.0c 1 mtsc9 51.4 3 mtsc14 46.1b ₁ _mtsc9 _51.4d ₄ _ugac671 _15.3 3 uwg96 24.3b ₅ _mtsc35b _27.0a ₃ _ugac224 _23.1d ₈ _ugac540 _30.9d 3 mtsc14 46.1c ₅ _ugac473 _29.3a ₈ _ugac540 _30.9c ₈ _ugac191 _63.7 5 vg2a2 49.8a ₈ _ugac540 _30.9c ₈ _ugac191 _63.7

8 ugac191 63.7 8 ugac191 63.7c

Chr, chromosome; Pos, estimated composite map position of the QTL. a_{Locus close to the QTL identified by the interval-mapping method.} b_{LRT at this locus is a peak close to the 0.05 threshold.}

c_{LRT at this locus is higher than the 0.05 threshold, but not a peak.} d_{LRT at this locus is a peak, not close to the 0.05 threshold.}

clude six SDRFs for FG (1995), six for FG (1996), five for FI and four QTL for winter injury on chromosomes 1, 5, and 8. The specifics of theMaet al.(2002) method and (COND 1996), and four for WI (1996) [Table 5,

summa-rized fromBrouweret al.’s (2000) Table 2]. Among all analysis make a direct comparison between our results and their results difficult for the following reasons. First, the detected SDRFs, all but five are located on

chromo-somes 1, 5, and 8, which also contain most of QTL we it is assumed inMaet al.(2002) that the markers and QTL have the same dosage level and are in coupling, detected with the proposed method. Two loci for FG

(1995) and three for FG (1996) are also shared by our while most of our detected QTL were of dosage two. Second, there is some evidence that the preferential interval-mapping method. Among the other significant

marker loci, four had likelihood-ratio test statistics pairing factor estimation process provided byMa et al.

(2002) is flawed since in reality the genotypic infor-larger than the 0.05 experiment-wise threshold used in

our interval-mapping method, but were not local peaks. mation collected for across years should not change, yet their estimates of preferential factor do (Caoet al.

Five loci had likelihood-ratio test statistics being local

peaks, with two of them close to the 0.05 threshold. 2004). Finally, inMaet al.(2002) the critical value for the significance level 5% is calculated on the basis of 200 Using an experiment-wise threshold based on a

sig-nificance level 0.05, rather than an asymptotic-based permutation tests. For a 5% significance level (Churchill

and Doerge1994;Doergeand Churchill1996), 200 strict cutoff based on a significance level 0.05 or 0.005

for each marker (Brouwer et al. 2000), we detected permutations limit full exploration of the null space and may lead to the declaration of some QTL that are fewer QTL by controlling the type I error. The region

on chromosome 8 (from 0.187 to 0.247 M) that was actually not significant at the 5% level. identified by our method as being shared by multiple

traits was not detected by the regression method. This

SIMULATION

is most likely the direct result of more experimental

information being incorporated into the analysis (e.g., We used the framework provided by both the alfalfa experiment and analysis as a means to investigate the model selection, dosages, intermarker position). Since

single-marker analysis explores only the relationship be- reliability of the detected QTL via a simulation study. Data were simulated by treating the estimated parental tween the quantitative trait and marker loci, if the QTL

is very close to a marker locus, the estimated effect for configuration and related parameters as the true values. For this purpose, 5 of the 12 QTL that we detected were that marker locus could be interpreted as the effect of

the QTL, under the assumption that the QTL is also of chosen to provide a range for the flanking marker ge-netic distance and configuration, QTL dosage and con-a single dose con-and in coupling with the mcon-arker. However,

most of the QTL that were detected using our method figuration, QTL effect, and QTL position (Table 6). The flanking marker distances were 0.358, 0.358, 0.098, 0.222, were of dosage two; therefore, it is not appropriate to

compare the QTL effects estimated by Brouweret al. and 0.087 M for the five intervals (Table 6). The flanking markers in the first two intervals were in repulsion, and (2000) with our method.

Ma et al. (2002) analyzed the same alfalfa data for the others were in coupling. The estimated QTL dosage

was one for the fourth interval and two for the other two traits, WI and fall injury (FI: COND). They identified

(9)

TABLE 6

Estimated QTL location and effect for data simulated from five intervals containing detected QTL as indicated by alfalfa data and analysis

QTL Trait Pos Mconfig Freq d(M1,M2) d1 dˆ1(s(dˆ1)) ␴ ␴ˆ ␮a ␮ˆ 1 FG 0.31 E 999 0.358 0.31 0.30(0.066) 2.61 2.55 12.24, 21.45, 24.61 12.32, 21.49, 24.54

2 FG 0.34 E 992 0.358 0.34 0.30(0.092) 3.17 3.04 12.51, 19.18, 19.29 12.53, 19.21, 19.29

3 FG 0.456 D 862 0.098 0.05 0.059(0.036) 2.96 2.82 12.70, 18.48, 22.08 12.68, 18.58, 21.89

4 SURV 0.187 D 955 0.222 0.10 0.097(0.041) 0.23 0.23 0.54, 0.14 0.54, 0.14

5 UIL 0.001 D 954 0.087 0.001 0.015(0.027) 1.85 1.86 9.34, 3.35, 12.74 9.14, 3.41, 12.50

For each interval, 1000 data sets of sample size 100 were simulated. Pos, position of the detected QTL on the composite genetic map; Mconfig, marker configuration (D is coupling and E is repulsion);d1, distance between the QTL and the left end marker;␴, common standard deviation of the trait for all QTL dosage levels; Freq, frequency of selecting the correct parental configuration out of 1000 data sets;dˆ1, average of the 1000 estimated QTL locations from the 1000 simulated data sets;s(dˆ1), sample standard deviation of the 1000 estimated QTL locations from the 1000 simulated data sets;␴ˆ, average of population standard deviation estimates;␮ˆ , average of population mean vector estimates.

a_{␮ ⫽}₍_␮

0,␮1, . . . ,␮dQ) is the mean vector of the trait corresponding to QTL dosages.

was similar to that of the first interval. We included the QTL can be provided by this extra copy of the QTL. This, in turn, makes it difficult to choose between the second interval in our simulation because we were

inter-ested in investigating whether our method is able to configurations with one copy and two copies of the QTL. On the other hand, with markers in repulsion, both identify the small increment from␮1to␮2, where␮⫽

(␮0, ␮1, ␮2) ⫽ (12.51, 19.18, 19.29) is the estimated copies of the QTL provide information about recombi-nation between markers and QTL. The effect of this dif-mean vector for the corresponding QTL in that interval.

The five interval settings were studied separately. For ference is easily illustrated by comparing the second interval with the third interval used for this simulation. each interval, 1000 data sets with a sample size of 100

(i.e., similar to the alfalfa data) were simulated and ana- Although the increment in trait means from QTL dose 1 to dose 2 is only 0.035␴for case 2 and 1.22␴for case 3, lyzed, with results summarized in Table 6. Using the joint

model selection method, the correct parental marker con- case 2 still outperformed case 3. On the basis of this, repulsion is not necessarily the worst-case scenario for figuration was included in the candidate parental marker

configuration pool 100% of the time for all five cases. QTL mapping in tetraploids with simplex markers, as was once thought.

The overall parental configuration pool is then formed

by the combination of candidate parental marker config- Since the above simulation was mainly focused on pa-rental marker configurations formed by simplex mark-urations and possible QTL configmark-urations. After fitting our

interval-mapping model to each possible parental con- ers, another simulation study was performed to serve as the basis of investigation for how, in general, QTL or figuration, the one having the maximal test statistic

pro-vides the estimated QTL location and marker and QTL marker dosages and their respective linkage phase af-fect the performance of our algorithm. Under a pseudo-configuration, as well as the parameter estimates. The

cor-rect configuration was identified at least 95% of the time doubled backcross experiment, two markers (M1/m1and

M2/m2) and one QTL (Q/q) were used with marker with markers in coupling and at least 99% of the time with

markers in repulsion, except for the third interval where distance 0.10 M and a sample size 100. The location of the QTL was 0.01, 0.03, or 0.05 M. The trait mean␮ the correct configuration was selected 86% of the time.

Given the correct configuration, the estimated parame- was (10.0, 12.0) if QTL dosage was 1 or (10.0, 12.0, 14.0) if QTL dosage was 2, and␴ was 1.0. One thou-ters were close to the true values used for simulation.

The fact that our method performed better with re- sand data sets were simulated for each combination of the parameter settings and each of the following con-pulsion than with coupling is due to the difference

between these two configurations. This can be demon- figurations: double coupling (A1, A3), asymmetric cou-pling 12 (B1, B2, B3), coucou-pling (D1, D2, D3), and repul-strated by considering an example where two copies of

the QTL are on both informative chromosomes. When sion (E1, E2, E3).

Our simulation results illustrate that when the correct the markers are in coupling, one copy of the QTL is

located on a chromosome containing no dominant parental configuration was identified, the parameter

estimate averages were close to the true value. The abil-marker alleles. Thus, not much extra information with

(10)

de-mizing the disequilibrium in pairing. However, with ran-dom pairing, one-third of the time one informative chro-mosome pairs with another informative chrochro-mosome to provide less, or even no, information, if two chromo-somes are homozygous. Using similar reasoning it can be demonstrated that random pairing is more informa-tive than preferential pairing for markers in repulsion.

DISCUSSION

One of the largest challenges when mapping QTL in polyploids is the incomplete parental configuration (model) information. This is compounded by the dra-matic increase in the size of the model space as the number of markers and/or ploidy level increase. To

Figure3.—The proportion identifying the correct parental

(marker and QTL) configuration for a tetraploid under a combat these challenges in autopolyploid QTL

map-pseudo-doubled backcross experiment. The trait mean ␮is _{ping, our interval-mapping algorithm takes one of two} (10.0, 12.0) if QTL dosage is 1 or (10.0, 12.0, 14.0) if QTL _{approaches. The size of the model space is addressed} dosage is 2. Population standard deviation␴is 1.0. With

sam-using Bayesian model selection by first identifying

po-ple size 100, 1000 data sets are simulated under 11 parental

tential parental marker dosages using a marginal

ap-configurations corresponding to numbers 1–11 on the

hori-zontal axis: A1, A3, B1, B2, B3, D1, D2, D3, E1, E2, and E3. proach or marker configurations based on a joint

ap-The letters denote the marker configurations: (A) double _{proach. We have demonstrated that both methods are} coupling, (B) asymmetric coupling 12, (D) coupling, and (E) _{effective in reducing the model space, but that the joint} repulsion. The numbers denote the QTL configurations: (1)

method is more efficient due to the fact that it

incorpo-one dose of QTL present on the first informative chromosome,

rates the genetic map information. From the reduced

(2) one dose of QTL present on the second informative

chro-mosome, and (3) two doses of QTL present on both informa- model space, the candidate parental configuration pool

tive chromosomes.d(M1,Q) is the genetic distance between is formed by combining each potential parental marker the QTL and the left end-flanking marker. _{configuration with a possible QTL configuration. For}

each potential parental configuration, profiles of MLEs of QTL effects, as well as population variation and likeli-pended on the underlying parental configuration. The _{hood-ratio test statistics, are produced at each} evalua-worst-case scenario was when markers were in coupling _{tion point of a grid search across the genome. Putative} on the first informative chromosome with one copy of _{QTL are detected using a resampling-based significance}

the QTL on the second informative chromosome (D2). _{threshold. The corresponding parental configuration}

For this case the identified correct configuration pro- _{(and related parameter estimates) is then taken as the} portion ranged from 63 to 66% (Figure 3). This is not _{underlying model from which the data are produced.} surprising since with D2 under random pairing, only _{Our simulation studies have demonstrated that if the} one-third of the pairings provide information concern- _{correct parental configuration is identified, our}

interval-ing the recombination between marker and QTL. On _{mapping method provides parameter estimates close}

the other hand, the best situation was when two markers _{to true values. However, the chance of identifying the} were in repulsion, where the correct parental configu- _{correct configuration depends on the true parental} con-ration was almost 100% regardless of the QTL configu- _{figuration, chromosome pairing system, and genetic} dis-ration or location. Most of the other configudis-rations ob- _{tance between markers and the QTL. This helps us}

tained a proportion of at least 95% (Figure 3). _{address a question raised in}_Doerge_and_Craig_(2000,

It is worth noting that in our earlier work (Caoet al. _{p. 7956), “for which situation is the linkage mostly}

af-2003), repulsion did not outperform configurations with _{fected?” Comparatively, among these factors, the true} higher dosages of loci, such as double coupling and asym- _{parental configuration is the most influential one. Since} metric coupling. The difference in results comes from a configuration consists of both loci dosages and linkage the difference in the assumed chromosome-pairing mech- phase, the influence of one cannot be considered with-anism. Our earlier work (Caoet al.2003) investigated out considering the effect of the other. For example, a

the performance of our interval-mapping method un- configuration with higher marker/QTL dosages does

(11)

maxi-tive to the chromosome-pairing system and thus con- sumption does not hold, we can simply change the upper

sistently provides high power for detecting QTL. bound for the QTL dosage to the ploidy level. This

Our simulation results also help to answer some of the change will result in more candidate QTL configura-remaining questions presented byDoerge andCraig tions to choose from, but the method and the procedure (2000, p. 7956). First, “If a molecular marker is found remain the same. Similarly, if the breeding design is

to be tightly linked to a QTL, should the dosage of altered from a pseudo-doubled backcross design, our

the marker agree with the dosage of the QTL?” Not approach can be adapted. Also, if codominant markers

necessarily. There is no causal relationship between the are considered, they can be easily incorporated in the dosages of the QTL and the flanking markers. Although model by including parameters such as allelic effect and as discussed previously a configuration with the marker interaction.

and QTL having the same dosage often has a higher The effect of the assumptions we place on the genetic

disequilibrium and thus a higher power of detecting map that is used for QTL mapping in polyploids is one QTL. Second, “Should the models which are controlled area of research that remains unaddressed. When the by dosage levels be weighted for the purpose of repre- genetic map is estimated, the most likely parental marker senting more realistic results?” Weighting the candidate configuration in turn has to be estimated since recombi-models may be a better way than depending solely on nation requires a parental marker configuration. Be-the one “best” model, when Be-the uncertainty about pa- _{cause of this, the parental marker configuration used} rental configuration is high. However, even configura- _{for estimating the genetic map should be consistent} tions with the same marker dosage levels could differ _{with the one used to locate QTL. However, if the most} to a greater extent biologically; therefore, further inves- _{likely parental marker configuration is not the correct} tigation is needed to find a both statistically and biologi- _{configuration, the reliability of the map and the mapped} cally sound weighting scheme. Finally, “Would models _{QTL are affected. By selecting the pool of potential} pa-with dosage levels more similar to each other be more _{rental configurations on the basis of observed progeny} likely, especially with close linkage (short genetic dis- _{data, our method can serve as a further verification for} tance)?” This answer depends on the linkage phase _{the estimated genetic map. A framework is currently under} since it is possible for two models with exactly the same _{development to allow variability or uncertainty in} estimat-dosage levels and genetic association to be biologically _{ing the map to be considered when mapping QTL.} different due to different linkage phases, and therefore

We thank Tom Osborn for providing the data set and continued

they are different in likelihood. _{work in this area, and we thank Gary Churchill for his comments on}

We applied our approach to a published alfalfa data this work.

set for quantitative traits that are related to winter hardi-ness. Our analysis confirms several putative QTL identi-fied by a regression-based single-marker analysis (

Brou-LITERATURE CITED

weret al.2000), and it also identified a new region on

Bayes, T., 1783 An essay towards solving a problem in the doctrine

chromosome 8 that was shared by several traits. Our

of chances. Philos. Trans. R. Soc. Lond.53:370–418.

analysis results demonstrate that for a majority of the

Bingham, E. T., andT. J. McCoy, 1988 Cytology and cytogenetics

detected QTL, there is a positive or negative correlation of alfalfa, pp. 737–776 inAlfalfa and Alfalfa Improvement, edited by A. A.Hanson, D. K.Barnesand R. R.Hill, Jr. American

between the quantitative trait mean and QTL dosage

Society of Agronomy, Madison, WI.

level;i.e.,␮0ⱕ ␮1ⱕ. . .ⱕ ␮_dQor␮0ⱖ ␮1ⱖ. . .ⱖ ␮_dQ.

Broman, K. W., 2001 Review of statistical methods for qtl mapping

If such an ordering of means is available before analysis, in experimental crosses. Lab Anim.30(7): 44–52.

Brouwer, D., andT. Osborn, 1999 A molecular marker linkage

an order-restricted version of interval mapping can be

map of tetraploid alfalfa. Theor. Appl. Genet.99:1194–1200.

performed to improve power of detection. With an

or-Brouwer, D. J., S. H. Dukeand T. C. Osborn, 2000 Mapping

der constraint, the problem falls into the category of _{genetic factors associated with winter hardiness, fall growth, and}

freezing injury in autotetraploid alfalfa. Crop Sci.40:1387–1396.

isotonic regression and it can be computed using the

Cao, D., B. A. CraigandR. W. Doerge, 2003 Interval mapping for

pool-adjacent-violator algorithm (Robertsonet al.1988).

autopolyploids, pp. 60–72 inProceedings of the 15th Annual Kansas We have performed simulations (not shown) to com- _{State University Conference on Applied Statistics in Agriculture}_{, edited} pare the power between an order restriction and without by G.Milliken. Kansas State University, Manhattan, KS.

Cao, D., T. C. OsbornandR. W. Doerge, 2004 Correct estimation

one. In fact, even when the only information is that the

of preferential chromosome pairing in autotetraploids. Genome

means are ordered, and the direction remains un- _Res._14:_459–462.

known, there is a slight improvement in the perfor- Churchill, G. A., and R. W. Doerge, 1994 Empirical threshold values for quantitative trait mapping. Genetics138:963–971.

mance. The improvement is most evident when the order

Darlington, C. D., 1931 Meiosis in diploid and tetraploid primula

of the means is increasing or decreasing with respect to _{sinensis. J. Genet.}_24:_65–98.

the QTL dosage and is knowna priori. Dempster, A. P., N. M. LairdandD. B. Rubin, 1977 Maximum

likelihood from imcomplete data via the em algorithm. J. R. Stat.

Thus far it has been assumed that the QTL dosage

Soc. Ser. B39(1): 1–38.

is at most half the ploidy level, which is a reasonable

Dietrich, W. F., J. Miller, R. Steen,M. A. Merchant, D. Damron-assumption given the pseudo-doubled backcross experi- _Boles_{et al.}_{, 1996} _{A comprehensive genetic map of the mouse}

genome. Nature380:149–152.

(12)

as-Doerge, R. W., 2002 Mapping and analysis of quantitative trait loci Ma, C.-X., G. Casella, Z.-J. Shen, T. C. OsbornandR. Wu, 2002 A unified framework for mapping quantitative trait loci in bivalent in experimental populations. Nat. Rev. Genet.3:43–52.

tetraploids using single-dose restriction fragments: a case study

Doerge, R. W., andG. A. Churchill, 1996 Permutation tests for

from alfalfa. Genome Res.12:1974–1981. multiple loci affecting a quantitative character. Genetics142:285–294.

Mather, K., 1938 Crossing-over. Biol. Rev.13:252–292.

Doerge, R. W., andB. A. Craig, 2000 Model selection for

quantita-Muller, H. J., 1916 The mechanism of crossing-over. Am. Nat.50

tive trait locus analysis in polyploids. Proc. Natl. Acad. Sci. USA

(592): 193–221, 284–305, 350–366, 421–434.

97(14): 7951–7956.

Newton, W. C. F., andC. D. Darlington, 1929 Meiosis in

poly-Doerge, R. W., Z-B. ZengandB. S. Weir, 1997 Statistical issues in

ploids. J. Genet.21:1–56. the search for genes affecting quantitative traits in experimental

Osborn, T., J. Pires, J. Birchler,D. L. Auger, Z. J. Chenet al., populations. Stat. Sci.12(3): 195–219.

2003 Understanding mechanisms of novel gene expression in

Grattapaglia, D., andR. Sederoff, 1994 Genetic linkage maps

polyploids. Trends Genet.19(3): 141–147. of eucalyptus grandis and eucalyptus urophylla using a

pseudo-Rieseberg, L. H., andM. F. Doyle, 1989 Tetrasomic segregation testcross: mapping strategy and RAPD markers. Genetics137:

in the naturally occurring autotetraploidallium nevii(aliaceae). 1121–1137.

Hereditas111:31–36.

Hackett, C. A., J. E. BradshawandJ. W. McNicol, 2001 Interval

Ripol, M., G. Churchill, J. da SilvaandM. Sorrells, 1999 Statisti-mapping of quantitative trait loci in autotetraploid species.

Genet-cal aspects of genetic mapping in autopolyploids. Gene 235:

ics159:1819–1832. _31–41.

Haley, C. S., andS. A. Knott, 1992 A simple regression method _{Robertson, T., F. T. Wright}_and_{R. L. Dykstra}_{, 1988} _Order Re-for mapping quantitative trait loci in line crosses using flanking _{stricted Statistical Inference}_{, Ed. 1. Wiley, New York.}

markers. Heredity69:315–324. _{Rodzen, J. A}_{., and}_{B. May}_{, 2002} _{Inheritance of microsatellite loci}

Jackson, R. C., andJ. W. Jackson, 1996 Gene segregation in auto- _{in the white sturgeon}_{acipenser transmontanus.}_Genome_45:_1064–1076. tetraploids: prediction from meiotic configurations. Am. J. Bot. _{Soltis, D. E}., andP. S. Soltis, 1999 Polyploidy: recurrent formation

83(6): 673–678. and genome evolution. Trends Ecol. Evol.14(9): 348–352.

Jansen, R. C., 1993 Interval mapping of multiple quantitative trait Soltis, D. E., P. E. SoltisandJ. A. Tate, 2003 Advances in the study loci. Genetics135:205–211. of polyploidy since plant speciation. New Phytol.161:173–191.

Knapp, S. J., J. W. C. BridgesandD. Brikes, 1990 Mapping quantita- Soltis, P. S., andD. E. Soltis, 2000 The role of genetic and genomic attributes in the success of polyploids. Proc. Natl. Acad. Sci. USA tive trait loci using molecular marker linkage maps. Theor. Appl.

97(13): 7051–7057. Genet.79:583–592.

Suzuki, D. T., A. J. F. Griffiths, J. H. MillerandR. C. Lewontin,

Koornneef, M., J. van Eden, C. J. Hanhart,P. Stam, F. J. Braaksma

1989. Introduction to Genetics, Ed. 4. W. H. Freeman, New York.

et al., 1983 Linkage map ofarabidopsis thaliana.J. Hered.74:

Sybenga, J., 1995 Meiotic pairing in autohexaploidlathyrus: a math-265–272.

ematical model. Heredity75:343–350.

Lander, E. S., andD. Botstein, 1989 Mapping Mendelian factors

Wolfe, K. H., 2001 Yesterday’s polyploids and the mystery of dip-underlying quantitative traits using RFLP linkage maps. Genetics

loidization. Nat. Rev. Genet.2:333–341.

121:185–199.

Wu, K. K., W. Burnquist, M. Sorrells, T. L. Tew, P. H. Mooreet al.,

Luo, Z. W., C. A. Hackett, J. E. Bradshaw, J. W. McNicoland

1992 The detection and estimation of linkage in polyploids

D. Milbourne, 2000 Predicting parental genotypes and gene

using single-dose restriction fragments. Theor. Appl. Genet.83:

segregation for tetrasomic inheritance. Theor. Appl. Genet.100:

294–300. 1067–1073.

Zeng, Z-B., 1993 Theoretical basis for separation of multiple linked

Luo, Z. W., C. A. Hackett, J. E. Bradshaw, J. W. McNicolandD. Mil- _{gene effects in mapping quantitative trait loci. Proc. Natl. Acad.} bourne, 2001 Construction of a genetic linkage map in tetraploid _{Sci. USA}₉₀_{(23): 10972–10976.}

species using molecular markers. Genetics157:1369–1385. _{Zeng, Z}_-_B_{., 1994} _{Precision mapping of quantitative trait loci.}

Genet-Luo, Z. W., R. ZhangandM. J. Kearsey, 2004 Theoretical basis _ics_136:_1457–1468. for genetic linkage analysis in autotetraploid species. Proc. Natl.