Figure 4.2 Modular Clustering of Traits
Height, flowering time, and node count related traits exhibited substantial modularity at the phenotypic level (across plots). Their apparent independence was reduced at the genetic level (across RILs). This reduction was particularly notable between flowering time and height related traits. Environmental correlations among height related traits and both flowering and node counts were weakly negative (across field environments).
116
Joint-Linkage Mapping: Partitioning heritable variation within NAM families
Joint-linkage mapping of total height and related traits was performed by a stepwise QTL model selection approach from a set of 1,106 markers nested within each of the 25 NAM families as previously described (Buckler et al., 2009). As a measure of QTL robustness, family-stratified parametric bootstrapping was employed to attain estimates of the resample model inclusion probability (RMIP) (Valdar et al., 2009) for putative QTL. Given the complexity of height and correlated phenotypes, the NAM panel’s effective population size, genetic map size, marker density, and the QTL selection mapping method, over 39 robust QTL with a RMIP >0.05 were identified for every surveyed phenotype.
These QTL models captured the major proportion of heritable phenotypic variation for most of the traits (median height 77% captured – median days to anthesis 88% captured).
However, the identified total plant height QTL were all of small effect with the largest explaining only approximately 2.1% of the total heritable phenotypic variance for height (Figure
Class Trait H2plot H2line
Height Total Height .53 .89
Height Ear Height .49 .88
Height Ear : Total Height .47 .86
Flowering Time Days to Anthesis .66 .91
Flowering Time Days to Silk .68 .91
Flowering Time Anthesis-Silking Interval .41 .83 Table 4.2 Heritability of Height and Related Traits in AMES Panel
All surveyed traits were found to be substantially heritable. Measures of flowering time were the most heritable across the AMES panel. These were followed by plant height and node count phenotypes.
117
4.4). Differences were noted in the magnitude and directionality of QTL captured for the phenotypes within each of the 25 NAM families; however, over 90% of the mapped QTL for each phenotype were identified as significant and shared across at least three NAM families (Figure 4.5). Over 70% of these shared QTL contained allele series possessing both positive and negative effects relative to the common reference parent, B73. Positive and negative effects for each trait explained comparable phenotypic variance. Their distributions were symmetric and of nearly equal variance. Given the polygenicity of plant height, flowering time, and node counts, all parental genomes possessed repulsion phase QTL for every trait at the mapping resolution noted within each NAM family. Similarly, all NAM families exhibited transgressive segregation for the complex traits surveyed.
Figure 4.4 Distribution of heritability height variation in NAM by QTL model selection The distribution of height variation across the maize genome revealed both the polygenicity and complexity of height. The locus controlling the largest proportion of variation was identified on chromosome 9L, yet only captured approximately 2.1% of the variation.
118 Pleiotropy Among Phenotypes in NAM Panel
Total plant height and measures of flowering time have long been considered tightly regulated phenomena. Analysis of strong genetic correlation (days to anthesis r = 0.62, days to silk r = .57) between these traits across the RILs in the NAM panel initially supported this supposition. Nonetheless, after dissecting the QTL underlying these traits across all 25 NAM families, only four QTL present within a 39 QTL model were identified possessing significantly correlated (r > 0.4) allelic effect estimates with days to anthesis. The allelic effects of these same four QTL were also significantly correlated with days to silk measurements. In contrast, all the allelic effects of QTL mapped for total plant height were found to strongly correlate with those identified when the model was fitted to the heritable variation for primary ear height. Even the allelic effects of thirty of the thirty nine total height QTL were significantly correlated with total node counts (Figure 4.6).
Figure 4.5 Effect sizes of most significant height QTL by NAM family Estimate of allelic effects across the maize genome revealed numerous allele series on all chromosomes. Most effects were less than five centimeters in size.
119
Joint-Linkage-Assisted GWAS: Partitioning heritable variation across NAM families
Joint linkage mapping provides a powerful yet low resolution view of the genetic architecture underpinning plant height and correlated traits. Genotype-to-phenotype associations identified in linkage analysis remain limited to recently recombined linkage blocks segregating within each of the NAM families. This precludes our ability to specify the precise location of a genetic effect with much certainty beyond an interval of approximately 2-3 centimorgans in length.
To capitalize on ancestral recombination and further fractionate these QTL, background QTL on all but a single linkage group were fit in a NAM family nest QTL regression model and residual heritable height variation attributable to the absent linkage group was mapped to the polymorphisms by resampled forward regression GWAS. This joint-linkage assisted GWAS
Figure 4.6 Pleiotropy of QTL capturing height variation in NAM
All the allelic effects of those QTL mapped for height were highly significantly correlated with their effect on primary ear height when the same QTL model was fitted to both traits. In contrast, the allelic effects of only four of the thirty nine height QTL were significantly correlated with days to anthesis. Comparable results were identified for days to silk.
120
method was conducted across the NAM families using HapMap snps and cnvs discovered in the NAM parents and imputed onto their RIL progeny as previously described. From the 26 million tested snps and cnvs, hundreds of significant associations (RMIP >0.05) were identified for plant height and correlated traits. Many of these associations were found to co-segregate across NAM families with the allelic effects of their nearest QTL for each trait; however, for several of these associations the directionality opposed the QTL’s main allelic effects. Most pleiotropic QTL possessed significant GWAS associations for their underlying traits within a two centimorgan interval. For plant height and many of the correlated phenotypes, the distribution of significant associations across NAM families revealed approximately symmetric distributions with equal densities of positive and negative effects relative to a common parent; however, effect sizes were notably smaller than those observed during joint linkage mapping of QTL for all traits.
Co-localization of natural genetic diversity and cloned height loci
We possess substantial understanding of the molecular dynamics underpinning several biochemical pathways governing plant height such as those responsible for regulation and biosynthesis of gibberellins, brassinosteroids, and auxin hormones. However, the basis for most of our knowledge of these pathways does not stem from studies of naturally segregating genetic diversity. Surprisingly, few significant robust associations were found in linkage disequilibrium with established genes in these canonical pathways or in most cases within a 250,000 base pair window surrounding them. Marker density within these regions was not significantly diminished compared to the genome-wide distribution or those regions surrounding significant associations.
Further analysis of over thirty five previously cloned plant height loci, similarly found little co-localization with those significant joint-linkage assisted GWAS hits identified in the NAM panel (Table 4.3).
121 Genomic prediction by ridge regression BLUP
Recently, a substantial proportion of the heritable variation in human height (45%), unidentified by previous single marker GWAS, was reportedly captured in a mixed linear model framework utilizing all marker data to define genetic relatedness among individuals (Yang et al., 2010) instead of defining significant polymorphisms to construct a multiple regression which was found to capture less than 5% of the heritable phenotypic variance. Employing the same ridge regression modeling framework, we captured approximately 77% of the heritable variation in plant height noted within the NAM panel from polymorphisms scored in the first generation maize HapMap (Gore et al., 2009) (Figure 3.7). This was comparable to the proportion of heritable height variation captured in a QTL model selection framework. Despite a similar capacity to capture heritable height variation through both QTL model selection and regression methods, the manner in which variation was partitioned across the genome differed substantially
Candidate Distance Median
Effect (cm)
Significance (RMIP)
Gibberellin-Regulated Protein 2 47Kb upstream -1.4 50 Gibberellin-Receptor-Like Protein 93Kb
downstream
-1.2 24
Gibberellin-Responsive-Like Protein 78Kb upstream -0.9 17 Phytosulfokine Receptor Protein Intronic 1.2 80 Brassinosteroid Synthesis Protein 78Kb
downstream
-1.3 17
Brassinosteroid LRR Receptor Kinase 0.413Kb upstream
1.8 8
Table 4.3 Co-localization of candidate height genes and joint-linkage GWAS
All the allelic effects of those QTL mapped for height were highly significantly correlated with their effect on primary ear height when the same QTL model was fitted to both traits.
In contrast, the allelic effects of only four of the thirty nine height QTL were significantly correlated with days to anthesis. Comparable results were identified for days to silk.
122
between the approaches. While the AMES panel captures more genetic diversity at lower minor allele frequencies than the NAM panel, ridge regression in the AMES panel was found to capture a comparable 82% of heritable height variation from the polymorphisms in both panels.
DISCUSSION
Height is both one of the most heritable and most complex of all maize traits. In spite of this complexity, we sought to dissect the natural phenotypic variation in maize height and genetically correlated traits and partition it into components of genetic, environmental, and environmentally conditional genetic variance. Following the characterization of heritable height variation, this genetic variation was further partitioned across the maize genome and the polygenicity as well as the pleiotropy of QTL mapped for height and related traits including node counts and flowering time were assessed to discern their independence at the genomic level.
Cloned height loci and candidate genes already implicated in well-established height related hormonal pathways such as the auxin, brassinosteroid, and gibberellin pathway were compared to the linkage mapped QTL as well as those significant polymorphisms identified in joint-linkage assisted GWAS to discern if previously identified loci adequately capture the natural heritable height variation existing in the NAM panel. Two unique regression approaches were
Figure 3.7 Genomic prediction of height in NAM and AMES panels Comparable prediction accuracy was attained from both QTL model selection approaches and ridge regression models. Both NAM and AMES populations possessed similar prediction accuracies based on HapMap 1 polymorphisms.
123
used to differentially model the distribution and size of allele effects across the maize genome: a model selection approach seeking to capture the most heritable variation and attribute it to the fewest QTL, and a ridge regression model wherein all polymorphisms were assumed to possess a marginal influence on maize height variation.
Proportion of heritable height variation
As in node count and flowering time measurements, most height variation in both the NAM and AMES panels was explainable. Heritable diversity in all surveyed traits captured the most substantial proportion of phenotypic variation. To overcome the confounding effects of photoperiod (Coles et al., 2010) in determining both plant height and flowering time, only temperate field environments were included in this study. Given the termination of apical growth upon flowering, inclusion of both tropical and temperate environments may have greatly increased the proportion of height and flowering time variation attributable to environmental effects and affected the capture of environmentally conditional genetic variation. Conversion of flowering time to growing degree days and thus controlling for temperature differences among fields did not appear to influence proportions of variation attributable to environment or the relationships between flowering time and plant height. However, similar environmental variables likely influence estimates of heritability and the proportion of variance reported for each trait. In the instance of plant height measurements this may be augmented as a substantially larger proportion of variation was attributable to environmentally conditional genetic variation than that observed in other traits, most notable both node count measures.
In addition to environments of study, the heritability of each trait is strongly influenced by the allelic composition of the population in which it is phenotyped. Substantially variance in estimates of heritability was observed between both NAM families and the AMES inbred panel for height and related traits. Interestingly, correlations between the heritable variance of the traits across the NAM families were not well paralleled by the covariance of the traits across all the NAM RILs. Given the significant proportion of heritable variation in these traits captured by
124
differences between NAM families, this was unexpected. A shared reduction in genetic variance of two traits such as total height and flowering time across the 25 NAM families was not found to proportionally reduce their correlation across all the NAM RILs. While correlations of heritable variation between total height and node counts across NAM families were increased compared to flowering time, they were reduced across all the NAM RILs. Similarly, correlations between traits within each of the NAM families were found to significantly differ. This was further evidenced by variation in the NAM family’s predicted multi-trait responses to selection.
Further analyses of the variation in these responses could not be explained by estimates of kinship among the lines indicating the total genetic relatedness was not a powerful indicator of multi-trait response to selection.
Distribution of heritable height variation across the maize genome
Partitioning the heritable height variation across the maize genome revealed the substantial effect of modeling method in attributing variation to polymorphisms. To characterize the genetic architecture of maize height the employed stepwise NAM family nested QTL selection approach estimated a minimal number of QTL which could capture the most variation in height. This approach revealed substantially less variation in total height and primary ear height existed between NAM families than that observed for both flowering time and node count measures. Moreover, the proportion of heritable variation captured by the largest effect QTL while small in all the complex traits analyzed was smallest for plant height at only approximately 2.1% of the heritable variation. Moreover, the distribution of variation per QTL was not substantially more uniform for height than that observed for the other surveyed traits. These factors led the proportion of total heritable height variation captured by these models to be reduced relative to the other complex traits and suggested an even more polygenic pattern of inheritance.
Given the polygenicity of height, ridge regression was employed to assess the ability of all polymorphisms genotyped across both NAM families and the AMES inbred panel to capture
125
height variation. In contrast to model selection approaches, polymorphisms were not nested within each of the NAM families, and no term was fitted for the proportion of variation captured between families. This ridge regression approach sought to capture more variation in height and reduced the probable overestimation of allelic effects or Beavis effect (Beavis, 1994). While allele effect estimates were substantially smaller than that observed by model selection approaches, the total proportion of heritable variation captured by both methods was comparable and approximated 77% of the total heritable height variation. Although no multiple regression was performed in the AMES panel, ridge regression was employed and captured a comparable 82% of the heritable phenotypic variation.
Differences in genetic architecture estimates from both methods were substantial;
however, given the number of polymorphisms scored relative to the number of genotypes on which we possess phenotypic data, we are left with an ill-posed problem or a lack of degrees of freedom to accurately estimate QTL effects. With insufficient degrees of freedom, we have an unconstrained solution space with an infinite number of potential QTL allele effect estimates that are equally valid from a numerical but perhaps not biological perspective. In order to discern the most biologically appropriate model, additional information or assumptions beyond phenotypes and genotypes is needed to constrain the solution space of possible effect estimates. The QTL model selection approach sought to do so by invoking Occam’s razor and assuming the minimal number of QTL capturing the most height variation was the most accurate model of genetic architecture. Ridge regression sought to constrain the solution space by limiting maximum effect sizes and assuming all QTL effects must be shrunken equally to 0. While numerous other methods have been successfully applied to genomic prediction (Jannink et al., 2010), all seek to either limit the number of effects or shrink their squared or absolute effect size either equally across all predicted QTL or differentially as in several hierarchical Bayesian approaches based on repeated sampling of the probability distributions. Although many methods may predict phenotypes and capture heritable variation, the most biologically accurate model of genetic architecture often remains to be determined.
126 Pleiotropy with genetically correlated traits
Pleiotropy remains an aspect of genetic architecture and is critical to predicting the independence of evolvability among traits in a selection regime. The design of the NAM panel provides a unique opportunity to characterize the pleiotropy of QTL through correlation of the allelic effects of a locus across the 25 NAM families. Using this approach we identified substantially reduced pleiotropy between both measures of flowering time and measures of total height and ear height than expected by comparison to their genetic correlations of (r = 0.58 – 0.62). Upon further review we found a substantial correlation (r = 0.51) between NAM families for days to anthesis and total height indicating that while the mapped QTL variation within each NAM family were not shared between these traits, heritable height variation captured between NAM families was significantly pleiotropic with flowering time measurements. This suggests larger effect loci may be independently evolvable for measures of flowering time and plant height; however, numerous small effect loci may ensure these traits continue to coevolve.
GWAS and co-localization of candidate loci
Joint-linkage assisted GWAS of total height and ear height across the NAM families revealed a substantial number (345 and 351 respectively) of significantly associated (RMIP>5) polymorphisms across the entire maize genome. Many of these were identified as co-localizing with those QTL mapped during joint-linkage analysis; however, no associations were identified near previously cloned height loci such as Anther ear, Brachytic 1, 2, 3, Brevis plant 1, 2, Clumped Tassel 1, 2, Crinkly Leaves, Dwarf 1, 3, 8, 9, 10, 12, Etched N617, Lilliputian, Nana Plant 1, 2, Pygmy, Terminal ear, or Yellow dwarf 1, 2. Similarly, no genes centrally involve in the auxin, gibberellin, and brassinosteroid pathways were in linkage disequilibrium with the GWAS associations. These results indicate the previously identified heritable height variation does not well explain natural heritable variation in height. Given most previously identified height variation resulted in severe stunting of plants, it is possible these large effect loci are primarily conserved and have already reached fixation in natural populations.
127
According to the Fisher’s geometric model (Fisher, 1930) large effect loci are only beneficial if a population is far from its fitness maximum. The closer a population approaches its fitness maximum the smaller effects must be to become adaptive. Height, unlike many kernel traits (Brown et al., 2011), has not been under recent direct selection in most maize populations.
As such, few large effect loci likely remain segregating within these populations and instead have been purged or have reached fixation. The remaining small effect loci influencing height variation likely exist in proximity to genes less central to those hormonal and biochemical
As such, few large effect loci likely remain segregating within these populations and instead have been purged or have reached fixation. The remaining small effect loci influencing height variation likely exist in proximity to genes less central to those hormonal and biochemical