Following the suggestion of a referee, we attempted to replicate the genome-wide associa- tions reported in our previous GWAS of EA (Rietveld, Medland, et al., 2013) in the new cohorts that were added to this study. Conversely, we also examined if the SNPs that reach genome-wide significance in a meta-analysis of the new cohorts replicate in the Rietveld, Medland, et al. (2013) cohorts.
A. COHORT OVERLAP WITH RIETVELD, MEDLAND ET AL. (2013)
The analyses of EduYears in Rietveld, Medland, et al. (2013) were based on a discovery sample of 101,069 individuals and a combined sample (discovery + replication) of 126,559 individuals. Some of the cohorts that contributed to the Rietveld, Medland, et al. (2013) study did not participate in the present study (N = 13,981). Overall, the combined sample size of the Rietveld, Medland, et al. (2013) cohorts that contributed to our study is N = 126,413 individuals. This number exceeds the difference between 126,559 and 13,981 be- cause some of the original Rietveld, Medland, et al. (2013) cohorts completed additional genotyping since 2013, and were hence able to contribute larger samples to the current study.
Figure 4.15. Manhattan plot from the pooled analysis of the College phenotype
Note: The x-axis is chromosomal position, and the y-axis is the P-value on a –log10 scale. The black line shows the genome-wide significance level (5×10-8). The red x’s are the approximately 74 independent genome-wide significant associations (“lead SNPs”) from the EduYears pooled results. The black dots labeled with rs numbers are the 3 Rietveld, Medland, et al. (2013) SNPs.
B. METHODS IN WITHIN-SAMPLE REPLICATION ANALYSES
Rietveld, Medland, et al. (2013) reported three genome-wide significant SNPs in their dis- covery sample, all of which replicated in their replication sample. These three SNPs also yielded lower P-values in the “combined” (discovery + replication) sample. In a meta-anal- ysis of the combined sample, four additional SNPs reached genome-wide significance. Of these, five were genome-wide significant in the EduYears analyses. The remaining two only reached genome-wide significance in the analyses of College, but both had P-values just shy of genome-wide significance in the combined-sample EduYears analysis. Given our decision to make EduYears the primary phenotype, and to facilitate comparisons of effect sizes, we attempt to replicate all of the seven original associations in our meta-analyses of the
EduYears variable. To examine if the seven associations replicate in our new cohorts, we
split our overall sample into two subsamples comprising: (1) cohorts that participated in Rietveld, Medland, et al. (2013) and (2) all new cohorts that were added to the current study. In what follows we refer to the former as the “Rietveld Cohorts” and the latter as the “New Cohorts.” We refer to the combined-sample meta-analysis results reported by Rietveld, Med- land et al. (2013) as the “Rietveld et al. (2013) Cohorts.”
C. WITHIN-SAMPLE REPLICATION RESULTS
Table B6 reports the results of the replication analysis. In the upper panel, we report for the seven SNPs, their standardized effect sizes, standard errors, and P-values. We report these statistics from three separate meta-analyses of EduYears conducted in: (i) the Rietveld et al. (2013) Cohorts (ii) the Rietveld Cohorts, and (iii) the New Cohorts. The reference allele is chosen to be the allele associated with higher values of EduYears in Rietveld, Medland, et al.’s analysis (2013).
Given the high degree of overlap between cohorts in the previous EA meta-analysis (Rietveld, Medland, et al., 2013) and the Rietveld Cohorts, the similarity of the effect-size estimates is unsurprising. Reassuringly, the sign of the estimated coefficient in the New Co- horts is always in the predicted direction, and for all but one of the seven SNPs we can reject the null hypothesis of no effect at the 5% significance level (two SNPs, rs4851266 and rs9320913, reach genome-wide significant also in the replication sample). For six of the seven SNPs, the 95% confidence intervals for the estimated effect sizes overlap across the Rietveld Cohorts and the New Cohorts.
128 GWASIDENTIFIES 74LOCI ASSOCIATED WITH EDUCATIONAL ATTAINMENT
To further examine replicability, we examined if SNPs that reach genome-wide significance in a meta-analysis of the New Cohorts replicate in the Rietveld Cohorts. Applying the prun- ing algorithm described in Section 4.2.6 to meta-analysis results for the New Cohorts re- sulted in 14 approximately independent SNPs. The results from this replication analyses are reported in Panel B of Table B6. The results are similar to those of the replication of the associations from the Rietveld Cohorts in the New Cohorts: the signs align for all 14 SNPs, and 12 SNP replicate at P-value < 0.05 in the Rietveld Cohorts (none of them at genome- wide significance, but 5 at P-value < 10-5).
In the two replication analyses, the average effects in the replication samples are about 35% smaller than the estimated effect of the genome-wide significant association, roughly con- sistent with the degree of inflation one would expect from a Winner’s Curse correction of the sort described and performed in the Supplementary Information section 1.8.3 of Okbay, Beauchamp, et al., (2016).