Proxy-phenotype and genetic overlap analyses

In this section, we describe and report the results from tests of joint enrichment that allow us to formally test if the SNPs showing the strongest evidence of association with one phenotype (for example, SWB), are more strongly associated with another phenotype (for example, DS) than expected by chance. The analyses are motivated by the evidence of the strong genetic correlations between SWB, DS, and neuroticism (Bartels et al., 2013; Kendler & Myers, 2009; Weiss et al., 2008), including the results shown in Table A1.

Figure 3.11. Polygenic score prediction in HRS and NTR

Note: Predictive power of the polygenic score constructed from the subjective well-being GWAS results in two independent holdout cohorts (HRS and NTR). Predictive power is tested for subjective well-being, positive affect, life satisfaction, depressive symptoms, the Big Five personality traits (which include neuroticism), and height.

A. METHODOLOGY FOR PROXY-PHENOTYPE AND CROSS-PHENOTYPE EN- RICHMENT ANALYSES

We use a two-stage approach that has been successfully applied in other contexts (Rietveld, Esko, et al., 2014). In the first stage, we conduct a meta-analysis of a first-stage “proxy phenotype” (e.g., SWB). In the second stage, we test the “lead/lead-proxy SNPs”—the SNPs showing strongest evidence of association with the first-stage phenotype—for association with a second-stage phenotype (e.g., DS) in an independent (non-overlapping) sample. Note that in the analyses described in this section, relative to the GWAS on SWB, DS, and neuroticism reported in Sections 3.4.2 and 3.4.4, we omit cohorts from the first-stage or second- stage as needed to ensure that the samples in the two stages are non-overlapping.

In total, we perform three lookup exercises; see Table A9 for a summary overview of the analyses, including cohort restrictions used to eliminate overlap between the stage-one and stage-two samples. In our analysis of DS, we apply the effective sample-size weighting scheme described in Section 3.4.4-F to the two case-control studies (GERA and PGC; dbGaP, 2015; Ripke et al., 2013) and continue to weight UKB by its sample size. As in the main analyses, we perform sample-size-weighted meta-analyses of SWB and neuroticism. For convenience, in what follows we adopt the convention of naming each lookup analysis in the format “First-stage phenotype → Second-stage phenotype”. We conducted three lookup exercises. In our first lookup exercise, the first- and second-stage phenotypes are, respectively, SWB and DS, or simply SWB → DS. Our second lookup is SWB → Neuroti- cism, and our third lookup is SWB → Height, where we treat Height as a negative control. We omit from the meta-analysis of the second-stage phenotype SNPs missing from a sub- stantial fraction of individuals; see the notes in Table A9 for details. For example, in the analysis where the second-stage phenotype is DS, we only consider SNPs available in all three DS cohorts (GERA, PGC and UKB; dbGaP, 2015; Ripke et al., 2013; Sudlow et al., 2015b). And in the analysis where the second-stage is neuroticism, we only consider SNPs available in our two neuroticism cohorts, GPC (De Moor et al., 2015) and UKB (Sudlow et al., 2015), with a minimum total sample size of N = 90,000. Below, we describe the meth- odology we used to construct the lead SNPs, and the tests of enrichment we performed. B. GENERATING LEAD SNPS

Throughout, we apply a uniform methodology to define the lead SNPs that are subsequently tested for association, both jointly and individually, with the second-stage phenotype. For brevity, we illustrate the methodology used to construct our list of lead/lead-proxy SNPs

86 GENOME-WIDE ANALYSES OF SWB,DS AND NEUROTICISM

using the example of SWB. However, the procedure used in the other two lookups is nearly identical, as explained in the relevant subsections below.

We began by identifying a set of approximately independent “SWB-associated SNPs” from the first-stage meta-analysis (or more generally, “first-stage-phenotype-associated SNPs”). We applied the clumping methodology described in Section 3.4.2, but with a p-value thresh- old for the index SNPs of 10-4_{. The more liberal p-value threshold was chosen prior to the}

study based on power calculations. As in our main analyses, we used the 1000G phase 1 reference sample (Abecasis et al., 2012) composed of Utah Residents (CEPH) with Northern and Western European Ancestry (CEU), Toscani in Italia (TSI), and British in England and Scotland (GBR) for clumping and for estimating linkage disequilibrium.

Applying the clumping procedure to the SWB meta-analysis results from the SWB → DS lookup generated 223 approximately independent SWB-associated lead SNPs. Of these, 85 were available in all three DS cohorts used in the second-stage analyses, whereas 148 were not. For each of these 148 SNPs, we examined if there are any SNPs satisfying the following conditions: (i) the SNP is in high LD (R2_{> 0.8) with the SWB-associated SNP, and (ii) the}

SNP is available both in the SWB meta-analysis and in all three cohorts contributing to the meta-analysis of DS. A proxy-lead SNP satisfying these criteria was available for 78 out of 148 SNPs (mean R2_{= 0.96, range 0.81 to 1.00). Whenever more than one proxy is available}

for a SNP, we chose as our proxy the SNP whose R2_{with the SWB-associated SNP was the}

greatest. Our final list of lead SNPs in the first lookup exercise therefore contains 85+78 = 163 SNPs.

C. TESTING LEAD/PROXY-LEAD SNPS FOR ENRICHMENT

Because SWB, DS, and neuroticism phenotypes are all highly polygenic, it is of limited interest to test the null hypothesis that the p-value distribution of the lead/lead-proxy SNPs is uniform. We instead perform a non-parametric test of joint enrichment that probes whether the lead SNPs are more strongly associated with the second-stage phenotype than randomly chosen sets of SNPs with minor allele frequencies within one percentage point of the lead/proxy-lead SNP. To perform our test, we generated 1,000 matched SNPs for each of the

Y lead/lead-proxy SNPs (e.g., Y = 163 in the SWB → DS analysis).

We then ranked the Y×1000 + Y SNPs by p-value and conducted a Mann-Whitney test (Nachar, 2008) of the null hypothesis that the p-value distribution of the Y lead/lead-proxy SNPs are drawn from the same distribution as the Y×1000 matched SNPs. To test the indi- vidual lead SNPs for experiment-wide significance, we examine whether any of the lead

SNPs (or their high-LD proxies) are significantly associated with the second-stage pheno- type at the Bonferroni-corrected significance level of 0.05/Y. Throughout, we adopt the con- vention of classifying an effect size as “in the predicted direction” if either (i) the signs are concordant and the two phenotypes are estimated to have a positive genetic correlation, or (ii) the signs are discordant and the phenotypes are estimated to have a negative genetic correlation.

D. RESULTS FROM PROXY-PHENOTYPE AND CROSS-PHENOTYPE ENRICH- MENT ANALYSES

Are SWB-Associated SNPs Enriched for Depression?

Figure 3.5a is a two-way scatterplot of the z-statistics of the lead/lead-proxy SNPs in SWB (horizontal axis) against DS (vertical axis). To aid interpretation, we choose the reference allele to be the SWB-increasing variant, so all z-statistics are by construction positive for the first-stage phenotype. On the basis of the negative genetic correlation reported in Table A1 (𝜌̂ = −0.81), we expect plotted points to lie disproportionately below the dashed horizontal line at zero (i.e. negative z-statistics). That is indeed what we find: 116 out of 163 (71%) signs are in the expected direction. Moreover, for 19 out of the 20 SNPs that are nominally significantly associated (p < 0.05) in the analysis of DS, the association is in the predicted direction.

Three lead/proxy-lead SNPs reach p-value < 10-7_{in the SWB meta-analysis. Two of these}

are nominally associated with DS: rs12517563 (p = 0.007) and rs2075677 (p = 0.0149). Two other SNPs are significantly associated with depressive symptoms at the Bonferroni-cor- rected p-value threshold of 0.05/163 = 0.00037. These are rs6904596 (p = 9.78×10-5_{) and}

rs4481363 (p = 3.06×10-4_{). The direction of the association with depressive symptoms is in}

the predicted direction for all four SNPs (rs12517563, rs2075677, rs6904596, rs4481363): the SWB-increasing allele is estimated to reduce depression risk. Supplementary Table 16 in Okbay, Baselmans, et al. (2016) lists the association results for the lead/proxy-lead SWB- associated SNPs in the first-stage SWB meta-analysis and the second-stage depressive symptoms meta-analysis conducted in an independent sample. The SNPs are ordered by p- value attained in the SWB analysis (from smallest to largest). Among SWB-associated SNPs with p-value < 10-5_{, 80% have signs in the predicted direction. Our test of joint enrichment}

rejects the null of no enrichment relative to the expected level for a randomly sampled set of SNPs matched on allele frequency (p = 0.033).

88 GENOME-WIDE ANALYSES OF SWB,DS AND NEUROTICISM Are SWB-Associated SNPs Enriched for Neuroticism?

Applying the same clumping algorithm, we identified 170 lead/lead-proxy SNPs from the first-stage analysis of SWB. The results from this lookup analysis are summarized in Figure 3.5b, where the reference allele is again chosen to be the SWB-increasing allele. Given the negative genetic correlation reported in Table A1 (𝜌̂ = −0.75), we expect z-statistics dispro- portionately below the dashed horizontal line. Indeed, 129 out of 170 signs (76%) are in the predicted direction in the neuroticism results. Moreover, all 28 SNPs that are nominally significant in the neuroticism analysis have the predicted sign. None of the three SNPs reaching

p-value < 10-7_{in the first-stage analysis are associated with neuroticism. However, four}

SNPs are significant at the Bonferroni-corrected significance threshold 0.05/173 = 0.00029. These are rs10838738 (p = 2.6×10-5_{), rs6904596 (p = 4.2×10}-5_{), rs4481363 (p = 5.7×10}-5₎

and rs10774909 (p = 7.3×10-5_{). In all four cases, the effects are in the expected direction.}

For complete results, see Supplementary Table 17 in Okbay, Baselmans, et al. (2016). Fi- nally, our test of joint enrichment rejects the null of no enrichment relative to the expected level for a randomly sampled set of SNPs matched on allele frequency (p = 10-4_).

Negative-Control Analyses: Are SWB-Associated SNPs Enriched for Height?

For our negative-control analyses, our first-stage analyses of SWB were performed omitting cohorts that contributed to GIANT consortium’s yeaommr-2010 study of height (Lango Allen et al., 2010), leaving us with a first-stage discovery sample of N = 229,853. Applying our methodology gives 181 lead/lead-proxy SNPs. Our second-stage lookup is conducted using publicly available summary statistics from the height GWAS (N = 133,859). We find no evidence that the proportion of SNPs for which the allele estimated to increase SWB is also the allele estimated to increase height is statistically distinguishable from 50% (p = 0.373), and the Mann-Whitney test of joint enrichment fails to reject the null hypothesis (p = 0.454).

In document Essays on Genetics and the Social Sciences (Page 91-95)