Examples of Power Approximations for 1 and 2-Stage Designs

2.2 Methods

2.2.4 Examples of Power Approximations for 1 and 2-Stage Designs

We calculated power for three models to demonstrate the difference in power between the competing approaches. For all three models, we assumed a multiplicative model

with a GRR = 1.3, and a susceptibility allele frequency fD = 0.3 in the general popu-

lation. In addition, for all three models we performed the calculations assuming study controls (in stage 2) have or have not been screened for disease. Model 1 was a GWA

scan on M = 500,000 SNPs for a study sample of NA = 2,000 study cases and NU =

2,000 study controls. Model 2 was identical to Model 1, except that there were fewer

study controls,NU = 1,000. Model 3 was designed to mimic a targeted follow-up study

to a previous GWA study. For Model 3, M = 7,500 andNA=NU = 1,250. For all three

models we considered a wide range of disease prevalence values of K = 1 x 10−4, 0.01,

0.05, 0.1, 0.25, and 0.5 and we assumed available genotype data on samples of NP U

= 1,000, 3,000, 5,000 and 10,000 public controls. We calculated power for the single- stage designs using only study controls, only public controls, or both control samples

combined. We also calculated the power for the optimal two-stage replication designs using one- and two-sided hypothesis tests in stage 2. For each optimal two-stage model we define the optimal platform and proportion of cases, cases, genotyped in stage 1. Finally, in order to test how power the 1- and 2-stage designs are impacted by different possible combinations of disease allele frequency, disease prevalence, and GRR, we cal-

culated power for Model 1 (assuming NP U = 5,000) using disease susceptibility allele

frequencies of fD = 0.1 and 0.5, disease prevalences of K = 0.01, 0.1 and 0.25, and

GRRs ranging from 1.1 to 1.5.

In the above power calculations, for the two-staged replication approach we chose the follow-up platform and proportion of cases genotyped in stage 1 that optimized power under a specific alternative hypothesis, namely, the relative risk and disease allele frequency (in the general population) were explicitly defined. In practice the true alternative model is unknown. A desirable quality of any two-stage approach is that the optimal choice of follow-up platform and the optimal proportion of cases genotyped on the follow-up platform are robust to the underlying relative risk and disease allele frequency. We performed additional power calculations to assess the robustness of the choice of follow-up platform and the proportion of cases, cases, genotyped on the follow-up platform across a range of alternative models. Specifically, assuming a GWA

study on M = 500,000 SNPs usingNA= 2,000 study cases,NU = 2,000 screened study

controls and NP U = 5,000 public controls for a multiplicative trait with a prevalence

K = 0.1, we calculated the maximum power and corresponding proportion of cases genotyped in stage 1, across a range of relative risks (GRR = 1.25-1.5) and disease

allele frequencies (fD = 0.1, 0.3, and 0.5) based on follow-up platforms containing 100,

375, 1,500, 7,500 and 16,500 SNPs. In addition, assuming a relative risk of 1.3 and disease allele frequency of 0.3, we calculated power across a range of proportion of cases, cases, genotyped for each of the 100, 1,500, 7,500, and 16,500 SNP follow-up

platforms to assess the decrease in power when using a higher or lower proportion of cases in stage 1 compared to the optimal proportion for each platform.

In the supplementary material, we performed additional power calculations using the general model (co-dominant) test of association (two-degree-of-freedom Chi-square test) under the same multiplicative alternative hypothesis models we considered for the Cochran-Armitage trend test. In addition, power was also calculated for several dominant and recessive inheritance models using the single-degree-of-freedom Chi-square test.

2.2.5 Impact on Power of Ancestrally Poorly-Matched Public

Controls and Batch Genotype Effects

In the previous calculations, we did not consider the impact of ancestrally poorly- matched public controls and batch genotype effects on power that can occur when genotyping samples of cases and public controls from different populations at different

times. We evaluated the impact of these factors for a study design that included

2,000 study cases, 2,000 study controls, and 5,000 public controls for a multiplicative disease model with susceptibility allele frequency = 0.3, K = 0.10 and GRR = 1.3. For ancestrally poorly-matched public controls (with respect to our study cases), we measured the reduction in power by decreasing the effective sample size of the public control sample. Specifically, for the purpose of these calculations, we have assumed that a fraction (we considered a range from 0% to 90%) of public controls will be removed from consideration after genotyping study cases (when comparisons of ancestry can be made between study cases and public controls using genome-wide data) and prior to performing association testing. We have additionally assumed that the proportion of cases genotyped in stage 1 of our two-stage replication design is optimized and chosen prior to the removal of any public controls. Power calculations were also performed

for the two one-stage designs that utilize public controls after eliminating ancestrally poorly-matched public controls.

To help assess the impact of batch genotype effects on our proposed two-stage design we calculated power using more stringent significance thresholds for stage 1. We assumed that batch genotype effects in stage 1 would lead to an excess of SNPs, under the null hypothesis, with low p-values and that the SNP associated with disease was not subject to batch genotype effects. The impact of batch genotype effects under these assumptions was that truly associated SNPs were required to reach a higher significance level in stage 1 than anticipated in order to be included in stage 2 genotyping. We calculated power in stage 1 of our two-stage replication design by varying the magnitude of the departure of the required significance threshold from markers in stage 1 (p-value

required for a SNP to be genotyped in stage 2) to be between 0.99 x πmarkers and

0.1 x πmarkers. The proportion of cases genotyped in stage 1 was optimized under the

erroneous assumption of no batch genotype effects (i.e. markers was assumed to be the significance threshold required for a SNP to be subsequently genotyped in stage 2). Power calculations that included batch genotype effects were not performed for the three one-stage designs.

2.2.6 Example of Genotyping Costs for Different Genotype

In document Novel statistical methods for the study design and analysis of genome-wide association studies (Page 46-49)