• No results found

How to get accurate sample size and power with nquery Advisor R

N/A
N/A
Protected

Academic year: 2021

Share "How to get accurate sample size and power with nquery Advisor R"

Copied!
71
0
0

Full text

(1)

How to get accurate sample size and power with nQuery Advisor

r

Brian Sullivan

Statistical Solutions Ltd.

ASA Meeting, Chicago, March 2007

(2)

Sample Size Two group t-test

χ2-test

Survival Analysis 2 × 2 Crossover Odds Ratio

(3)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(4)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(5)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(6)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(7)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(8)

“What sample size do I need?”

Formulate the study

Specify analysis parameters

Specify effect size for test

Compute sample size or power

Sensitivity analysis

Choose sample size

(9)

Formulate the study

Type of analysis being undertaken

Clinical trial

Laboratory study

Survey research

What is being measured?

How is it being analyzed?

(10)

Formulate the study

Type of analysis being undertaken

Clinical trial

Laboratory study

Survey research

What is being measured?

How is it being analyzed?

(11)

Formulate the study

Type of analysis being undertaken

Clinical trial

Laboratory study

Survey research

What is being measured?

How is it being analyzed?

(12)

Specify analysis parameters

What parameters are involved?

Significance level

Test type: one or two sided

(13)

Specify analysis parameters

What parameters are involved?

Significance level

Test type: one or two sided

(14)

Specify effect size

Use information from previous/similar studies

What effect size would be important to detect?

(15)

Compute sample size or power

Specify a value for sample size or power and perform the calculation solving for the other

(16)

Recalculate sample size for different scenarios about effect size

Recalculate sample size for different levels of power

Compare different sample sizes

Plot sample size vs power

(17)

Choose sample size

Based on the trade off between power attainable for the desired effect size and sample size feasibility, choose a sample size for the study

(18)

Two group t-test

Study Design

Does a new drug reduce anemia in elderly women after hip fracture?

Two group, randomized, parallel, double-blind study

Drug for two weeks, three times per week

New drug vs placebo

Equal n’s

(19)

Two group t-test

Study Design

Does a new drug reduce anemia in elderly women after hip fracture?

Two group, randomized, parallel, double-blind study

Drug for two weeks, three times per week

New drug vs placebo

Equal n’s

(20)

Outcome and Analysis

Primary outcome: Mean change in hematocrit level (percentage of red blood cells in the blood) from pre-treatment to post-treatment

Compare mean change in hematocrit level between two groups using two-group t-test

Null Hypothesis: Mean change is the same in both groups

(21)

Outcome and Analysis

Primary outcome: Mean change in hematocrit level (percentage of red blood cells in the blood) from pre-treatment to post-treatment

Compare mean change in hematocrit level between two groups using two-group t-test

Null Hypothesis: Mean change is the same in both groups

(22)

Outcome and Analysis

Primary outcome: Mean change in hematocrit level (percentage of red blood cells in the blood) from pre-treatment to post-treatment

Compare mean change in hematocrit level between two groups using two-group t-test

Null Hypothesis: Mean change is the same in both groups

(23)

Calculation Parameters

Significance level, α

Test type: one or two sided, s

Expected difference, d = |µ1− µ2|

Standard deviation, σ

Effect size, δ = d /σ

Power, 1 − β

Sample size, n

(24)

Previous Analysis

There have been four previous studies from which data can be drawn.

Study n Population Mean Hct

1 6 Elderly women with fracture 32.3%

2 32 Elderly women no fracture 33.5%

3,4 Change in placebo 0.0

3,4 Change in drug groups 2.7 - 4.7 Expected change: 0% in placebo group

2 - 2.4% in drug group

(25)

Previous Analysis

There have been four previous studies from which data can be drawn.

Study n Population Mean Hct

1 6 Elderly women with fracture 32.3%

2 32 Elderly women no fracture 33.5%

3,4 Change in placebo 0.0

3,4 Change in drug groups 2.7 - 4.7 Expected change: 0% in placebo group

2 - 2.4% in drug group

(26)

Previous Analysis

From previous studies the standard deviation of hematocrit levels were between 3% and 6%.

Here we need the standard deviation for the change in hematocrit level and there is less information on this.

From previous studies it can be estimated to be between 1.5% to 2.5%.

A reasonable estimate would be 2%.

(27)

Example

Using the following details we can proceed with the calculation:

Parameter Value

Significance level .05

Test type 2 sided

Expected difference 2 - 2.4 Standard deviation 2 Illustrated in nQuery Advisor file:hematocrit.nqa.

(28)

Sensitivity Analysis

Figure:Plot of Sample Size vs Power

(29)

Choose Sample Size

A sample size of 20 in each group will have 92% power to detect a difference in means of 2.200 assuming that the common standard deviation is 2.000 using a two group t-test with a 0.050 two-sided significance level.

(30)

χ

2

-test

Study Design

Does a certain diet drug combination increase the likelihood of requiring valve surgery?

Case-Control study

Cases: patients who had valve surgery in defined two year period

Controls: matched using age, gender and body-mass index

(31)

χ

2

-test

Study Design

Does a certain diet drug combination increase the likelihood of requiring valve surgery?

Case-Control study

Cases: patients who had valve surgery in defined two year period

Controls: matched using age, gender and body-mass index

(32)

Outcome and Analysis

Primary outcome: Proportion of cases and controls that have taken the diet drugs

Compare proportions of cases and controls that have taken diet drugs using McNemar’s χ2test for paired responses

(33)

Outcome and Analysis

Primary outcome: Proportion of cases and controls that have taken the diet drugs

Compare proportions of cases and controls that have taken diet drugs using McNemar’s χ2test for paired responses

(34)

Calculation Parameters

Significance level, α

Test type: one or two sided, s

Expected difference in proportions, δ = |π1− π2|

Proportion of discordant pairs, η = π10+ π01

Power, 1 − β

Sample size, n

(35)

Previous Analysis

It is expected that about 5% of controls would have taken the diet drugs.

The investigators are interested in detecting an increase to 10% among the cases with 90% power.

(36)

Previous Analysis

A two by two table of proportions of cases and controls using the diet drugs is postulated as:

Case yes Case no Row proportion

Control yes 0.03 0.02 0.05

Control no 0.07 0.88 0.95

Column proportion 0.10 0.90 1.00

(37)

Example

Using the following details we can proceed with the calculation:

Parameter Value

Significance level 0.05

Test type 2 sided

Difference in proportions 0.05 Proportion of discordant pairs 0.09 Illustrated in nQuery Advisor file:surgery.nqa.

(38)

Sensitivity Analysis

Figure:Plot of Difference in Proportions vs Sample Size

(39)

Choose Sample Size

A sample size of 342 pairs will have 90% power to detect a difference in proportions of 0.050 when the proportion of discordant pairs is expected to be 0.090 and the method of analysis is a McNemar’s test of equality of paired proportions with a 0.050 two-sided significance level.

(40)

Survival Analysis

Study Design

Does TIPS surgery for bleeding esophageal varices prolong life over shunt surgery?

Survival analysis log-rank test

Randomization into two parallel groups

All patients must be followed for a minimum of one year

(41)

Survival Analysis

Study Design

Does TIPS surgery for bleeding esophageal varices prolong life over shunt surgery?

Survival analysis log-rank test

Randomization into two parallel groups

All patients must be followed for a minimum of one year

(42)

Outcome and Analysis

Primary outcome: Time to death due to any cause

Groups compared using survival analysis log-rank test

Null Hypothesis: Time to death is the same in both groups

(43)

Outcome and Analysis

Primary outcome: Time to death due to any cause

Groups compared using survival analysis log-rank test

Null Hypothesis: Time to death is the same in both groups

(44)

Outcome and Analysis

Primary outcome: Time to death due to any cause

Groups compared using survival analysis log-rank test

Null Hypothesis: Time to death is the same in both groups

(45)

Calculation Parameters

Significance level, α

Test type: one or two sided, s

Group proportions at time t, π1, π2

Hazard ratio, h = ln(πln(π1)

2)

Power, 1 − β

Sample size, n

(46)

Previous Analysis

Results from previous studies give an indication of the expected proportions.

It is expected that 65% of patients will survive one year after shunt surgery.

TIPS surgery would be considered to be markedly worse if only 45% of patients survived one year.

(47)

Example

Using the following details we can proceed with the calculation:

Parameter Value

Significance level 0.05

Test type 2 sided

Group 1 proportion 0.65 Group 2 proportion 0.45 Illustrated in nQuery Advisor file:surgery2.nqa.

(48)

Sensitivity Analysis

Figure:Plot of Hazard vs Sample Size

(49)

Choose Sample Size

When the sample size in each group is 131, with a Total number of events required, E, of 110, a 0.050 level two-sided log-rank test for equality of survival curves will have 90% power to detect the difference between a Group 1 proportion π1at time t of 0.650 and a Group 2 proportion π2at time t of 0.450 (a constant hazard ratio of 0.539); this assumes no dropouts before time t.

(50)

2 × 2 Crossover

Study Design

Does a new drug have similar steady state blood levels to that of a standard drug?

Two-way crossover design

Two period, two treatment AB, BA

New drug vs standard drug

(51)

2 × 2 Crossover

Study Design

Does a new drug have similar steady state blood levels to that of a standard drug?

Two-way crossover design

Two period, two treatment AB, BA

New drug vs standard drug

(52)

Outcome and Analysis

Primary outcome: Mean steady state blood level

Interested to know will the mean steady state blood level for the new drug be within 15% of that of the standard

Two group t-test (TOST) for equivalence in means

Null Hypotheses: Difference ≥ upper level and Difference ≤ lower level

(53)

Outcome and Analysis

Primary outcome: Mean steady state blood level

Interested to know will the mean steady state blood level for the new drug be within 15% of that of the standard

Two group t-test (TOST) for equivalence in means

Null Hypotheses: Difference ≥ upper level and Difference ≤ lower level

(54)

Outcome and Analysis

Primary outcome: Mean steady state blood level

Interested to know will the mean steady state blood level for the new drug be within 15% of that of the standard

Two group t-test (TOST) for equivalence in means

Null Hypotheses: Difference ≥ upper level and Difference ≤ lower level

(55)

Outcome and Analysis

Primary outcome: Mean steady state blood level

Interested to know will the mean steady state blood level for the new drug be within 15% of that of the standard

Two group t-test (TOST) for equivalence in means

Null Hypotheses: Difference ≥ upper level and Difference ≤ lower level

(56)

Calculation Parameters

Significance level (one sided), α

Lower equivalence limit for µT − µS, ∆L

Upper equivalence limit for µT − µS, ∆U

Expected difference in means, d = µT − µS

Crossover ANOVA,√ MSE

Standard deviation of differences, σd

Power, 1 − β

Sample size, n

(57)

Previous Analysis

Previous study on standard drug indicated a mean steady state blood level of 16

Pilot study of two different doses for new drug showed standard deviation of about 2.8 for each

The correlation between blood levels for the two periods was 0.55

(58)

Example

Using the following details we can proceed with the calculation:

Parameter Value

Significance level 0.05

Lower limit -2.4

Upper limit 2.4

Expected difference 0.0 Standard deviation 2.8

Correlation 0.55

Illustrated in nQuery Advisor file:blood.nqa.

(59)

Sensitivity Analysis

Figure:Plot of Sample Size vs Power

(60)

Choose Sample Size

When the sample size in each sequence group is 7, (a total sample size of 14) a crossover design will have 80% power to reject both the null hypothesis that the test mean minus the standard mean is below -2.400 and the null hypothesis that the test mean minus the standard mean is above 2.400 i.e., that the test and standard are not equivalent, in favor of the alternative hypothesis that the means of the two treatments are equivalent, assuming that the expected difference in means is 0.000, the Crossover ANOVA√

MSE is 1.878 (the Standard deviation of differences, σd, is 2.656) and that each test is made at the 5.0% level.

(61)

Odds Ratio

Study Design

Does an educational program for expectant mothers reduce the risk of preterm birth?

two group study, odds ratio

cases vs controls

cases receive education

equal n’s

(62)

Odds Ratio

Study Design

Does an educational program for expectant mothers reduce the risk of preterm birth?

two group study, odds ratio

cases vs controls

cases receive education

equal n’s

(63)

Outcome and Analysis

Primary outcome: Proportion of preterm births

The educational program would be of interest if it reduced the rate of preterm births to produce and odds ratio of 0.5

How wide is the 95% confidence interval for the odds ratio likely to be if 500 women are assigned to each group?

(64)

Outcome and Analysis

Primary outcome: Proportion of preterm births

The educational program would be of interest if it reduced the rate of preterm births to produce and odds ratio of 0.5

How wide is the 95% confidence interval for the odds ratio likely to be if 500 women are assigned to each group?

(65)

Outcome and Analysis

Primary outcome: Proportion of preterm births

The educational program would be of interest if it reduced the rate of preterm births to produce and odds ratio of 0.5

How wide is the 95% confidence interval for the odds ratio likely to be if 500 women are assigned to each group?

(66)

Calculation Parameters

Significance level, α

Test type: one or two sided, s

Group proportions, π1, π2

Odds ratio, ψ = ππ2(1−π1)

1(1−π2)

Log odds ratio, B = ln(ψ)

Distance from B to limit, ω

Sample size, n

(67)

Previous Analysis

Previous studies of similar populations have found preterm birth rates between 7% and 9%.

A reasonable estimate would be 8%.

(68)

Example

Using the following details we can proceed with the calculation:

Parameter Value

Significance level 0.05

Test type 2 sided

Control proportion 0.08

Odds ratio 0.5

n per group 500

Illustrated in nQuery Advisor file:education.nqa.

(69)

Sensitivity Analysis

Figure:Plot of Sample Size vs Distance from B to limit

(70)

Choose Sample Size

When the sample size in each group is 500, the Control proportion, π1, is 0.080, and the Education proportion, π2, is 0.042, a two-sided 95.0% confidence interval for a ln(odds ratio) expected to be -0.693 will extend 0.545 from the

observed ln(odds ratio) (corresponding to confidence limits of 0.290 and/or 0.862 for an odds ratio of 0.500).

(71)

Bibliography

Elashoff J.D. nQuery Advisor Version 6.01 User’s Guide. (2005) Los Angeles, CA.

Dixon, W.J., Massey, F.J. Introduction to Statistical Analysis. 4th Edition McGraw-Hill (1983)

O’Brien, R.G., Muller, K.E. Applied Analysis of Variance in Behavioral Science Marcel Dekker, New York (1983) pp. 297-344

Miettinen, O.S. "On the matched-pairs design in the case of all-or-none responses" Biometrics 24(1986) pp.

339-352

Freedman, L.S. "Tables of the number of patients required in clinical trials using the logrank test" Statistics in Medicine 1(1982) pp. 121-129

Collett, D. Modelling Survival Data in Medical Research Chapman and Hall (1994) Section 9.2

Chow, S.C, Liu, J.P. Design and Analysis of Bioavailability and Bioequivalence Studies Marcel Dekker, Inc.

(1992)

Schuirmann, D.J. "A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability" Journal of Pharmacokinetics and Biopharmaceutics 15(1987) pp.

657-680

Phillips, K.E. "Power of the two one-sided tests procedure in bioequivalence" Journal of Pharmacokinetics and Biopharmaceutics 18(1990) pp. 137-143

Owen, D.B. "A special case of a bivariate non-central t-distribution" Biometrika 52(1965) pp. 437-446

O’Neill, R.T. "Sample size for estimation of the odds ratio in unmatched case-control studies" American Journal of Epidemiology 120(1984) pp. 145-153

References

Related documents

Reactions (of a given biochemical system) influence each other through their products — they may contain entities which are reactants for some reactions (therefore facilitating

by daily ET r , which is derived from synoptic gridded weather data in EEFlux as compared

In order To study the Interaction effect of Internet Usage levels, Gender and Locale on the Social Competence of the secondary school students 3x2x2 ANOVA has been employed

Digital health and social care practices, activities and projects in Cumbria, including those which had expired, were mapped from May 2014 to July 2015.. The purpose of carrying out

Thus, the objective of this study was to identify correlation (if any) between Threshold Odour Number (TON) test with a portable olfactometry plus odour wheel (POPOW)

In the two concomitant administration studies, the mean IOP reductions of DuoTrav Eye Drops were similar to those achieved by concomitant therapy with Travatan dosed once daily in

The High Patent Use Dummy is a dummy that is equal to one for affiliates of parents that over the four years prior to a reform average at least as many patent applications as

bermakna jika siswa mengalami langsung apa yang dipelajarinya. Proses pembelajaran Biologi di SMA dapat dilaksanakan dengan siswa mengalami langsung sehingga pembelajaran

Using a real data set from the motor insurance sector, we compare the estimated risk of loss evaluated for the bivariate Sarmanov distribution with truncated extreme value

The main sources of unsafe care in our weighted sample were due to: medication- related incidents (e.g.. incomplete or non- transfer of information across care boundaries) (n=390,

In addition, since RasD, a protein not normally found in vegetative cells, is expressed in vegetative rasG ⴚ and rasC ⴚ /rasG ⴚ cells and appears to partially compensate for the

In a cross-sectional design, this study examined the relationship between sarcopenia status, mild cognitive impairment and depression symptoms in community-dwelling

What sample size do I need to have a 80% probability (power) to detect this particular effect (difference and standard deviation) at a 5% significance level using a 2-sided test?...

What sample size do I need to have a 80% probability (power) to detect this particular effect (difference and standard deviation) at a 5% significance level using a 2-sided test?...

For the hydrodynamic cases of vertically propagating sound waves and internal gravity waves, or slow waves in a low-β plasma in a vertical magnetic field the resonance exists below

Because fraud is a manifestation of acute agency costs, I expect that fraud firms will experience a greater increase in: (i) outside director percentage, (ii) audit

NR activity of glasshouse-grown seedling cotyledons (A) and first, second, third and fourth leaves (B-E, respectively) was measured 22 h following infiltration with (+N) or

The power of the test depends upon the sample size, the magnitudes of the variances, the alpha level, and the actual difference between the two population

THE WITCH AND THE SAINT Für das Jugendblasorchester der Stadt Ellwangen, Deutschland.. Steven

Since squared standard errors are inversely proportional to sample sizes, the required sample size for a multilevel design will be given by the sample size that would be required for

What sample size do I need to have a 80% probability (power) to detect this particular effect (difference and standard deviation) at a 5% significance level using a 2-sided test?...

What sample size do I need to have a 80% probability (power) to detect this particular effect (difference and standard deviation) at a 5% significance level using a 2-sided test?...