Draw Study Conclusions - Draw Conclusions from the Data

Step 5: Draw Conclusions from the Data

5.1.2 Draw Study Conclusions

The goal of this activity is to translate the results of the statistical hypothesis test so that the data user may draw a conclusion from the data. The results of the statistical hypothesis test will be either:

(a) reject the null hypothesis, in which case the analyst is concerned about a possible false rejection decision error; or

(b) fail to reject the null hypothesis, in which case the analyst is concerned about a possible false acceptance decision error.

In case (a), the data have provided the evidence needed to reject the null hypothesis, so the decision can be made with sufficient confidence and without further analysis. This is because the statistical test based on the classical hypothesis testing philosophy, which is the approach described in prior chapters, inherently controls the false rejection decision error rate within the data user's tolerable limits, provided that the underlying assumptions of the test have been verified correctly.

In case (b), the data do not provide sufficient evidence to reject the null hypothesis, and the data must be analyzed further to determine whether the data user's tolerable limits on false acceptance decision errors have been satisfied. One of two possible conditions may prevail:

(1) The data do not support rejecting the null hypothesis and the false acceptance decision error limits were satisfied. In this case, the conclusion is drawn in favor of the null hypothesis, since the probability of committing a false acceptance decision error is believed to be sufficiently small in the context of the current study (see Section 5.2).

(2) The data do not support rejecting the null hypothesis, and the false acceptance decision error limits were not satisfied. In this case, the statistical test was not powerful enough to satisfy the data user's performance criteria. The data user may choose to tolerate a higher false acceptance decision error rate than previously specified and draw the conclusion in favor of the null hypothesis, or instead take some form of corrective action, such as obtaining additional data before drawing a conclusion and making a decision.

When the test fails to reject the null hypothesis, the most thorough procedure for verifying whether the false acceptance decision error limits have been satisfied is to compute the estimated power of the statistical test, using the variability observed in the data. Computing the power of the statistical test across the full range of possible parameter values can be complicated and usually requires specialized software. Power calculations are also necessary for evaluating the performance of a sampling design. Thus, power calculations will be discussed further in Section 5.1.3.

A simpler method can be used for checking the performance of the statistical test. Using an estimate of variance obtained from the actual data or upper 95% confidence limit on variance, the sample size required to satisfy the data user's objectives can be calculated retrospectively. If this theoretical sample size is less than or equal to the number of samples actually taken, then the test is sufficiently powerful. If the required number of samples is greater than the number actually collected, then additional samples would be required to satisfy the data user's performance criteria for the statistical test. An example of this method is contained in Box 5-1. The equations

EPA QA/G-9 Final

QA00 Version 5 - 5 July 2000

Box 5-1: Checking Adequacy of Sample Size for a One- Sample t-Test for Simple Random Sampling

In Box 3-1, the one-sample t-test was used to test the hypothesis H0: µ # 95 ppm vs. HA: µ > 95 ppm. DQOs specified that the test should limit the false rejection error rate to 5% and the false acceptance error rate to 20% if the true mean were 105 ppm. A random sample of size n = 9 had sample mean X¯ = 99.38 ppm and standard deviation s = 10.41 ppm. The null hypothesis was not rejected. Assuming that the true value of the standard deviation was equal to its sample estimate 10.41 ppm, it was found that a sample size of 9 would be required, which validated the sample size of 9 which had actually been used.

The distribution of the sample standard deviation is skewed with a long right tail. It follows that the chances are greater than 50% that the sample standard deviation will underestimate the true standard deviation. In such a case it makes sense to build in some conservatism, for example, by using an upper 90% confidence limit for F in Step 5 of Box 3-12. Using Box 4-22 and n - 1 = 8 degrees of freedom, it is found that L = 3.49, so that an upper 90% confidence limit for the true standard deviation is

s [(n& _{1) /}_L_] ' _{10.41 8 / 3.49} ' _15.76

Using this value for s in Step 5 of Box 3-1 leads to the sample size estimate of 17. Hence, a sample size of at least 17 should be used to be 90% sure of achieving the DQOs. Since it is generally desirable to avoid the need for additional sampling, it is advisable to conservatively estimate sample size in the first place. In cases where DQOs depend on a variance estimate, this conservatism is achieved by intentionally overestimating the variance.

required to perform these calculations have been provided in the detailed step-by-step instructions for each hypothesis test procedure in Chapter 3.

In document Quality. Guidance for Data Quality Assessment. Practical Methods for Data Analysis EPA QA/G-9 QA00 UPDATE. EPA/600/R-96/084 July, 2000 (Page 169-171)