The principles of inferential statistics
SIGNIFICANCE TESTING
Now, given that you know that the null hypothesis is a statement about the statistical popu-lation, how can you know whether to accept it or reject it? How can you know that there is absolutely no difference between the means of two statistical populations, or that there is no association at all between two or more variables in the populations you are interested in? The only way to search for an answer to these questions is to examine the characteris-tics of your statistical samples. For example, imagine that your null hypothesis is that there is no difference in the mean well-being of men and women, and that to try to find out whether this is true you collect information about the well-being of 100 men and 100 women. The mean and standard deviation of the sample data are shown in Table 2.10.
The sample means are different in Table 2.10 for men and women. But remember that your null hypothesis does not concern the sample data, but is a statement about the statistical popula-tions from which the sample data have been drawn. It may well be that there is no difference between the mean well-being of all men and the mean well-well-being of all women, but that the difference you have found in your two samples is due to chance. After all, if the mean well-being score of all men is exactly 4.15, you would not necessarily obtain a mean of exactly 4.15 if you just measured the well-being of a sample of 100 men – as explained on page 32, sample statistics are only estimates of population parameters. As a consequence, if you took several random samples of men, with 100 men in each sample, you would probably find that sometimes the sample mean would be over 4.15, and sometimes it would be less than 4.15.
Let’s return to the case of a sample of 100 men and 100 women. Here are the null and alternative hypotheses.
■ Null hypothesis: There is no difference between the mean well-being of men and the mean well-being of women.
■ Alternative hypothesis: There is a difference between the mean well-being of men and the mean well-being of women.
As a researcher, what you really want to find out here is whether the alternative hypoth-esis is true. However, statistical tests cannot directly tell you whether the alternative hypothesis is true or not. What they do, instead, is to tell you how likely you would be to obtain research data with a difference between means (or an association) as large as the one
Table 2.10 The well-being of 100 men and 100 women
Well-being
Mean Standard deviation
Men 4.13 0.11
Women 4.17 0.18
you have found if the null hypothesis that there is no difference (or no association) between the populations is true. This is explained more fully in the next section on statistical signifi-cance.
What is meant by statistical significance?
We are now in a position to examine what is meant by statistical significance. Look again at the sample data in Table 2.10. In this case what a test of statistical significance would ask is this:
If the null hypothesis is true, what is the probability that you would obtain a difference between the two sample means as large as this?
Let’s say that, using inferential statistical methods to analyse your data, you find that if the null hypothesis is true, only one time in 1,000 would you obtain a difference in the means as large as the one you have in Table 2.10. On the basis of this finding you can quite sensibly argue that since this is so unlikely you should probably accept the alternative hypothesis instead. That is, you accept the alternative hypothesis that the population of well-being scores for men and the population of well-being scores for women have different means.
Put more simply, the mean well-being of men and the mean well-being of women are different.
But what if you find that if the null hypothesis is true, then one time in every three you would, by chance, find a difference between means as large as that in your sample data? In this case it would surely be more sensible to conclude that actually getting this result by chance alone if the null hypothesis is true is not very unlikely at all. Therefore, instead of rejecting the null hypothesis and accepting the alternative one, you should simply accept the null hypothesis.
While this example has dealt with differences between means, it is also possible to use significance tests to examine the null hypothesis that there is no association between vari-ables and, in so doing, to test the alternative hypothesis that the varivari-ables are associated.
The logic of significance testing can be summarized as follows:
1 Tests of statistical significance are usually concerned with whether there is an association between populations, or whether there is a difference in the central tendency of populations.
2 They work by examining the probability that the null hypothesis is false. The null hypothesis is normally that there is no association between the populations, or that there is no difference in the central tendency of the populations.
3 If a test of statistical significance indicates that if the null hypothesis is correct there is still a sizeable chance that we would obtain a difference (or association) as large as the one we have found in our research, we accept the null hypothesis and reject the alternative hypothesis.
THE PRINCIPLES OF INFERENTIAL STATISTICS
4 If a test of statistical significance indicates that if the null hypothesis is correct there is only a relatively small chance that we would obtain a difference (or association) as large as the one we have found in our research, we reject the null hypothesis and accept the alternative one.
What statistical tests do is to tell you the probability that you would get a difference between variables as large as the one you have found in your sample data, if the null hypothesis is true. Or they tell you the probability you would get an association between variables as large as the one you have found in your sample data, if the null hypothesis is true. What you then have to do is decide, on the basis of this information, whether to accept or reject the null hypothesis. How do you do this? There is a convention that is generally accepted in research, not only in organizational research but in many other research areas as well. It is as follows:
If the statistical test tells you that the probability that . . . 1 if the null hypothesis is true
2 you would have obtained a difference (or association) in your sample data as large as the one you have found
3 less frequently than 5 per cent of the time by chance . . . it makes reasonable sense to reject the null hypothesis.
This is because you are being told that you would be unlikely to obtain the difference (or association) you have found in your data if the null hypothesis is true. Given that this is the case, you say that the difference (or association) found in your sample data is ‘statistically significant’.
But, if the statistical test tells you that the probability that . . . 1 if the null hypothesis is true
2 you would have obtained a difference (or association) in your sample data as large as the one you have found
3 5 per cent or more of the time by chance (5 per cent of the time, 10 per cent of the time, 50 per cent of the time, etc.)
. . . it makes sense to accept the null hypothesis.
This is because you are being told that you would be quite likely to obtain the difference you have found in your data if the null hypothesis is true. In this case you say that the difference you have found in your sample data is not statistically significant.
The probability that you will obtain a statistically significant result partly depends on the level of significance used. In this book, for the sake of simplicity, and because it is the widely
used convention, we have always referred to the 5 per cent or .05 level of significance. In fact, significance is sometimes set at .01 (or even .001). If the null hypothesis is true you are, by definition, more likely to obtain significance at the 5 per cent level than at the 1 per cent level.
Why, you may ask, is the level of significance set at the 5 per cent or .05 level? Surely it would be safer to set it much higher than this. If we want to be certain that the null hypothesis is false and the alternative hypothesis is true, why not say ‘I am only willing to reject the null hypothesis and accept the alternative hypothesis if I would only obtain a difference as large as the one I have in my data (if the null hypothesis is true) 1 time in 10,000 by chance’? This way we could be very sure that we have not accepted the alter-native hypothesis when we should have accepted the null hypothesis instead. The reason that very high significance levels like this are not used is that while dealing quite effectively with one problem it introduces a new one. The problem that is dealt with, called a Type I error, is the danger of rejecting the null hypothesis when it should be accepted. We make a Type I error when there is actually no difference (or association) between popula-tions but we wrongly conclude from our sample data that such a difference (or association) is there. For example, if there is no difference between the job commitment of employees working in sales and marketing, but we measure the job commitment of 20 sales and marketing people and from this incorrectly conclude that those in sales are more committed than those working in marketing (or vice versa), we are making a Type I error. If we reduce the level required for statistical significance from .05 to, say, .001, the likelihood of making a Type I error is reduced. The trouble is that if we do this the probability of making a Type II error, that is accepting the null hypothesis when it is actually false, increases. The .05 level of significance is generally chosen not because it is perfect, but because it represents a reasonable trade-off between (1) the danger of making a Type I error, which is particu-larly likely to occur if we set the significance level too high (e.g. .10); and (2) the danger of making a Type II error, which is prone to happen if we set the significance level too low (e.g. .0001).
All statistical significance tests provide information about how unlikely the differences or associations you have found in your sample data are if the null hypothesis is true. And they do so with what is called a p, or probability, value. For example if you use the t-test, which is the most commonly used statistical test for examining whether two samples of continuous data are drawn from populations with the same means, SPSS will provide you with information about the p value of the result. If you obtain a p value of 0.03, this means that the probability of finding a difference between the sample means as large as the one you found is 3 in 100. Shown in Table 2.11 are some examples of p values for differences between means, and how they should be interpreted.
The way to decide whether the result is statistically significant or not is to ask whether the p value is less than 0.05. This is because, as discussed above, the convention is to reject the null hypothesis and accept the alternative hypothesis if the difference (or association) you find, or a bigger one, would occur by chance less than 5 per cent of the time if the null hypothesis is true.
1111
THE PRINCIPLES OF INFERENTIAL STATISTICS
So, if the p value is less than 0.05 you can claim to have a statistically significant result, reject the null hypothesis and accept the alternative hypothesis. However, if the p value is 0.05 or over, you have to say that your result is not statistically significant and accept the null hypothesis.