Sampling for Extensive Studies Barbara M Wildemuth
SAMPLE SIZE
To conduct your study as efficiently as possible, you will want to recruit and collect data from as small a sample as possible. Unfortunately, as Cohen (1965) noted when discussing psychology research, there are a number of problems with the ways that many researchers decide on their sample sizes:
122 APPLICATIONS OF SOCIAL RESEARCH METHODS
As far as I can tell, decisions about n in much psychological research are generally arrived at by such considerations as local tradition (“At Old Siwash U., 30 cases are enough for a dissertation”); subject-matter precedent (“I’ll use samples of 20 cases, since that’s how many Hook and Crook used when they studied conditioning under anxiety”); data availability (“I can’t study more idiots savants than we have in the clinic, can I?”); intuition or one of its more presumptuous variants, “experience”; and negotiation (“If I give them the semantic differential, too, then it’s only fair that I cut my sample to 40”). (p. 98)
Despite the temptation of using one of these “standard” methods, we hope you’ll try to be more rational in deciding how big your sample should be.
If you’re conducting a descriptive study, you’ll be using data from your sample to estimate characteristics (i.e., parameters) of your population. There are three aspects to the decision about sample size in this situation. The first, and most important, is the confidence interval around the estimate (Czaja & Blair, 2005). This is the amount of error that you can tolerate in the parameter estimates. For example, if you want to know the percentage of your computer help desk users who use the Windows operating system, you might be comfortable with a confidence interval that is five percentage points above or below the estimate that is derived from the sample, but in another study, you might need to estimate the population values within two percentage points above or below the estimate. There is no fixed criterion for setting the acceptable confidence interval; you need to balance the importance of the accuracy of the estimates for the goals of your study with the extra cost of collecting data from a larger sample. The second aspect of your decision is concerned with the probability that the true value for the population falls within the confidence interval you desire. Usually, you will want a small probability (5% or 1%) that the true population parameter falls outside the confidence interval (Rea & Parker, 1997). Finally, you need to take into account the variability expected within the population. If the characteristic you are estimating varies widely within the population (e.g., the number of times each person in a community visits the library each year), you’ll need a larger sample to estimate its value accurately. Once you’ve made decisions about or know each of these attributes of your situation, there are simple formulas for estimating what sample size you will need.
If you’re conducting a study in which you will be testing hypotheses, you will be concerned with the power of your statistical analyses. In these cases, there are four aspects of the study that you need to consider in deciding what sample size is appropriate. The first, and most important, is the effect size that you want to be able to detect (Asraf & Brewer, 2004). Let’s imagine that you are trying to determine if the number of terms used in a Web search is related to the searcher’s expertise on the topic of the search. You will be collecting measures of topic expertise and you will capture searches to determine how many search terms are used in each. In such a study, the effect is the difference in the number of search terms. Is it important for you to know that on average, experts use one more term than nonexperts? Or is it sufficient to be able to detect that experts use three more terms than nonexperts? The substance of your research question will help you to determine how small an effect you want to be able to detect. The smaller the effect you need to detect, the larger the sample you will need. The next aspect to consider is the probability of rejecting a null hypothesis when you shouldn’t. This is the most obvious consideration in hypothesis testing and is sometimes called a type I
error. Using our Web search example, you would be committing a type I error if you
Sampling for Extensive Studies 123
For most studies, this value (alpha) is set at 0.05. This would mean that you’re willing to accept only a 5 percent chance of being wrong if you reject the hypothesis that there is no difference between experts and nonexperts. The next aspect to consider is the probability of making a type II error: not rejecting the null hypothesis when you really should. Using our Web search example, you would be committing a type II error if you said that there is no difference between experts and nonexperts when there really is. This aspect is sometimes referred to as the power of a statistical test, formally defined as 1—the probability of a type II error—or as the probability of correctly rejecting a null hypothesis (Howell, 2004). Cohen (1988) recommends that you set the probability of a type II error to 0.20 so that you will be achieving a power of 0.80. Finally, as with estimating population parameters, you need to take into account the variability expected within the population. Once you’ve made decisions or know about these four aspects of your study, you can use the formulas given by Cohen (1988) or Howell (2004) or use an online power calculator (e.g., Lenth, 2006) to determine the sample size needed.
A Special Case: Usability Testing
Usability testing is a special case, when it comes to deciding on sample size, because some experts (e.g., Nielsen, 1993) have argued that just five or six users can identify 80 percent of the usability problems there are, while others (e.g., Faulkner, 2003; Spool & Schroder, 2001) have found that a larger sample size is needed to identify a signif- icant proportion of the usability problems in a new system or Web site. Lewis (2006) demonstrates how the sample size needed depends on the goals of the usability study.