Probability-based sampling includes any selection method where the sample members (sampling units) are selected from the target population on a purely random (chance) basis. Under random sampling, every member of the target population has a chance of being selected for the sample.
155 There are four probability-based sampling methods:
simple random sampling systematic random sampling stratified random sampling cluster random sampling.
Simple Random Sampling
In a simple random sample, each member in the target population has an equal chance of being selected.
It is assumed that the population is homogeneous with respect to the random variable under study (i.e. the sampling units share similar views on the research question(s); or the objects in a population are influenced by the same background factors).
One way to draw a simple random sample is to assign a number to every element of the population and then effectively ‘draw numbers from a hat’. If a database of names exists, then a random number generator can be used to draw a simple random sample.
Some examples are given below.
A simple random sample of taxpayers can be selected from all taxpayers to check on the correctness of their tax return forms. Since the correctness of tax return forms is independent of the age, gender or income of the taxpayer, a simple random sample is likely to be representative of the population of taxpayers.
The population of Johannesburg motorists is to be surveyed for their views on toll roads.
A simple random sample of Johannesburg motorists is assumed to be representative of this population as their views are unlikely to differ significantly across gender, age, car type driven or use of vehicle (i.e. private or business).
In a production process, parts that come off the same production line can be selected using simple random sampling to check the quality of the entire batch produced.
A survey of all tourists to Cape Town is to be undertaken to find out their views on service standards (quality and price) in Cape Town with respect to accommodation, transport, attractions and restaurants. A simple random sample of tourists is likely to represent the views of all tourists as it is assumed that their views on ‘service standards’ will be similar regardless of age, gender or nationality.
Systematic Random Sampling
Systematic random sampling is used when a sampling frame (i.e. an address list or database of population members) exists. Sampling begins by randomly selecting the first sampling unit. Thereafter subsequent sampling units are selected at a uniform interval relative to the first sampling unit. Since only the first sampling unit is randomly selected, some randomness is sacrificed.
To draw a systematic random sampling, first divide the sampling frame by the sample size to determine the size of a sampling block. Randomly choose the first sample member from within the first sampling block. Then choose subsequent sample members by selecting one member from each sampling block at a constant interval from the previously sampled member.
For example, to draw a systematic random sample of 500 property owners from a database of 15 000 property owners, proceed as follows:
Identify the size of each sampling block = ______ 15 000500 = 30 property owners.
Randomly select the first sample member from within the first 30 names on the list.
Assume it is the 16th name on the list.
Thereafter select one property owner from each of the remaining 499 sampling blocks at a uniform interval (every 30th person) relative to the first sampling unit. If the 16th name was initially randomly selected, then the 46th, 76th, 106th, 136th etc. until the 14 986th name on the list would represent the remaining sample members.
This will result in a randomly drawn sample of 500 names of property owners.
Stratified Random Sampling
Stratified random sampling is used when the population is assumed to be heterogeneous with respect to the random variable under study. The population is divided into segments (or strata), where the population members within each stratum are relatively homogeneous.
Thereafter, simple random samples are drawn from each stratum.
If the random samples are drawn in proportion to the relative size of each stratum, then this method of sampling is called proportional stratified random sampling.
The advantage of this sampling method is that it generally ensures greater representativeness across the entire population and also results in a smaller sampling error, giving greater precision in estimation. A disadvantage is that larger samples are required than in simple random sampling to ensure adequate representation of each stratum. This increases the cost of data collection.
Some examples are given below.
If age and gender of the motoring public are assumed, a priori, to influence their responses to questions on car type preferred and features sought in a car, then stratifying this population by these two characteristics and drawing a simple random sample from each stratum of age/gender combination is likely to produce a more representative sample.
Ratepayers in Durban could be stratified by the criteria ‘property value per square metre’.
All ‘low-valued’ suburbs would form one stratum, all ‘medium-valued’ suburbs would form a second stratum and all ‘high-valued’ suburbs would form a third stratum. A simple random sample of households within each stratum would be selected and interviewed.
Their responses to questions on rates increases would be assumed to represent the responses of all ratepayers across all strata of property values.
Cluster Random Sampling
Certain target populations form natural clusters, which make for easier sampling. For example, labour forces cluster within factories; accountants cluster within accounting firms; lawyers cluster within law firms; shoppers cluster at shopping malls; students cluster at educational institutions; and outputs from different production runs (e.g. margarine tubs) are batched and labelled separately, forming clusters.
Cluster random sampling is used where the target population can be naturally divided into clusters, where each cluster is similar in profile to every other cluster. A subset of clusters is then randomly selected for sampling.
157 The sampling units within these sampled clusters may themselves be randomly selected to provide a representative sample from the population. For this reason, it is also called two-stage cluster sampling (e.g. select schools (two-stage 1) as clusters, then pupils within schools (stage 2); select companies (stage 1) as clusters, then employees within companies (stage 2)).
Cluster sampling tends to be used when the population is large and geographically dispersed. In such cases, smaller regions or clusters (with similar profiles) can be more easily sampled.
The advantage of cluster random sampling is that it usually reduces the per unit cost of sampling. One disadvantage is that cluster random sampling tends to produce larger sampling errors than those resulting from simple random sampling.
Some examples are given below.
In a study on mine safety awareness amongst miners, each gold mine could be considered to be a separate cluster. Assume there are 47 mine clusters in the Gauteng area. A randomly drawn sample of, say, eight mine clusters would first be selected. Then a simple random sample of miners within each of the chosen mines would be identified and interviewed. The responses are assumed to be representative of all miners (including those in mine clusters not sampled).
Each of the 15 major shopping malls in the Cape Peninsula can be classified as a cluster.
A researcher may randomly choose, say, three of these shopping malls, and randomly select customers within each of these selected clusters for interviews on, say, clothing purchase behaviour patterns. Their responses are assumed to reflect the views of all shopping mall shoppers.