• No results found

Rationale of a Sampling DistributionRationale of a Sampling Distribution

Rationale of a Sampling Distribution

Assume that the Department of Labour wants to determine the average number of years’

experience of all professional engineers in South Africa. Consider the hypothetical situation of drawing every possible sample of size n (sayn = 100 professional engineers) from this population. Assume that there arek such samples. Calculate the mean years of experience for each of thesek samples. There will now bek sample means.

Chapter 6 – Sampling and Sampling Distributions

Now construct a frequency distribution of thesek sample means and also calculate the mean and standard deviation of thesek sample means. The following properties w ill emerge:

(a) The sample mean is itself arandom variable, as the value of each sample mean is likely to vary from sample to sample. Each separate sample of 100 engineers will have a different sample mean number of years’ experience.

(b) Themean of all thesek sample means will be equal to thetrue population mean,µ.

called thestandarstandard error of d error of the sample meanthe sample meanssand is calculated as follows:

σx_ = ___ σ

__

n 6.16.1

It measures the average deviation of sample means about its true population mean.

(d) The histogram of thesek sample means will benormally distributed.

This distribution of the sample means is called thesampling distributionsampling distribution of _ x.

To summarise, the sample mean is a random variable that has the following three properties:

It is normally distributed.

It has a mean equal to the population mean,µ.

It has a standard deviation, called the standard error,σx_, equal to ___ σ

68.3% of all sample means will lie within one standard error of its population mean.

95.5% of all sample means will lie within two standard errors of its population mean.

99.7% of all sample means will lie within three standard error of its population mean.

Alternatively, it can be stated as follows:

There is a 68.3% chance that a single sample mean will lieno further than one standard error away from its population mean.

There is a 95.5% chance that a single sample mean will lieno further than two standard errors away from its population mean.

There is a 99.7% chance that a single sample mean will lieno further than threestandard errors away from its population mean.

This implies that any sample mean which is calculated from a randomly drawn sample has a high probability (up to 99.7%) of beingno more than three standard errors away from its true, but unknown, population mean value.

These probabilities are found by relating the sampling distribution of _ x to the z-distribution. Any sample mean, _ x, can be converted into az-value through the following z transformation formulae:

Applied Business Statistics

Thesampling distribution of the sample mean is shown graphically in Figure 6.1.

68.3%

_ x

±1σ 95.5%

99.7%

x

±2σ

x

±3σx σ = ___ σ

__

n x

µ

Figure 6.1

Figure 6.1 Sampling distribution of the sample mean

This relationship between the sample mean and its population mean can be used to:

find probabilities that a single sample mean will lie within a specified distance of its true but unknown population mean

calculate probability-based interval estimates of the population mean

test claims/statistical hypotheses about a value for the true but unknown population mean.

The sampling distribution is the basis for the two inferential techniques ofconfidence intervals andhypotheses tests, which are covered in the following chapters.

6.5

6.5 The The Sampling Sampling Distributi Distribution on of of the the Sample Sample Proportion Proportion (( p ))

The sample proportion, p, is used as the central location measure when the random variable under study isqualitative and the data iscategorical (i.e. nominal/ordinal).

A sample proportion is found by counting the number of cases that have the characteristic of interest,r , and expressing it as a ratio (or percentage) of the sample size,n (i.e. p =

_

nr ). See Table 6.3 for illustrations of sample proportions.

Table 6.3

Table 6.3 Illustrations of sample proportions for categorical variables Qu

Qualaliitatatitivve e rarandndom om vavaririababllee SaSampmplle e ststatatisistiticc Gender

Trade union membership Mobile phone brand preference

Proportion of females in a sample of students Proportion of employees who are trade union members

Proportion of mobile phone users who prefer Nokia This section will show how the sample proportion, p, is related to the true, but unknown population proportionπ for any categorical random variablex.

The sample proportion, p, is related to its population proportion, π, in exactly the same way as the sample mean, _ x, is related to its population mean,µ. Thus the relationship between p andπ can be described by thesampling distrisampling distribution of bution of the sample proporthe sample proportiontion.

Chapter 6 – Sampling and Sampling Distributions

This relationship can be summarised as follows for a givencategorical random variable,x. (a) The sample proportion, p, is itself a random variable as the value of each sample

proportion is likely to vary from sample to sample.

(b) Themean of all sample proportions is equal to itstrue population proportion,π.

(c) Thestandard deviation of all sample proportions is a measure of the sampling error . It is called thestandarstandard error of d error of sample proposample proportionsrtions and is calculated as follows:

σ p =

_______ _______ π(1 –n π) 6.36.3

It measures the average deviation of sample proportions about the true population proportion.

(d) The histogram of all these sample proportions isnormally distributed.

Based on these properties and using normal distribution theory, it is possible to conclude the following about how sample proportions behave in relation to their population proportion:

68.3% of all sample proportions will lie withinone standard error of its population proportion,π.

95.5% of all sample proportions will lie withintwo standard errors of its population proportion,π.

99.7% of all sample proportions will lie within three standard error of its population proportion,π.

These probabilities are found by relating the sampling distribution of p to thez-distribution.

Any sample proportion, p, can be converted into a z-value through the following z transformation formulae:

z = ______ pσπ

p or z = ________ pπ

_______ _______ π(1 – π)n 6.46.4 Figure 6.2 shows the sampling distribution of single sample proportions graphically.

68.3%

95.5%

99.7%

±1σ p

±2σ p

±3σ p

π p

σ p =

_______ ______ π(1 –nπ)

Figure 6.2

Figure 6.2 Sampling distribution of sample proportions ( p)

This relationship can now be used to derive probabilities, develop probability-based estimates, and test hypotheses of the population proportion in statistical inference.

Applied Business Statistics

6.6

6.6 The The Sampling Distribution Sampling Distribution of of the the Difference between Difference between TTwo wo