The Normal Distribution
7.6 Normal Approximation to a Binomial Distribution
It is often desirable to use the normal distribution in place of another probability distribution. In particular, it is convenient to replace the binomial distribution with the normal when certain conditions are met. Remember, though, that the binomial distribution is discrete, whereas the normal distribution is continuous.
The shape of the binomial distribution varies considerably according to its parameters, n and p. If the parameter p, the probability of “success” (or a defective item or a failure, etc.) in a single trial, is sufficiently small (or if q = 1 – p is suffi
ciently small), the distribution is usually unsymmetrical. If p or q is sufficiently small and if the number of trials, n, is large enough, a binomial distribution can be approxi
mated by a Poisson distribution. This was discussed in section 5.4 (c).
On the other hand, if p is sufficiently close to 0.5 and n is sufficiently large, the binomial distribution can be approximated by a normal distribution. Under these conditions the binomial distribution is approximately symmetrical and tends toward a bell shape. A larger value of n allows greater departure of p from 0.5; a binomial distribution with very small p (or p very close to 1) can be approximated by a normal distribution if n is very large. If n is large enough, sometimes both the Poisson approximation and the normal approximation are applicable. In that case, use of the normal approximation is usually preferable because the normal distribution allows easy calculation of cumulative probabilities using tables or computer software.
0.2
0.15
Normal Probability Density 0.1
Binomial Probability
0.05
0
0 5 10 15 20
Number of defectives
Figure 7.16: Comparison of a Binomial Distribution
Figure 7.16 compares a binomial distribution with a normal distribution. The parameters of the binomial distribution are p = 0.4 and n = 20 (for instance, we might take samples of 20 items from a production line when the probability that any one item will require further processing is 0.4). To fit a normal distribution we need to know the mean and the standard deviation. Remember that the mean of a binomial distribution is µ = np, and that the standard deviation for that distribution is
σ = np
(
1 − p)
. To fit a normal distribution to this binomial distribution, we must have µ = np = (20)(0.4) = 8, and σ = np(
1 − p)
=(
20)(
0.4)(
0.6)
= 2.191. In Figure 7.6 the continuous curve passing through small circles represents the density function for the fitted normal distribution, while the vertical lines topped by small crosses represent binomial probabilities. The agreement appears to be very good.But we have a difficulty to deal with. That is, the normal distribution is continuous, whereas the binomial distribution is discrete. Probabilities according to the binomial distribution are different from zero only when the number of defectives is a whole number, not when the number is between the whole numbers. On the other hand, if we integrate the normal distribution only for limits infinitesimally apart around the whole numbers, the area under the curve will be infinitesimally small. Then the corresponding probability will be zero.
The common-sense solution is to integrate for wider steps, which together cover the whole range. We set limits for integration of the normal distribution halfway between possible values of the discrete variable. This modification is called the correction for continuity. In Figure 7.6 the limits for integration of the normal distribution would be from 5.5 to 6.5 to compare with a binomial probability at 6 defects. For comparison with the binomial value at 7, the limits would be from 6.5 to 7.5, and so on.
The numerical comparison of probabilities using the correction for continuity is shown in Example 7.7. Approximating binomial probabilities in this way is called the normal approximation to a binomial distribution.
Example 7.7
Corresponding to the case shown in Figure 7.6, let’s calculate probabilities according to the binomial distribution and for the normal distribution which fits it approxi
mately. In a sample of 20 items when the probability that any one item requires further processing is 0.4, the binomial distribution gives probabilities that various numbers of items will require more processing. This is then a binomial distribution with n = 20 and p = 0.4.
Answer: Sample calculations will be shown for the probability of six items requir
ing further processing in a sample of 20, and then all the results will be compared.
By the binomial distribution, Pr [R = 6] = 20C6 (0.4)6 (0.6)14 =
= 0.124
By the normal approximation,
19 18 17
6 5
)(
4 3)( )
(0.4)6 (0.6)14(
20)( )( )( )(
16)(
15)
( )( )(
2 5.5 − 8 Pr [R = 6] ≈ Pr [5.5 < X < 6.5] = 6.5 − 8
Φ 2.191 − Φ 2.191
= Φ(–0.68) – Φ(–1.14)
= 0.121
The values for the normal approximation shown above were read from tables with z evaluated to two decimal places. Evaluating z to three decimal places and using linear interpolation, or using computer software such as the function
NORMSDIST from Excel, would give 0.2468 – 0.1269 = 0.120 for the probability of six defectives. In Table 7.2 the normal approximations have been calculated with z evaluated to three decimal places and with linear interpolation to give a more accurate error of approximation, but interpolation is not ordinarily required.
Table 7.2: Comparison of Binomial Distribution and Normal Approximation
Number for Binomial Normal Error of
Further Probability Approximation Approximation Processing
0 0.00004 0.00026 –0.0002
1 0.0005 0.0012 –0.0007
2 0.0031 0.0045 –0.0014
3 0.012 0.014 –0.0016
4 0.035 0.035 –0.0001
5 0.075 0.072 +0.003
6 0.124 0.120 +0.005
7 0.166 0.163 +0.003
8 0.180 0.180 –0.001
9 0.160 0.163 –0.003
10 0.117 0.120 –0.003
11 0.071 0.072 –0.0009
12 0.035 0.035 +0.0004
13 0.015 0.014 +0.0006
14 0.0049 0.0045 +0.0003
15 0.0013 0.0012 +0.0001
16 0.0003 0.0003 +0.0000
17 0.00004 0.00005 –0.0000
18 5x10–6 0.00001 –0.0000
19 3x10–7 <10–6
20 1x10–8 <10–6
The largest error in Table 7.2 is 0.005, 0.124 vs. 0.120 for six defectives.
As a rough rule, the normal approximation to the binomial distribution is usually reasonably good if both np and (n)(1–p) are greater than 5. In Example 7.7, np is equal to (20)(0.4) = 8 and (n)(1 – p) is equal to (20)(0.6) = 12, so the rough rule is satisfied with some to spare. The rough rule should be used in solving problems in this book.
The rule is only a rough guide because the two parameters, n and p, affect the agreement separately. For the same value of the product np, the normal approxima
tion to the binomial distribution is better when p is closer to 0.5. We can illustrate that by comparing the binomial distribution with the corresponding normal approxi
mation just at np = 5, the limit given by the rough rule, at three combinations of n and p. Figure 7.17 shows these comparisons.
Probability
0.25
0.2
0.15
0.1 Binomial
Normal Approximation 0.05
0
0 1 2 3 4 5 6 7 8 9 10
Number of Occurrences, i
Figure 7.17(a): Comparison at n = 10 and p = 0.5
0 0.05 0.1 0.15 0.2 0.25
Probability
Normal Approximation Binomial
0 1 2 3 4 5 6 7 8 9 10 11 12 Number of Occurrences, i
Figure 7.17(b): Comparison at n = 25 and p = 0.2
0 0.05 0.1 0.15 0.2 0.25
Probability
Normal Approximation Binomial
0 1 2 3 4 5 6 7 8 9 10 11 12 Number of Occurrences, i
Figure 7.17(c): Comparison at n = 250 and p = 0.02
We can see from Figure 7.17 that the discrepancies are smallest at n = 10 and p = 0.5, intermediate at n = 25 and p = 0.2, and largest at n = 250 and p = 0.02, even though all are at np = 5 and n(1 – p) > 5. At n = 10 and p = 0.5 the largest absolute discrepancy is 0.002; at n = 25 and p = 0.2 the largest absolute discrepancy is 0.011;
and at n = 250 and p = 0.02 the largest absolute discrepancy is 0.071.
Example 7.8
A coin is biased. We are told that the probability of heads on any one toss is 40% and the corresponding probability of tails is 60%. The coin is tossed 120 times, giving 56 heads and 64 tails. From what we were told about the bias, we expect (120)(0.40) = 48 heads. If the given information is correct, what is the probability of getting either
56 or more heads, or 40 or fewer heads (i.e., a result as far from the expected result as 56 heads or farther in either direction)? Is the result so unlikely that we should doubt that the probability of heads on a single toss is only 40%?
Answer: This problem could be solved using the binomial distribution directly:
Pr [R = 56] = 120C56 (0.4)56 (0.6)64, and similarly for R = 57, 58, ... 120 and R = 0, 1, 2, ..., 39, 40, then adding up probabilities. However, these calculations are very laborious. It would be less work to calculate the sum of Pr [R = 41], Pr [R = 42], ...
Pr[R = 54], Pr [R = 55] and subtract that sum from 1, but that would still be a lot of labor. It is much easier to apply the normal approximation, and results should be very little different. In this case np = (120)(0.4) = 48 and (n)(1 – p) = (120)(0.6) = 72, so the rough rule is very easily satisfied. For the normal approximation µ = np = (120)(0.4) = 48 and σ =
( )(
n p)(
1 − p)
=(
120)(
0.4)(
0.6)
= 5.367.Using the correction for continuity, Pr [R = 56] corresponds to the area under the normal probability curve between 55.5 and 56.5. So, Pr [R > 55] corresponds to the area under the curve beyond 55.5. Similarly, Pr [R < 41] corresponds to the area
x1 − µ 55.5 − 48
under the curve for X < 40.5. If x1 = 55.5, z1 = = =1.397
σ 5.367
40.5 − 48
Similarly, if x2 = 40.5, z2 = = −1.397 5.367
Then Pr [R > 55, Binomial] ≈ Pr [Z > 1.397]
Req'd areas
40.5 48 55.5 x, no. of heads
= 1 – Φ(1.397) z
2 0 z1
≈ 1 – Φ(1.40) Figure 7.18:
= 1 – 0.9192 = 0.081. Probabilities for
Example 7.8 Then Pr [more than 55 heads] ≈ 8.1%.
Similarly, Pr [fewer than 41 heads] ≈ 8.1%. The probability of a result as far from the mean as 56 heads or farther in either direction, given that p = 0.400, is (2)(8.1%)
= 16.2%. This would happen by chance about one time in six, so it is not very unlikely. Then the result of tossing the coin gives us no evidence that p is not equal to 0.400.
Approximations such as the normal approximation to the binomial distribution are not as important as they used to be because nearly exact values can be obtained using computer software. As we saw in section 5.5(b), both single and cumulative values for the binomial distribution can be obtained from Microsoft Excel. However, even when these nearly exact values are available, it may be desirable to use a convenient approximation.