• No results found

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

N/A
N/A
Protected

Academic year: 2021

Share "16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

It is sometimes difficult to directly compute probabilities for a binomial (n, p) random variable, X. We need a different table for each value of n, p. If we don't have a table, direct calculations can get cumbersome very quickly.

Eg: Compute P(X 100) for n = 150, p = 0.35.

For normal random variables, on the other hand, probability calculations are extremely easy; just one table is required.

Fortunately, we can approximate the binomial distribution by a normal distribution, with an appropriate choice of µ and σ .

To get a feel for why this might work, let's study the Quincunx.

The Quincunx is a device invented by Sir Francis Galton in the 1800’s which shows empirically that binomial random variables, observed repeatedly, reveal a histogram which looks bell-shaped, as long as the number of trials is not too small.

See Quincunx website at:

http://www.rand.org/methodology/stat/applets/clt.html

(2)

• In general, the distribution of a binomial random variable may be accurately approximated by that of a normal random variable,

as long as np5, nq ≥ 5, and assuming that a “continuity correction” is made to account for the fact that we are using a continuous distribution (the normal) to approximate a discrete one (the binomial).

• For approximating the distribution of X, we will use the normal distribution with mean µ = np, variance σ

2

= npq, where q = 1 p. Why are these reasonable choices of µ , σ

2

? To study the quality of this approximation, visit the Normal Approximation to the Binomial website at:

http://www.stat.sc.edu/~west/applets/binomialdemo2.html This draws a bar chart of the binomial distribution for a given n, p, and superimposes the approximating normal distribution. Note how skewness increases as p moves away from 0.5.

See histograms of number of dark M&Ms and orange

M&Ms from M&M Lab. (Separate handout).

(3)

• If p(x) is the binomial distribution and f (x) is the density of the normal, the approximation is:

Thus, the binomial probability p(a) is approximately equal to the probability that a normal RV with mean np and variance npq lies between x = a 1/2 and x = a + 1/2.

Also, P(a ≤ X ≤ b) is approximately equal to the area under the normal curve between x = a 1/2 and x = b + 1/2.

≈ ∫

+

2 1

2 1

) ( )

(

a

a

dx x f a

p

≈ ∫

+

= −

2 1

21

) ( )

(

b

a b

a x

dx x f x

p

• The “continuity correction” is the use of a 1/2, b + 1/2 in

the normal approximation. This ensures that probabilities are

always approximated by areas under the normal curve. It can

dramatically improve the quality of the approximation, even

when n is large, so it should be used whenever possible.

(4)

In the diagram above, the bars represent the binomial distribution with n = 10, p = 0.5. The superimposed curve is a normal density f(x). The mean of the normal is µ = np = 5, and the standard

deviation is

Suppose we wish to find p(4), the probability that the binomial equals 4.

From Table 2 of Appendix B, we get

p(4) = 0.3770 − 0.1719 = 0.2051.

This is the exact probability, but we won’t always be so lucky as to have a binomial table for the given n and p.

So let’s try the normal approximation.

σ = 10 0 5 0 5 ( . )( . ) = 158 .

Using the normal approximation, we need to calculate the probability that our normal is between 3.5 and 4.5. The corresponding z-scores are

(3.5 − 5)/1.58 = − 0.95 and (4.5 − 5)/1.58 = − 0.32.

Thus, the normal approximation to p(4) is

Pr(0.32 < Std Normal < 0.95) = 0.3289 − 0.1255 = 0.2034.

This is quite close to the actual value, p(4) = 0.2051.

If we hadn’t used the continuity correction, our approximation

to p(4) would be zero, that is, the area under the normal curve

between 4 and 4. This would be a very poor approximation

indeed!

(5)

As the diagram shows, the area under the normal density

between 3.5 and 4.5 provides a reasonable approximation to the height of the bar, p(4). This should make it clear why the

continuity correction is helpful.

Eg: Tomorrow morning’s Iberia flight to Madrid can seat 370 passengers. From past experience, Iberia knows that the probability is 0.90 that a given ticket-holder will show up for the flight. They have sold 400 tickets, deliberately

overbooking the flight. How confident can Iberia be that no passenger will need to be “bumped” (denied boarding)?

Solution: We will assume that the number (X) of passengers showing up for the flight has a binomial distribution with mean µ µ = (400)(0.9) = 360 and standard deviation

(Is this reasonable?)

We want Pr[X ≤ 370]. We approximate this by the probability that our normal RV is less than 370.5. This is the probability that a standard normal is less than z = (370.5 − 360)/6 = 1.75. So the probability that nobody gets bumped is approximately

0.5 + 0.4599 = 0.9599. (Almost 96%).

σ = 400 0 9 01 ( . )( . ) = 6

(6)

Eg: What is the probability that you will win at least $10 after playing 100 games of craps for $1 per game?

Solution: To win at least $10, you must win at least 55 games.

The number (X) of games you will win has a binomial distribution with n = 100, p = .493.

Therefore, X has mean µ = 49.3 and standard deviation σ

We want Pr[X 55] = Pr[Std. Normal > (54.5 – 49.3)/5]

= Pr[Std. Normal >1.04] = .5 – .3508 = .1492.

(Just a 15% chance!)

[Forecasting Lab Results]

. 00 . 5 ) 507 )(.

493 (.

100 =

=

References

Related documents

Thus, they concluded that &#34;within the CMEA the Soviet Union has been 'subsidizing' certain East European countries by exporting 'hard goods' (fuels, nonfood raw

In our study, control subjects were well matched to 100% of case patients for the factors age, sex, and NNIS index, and we unable to find control subjects who underwent surgery in

Its programs are especially strong in business, economics, entrepreneurship, management, finance and accounting, law, psychology, communication, information technology,

Non-monetary assets and liabilities are translated at historical exchange rates if held at historical cost, or year-end exchange rates if held at fair value, and the resulting

Germany: UniCredit Bank AG, UniCredit Business Integrated Solutions ScpA, UniCredit Global Business Services GmbH Austria: UniCredit Bank Austria AG, UniCredit Business

Using eleven in-depth interviews with women working as domestic workers in New York City, this project contributes to the extant literature regarding domestic workers

Contracts, Policies, Processes and Management guidelines HR Due Diligence and Compliance Reviews Changing and / or harmonising terms and conditions of employment

Th e Rhode Island Family to Family Health Information Center is designed to enhance community- based care for people with disabilities and ensure that families of children and