Chapter 4
Probability Distributions
Lesson 4-1/4-2
Random Variable
Probability Distributions
This chapter will deal the construction of probability distribution.
By combining the methods of descriptive
statistics in Chapter 2 and those of probability presented in Chapter 3.
Probability Distributions will describe what will probably happen instead of what
actually did happen.
Combining Descriptive Methods and Probabilities
In this chapter we will construct probability distributions by presenting possible outcomes along with the relative
frequencies we expect.
Definitions
A random variable is a variable (typically represented by X) that has a single
numerical value, determined by chance, for each outcome of a procedure.
A probability distribution is a graph, table, or formula that gives the probability for each value of the random variable.
2 Types of Random Variables
Discrete Random Variable has either a
finite number of values or countable number of values, where “countable” refers to the fact that there might be infinitely many values, but they result from a counting process.
Continuous Random Variable has infinitely many values, and those values can be
associated with measurements on a
continuous scale in such way that there are no gaps or interruptions.
Example – Page 192, #2
Identify the given variable as being discrete or continuous.
A. The cost of making a randomly selected movie.
B. The number of movies currently being shown in U.S.
theaters.
C. The exact running time of a randomly selected movie.
Discrete
Discrete
Continuous
Example – Page 192, #2
D. The number of actors appearing in a randomly selected movie.
E. The weight of the lead actor in a randomly selected movie.
Discrete
Continuous
Graphs
The probability histogram is very similar to a relative
frequency histogram, but the vertical scale shows probabilities.
Requirements for a Discrete Probability Distribution
1.
2.
P x ( ) 1
P x
0 ( ) 1
where x assumes all possible values
for every individual value of x
Describing a Distributions
Center
Value that indicates where the middle of the data set is located.
Variation
Measures the amount that the values vary among themselves
Distribution
Shape of the distribution of data
Outliers
Sample values that lie very far away from the vast majority of the other sample values
Mean, Variance and Standard Deviation of a Probability
Distribution
[ ( )]
x P x
22
x P x( )
2 2 2
( )
x P x
2 2
( )
x P x
Mean
Variance
Variance (shortcut) Standard Deviation
Roundoff Rule for μ, σ, and σ²
Round results by carrying one more decimal place than the number of decimal places used for the random variable x.
If the values of x are integers, round μ, σ, and σ² to one decimal place.
Example – Page 192, #4
Determine whether a probability distribution is given. If probability distribution is described, find its mean and standard deviation.
When manufacturing DVDs for Sony, batches of DVD’s are randomly selected and the number of defects x is found for each batch.
x P x( )
0 0.502
1 0.365
2 0.098
3 0.011
4 0.001
P x( ) 0.977 1
Not a probability distribution
Example – Page 192, #6
Determine whether a probability distribution is given. If probability distribution is described, find its mean and standard deviation.
The Telektronic Company provides life insurance policies for its top four executives, and the random variable x is the number of those employees who live through next year.
x P x( )
0 0.0000
1 0.0001
2 0.0006
3 0.0387
4 0.9606
P x( ) 1
A probability distribution
Example – Page 192, #6
0 0.0000 0 0
1 0.0001 0.0001 0.0001
2 0.0006 0.0012 0.0024
3 0.0387 0.1161 0.3483
4 0.9606 3.8424 15.37
x P x( ) x P x ( )
3.9598
x2 P x( )
15.7204
Example – Page 192, #6
3.9598 4
x P x
μ
( ) x P x2 2 2
σ
( ) μ 15.7204
3.9598
2 .04038 σ σ2 0.04038 0.2009 0.2Example – Page 192, #6
Use the TI to find the mean, variance, and standard deviations.
x P(x) StatCALC
Identifying Unusual Results
According to the range rule of thumb, most values should lie within 2 standard deviation of the mean.
We can therefore identify “unusual”
values by determining if they lie outside these limits:
Maximum usual value = + 2
Minimum usual value = – 2
Identifying Unusual Results Probabilities
Rare Event Rule
Given the assumption that boys and girls are equally likely the probability of a particular observed event (such as 13 girls in 14 births) is extremely small, we conclude that the assumption is probably not correct.
Unusually high
x successes among n trials is an unusually high number of successes if P(x or more) is very small (such as 0.05 or less).
Unusually low
x successes among n trials is an unusually low
number of successes if P(x or fewer) is very small (such as 0.05 or less).
Example – Page 194, #18
x (girls) P(x)
0 0.000
1 0.001
2 0.006
3 0.022
4 0.061
5 0.122
6 0.183
7 0.209
8 0.183
9 0.122
10 0.061
11 0.022
12 0.006
13 0.001
14 0.000
Assume that in a test of gender-selections
technique, in clinical trial results in 12 girls and 14 births. Refer to Table 4-1 and find the
indicated probabilities.
A. Find the probability of exactly 12 girls in 14 births.
P x( 12) 0.006
Example – Page 194, #18
x (girls) P(x)
0 0.000
1 0.001
2 0.006
3 0.022
4 0.061
5 0.122
6 0.183
7 0.209
8 0.183
9 0.122
10 0.061
11 0.022
12 0.006
13 0.001
14 0.000
P x( 12)
P x( 12) P x( 13) P x( 14)
B. Find the probability of 12 or more girls in 14 births.
0.006 0.001 0.000 0.007
Example – Page 194, #18
C. Which probability is relevant for determining whether 12 girls in 14 births is unusually high: the results from part (A) or part (B)
Part (B), the occurrence of 12 girls among 14 would be unusually high if the probability of 12 or more
girls is very small 0.007 ≤ 0.05
Example – Page 194, #18
D. Does 12 girls in 14 births suggest that the gender- selection technique is effective? Why or why not?
Yes, because the probability of 12 or more girls is
very small (0.007). The result of 12 or more girls could not easily occur by chance.
Definition
The expected value of a discrete
random variable is denoted by E, and it represents the average value of the outcomes.
It is obtained by find the value of E
E
x P x ( ) The expected value of a discrete
random variable is denoted by E, and it represents the average value of the outcomes.
It is obtained by find the value of E
E
x P x ( )Example – Page 192, #12
When you give the casino $5 for a bet on the number 7 in a roulette, you have 1/38 probability of winning $175 and 37/38 probability of losing $5. If you bet $5 that outcome is an odd number, the probability of winning $5 is 18/38, and the
probability of losing $5 is 20/38.
Example – Page 192, #12
When you give the casino $5 for a bet on the number 7 in a roulette, you have 1/38 probability of winning $175 and 37/38 probability of losing $5. If you bet $5 that outcome is an odd number, the probability of winning $5 is 18/38, and the probability of losing $5 is 20/38.
A. If you bet $5 on the number 7, what is your expected value?
x P x( ) x P x( )
1 175
175 38 38
37 185
5 38 38
10 38
E 100.2631 38
The expected loss is $0.263 for a $5 bet.
Example – Page 192, #12
When you give the casino $5 for a bet on the number 7 in a roulette, you have 1/38 probability of winning $175 and 37/38 probability of losing $5. If you bet $5 that outcome is an odd number, the probability of winning $5 is 18/38, and the probability of losing $5 is 20/38.
B. If you bet $5 that the outcome is an odd number, what is your expected value?
x P x( ) x P x( )
18 90
5 38 38
20 100
5 38 38
10 38
E 100.2631 38
The expected loss is $0.263 for a $5 bet.
Lesson 4-3
Binomial Probability Distribution
Binomial Probability Distribution
Specific type of discrete probability distribution
The outcomes belong to two categories
pass or fail
acceptable or defective
success or failure
Example of a Binomial Distribution
Suppose a cereal manufacturer puts pictures of famous athletes on cards in boxes of cereal, in the hope of
increasing sales. The manufacture announces that 20%
of the boxes contain a picture of Tiger Woods, 30% a picture of Lance Armstrong, and the rest a picture of Serena Williams.
You buy 5 boxes of cereal. What’s the probability you get exactly 2 pictures of Tigers Woods?
Requirements for a Binomial Probability Distribution
1) The procedure has a fixed number of trials
2) Trials are must be independent. (The
outcome of any individual trial doesn’t affect the probabilities in the other trials.)
3) Each trial must have all outcomes classified into two categories
4) Probabilities must remain constant for each trial.
Example – Page 203, #4
Determine whether the given procedure results in a binomial distribution.
Rolling a loaded dice 50 times and find the number of times that 5 occurs.
Binomial
Notation for a Binomial Distribution
There are n number of fixed trials
Let p denote the probability of success
Let q denote the probability of failure
q = 1 – p
Let x denote the number of success in n independent trials:
Let P(x) denotes the probability of getting exactly x successes among the n trials
x n 0
Important Hints
Be sure that x and p both refer to the same category being called a success.
When sampling without replacement, the events can be treated as if they were
independent if the sample size is no more than 5% of the population size.
Where n 0.05N
Mathematical Symbols
Phrase Math Symbols
“at least” or “no less than” ≥
“more than” or “greater than” >
“fewer than” or “less than” <
“no more than” or “at most” ≤
“exactly” =
Methods for Finding Probabilities of a Binomial Distribution
Using the Binomial Probability Formula
Using Table A-1 in Appendix A
Using the TI
Binomial Probability Formula
n
x n xn x x
P x ! p q
( ) ! !
Number of outcomes with exactly x
successes among n trials
Probability of x successes among
n trials for any particular order
n x x n xP x ( ) C p q
Binomial Probability Formula
n x x n xP x ( ) C p q
for x = 0, 1, 2, ……,n
where
n = number of trials
x = number of success among n trials
p = probability of success in any one trial
q = probability of failure in any one trial (q = 1 – p)
Example 1 – Cereal
Suppose you buy 5 boxes of cereal. Where n = 5 and p = 0.2. What’s the probability you get exactly 2 pictures of Tiger Woods?
5 2
5 5!
2 C 2! 5 2 ! 10
First find the total number of outcomes.
There are 10 ways to get 2 Tiger pictures in 5 boxes.
Example 1 – Cereal
Suppose you buy 5 boxes of cereal. Where n = 5 and p = 0.2.
What’s the probability you get exactly 2 pictures of Tiger Woods?
2 3
( 2) 10(0.20) (0.80) 0.2048
P X
There are 10 ways to get 2 Tiger pictures in 5 boxes.
2nd Vars
Example – Cereal
The following table show the probability distribution function (p.d.f) for the binomial random variable, X.
X = Tiger P(X)
0 0.32768
1 0.4096
2 0.2048
3 0.0512
4 0.0064
5 0.00032
Sum 1
(5, 0.20, 0) 0.32768 (5, 0.20,1) 0.4096 (5, 0.20, 2) 0.2048 (5, 0.20,3) 0.0512 (5, 0.20, 4) 0.0064 (5, 0.20,5) 0.00032 Binompdf
Binompdf Binompdf Binompdf Binompdf Binompdf
Example – Cereal
The following table show the cumulative distribution function (c.d.f) for the binomial random variable, X.
0 1 2 3 4 5
( ) 0.32768 0.4096 0.2048 .0512 .0064 .00032
( 0) ( 1) ( 2) ( 3) ( 4) ( 5)
( )
0.32768 0.73728 0.94208 0.99328 0.99968 1
cdf
X P X
P X P X P X P X P X P X
P X
(5, 0.20, 0) 0.32768 (5, 0.20,1) 0.73728 (5, 0.20, 2) 0.94208 (5, 0.20,3) 0.99328 Binomcdf
Binomcdf Binomcdf Binomcdf
(5, 0.20, 4) 0.99968 (5, 0.20,5) 1
Binomcdf Binomcdf
Example – Cereal
0 1 2 3 4 5
( ) 0.32768 0.4096 0.2048 .0512 .0064 .00032
( 0) ( 1) ( 2) ( 3) ( 4) ( 5)
( )
0.32768 0.73728 0.94208 0.99328 0.99968 1
cdf
X P X
P X P X P X P X P X P X
P X
Construct a histogram of the pdf and cdf using X[0, 6]1 and Y[0, 1]0.01
pdf cdf
TI – Binomial Probability
Computing exact probabilities
2nd/Vars/binompdf
• binompdf(n, p, x)
pdf: probability distribution function
Computing less than or equal to probabilities
2nd/Vars/binomcdf
• binomcdf(n, p, x)
cdf: cumulative distribution function
Example – Page 204, #18
x
n xP x ( )
6C
2p q
Use the Binomial Probability Formula. Assume that a procedure yields a binomial distribution.
n 6, x 2, p 0.45
P x ( 2) 15(0.45) 0.55
2 6 20.2779 0.278
Binomial Probability Table
Example – Page 204, #12
Using Table A-1. Assume that a procedure yields a binomial distribution.
n 7, x 2, p 0.01
P x( 2) 0.002
Example – Page 204, #14
Using Table A-1. Assume that a procedure yields a binomial distribution.
n 6, x 5, p 0.99
P x( 5) 0.057
Example - Nielson Media
According to Nielson Media research, 75% of US households have cable TV.
A). In a random sample of 15 households, what is the probability that 10 have cable?
P x( 10)
binompdf 0.2361
(15,0.75,10)
x = households, p = 0.75, n = 15
Example - Nielson Media
The probability of getting exactly 10 households out 15
with cable is 0.1651. In 100 trials of this experiment we would expect (100)(0.1651) = 17 trials to result in 10 households
with cable.
Example - Nielson Media
According to Nielson Media research, 75% of US households have cable TV.
B). In a random sample of 15 households, what is the probability that fewer than 13 have cable?
P x( 13) P x( 0) P x( 1) ... P x( 12)
binomcdf(15, 0.75, 12) 0.7639 P x( 12)
Example - Nielson Media
There is 0.7639 probability that in a random sample of 15 households, less than 13 will have cable.
In 100 trials of this experiment, we would expect (0.7639)(100) = 76 trials to result in fewer than 13 households that have cable.
Example – Nielson Media
According to Nielson Media research, 75% of US households have cable TV.
C). In a random sample of 15 households, what is the probability that at least 13 have cable?
P x( 13) P x( 13) P x( 14) P x( 15)
binomcdf 0.2361
1 (15,0.75,12)
1 P x( 12)
Example – Nielson Media
There is 0.2361 probability that in a random sample of 15 households, at least 13 will have cable. In 100 trials at
of this experiment we would expect about (100)(0.2361) = 24 trials to result in at least 13 households have cable
Example – Page 205, #28
An article in USA Today stated that “Internal surveys paid by the directory assistance providers show that even the most accurate companies give out wrong numbers 15% of the time.” Assume that you are
testing such a provider by making 10 requests and also assume that the provider gives the wrong telephone
number 15% of the time.
A). Find the probability of getting one wrong number.
P x( 1)
binompdf 0.347
(10,0.15,1)
x = the number of wrong answers, n = 10, p = 0.15
Example – Page 205, #28
B). Find the probability of getting at most one wrong number.
P x( 1) P x( 0) P x( 1)
0.544
binomcdf (10,0.15,1)
C). If you do get at most one wrong number, does it appear that the rate of wrong numbers is not 15%, as claimed?
No, since 0.544 > 0.05 “at most one wrong number” is not an unusual occurrence when the error rate is 15%
Example – Page 206, #32
A study was conducted to determine whether there were significant
difference between medical students admitted through special programs (such as affirmative action) and medical students admitted through the regular admission criteria. It was found that the graduation rate was 94% for the medical students admitted through special programs.
A). If 10 students from the special programs are randomly selected, find the probability that at least nine of them graduated.
P x( 9) P x( 9) P x( 10) 1 P x( 8)
1 binomcdf(10,0.94,8) 0.8824
x = the number of special program student who graduate n = 10, p = 0.94
Example – Page 206, #32
B). Would it be unusual to randomly select 10 students from the special programs and get only seven that graduate?
Why or why not?
P x( 7) binompdf (10,0.94,7) 0.01681
Yes, since 0.017 < 0.05 getting only 7 graduates would be unusual result.
Lesson 4-4
Binomial Distribution: Mean,
Variance and Standard Devation
Any Discrete
Probability Distribution Formulas
Std. Dev = [ x
2 •P(x) ] – µ
2Mean µ =
[x • P(x)]Variance
2= [
x 2 • P(x) ] – µ 2Binomial Distribution
Std. Dev. = n
•p
•q
Where
n = number of fixed trials
p = probability of success in one of the n trials q = probability of failure in one of the n trials
Mean µ = n • p
Variance
2 = n • p • qInterpretation of Results
Maximum usual values = µ + 2
Minimum usual values = µ – 2
The range rule of thumb suggests that values are unusual if they lie outside of these limits:
Example – Page 210, #2
Finding , , and Unusual Values. Assume that a procedure yields a binomial distributions with n trials and the probability of success for one trial is p.
n = 250, p = 0.45
250(0.45) 112.5
np
(250)(0.45)(1 0.45) 7.86
npq
min 2 112.5 2(7.86) 96.8 max 2 112.5 2(7.86) 128.2
Example – Page 210, #6
Several students are unprepared for a multiple-choice quiz with 10 questions, and all of their answers are guesses.
Each question has five possible answers, and only one of them is correct.
A). Find the mean and standard deviation for the number of correct answers for such students.
n 10 p 1
5 0.20
q 1 0.20 0.80
μ np
10(0.20) 2
σ npq
(10)(0.20)(0.80) 1.264
1.3
Example – Page 210, #6
B). Would it be unusual for a student to pass by guessing and getting at least 7 correct answers? Why or why not?
Yes, since 7 is greater than or equal to the maximum usual value, it would be unusual for a student to
pass by getting at least 7 correct answers.
max 2 2 2(1.3) 4.6
Example – Page 210, #10
The Central Intelligence Agency has specialists who analyze the frequencies of letters of the alphabet in an attempt to decipher intercepted messages. In standard English text, the letter r is used at a rate of 7.7%
A). Find the mean and standard deviation for the number of times the letter r will be found on a typical page of 2600 characters.
n 2600 p 0.077
q 1 0.077 0.923
μ np
2600(0.077) 200.2
σ npq
(2600)(0.077)(0.923) 13.6
Example – Page 210, #10
B). In the intercepted message sent to Iraq, a page of
2600 characters is found to a have the letter r occurring 175 times. Is this unusual?
μ 2σ unusual values are outside of these boundaries.
200.2 2(13.6) 173 and 227.4
No, since 175 is within the above limits it would not be considered an unusual result.
Example – Page 212, #16
Car Crashes: For drivers in the 20 – 24 age bracket, there is a 34% rate of car accidents in one year (based on the data from the National Safety Council). An insurance investigator finds that in a group of 500 randomly selected drivers age
20 – 24 living in New York City, 42% had accidents in the last year.
A) How many drivers in the New York City group of 500 had accidents in the last year?
(500) (0.42) 210
Example – Page 212, #16
B) Assuming that the same 34% rate applies to New York City, find the mean and standard deviation for the number of people in groups of 500 that can expected to have
accidents
500(0.34) 170.00
np
500(0.34)(1 0.34) 10.6
npq
Example – Page 212, #16
C) Based on the preceding results, does 42% result for the New York City drivers appear to be unusually high when compared to the 34% rate for the general population?
Does it appear that higher insurance rates for New York City drivers justified?
max 2
170 2(10.6) 191.2
Yes, the result of 210 accidents for NYC drivers are
unusually high if the 34% rate for the general population applies. Yes, it appears that the higher rates for NYC drivers are justified.
Lesson 4-5
The Poisson Distribution
Examples of Poisson Distribution
Planes arriving at an airport
Arrivals of people in a line
Cars pulling into a gas station
Definition
The Poisson distribution is a discrete probability distribution that applies to occurrences of some event over a specified interval. The random variable x is the number of occurrences of the event in an interval. The interval can be time, distance, area, volume, or some similar unit.
P(x) =
µ x • e -µ wheree
2.71828x!
Formula
Requirements for a Poisson Distribution
Random variable (x) is the number of occurrences of an event over some interval.
Occurrences must be random
Occurrences must be independent of each other
Occurrences must be uniformly distributed over the interval being used.
mean is μ
Standard deviation is σ μ
Difference from a Binomial Distribution
The binomial distribution is affected by the
sample size n and the probability p, whereas the Poisson distribution is affected only by
the mean .
In a binomial distribution the possible values of the random variable x are 0, 1, … n, but a Poisson distribution has possible x values of 0, 1, …., with no upper limit.
Example – Page 216, #2
Use a Poisson Distribution to find probability.
If μ = 0.5, find P(2).
x e
P x x
μ μ
( ) !
P e
2 0.5
(2) 0.5
2!
0.0758
Example – Page 216, #6
Currently, 11 babies are born in the village of Westport (population 760) each year.
A). Find the mean number of births per day.
μ 11 0.0301
365
Example – Page 216, #6
B). Find the probability that on a given day, there is no births.
P x poissonpdf P x
( 0) (11/ 365,0)
( 0) 0.970312
2ndvars/C:poissonpdf(μ,x) 11
365 x = births
Example – Page 216, #6
C). Find the probability that on a given day, there is at least one birth.
P x( 1) P x( 1) P x( 2) ... P x( 11)
P x
poissonpdf
1 ( 0)
1 (11/ 365,0)
0.0297
Example – Page 216, #6
D). Based on the preceding results, should medical birthing personnel be on permanent standby, or should they be called in as needed? Does this mean that Westport mothers might not get the immediate medical attention they would be likely to get in a more populated area?
The personnel should be called as needed. Yes; this does mean that women giving birth might not get
the immediate attention available in more populated areas.
Example – Page 216, #8
Homicide Deaths: In one year, there were 116 homicide deaths in Richmond, Virginia (Based on “A Classroom Note on the
Poisson Distribution: A Model for Homicidal Deaths in Richmond, VA for 1991,” in Mathematics and Computer
Education, by Winston A. Richards). For a randomly selected day, find the probability that the number of homicide deaths is a. 0 b. 1 c. 2 d. 3 e. 4
Compare the calculated probabilities to these actual results:
268 days (no homicides); 79 days (1 homicides); 17 days (2 homicides); 1 day (3 homicides); no days with more than 3 homicides.
Example – Page 216, #8
Let x = the number of homicides per day
116 0.3178
365
a. P(x = 0) = poissonpdf(0.3178, 0) = 0.7277 b. P(x = 1) = poissonpdf(0.3178, 1) = 0.2313 c. P(x = 2) = poissonpdf(0.3178, 2) = 0.0368 d. P(x = 3) = poissonpdf(0.3178, 3) = 0.0038 e. P(x = 4) = poissonpdf(0.3178, 4) = 0.0003
Example – Page 216, #8
x frequency Relative frequency P(x)
0 268 0.7342 = 268/365 0.7277
1 79 0.2164 0.2313
2 17 0.0466 0.0368
3 1 0.0027 0.0038
4 or more 0 0.0000 0.0004 (by subtraction)
365 1.000 1.0000
The agreement between the observed relative frequencies and the probabilities predicted by the Poisson formula
is very good.
Poisson as Approximation to Binomial
The Poisson distribution is sometimes used to approximate the binomial distribution when n is large and p is small.
Rule of thumb
n 100
np 10
If the two conditions are satisfied we can use the Poisson distribution as an approximation to the binomial distribution where
= np