UNIT 4 - PROBABILITY
DISTRIBUTIONS
■ Random Variables & Probability Distributions ■ Binomial Distribution
RANDOM VARIABLES AND
PROBABILITY
DISTRIBUTIONS
■ Random Variables
■ Probability Distributions
Random Variables - Definition
⚫
A
random variable
is a variable whose value is
determined by the outcome of a random experiment.
⚫
Typically we use capital letters to represent random
variables, and lower case letters to represent
particular values they can take
■ Eg : The random variable X represents the face of an unbiased
Random Variables
• There are two types of random variables
• Discrete Random Variables – can assume only countable values
• Examples
• Number of chairs
• Number of lotto winners
• Continuous Random Variables – can assume any value in an interval
• Examples
• weight
• height
Probability Distributions
• A probability distribution is:
• A mathematical function that assigns probabilities to the values of a random variable
• Eg:
• A listing of all the outcomes of an experiment and the probability of each outcome
• Eg:
• A function which describes how probabilities are distributed over the values of a random variable
• Eg: We roll two dice. Let the random variable X = “sum of the
two faces”
X 0 1 2
Properties of Probability
Distributions
•
Each probability MUST be greater than or equal
to zero
• f(x) ≥ 0 or P(X = x) ≥ 0
• ie. Probabilities cannot be negative
•
The sum of ALL probabilities MUST be equal to 1
Discrete vs. Continuous Probability
Distributions
• The discrete probability distribution lists all
possible values than the variable can assume and corresponding probabilities.
• For a continuous random variable, it is
Example – Discrete Probability
Distributions
⚫ Example 1:
⚫ If we roll two dice and let the random variable X be the sum of the values facing up
a. Calculate the probability distribution of X b. Determine the probability of X = 5
c. Determine the probability of the sum of the
Distributions
•Example 2:
•Let
•be the probability distribution of X.
Summarising Probability Distributions
• Probability distributions of random variables are
summarized in various ways, using the key ideas of centre, spread, and shape.
• The most important measures are:
• Centre — the mean or expected value (”expectation”)
• Spread — the variance and the standard deviation
Expected Value
•
The
expected value
of a random variable is
simply the mean of a random variable
•
It is denoted as E(X) – read ‘the expectation of
X’
•
Calculating the expected value:
• Formula: Σ x P(X) or Σ x f(x)
• Translation: multiply each value of x by its
Variance
• The variance of a random variable measures
the spread of the distribution
• It is denoted by V(X) or Var(X) • Calculating V(X):
• Formula: Σ x2 P(X) – [Σ x P(X)] 2
• Translation:
●multiply the square of each x value by its corresponding
probability and sum the results
Example – Expected Value & Variance
• Example 1:
• Let x be the number of magazines a person reads every
week. Based on a sample survey of adults, the following probability distribution table was prepared:
• Calculate the mean and standard deviation.
x 0 1 2 3 4 5
Example – Expected Value & Variance
• Example 2:
• Loraine Corporation is planning to market a new
makeup product. According to the analysis made by the financial department of the company, it will earn an annual profit of $4.5 million if this product has high sales, an annual profit of $1.2 million if the sales are mediocre and it will lose $2.3 million if the sales are low. The probabilities of these
three scenarios are 0.32, 0.51 and 0.17
• Calculate the mean and standard deviation of the
BINOMIAL DISTRIBUTION
(DISCRETE PROBABILITY
DISTRIBUTION)
■ What is the Binomial Distribution (BD)?
The Binomial Distribution
• One of the most widely used Discrete Probability
Distributions
• Applied to find the probability that an event occurs
x times in n performances of an experiment.
• To apply the Binomial distribution the random
The Binomial Distribution
Binomial Distribution – Example I
•
Individuals with a certain gene have a 0.70
probability of eventually contracting a certain
disease. What is the probability that 5 out of
10 randomly selected persons with the certain
will have the disease?
•
This is a Binomial Distribution. Why?
• 10 identical trials (testing for the disease)
• Each trial has only 2 possible outcomes (have the disease, does not have the disease)
Binomial Distribution – Example II
•
Suppose a survey in a particular country
indicated that 9 out of 10 cars carry liability
insurance. If four cars in that country are
involved in accidents what is the probability
that …
•
This is a Binomial Distribution. Why?
• 4 identical trials (4 cars in accidents)
• Each trial has only 2 possible outcomes (have liability insurance, does not have liability insurance)
• We assume a constant probability (9 out of 10 – 9/10 or 0.9)
Binomial Distribution
•
The formula used to calculate the probability
for the binomial distribution is:
•
Where :
• x = probability of interest • p = probability of success • n = number of trials
• q = 1 – p (probability of failure) • n-x = number of failures
x
•
This is calculated
using:
n denotes the total number of
elements
x denotes the number of elements selected per selection
What is
n
C
x
? - II
• n! – ‘n factorial’
• n! = n x (n-1) x (n-2) x (n-3) x . . . . x 3 x 2 x 1
• Example 1: 4! = 4 x 3 x 2 x 1 = 24
Playing Lotto - I
• Write down any six numbers between 0 & 36
inclusive
• Watch the draw . . ..
• http://www.youtube.com/watch?v=7fdBLzoIOjs&feature=r
Playing Lotto - II
•
Calculate the total number of combinations of
any 6 numbers between 0 and 36
•
It would be:
•
To ensure you win the lotto on a certain day
you would have to buy every possible
combination – ie 2,324,784 tickets.
Binomial Distribution
⚫ NB Formula:
⚫ Step 1
: Calculate
nC
x⚫ Step 2
: Calculate
p
x⚫ Step 3
: Calculate
q
n-xBack to the Binomial Distribution
• Reminders:
• Only 2 outcomes: (i) success or (ii) failure
• Known probability of success
• Probability Distribution Function:
• n = number of trials
• p = probability of success
Example I - Binomial Distribution
• Suppose that the probability of breaking your
leg at a ski lodge is 0.20
1. What is the probability that exactly 2 out of 5 people
will break their leg?
2. What is the probability that more than 2 people out of 5
Example II - Binomial Distribution
1. What is the probability of obtaining exactly 4
heads in 6 flips of a fair coin?
2. What is the probability of obtaining 4 or more
Example III - Binomial Distribution
• Express House Delivery Service guarantees a refund of
all charges if delivery does not arrive by the specified time. It is known that 2% of packages do arrive by the
specified time. Suppose a corporation mails 10 packages on a certain day.
• A) Find the probability that exactly one of these ten
packages will not arrive by the specified time.
• B) Find the probability that at most one of these ten
Expectation & Variance – Binomial
Distribution
• If the random variable X is binomially distributed
then:
• E(X) = np
• Var(X) = npq or np(1-p)
Example – Binomial Expectation &
Variance
• Suppose experience has shown that 30% of the
television sets sold on hire purchase at a certain store are eventually re-possessed. If 10 TV sets are sold during a week, determine:
NORMAL DISTRIBUTION
(CONTINUOUS PROBABILITY
DISTRIBUTION)
■ What is the Normal Distribution (ND)?
■ The Standard Normal Distribution & Its Probabilities ■ How to Calculate General Probabilities for the ND ■ The Normal Approximation of the Binomial
Distribution
The Normal Distribution
• The normal distribution is a continuous
random variable
• It is symmetric and bell-shaped
• The mean = median = mode
• Shape of curve depends on population
The Normal Distribution
• Center of distribution is μ
- the mean
• Spread is determined by
σ - the standard
deviation
• 50% of the scores lie
above the mean and 50% lie below the mean
• Total area under the
General Normal Distributions
•
The mean and standard deviation affect the
shape of the normal distribution
Standard Normal
Smaller Standard Deviation
The Standard Normal Distribution
•
This is a special case of the normal distribution
•
Mean = μ = 0
•
Standard Deviation = σ =1
•
The Normal Distribution Table calculates these
Distribution
• The units for the standard normal distribution
curve are denoted by z and are called z values
• The z-value gives the distance between the mean
and the point represented by z in terms of the standard deviation
The Standard Normal Tables
•
The tables give you
the area to the left
of any positive value
•
The area under the
curve is equal to 1
•
The area to the right
is therefore 1 minus
Normal Tables cont’d
Normal Tables cont’d
Steps to Calculate Normal
Probabilities
⚫
Step 1
: Sketch the normal distribution and indicate
the mean (the middle) of the random variable X
⚫
Step 2
: Shade the area you want to find
⚫
Step 3
: Find the corresponding area to the right of the
mean (if needed)
Example I – Normal Distribution
1. What is the probability of X being less than 1.5
2. What is the probability of X being greater than
1.5?
3. What is the probability of X being less than
-0.67?
4. What is the probability of X being greater than
Example II – Normal Distribution
5. What is the proportion that falls between 0.5 and 1.5?
6. What is the proportion that falls between -0.5 and 1.5?
Standardization
• To calculate probabilities from a normal
distribution with μ ≠ 0 and σ ≠ 1 (ie. not standard
normal) we have to standardize the distribution
• Once we know the mean and standard deviation
this can be done
• The formula is:
• After the standardization process is complete (ie.
Example - Standardisation
•
Assume that the length of time, X, between
charges of a cellular phone is normally
distributed with a
• mean of 10 hours and a
• standard deviation of 1.5 hours
Find the probability that the cellular phone
will last for :
a. Less than 13 hours between charges b. More than 12.5 hours between charges
c. Between 8 and 12 hours between charges d. Between 6 and 9 hours between charges
Calculating % aspects
⚫ Sometimes the probability (area under the curve) is given for us to determine the value
⚫ Step 1: Draw the diagram and shade the relevant area
⚫ Step 2: Determine the standardised value
⚫
To qualify for the police academy, candidates must
Example – Normal Distribution % II
•
It is known that the life of a calculator
manufactured by Texas Instruments has a
normal distribution with a mean of 54 months
and a standard deviation of 8 months. What
should the warranty period be to replace a
using the Normal Distribution I
• When n is large (n ≥ 20), a normal probability
distribution may be used to provide a good
approximation to the probability distribution of a binomial random variable
Approximating the Binomial Distribution
using the Normal Distribution II
⚫The Continuity Correction Factor (CCF)
■ This is the value which is added when a discrete
distribution (eg. The binomial distribution) is being
approximated by the normal distribution (a continuous distribution)
■ Ie. we are correcting the discrete distribution so that it
can be approximated by the continuous one
■ The CCF value used is 0.5
■ Subtract 0.5 from lower limit and add 0.5 to upper
Approximate the Binomial Distribution I
⚫ Step 1: Calculate the mean and variance of the binomial distribution in question
Approximate the Binomial Distribution II
•
Step 3: Calculate the probability to be
approximated using the formula:
•
Step 4: Sketch the approximating normal
distribution and shade the area corresponding
to the probability of the event of interest
•
Step 5: Find the probability using the ‘Steps
to Calculate Normal Probabilities’ slide
The mean of the binomial distribution
to the Binomial Distribution
•
In a recent survey conducted for Money
magazine, 80% of the women surveyed said
that they are more knowledgeable about
investing now than they were five years ago
(Money, June 2002). Suppose this result is
true for the current population of all women.
What is the probability that in a random
sample of 100 women, 72 – 76 will say that
Example II – Normal Approximation
to the Binomial Distribution
•
According to the 2001 Youth Risk Behaviour
Surveillance by the Centres for Disease Control and
Prevention, 39% of the 10
th-graders surveyed said
that they watch three or more hours of television on a
typical school day. Assume that this percentage is
true for the current population of all 10
th-graders.
What is the probability that 86 or more of the
Binomial vs. Normal Distributions
Binomial Normal
Variable Type Discrete (assumes only discrete
values)
Continuous (assumes all values within a given interval)
Mean np μ
Variance npq σ2
Symmetry
■ Only symmetrical when p = 0.5
■ If p = 0.5 the distribution is
skewed but tends to symmetry as n increases