• No results found

Unit 6 - Probability Distribution

N/A
N/A
Protected

Academic year: 2020

Share "Unit 6 - Probability Distribution"

Copied!
52
0
0

Loading.... (view fulltext now)

Full text

(1)

Statistics For Social Sciences – MATH1208

Unit 6 – Probability Distribution

Random Variable

A random variable, X, is a quantitative

variable that has values that vary according to the rules of probability. A random

variable is also known as a chance variable.

In the experiment of rolling a single die and recording the number of spots that are on the top face, the sample space for this

(2)

number of spots on the top face of a die is a random variable.

In the experiment of tossing two coins, the sample space is S = {HH, HT, TH, TT}. This experiment does not involve a random variable because the possible outcomes are not numerical. It is possible to describe the outcomes of the experiment numerically by defining a random variable for the

experiment. We could let X = the number of heads that appear on the two coins, then the

(3)

Probability Distribution of a Discrete Random Variable

The probability distribution of a random

variable, X, written as p(x), gives the

probability that the random variable will

take on each of its possible values. p(x) =

P(X = x) for all possible values of X.

A discrete probability distribution for a variable X is defined, if X can assume a

discrete set of values X1, X2, X3, …, Xk with

respective probabilities p1, p2, p3, …, pk,

where p1 + p2 + p3 + … + pk = 1. The

function p(X), which has the respective values p1, p2, p3, …, pk for X = X1, X2, X3,

(4)

frequency function of X. Since X can assume certain values with given

probabilities, it is often called a discrete random variable.

A discrete probability distribution for a

variable X is such that = 1.

Probability Notation Summary

Find the probability that X takes on a value that is …

Notation

at least x P(X ≥ x)

more than x P(X > x)

at most x P(X ≤ x)

less than x P(X < x)

(5)

between x1 and x2

inclusive

P(x1 ≤ X ≤ x2)

Example

In the experiment of tossing two coins, X = the number of heads is a random variable that can take on three values: 0, 1 and 2.

a. Find the probability distribution of X,

which is the number of heads in tossing two coins.

The sample space is S = {HH, TH, HT, TT}

The event that no head appears is TT, that is,

(6)

The event that one head appears is HT or

TH, that is, P(X = 1) = p(1) =

The event that two heads appear is HH, that

is, P(X = 2) = p(2) = .

b. Construct a table to show probability distribution of your results in part a.

Note: p(0) + p(1) + p(2) = + + = 1

c. Which is the most frequent number of

heads? Answer 1 head

x 0 1 2

(7)

Expectation

If p is the probability that a person will

receive a sum of money S, the mathematical expectation (or expectation) is defined as p S = pS. For example, if the probability that a

man wins a $10 prize is , then his

expectation is $10 = $2.

If X denotes a discrete random variable that

can assume the values X1, X2, …, Xk with

respective probabilities p1, p2, …, pk, where

p1 + p2 + … + pk = 1, the mathematical

(8)

NOTE: The mean of a distribution is the expectation, E(X), of the distribution. E(X) = p1X1 + p2X2 + … + pkXk =

. The variance of the

distribution, VAR(X) = E[(X - )2] =

OR

VAR(X) = E[(X - )2] = E(X2) – [E(X)]2,

where E(X2) = and

E(X) = .

(9)

STD(X) = . That is, STD(X) =

=

OR STD(X) =

Exercise

1. The table below shows the weekly accidents at a factory.

a. Copy and complete the table below.

Number of accidents per week (x)

0 1 2 3 4 Total

Number of weeks (f)

10 18 15 6 1

xf

Probability (p)

px

(10)

i. mean number of accidents. Ans: 1.4

ii. expected number of accidents per week.

Ans: 1.4

c. What is the most frequent number of accidents per week?

2. On a particular day a farmer expects the sales of cucumbers to follow the trend below:

Sales 0 100 200 300 Total

Probability (P) 0.1 0.4 0.3 0.2

Profit (p) – 50 5 60 60

Pp

a. Copy and complete the table above.

b. What is the profit from the sale of

(11)

c. Calculate the expected profit for the sale

of cucumbers. Ans: $27

d. What is the most frequent number of sales?

Exercise

Answer the following.

1. a) Construct a table to show the

probability distribution of a tossing a pair of fair dice, where X denotes the sum of the

points obtained. [Schaum’s Outline,

Page 130]

Answer:

1 2 3 4 5 6

(12)

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

X 2 3 4 5 6 7 8 9 10 11 12

P(X)

b) What is the most frequent sum of the points obtained?

2. The number of ‘STARS’ sold daily is a random variable X that has the following probability distribution: [Elementary

(13)

X 0 1 2 3 4 5

P(X = x) 0.10 0.12 0.25 0.30 0.20 0.03

a. Find the probability that the following copies of the ‘STARS’ will be sold.

i. two copies Ans: 0.25

ii. one or two copies Ans: 0.37

iii. more than three copies Ans: 0.23

iv. at most two copies Ans: 0.47

v. between one and five copies Ans: 0.75

b. Which is the most frequent number of ‘STARS’ sold?

c. Find: i) P(2 < X ≤ 4) Ans: 0.5

ii) P( X < 1) Ans: 0.1

(14)

iv) P( X = 2 or X = 5) Ans: 0.28

v. P(X > 2) Ans: 0.53

NOTE: P(X > 2) = 1 – P(X ≤ 2) , P(X < 1) = 1 – P(X ≥ 1) and P(X ≤ 3) = 1 – P(X ≥ 4)

d. Find:

i) the expectation or the mean of the distribution, E(X)

NB. E(X) =

Ans E(X) = 2.47

ii) E(X2) Ans: 7.77

NB. E(X2) = .

(15)

NB. VAR(X) = E[(X - )2] =

OR

VAR(X) = E(X2) – [E(X)]2 where E(X2) =

. Ans: VAR(X) = 1.67

iv) the standard deviation of the

distribution, STD(X) Ans: STD(X) = 1.29

NB. STD(X) = . That is,

STD(X) = =

OR STD(X) =

(16)

service number for a Cable Company is a random variable with the following

probability distribution:

x

0 1 2 3 4

p(x) 0.23 0.34 0.17 0.15 2m

Find:

a. i) m Ans: 0.055 ii) p(4) Ans: 0.11

b. the probability that a person does not get a busy signal.

c. the probability that a person gets at least

one busy signal. Ans: 0.77

d. the probability that a person receives

more than two busy signals. Ans: 0.26

(17)

i) the expectation or the mean of the

distribution, E(X) Ans: 1.57

ii) E(X2) Ans: 4.13

iii) the variance of the distribution, VAR(X)

Ans: 1.67

iv) the standard deviation of the

distribution, STD(X) Ans: 1.29

f. What is the most frequent busy signal?

4. A company that sells ballpoint pens in bulk packages knows that the number of defective pens in a package is a random variable with the probability distribution:

(18)

p(x) 0.30 0.21 3w 0.10 0.10 0.09 0.08

Find:

a.i) w Ans: 0.04 ii) p(2) Ans: 0.12

b. What is the most frequent number of defective pens in a package?

c. the probability that a package of pens will contain at least three defective pens.

Ans: 0.37

d. the probability that the package will contain between two and five defective

pens. Ans: 0.2

(19)

f. P(X = 0 or X = 2 or X = 6) Ans: 0.5

g. P(1 ≤ X ≤ 3) Ans: 0.43

h. P(X > 4) Ans: 0.17

I. Find:

i) the expectation or the mean of the

distribution, E(X) Ans: 2.08

ii) E(X2) Ans: 8.32

iii) the variance of the distribution, VAR(X) Ans: 3.99

iv) the standard deviation of the

distribution, STD(X) Ans: 2.0

(20)

Given a distribution with n = number of trials and p = probability of success at each trial, then:

Mean = np, Variance = np(1 – p) = npq where q = 1 – p and

the standard deviation = .

Exercise

1. Find the mean, variance and standard deviation of the binomial distribution when:

a. n =12 and p = 0.4

Answer: mean = 4.8, variance = 2.88

standard deviation = 1.7

b. n = 3 and p = 0.35

(21)

standard deviation = 0.82

Binomial Distribution

If p is the probability that an event will happen in any single trial (called the

probability of a success) and q = 1 – p is the probability that it will fail to happen in any single trial (called the probability of a

(22)

The probability distribution of X is

determined by the formula

p(x) = or

p(x) = for x = 0, 1, 2, 3,

…, n.

Note: n! = n(n – 1)(n – 2)(n – 3) … (1) and 0! = 1.

In a binomial distribution the variable takes on only two outcomes. That is, the existence of a trial of an experiment is defined in

terms of the two states ‘success and failure’. Identical trials are repeated a number of

times, yielding a number of successes. In a binomial distribution the probability of

(23)

Note: x ~ B(n, p) where n is the sample size and p is the probability of success. Exercise

In a binomial distribution with n =5 and p = 0.35, what is the variance?

Ans: variance = 1.138

Ans: D

Which of the following is NOT a condition for a binomial experiment? A. Each trial has only two outcomes

B. There are only two trials

C. p is the probability of success, q is the probability of failure and p+q=1 D. The trials are independent

(24)

Ans: C

Ans: D

Ans: C

Ans: B

(25)

Ans: C

Scenario

Suppose a salesman who makes one call per day and considers the call successful if he sells goods worth over $200. In a five day working week, he can make 0, 1, 2, 3, 4 or 5 successful calls. The frequency distribution table below shows the number of successful calls per week over a 48 – week year.

Number of successful

calls (x)

0 1 2 3 4 5 Total

(26)

This is known as a binomial frequency distribution. The particular characteristics that makes it binomial is, it describes the number of ‘successes’ obtained when a

number of identical ‘trials’ of an experiment are performed.

Trial = making a single call on one day

Trial success = a sale of goods worth over $200

Number of trials = 5 (i.e. 5 calls in a week)

Exercise

1. In a company the probability that a

(27)

during a day’s production run is 0.2. If there are 6 of these machines running on a

particular day, find the probability that:

[P. 464 Example 2 – Francis, A.]

a. no machine need correcting;

Answer: 0.262

b. just one machine needs correcting;

Answer: 0.393

c. exactly two machines need correcting;

Answer: 0.246

d. more than two machines.

Answer: 0.099

(28)

0.4. Assuming a Binomial probability

distribution, calculate the probability that for a given week the:

a. bus is early exactly 3 times

Answer: 0.2903

b. bus is early at least 5 times

Answer: 0.0962

c. bus is early at most 2 times

Answer: 0.4199

3. In a certain game in which winning and losing are the only outcomes, the probability

of a player winning is . Given that the

game is played 6 times. Calculate :

a. The probability that the player has exactly

(29)

b. The probability that the player has at least

2 wins. Answer: 0.6492

c. The expected number of wins.

Answer: 2

4. A multiple choice quiz has 10 questions. Each question has 5 possible answers of

which only one is correct.

a. What is the probability of selecting a

correct answer? Answer: 0.2

b. Determine the probability that if a student guesses he/she will get:

i. Exactly two (2) correct answers?

Answer: 0.302 (3 dec. pl.)

ii. At least three (3) correct answers?

(30)

5. A manufacturer sets up the following sampling scheme for accepting or rejecting large crates of identical items of raw

material received. He takes a random sample of 20 items from the crate. If he finds more than two defective items in the sample, he rejects the crate; otherwise he accepts it. It is known that approximately 5% of these type of items received are defective. Calculate

the: [P. 465 Example 3 – Francis, A.]

a. mean and the variance of the number of

defectives in the sample of 20;

Ans.: 1 & 0.95

b. proportion of crates that will be rejected.

Answer: 0.076

Poisson Distribution

A poisson situation (or process) is a

(31)

frequency distribution is obtained by

observing the number of random events that occur in repeated intervals.

Given a poisson situation with m = mean number of events per interval, then the

poisson probability distribution of x events

occurring is given by p(x) = for x =

0, 1, 2, 3, …, n. A poisson interval can be adjusted provided that the mean is adjusted accordingly. For a poisson probability

distribution, the mean = variance.

In a binomial situation, the poisson

(32)

1. n is large (greater than 30);

2. the probability (p) is small (less than 0.01).

Ans: B

Exercise

(33)

Answer: 4.5e– 3 or 0.224

2. On average, 8 persons visit the doctor in a given hour. Assuming a Poisson

probability distribution, calculate the: a. probability that exactly 5 persons will visit the doctor in a randomly selected hour.

Answer: 0.0916

b. probability that at least 10 persons visit the doctor in a given hour.

Answer: 0.2196

(estimated for 3 values)

c. probability that for a randomly selected half-hour, exactly 3 persons will visit the

doctor. Answer: 0.1954

(34)

assembly line is 3 per week. Find the probability that:

a. A particular week will be accident free.

Answer: 0.0498

b. At least three (3) accidents will occur in

a week. Answer: 1 – 8.5e – 3 or 0.5767

c. Exactly five (5) accidents will occur in

two weeks. Answer: 64.8e- 6 or 0.1607

4. Customers arrive randomly at a

department store at an average rate of 3.4 per minute. Assuming the customer arrivals form a poisson distribution, calculate the probability that:

(35)

a. no customer arrive in any particular time; Answer: 0.0334

b. exactly one customer arrives in any

particular minute; Answer: 0.1136

c. two or more customers arrive in any

particular minute; Answer: 0.8530

d. one or more customers arrive in any 30 –

second period. Answer: 0.8173

5. Items produced from a machine are

known to be 1% defective. If the items are boxed into cases of 200, what is the

probability of finding that a single box has 2 or more defectives? [P. 469, Example 6 –

(36)

6. JIIC receives on average 3.5 claims per week. Assuming that the number of claims follows a poisson distribution, find the

probability that it will receive:

a. exactly three claims in a given week;

b. no more than three claims in a week.

Standard Normal Variable (z – value)

(37)

lengths of uniform manufactured products and times.

It is not possible to find the probability as a precise value because the Normal

distribution is continuous. However, we can evaluate the existence of the probability in a certain range. The information needed to

evaluate probabilities for ranges of values for a Normal distribution is the values of the:

1. mean ( - the Greek letter mu) and 2. standard deviation ( - the Greek letter sigma) of the distribution.

The procedure used in calculating

(38)

distribution requires a knowledge of the use of:

1. Z-scores and

2. Z(or Standard Normal) tables.

Z-scores

The Z-score for any value x of a normal

distribution, having mean and standard

deviation is defined by: . This

process is sometimes known as

standardizing the x-value.

The z-score measures the number of

standard deviation that a data value is from the mean. A standard normal random

(39)

mean of 0 and a standard deviation of 1. That is, Z ~ N(0, 1). Thus, Z ~ N( , ).

Note: x ~ N(95, 16), that is x ~ N( , )

Standard Normal (Z) tables

A standard normal table is a table of probabilities for a Z random variable.

The Normal probability is calculated by first standardizing the Normal value given in the question to obtain a z-score, then use the standard Normal tables to find the

probability.

Note: X ~ Z(mean, variance) or

X ~ Z( , ).

(40)

Ans: C

Ans: C

Summary of procedure for calculating Normal probabilities

The procedure for calculating any Normal probability is:

(41)

STEP 2 – Use the standard Normal table to find the probability of Z.

STEP 3 – If necessary, manipulate the probability obtained.

Example – Page 475, A. Francis

1. The lengths of steel pins produced by a machine are distributed Normally with mean 20 cm and standard deviation 0.1 cm. Find the:

a. z-score if the probability of a randomly selected pin is less than 20.1 cm in length

(42)

b. probability that a randomly selected pin is less than 20.1 cm in length

Solution: P(length of pin < 20.1) = P(Z < 1)

Using the standard normal table P(Z < 1) = 0.8413

c. probability that a randomly selected pin is greater than 20.1 cm in length.

Solution: P(length of pin > 20.1) = P(Z > 1) P(Z > 1) = 1 – 0.8413 = 0.1587 d. probability that a randomly selected pin is greater than 19.9 cm in length.

Solution:

P(length of pin > 19.9) = P(Z > – 1) Using the standard normal table

20.1 20

● ●

1 – 0.8413 = 0.1587 0.8413

Normal distribution of steel pins

Length of pin

1 0 ● ●

1 – 0.8413 = 0.1587 0.8413

Standard Normal distribution of steel pins

(43)

P(Z < 1) = P(Z > – 1) = 0.8413 because the Normal distribution is symmetric.

Draw a diagram for each of the following Note: P(Z < – 1) = P(Z > 1)

P(Z > 1) = 1 – P(Z < 1) P(Z < – 1) = 1 – P(Z < 1) P(Z < 1) = 1 – P(Z > 1)

P(Z < 1) = P(Z > – 1)

P(– 1 < Z < 1) = P(Z < 1) – P(Z < – 1) P(– 1 < Z < 1) = P(Z > – 1) – P(Z > 1)

2. The weights of bags of potatoes are Normally distributed with mean 5 lbs and

1 0 ● ●

Area of (B+C+D) = Area of (A+B+C) Standard Normal distribution of steel pins

z ●

– 1 A

B C

(44)

standard deviation 0.2 lb. The potatoes are delivered to a supermarket, 200 bags at a time.

a. What is the probability that a random bag will weigh more than 5.5 lbs?

Solution: P(bag > 5.5 lbs) =

z =

Using the table P(Z < 2.5) = 0.9938, so P(Z > 2.5) = 1 – 0.9938 = 0.0062

b. How many bags, from a single delivery, would be expected to weigh more than 5.5 lbs?

(45)

= 200 0.0062 = 1.24. In practical terms 1 bag.

Example 3 – Page 478, A. Francis The time taken to complete jobs of a

particular type is known to be Normally distributed with mean 6.4 hours and

standard deviation 1.2 hours. What is the probability that a randomly selected job of this type takes:

a) less than 7 hours

Ans: P(X < 7) is P(Z < 0.5) = 0.6915 b) less than 6 hours

Ans: P(X < 6 hours) is

P(Z < – 0.33) = 1 – P(Z < 0.33) = 0.3707

(46)

Ans: P(6 < X < 7) is P(– 0.33 < Z < 0.5) =

P(Z < 0.5) – P(Z < (– 0.33)) =

0.6915 – 0.3707 = 0.3208

Example 4 – Page 479, A. Francis

Company records show the weekly distance travelled by their salesmen is approximately normally distributed with mean 800 miles and standard deviation 90 miles. The sales manager considers that salesmen who travel less than 600 miles in one week are

(47)

If the company employs 200 salesmen, how many would be expected to perform poorly in a particular week?

Ans: Standardising 600 gives z = – 2.22 P(Z < – 2.22) = 1 – P(Z < 2.22)

= 1 – 0.9868 = 0.0132

The expectation = np = 200(0.0132) = 2.64 approx. 3 salesmen to perform poorly in any one week.

Exercise

1. The random variable X is normally

distributed with mean 12 and variance 0.25. What is the probability that X < 12?

(48)

2. Let X be a random variable with a mean of 50 and a variance of 64. Find:

a. P(X < 52) Answer:0.5987

[2013 Past Paper – question 4b]

b. P(X > 52) Answer: 0.4013

c. P(35 < X < 64) Answer: 0.9298

3. A random variable X follows a Normal probability distribution and has a mean

value of 25 and a variance of 9. Calculate: [2014 – question 4b]

a. P(X < 21) Answer: 0.0918

b. P(X > 26) Answer: 0.3707

(49)

4. The height of students at a certain school are normally distributed with mean 169 cm and standard deviation 10 cm.

a. What is the z-score of a student who

is 167 cm tall? Answer: z = – 0.2

b. What is the probability that a student selected at random is:

i) less than 167 cm tall?

Ans: P(Z ≤ 167) = 0.4207

ii) between 167 and 172 cm?

Ans: P(167 < x < 172) = 0.1972

5. Weekly demand at a grocery store for a brand of canned sausages is normally

(50)

cans. What is the probability that the weekly demand is:

a. 959 cans or less; Ans:P(Z ≤ 959) = 0.983

b. between 660 and 735 cans of sausages?

Ans: P(660 < x < 735) = 0.16141

Approximating Probabilities to the

Binomial Distribution Using the Standard Normal Distribution

The Normal distribution can be used as an approximation to the binomial distribution when:

(51)

2. the probability (p) is not too small or large (the closer to 0.5 the better).

When used as an approximation in this way, the Normal distribution has:

mean = np and

standard deviation = .

References

Related documents

Residents with flood expe- rience prior to flash flood events (as compared to those without) have been found to have a higher awareness of flooding and a higher perception of

 Benchmark - In the same way that an organization will consider their financial position by comparison with previous years, so the regular use of online surveys will allow an

Helps to design a user experience that is responsive to people’s different needs, desires and

- Kingston Class 10 UHSI microSDHC DataSheet - SPCIFICAIO S WorkForce WF7610 Epson - KODAK EASYSHARE Camera Z5010.. - DCS2310L DCS2332L HD Wireless Outdoor Cloud Camera

The lord of the lOth house though ln retrogression, ts in debilitation and placed in the house of destruction and aspected by the cruel Mars, who in throwing

From the work of Doherty (1997) and confirmed in this research, it is possible to show that mass market packages are the packages most likely to be used for project management

In this work the performance against atmospheric corrosion of reinforcing steel rebar of two primers was evaluated, a neutral rust converter and an organic primer coating containing

Nell’analisi delle correlazioni non parametriche lo score totale ed il danno vascolare sono risultati direttamente legati all’età del donatore e lo score totale