• No results found

Probability and Random Variable Primer

N/A
N/A
Protected

Academic year: 2022

Share "Probability and Random Variable Primer"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

B. Maddah INDE 504 Simulation 09/02/17

Probability and Random Variable Primer

• Sample space and Events

¾ Suppose that an experiment with an uncertain outcome is performed (e.g., rolling a die).

¾ While the outcome of the experiment is not known in

advance, the set of all possible outcomes is known. This set is the sample space, Ω.

¾ For example, when rolling a die Ω = {1, 2, 3, 4, 5, 6}. When tossing a coin, Ω = {H, T}. When measuring life time of a machine (years), Ω = {1, 2, 3, …}.

¾ A subset E⊂ Ω is known as an event.

¾ E.g., when rolling a die, E = {1} is the event that one appears and F = {1, 3, 5} is the event that an odd number appears.

• Probability of an event

¾ If an experiment is repeated for a number of times which is large enough, the fraction of time that event E occurs is the probability that event E occurs, P{E}.

¾ E.g., when rolling a fair die, P{1} = 1/6, and P{1, 3, 5} = 3/6 = 1/2. When tossing a fair coin, P{H} = P{T} = 1/2.

¾ In some cases, events are not repeated many times.

¾ For such cases, probabilities can be a measure of belief (subjective probability).

(2)

• Axioms of probability (1) For E ⊂ Ω, 0 ≤ P{E} ≤ 1;

(2) P{ Ω} = 1;

(3) For events E1, E2, …, Ei, …, with Ei ⊂ Ω, Ei ∩ Ej = ∅, for all i and j,

1 1

i { }i

i i

P E P E

=

=

=

.

• Implications

¾ The axioms of probability imply the following results:

o For E and F ⊂ Ω,

P{E “or” F} = P{E ∪ F} = P{E} + P{F} − P{E ∩ F} ;1 o If E and F are mutually exclusive (i.e., E ∩ F = ∅), then P{E ∪ F} = P{E} + P{F};

o For E ⊂ Ω, let Ec be the complement of E (i.e., E ∪ Ec = Ω), P{Ec} = 1 − P{E};

o P{∅} = 0.

• Conditional probability

¾ The probability that event E occurs given that event F has already occurred is

{ }

{ | }

{ } P E F P E F

P F

= .

(3)

• Independent events

¾ For E and F ⊂ Ω, P{E ∩ F} = P{E|F}P{F} .

¾ Two events are independent if an only if P{E ∩ F} = P{E}P{F}. That is, P{E|F} = P{E} .

• Example 1

¾ Suppose that two fair coins are tossed. What is the probability that either the first or the second coin falls heads?

¾ In this example, Ω = {(H, H), (H, T), (T, H), (T,T)}. Let E (F) be the event that the first (second) coin falls heads, E ={(H, H), (H, T)} and F = {(H, H), (T, H)}, and E ∩ F ={H, H}. The desired probability is

{ } { } { } { } 1/ 2 1/ 2 1/ 4 3 / 4 .

P EF = P E +P F P E F = + =

• Example 2

¾ When rolling two fair dice, suppose the first die is 3, what is the probability the sum of the two dice is 7?

¾ Let E be the event that the sum of the two dice is 7, E = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}, and F be the event that the first die is 3, F= {(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)}. Then,

{ } {(3, 4)}

{ | }

{ } {(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)}

1/ 36 1 .

6 / 36 6

P E F P

P E F

P F P

= =

= =

• Finding Probability by Conditioning

¾ Suppose that we know the probability of event B once event A is realized (or not). We also know P{A}. That is, we know P{B|A}, and P{B|Ac} and P{A}. What is P{B}?

(4)

¾ Note that

B = (A ∩ B) ∪ (Ac ∩ B) ⇒ P{B} = P{A ∩ B} + P{Ac ∩ B}.

¾ Therefore,

P{B} = P{B|A}P{A} + P{B|Ac}P{Ac} = P{B|A}P{A} + P{B|Ac}(1 −P{A}) .

¾ Here we are finding P{B} by “conditioning” on A .

¾ In general, if the realization of B depends on a partition Ai of Ω, A1A2 ∪ ∪ An = Ω, Ai Aj = ∅ ≠,i j,

1

{ } n { | } { }.i i

i

P B P B A P A

=

=

• Bayes’ Formula

¾ This follows from conditional probabilities. For two events,

{ } { | } { }

{ | } .

{ } { | } { } { | c} { }c

P A B P B A P A

P A B

P B P B A P A P B A P A

= ∩ =

+

¾ With a partition Ai,

1

{ } { | } { }

{ | } .

{ } { | } { }

j j j

j n

i i

i

P A B P B A P A P A B

P B P B A P A

=

= ∩ =

• Example 3

¾ Consider two urns. The first urn contains three white and seven black balls, and the second contains five white and five black balls. We flip a coin and then draw a ball from the first urn or the second urn

depending on whether the outcome was heads or tails.

¾ What is the probability that a white ball is selected?

(5)

P{W} = P{W|H}P{H} + P{W|T}P{T} = (3/10)(1/2) + (5/10)(1/2) = 2/5 .

¾ What is the probability that a black ball is selected?

P{B} = 1 − P{W} = 3/5 .

¾ What is the probability that the coin has landed heads given that a white ball is selected?

From Bayes’ formula, { | } { } (3/10)(1/ 2) 3 { | }

{ } 2 / 5 8

P W H P H P H W

= P W = = .

• Random Variables

¾ Consider a function that assigns real numbers to events (outcomes) in Ω. Such real-valued function is a random variable.

¾ E.g., when rolling two fair dice, define X as the sum of the two dice. Then, X is a random variable with P{X = 2} = P{(1,1)}=1/36, P{X = 3} = P{(1, 2), (2, 1)}=2/36=1/18, etc.

¾ E.g., the salvage value of a machine, S, is $1,500 if the market goes up (with probability 0.4) and $1,000 if the market goes down (with probability 0.6). Then, S is a

random variable with P{S = 1500} = 0.4 and P{S = 1000} = 0.6 .

¾ If the random variable can take on a limited number of values. Then, this is a discrete random variable. E.g., the random variable X representing the sum of two dice.

(6)

¾ If the random variable can take on an uncountable number of values. Then, this is a continuous random variable. E.g., the random variable H representing height of an AUB student.

¾ If X is a discrete random variable, the function fX(x) = P{X = x} is the probability mass function (pmf) of X .

¾ The function FX(x) = P{X ≤ x} = ( )

i

X i

x x

f x

is the cumulative distribution function (cdf) of X.

¾ E.g., for the random variable S representing salvage value of a machine above,

0.6 if 1000 0 if 1000 ( ) 0.4 if 1500 , ( ) 0.6 if 1000 1500

0 othewise 1 if 1500

S S

s s

f s s F s s

s

= <

= = = ≤ <

.

¾ For a continuous random variable, X, the cdf is defined based on a function fX(x) called the density function, where

{ } ( ) ( )

x

X X

P X x F x f t dt

−∞

≤ = =

.

Fact. For a discrete random variable ( ) 1.

i

X i

x

f x =

For a

continuous random variable, f xX( ) 1.

−∞

=

• Independent Random variables

¾ Two random variables X and Y are said to be independent if

{ , } { } { } X ( ) ( )Y

P Xx Yy = P Xx P Yy = F x F y .

(7)

• Expectation of a random variable

¾ The expectation of a discrete random variable X is

[ ] { } ( )

i i

i i i X i

x x

E X =

x P X = x =

x f x .

¾ The expectation of a continuous random variable X is [ ] X( )

E X xf x dx

−∞

=

.

¾ The expectation of a random variable is the value obtained if the underlying experience is repeated for a number of times which is large enough and the resulting values are averaged.

¾ The expectation is “linear.” That is, for two random variables X and Y, E[aX + bY] = aE[X] + bE[Y] .

¾ The expectation of a function of random variable X, g(X), is [ ( )] ( ) X( )

E g X g x f x dx

−∞

=

.

¾ An important measure is the nth moment of X, n =1, 2, … [ n] n X( )

E X x f x dx

−∞

=

• Measures of variability

¾ The variance of a random variable X is

Var[ ]X = E X E X[( [ ]) ]2 = E X[ 2]

(

E X[ ]

)

2.

¾ The standard deviation of a random variable X is Var[ ]

X X

σ = .

(8)

¾ The coefficient of variation of X is CV[X] = σX/E[X] .

¾ The variance (standard deviation) measures the spread of the random variable around the expectation.

¾ The coefficient of variation is useful when comparing variability of different alternatives.

¾ Note that Var[aX+b] =a2 Var[X], for any real number a and random variable a .

• Joint distribution

¾ The joint distribution function of two random variables is

, ( , ) { , }.

FX Y x y =P Xx Yy

¾ If X and Y are discrete random variables then,

, ,

, ,

( , ) { , } ( , ) ,

X Y X Y

i x j y i x j y

F x y P X i Y j f i j

=

= = =

where fX,Y (.) is the joint pmf of X and Y.

¾ If X and Y are continuous random variables then,

, ( , ) , ( , ) ,

x y

X Y X Y

F x y f x y dxdy

−∞ −∞

=

∫ ∫

where fX,Y (.) is the joint pdf of X and Y.

Fact. FX Y, ( , )x y = F x F yX ( ) Y ( ) if and only if (iff) X and Y are independent.

(9)

• Covariance

¾ The covariance measures the dependence of two random variables. For two random variables X and Y,

Cov[ , ] [( [ ])( [ ])]

[ ] [ ] [ ] ,

X Y X Y E X E X Y E Y

E XY E X E Y

σ = = − −

= −

where,

[ ] X Y, ( , ) ,

E XY xyf x y dxdy

∞ ∞

−∞ −∞

=

∫ ∫

¾ If σ > 0 (<0), X and Y are said to be positively (negatively) correlated.

¾ σxy = 0 iff X and Y are independent.

¾ Properties of covariance

Cov[ , ] Var[ ], Cov[ , ] Cov[ , ], Cov[ , ] Cov[ , ],

Cov[ , ] Cov[ , ] Cov[ , ], Cov[ , ] XY X Y.

X X X

X Y Y X

aX Y a Y X

X Y Z X Y X Z

X Y σ σ σ

=

=

=

+ = +

=

¾ The coefficient of correlation is defined as XY XY .

X Y

ρ σ

= σ σ

¾ Note that ρXY 1

¾ Note that Var[X Y+ ] Var[ ] 2Cov[ , ] Var[ ]= X + X Y + Y .

¾ If X and Y are independent, Var[X+Y] = Var[X] + Var[Y].

(10)

• The Bernoulli Random Variable

¾ Suppose an experiment can result in success with probability p and failure with probability (w.p.) 1−p. We define a

Bernoulli random variable X as X =1 if the experiment outcome is a success and X = 0, otherwise.

¾ The pmf of X is

1 if 0

( ) { }

if 1

X

p x

f x P X x

p x

− =

= = = ⎨⎧⎩ = .

¾ The expected value of X is E[X] = 0(1−p) + 1(p) = p.

¾ The second moment of X is E[X2] = 02(1−p) + 12(p) = p.

¾ The variance of X is Var[X] = E[X2] −(E[X])2 = p − p2 = p(1−p).

• The Binomial Random Variable

¾ Consider n independent trials, each of which can results in a success w.p. p and failure w.p. 1−p .

¾ We define a Binomial random variable, X, as the number of successes in the n trials.

¾ The pmf of X is defined as

( ) { } i(1 )n i , 0,1,

X

f i P X i n p p i n

i

⎛ ⎞

= = =⎜ ⎟ − =

⎝ ⎠ …

where ⎛ ⎞⎜ ⎟⎝ ⎠ni = (n i in!)! !.

(11)

0 1 2 3 4 5 0

0.1 0.2 0.3

fX i()

1 i

¾ Note that

1 0

( ) (1 ) 1, 0,1, .

n n

i n i

X

i i

f i n p p i n

i

= =

= ⎛ ⎞⎜ ⎟ = =

∑ ∑

⎝ ⎠

Fact. Let Xi = 1, i =1, …, n, if the ith trial results in success and Xi = 0, otherwise. Then

1 n

i i

X X

=

=

.

¾ Note that Xi are independent and identically distributed (iid) Bernoulli random variable with parameter p.

¾ Therefore,

1 1

[ ] n [ i] , [ ] n [ i] (1 ).

i i

E X E X np Var X Var X np p

= =

=

= =

= −

• Example 4

¾ A fair coin is flipped 5 times.

¾ What is the probability that two heads are obtained?

¾ The number of heads, X, is a binomial random variable with

parameters n = 5 and p = 0.5. Then, the desired probability is P{X = 2} = [5!/(2!×3!)]×(0.5)2(0.5)3 = 0.313 .

(12)

• The Geometric Random Variable

¾ Suppose independent trials, each having a probability p of being a success, are performed.

¾ We define the geometric random variable (rv) X as the number of trials until the first success occurs.

¾ The pmf of X is defined as

( ) { } (1 )i 1 , 1, 2, f iX = P X = = −i p p i = …

5 10 15 20

0 0.05 0.1 0.15

fX i()

i

¾ Note that fX(i) defines a pmf since

1

1 1 0

( ) (1 )i (1 )i /[1 (1 )] 1.

X

i i i

f i p p p p p p

= = =

= − = − = − − =

∑ ∑ ∑

¾ Let q = 1−p. The first two moments and variance of X are

1 1

2 2 1

2 1

2 2

2

[ ] 1 ,

[ ] 2 ,

Var[ ] [ ] ( [ ]) 1 .

i i

i i

E X iq p p E X p i q p

p X E X E X p

p

=

=

= =

= = −

= − = −

(13)

Example 5.

¾ When rolling a die repetitively, what is the probability that the first 6 appears on the sixth roll?

¾ Let X be the number of rolls until a 6 appears. Then, X is a geometric rv with parameter p = 1/6, and the desired probability is P{X = 6} = (5/6)5(1/6) = 0.0667 .

¾ What is the expected number of rolls until a 6 appears?

¾ E[X] = 1/p = 6.

• The Poisson Random Variable

¾ A rv, taking on values 0, 1, …, is said to be a Poisson random variable with parameter λ > 0 if

( ) { } , 0,1,

!

i

f iX P X i e i

i

λ λ

= = = = …

0 2 4 6 8 10

0 0.05 0.1 0.15

fX i()

i

¾ fX(i) defines a pmf since

0 0

( ) ( )( ) 1.

!

i X

i i

f i e e e

i

λ λ λ λ

= =

= = =

∑ ∑

¾ The Poisson rv is a good model for demand, arrivals, and certain rare events.

(14)

¾ The first two moments and the variance of X are

2 2

2 2

[ ] ,

[ ] ,

[ ] [ ] ( [ ]) .

E X E X

Var X E X E X λ

λ λ

λ

=

= +

= =

¾ Let X1 and X2 be two independent Poisson rv’s with means λ1

and λ2. Then, Z = X1 + X2 is a Poisson rv with mean λ1 + λ2 .

Example 6.

¾ The monthly demand for a certain airplane spare part of Fly High Airlines (FHA) fleet at Beirut airport is estimated to be a Poisson random variable with mean 0.5. Suppose that FHA will stock one spare part at the beginning of March. Once the part is used, a new part is ordered. The delivery lead time for a part is 2 months.

¾ What is the probability that the spare part will be used during March?

¾ Let X be the demand for the spare part. The desired probability is P{X ≥ 1} = 1− e−λ = 1 − e−0.5 = 0.393 .

¾ What is the probability that FHA will face a shortage on this part in March?

¾ The desired probability is P{X > 1} = 1 − P{X = 0} − P{X = 1}

= 1 − e−0.5 − 0.5e−0.5 = 0.09 .

• The Uniform Random Variable

¾ A rv X that is equally like to be “near” any point of an interval (a, b) is said to have a uniform distribution.

(15)

¾ The pdf of X is

1 , if ( )

0 , otherwise

X

a x b

f x b a

⎧ < <

= ⎪⎨ −

⎪⎩

¾ Note that fX(x) defines a pdf since ( ) 1 1.

b b

X

a a

f x x dx

= b a =

∫ ∫

0 1 2 3 4 5 6

0 0.1 0.2 0.3

fX x( )

x

¾ The cdf of X is

0, if

( ) ( ) , if

1, otherwise

x

X X

x a F x f t dt x a a x b

−∞ b a

<

⎪ −

= =⎨ − ≤ ≤

⎪⎩

¾ The first two moments of X are

2 2

2 3 3 2 2

2 2

[ ] ( ) ,

2( ) 2

[ ] ( ) .

3( ) 3

b b

X

a a

b b

X

a a

x b a b a

E X xf x dx dx

b a b a

x b a a ab b

E X x f x dx dx

b a b a

+

= = = =

+ +

= = = =

∫ ∫

∫ ∫

¾ The variance of X is E[X2] − (E[X])2 = (b − a)2 / 12 .

(16)

• The Exponential Random Variable

¾ An exponential rv with parameter λ is a rv whose pdf is , if 0

( ) 0, othewise

x X

e x

f x

λ λ

= ⎨

0 5 10 15 20

0 0.05 0.1 0.15

fX x( )

x

¾ Note that fX(x) defines a pdf since

0 0 0

( ) x x 1.

f x xX λe dxλ e λ

= = − =

∫ ∫

¾ The exponential rv is a good model for time between arrivals or time to failure of certain equipments.

¾ The cdf of X is

0 0 0

( ) ( ) 1 , 0 .

x x

t x x x

X X

F x =

f t dt =

λe dtλ = −eλ = −eλ x

¾ A useful property of the exponential distribution is that P{X > x} = e−λx .

(17)

17

¾ The first two moments and the variance of X are

2

2

2 2

2

[ ] 1 ,

[ ] 2 ,

[ ] [ ] ( [ ]) 1 .

E X E X

Var X E X E X

λ λ

λ

=

=

= − =

Preposition. The exponential distribution has the memoryless property. I.e.,P X{ > +t u X| > =t} P X{ >u}.

Proof.

( )

{ , } { }

{ | }

{ } { }

{ } .

t u

u t

P X t u X t P X t u P X t u X t

P X t P X t

e e P X u

e

λ λ

λ

+

> + > > +

> + > = =

> >

= = = >

¾ The memoryless property allows developing tractable

analytical models with the exponential distribution. It makes the exponential distribution very popular in modeling.

Preposition. Let X1 and X2 be two independent exponential

random variables with parameters λ1and λ2 .Let X = min(X1, X2).

Then, X is an exponential random variable with parameter λ1+ λ2. Proof. P{X > x} = P{X1 > x, X2 > x} = P{X1 > x}P{X2 > x}

= eλ1xeλ2x = e− +(λ λ2)x.

A Poisson process N(t) with rate λ is a stochastic process defined over time, t, such that

(i) The number of events (typically arrivals) over a time t, N(t), is a Poisson random variable with mean λt,

(18)

Example 7.

¾ The amount of time one spends in the bank is exponentially distributed with mean 10 minutes.

¾ A customer arrives at 1:00 PM. What is the probability that the customer will be in the bank at 1:15 PM?

¾ Let X be the time the customers spends in the bank. Then, X is exponentially distributed with parameter λ = 1/10. The desired probability is P{X > 15} = e−15λ = e−15/10 = 0.223.

¾ It is now 1:20 PM and the customer is still in the bank? What is the probability that the customer will be in the bank at 1:35 PM?

¾ 0.223 (by the memoryless property).

• The Normal Random Variable

¾ We say that a random variable X is a normal rv with parameters μ and σ > 0 if it has the following pdf:

2 2

( ) /(2 )

( ) , ( , ) .

2

x X

f x e x

μ σ

πσ

= − − ∈ −∞ ∞

5 0 5 10 15

0 0.05 0.1 0.15

fX x( )

x

(19)

¾ Note that fX(x) defines a pdf. With a change of variable z = (x − μ)/σ and using the fact that ez2/ 2dz 2 ,π

−∞

=

2 2

( ) /(2 ) 2

1 / 2

( ) 1.

2 2

x z

X

f x dx e dx e dz

μ σ

πσ π

− −

−∞ −∞ −∞

= = =

∫ ∫ ∫

¾ The normal rv is a good model for quantities that can be seen as sums or averages of a large number of rv’s.

¾ The cdf of X, ( ) ( ) ,

x

X X

F x f t dt

−∞

=

has no closed-form.

¾ The first two moments of and variance of X are

2 2 2

2

[ ] ,

[ ] ,

[ ] .

E X E X Var X

μ

σ μ

σ

=

= +

=

Fact. If X is a normal rv, then Z = (X − μ)/σ is a “standard normal r.v.” with parameters 0 and 1.

Proof. Note that

2 2

( ) /(2 )

{ } { } { } .

2

z t

X e

P Z z P z P X z dt

μ σ μ σ

μ μ σ

σ πσ

+ − −

−∞

< = < = < + =

Let u = (t − μ)/σ, then

2/ 2

{ } ,

2

z e u

P Z z du

π

−∞

< =

which is the cdf of the standard normal.

¾ This fact implies that X = μ + σ Z .

(20)

¾ The cdf of X, FX(x), is evaluates through the cdf of Z, FZ(z), which is often tabulated,

{ } { x } X( ) Z x

P X x P Z μ F x F μ

σ σ

− ⎛ − ⎞

< = < ⇒ = ⎜⎝ ⎟⎠ .

Proposition If X1 and X2 are two independent normal rvs with means μi and variances σi2 , i =1,2, then Z = X1 + X2 is normal with mean μ1 + μ2 and variance σ12 + σ22 .

Theorem (central limit theorem). If Xi, i =1, 2,…, n, are iid rv’s with mean μ and variance σ 2. Then, for n large

enough,

1 n

i i

X

= is normally distributed with mean nμ and variance nσ2.

Example 8.

¾ The height of an AUB male student is a normal rv with mean 170 cm and standard deviation 8 cm.

¾ What is the probability that the height of an AUB student is less than 180 cm?

¾ Let X be the height of the student. Then, the desired probability is P{X < 180} = P{Z < (180 −170)/8} = P{Z < 1.25} = 0.894.

References

Related documents

Quality: We measure quality (Q in our formal model) by observing the average number of citations received by a scientist for all the papers he or she published in a given

Furthermore, while symbolic execution systems often avoid reasoning precisely about symbolic memory accesses (e.g., access- ing a symbolic offset in an array), C OMMUTER ’s test

As such, the purpose of the current study is to examine the association between early sexual abuse, neglect, depressive symptoms, risky sexual behavior, and friends trading sex

Across all four pilot areas, communities are faced with the land-use challenge of balancing the provision of subsistence food and resources with economic development through

[r]

How the study was conducted The researchers used a 3-D global atmospheric download to predict how the radioactive material download move over earth and a health-effects model to see

The aim of this study was to evaluate the current vac- cination status of the HCWs in all of the Departments different from the Department for the Health of Women and Children of one

As with other rapidly reconfigurable devices, optically reconfigurable gate arrays (ORGAs) have been developed, which combine a holographic memory and an optically programmable