A random variable is a function that assigns a number to each element of the sample space. We typically write a random variable in a capital letter such as X .
Example 1. If we flip a coin and observe which side is up, the sample space is {H, T}. If we assign H=1 and T=0, our random variable is:
1 with probability of 0.5 0 with probability of 0.5 X =
There’s more than one way to assign a value to each element in the sample space. In the flip of a coin, we can also assign H=0 and T=1. In this case, the random variable is:
0 with probability of 0.5 1 with probability of 0.5 X =
Of course, you can assign H=2 and T=3, etc.
Example 2. If we flip a coin three times, the sample space is:
{
HHH HHT HTH HTT THH THT TTH TTT, , , , , , ,}
=
Each element in the sample space has 0.125 chance of occurring. If we assign X =# of heads, Y=# of tails, Z=# of successive flips that have the same outcome, then each element corresponds to the following 3 sets of numerical values:
Element X Y Z Probability
HHH 3 0 3 0.125
HHT 2 1 2 0.125
HTH 2 1 1 0.125
HTT 1 2 2 0.125
THH 2 1 2 0.125
THT 1 2 1 0.125
TTH 1 2 2 0.125
TTT 0 3 3 0.125
Please note that in HHH, we have 3 consecutive heads; so Z=3. In HTH, no two consecutive outcomes are the same; so Z=1.
Here X Y Z, , are three random variables.
Page 73 of 425
©Yufeng Guo, Deeper Understanding: Exam P
Example 3. If we roll a die and record the side that’s face up, the sample space is
{
1, 2,3, 4, 5, 6}
= . We can assign a value to each element in the sample space as follows:
Element of the sample space X Probability
1 1 1/6
2 2 1/6
3 3 1/6
4 4 1/6
5 5 1/6
6 6 1/6
Here X is a random variable.
Of course, you can assign values differently as follows:
Element of the sample space X Probability
1 6 1/6
2 5 1/6
3 4 1/6
4 3 1/6
5 2 1/6
6 1 1/6
You see that a random variable is really an arbitrary translation of each element in the sample space into a number. Of course, some translation schemes are more useful than others.
What do we gain by such translation? By mapping the entire sample space into a series of numbers, we can extract relevant information from the sample space to solve the problem at hand, while ignoring other details of the sample space.
Example 4. We flip a coin and are interested in finding the # of times heads are up. If we assign 1 and 0 to H and T respectively, then the information about the # of heads up can be conveniently summarized as follows:
1 with probability of 0.5 0 with probability of 0.5 X =
You see that we have reduced the coin flipping process to a simple, elegant math equation. More importantly, this equation answers our question at hand.
Example 5. If we flip a coin three times. We are concerned about the # of times heads show up. Let X represent the # of times heads show up, then we:
Page 74 of 425
©Yufeng Guo, Deeper Understanding: Exam P
X 3 2 2 1 2 1 1 0
Probability 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
Example 6. We roll a die and record the side that’s face up. We are interested in finding the probability of getting 1,2,3,4,5, and 6 respectively. If we let random variable X represent the number that’s face up, then we have:
X 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/6
Expressed more succinctly:
( )
1P X =n =6, where n=1, 2, 3, 4, 5, 6
Discrete random variable vs. continuous random variable
If a random variable can take on discrete values, then it’s a discrete random variable. If a random variable can take on any value in a range, then it’s a continuous random variable.
Example 7. Let the random variable X represent the # of heads we get from flipping a coin n times. Then X can take on integer values ranging from 0 to n . X is a discrete random variable.
Example 8. Let random variable Yrepresent the number randomly chosen from the range [0,1]. Then Ycan take on any value in [0,1]. Y is a continuous random variable.
PMF and CDF for discrete random variables Probability mass function
The most important way to describe a discrete random variable is through the probability mass function (PMF). If x is a possible value of the random variable X , the probability mass of x , denoted as pX
( )
x , is the probability that X =x:( ) ( )
pX x =P X =x
Example 9. We flip a coin twice and record the # of times we get heads. Let X represent the # of heads in 2 flips of a coin. The probability mass function of X is:
Page 75 of 425
©Yufeng Guo, Deeper Understanding: Exam P
A probability mass function must satisfy the 3 axioms:
• pX
( )
x 0So a valid PMF needs to satisfy the following two conditions:
( )
0pX x , X
( )
1x
p x =
Example 10. You are given the following PMF:
( )
!n
pN n e
= n , where n=0,1, 2,...,+ and is a positive constant Verify that this is a legitimate PMF.
Solution
©Yufeng Guo, Deeper Understanding: Exam P So
( )
!
n
pN n e
= n is a valid PMF.
Example 11. A special die has 3 sides painted 1, 2, and 3 respectively. If the die is thrown, each side has an equal chance of landing face up on the ground. Two dies are thrown together and let X represent the sum of the two sides facing up.
Find the probability mass function of X . Solution
In the above table, the blue cells represent the values of X . Because each side has 1 3 chance of landing face up, each cell has
( )
1 3 2 =1 9 chance of occurring.We convert the above table into the new table below:
To understand the above table, let’s look at pX
( )
3 =2 9.This is how we get( )
3 2 9pX = . There are two ways to have X =3: you get a 1 from the 1st die and 2 from the 2nd die (with probability of 1 9); you get 2 from the 1st die and 1 from the 2nd die (with probability of 1 9). So the total probability of having X =3 is 2 9.
Example 12. Claim payment, X , has the following PMF:
x $0 $50 $80 $135 $250 $329
( )
pX x 0.32 0.2 0.18 0.1 0.15 0.05
Calculate
1. P X
(
>120)
2. P X
(
300 X >120)
Solution
outcome of 2nd throw outcome of the 1st throw 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
x 2 3 4 5 6
( )
pX x 1 9 2 9 3 9 2 9 1 9
Page 77 of 425
©Yufeng Guo, Deeper Understanding: Exam P
Cumulative probability function (CDF)
The cumulative function is defined as FX
( )
x =P X(
x)
©Yufeng Guo, Deeper Understanding: Exam P
( )
4(
4) ( )
2( )
3( )
4 1 9 2 9 3 9 2 3F =P X = p + p +p = + + =
( )
5(
5) ( )
2( )
3( )
4( )
5 1 9 2 9 3 9 2 9 8 9F =P X = p +p + p + p = + + + =
( )
6(
6) ( )
2( )
3( )
4( )
5( )
6 1 9 2 9 3 9 2 9 1 9 1F =P X = p +p + p + p + p = + + + + =
( )
7(
7) ( )
2( )
3( )
4( )
5( )
6( )
7F =P X = p + p + p +p +p +p
1 9 2 9 3 9 2 9 1 9 0 1
= + + + + + = because p
( )
7 =0( ) ( ) (
6)
1F + =P X + =P X =
PDF and CDF for continuous random variables
For a continuous random variable X, the probability density function (PDF), f x
( )
, isdefined as:
( )
b( )
a
P a x b = f x dx
( )
P a x b is the area under the graph f x
( )
. Because including or excluding the end points doesn’t affect the area, including or excluding the end points doesn’t affect the probability:( ) ( ) ( ) ( )
b( )
a
P a< X <b =P a X <b =P a X b =P a< X b = f x dx
The CDF (cumulative probability function) of the continuous random variable X is defined as:
( ) ( )
F x =P X x . This is the same definition when X is discrete.
If a random variable is discrete, we say PMF (probability mass function); if a random variable is continuous, we say PDF (probability density function).
Whether a random variable is discrete or continuous, we always say CDF (cumulative probability function).
Please note that often for the sake of convenience, people use f x
( )
to refer to either PMF pX( )
x or PDF f x( )
.Page 79 of 425
©Yufeng Guo, Deeper Understanding: Exam P
Properties of CDF
Rule 1 F x
( )
=P X(
x)
for all x -- this is just the definition.Rule 2 CDF can never be decreasing. If a b, then F a
( )
F b .( )
To see why, notice F x
( )
=P X(
x)
. If a b, then{
x a} {
x b . In other words,}
x b contains x a. So P x
(
b)
P x(
a . This gives us)
F b( )
F a .( )
Rule 3 F
( )
= and 0 F( )
+ = .1They are true for both discrete and continuous random variables. To see why, notice
< X < + . There’s zero chance that X can be smaller or equal to ;
( ) ( )
0F =P X = . On the other hand, we are 100% certain that Xcannot exceed + . So F
( )
+ =P X(
+)
= .1Rule 4 If X is discrete and takes integer values, the PMF and CDF can be obtained from each other by summing or differencing:
( )
k X( )
i
F k p i
=
= -- this is the definition of F k
( )
( ) ( ) (
1) ( ) (
1)
pX k =P X k P X k =F k F k
Rule 5 If X is continuous, the PDF and CDF can be obtained from each other by integration or differentiation:
( )
x( )
F x = f t dt, f x
( )
d F x( )
= dx .
By definition, F x
( )
=P X(
x)
=P(
X x)
= x f t dt( )
. Taking the derivative at both sides of F x( )
= x f t dt( )
gives us f x( )
d F x( )
= dx .
Example 14. X has the following density: f x
( )
=3x2 where 0 x 1.Then,
( ) ( ) ( ) ( )
2 30 0
3
x x x
F x =P X x = f t dt= f t dt= t dt=x .
(
0.2 0.6) ( )
0.6( )
0.2 0.63 0.23 0.208P X =F F = =
Page 80 of 425
©Yufeng Guo, Deeper Understanding: Exam P
Example 15. A real number is randomly chosen from [0,1]. Then this number is squared. Let X represent the result.
Find the PDF and CDF forX. Solution
We’ll find the CDF first. Let U represent the # randomly drawn from [0,1]. Then X =U2.
( ) ( ) (
2) ( )
FX x =P X x =P U x =P U x
Because any number in the interval [0,1] has an equal chance of being drawn,
( )
P U x must be proportional to the length of the interval 0, x . The total probability that P U
(
1)
= -- we are 100% certain that any number taken from [0,1] 1 must not exceed 1. Consequently,( )
Please note that the following key difference between PMF for a discrete random variable and PDF for a continuous random variable:
PMF is a real probability and its value must not exceed one; PDF is a fake probability and can take on any non-negative value. PDF itself doesn’t have any meaning. For PDF to be useful, we must integrate it over a range.
Page 81 of 425
©Yufeng Guo, Deeper Understanding: Exam P In the example above, PDF is
( )
12 f x
x
= for 0<x 1. When x 0,
( )
1f x 2
= x + .
( )
1f x 2
= x is not a probability. To get a probability, we must integrate
( )
12 f x
x
= over a range. For example, if we integrate f x over
( ) [ ]
a b , we’ll , get a real probability:( )
b( )
a
P a< X b = f x dx
Mean and variance of a random variable
You just have to memorize a series of formulas:
If X is discrete, then
mean
( )
X( )
x
E X = x p x
variance
( ) ( )
2( )
2 X( ) ( )
2 2( )
x
Var X =E X E X = x E X p x =E X E X
If X is continuous, then the mean E X
( )
=+ xf x dx( )
variance Var X
( )
=E X E X( )
2 =+ x E X( )
2f x dx( )
=E X( )
2 E2( )
XStandard deviation of X - no matter X is continuous or discrete
( )
X Var X
" =
Page 82 of 425
©Yufeng Guo, Deeper Understanding: Exam P
©Yufeng Guo, Deeper Understanding: Exam P
Mean of a function
Many times we need to find E Y =g X
( )
. One way to find E Y is to find the pdf[ ]
Don’t worry about how to prove it. Just memorize it.
Example 18. Y =X2 1, where X has the following distribution:
©Yufeng Guo, Deeper Understanding: Exam P Alternative method:
e x is the exponential pdf (probability density function) with mean ) =1. Consequently,
( ) ( ) ( )
Example 19. X has the following distribution:
2 1
©Yufeng Guo, Deeper Understanding: Exam P