Lecture6

(1)

Lecture 6

Random variable

Plan of the lecture:

1. Discrete random variable

1.1Definitions of random variable and discrete random variable

1.2Probability mass function

2. Continuous random variables and PDFs

3. Cumulative distribution functions

(2)

1 Discrete Random variable

1.1 Definitions of random variable and discrete random variable

In many probabilistic models, the outcomes are of a numerical nature, e.g., if they

correspond to instrument readings or stock prices. In other experiments, the outcomes are not

numerical, but they may be associated with some numerical values of interest. For example, if

the experiment is the selection of students from a given population, we may wish to consider

their grade point average. When dealing with such numerical values, it is often useful to assign

probabilities to them. This is done through the notion of a random variable.

Given an experiment and the corresponding set of possible outcomes (the sample space),

a random variable associates a particular number with each outcome; see Fig. 1. We refer to this

number as the numerical value or the experimental value of the random variable.

Mathematically, a random variable is a real-valued function of the experimental outcome.

Figure 1: Visualization of a random variable. It is a function that assigns a numerical value to

each possible outcome of the experiment.

Here are some examples of random variables:

(a) In an experiment involving a sequence of 5 tosses of a coin, the number of heads in

the sequence is a random variable. However, the 5-long sequence of heads and tails is not

considered a random variable because it does not have an explicit numerical value.

(b) In an experiment involving two rolls of a die, the following are examples of random

variables:

(1) The sum of the two rolls.

(2) The number of sixes in the two rolls.

(3)

(c) In an experiment involving the transmission of a message, the time needed to transmit

the message, the number of symbols received in error, and the delay with which the message is

received are all random variables.

There are several basic concepts associated with random variables, which are

summarized below.

Main Concepts Related to Random Variables

Starting with a probabilistic model of an experiment:

 A random variable is a real-valued function of the outcome of the experiment.

 A function of a random variable defines another random variable.

 We can associate with each random variable certain “averages” of interest, such the

mean and the variance.

 A random variable can be conditioned on an event or on another random variable.

 There is a notion of independence of a random variable from an event or from another

random variable.

A random variable is called discrete if its range (the set of values that it can take) is

finite or at most countably infinite. For example, the random variables mentioned in (a) and (b)

above can take at most a finite number of numerical values, and are therefore discrete.

A random variable that can take an uncountably infinite number of values is not discrete.

Concepts Related to Discrete Random Variables

Starting with a probabilistic model of an experiment:

 A discrete random variable is a real-valued function of the outcome of the experiment

that can take a finite or countably infinite number of values.

 A (discrete) random variable has an associated probability mass function (PMF), which gives the probability of each numerical value that the random variable can take.

 A function of a random variable defines another random variable, whose PMF can be

(4)

Figure 2: Example of discrete RV

1.2 Probability mass function

The most important way to characterize a random variable is through the probabilities of

the values that it can take. For a discrete random variable 𝑋, these are captured by the

probability mass function (PMF for short) of 𝑋, denoted 𝑝_𝑋. In particular, if 𝑥 is any possible

value of 𝑋, the probability mass of 𝑥, denoted 𝑝_𝑋(𝑥), is the probability of the event {𝑋 = 𝑥}

consisting of all outcomes that give rise to a value of 𝑋 equal to 𝑥:

𝑝𝑋 𝑥 = 𝑃 𝑋 = 𝑥 .

For example, let the experiment consist of two independent tosses of a fair coin, and let 𝑋

be the number of heads obtained. Then the PMF of 𝑋 is

𝑝_𝑋 𝑥 = 1 41 2 𝑖𝑓 𝑥 = 0 𝑜𝑟 𝑥 = 2, 𝑖𝑓 𝑥 = 1,

0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

In what follows, we will often omit the braces from the event/set notation, when no

ambiguity can arise. In particular, we will usually write 𝑃(𝑋 = 𝑥) in place of the more correct

0 5 10 15 20 25 30

01.02.10 06.02.10 11.02.10 16.02.10 21.02.10 26.02.10 03.03.10 08.03.10 13.03.10 18.03.10

(5)

notation 𝑃( 𝑋 = 𝑥 ).We will also adhere to the following convention throughout: we will use

upper case characters to denote random variables, and lower case characters to denote real

numbers such as the numerical values of a random variable.

Note that

𝑝𝑥 𝑋 𝑥 = 1,

where in the summation above, 𝑥 ranges over all the possible numerical values of 𝑋. This

follows from the additivity and normalization axioms, because the events {𝑋 = 𝑥} are disjoint

and form a partition of the sample space, as 𝑥 ranges over all possible values of 𝑋. By a similar

argument, for any set 𝑆 of real numbers, we also have

𝑃 𝑋 ∈ 𝑆 = 𝑥∈𝑆𝑝𝑋 𝑥 .

For example, if 𝑋 is the number of heads obtained in two independent tosses of a fair

coin, as above, the probability of at least one head is

𝑃 𝑋 > 0 = 𝑥>0𝑝𝑋 𝑥 = 1₂+1₄ =3₄.

Calculating the PMF of 𝑋 is conceptually straightforward, and is illustrated in Fig. 3.

Figure 3: Illustration of the method to calculate the PMF of a random variable 𝑋. For each

possible value 𝑥, we collect all the outcomes that give rise to 𝑋 = 𝑥 and add their probabilities to

obtain 𝑝_𝑋 𝑥 .

Calculation of the PMF of a Random Variable 𝑋

(6)

1. Collect all the possible outcomes that give rise to the event {𝑋 = 𝑥}.

2. Add their probabilities to obtain 𝑝_𝑋(𝑥).

2 Continuous random variables and PDFs

A random variable 𝑋 is called continuous if its probability law can be described in terms

of a nonnegative function 𝑓_𝑋, called the probability density function of 𝑋, or PDF for short,

which satisfies

𝑃(𝑋 ∈ 𝐵) = 𝑓_𝐵 𝑋(𝑥)𝑑𝑥,

for every subset 𝐵 of the real line.In particular, the probability that the value of 𝑋 falls within an

interval is

𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑓_𝑎𝑏 𝑋(𝑥)𝑑𝑥,

and can be interpreted as the area under the graph of the PDF (see Fig. 4). For any single value a,

we have 𝑃(𝑋 = 𝑎) = 𝑓_𝑎𝑎 𝑋(𝑥)𝑑𝑥 = 0. For this reason, including or excluding the endpoints of

an interval has no effect on its probability:

𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑃(𝑎 < 𝑋 < 𝑏) = 𝑃(𝑎 ≤ 𝑋 < 𝑏) = 𝑃(𝑎 < 𝑋 ≤ 𝑏).

Figure 4: Illustration of a PDF. The probability that 𝑋 takes value in an interval [𝑎, 𝑏] is

𝑓_𝑎𝑏 𝑋(𝑥)𝑑𝑥, which is the shaded area in the figure.

Note that to qualify as a PDF, a function 𝑓_𝑋 must be nonnegative, i.e., 𝑓_𝑋(𝑥) ≥ 0 for

(7)

𝑓_−∞∞ 𝑋(𝑥)𝑑𝑥= 𝑃(−∞ < 𝑋 < ∞) = 1.

Graphically, this means that the entire area under the graph of the PDF must be equal to

1.

To interpret the PDF, note that for an interval [𝑥, 𝑥 + 𝛿] with very small length 𝛿, we

have

𝑃 [𝑥, 𝑥 + 𝛿] = _𝑥𝑥+𝛿𝑓_𝑋(𝑡) 𝑑𝑡≈ 𝑓_𝑋(𝑥) ∙ 𝛿,

so we can view 𝑓_𝑋(𝑥) as the “probability mass per unit length” near x (cf. Fig. 5). It is

important to realize that even though a PDF is used to calculate event probabilities, 𝑓_𝑋(𝑥) is not

the probability of any particular event. In particular, it is not restricted to be less than or equal to

one.

Figure 5: Interpretation of the PDF 𝑓_𝑋(𝑥) as “probability mass per unit length” around 𝑥. If δ is

very small, the probability that 𝑋 takes value in the interval [𝑥, 𝑥 + 𝛿] is the shaded area in the

figure, which is approximately equal to 𝑓𝑋(𝑥) ∙ 𝛿.

Summary of PDF Properties

Let 𝑋 be a continuous random variable with PDF 𝑓𝑋:

 𝑓_𝑋(𝑥) ≥ 0 for all 𝑥.

 𝑓_−∞∞ 𝑋(𝑥)𝑑𝑥 = 1.

 If 𝛿 is very small, then 𝑃 [𝑥, 𝑥 + 𝛿] ≈ 𝑓𝑋(𝑥) ∙ 𝛿.  For any subset 𝐵 of the real line,

(8)

3 Cumulative distribution functions

We have been dealing with discrete and continuous random variables in a somewhat

different manner, using PMFs and PDFs, respectively. It would be desirable to describe all kinds

of random variables with a single mathematical concept. This is accomplished by the

cumulative distribution function, or CDF for short. The CDF of a random variable 𝑿 is

denoted by 𝑭_𝑿 and provides the probability 𝑷(𝑿 ≤ 𝒙). In particular, for every 𝑥 we have

𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 =

𝑝_𝑋 𝑘

𝑥≤𝑘

𝑋: 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒,

𝑓𝑋 𝑡 𝑑𝑡

𝑥

−∞

𝑋: 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠.

Loosely speaking, the CDF 𝐹_𝑋 𝑥 “accumulates” probability “up to” the value 𝑥.

Any random variable associated with a given probability model has a CDF, regardless of

whether it is discrete, continuous, or other. This is because {𝑋 ≤ 𝑥} is always an event and

therefore has a well-defined probability. Figures 6 and 7 illustrate the CDFs of various discrete

and continuous random variables. From these figures, as well as from the definition, some

general properties of the CDF can be observed.

Figure 6: CDFs of some discrete random variables. The CDF is related to the PMF through the

formula 𝐹𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 = 𝑥≤𝑘𝑝𝑋 𝑘 , and has a staircase form, with jumps occurring at the

values of positive probability mass. Note that at the points where a jump occurs, the value of 𝐹_𝑋

(9)

Figure 7: CDFs of some continuous random variables. The CDF is related to the PDF through

the formula 𝐹_𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 = 𝑓_−∞𝑥 _𝑋 𝑡 𝑑𝑡. Thus, the PDF 𝑓_𝑋 can be obtained from the CDF

by differentiation: 𝑓𝑋 𝑥 =𝑑𝐹_𝑑𝑥𝑋 𝑥 . For a continuous random variable, the CDF has no jumps, i.e.,

it is continuous.

Properties of a CDF

The CDF 𝐹_𝑋 of a random variable 𝑋 is defined by 𝐹_𝑋 𝑥 = 𝑃 𝑋 ≤ 𝑥 , for all 𝑥, and has

the following properties.

 𝐹_𝑋 is monotonically nondecreasing:

if x ≤ y, then 𝐹𝑋(𝑥) ≤ 𝐹𝑋(𝑦).

 𝐹𝑋 𝑥 tends to 0 as 𝑥 → −∞, and to 1 as 𝑥 → ∞.

 If 𝑋 is discrete, then 𝐹_𝑋 has a piecewise constant and staircase-like form.

 If 𝑋 is continuous, then 𝐹𝑋 has a continuously varying form.

 If 𝑋 is discrete and takes integer values, the PMF and the CDF can be obtained from

each other by summing or differencing:

𝐹𝑋 𝑘 = 𝑘𝑖=−∞𝑝𝑋 𝑖 ,

𝑝𝑋 𝑘 = 𝑃 𝑋 ≤ 𝑘 − 𝑃 𝑋 ≤ 𝑘 − 1 = 𝐹𝑋 𝑘 − 𝐹𝑋 𝑘 − 1 ,

for all integers 𝑘.

 If 𝑋 is continuous, the PDF and the CDF can be obtained from each other by

integration or differentiation:

𝐹_𝑋 𝑥 = 𝑓_−∞𝑥 𝑋 𝑡 𝑑𝑡,

𝑓_𝑋 𝑥 =𝑑𝐹𝑋

𝑑𝑥 𝑥 .