Recursive Estimation
Raffaello DâAndrea Spring 2014
Problem Set 2:
Bayesâ Theorem and Bayesian Tracking
Last updated: March 18, 2015
Notes:
âĒ Notation: Unless otherwise noted, x, y, and z denote random variables, f
x(x) (or the short hand f (x)) denotes the probability density function of x, and f
x|y(x|y) (or f (x|y)) denotes the conditional probability density function of x conditioned on y. The expected value is denoted by E [·], the variance is denoted by Var[·], and Pr (Z) denotes the probability that the event Z occurs. A normally distributed random variable x with mean Âĩ and variance Ï
2is denoted by x âž N Âĩ, Ï
2.
âĒ Please report any errors found in this problem set to the teaching assistants
([email protected] or [email protected]).
Problem Set
Problem 1
Mr. Jones has devised a gambling system for winning at roulette. When he bets, he bets on red, and places a bet only when the ten previous spins of the roulette have landed on a black number. He reasons that his chance of winning is quite large since the probability of eleven consecutive spins resulting in black is quite small. What do you think of this system?
Problem 2
Consider two boxes, one containing one black and one white marble, the other, two black and one white marble. A box is selected at random with equal probability and a marble is drawn at random with equal probability from the selected box. What is the probability that the marble is black?
Problem 3
In Problem 2, what is the probability that the first box was the one selected given that the marble is white?
Problem 4
Urn 1 contains two white balls and one black ball, while urn 2 contains one white ball and five black balls. One ball is drawn at random with equal probability from urn 1 and placed in urn 2. A ball is then drawn from urn 2 at random with equal probability. It happens to be white.
What is the probability that the transferred ball was white?
Problem 5
Stores A, B and C have 50, 75, 100 employees, and respectively 50, 60 and 70 percent of these are women. Resignations are equally likely among all employees, regardless of sex. One employee resigns and this is a woman. What is the probability that she works in store C?
Problem 6
a) A gambler has in his pocket a fair coin and a two-headed coin. He selects one of the coins at random with equal probability, and when he flips it, it shows heads. What is the probability that it is the fair coin?
b) Suppose that he flips the same coin a second time and again it shows heads. What is now the probability that it is the fair coin?
c) Suppose that he flips the same coin a third time and it shows tails. What is now the probability that it is the fair coin?
Problem 7
Urn 1 has five white and seven black balls. Urn 2 has three white and twelve black balls. An
urn is selected at random with equal probability and a ball is drawn at random with equal
probability from that urn. Suppose that a white ball is drawn. What is the probability that the
second urn was selected?
Problem 8
An urn contains b black balls and r red balls. One of the balls is drawn at random with equal probability, but when it is put back into the urn c additional balls of the same color are put in with it. Now suppose that we draw another ball at random with equal probability. Show that the probability that the first ball drawn was black, given that the second ball drawn was red is b/(b + r + c).
Problem 9
Three prisoners are informed by their jailer that one of them has been chosen to be executed at random with equal probability, and the other two are to be freed. Prisoner A asks the jailer to tell him privately which of his fellow prisoners will be set free, claiming that there would be no harm in divulging this information, since he already knows that at least one will go free. The jailer refuses to answer this question, pointing out that if A knew which of his fellows were to be set free, then his own probability of being executed would rise from 1/3 to 1/2, since he would then be one of two prisoners. What do you think of the jailerâs reasoning?
Problem 10
Let x and y be independent random variables. Let g(·) and h(·) be arbitrary functions of x and y, respectively. Define the random variables v = g(x) and w = h(y). Prove that v and w are independent. That is, functions of independent random variables are independent.
Problem 11
Let x be a continuous, uniformly distributed random variable with x â X = [â5, 5]. Let z
1= x + n
1z
2= x + n
2,
where n
1and n
2are continuous random variables with probability density functions
f
n1(n
1) =
ïĢą
ïĢē
ïĢģ
Îą
1(1 + n
1) for â 1 âĪ n
1âĪ 0 Îą
1(1 â n
1) for 0 âĪ n
1âĪ 1
0 otherwise ,
f
n2(n
2) =
ïĢą
ïĢē
ïĢģ
Îą
21 +
12n
2for â 2 âĪ n
2âĪ 0 Îą
21 â
12n
2for 0 âĪ n
2âĪ 2 0 otherwise ,
where Îą
1and Îą
2are normalization constants. Assume that the random variables x, n
1, n
2are independent, i.e. f (x, n
1, n
2) = f (x) f (n
1) f (n
2).
a) Calculate Îą
1and Îą
2.
b) Use the change of variables formula from Lecture 2 to show that f
zi|x(z
i|x) = f
ni(z
iâ x).
c) Calculate f (x|z
1= 0, z
2= 0).
d) Calculate f (x|z
1= 0, z
2= 1).
e) Calculate f (x|z
1= 0, z
2= 3).
Consider the following estimation problem: an object B moves randomly on a circle with radius 1. The distance to the object can be measured from a given observation point P . The goal is to estimate the location of the object, see Figure 1.
x y
-1
distance B
P L
Îļ
Figure 1: Illustration of the estimation problem.
The object B can only move in discrete steps. The objectâs location at time k is given by x(k) â {0, 1, . . . , N â 1}, where
Îļ(k) = 2Ï x(k) N . The dynamics are
x(k) = mod (x(k â 1) + v(k) , N ), k = 1, 2, . . . ,
where v(k) = 1 with probability p and v(k) = â1 otherwise. Note that mod (N, N ) = 0 and mod (â1, N ) = N â 1. The distance sensor measures
z(k) = q
(L â cos Îļ (k))
2+ (sin Îļ (k))
2+ w(k),
where w(k) represents the sensor error which is uniformly distributed on [âe, e]. We assume that x(0) is uniformly distributed and x(0), v(k) and w(k) are independent.
Simulate the object movement and implement a Bayesian tracking algorithm that calculates for each time step k the probability density function f (x(k)|z(1 : k)).
a) Test the following settings and discuss the results: N = 100, x(0) =
N4, e = 0.5, (i) L = 2, p = 0.5,
(ii) L = 2, p = 0.5, (iii) L = 2, p = 0.55, (iv) L = 0.1, p = 0.55, (v) L = 0, p = 0.55.
b) How robust is the algorithm? Set N = 100, x(0) =
N4, e = 0.5, L = 2, p = 0.55 in the simulation, but use slightly different values for p and e in your estimation algorithm, Ë p and Ë
e, respectively. Test the algorithm and explain the result for:
(i) p = 0.45, Ë Ë e = e, (ii) p = 0.5, Ë Ë e = e, (iii) p = 0.9, Ë Ë e = e, (iv) p = p, Ë e = 0.9, Ë
(v) p = p, Ë e = 0.45. Ë
Sample solutions
Problem 1
We introduce a discrete random variable x
i, representing the outcome of the ith spin with x
iâ {red, black}. We assume that both are equally likely and ignore the fact that there is a 0 or a 00 on a Roulette board.
Assuming independence between spins, we have
Pr (x
k, x
kâ1, x
kâ2, . . . , x
1) = Pr (x
k) Pr (x
kâ1) . . . Pr (x
1) . The probability of eleven consecutive spins resulting in black is
Pr (x
11= black, x
10= black, x
9= black, . . . , x
1= black) = 1 2
11â 0.0005.
This value is actually quite small. However, given that the previous ten were black, we calculate Pr (x
11= black|x
10= black, x
9= black, . . . , x
1= black) (by independence assumption)
= Pr (x
11= black) = 1 2
= Pr (x
11= red|x
10= black, x
9= black, . . . , x
1= black)
i.e. it is equally likely that ten black spins in a row are followed by a red spin, as that they are followed by another black spin (by the independence assumption). Mr. Jonesâ system is therefore no better than randomly betting on black or red.
Problem 2
We introduce two discrete random variables. Let x â {1, 2} represent which box is chosen (box 1 or 2) with probability f
x(1) = f
x(2) =
12. Furthermore, let y â {b, w} represent the color of the drawn marble, where b is a black and w a white marble with probabilities
f
y|x(b|1) = f
y|x(w|1) = 1 2 , f
y|x(b|2) = 2
3 , f
y|x(w|2) = 1 3 .
Then, by the total probability theorem, we find f
y(b) = f
y|x(b|1) f
x(1) + f
y|x(b|2) f
x(2) = 1
2 · 1 2 + 2
3 · 1 2 = 7
12 .
Problem 3
f
x|y(1|w) = f
y|x(w|1) f
x(1) f
y(w) =
1 2
·
121 â f
y(b) =
1 4 5 12
= 3
5 .
Let x â {b, w} represent the color of the ball drawn from urn 1, where f
x(b) =
13, f
x(w) =
23, and y â {b, w} be the color of the ball subsequently drawn from urn 2. Considering the different possibilities, we have
f
y|x(b|b) = 6
7 f
y|x(b|w) = 5
7 f
y|x(w|b) = 1
7 f
y|x(w|w) = 2
7 .
We seek the probability that the transferred ball was white, given that the second ball drawn is white and calculate
f
x|y(w|w) = f
y|x(w|w) f
x(w) f
y(w) =
2 7
·
23f
y|x(w|b) f
x(b) + f
y|x(w|w) f
x(w)
=
2 7
·
231
7
·
13+
27·
23= 4 5 .
Problem 5
We introduce two discrete random variables, x â {A, B, C} and y â {M, F }, where x represents which store an employee works in, and y the sex of the employee. From the problem description, we have
f
x(A) = 50 225 = 2
9 f
x(B) = 75
225 = 1 3 f
x(C) = 100
225 = 4 9 ,
and the probability that an employee is a woman is f
y|x(F |A) = 1
2 f
y|x(F |B) = 3 5 f
y|x(F |C) = 7
10 .
We seek the probability that the resigning employee works in store C, given that it is a woman and calculate
f
x|y(C|F ) = f
y|x(F |C) f
x(C) f
y(F ) =
7 10
·
49X
iâ{A,B,C}
f
y|x(F |i) f
x(i)
=
7 10
·
491
2
·
29+
35·
13+
107·
49= 1 2 .
Problem 6
a) Let x â {F, U } represent whether it is a fair (F ) or an unfair (U ) coin with f
x(F ) = f
x(U ) =
12. We introduce y â {h, t} to represent how the toss comes up (heads or tails) with
f
y|x(h|F ) = 1
2 f
y|x(t|F ) = 1
2
f
y|x(h|U ) = 1 f
y|x(t|U ) = 0 .
We seek the probability that the drawn coin is fair, given that the toss result is heads and calculate
f
x|y(F |h) = f
y|x(h|F ) f
x(F ) f
y(h)
=
1 2
·
12f
y|x(h|F ) f
x(F )
| {z }
fair coin
+ f
y|x(h|U ) f
x(U )
| {z }
unfair coin
=
1 4 1
2
·
12+ 1 ·
12= 1 3 .
b) Let y
1represent the result of the first flip and y
2that of the second flip. We assume conditional independence between flips (conditioned on x), yielding
f
y1,y2|x(y
1, y
2|x) = f
y1|x(y
1|x) f
y2|x(y
2|x) .
We seek the probability that the drawn coin is fair, given that the both tosses resulted in heads and calculate
f
x|y1,y2(F |h, h) = f
y1,y2|x(h, h|F ) f
x(F ) f
y1,y2(h, h) =
1 2
·
12·
121 2
2·
12| {z }
fair coin
+ 1
2·
12| {z }
unfair coin
= 1 5 .
c) Obviously, the probability then is 1, because the unfair coin cannot show tails. Formally, we show this by introducing y
3to represent the result of the third flip and computing
f
x|y1,y2,y3(F |h, h, t) = f
y1,y2,y3|x(h, h, t|F ) f
x(F ) f
y1,y2,y3(h, h, t) =
1
2
·
12·
12·
121 2
3·
12| {z }
fair coin
+ 1
2· 0 ·
12| {z }
unfair coin
= 1.
Problem 7
We introduce the following two random variables:
âĒ x â {1, 2} represents which box is chosen (box 1 or 2) with probability f
x(1) = f
x(2) =
12,
âĒ y â {b, w} represents the color of the drawn ball, where b is a black and w a white ball with probabilities
f
y|x(b|1) = 7
12 f
y|x(w|1) = 5
12 f
y|x(b|2) = 12
15 f
y|x(w|2) = 3
15 .
We seek the probability that the second box was selected, given that the ball drawn is white and calculate
f
x|y(2|w) = f
y|x(w|2) f
x(2) f
y(w) =
3 15
·
125
12
·
12+
153·
12= 12
37 .
Let x
1â {B, R} be the color (black or red) of the first ball drawn with f
x1(B) =
b+rband f
x1(R) =
b+rr; and let x
2â {B, R} be the color of the second ball with
f
x2|x1(B|R) = b
b + r + c f
x2|x1(R|R) = c + r b + r + c f
x2|x1(B|B) = b + c
b + r + c f
x2|x1(R|B) = r
b + r + c .
We seek the probability that the first ball drawn was black, given that the second ball drawn is red and calculate
f
x1|x2(B|R) = f
x2|x1(R|B) f
x1(B) f
x2(R)
=
r
b+r+c
·
b+rbc + r
b + r + c · r b + r
| {z }
first red
+ r
b + r + c · b b + r
| {z }
first black
= b
b + r + c .
Problem 9
We can approach the solution in two ways:
Descriptive solution: The probability that A is to be executed is
13, and there is a chance of
23that one of the others was chosen. If the jailer gives away the name of one of the fellow prisoners who will be set free, prisoner A does not get new information about his own fate, but the probability of the remaining prisoner (B or C) to be executed is now
23. The probability of A being executed is still
13.
Bayesian analysis: Let x represent which prisoner is to be executed, where x â {A, B, C}.
We assume that it is a random choice, i.e. f
x(A) = f
x(B) = f
x(C) =
13.
Now let y â {B, C} be the prisoner name given away by the jailer. We can now write the conditional probabilities:
f
y|x(y|x) =
ïĢą
ïĢī ïĢē
ïĢī ïĢģ
0 if x = y (the jailer does not lie)
1
2
if x = A (A is to be executed, jailer mentions B and C with equal probability) 1 if x 6= A (jailer is forced to give the name of the other prisoner to be set free).
You could also do this with a table:
y x f
y|x(y|x)
B A 1/2
B B 0
B C 1
C A 1/2
C B 1
C C 0
To answer the question, we have to compare f
x|y(A|ÂŊ y) , ÂŊ y â {B, C} , with f
x(A):
f
x|y(A|ÂŊ y) = f
y|x(ÂŊ y|A) f
x(A) f
y(ÂŊ y) =
1 2
·
13X
kâ{A,B,C}
f
y|x(ÂŊ y|k) · f
x(k)
=
1 6 1
|{z}
2 fy|x(ÂŊy|A)·
13+ 0
|{z}
fy|x(ÂŊy|ÂŊy)
·
13+ 1
|{z}
fy|x(ÂŊy| not ÂŊy)
·
13= 1 3 ,
where (not ÂŊ y) = C if ÂŊ y = B and (not ÂŊ y) = B if ÂŊ y = C. The value of the posterior probability is the same as the prior one, f
x(A). The jailer is wrong: prisoner A gets no additional information from the jailer about his own fate!
See also Wikipedia: Three prisoners problem, Monty Hall problem.
Problem 10
Consider the joint cumulative distribution F
v,w(ÂŊ v, ÂŊ w) = Pr ((v âĪ ÂŊ v) and (w âĪ ÂŊ w))
= Pr ((g(x) âĪ ÂŊ v) and (h(y) âĪ ÂŊ w)) . We define the sets A
vÂŊand A
wÂŊas below:
A
vÂŊ=
x â ÂŊ X : g(x) âĪ ÂŊ v A
wÂŊ=
y â ÂŊ Y : h(y) âĪ ÂŊ w . Now we can deduce
F
v,w(ÂŊ v, ÂŊ w) = Pr ((x â A
vÂŊ) and (y â A
wÂŊ)) â ÂŊ v, ÂŊ w
= Pr (x â A
ÂŊv) Pr (y â A
wÂŊ) (by independence assumption)
= Pr (g(x) âĪ ÂŊ v) Pr (h(y) âĪ ÂŊ w)
= Pr (v âĪ ÂŊ v) Pr (w âĪ ÂŊ w)
= F
v(ÂŊ v) F
w( ÂŊ w) . Therefore v and w are independent.
Problem 11
a) By definition of a PDF, the integrals of the PDFs f
ni(n
i), i = 1, 2 must evaluate to one, which we use to find the Îą
i. We integrate the probability density functions, see the following figure:
f
n1Îą
1â1 n
11 2
· 2ι
11
f
n2Îą
2â2 n
21 2
· 4ι
22
1
Z
âââ
f
n1(ÂŊ n
1) dÂŊ n
1= Îą
1ïĢŦ
ïĢ Z
0â1
(1 + ÂŊ n
1) dÂŊ n
1+ Z
10
(1 â ÂŊ n
1) dÂŊ n
1ïĢķ
ïĢļ
= Îą
11 â 1
2 + 1 â 1 2
= Îą
1. Therefore Îą
1= 1. For n
2, we get
Z
âââ
f
n2(ÂŊ n
2) dÂŊ n
2= Îą
2ïĢŦ
ïĢ Z
0â2
1 + 1
2 ÂŊ n
2dÂŊ n
2+
Z
20
1 â 1
2 n ÂŊ
2dÂŊ n
2ïĢķ
ïĢļ
= Îą
2(2 â 1 + 2 â 1) = 2Îą
2. Therefore Îą
2=
12.
b) The goal is to calculate f
zi|x(z
i|x) from z
i= g(n
i, x) := x + n
iand the given PDFs f
ni(n
i), i = 1, 2. We apply the change of variables formula for CRVs with conditional PDFs:
f
zi|x(z
i|x) = f
ni|x(n
i|x)
âg
âni
(n
i, x) (just think of x as a parameter that parametrizes the PDFs).
The proof for this formula is analogous to the proof in the lecture notes for a change of variables for CRVs with unconditional PDFs. We find that
âg
ân
i(n
i, x) = 1, for all n
i, x
and, therefore, the fraction in the change of variables formula is well-defined for all values of n
iand x. Due to the independence of n
iand x, f
ni|x(n
i|x) = f
ni(n
i) :
f
zi|x(z
i|x) = f
ni|x(n
i|x)
âg
âni
(n
i, x) = f
ni(n
i) . Substituting n
i= z
iâ x, we finally obtain
f
zi|x(z
i|x) = f
ni(n
i) = f
ni(z
iâ x) . c) We use Bayesâ theorem to calculate
f
x|z1,z2(x|z
1, z
2) = f
z1,z2|x(z
1, z
2|x) f
x(x)
f
z1,z2(z
1, z
2) . (1)
First, we calculate the prior PDF of the CRV x, which is uniformly distributed:
f
x(x) = (
110
for â 5 âĪ x âĪ 5
0 otherwise .
In b), we showed by change of variables that f (z
i|x) = f
ni(z
iâ x). Since n
1, n
2are independent, it follows that z
1, z
2are conditionally independent given x. Therefore, we can rewrite the measurement likelihood as
f
z1,z2|x(z
1, z
2|x) = f
z1|x(z
1|x) f
z2|x(z
2|x) .
We calculate the individual conditional PDFs f
zi|x(z
i|x):
f
z1|x(z
1|x) = f
n1(z
1â x) =
ïĢą
ïĢī ïĢē
ïĢī ïĢģ
Îą
1(1 â z
1+ x) for 0 âĪ z
1â x âĪ 1 Îą
1(1 + z
1â x) for â 1 âĪ z
1â x âĪ 0
0 otherwise.
The conditional PDF of z
1given x is illustrated in the following figure:
f
z1|x(z
1|x) Îą
1x â 1
z
1x + 1 x
Analogously
f
z2|x(z
2|x) = f
n2(z
2â x) =
ïĢą
ïĢī ïĢē
ïĢī ïĢģ
Îą
21 â
12z
2+
12x
for 0 âĪ z
2â x âĪ 2 Îą
21 +
12z
1â
12x
for â 2 âĪ z
2â x âĪ 0
0 otherwise.
Let numx be the numerator of the Bayesâ rule fraction (1). Given the measurements z
1= 0, z
2= 0, we find
numx = 1
10 f
z1|x(0|x) f
z2|x(0|x) .
We consider four different intervals of x: [â5, â1], [â1, 0], [0, 1] and [1, 5]. Evaluating numx for these intervals results in:
âĒ for x â [â5, â1] or x â [1, 5], numx = 0
âĒ for x â [â1, 0], numx = 1
10 Îą
1(1 + x) Îą
21 + x
2
= 1
20 (1 + x) 1 + x
2
âĒ for x â [0, 1], numx = 1
10 Îą
1(1 â x) Îą
21 â x
2
= 1
20 (1 â x) 1 â x
2
.
Finally, we need to calculate the denominator of the Bayesâ rule fraction (1), the normal- ization constant, which can be calculated using the total probability theorem:
f
z1,z2(z
1, z
2) = Z
âââ
f
z1,z2|x(z
1, z
2|x) f
x(x) dx = Z
âââ
numx dx.
Evaluated at z
1= z
2= 0, we find
f
z1,z2(0, 0) = 1 20
ïĢŦ
ïĢ Z
0â1
(1 + x) 1 + x
2
dx + Z
10
(1 â x) 1 â x
2
dx
ïĢķ
ïĢļ
= 1 20
5 12 + 5
12
= 1
24 .
f
x|z1,z2(x|0, 0) = 1
f
z1,z2(0, 0) numx = 24 · numx
=
ïĢą
ïĢī ïĢē
ïĢī ïĢģ
0 for â 5 âĪ x âĪ â1, 1 âĪ x âĪ 5
6
5
(1 + x) 1 +
x2for â 1 âĪ x âĪ 0
6
5
(1 â x) 1 â
x2for 0 âĪ x âĪ 1 which is illustrated in the following figure:
f
x|z1,z2(x|0, 0)
6 5
â1 1 x
The posterior PDF is symmetric about x = 0 with a maximum at x = 0: both sensors
âagreeâ.
d) Similar to the above. Given z
1= 0, z
2= 1, we write the numerator as numx = 1
10 f
z1|x(0|x) f
z2|x(1|x) . (2)
Again, we consider the same four intervals:
âĒ for x â [â5, â1] or x â [1, 5], numx = 0
âĒ for x â [â1, 0], numx = 1
10 Îą
1(1 + x) Îą
21 2 + 1
2 x
= 1
40 (1 + x)
2âĒ for x â [0, 1]
numx = 1
10 Îą
1(1 â x) Îą
21 2 + 1
2 x
= 1
40 1 â x
2.
Normalizing yields
f
z1,z2(0, 1) = Z
1â1