Two groups with 0 (upper), 1 (lower left), 2 (lower right) people in common

To compute Cov(Xi, Xj) for i < j, consider how many people are in common for group i and group j. If the number of people in common is 0 or 1 (as shown in the upper and lower left cases in the above figure, respectively), then the Cov(Xi, Xj) = 0 since the coin flips used to determine whether Group i is a clique are independent of those used for Group j. If there are 2 people in common (as shown in the lower right case of Figure 1), then

Cov(Xi, Xj) = E(XiXj) − E(Xi)E(Xj) = 1 2⁵ − (1

2³)²= 1 64, since 5 distinct pairs of people must know each other to make XiXjequal to 1.

There are ⁿ₄ 4

2 = 6 ⁿ₄ pairs of groups {i, j} (i 6= j) with 1 pair of people in common (choose 4 people out of the n, then choose which 2 of the 4 are the overlap of the groups).

The remaining pairs of groups have covariance 0. Thus, the variance of the number of cliques is

(c) We will prove the existence of a network with the desired property by showing that the probability is positive that a random network has the property is positive (this strategy is explored in the starred Section 4.9). Form a random network as in (a), and let Ai be the event that the ith group of k people (in any fixed ordering) is neither a clique nor an anticlique. We have

P (

68 Chapter 7: Joint distributions

which shows that

P ( (ⁿ_k)

i=1

Ai) = 1 − P ( (ⁿ_k) [

i=1

A^ci) > 0,

as desired. Alternatively, let C be the number of cliques of size k and A be the number of anticliques of size k, and write C + A = T . Then

E(T ) = E(C) + E(A) = n k

2⁻(^k₂)⁺¹< 1,

by the method of Part (a). So P (T = 0) > 0, since P (T ≥ 1) = 1 would imply E(T ) ≥ 1.

This again shows that there must be a network with the desired property.

85. Shakespeare wrote a total of 884647 words in his known works. Of course, manys words are used more than once, and the number of distinct words in Shakespeare’s known writings is 31534 (according to one computation). This puts a lower bound on the size of Shakespeare’s vocabulary, but it is likely that Shakespeare knew words which he did not use in these known writings.

More specifically, suppose that a new poem of Shakespeare were uncovered, and consider the following (seemingly impossible) problem: give a good prediction of the number of words in the new poem that do not appear anywhere in Shakespeare’s previously known works.

Ronald Thisted and Bradley Efron studied this problem in the papers [?] and [?], devel-oping theory and methods and then applying the methods to try to determine whether Shakespeare was the author of a poem discovered by a Shakespearean scholar in 1985.

A simplified version of their method is developed in the problem below. The method was originally invented by Alan Turing (the founder of computer science) and I.J. Good as part of the effort to break the German Enigma code during World War II.

Let N be the number of distinct words that Shakespeare knew, and assume these words are numbered from 1 to N . Suppose for simplicity that Shakespeare wrote only two plays, A and B. The plays are reasonably long and they are of the same length. Let Xj

be the number of times that word j appears in play A, and Yj be the number of times it appears in play B, for 1 ≤ j ≤ N .

(a) Explain why it is reasonable to model Xjas being Poisson, and Yjas being Poisson with the same parameter as Xj.

(b) Let the numbers of occurrences of the word “eyeball” (which was coined by Shakespare) in the two plays be independent Pois(λ) r.v.s. Show that the probability that “eyeball” is used in play B but not in play A is

e^−λ(λ − λ²/2! + λ³/3! − λ⁴/4! + . . . ).

(c) Now assume that λ from (b) is unknown and is itself taken to be a random variable to reflect this uncertainty. So let λ have a PDF f0. Let X be the number of times the word “eyeball” appears in play A and Y be the corresponding value for play B. Assume that the conditional distribution of X, Y given λ is that they are independent Pois(λ) r.v.s. Show that the probability that “eyeball” is used in play B but not in play A is the alternating series

P (X = 1) − P (X = 2) + P (X = 3) − P (X = 4) + . . . . Hint: Condition on λ and use (b).

(d) Assume that every word’s numbers of occurrences in A and B are distributed as in

Chapter 7: Joint distributions 69 (c), where λ may be different for different words but f0is fixed. Let Wjbe the number of words that appear exactly j times in play A. Show that the expected number of distinct words appearing in play B but not in play A is

E(W1) − E(W2) + E(W3) − E(W4) + . . . .

(This shows that W1− W2+ W3− W4+ . . . is an unbiased predictor of the number of distinct words appearing in play B but not in play A: on average it is correct. Moreover, it can be computed just from having seen play A, without needing to know f0 or any of the λj. This method can be extended in various ways to give predictions for unobserved plays based on observed plays.)

Solution:

(a) It is reasonable to model Xjand Yj as Poisson, because this distribution is used to describe the number of “events” (such as emails received) happening at some average rate in a fixed interval or volume. The Poisson paradigm applies here: each individual word in a play has some very small probability of being word j, and the words are weakly dependent. Here an event means using word j, the average rate is determined by how frequently Shakespeare uses that word overall. It is reasonable to assume that the average rate of occurrence of a particular word is the same for two plays by the same author, so we take λ to be the same for Xj and Yj.

(b) Let X be the number of times that “eyeball” is used in play A, and Y be the number of times that it is used in play B. Since X and Y are independent Pois(λ),

P (X = 0, Y > 0) = P (X = 0) (1 − P (Y = 0)) = e^−λ

(c) Now λ is a random variable. Given λ, the calculation from (b) holds. By the law of total probability, the indicator r.v. of the event that word j appears in play B but not in play A, and N is the total number of words. By (c), for 1 ≤ j ≤ N ,

Also, note that the number of words that appear exactly i times in play A is Wi= I(X1= i) + I(X2= i) + I(X3 = i) + · · · + I(XN= i),

70 Chapter 7: Joint distributions

where I(Xj= i) is the indicator of word j appearing exactly i times in play A. So

EWi=

j=1

EI(Xj= i) =

j=1

P (Xj= i).

Then

EW =

j=1

EIj=

j=1

∞

i=1

(−1)ⁱ⁺¹P (Xj= i)

∞

i=1

(−1)ⁱ⁺¹

j=1

P (Xj= i)

∞

i=1

(−1)ⁱ⁺¹EWi

= EW1− EW2+ EW3− EW4+ . . . .

Chapter 8: Transformations

with fY(y) = 0 otherwise. This says that Y has a Beta(2,1) distribution.

In general, the same method shows that U^α¹ ∼ Beta(α, 1) for any α > 0.

16. Let X, Y be continuous r.v.s with a spherically symmetric joint distribution, whichs means that the joint PDF is of the form f (x, y) = g(x²+ y²) for some function g. Let (R, θ) be the polar coordinates of (X, Y ), so R² = X²+ Y² is the squared distance from the origin and θ is the angle (in [0, 2π)), with X = R cos θ, Y = R sin θ.

(a) Explain intuitively why R and θ are independent. Then prove this by finding the joint PDF of (R, θ).

(b) What is the joint PDF of (R, θ) when (X, Y ) is Uniform in the unit disk {(x, y) : x²+ y²≤ 1}?

Solution:

(a) Intuitively, this makes sense since the joint PDF of X, Y at a point (x, y) only 71

In document Joseph K. Blitzstein and Jessica Hwang Departments of Statistics, Harvard University and Stanford University (Page 69-73)