• No results found

Generating Random Numbers

N/A
N/A
Protected

Academic year: 2021

Share "Generating Random Numbers"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Aim: produce random variables for given distribution Inverse Method

Let F be the distribution function of an univariate distribution and let F −1 (y) = inf{x|F (x) ≥ y} (generalized inverse of F ).

For a uniformly distributed random variable U ∼ U(0, 1), let X = F −1 (U ).

Then X has distribution function F :

P(X ≤ x) = P(F −1 (U ) ≤ x) = P(U ≤ F (x)) = F (x).

Example:

For the exponential distribution, we have F (x) = 1 − exp(−θx). Thus X = F −1 (U ) = − 1

θ log(1 − U )

is exponentially distributed with parameter θ.

Remarks:

The result shows that the generation of sequences of random variables (usually iid) for some given distribution is based on the the production of uniform random variables.

Since the algorithms for generating uniform random variables are deter-

(2)

Generating Random Numbers

Acceptance-Rejection Method

Let f be the density of some univariate distribution we want to sample from. Suppose the f (·) is majorized by c g(·),

f (y) ≤ c g(y),

for some simple probability density g and some constant c > 1.

The idea of acceptance-rejection is to sample proposals X from the simpler (simulationwise) density g and then to reject some proposals which are likely to be overrepresented in the sample:

◦ Sample X ∼ g(x)

◦ Sample U ∼ U (0, 1)

◦ Accept X if U ≤ f (X) c g(X) .

Example: Non-standard prior for binomial parameter Suppose that Y is binomially distributed

Y ∼ Bin(n, θ)

with non-conjugate prior distribution π(θ) = 4  1

2 − θ − 1

2

 , θ ∈ [0, 1].

The posterior distribution (conditional on data Y = y) satisfies π(θ|y) ∼ θ y (1 − θ) n−y  1

2 − θ − 1

2

 , θ ∈ [0, 1].

(3)

Example: Non-standard prior for binomial parameter (contd)

Since 1 2 − |θ − 1 2 | ≤ 1 2 − 2 (θ − 1 2 ) 2 = 2 θ (1 − θ) we get for some c > 0 π(θ|y) ≤ c θ y+1 (1 − θ) n−y+1

The right side is proportional to the density of a beta distribution with parameters y + 2 and n − y + 2. This suggests the following sampling scheme:

◦ Define f (θ) = θ y (1 − θ) n−y 1

2 − θ − 1

2

 g(θ) = 2 θ y+1 (1 − θ) n−y+1

◦ Sample θ ∼ Beta(y + 2, n − y + 2) U ∼ U(0, 1)

◦ Accept θ if U ≤ f (θ) g(θ) .

◦ Note: We did not need to compute the proportionality factor Z

Θ

f (y|θ) π(θ) dθ,

since this cancels in the ratio f (θ) g(θ) .

2.0 2.5 3.0

cg ( θ )

1200

Acceptance−Rejection for Bayesian Inference with Non−Standard Prior 1600

(4)

Generating Random Numbers

For other distributions (e.g. the normal) exist special methods.

In this course, we will only use the built-in generators in R:

N (µ, σ 2 ) rnorm(n,mean=0,sd=1) U[min, max] runif(n,min=0,max=1) Beta(a, b) rbeta(n,a,b)

Bin(s, p) rbinom(n,size,prob) Cauchy(α, σ)) rnorm(n,loc=0,scale=1) χ 2 n (δ) rchisq(n,df,ncp=0) Exp(rate) rnorm(n,rate=1)

F m,n rf(n,df1,df2)

Γ(a, s) rgamma(n,shape,rate=1,scale=1/rate) Geom(p) rgeom(n,prob)

H(m, n, k) rhyper(nn,m,n,k)

log-normal(µ, σ 2 ) rlnorm(n,mean=0,sd=1)) Logistic(µ, σ 2 ) rlogis(n,loc=0,scale=1) NegBinom(s, p) rnbinom(n,size,prob,mu) Poisson(λ) rpois(n,lambda)

t n rt(n,df)

Weibull(a, b) rweibull(n,shape,scale=1)

R also provides functions for calculating the density (dF), the distribution

function (pF) and quantiles (qF), where F is the name of the distribution

as in the above commands.

(5)

Aim: Evaluate expectation E(h(Y )) =

Z

h(y) f (y) dy, (1)

where Y is some (possibly high-dimensional) random variable with distri- bution defined by f (y).

Examples:

◦ Suppose ˆ θ = ˆ θ(Y ) is an estimator for some parameter θ. Quantities of interest are the bias and the standard deviation of ˆ θ,

bias(ˆ θ) = E(ˆθ) − θ, σ(ˆ θ) = E(ˆθ − E(ˆθ)) 2  1 2

.

◦ In cases, where the estimate ˆ θ is obtained by an iterative estimation procedures (e.g. by Newton-Raphson), the estimator ˆ θ(Y ) cannot be written in closed form and the integral (1) cannot be computed by numerical integration.

◦ For two normally distributed samples Y 11 , . . . , Y n 1 1 and Y 12 , . . . , Y n 2 2 , the hypothesis of equal means can be tested by the two-sample t test with test statistic

T =

Y ¯ 1 − ¯ Y 2 q s 2 1

n 1 + n s 2 2

2

.

The p-value of the test is defined as

P(T (Y ) > t) = E (1

(6)

Monte Carlo Methods

Monte Carlo approach:

◦ Draw sample Y (1) , . . . , Y (n) iid ∼ f (y).

◦ Estimate expectation by 1

n

n

P

t=1

h(Y (t) )

(Monte Carlo integration)

◦ For independent samples, the Law of Large Numbers yields 1

n

n

P

t=1

h(Y (t) ) → E(h(Y )) as n → ∞.

Example: Two-sample t test in R

N<-10000 #number of MC repititions

n1<-8 #sample size

n2<-4

Y1<-rgamma(N*n1,1,1) #sample from gamma distribution Y2<-rgamma(N*n2,2,2)

#Y1<-rnorm(N*n1,0,1) #alternatively:

#Y2<-rnorm(N*n2,0,2) #sample from normal distribution Y<-c(Y1,Y2)

dim(Y)<-c(N,n1+n2)

tstat<-function(Y,n1,n2) { Y1<-Y[1:n1]

Y2<-Y[(n1+1):(n1+n2)]

#two-sample t test statistic

T<-(mean(Y1)-mean(Y2))/sqrt(var(Y1)/n1+var(Y2)/n2)

#Satterthwaite approximation of degrees of freedom

df<-(var(Y1)/n1+var(Y2)/n2)^2/((var(Y1)/n1)^2/(n1-1)+(var(Y2)/n2)^2/(n2-1)) return(c(T,df))

}

#calculate test statistic for N samples R<-apply(Y,1,tstat,n1,n2)

#estimate significance level

a<-mean(ifelse(abs(R[1,])>qt(0.975,R[2,]),1,0))

a

(7)

Suppose Y 1 , . . . , Y n ∼ P θ and ˆ θ = S(Y ) is an estimator for θ.

Aim: Determine variance of ˆ θ.

Problem: Iterative algorithms such as the EM algorithm often do not give analytic expressions for the variance of the estimator.

Idea: Simulate from the distribution P θ to obtain samples Y (b) = (Y 1 (b) , . . . , Y n (b) ), b = 1, . . . , B

and for each sample an estimate θ ˆ (b) = S(Y (b) ).

Then the standard error of ˆ θ = S(Y ) can be estimated by ˆ

σ 2 (ˆ θ) = 1

B

B

P

b=1

θ ˆ (b) − ¯ θ  2

.

Problem: We cannot sample from P θ since θ is unknown.

Idea: Approximate P θ by P θ ˆ Parametric bootstrap:

◦ Estimate θ by ˆ θ = S(Y )

◦ For b = 1, . . . , B, simulate bootstrap replications Y j ∗ (b) iid ∼ P θ ˆ .

◦ Compute bootstrap estimate ˆ θ ∗ (b) from the bootstrap sample Y ∗ (b) .

◦ Estimate the variance of ˆ θ = S(Y ) by

(8)

Importance Sampling

Aim: Reduction of variance Rewrite E(h(Y )) with Y ∼ f as

Z

h(y) f (y) dy =

Z h(y) f (y)

g(y) g(y) dy.

Thus if Z (1) , . . . , Z (n) ∼ g then 1

n

n

X

t=1

h(Z (t) ) f (Z (t) ) g(Z (t) )

offers an alternative approximation for E(h(Y )).

The variance of the Monte Carlo estimate for E(h(Y )) is minimized if

|h(y)| f (y)

g(y) ≈ constant

Example:

Suppose we want to approximate p = P(Y ≥ c) for (i) Y ∼ N (0, 1)

(ii) Y ∼ Cauchy(0, 1)

An intuitive Monte Carlo approximation is p n = 1

n

n

P

t=1

1 (c,∞) Y (t) 

where Y (t) iid ∼ N (0, 1) resp. Y (t) iid ∼ Cauchy(0, 1).

Since p n is the mean of n Bernoulli random variables, its relative error is

pvar(p n )

p =

r 1 − p

p n .

(9)

Alternative approach: Let Z (t) be exponentially distributed on (c, ∞), Z (t) ∼ g(z) = exp(c − z) 1 (c,∞) (z).

Then p can be approximated by p 0 n = 1

n

n

P

t=1

f Z (t)  g Z (t)  ,

where f is the density of a (i) normal or (ii) Cauchy distribution.

0 10 20 30 40 50

0.0 0.1 0.2 0.3 0.4

t

pn [%]

p

n

solid p’

n

dashed

0 10 20 30 40 50

6 8 10 12 14

t

pn [%]

p

n

solid p’

n

dashed

3 4 5 6 7 8 9 10

0 1 2 3 4

z

f(z)/g(z) [x1000]

3 4 5 6 7 8 9 10

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

z

f(z)/g(z)

(10)

Markov Chain Monte Carlo

Suppose that Y = (Y 1 , . . . , Y d ) T has density f (y).

Aim: Approximation E(h(Y )) ≈ 1

n

n

P

t=1

h(Y (t) ) with Y (1) , . . . , Y (n) iid ∼ f (y).

Problem: Independent sampling from a multivariate distribution is often difficult.

Solution:

◦ The Law of Large Numbers 1

n

n

P

t=1

h(Y (t) ) → E(h(Y )) as n → ∞.

still applies if Y (t) are (not too) dependent observations with Y (t) ∼ f (y).

◦ Idea: Generate a sequence of random numbers Y (k) which converges to a dependent sample from the joint distribution f (y)

Y (t) ∼ f (y) but NOT Y (t) iid ∼ f (y).

◦ We will show that this can be accomplished by sampling Y i (t) from the conditional distribution

f y i

Y 1 (t) , . . . , Y i−1 (t) , Y i+1 (t−1) , . . . , Y d (t−1) 

for i = 1, . . . , d

References

Related documents

Students will be expected to master concepts such as the production process, the production team, production control room, manipulation of the video camera and lens,

There were 20 target sentences of the types shown in (15–18), varying the order of the two embedded clauses and the presence of a complementizer before the second embedded 2

On the other hand, the effectiveness of teaching philosophy to children in the development of students' philosophical thinking (Khadem Sadegh &amp; Fereydooni,

The focus in this paper, however, is on ‘how’ a VET youth work curriculum might be envisioned and delivered – the particular educational strategies, intentions and actions taken

to 2010 the Conservative Party did not use any form of positive action, but for that election an ‘A List’ system was used which ensured that women candidates stood in seats the

Dans ce papier, nous allons tout d’abord présenter le cadre conceptuel en définissant la coopérative, ses spécificités et en l’abordant entant que modèle d’entreprise

For instance, North America and Europe, both being mature markets, are the most consolidated markets where the top three distributors hold 30 to 40% and 15 to 20% of the

Therefore the paper also deals with business innovation and innovation performance in the Czech Republic and with innovation metrics for innovation measurement