Supplement to “Detecting chance: A solution to the null sensitivity problem in subliminal
priming”
Richard D. Morey, Paul L. Speckman, Jeffery N. Rouder, and Michael S. Pratte
September 8, 2006
This document is a supplement to Rouder, Morey, Speckman, and Pratte’s
“Detecting chance: A solution to the null sensitivity problem in subliminal priming.” We present model development and computer code for the mass-at- chance (MAC) model. The model is specified and analyzed in the Bayesian framework; Rouder and Lu (2005) provide a tutorial on Bayesian hierarchical analysis for psychologists.
1 Data Representation
In the paradigm, each of I participants observes N i trials (i = 1, . . . , I). Let y ij
denote the outcome of the jth trial (j = 1, . . . , N i ) for the ith participant. If the outcome is a correct response, then y ij = 1; otherwise, y ij = 0.
2 Model Specification
Outcomes are modeled as Bernoulli events:
y ij |p i indep
∼ Bernoulli(p i ). (1)
Parameter p i is a function of latent ability x i : p i = Φ(x i ∨ 0),
where a ∨ b = max(a, b) and Φ is the cumulative distribution function of the standard normal distribution.
Individuals’ latent abilities are assumed to be normally distributed:
x i |µ, σ 2 iid ∼ Normal(µ, σ 2 ).
3 Priors
Priors are needed for parameters µ and σ 2 . The choice of a normal and a truncated inverse gamma for mu and σ 2 is convenient and flexible:
µ ∼ Normal(µ 0 , σ 2 0 ), σ 2 ∼ TIG t
σ2(a 0 , b 0 ),
where TIG t
σ2denotes the truncated inverse gamma 1 distribution with pdf pro- portional to
f (x|a 0 , b 0 ) ∝ x −(a
0+1) exp{− b 0
x }I (0<x<t
σ2) . (2) Before analysis, values of µ 0 , σ 2 0 , a 0 , b 0 , and t σ
2must be specified. Unfortu- nately, the choice of values for these parameters of the prior affects selection of participants. We have experimented with a number of choices and recommend (µ 0 = 0, σ 2 0 = 1, a 0 = −.5, b 0 = 0, t σ
2= 1) as reasonable. This recommendation introduces a subtle bias against selection. This bias is appropriate as it is im- portant to control the rate of selecting above-chance-pperforming individuals.
With these prior parameters the prior on σ is flat over the interval (0,1).
• Proof. If a 0 = −.5, b 0 = 0, and t σ
2= 1, then σ 2 ∝ (σ 2 ) −.5 I (0<σ
2<1)
by (2). Transforming variables, σ ∝ (σ 2 ) −.5 d
dσ σ 2
I ( √ 0<σ< √ 1)
∝ σ −1 σI (0<σ<1)
∝ I (0<σ<1)
This is the density function for a uniform random variable over (0,1).
When selecting prior parameters, it is important to choose parameters which yield proper posterior distributions. One way to do this is to ensure that that the priors are proper; i.e., for σ 2 , a 0 , b 0 , and t σ
2are chosen in such a way that
0 <
Z t
σ20
f (σ 2 |a 0 , b 0 )dσ 2 < +∞.
This will guarantee the propriety of the posterior distributions. In this case, choosing a 0 > 0 and b 0 > 0 is sufficient to ensure that the prior is proper. For other values of a 0 and b 0 , the prior is not guaranteed to be proper.
1
When t
σ2= +∞, a
0> 0, and b
0> 0 the truncated inverse gamma distribution reduces
to the inverse gamma distribution.
4 Overview of Analysis
The basic steps in analysis are to specify the joint posterior, then integrate the joint posterior via Markov chain Monte Carlo (MCMC) sampling to obtain marginal posterior distributions. We describe these steps in turn. Preceding this, however, we define a set of additional parameters that are useful in deriving posterior distributions.
5 Additional Latent Parameters
We adapt the formulation of Albert & Chib (1995) and define the following latent parameters to aid analysis of probit models:
w ij indep
∼ Normal(x i ∨ 0, 1).
If y ij is defined as
y ij =
0 if and only if w ij < 0, 1 if and only if w ij ≥ 0, then P r(w ij ≥ 0) = p i , and (1) holds.
6 Joint Posterior
Let Y be the collection of data, Y = D
hy ij i j=N j=1
iE i=I
i=1 , and W be the collection of latent parameters, W = D
hw ij i j=N j=1
iE i=I
i=1 . The joint posterior of all parameters is proportional to
[W, x, µ, σ 2 |Y] ∝ Y
i
N
iY
j=1
(y
ij=0)
exp
− (w ij − (0 ∨ x i )) 2 2
I (w
ij<0)
×
N
iY
j=1
(y
ij=1)
exp
− (w ij − (0 ∨ x i )) 2 2
I (w
ij>0)
× Y
i
σ −1 exp
− (x i − µ) 2 2σ 2
× exp
− (µ − µ 0 ) 2 2σ 0 2
(σ 2 ) −(a
0+1) exp
− b 0
σ 2
. (3)
7 Conditional Posteriors
In order to obtain the marginal posterior distributions, we use Gibbs sampling (Gelfand & Smith, 1990). As input, Gibbs sampling requires conditional pos- terior distributions. These conditional posterior distributions are provided by Facts A through Fact D, below. Proofs of Facts A, B, and C are standard and may be found in Gelman, Carlin, Stern, and Rubin (2004). The proof of Fact D is provided here.
• Fact A. Conditional posterior of w ij |x i , y ij : Let TN t+ (µ, σ 2 ) denote the Normal(µ, σ 2 ) distribution truncated below at t and TN t− (µ, σ 2 ) denote the Normal(µ, σ 2 ) truncated above at t, respectively. The pdfs for these distributions are
f t+ (x|µ, σ 2 ) = I (x>t)
Φ( µ−t σ ) √
2πσ 2 exp
− (x − µ) 2 2σ 2
, (4)
f t− (x|µ, σ 2 ) = I (x<t)
Φ( t−µ σ ) √
2πσ 2 exp
− (x − µ) 2 2σ 2
. (5)
The conditional posterior for w ij |x i , y ij is
w ij |x i ; (y ij = 1) ∼ TN 0+ (x i ∨ 0, 1) w ij |x i ; (y ij = 0) ∼ TN 0− (x i ∨ 0, 1).
• Fact B. Conditional posterior of µ|x, σ 2 : Let τ 2 = I
σ 2 + 1 σ 0 2
−1
η = τ 2
P
i x i
σ 2 + µ 0
σ 2 0
. Conditional posterior for µ|x, σ 2 is
µ |x, σ 2 ∼ Normal(η, τ 2 ).
• Fact C. Conditional posterior of σ 2 |x, µ:
σ 2 |x, µ ∼ TIG t
σ2a 0 + I
2 , b 0 + 1 2
X
i
(x i − µ) 2
!!
.
• Fact D: Conditional posterior of x i |W, µ, σ 2 : Let λ 2 i =
N i + 1
σ 2
−1
ν i = λ 2 i
N
iX
j
w ij + µ σ 2
ρ i = σ Φ(− µ σ )
σ Φ(− µ σ ) + λ i Φ( ν λ
ii
) exp{− 2 1 ( µ σ
22− ν
2 i
λ
2i)} .
Conditional posterior pdf of x i |W, µ, σ 2 is
[x i |W, µ, σ 2 ] = ρ i f 0− (x i |µ, σ 2 ) + (1 − ρ i )f 0+ (x i |ν i , λ 2 i ), (6) where f 0+ and f 0− are the pdfs of normal distributions truncated below and above at 0, respectively, and are defined in (4) and (5).
Proof of Fact D: Inspection of (3) reveals that the conditional posterior density of x i |W, µ, σ 2 satisfies
[x i |W, µ, σ 2 ] ∝ exp
− 1 2
N
iX
j=1
(w ij − (x i ∨ 0)) 2
× exp
− (x i − µ) 2 2σ 2
. This right-hand side may be simplified as
[x i |W, µ, σ 2 ] ∝
I (x
i<0) exp
− 1 2
N
iX
j=1
w 2 ij
+ I (x
i>0) exp
− 1 2
N
iX
j=1
(w ij − x i ) 2
× exp
− (x i − µ) 2 2σ 2
. Distributing and completing the square yields
[x i |W, µ, σ 2 ] ∝ I (x
i<0) exp
− (x i − µ) 2 2σ 2
+I (x
i>0) exp
− 1 2 ( µ
σ 2 − ν 2 i λ 2 i )
× exp
− (x i − ν i ) 2 2λ 2 i
. Multiplying by constants equal to 1 yields
[x i |W, µ, σ 2 ] ∝
" √
2πσ 2 Φ(− µ σ )
√ 2πσ 2 Φ(− µ σ
!
I (x
i<0) exp
− (x i − µ) 2 2σ 2
#
+ " p2πλ 2 i Φ(− ν λ
ii) p2πλ 2 i Φ(− ν λ
ii)
!
I (x
i>0) exp
− 1 2 ( µ
σ 2 − ν i 2 λ 2 i )
× exp
− (x i − ν i ) 2 2λ 2 i
.
Rearranging terms and dividing the expression by √
2π gives [x i |W, µ, σ 2 ] ∝ σΦ(− µ
σ ) I (x
i<0)
Φ(− µ σ ) √
2πσ 2 exp
− (x i − µ) 2 2σ 2
+λ i Φ( ν i
λ i
) exp
− 1 2 ( µ
σ 2 − ν i 2 λ 2 i )
× I (x
i>0)
Φ( ν λ
ii
)p2πλ 2 i exp
− (x i − ν i ) 2 2λ 2 i
.
Substituting (4) and (5) yields [x i |W, µ, σ 2 ] ∝ σΦ(− µ
σ )f 0− (x i |µ, σ 2 ) (7)
+λ i Φ( ν i
λ i
) exp
− 1 2 ( µ
σ 2 − ν i 2 λ 2 i )
f 0+ (x i |ν i , λ 2 i ).
The normalization constant is
C = 1
σ Φ(− µ σ ) + λ i Φ( λ ν
ii
) exp n
− 1 2 ( σ µ
2− ν λ
i22i
) o . Multiplying (7) by C yields a proper density
[x i |W, µ, σ 2 ] = σ Φ(− µ σ )f 0− (x i |µ, σ 2 ) σ Φ(− µ σ ) + λ i Φ( ν λ
ii
) exp n
− 1 2 ( σ µ
2− ν
2 i
λ
2i) o +
λ i Φ( ν λ
ii) exp n
− 2 1 ( σ µ
2− ν
2 i
λ
2i) o
f 0+ (x i |ν i , λ 2 i ) Φ(− µ σ )σ + λ i Φ( ν λ
ii) exp n
− 1 2 ( σ µ
2− ν
2 i
λ
2i) o
= ρ i f 0− (x i |µ, σ 2 ) + (1 − ρ i )f 0+ (x i |ν i , λ 2 i ).
8 MCMC sampling
Sampling from the conditional posterior distributions derived in Facts A-D re-
quires the ability to sample from normal, truncated inverse gamma, truncated
normal, and Bernoulli distributions. Routines for sampling the normal distri-
butions are found in most statistical applications. Samples from a truncated
inverse gamma distribution are obtained by taking the reciprocal of samples
from a truncated gamma distribution. We sample from the truncated gamma
distribution using inverse cdf sampling (Devroye, 1986). Ahrens and Dieter
(1974, 1982) provide algorithms for sampling the gamma distribution. Sam-
pling w ij |x i , y ij and x i |W, µ, σ 2 requires sampling from truncated normal dis-
tributions. We describe our method.
8.1 Sampling from the Truncated Normal Distribution
We use inverse cdf sampling (Devroye, 1986). Unfortunately, we encountered occasional underflow errors using the standard approximation for Φ. The fol- lowing variant using the standard function for Ψ(x) = ln Φ(x) worked well. Let U be a sample from a Uniform(0, 1). Then it is easy to see that
σΨ −1 (ln U + Ψ( t − µ
σ )) + µ ∼ TN t− (µ, σ 2 ).
Further, by the symmetry of the normal distribution,
−σΨ −1 (ln U + Ψ( µ − t
σ )) + µ ∼ TN t+ (µ, σ 2 ).
The standard approximation to Ψ is provided in the R statistical package.
8.2 Sampling from the conditional x i |W, µ, σ 2
Sampling from (6) requires sampling from a Bernoulli(ρ i ) and a truncated nor- mal distribution. Again, we found occasional underflow problems with the straightforward calculation of ρ i . However, calculation of the log odds-ratio of ρ i , denoted here by γ i , proved satisfactory. Let
γ i = ln
σ Φ(− µ σ ) λ i Φ( ν λ
ii
) exp n
− 1 2 ( µ σ
22− ν λ
i22i