3.3 Approximating the marginal likelihood
3.4.1 Testing random intercepts
We conducted a simulation study to evaluate the performance of our method in correctly iden- tifying models with or without random intercepts. We consider a simple setting with two non-nested factors with 30 classifications each. We simulated b10[i] ∼ N(0,1), b20[i] ∼ N(0,1),
εi ∼N(0,1), and calculated
for various combinations ofm = (100,500,1000),λ1 = (0,0.1,0.2,0.3), andλ2 = (0,0.1,0.2,0.3)
for 1,000 datasets. Using prior distributions β0 ∼N(0,1), σ2 ∼InvGam(.1, .1) (which are non-
informative given the simulation settings), and φh ∼N(log(0.3),2), we approximated marginal
likelihoods for the following models:
M0 : Yi = β0+εi, (3.13)
M1 : Yi = β0+eφ1b10[i]+εi,
M2 : Yi = β0+eφ2b20[i]+εi,
M3 : Yi = β0+eφ1b10[i]+eφ2b20[i]+εi,
in which φh = log(λh) for λh > 0 and h = 1,2. Estimates of the Bayes factors ˆB30, ˆB10, ˆB20
were calculated for each data set and interpreted according to the scale given by Wasserman (2000) and Jeffreys (1961). For comparison with frequentist methods, we chose to reject H0
k
if an estimated Bayes factor Mkk0 was greater than 1, in which model k was preferred over
model k0. In this simple setting, we can use the restricted likelihood ratio test for testing M 1
and M2 versus M0, in which the null distributions follow a 50:50 mixture of a point mass at 0
and a chi-square distribution with 1 degree of freedom (denoted as LR10 and LR20) (Self and
Liang, 1987; Stram and Lee, 1994). We can also test M1 and M2 versus M0 using the ANOVA
F-test (denoted as AOV10 and AOV20). For testingM3 versus the other models, we implement
an ad-hoc restricted likelihood ratio test, in which the standard test statistic is compared at the α = 0.10 level to a chi-square distribution with degrees of freedom equal to the difference in the number of variance components in the models being compared (denoted as LR∗
30, LR∗31,
and LR∗
32). Although this approach may not be recommended from a theoretical perspective
(Fitzmaurice et al., 2004), it is known to be used in practice.
In the absence of random effects, the Bayes factor approach, likelihood ratio tests, ANOVA F-tests, and ad-hoc tests all preserved the nominal Type I error rate at 0.05 for all model comparisons and all sample sizes (Table 3.1). The power for ˆB10 and ˆB20in detecting a random
effect was very similar to the likelihood ratio tests LR10 and LR20 and the ANOVA F-tests. For
testing M3 versus M0, the performance of ˆB30 was similar to the ad-hoc LR∗30, with slighter
pattern was seen comparing ˆB31 versus LR∗31 and B32 versus LR∗32. These results support the
claim that our method has good frequentist properties with respect to power and Type I error. Tables 3.2-3.4 shows a more complete breakdown of the estimated Bayes factors according to the scale of Wasserman (2000) and Jeffreys (1961). As λ1 and λ2 increased, the estimated
Bayes factor displayed greater evidence for the model with random intercepts. As the sample size increased, the estimated Bayes factors increasingly favored the null model in the absence of random intercepts, and increasingly favored the random intercept models in the presence of random intercepts. This shows large sample consistency in our method under these simulation settings.
3.4.2
Testing a random slope
We extend our simulation to test for the presence of a random slope in a two-factor non-nested multilevel model. To simulate the data, we include random intercepts for each factor as done previously, but also incorporate a random slope for one of the factors. We simulated xi ∼
N(0, .25), b20[i] ∼ N(0,0.04), εi ∼ N(0,1), b1[i] ∼ N2(0,ψ), with ψ11 = 0.04, ψ12 = ρ √
ψ11ψ22,
and ρ = −0.3, which induces a negative correlation between the random intercept and slope. The variances of the random intercepts (0.04) were chosen to match the variances from the previous simulation corresponding to λ1 =λ2 = 0.2. We calculated
Yi =b10[i]+b20[i]+b11[i]xi+εi, (3.14)
for various combinations of m = (100,500,1000) and √ψ22 = (0,0.1,0.2,0.3,0.6,1.0) for 1,000
datasets. Using prior distributionsβ0 ∼N(0,1),σ2 ∼InvGam(.1, .1) (which are non-informative
given the simulation settings), and φh ∼ N(log(0.3),2), we approximated marginal likelihoods
for the following models:
M3 : Yi = β0+β1xi +eφ1b10[i]+eφ2b20[i]+εi (3.15)
in which b∗
11[i] = γ1b10[i] +b11[i]. Model M3 incorporates random intercepts for both factors
with a fixed effect for the covariate, and model M4 includes the additional random slope on
the covariate for factor 1. Table 3.5 gives the power and Type I error of our approach using approximate Bayes factors and the ad-hoc restricted likelihood ratio test. Our method preserves the Type I error rate at α = 0.05 and has similar power to the ad-hoc RLRT. Table 3.6 shows the estimated Bayes factors according to the scale of Wasserman (2000) and Jeffreys (1961). As
√
ψ22 increased, the estimated Bayes factor displayed greater evidence for the model with the
random slope. As the sample size increased, the estimated Bayes factor increasingly favoredM3
in the absence of a random slope, and increasingly favoredM4 in the presence of a random slope.
These simulation results support the claim that our method has good frequentist properties and large sample consistency.