The empirical Bayes approach
3.5 Interval estimation
Obtaining an empirical Bayes confidence interval (EBCI) is, in principle, very simple. Given an estimated posterior
we would any other posterior distribution to obtain an HPD or equal tail credible set for
then a
Why is this interval "naive?" From introductory mathematical statistics we have that
(3.33) In the Gaussian/Gaussian case, a 95% naive EBCI would be
The term under the square root approximates
the EB approach simply replaces integration with maximization. But this corresponds to only the term in (3.33); the naive EBCI is ignoring the posterior uncertainty about the second term (3.33)). Hence, the since equal tail naive EBCI for is
That is, if is such that
we could use it as
naive interval may be too short, and have lower than advertised coverage probability.
To remedy this, we must first define the notion of "EB coverage".
Definition 3.1 for
where
Morris actually proposes estimating shrinkage by
The first term in (3.34) approximates the first term in (3.33), and is es-sentially
in (3.34) approximates
thus serves to "correct" the naive EB interval by widening it somewhat.
Notice the amount of correction decreases to zero as k increases as the estimated shrinkage factor
the estimated prior mean.
(3.34) For the Gaussian/Gaussian model (3.9), Morris (1983a) suggests basing the EBCI on the modified estimated posterior that uses the "naive" mean, but inflates the variance to capture the second term in equation (3.33). This
distribution is
3.5.1 Morris' approach
Naive EB intervals typically fail to attain their nominal coverage probabil-ity, in either the conditional or unconditional EB sense (or in the frequentist sense). One cannot usually show this analytically, but it is often easy to check via simulation. We devote the remainder of this section to outlining methods that have been proposed for "correcting" the naive EBCI.
So we are evaluating the performance of the EBCI over the variability in-herent in both
authors have suggested instead conditioning on some data summary, b(y).
For example, taking b(y) = y produces fully conditional (Bayesian) cover-age. Many likelihood theorists would take b(y) equal to an appropriate an-cillary statistic instead (see Hill, 1990). Or we might simply take b(y) = on the grounds that
Definition 3.2
givenb(y) if,for eachb(y) = band
is a (1- )100% unconditional EB confidence set if and only if for each
and the data. But is this too weak a requirement? Many
is for when is known.
where
the naive EB variance estimate. The second term the second term in (3.33), and
decreases, or as the data-value approaches is a 100(1- )%conditional EB confidence set for
which is the result of several "ingenious adhockeries" (see Lindley, 1983) designed to better approximate (3.33). It is important to remember that this entire derivation assumes
estimate may be substituted. Morris offers evidence (though not a formal proof) that his intervals do attain the desired nominal coverage. Extension of these ideas to non-Gaussian and higher dimensional settings is possible but awkward; see Morris (1988) and Christiansen and Morris (1997).
3.5.2 Marginal posterior approach
This approach is similar to Morris' in that we mimic a fully Bayesian calcu-lation, but a bit more generally. Since
on
where
Several simplifications to this procedure are available for many models.
First, if density
conditions only on a univariate statistic. Second, note that
(3.35) the product of the sampling model and the prior. Hence
only on
(3.36) Appropriate percentiles of the marginal posterior
mated posterior) determine the EBCI. This is an intuitively reasonable approach, since lh accounts explicitly for the uncertainty in
words, mixing
As indicated by expression (3.35), the first term in the integral in equa-tion (3.36) will typically be known due to the conjugate structure of the hierarchy. However, two issues remain. First, will h be available? And sec-ond, even if so, can the integral be computed analytically?
Deely and Lindley (1981) were the first to answer both of these questions affirmatively, obtaining closed-form results in the Poisson/gamma and "sig-nal detection" (i.e., where
to the approach as "Bayes empirical Bayes," since placing a hyperprior on is essentially a fully Bayesian solution to the EB problem. However, the first general method for implementing the marginal posterior approach is is a 0-1 random variable) cases. They referred with respect to hshould produce wider intervals.
In other (instead of the esti-and the marginal posterior can be written as
and
depends is sufficient for
then we can replace by which
and has in the marginal family
and base inference about on the marginal posterior,
is known; if it is unknown, a data-based
is unknown, we place a hyperprior
bootstrap process data process
Figure 3.3 Diagram of the Laird and Louis Type III Parametric Bootstrap.
due to Laird and Louis (1987), who used the bootstrap. Their idea was to take
arguments interchanged. We then approximate
(3.37)
where as
via what the authors refer to as the Type III Parametric Bootstrap: given we draw
From the bootstrapped data sample
Repeating this process N times, we obtain
1, . . ., N for use in equation (3.37), the quantiles of which, in turn, provide our corrected EBCI.
Figure 3.3 provides a. graphical illustration of the parametric bootstrap process: it simply mimics the process generating the data, replacing the unknown
process is entirely generated from a single parameter estimate
by resampling from the data vector y itself (a "nonparametric bootstrap").
Hence we can easily draw from available analytically (e.g., when in many marginal MLE settings).
Despite this computational convenience, the reader might well question the wisdom of taking
be to choose a reasonable
obtain an estimator of the corresponding
match a prespecified hyperprior Bayes solution, and, unlike be sensitive to the precise choice of
objective, then
lengthen the naive EBCI, but need not "correct" the interval to level 1-as we desire. Still, Carlin and Gelfand (1990) show how to use the Type
the sampling density of given with the by observing
j = I.... N. Notice that this estimator converges to by the Law of Large Numbers. The values are easily generated
and then draw for i = 1,...k.
we may compute
by its estimate The "parametric" name arises because the rather than even when this function is not itself must be computed numerically, as
An ostensibly more sensible approach would
compute and
This marginal posterior would would not However, if EB coverageis the After all, taking quantiles of will may be as good as
III parametric bootstrap to match any is available in closed form.
3.5.3 Bias correction approach
A problem with the marginal posterior approach is that it is to say how to pick a good hyperprior
that actually achieve the nominal EB coverage rate). In fact, if
badly biased, the naive EBCI may not be too short, but too long.Thus, in general, what is needed is not a method that will widen the naive interval, but one that will correctit.
Suppose we attempt to tackle the problem of EB coverage directly. Recall that
and
Hence Ris the true EB coverage, conditional on tail area. Usually the naive EBCI is too short, i.e.,
If we solved
ditionally "correct the bias" in our naive procedure and give us intervals with the desired conditional EB coverage. But of course
instead we might solve
(3.38) for
our corrected confidence interval. We refer to this interval as a conditionally bias correctedEBCI. For unconditionalEB correction, we can replace Rby
and solve
is called the unconditionally bias correctedEBCI.
Implementation if
exponential/inverse gamma models), then solving (3.38) requires only tra-ditional numerical integration and a rootfinding algorithm. If
is not available (e.g., if
son/gamma and beta/binomial models), then solving (3.38) must be done via Monte Carlo methods. In particular, Carlin and Gelfand (1991a) show how the Type III parametric bootstrap may be used in this regard. Notice is available in closed form (e.g., the Gaussian/Gaussian and
itself is not available analytically, as in the Pois-for The naive interval with a replaced by this and take the naive interval with a replaced by as is unknown, so for then using this with would con-of the naive EB
is the quantile of Define
is quantiles (i.e., one that will result in
provided the sampling density
that in either case, since
where
Carlo methods must be employed. mathematical subroutines can be em-ployed at this innermost step.
To evaluate whether bias correction is truly effective, we must check whether
(3.39)
Ifhold for large k, but, in this event. the naive EB interval would do fine as well! For fixed k, Carlin and Gelfand (1990) provide conditions (basically, stochastic ordering of
where
and
That is, since area,
Example 3.3 We now consider the exponential/inverse gamma model, where
tribution for
= 2 and k = 5. Our simulation used 3000 replications, and set (recall α is now near 0.95).
increases, and that this decrease is more pronounced for larger we see that is close to a for near 1, but decreases The behavior for the upper tail is similar, except that
(unconditional bias correction), and plotting the values obtained as Consider bias correcting the lower tail of the EBCI. Solving
for
a function of steadily as values of
is now an increasing function of
To evaluate the ability of the various methods considered in this sec-tion to achieve nominal EB coverage in this example, we simulated their unconditional EB coverage probabilities and average interval lengths for two nominal coverage levels (90% and 95%) where we have set the true value of
N = 400 in those methods that required a parametric bootstrap. In ad-dition to the methods presented in this section, we included the classical (frequentist) interval and two intervals that arise from matching a specific so that the MLE is then
The marginal dis-lies in an interval containing the nominal coverage level
the true coverage of the bias corrected EB tail
and in under which
as (i.e., is consistent for ), then (3.39) would certainly is the cdf corresponding to So even when Monte
is typically chosen to be conjugate with
Table 3.4 Comparison of simulated unconditional EBcoverage probabilities, ex-gamma model.
hyperprior Bayes solution as in equation (3.36). The two hyperpriors we consider are both noninformative and improper, namely,
(so that
can also have an impact on coverage, and this impact is difficult to performs well, with the intervals being too short.
= .95. Finally, of the two hyperprior matching intervals, only
= .90 and the latter a bit
and (so that
Looking at the results in Table 3.4, we see that the classical method faithfully produces intervals with the proper coverage level, but which are extremely long relative to all the EB methods. The naive EB intervals also perform as expected, having coverage probabilities significantly below the nominal level. The intervals based on bias correction and the Laird and Louis bootstrap approach both perform well in obtaining nominal coverage, with the former being somewhat shorter for
shorter for the one based on
This highlights the difficulty in choosing hyperpriors that have the desired result with respect to empirical Bayes coverage.
On a related note, the proper choice of marginal hyperparameter esti-mate
predict. Additional simulations (not shown) suggest that the best for is not the marginal MLE, but the marginal uniformly minimum variance
unbiased estimate (UMVUE),
MLE-based intervals were, in general, a bit too long for the bias correction method, but too short for the Laird and Louis method.