Frequentist performance: Confidence intervals

Performance of Bayes procedures

4.3 Frequentist performance: Confidence intervals

They summarize the results of their investigation by concluding that at least for the models considered, the Bayesian approach is quite robust in that it is preferred to the frequentist so long as either the assumed prior is approximately correct or if relatively little weight is put on the prior.

The Bayesian will get in trouble if she or he is overly confident about an incorrect assumption.

Inequality (4.6) can be explored quite generally to see when the Bayesian approach is superior. For example, if

term is 0 and the comparison depends on the relation between assumed prior mean

This special case structures the foregoing evaluations of the beta/binomial and Gaussian/Gaussian models.

Samaniego and Reneau further investigate the situation when the true and assumed prior means are equal. Then inequality (4.6) becomes

(4.7) and the weight put on the prior mean

(1-the then the first

Before considering frequentist performance,

(4.8) Properties of the HPD interval

is degenerate at

4.3.1 Beta/Binomial model

a monotone decreasing function. Thus the HPD Bayesian credible interval is one-sided. From (4.8) it is easy to show that

so letting

by solving the equation S(t) _ interval as

More generally, it is easy to show that when X = 0, the Bayesian interval is one-sided so long as

X = n, and is left as an exercise.

Unlike the case of the Gaussian/Gaussian model, however, if one uses the zero-precision prior (M = 0). technical problems occur if X equals either 0 or n. Hence the prior must contain at least a small amount of information. Fortunately, we have seen that using a small value of M > 0 actually leads to point estimates with better frequentist performance than the usual maximum likelihood estimate. W e investigate the properties of the corresponding interval estimates below.

As an interesting sidelight. consider representing the upper limit in (4.9) as a pseudo number of events one might expect in a future experiment divided by the sample size (n). Then.

Thus as the sample size gets large. the numerator of the upper limit of the confidence interval converges to a fixed number of pseudo-events. For example, with

- log(.05)

value is the upper limit for most individuals when asked: "In the light of zero events in the current experiment. what is the largest number of events you think is tenable in a future replication of the experiment?"

Frequentist performance Based on the

100(1

-obtained by taking the upper and lower a-points of the posterior distribu-tion. But as we have just seen, this approach would be silly for

X = 0. This is because the monotone decreasing nature of the posterior in this case implies that the

higher posterior density than any

comment applies to the situation where b = (1

-A possible remedy would be to define the "corrected equal-tail" interval for t. Doing this, we obtain the HPD the upper endpoint of the HPD interval may be found

A similar result holds for the case

3. As reported by Louis (1981a) and Mann et al. (1988), this this number of pseudo-events is

posterior distribution for a simple is the "equal-tail" interval ) % Bayesian credible interval for

when values excluded on the lower end would have values within the interval! A similar

) M 1 and X = n.

as follows:

(4.10) where

and again

and removes the most glaring deficiency of the equal-tail procedure in this case, it will still be overly broad when the posterior is two-tailed but skewed to the left or right.

Of course, the HPD interval provides a remedy to this problem. at some computational expense. Since the one-tailed intervals in equation (4.10) are already HPD, we focus on the two-tailed case. The first step is to find the two roots

not too difficult numerically, since polynomial in

where F is the cdf of our beta posterior. Finally. we adjust

down depending on whether the posterior coverage of our interval was too large or too small, respectively. Iterating this procedure produces the HPD interval.

Notice that the above algorithm amounts to finding the roots of the equation

wherein each step requires a rootfinder and a beta cdf routine. The latter is now readily available in many statistical software packages (such as S), but the former may require us to do some coding on our own. Fortunately, a simple bisection (or interval halving) technique will typically be easy to program and provide adequate performance. For a description of this tech-nique, the reader is referred to a calculus or elementary numerical methods text, such as Burden et al. (1981, Section 2.1). Laird and Louis (1982) give an illustration of the method's use in an HPD interval setting.

We now examine the frequentist coverage of our Bayesian credible inter-vals. Adopting the notation

note carefully that θ

possible effect of posterior skewness, we first compute the lower and upper non-coverages

(4.11)

so that coverage( = 1 - lower( - upper( ). These coverages are easily is held fixed in this calculation. To allow for the

this is defined as up or We next find

and of the equation for a given This is Though this interval is still straightforward to compute

quantile of the posterior,

denotes the

in this case is simply a (a+b+n)-degree

We begin by reporting coverages of the corrected equal tail intervals defined in equation (4.10) based on three beta priors that are symmetric about

Jeffreys prior, the uniform prior, and an informative Beta(5, 5) prior. The table also lists the prior mean

of the weight given to the prior by the posterior when n = 10, namely, Since the Beta(5, 5) prior gives equal posterior weight to the data and the prior, this is clearly a strongly informative choice.

For n = 10, Figure 4.6 plots the coverages of the Bayesian intervals with nominal coverage probability

"true"

obtained by summing binomial probabilities. For example

Figure 4.6 Coverage of "corrected" equal tail 95% credible interval under three symmetric priors with n = 10.

values from 0.05 to 0.95. Since the symmetry of the priors implies

= .95 across a arid of possible the prior precision M, and the percentage

= 0.5. These are the first three listed in Table 4.1, namely, the (4.12)

Table 4.1 Beta prior distributions that lower

interpretation of the plot, horizontal reference lines are included at the 0.95 and 0.025 levels. We see that the two noninformative priors produce values for coverage

Figure 4.7 Coverage of "corrected" equal tail 95% credible interval under three symmetric priors with n = 40.

= upper(1 - ), the latter quantity is not shown. To enhance

that are reasonably stable at the nominal level for

4.3.2 Fieller-Creasy problem

and so endpoints for a 100(1-by solving the quadratic equation

for

While appearing reasonable, this approach suffers from a serious defect:

the quadratic equation (4.13) may have no real roots. That is, the discrim-We now take up the problem of estimating the ratio of two normal means, a problem originally considered by Fieller (1954) and Creasy (1954). Suppose

assumed independent given The parameter of interest is the mean ratio

frequentist approach might begin by observing that all

irregularities in the pattern. However, the informative prior performs poorly for extreme

course, the narrowness of its intervals should be borne in mind as well). Of particular interest is the 0% coverage provided if

of the fact that the informative prior will not allow the intervals to stray this far from 0.5 - even if the data encourage such a move (i.e., X = 0 or n). As in the case of point estimation, we see that informative priors enable more precise estimation, but carry the risk of disastrous performance when their informative content is in error.

Figure 4.7 repeats the calculations of Figure 4.6 in the case where n = 40.

Again both noninformative priors produce intervals with good frequentist coverage, with the uniform prior now performing noticeably better for the most extreme

still poorly for

sample size, the prior now contributes only 20% of the posterior information content, instead of 50%.

Finally, Figure 4.8 reconsiders the ⁿ = 10 case using an asymmetric Beta(1, 3) prior, listed last in Table 4.1. This prior is mildly informative in this case (contributing 4/14 = 29% of the posterior information) and favors smaller

a mirror image of lower

values of the former shown for large informative priors must be used with care.

) % confidence interval for and

values. Again, the message is clear:

is no longer and this is evident in the unacceptably large values over larger ones. As a result, upper

This is due to the fact that, with the increased values. The informative prior also performs better, though

= 0.05 or 0.95, a result values, and only slightly better for

though the discrete nature of the binomial distribution causes some values near 0.5 (though of

We notate the roots of this equation by

could be derived (4.13)

inant of the quadratic equation,

Figure 4.8 Coverage of "corrected" equal tail 95% credible interval under an asymmetric Beta(a = 1, b = 3) prior with n = 10.

may be negative. To remedy this, we might try the following more naive approach. Write

the last line arising from a one-term Taylor expansion of the denominator (4.14)

on the previous line. Hence

and so it makes sense to simply take proximation (4.14) again to obtain

so a sensible variance estimate is given by

Thus an alternate (1-by

While this procedure will never fail to produce an answer, its accuracy is certainly open to question.

By contrast, the Bayesian procedure is quite straightforward. Stephens and Smith (1992) show how to obtain samples from the joint posterior dis-tribution of

using a Monte Carlo method called theweighted bootstrap (see Subsec-tion 5.3.2). We stay within the simpler

conjugate priors

be made noninformative by setting

then obtain the independent posterior distributions

)100% frequentist confidence interval for is given

and under the noninformative prior

known setting, and adopt the

and (recall that these may

From Example 2.1 we Next, we may use

ap-We could now derive the distribution of

the familiar Jacobian approach, but, like Stephens and Smith (1992), we find it far easier to use a sampling-based approach. Suppose we draw

and then define

stitute a sample of size G from the posterior distribution of (1

-the

element in the sample, and ple. (Note that

or else this calculation may require interpolation.)

To compare the frequentist performance of these three approaches to the Fieller-Creasy problem, we perform a simulation study. After choosing some "true" values for

and apply each of the three procedures to each of the generated data pairs (Note that to implement our sampling-based Bayesian proce-dure, this means we must generate

simulated data pair. Thus we effectively have a simulation within a simula-tion, albeit one that is very easily programmed.) For each simulated data pair, we record the lengths of the three resulting confidence intervals and whether the true

a

method with good frequentist performance will have a coverage proba-bility of (1 - )

are not too long on the average.

Table 4.2 gives the resulting average interval lengths and estimated prob-abilities that the true

dence interval, where we take

the exact and approximate frequentist approaches, we also evaluate two Bayesian methods, both of which take the two prior means

equal to 3. However, the first method takes highly informative prior. The second instead takes

vague prior specification. The four portions of the table correspond to the four true

in all cases.

Looking at the table, one is first struck by the uniformly poor perfor-mance of the approximate frequentist method. While it typically produces intervals that meet or exceed the desired coverage level of 95%, they are unacceptably wide - ludicrously so when

based on the exact frequentist method (derived from solving the quadratic equation (4.13)) are much narrower, but in no case do they attain the nom-analytically. perhaps using

g = 1, . . ., G. These values then con-Hence a could be obtained by sorting

as the -smallest

)100% equal-tail credible interval for sample, and subsequently taking

as the

may first have to be rounded to the nearest integer, largest element in the

sam-and we generate

g = 1,...,G} for each

value fell below, within, or above each interval. Clearly or better, while at the same time producing intervals that

lies below, within, or above the computed

confi-= .05 and N = G = 10,000. Along with

= 1, resulting in a and

= 50, a rather combinations (0,1), (0,3), (3,1), and (3,3), where

is close to 0. The intervals

inal 95% coverage level. This is no doubt attributable in part to the fact that the coverage probabilities in the table are of necessity based only on those cases where the discriminant

cases ("failures") were simply ignored when computing the summary infor-mation in the table. With the exception of the final

fail rates (shown just below the method's label in the leftmost column) are quite substantial, suggesting that whatever the method's properties, it will often be unavailable in practice.

By contrast, the performance of the two Bayesian methods is quite im-pressive. The intervals based on the informative prior have both very high

Table 4.2 Simulation results, Fielder-Creasy problem

was positive; the remaining scenario, these

In document Bayes and Empirical Bayes Methods for Data Analysis - Carlin Louis (Page 112-122)