• No results found

Generalization to regression structures

The empirical Bayes approach

3.6 Generalization to regression structures

We have considered the basic two-stage empirical Bayes model with asso-ciated delta-theorem and bootstrap methods for obtaining approximately correct posterior means and variances. A brief look at the general model, where the prior mean for each coordinate can depend on a vector of re-gressors, shows features of this important case and further motivates the need for the more flexible and applicable hyperprior Bayesian approach to complicated models.

Consider the two-stage model where

(3.40) (3.41) where X is a design matrix. Then, if the parameters

the posterior mean and variance of

(3.42) (3.43) with B defined as in Example 2.1. Notice that the shrinkage applies to residuals around the regression line.

For the EB approach, parameters are estimated by marginal maximum likelihood. As we have seen for the basic model, the posterior variance must be adjusted for estimation uncertainty. The Morris expansion (Morris, 1983) gives

(3.44) (3.45) where all hatted values are estimated and c is the dimension of a. Formula (3.44) generalizes (3.34) and is similar to the adjustment produced by using restricted maximum likelihood (REML) estimates of variance instead of ML estimates (as in SAS Proc MIXED or see Appendix C).

Derivation of (3.44) is non-trivial, and doesn't account for correlation among parameter estimates or the fact that the posterior is skewed and not Gaussian; see Laird and Louis (1987), Carlin and Gelfand (1990), and Section 3.5 of this book. As we shall see in Chapter 5, these derivations can be replaced by Monte Carlo methods that account for all of the conse-quences of not knowing prior parameters. The power of these methods for the linear model are further amplified for nonlinear models. For example, in

are

are known, Ironically, the

Example 3.1, to use a Gaussian/Gaussian model we transformed five-year death rates to log odds ratios. A more direct approach based on logistic regression requires the methods in Chapter 5.

3.7 Exercises

1. (Berger, 1985, p.298) Suppose

are i.i.d. from a common prior G. Define the marginal density in the usual way as

(a) Show that, given G, the posterior mean for

where

(b) Suggest a related nonparametric EB estimator for 2. Under the compound sampling model (3.4) with

that (3.5) holds (i.e.. that the

putational significance of this result for parametric EB data analysis?

3. Consider the gamma/inverse gamma model, i.e., a known tuning constant, and

(a) Find the marginal density of (b) Suppose

4. In the Gaussian/Gaussian model (3.9), if marginally independent with distribution

(a) Find the marginal MLE of B, estimates

(b) Show that while

with respect to Show further that using this

PEB point estimation fashion produces the James-Stein estimator (3.23).

5. Prove result (3.24) using the completeness of the non-central chi-square distribution.

6. Show how to evaluate (3.24) for a general

square can be represented as a Poisson mixture of central chi-squares with mixing on the degrees of freedom.)

i = .. .,k, and that the

can be written as

= 2. Find the marginal MLE of

= 1 and = 0, the are

and the resulting PEB point it is unbiased for B in the usual in (3.25) is not equal to

(Hint: A non-central chi-are marginally i.i.d.). What is the

com-= f for all i, show

is the of m.

7. Prove result (3.26) on the maximum coordinate-specific loss for the James-Stein estimate.

8. Consider again Fisher's sleep data:

Suppose these k = 10 observations arose from the Gaussian/Gaussian PEB model,

Assuming (a)

(b) (c) (d)

A naive 95% EBCI for A Morris 95% EBCI for

Compare the two point and interval estimates you obtain.

9. Consider the PEB model

(a) Find the marginal distribution of

(b) Use the method of moments to obtain closed form expressions for the hyperparameter estimates a and b. (Hint: Define the rates

and equate their first two moments, moments in the marginal family.) (c) Let k = 5, a = b =3, and

study to determine the actual unconditional EB coverage of the 90%

equal-tail naive EBCI for

10. Consider the data in Table 3.5. These are the numbers of pump failures, a certain nuclear power plant. The observations are listed in increasing order of raw failure rate

rate

(a) Using the statistical model and results of the previous question, com-pute PEB point estimates and 90% equal-tail naive interval estimates for the true failure rates of systems 1, 5, 6, and 10.

(b) Using the approach of Laird and Louis (1987) and their Type III parametric bootstrap, obtain corrected 90% EBCIs for

Why does the

11. In the Gaussian/Gaussian EM example,

interval require more drastic correction?

and for the system.

the classical point estimate of the true failure thousands of hours for k = 10 different systems of observed in

= 1 for all i, and perform a simulation and to the corresponding

= 1 for these data, compute

Table 3.5 Pump Failure Data (Gayer and O'Muircheartaigh. 1987) (a) Complete the derivation of the method for finding MMLEs of the

prior mean and variance.

(b) Use the Meilijson approach to find the observed information.

(c) For this model, a direct (but still iterative) approach to finding the MMLE is straightforward. Find the gradient for the marginal distri-bution and propose an iterative algorithm for finding the MMLEs.

(d) Find the observed information based on the marginal distribution.

(e) Compare the EM and direct approaches.

12. Do a simulation comparison of the MLE and three EB approaches: Bob-bins, the Poisson/gamma, and the Gaussian/Gaussian model for esti-mating the rate parameter in a Poisson distribution. Note that the Gaus-sian/Gaussian model is incorrect. Investigate two values of k (10, 20) and several true prior distributions. Study gamma distributions with large and small means and large and small of variation. Study two distributions that are mixtures of these gammas. Compare approaches relative to SSEL and the maximum coordinate-specific loss and discuss results. You need to design the simulations.

Extra credit:Include in your evaluation a Maritz (forced monotonicity) improvement of the Robbins rule, and the rule based on the NPML (see Section 7.1 and Subsection 3.4.3 above).

CHAPTER 4