Other generalised linear mixed models - The impact of model structure on robustness

Chapter 5 Robustness to model misspecification

5.3 The impact of model structure on robustness

5.3.3 Other generalised linear mixed models

The same sort of argument will apply to other generalised linear mixed models. As the amount of information available on each random effect increases, the impact of the specification offu(.) on the maximum likelihood estimator shrinks. The

maximum likelihood estimator will be consistent in the limit as both the number of random effects, and the amount of information available per random effect, si- multaneously tend to infinity. In sufficiently dense models, the impact of model misspecification on the maximum likelihood estimator should therefore be small. However, in sparse models, the maximum likelihood estimator may be considerably more sensitive to the random-effects distribution. We demonstrate this sensitivity using a pairwise competition model with sparse structure.

Example 5.2. Consider a binary repeated star tournament, with three players in each star. The structure of this model is shown in Figure 3.10a on page 33. Notice that there are only 2 observations for every 3 random effects, so this model is sparser than the two-level model, where there were mi ≥ 2 observations for each random

effecti. Suppose that each player has a binary covariate, simulated as independent draws from a Bernoulli 1₂distribution. We assume thatP r(ibeats j) = Φ(λi−λj),

where

λi =βxi+σui,

andui ∼N(0,1). As in Example 5.1, suppose thatuireally areN(0,1) samples, but

we fixσ at some constant value ˜σ 6=σ0. Table 5.2 gives the limit of the maximum

likelihood estimator ofβ, whenβ0= 1 and σ0 = 0.5, for various values of ˜σ. It also

gives the asymptotic variance of the estimator, which grows with ˜σ. The estimated asymptotic variance is also given. In this model, this estimated variance is a good approximation to the true variance of the estimator.

Limit Variance Estimated variance ˜ σ β∗ G˜−1(β∗) H˜−1(β∗) 0.1 0.82 2.14 2.06 0.2 0.85 2.26 2.19 0.4 0.94 2.75 2.73 0.5 =σ0 1.00 3.12 3.12 0.6 1.07 3.58 3.60 1 1.40 6.23 6.32 1.5 1.88 11.45 11.53

Example 5.3. We consider exactly the same type of misspecification as used in Example 5.2, but now suppose that we haveRrepetitions of a complete tournament amongnplayers, for various values of n≥10. These tournaments are much denser than the repeated star tournament of Example 5.2, and we want to investigate how this change in model structure affects the asymptotic bias of the maximum likelihood estimator ofβ. We recall from Example 3.5 that inference from the true likelihood is very well approximated by inference from the Laplace approximation to the likelihood in a complete tournament between a large number of players, so we use the Laplace approximation in place of the full likelihood here.

We simulateNn tournaments for eachn∈ {10,20,40,100}, with the ability

of each player simulated as

λi=β0xi+σ0ui,

wherexi ∼Bernoulli 1₂

andui∼N(0,1). We (wrongly) suppose that

λi =βxi+ ˜σui,

and try to estimateβ. We take the number of simulations Nnlarge enough that we

can be confident of the limiting value of the maximum likelihood estimator ofβ, as

R→ ∞.

Suppose thatβ0= 1 and σ0 = 0.5, but ˜σ = 1. Under the same misspecifica-

tion, the limit of ˆβ in the repeated star tournament of Example 5.2 was 1.40. The limits of the Laplace estimator asR → ∞ for each repeated complete tournament are given in Table 5.3. Forn= 10, there is a small asymptotic bias in the estimator even in the correct model. This is because the estimator is computed by maximising the Laplace approximation to the likelihood, rather than the true likelihood. However, the magnitude of this bias is negligible compared to that induced by the misspecification ofσ.

As n increases, the limit of maximum likelihood estimator of β becomes closer to β0. In this case, the variance of the maximum likelihood estimator under

misspecification tends towards the variance under correct model specification as n

increases. However, the estimated variance under misspecification is too large in each case, which would make standard hypothesis tests too conservative.

Note. We have demonstrated that if there are R independent replications of a complete tournament amongnplayers, the asymptotic bias in ˆβ tends to zero as n

grows. In fact, we do not need a growing number of independent replications in order for this result to hold. To see this, note that a complete tournament amongnplayers contains within it√ntournaments, each with √n players. For a single replication

Correct model Misspecified model

n Limit Variance Limit Variance Estimated variance Nn

β0 I−1(β0) β∗ G˜−1(β∗) H˜−1(β∗)

10 1.01 0.23 1.24 0.36 0.59 10000

20 1.00 0.08 1.13 0.10 0.23 5000

40 1.00 0.03 1.07 0.03 0.11 100

100 1.00 0.01 1.04 0.01 0.04 100

Table 5.3: Inference under misspecification ofσ in a repeated complete tournament

R= 1, by an argument similar to those given in Section 3.2, the maximum likelihood estimator of β will therefore be consistent as n→ ∞, even under misspecification of the random-effects distribution.

Note. In dense models, the Laplace, importance sampling and sequential reduc- tion approximations to the likelihood will all give inference very close to the true likelihood (see the discussion of Section 3.3). This means that the robustness of the maximum likelihood estimator in a dense model is inherited by the estimators maximising any of these approximations.

5.4 Robustness of composite likelihood estimators

In document Inference for generalised linear mixed models with sparse structure (Page 92-94)