• No results found

STK3100 and STK4100 Prediction of random effects Marginal models and generalized linear mixed models

N/A
N/A
Protected

Academic year: 2022

Share "STK3100 and STK4100 Prediction of random effects Marginal models and generalized linear mixed models"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

STK3100 and STK4100

Covers the following material in chapter 9:

Department of Mathematics University of Oslo

Prediction of random effects Marginal models and

generalized linear mixed models

• Sections 9.1.2, 9.1.3, 9.3.2, 9.4.1, 9.4.2, 9.5.1, 9.5.2, 9.6.3, 9.6.4, and 9.7

(2)

Let be multivariate normally distributed

with mean vector and positive definite covariance matrix , i.e.

2

Conditional expectation for multivariate normal distribution (recap from chapter 3)

T 1 2

( , ,...., )Y Y Yn

= Y

μ V Y ~ ( , )N μ V

Suppose is partitioned as Y

1 2

æ ö

= ç ÷ è ø Y Y

Y with and 1

2

æ ö

= ç ÷ è ø μ μ

μ æ 1121 1222 ö

= çè ÷ø V V V V V Then

In particular

1 1

2 | 1 = 1 ~ N éë 2 + 21 11- ( 1 - 1), 22 - 21 11- 12 ùû Y Y y μ V V y μ V V V V

1

2 1 1 2 21 11 1 1

( | = ) = + - ( - )

E Y Y y μ V V y μ

(3)

where

and

3

Prediction of random effects

We will first see how we may predict the values of the random effects assuming that and are known We have

It follows that

u V = ZΣ Zu T + Rε

~ ( ,N u)

u 0 Σ ε ~ ( ,N 0 Rε) Note that

cov( , )Y u

= + +

Y Xβ Zu ε

T

~ é , æ +T öù

æ ö æ ö

ê ç ÷ú

ç ÷ ç ÷

è ø N ëè ø è u ε u øû

u u

ZΣ Z R

Y

u 0 Σ Z Σ

β

cov( , )

= Zu ε u+ = Zvar( )u = u

(4)

4

By the result on conditional expectation for multivariate normal distributions, we have

T T 1

( | = ) = ( + ) (- - )

E u Y y Σ Z ZΣ Zu u Rε y Xβ

We may now insert estimates for the unknown parameters and obtain the predictions

T T 1 ˆ

ˆ ˆ ˆ

ˆ = u ( u + ε) (- - ) u Σ Z ZΣ Z R y Xβ

The course web-page gives examples of R code (fecal fat and fev data)

(5)

Marginal models and GLMMs

We assume that are

independent vectors that corresponds to observations from each of n clusters

A marginal model with link function g has the form

for

[ ( )]ij = ( )µij = ij

g E Y g x β

T

1 2

( , ,..., ) ; 1,...,

i = Y Yi i Yid i = n

Y

1,...., ; 1,..., i = n j = d

A generalized linear mixed model (GLMM) has the form

where the parameters are the fixed effects and give the random effects

[ ( | )]ij i = ij + ij i g E Y u x β z u

~ ( , )

i N u

u 0 Σ

β

(6)

For a GLMM we have

( | )ij i = -1( ij + ij i) E Y u g x β z u This gives

µij = E Y( )ij

where is the density function f u Σ( ;i u) N 0 Σ( , u) For the identity link we have

( ) ( ; )

µij =

ò

x β z uij + ij i f u Σi u dui

( ; ) ( ; )

= x βij

ò

f u Σi u dui + z uij

ò

i f u Σi u dui

= x βij

Thus we have an identity link (with the same effects) also for the marginal model, but a similar result does not

necessarily hold for other link functions [ ( | )]

= E E Y uij i =

ò

g-1(x β z uij + ij i) ( ;f u Σi u)dui

(7)

Consider first the mixed probit model for binary data

Models for a binary response

where and is the cdf of ( ij =1| )i = F( ij + i)

P Y u x β u

~ (0,s 2)

i u

u N

7

F z( ) Z ~ (0,1)N

Then

( ij =1| )i

P Y u

and 𝑃 𝑌!" = 1 = ∫ 𝑃 𝑌!" = 1 𝑢! 𝑓 𝑢!; 𝜎#$ 𝑑𝑢! = + 𝑃(𝑍 − 𝑢! ≤ 𝐱!"β)𝑓 𝑢!; 𝜎#$ 𝑑𝑢!

( )

= P Z £ x βij +ui = P Z u( - £i x βij )

We have 𝑍 − 𝑢!~𝑁 0,1 + 𝜎#$ , and hence (𝑍 − 𝑢!)/ 1 + 𝜎#$~𝑁 0,1 , and so

𝑃 𝑌!" = 1 = ϕ(𝐱!"β/ 1 + 𝜎#$)

(8)

8

Thus the implied marginal model is also a probit model, but with replaced by x βij x βij / 1+su2

For the logistic mixed model

exp( )

( 1| )

1 exp( )

= = +

+ +

ij i

ij i

ij i

P Y u u

u x β

x β

the implied marginal model is not exactly of logistic form

But it is approximately a logistic model with replaced by where x βij / 1 (+ su / )c 2

x βij

»1.7 c

(9)

Example: Survey on attitude to abortion

Survey in the US on supporting legalized abortion under d = 3 three different situations

if person i supports legalized abortion under situation j , otherwise

ij 1 Y =

ij 0 Y = Summary of data:

(10)

When modelling the data, we need to take into account that the three responses for a person are dependent

A possible model is a logistic regression model with a random effect for persons:

10

logit[ (P Yij =1| )]ui = x βij +ui

Here , , and ui ~ (0,N su2) β = ( , ,b b b b0 1 2, )3 T (1,1,0, ) for 1

(1,0,1, ) for 2 (1,0,0, ) for 3 ì =

= ïí =

ï =

î

i

ij i

i

s j

s j

s j

x

where if person i is a female and if person i is a male

i 1

s = s =i 0

R code is given on course web-page

(11)

A mixed Poisson GLMM with log link is given by

Poisson models for correlated count data

where , and given we have that

are independent and Poisson distributed log éëE Y u( | )ij i ù =û x β z uij + ij i

11

For the random intercept model [ ]

~ ( , )

i N u

u 0 Σ ui

1, 2,...,

i i id

Y Y Y

log éëE Y u( | )ij i ù =û x βij +ui

~ (0,s 2)

i u

u N

we have (using the moment generating function) µij = E Y( )ij

exp( ) [exp( )]

= x βij E ui

Thus the derived marginal model has the same effect of the covariates, but a different intercept

[ ( | )]

= E E Y uij i = E[exp(x βij +ui)]

exp( s 2 / 2)

= x βij + u

(12)

12

Maximum likelihood estimation for GLMMs

The likelihood function for a GLMM is given as

1 1

( , ; ) ( | ; ) ( ; )

= =

ì é ù ü

ï ï

= í ê ú ý

ï ë û ï

î þ

Õ Õ ò

!

n d

u ij i i u i

i j

f y f d

β Σ y u β u Σ u

This is a complicated expression, and the integral (typically) has no closed form solution

The integral has to be evaluated by numeric techniques Common approaches are Laplace approximations and Gauss-Hermite quadrature approximations

(13)

13

GEE estimation for marginal models

For cluster i with observations and means , the marginal model with link function g is

Let denote the working covariance matrix and let be the matrix with element

equal to

Remember the quasi-likelihood equations for univariate responses

!"#$ (𝜕𝜇!/ 𝜕𝜂!)%𝑣(𝜇!)&#𝑥!' 𝑦! − 𝜇! = ∑!"#$ (𝜕𝜇!/ 𝜕𝛽!)%𝑣(𝜇!)&#(𝑦! − 𝜇!)

( )µij = ij

g x β

T

1 2

( , ,..., )

i = y yi i yid

y

Vi

T

1 2

(µ µ, ,...,µ )

i = i i id

μ

= ¶ / ¶

i i

D μ β d p´ ( , )j k

µ / b

ijk

T -1( )

=

- =

å

n D V yi i i μi 0

Analogous for the multivariate setting we have the generalized estimating equations (GEE)

(14)

14

The GEE estimator is approximately multivariate normal with mean , and with covariance matrix

β ˆβ

1 1

T 1 T 1 1 T 1

1 1 1

var( )ˆ var( )

- -

- - - -

= = =

æ ö æ ö

= ç ÷ ç ÷

è

å

n i i i ø

å

n i i i i i è

å

n i i i ø

i i i

β D V D D V Y V D D V D

Analogous to the univariate case, we obtain sandwich

estimator for the covariance matrix by inserting estimates for the unknown parameters and by replacing by var( )Yi

ˆ ˆ ˆ T

var( ) (Yi = yi -μ yi)( i -μi)

R code for attitude to abortion example is given on the course web-page

References

Related documents

4.10 State level financial institutions will be revived for financing Micro, Small and Medium Enterprises (MSMEs), through which implementation of the Central government

On the contrary the volatilities of the economy simulated with both constraints are much closer to the empirical volatilities. Fixed capital is less volatile in comparison with

While the role of the large universal banks particularly the "Big Three" Deutsche Bank, Dresdner Bank and Commerzbank in providing finance to large companies is well known

Onderwijsmanagers zouden meer aandacht moeten hebben voor de ‘lijm’ die losjes gekoppelde organisaties bijeenhoudt: zij moeten uit hun kantoren komen en veel tijd

wrapTempl allows the partitioning of wrapping key objects so that they can only wrap a subset of extractable keys: the value of the attribute wrapTempl is an attribute template

In reality the token can use sym- metric cryptography and does support wrapping keys, which means the Key Separation Attack is a real threat. It is also interesting to see that

The purpose of the “Retailer of the Year. Suppliers’ Choice” award is to honor the retail and wholesale chains operating in various formats on the Polish market, cooperating

In spite of the scarcity of finds [in fact, no finds] in this area, the rel- atively abundance of Persian period finds along its southern slope, its proximity to the