• No results found

Checking model assumptions with regression diagnostics

N/A
N/A
Protected

Academic year: 2019

Share "Checking model assumptions with regression diagnostics"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

Checking model

assumptions

with regression

diagnostics

Graeme L. Hickey

University of Liverpool

@graemeleehickey www.glhickey.com

(2)

Conflicts of interest

• None

(3)
(4)

Question: who routinely checks model assumptions

(5)

Outline

• Illustrate with multiple linear regression

• Plethora of residuals and diagnostics for other model types

• Focus is not to “what to do if you detect a problem”, but “how to

(6)

My personal experience*

• Reviewer of EJCTS and ICVTS for 5-years

• Authors almost never report if they assessed model assumptions

• Example: only one paper submitted where authors considered

sphericity in RM-ANOVA at first submission

• Usually one or more comment is sent to authors regarding model assumptions

(7)

Linear regression modelling

• Collect some data

• 𝑦": the observed continuous outcome for subject 𝑖 (e.g. biomarker)

• 𝑥%", 𝑥'", … , 𝑥)": p covariates (e.g. age, male, …)

• Want to fit the model

• 𝑦" = 𝛽, + 𝛽%𝑥%" + 𝛽'𝑥"' + ⋯ + 𝛽)𝑥)" + 𝜀"

• Estimate the regression coefficients

• 𝛽0,, 𝛽0%, 𝛽0', … , 𝛽0)

• Report the coefficients and make inference, e.g. report 95% CIs

(8)

Residuals

• For a linear regression model, the residual for the 𝑖-th observation is

𝑟" = 𝑦" − 𝑦3"

• where 𝑦3" is the predicted value given by

𝑦3" = 𝛽0, + 𝛽0%𝑥%" + 𝛽0'𝑥"' + ⋯ + 𝛽0)𝑥)"

(9)

Linearity of functional form

• Assumption: scatterplot of (𝑥", 𝑟") should not show any systematic

trends

(10)

● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● 0 20 40 60 80

0 5 10 15 20

X Y A ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 −5 0 5 10

0 5 10 15 20

X Residual B ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 20 40 60 80

0 5 10 15 20

X Y C ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −4 0 4 8

0 5 10 15 20

X

Residual

D

Fitted model:

𝑌 = 𝛽, + 𝛽%𝑋 + 𝜀

(11)

Homogeneity

• We often assume assume that 𝜀" ∼ 𝑁 0, 𝜎'

• The assumption here is that the variance is constant, i.e.

homogeneous

• Estimates and predictions are robust to violation, but not inferences (e.g. F-tests, confidence intervals)

• We should not see any pattern in a scatterplot of 𝑦3", 𝑟"

(12)

Homoscedastic residuals Heteroscedastic residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 −5 0 5

0 5 10 15 20 25

Fitted value Residual A ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10 −5 0 5

0 5 10 15 20 25

Fitted value

Residual

(13)

Normality

• If we want to make inferences, we generally assume 𝜀" ∼ 𝑁 0, 𝜎'

• Not always a critical assumption, e.g.:

• Want to estimate the ‘best fit’ line

• Want to make predictions

The sample size is quite large and the other assumptions are met

• We can assess graphically using a Q-Q plot, histogram

(14)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

−2 −1 0 1 2

− 6 − 2 2 4 6 Normal residuals Theoretical Quantiles

Sample Quantiles ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

−2 −1 0 1 2

0 5 10 15 Skewed residuals Theoretical Quantiles Sample Quantiles Residuals Frequency

−6 −4 −2 0 2 4 6 8

0 5 10 15 20 25 Residuals Frequency

0 5 10 15

0

5

10

20

(15)

Independence

• We assume the errors are independent

• Usually able to identify this assumption from the study design and analysis plan

E.g. if repeated measures, we should not treat each measurement as

independent

(16)

●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● −60 −30 0 30

0 25 50 75 100

X Residual A ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −150 −100 −50 0 50 100

0 25 50 75 100

X

Residual

B

(17)

Multicollinearity

• Correlation among the predictors (independent variables) is known as collinearity (multicollinearity when >2 predictors)

• If aim is inference, can lead to

Inflated standard errors (in some cases very large)

• Nonsensical parameter estimates (e.g. wrong signs or extremely large)

• If aim is prediction, it tends not to be a problem

• Standard diagnostic is the variance inflation factor (VIF)

𝑉𝐼𝐹 𝑋? = 1

(18)

Outliers & influential points

● ●

● ● ●

● ●

r = 0.82

●● ● ●●

r = 0.82

● ●

● ●

● ●

● ●

r = 0.82

● ● ● ● ● ● ●

● ● ● r = 0.82

Dataset 1 Dataset 2

Dataset 3 Dataset 4 4

8 12

4 8 12

5 10 15 5 10 15

Measurement 1

Measurement 2

y = 3.00 + 0.500x y = 3.00 + 0.500x

y = 3.00 + 0.500x y = 3.00 + 0.500x

x

y

(19)

Diagnostics to detect influential points

• DFBETA (or Δβ)

Leave out i-th observation out and refit the modelGet estimates of 𝛽0, C" , 𝛽0% C" , 𝛽0' −𝑖 , … , 𝛽0) C"

Repeat for 𝑖 = 1, 2, … , 𝑛

• Cook’s distance D-statistic

A measure of how influential each data point is

• Automatically computer / visualized in modern software

(20)

Residuals from other models

GLMs (incl. logistic regression)

• Deviance

• Pearson

• Response

• Partial

• Δβ

• …

Cox regression

• Martingale

• Deviance

• Score

• Schoenfeld

• Δβ

• …

(21)

Two scenarios

Statistical methods routinely submitted to EJCTS / ICVTS include:

1. Repeated measures ANOVA

2. Cox proportional hazards regression

(22)

Repeated measures ANOVA

Assumptions: those used for classical ANOVA + sphericity

Sphericity: the variances of the differences of all pairs of the within subject conditions (e.g. time) are equal

• It’s a questionable a priori assumption for longitudinal data

Patient T0 T1 T2 T0 – T1 T0 – T2 T1 – T2

1 30 27 20 3 10 7

2 35 30 28 5 7 2

3 25 30 20 −5 5 10

4 15 15 12 0 3 3

5 9 12 7 −3 2 5

(23)

Mauchly's test

• A popular test (but criticized due to power and robustness)

H0: sphericity satisfied (i.e. 𝜎H'ICHJ = 𝜎H'ICHK = 𝜎H'JCHK)

H1: non-sphericity (at least one variance is different)

• If rejected, it is usual to apply a correction to the degrees of freedom (df) in the RM-ANOVA F-test

• The correction is 𝜖 x df, where 𝜖 = epsilon statistic (either Greenhouse-Geisser or Huynh-Feldt)

(24)

Proportionality assumption

• Cox regression assumes proportional hazards:

• Equivalently, the hazard ratio must be constant over time

• There are many ways to assess this assumption, including two using residual diagnostics:

• Graphical inspection of the (scaled) Schoenfeld residuals

A test* based on the Schoenfeld residuals

(25)

● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

56 150 200 280 350 450 570 730

Time

Beta(t) f

or age

Schoenfeld Individual Test p: 0.5385

● ● ●●●●●●●● ● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●●●●● ● ●●● ● ● ● ● ●●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ●●● ● ● ●● ● ●●●● ● ● ●●● ● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● −2 −1 0 1 2 3

56 150 200 280 350 450 570 730

Time

Beta(t) f

or se

x

Schoenfeld Individual Test p: 0.1253

● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 0.0 0.2

56 150 200 280 350 450 570 730

Time

Beta(t) f

or wt.loss

Schoenfeld Individual Test p: 0.8769

Global Schoenfeld Test p: 0.416

• Simple Cox model fitted to the North Central Cancer Treatment Group lung cancer data set*

• If proportionality is valid, then we should not see any association between the

residuals and time

• Can formally test the correlation for each covariate

Can also formally test the “global”

proportionality

(26)

Conclusions

• Residuals are incredibly powerful for diagnosing issues in regression models

• If a model doesn’t satisfy the required assumptions, don’t expect subsequent inferences to be correct

• Assumptions can usually be assessed using methods other than (or in combination with) residuals

• Always report in manuscript

(27)

Slides available (shortly)

from:

www.glhickey.com

Thanks for listening

Any questions?

References

Related documents

In the Cameras section on the Summary tab, click next to a camera, or click a camera icon in the Home View section on the Summary tab, or click Live at the top of the Pictures

Map of the Midwestern states (plus 250km buffer) showing MAPS station-specific diversity, Shannon’s Diversity Index (SDI) indices, for summer resident breeding species

The FEDERAL REGISTER (ISSN 0097–6326) is published daily, Monday through Friday, except official holidays, by the Office of the Federal Register, National Archives and

Cite this article as: Bastianello et al .: Changes in magnetic resonance imaging disease measures over 3 years in mildly disabled patients with relapsing-remitting multiple

DCP: disease control and prevention; LMIC: lower and middle income countries; HDI: human development index; CBS: Central Bureau of Statistics; VDC: Village Development Committee;

The Gatton MBA team of James Davey, Jordan McMurtrey, Lauren Scanlon, and Luke Williams placed second at the 2015 Southeastern Conference (SEC) MBA Case Competition held at the

It provides ‘at-a-glance’ aggregated views of calculated threats from many security systems, such as: big data cyber analytics (BDCA), front-end perimeter protection, SCADA,

Data were collected on the policy positions of various bodies including: (i) third party insurers (such as private health funds, the Health Insurance Commission, and workers’