Ph.D. course: Regression models. Linear effects. Approaches

(1)

Ph.D. course: “Regression models”

Non-linear effect of a quantitative covariate

PKA & LTS Sect. 4.2.1, 4.2.2

6 May 2013

www.biostat.ku.dk/~pka/regrmodels13

Per Kragh Andersen

1

Linear effects

We have studied models with the linear predictor:

LP_i = a + bx_i for a quantitative covariate x, both for

• quantitative outcomes (b is a mean value difference),

• binary outcomes (b is a log(odds ratio),

• survival times (“a = log(h0(t))”, b is a log(hazard ratio)).

The slope b has a simple interpretation: change for the linear predictor per 1 unit change in x.

Linearity is simple, but restrictive and we need

• ways of checking the assumption of linearity

• alternative models to use when linearity fits poorly

2

Approaches

Different possibilities are:

• transformation of x, i.e. LPi= a + bf (x_i); the function f must be known,

• scatterplot smoother, fine for description, not optimal for inference (Figure next slide),

• methods based on choosing cut-points for x (Sect. 4.2.1),

• polynomials (Sect. 4.2.2).

The last three approaches may suggest transformations, f.

We will use modeling the effect of bilirubin in the PBC-3 trial as illustration, but ideas carry over to quantitative and binary outcomes.

0 5 10 15

0.000.010.020.030.04

Drinks per week

Probability of fetal death

Figure 1: Scatterplot smoother for the binary outcome y (fetal death) when plotted against the covariate x (alcohol consumption). The distribution of x is indicated along the horizontal axis.

(2)

Using cut-points for the covariate

• Piecewise constant effect

• Linear regression splines

• Quadratic/cubic (restricted) regression splines

5

2 4 6 8 10

−2.0−1.5−1.0−0.50.00.5

x

Linear Predictor

2 4 6 8 10

−2.0−1.5−1.0−0.50.00.5

x

Linear Predictor

2 4 6 8 10

−2.0−1.5−1.0−0.50.00.5

x

Linear predictor

2 4 6 8 10

−2.0−1.5−1.0−0.50.00.5

x

Linear Predictor

Figure 2: Illustration of models for the linear predictor that are alternatives to the simple linear model. The dotted curve represents the true relationship.

6

Bilirubin in quintiles

Cox regression model with bilirubin categorized in quintiles

h_i(t) =











h₀(t) if xi≤ 10.3, h0(t) exp(b1) if xi∈ (10.3, 16], h₀(t) exp(b₂) if xi∈ (16, 26.7], h0(t) exp(b3) if xi∈ (26.7, 51.4], h₀(t) exp(b₄) if xi> 51.4.

With dummy variables I(xi≤ 10.3), . . . , I(xi> 51.4), the linear predictor for individual i is the piecewise constant function

LP_i(t) = log(h₀(t)) + b₁I(10.3 < x_i≤ 16) + · · · + b4I(x_i> 51.4).

The estimates in this model are : bb1=−0.537(0.708),bb2= 1.120(0.494), bb₃= 1.698(0.460), bb₄= 2.670(0.437).

0 100 200 300 400

0123

Bilirubin

Linear predictor

Figure 3: Estimated linear predictor (solid curve) for the PBC study assuming an effect of serum bilirubin that is piecewise constant in quintile groups. The dashed curve joins values of the linear predictor for the scores attached to each interval of bilirubin. The distribution of bilirubin is shown on the horizontal axis.

(3)

Using interval scores

s(x_i) =











7.66 if xi≤ 10.3 13.26 if xi∈ (10.3, 16]

20.23 if xi∈ (16, 26.7]

37.32 if xi∈ (26.7, 51.4]

148.83 if xi> 51.4

The model with linear predictor log(h0(t)) + bs(xi)is nested in the model with bilirubin categorized and the likelihood ratio test for linearity is 19.0∼ χ²3, P = 0.0003.

However, the model with a linear effect of x is not nested in the categorized model.

Using plots of pseudo-observations, the fit may be evaluated.

9

0 100 200 300 400

−6−4−202

Bilirubin

time = 0.71

0 100 200 300 400

−6−4−202

Bilirubin

time = 1.18

0 100 200 300 400

−6−4−202

Bilirubin

time = 2.16

0 100 200 300 400

−6−4−202

Bilirubin

time = 3.19

Figure 4: The estimated linear predictor for the PBC3 study (assuming an effect of serum bilirubin which is piecewise constant in quintile groups) plotted against bilirubin together with smoothed pseudo-observations. The four panels correspond to quintiles of observed event times.

10

1 2 3 4 5 6

−6−4−202

log(bilirubin)

time = 0.71

1 2 3 4 5 6

−6−4−202

log(bilirubin)

time = 1.18

1 2 3 4 5 6

−6−4−202

log(bilirubin)

time = 2.16

1 2 3 4 5 6

−6−4−202

log(bilirubin)

time = 3.19

Figure 5: The estimated linear predictor for the PBC3 study (assuming an effect of log(serum bilirubin) which is piecewise constant in quintile groups) plotted against log(bilirubin) together with smoothed pseudo-observations. The four panels correspond to

Comments

The model with a piecewise constant effect of x:

• is easy to fit and easy to report,

• does not contain the model with a linear effect of x as a sub-model

• does not provide a smooth (in fact, not even continuous) relationship

• is sensitive to the choice of cut-points

(4)

Regression splines

A regression “spline” is a function of the form

x⁺_i = (x_i− r)I(xi > r)

for some threshold r. Thus, x⁺_i = 0 for xi ≤ r and increases linearly with xi from xi= rand upwards.

If we compose the linear predictor of several such spline terms:

LP_i= a + bx_i+ b₁x⁺_i1+ ... + b₄x⁺_i4

we get a broken linear function (Figure). The parameter bj is the change of slope at cut-point j: slope before first cut-point is b, slope between first and second is b + b1etc.

Linearity: b1= b₂= ... = b₄= 0.

13

Results for PBC-3

For the PBC3 example we get the estimates:

bb = −0.245(0.182), bb1= 0.460(0.309), bb2=−0.122(0.185), bb₃=−0.0592(0.0654), bb4=−0.0278(0.0174).

The likelihood ratio test for linearity is 39.13 ∼ χ²(4), P < 0.001.

The linear spline function is now continuous but still not smooth.

Linear predictor with quadratic splines:

LP_i= a + b₁x_i+ b₂x²_i + b_1,1(x⁺_i1)²+ ... + b_1,4(x⁺_i4)². No simple interpretation of coefficients, but a smooth curve is obtained. LR test for linearity b2= b_1,1= ... = b_1,4= 0: 40.97 ∼ χ²5.

14

0 100 200 300 400

−101234

Bilirubin

Linear predictor

Figure 6: Estimated linear predictor for the PBC3 study assuming an effect of serum bilirubin modeled as a linear spline (dashed), an unrestricted quadratic spline (solid), or a quadratic spline restricted to be linear for bilirubin values above 51.4 (dotted). The distribution of bilirubin is shown on the horizontal axis.

Restricted splines

The quadratic effect b2x²_i may be quite dramatic for large (both positive and negative) values of xi.

This may be avoided using restricted splines, see p. 220. The idea is that for large (positive or negative) x’s, the curve is linear instead of quadratic.

Also cubic splines may be defined: (x⁺_ij)³.

(5)

Polynomials

The simplest alternative to a linear function is a quadratic function, and a standard test for linearity is including x²and testing whether the corresponding coefficient is b2= 0:

LPi = a + b1xi+ b2x²_i. The resulting parabola has maximum in ^−b_2b¹

2 and it is “happy” (convex) if b2> 0, “bad-tempered” (concave) if b2< 0.

For the PBC-3 data:

bb₁= 0.0227(0.0031), bb₂=−0.0000369(0.00000871).

Also higher order polynomials (cubic etc.).

• Simple approach including simple tests for linearity

• No simple interpretation of coefficients

• Influential points

17

0 100 200 300 400

0.000.050.100.150.20

Bilirubin

Cook’s distance

0 1 2 3 4 5 6

0.000.050.100.150.20

Years

Cook’s distance

Figure 7: Cook’s distance for the model with a quadratic effect of bilirubin plotted against bilirubin and time: +: observed failure times, o:

censored observations.

18

Fractional polynomials

Instead of using just x² and perhaps x³, use several powers x^q e.g.,

√x = x^0.5, 1/√

x = x^−0.5, x, 1/x = x⁻¹, x², 1/x²= x⁻², x³, 1/x³= x⁻³, (and the power q = 0 is taken to mean log(x)).

This provides a lot of flexibility but no interpretable coefficients.

Since such models are purely descriptive, one often aims at finding best-fitting models with two or three terms.

We did that for the PBC-3 study:

Table 1: Likelihood ratio tests comparing fractional polynomial models for the effect of bilirubin in the PBC3 study to a model with a linear effect. First column: one additional term in the model; next columns two additional terms in the model.

q2

q1 — –3 –2 –1 –0.5 0 0.5 2

–3 4.29

–2 12.10 19.10

–1 25.51 29.32 32.38

–0.5 30.69 32.75 34.12 34.09

0 32.32 33.10 33.24 32.57 32.35

0.5 30.56 30.68 30.57 31.08 31.77 32.42

2 20.82 21.47 23.65 28.69 31.17 32.50 32.44

3 16.59 17.88 21.19 28.00 31.06 32.46 32.00 26.59

(6)

Results

The best fitting model with 1 extra term (in addition to just x) is

LP_i= log(h₀(t)) + b₁x_i+ b₀log(x_i)

with estimates bb1=−0.000723(0.00222),bb0= 1.0661(0.202), i.e. the linear term is insignificant and that for log(x) is highly significant.

The best fitting model with 2 extra terms is

LP_i= log(h₀(t)) + b₁x_i+ b_−0.5x^−0.5_i + b₋₂x⁻²_i with estimates bb1=−0.00242(0.00165),bb−0.5=

40.301(13.889), bb₋₂=−12.575(2.372), the last two terms being highly significant and the linear term insignificant.

21

0 100 200 300 400

−101234

Bilirubin

Linear predictor

Figure 8: Estimated linear predictor for the PBC study assuming an effect of serum bilirubin which is modeled either as a fractional polynomial with powers 1 and 0 (dashed) or with powers 1, –0.5, and –2 (solid).

The distribution of bilirubin is shown on the horizontal axis.

22

Comments

• Easy to fit models with non-linear effects using a linear predictor - just define appropriate extra covariates.

• Most such models provide estimates without a simple interpretation (however, linear splines)

• Note the distinction to “truly non-linear models”, e.g. the Gompertz growth curve model with

E(yi) = a + b exp(cxi) for which special software is needed for the fitting