Ph.D. course: “Regression models”
Non-linear effect of a quantitative covariate
PKA & LTS Sect. 4.2.1, 4.2.2
6 May 2013
www.biostat.ku.dk/~pka/regrmodels13
Per Kragh Andersen
1
Linear effects
We have studied models with the linear predictor:
LPi = a + bxi for a quantitative covariate x, both for
• quantitative outcomes (b is a mean value difference),
• binary outcomes (b is a log(odds ratio),
• survival times (“a = log(h0(t))”, b is a log(hazard ratio)).
The slope b has a simple interpretation: change for the linear predictor per 1 unit change in x.
Linearity is simple, but restrictive and we need
• ways of checking the assumption of linearity
• alternative models to use when linearity fits poorly
2
Approaches
Different possibilities are:
• transformation of x, i.e. LPi= a + bf (xi); the function f must be known,
• scatterplot smoother, fine for description, not optimal for inference (Figure next slide),
• methods based on choosing cut-points for x (Sect. 4.2.1),
• polynomials (Sect. 4.2.2).
The last three approaches may suggest transformations, f.
We will use modeling the effect of bilirubin in the PBC-3 trial as illustration, but ideas carry over to quantitative and binary outcomes.
0 5 10 15
0.000.010.020.030.04
Drinks per week
Probability of fetal death
Figure 1: Scatterplot smoother for the binary outcome y (fetal death) when plotted against the covariate x (alcohol consumption). The dis- tribution of x is indicated along the horizontal axis.
Using cut-points for the covariate
• Piecewise constant effect
• Linear regression splines
• Quadratic/cubic (restricted) regression splines
5
2 4 6 8 10
−2.0−1.5−1.0−0.50.00.5
x
Linear Predictor
2 4 6 8 10
−2.0−1.5−1.0−0.50.00.5
x
Linear Predictor
2 4 6 8 10
−2.0−1.5−1.0−0.50.00.5
x
Linear predictor
2 4 6 8 10
−2.0−1.5−1.0−0.50.00.5
x
Linear Predictor
Figure 2: Illustration of models for the linear predictor that are alternatives to the simple linear model. The dotted curve represents the true relationship.
6
Bilirubin in quintiles
Cox regression model with bilirubin categorized in quintiles
hi(t) =
h0(t) if xi≤ 10.3, h0(t) exp(b1) if xi∈ (10.3, 16], h0(t) exp(b2) if xi∈ (16, 26.7], h0(t) exp(b3) if xi∈ (26.7, 51.4], h0(t) exp(b4) if xi> 51.4.
With dummy variables I(xi≤ 10.3), . . . , I(xi> 51.4), the linear predictor for individual i is the piecewise constant function
LPi(t) = log(h0(t)) + b1I(10.3 < xi≤ 16) + · · · + b4I(xi> 51.4).
The estimates in this model are : bb1=−0.537(0.708),bb2= 1.120(0.494), bb3= 1.698(0.460), bb4= 2.670(0.437).
0 100 200 300 400
0123
Bilirubin
Linear predictor
Figure 3: Estimated linear predictor (solid curve) for the PBC study assuming an effect of serum bilirubin that is piecewise constant in quintile groups. The dashed curve joins values of the linear predictor for the scores attached to each interval of bilirubin. The distribution of bilirubin is shown on the horizontal axis.
Using interval scores
s(xi) =
7.66 if xi≤ 10.3 13.26 if xi∈ (10.3, 16]
20.23 if xi∈ (16, 26.7]
37.32 if xi∈ (26.7, 51.4]
148.83 if xi> 51.4
The model with linear predictor log(h0(t)) + bs(xi)is nested in the model with bilirubin categorized and the likelihood ratio test for linearity is 19.0∼ χ23, P = 0.0003.
However, the model with a linear effect of x is not nested in the categorized model.
Using plots of pseudo-observations, the fit may be evaluated.
9
0 100 200 300 400
−6−4−202
Bilirubin
time = 0.71
0 100 200 300 400
−6−4−202
Bilirubin
time = 1.18
0 100 200 300 400
−6−4−202
Bilirubin
time = 2.16
0 100 200 300 400
−6−4−202
Bilirubin
time = 3.19
Figure 4: The estimated linear predictor for the PBC3 study (assuming an effect of serum bilirubin which is piecewise constant in quintile groups) plotted against bilirubin together with smoothed pseudo-observations. The four panels correspond to quintiles of observed event times.
10
1 2 3 4 5 6
−6−4−202
log(bilirubin)
time = 0.71
1 2 3 4 5 6
−6−4−202
log(bilirubin)
time = 1.18
1 2 3 4 5 6
−6−4−202
log(bilirubin)
time = 2.16
1 2 3 4 5 6
−6−4−202
log(bilirubin)
time = 3.19
Figure 5: The estimated linear predictor for the PBC3 study (assuming an effect of log(serum bilirubin) which is piecewise constant in quintile groups) plotted against log(bilirubin) together with smoothed pseudo-observations. The four panels correspond to
Comments
The model with a piecewise constant effect of x:
• is easy to fit and easy to report,
• does not contain the model with a linear effect of x as a sub-model
• does not provide a smooth (in fact, not even continuous) relationship
• is sensitive to the choice of cut-points
Regression splines
A regression “spline” is a function of the form
x+i = (xi− r)I(xi > r)
for some threshold r. Thus, x+i = 0 for xi ≤ r and increases linearly with xi from xi= rand upwards.
If we compose the linear predictor of several such spline terms:
LPi= a + bxi+ b1x+i1+ ... + b4x+i4
we get a broken linear function (Figure). The parameter bj is the change of slope at cut-point j: slope before first cut-point is b, slope between first and second is b + b1etc.
Linearity: b1= b2= ... = b4= 0.
13
Results for PBC-3
For the PBC3 example we get the estimates:
bb = −0.245(0.182), bb1= 0.460(0.309), bb2=−0.122(0.185), bb3=−0.0592(0.0654), bb4=−0.0278(0.0174).
The likelihood ratio test for linearity is 39.13 ∼ χ2(4), P < 0.001.
The linear spline function is now continuous but still not smooth.
Linear predictor with quadratic splines:
LPi= a + b1xi+ b2x2i + b1,1(x+i1)2+ ... + b1,4(x+i4)2. No simple interpretation of coefficients, but a smooth curve is obtained. LR test for linearity b2= b1,1= ... = b1,4= 0: 40.97 ∼ χ25.
14
0 100 200 300 400
−101234
Bilirubin
Linear predictor
Figure 6: Estimated linear predictor for the PBC3 study assuming an effect of serum bilirubin modeled as a linear spline (dashed), an unrestricted quadratic spline (solid), or a quadratic spline restricted to be linear for bilirubin values above 51.4 (dotted). The distribution of bilirubin is shown on the horizontal axis.
Restricted splines
The quadratic effect b2x2i may be quite dramatic for large (both positive and negative) values of xi.
This may be avoided using restricted splines, see p. 220. The idea is that for large (positive or negative) x’s, the curve is linear instead of quadratic.
Also cubic splines may be defined: (x+ij)3.
Polynomials
The simplest alternative to a linear function is a quadratic function, and a standard test for linearity is including x2and testing whether the corresponding coefficient is b2= 0:
LPi = a + b1xi+ b2x2i. The resulting parabola has maximum in −b2b1
2 and it is “happy” (convex) if b2> 0, “bad-tempered” (concave) if b2< 0.
For the PBC-3 data:
bb1= 0.0227(0.0031), bb2=−0.0000369(0.00000871).
Also higher order polynomials (cubic etc.).
• Simple approach including simple tests for linearity
• No simple interpretation of coefficients
• Influential points
17
0 100 200 300 400
0.000.050.100.150.20
Bilirubin
Cook’s distance
0 1 2 3 4 5 6
0.000.050.100.150.20
Years
Cook’s distance
Figure 7: Cook’s distance for the model with a quadratic effect of biliru- bin plotted against bilirubin and time: +: observed failure times, o:
censored observations.
18
Fractional polynomials
Instead of using just x2 and perhaps x3, use several powers xq e.g.,
√x = x0.5, 1/√
x = x−0.5, x, 1/x = x−1, x2, 1/x2= x−2, x3, 1/x3= x−3, (and the power q = 0 is taken to mean log(x)).
This provides a lot of flexibility but no interpretable coefficients.
Since such models are purely descriptive, one often aims at finding best-fitting models with two or three terms.
We did that for the PBC-3 study:
Table 1: Likelihood ratio tests comparing fractional polynomial models for the effect of bilirubin in the PBC3 study to a model with a linear effect. First column: one additional term in the model; next columns two additional terms in the model.
q2
q1 — –3 –2 –1 –0.5 0 0.5 2
–3 4.29
–2 12.10 19.10
–1 25.51 29.32 32.38
–0.5 30.69 32.75 34.12 34.09
0 32.32 33.10 33.24 32.57 32.35
0.5 30.56 30.68 30.57 31.08 31.77 32.42
2 20.82 21.47 23.65 28.69 31.17 32.50 32.44
3 16.59 17.88 21.19 28.00 31.06 32.46 32.00 26.59
Results
The best fitting model with 1 extra term (in addition to just x) is
LPi= log(h0(t)) + b1xi+ b0log(xi)
with estimates bb1=−0.000723(0.00222),bb0= 1.0661(0.202), i.e. the linear term is insignificant and that for log(x) is highly significant.
The best fitting model with 2 extra terms is
LPi= log(h0(t)) + b1xi+ b−0.5x−0.5i + b−2x−2i with estimates bb1=−0.00242(0.00165),bb−0.5=
40.301(13.889), bb−2=−12.575(2.372), the last two terms being highly significant and the linear term insignificant.
21
0 100 200 300 400
−101234
Bilirubin
Linear predictor
Figure 8: Estimated linear predictor for the PBC study assuming an ef- fect of serum bilirubin which is modeled either as a fractional polynomial with powers 1 and 0 (dashed) or with powers 1, –0.5, and –2 (solid).
The distribution of bilirubin is shown on the horizontal axis.
22
Comments
• Easy to fit models with non-linear effects using a linear predictor - just define appropriate extra covariates.
• Most such models provide estimates without a simple interpretation (however, linear splines)
• Note the distinction to “truly non-linear models”, e.g. the Gompertz growth curve model with
E(yi) = a + b exp(cxi) for which special software is needed for the fitting