July 12,2000
Sarah W. Hardy Douglas W. Nychka
North Carolina State University National Center for Atmospheric Research
R-splines are introduced as splines t with a polynomial null space plus the sum of radial basis functions. Thin plate splines are a special case of R-splines. By this
broader denition of an R-spline, however, it includes splines in which 2m?d 0
where the traditional roughness penalty is not guaranteed to be non-negative denite. This papers discusses a modication of the roughness penalty that allows for the tting of reduced polynomial null spaces. A series of examples are used to demonstrate the behavior of this modied roughness penalty.
Keywords: Thin plate spline, nonparametric, response surfaces, roughness penalty, Demmler-Reinsch basis functions.
1 Introduction
plate spline modeling.
R-splines arose in the context of attempting to modify the thin plate spline to allow for a greater number of explanatory variables in the model. R-splines are splines t with a polynomial null space plus the sum of radial basis functions. Clearly, thin plate splines fall into this category. By this broader denition of an R-spline, however, it also includes other splines. Specically, it includes splines in which 2
m
?d
0. In other words, the polynomial null space is reduced or of a lower order than required by the thin plate spline restraint. When 2m
?d
0, the seamless way in which the polynomial function and roughness penalty t together (i.e, the space spanned by the polynomial terms is the null space of the roughness penalty) no longer holds or the relationship is \broken." The term brokenspline will be used to indicate this kind of spline where the penalty matrix is chosen to be similar to the thin plate spline. The term R-splines refers to the broader class of splines including both broken and thin plate splines.This paper begins by discussing the thin plate spline roughness penalty. It then presents the broken spline modication to the roughness penalty. The Demmler-Reinsch basis function representation of splines is used to show the role the eigenvalues of the roughness penalty matrix have in the spline solution. A series of four two and three dimensional examples from data sets available in S-PLUS and FUNFITS (Nychka, et. al., 1996), is used to compare the roughness penalties for the thin plate and broken spline. Another ve dimensional example using data collected at Becton Dickinson is also presented.
2 Thin plate spline roughness penalty
The roughness functional below,
J
m(f
), will increase in magnitude as a function departsJ
m(f
) =Z <
d
X
m
! 1!:::
d!@
mf
@u
1 1:::@u
d
d
! 2
:d
uThe sum in the integrand is taken over all non-negative integer vectors,
, such that P 1++
d =m
. Clearly, form
?1 order polynomialsJ
m(f
) = 0, because allm
th derivatives are 0.The thin plate spline estimator of
f
(Wahba, 1990) is the minimizer of the sum of the mean-squared error and the roughness penalty which is weighted by a smoothing parameter. Details on the form of the thin plate spline can be found in the appendix. In matrix form, the thin plate spline estimate isf
(xi) =T
+M
whereT
T= 0 and 2m
?d >
0;
where
T
is the design matrix of a polynomial model of orderm
?1 andM
k;i =E
(k xi?xk k;m;d
).For linear combinations of basis functions that satisfy the constraints,
J
m(f
) =TM >
0. In other words,
M
is guaranteed to be positive denite. Dening the matrixW
nn, a diagonal matrix proportional to the reciprocal variances of the errors, allows for the following matrix representation of the penalized sums of squares:S
= 1n
(Y
?T
?M
)TW
(Y
?T
?M
) +TM:
(1) After taking partial derivatives of (1) with respect to and , a QR decomposition ofT
,F
TT
=2 6 6 4
R
0 3 7 7
5, is used to enforce the constraint
T
T
= 0.
F
nn is an orthogonal matrix that can be partioned
F
= [F
1j
F
2] where
F
1 has columns that span the column space ofT
, i.eT
=F
1R
, andF
2 is orthogonal to the column space ofT
. Reparameterizing by letting =F
2!
2,!
1, and
!
T = (
!
1
;!
2)regression form:
X
TWX!
+H!
=X
TWY;
where
X
=T MF
2 Tand
H
= 2 6 6 40 0
0
F
T2
MF
23 7 7 5
:
Note that in the ridge regression formulation the roughness penalty is represented by the matrix H.
3 Broken spline roughness penalty
The degree of the polynomial component implies a specic roughness penalty in the thin plate spline, and thus if
T
, the design matrix for the polynomial component does not spanP
m?1 the roughness penalty minimized in the ridge regression formulation will not be the same roughness penalty as the one used to derive the thin plate spline. We will use the term broken spline to describe splines where T does not spanP
m?1.For purposes of this comparative discussion the T matrix for a thin plate spline will be denoted
T
P and theT
matrix for a broken spline will be denotedT
R. Note that thecolumn space of
T
R is a subset of that ofT
P. PartitionF
F
= [F
aF
bF
c]where
F
a spans the space ofT
R,F
b spans the columns ofT
P that are not inT
R, andF
cFor the broken spline H, the roughness penalty matrix, becomes
H
R=2 6 6 6 6 6 6 4
0r r
0 0
0
F
TbMF
bF
TcMF
b 0F
TbMF
cF
TcMF
c3 7 7 7 7 7 7 5
;
as compared to the H for the thin plate spline which is
H
t=2 6 6 6 6 6 6 4
0r r
0 0
0 0t ?rt?r
0
0 0
F
TcMF
c3 7 7 7 7 7 7 5
:
Hence,
H
R for the broken spline is the roughness penalty for the thin plate spline, plusother terms,
H
R=H
t+2 6 6 6 6 6 6 4
0 0 0
0
F
TbMF
bF
TcMF
b 0F
TbMF
c 03 7 7 7 7 7 7 5
:
With thin plate splines,
H
tis guaranteed to be non-negative denite. RecallTM >
0where
T
T= 0 and 2
m
?d >
0, which guarantees thatM
is positive denite.F
T 2MF
2, is a quadratic form of a positive denite matrix and as such is also positive denite.
H
t,having
F
T2
MF
2 in the lower right block and zeros elsewhere, is non-negative denite. With broken splines, however,
H
Ris no longer a non-negative denite matrix. In thecomputational solution to the thin plate spline,
UDU
T is the singular value decompositionof
BHB
soH
t =B
?1UDU
TB
?1. Now, BHB is no longer positive denite and thus its singular value decomposition isUDV
T, whereU
6=V
, andH
R =B
columns corresponding to the negative eigenvalues. Hence,
H
R, is the matrix that results from forcing the negative eigenvalues ofH
R to be positive. How \close" the two matricesare depends on the magnitude of these eigenvalues. The negative eigenvalues that occur correspond to the \missing" polynomial terms in the null space. Thus, the roughness penalty is forced into penalizing roughness in the surface that might be modeled by these missing polynomial terms, tting them instead with the exible radial basis functions.
3.1 Demmler-Reinsch basis functions
3.1.1 Denition
Constructing a Demmler-Reinsch basis for the smoothing problem aids in the understand-ing of the roles the eigenvalues and eigenvectors of
H
have in the spline solution. Before dening the Demmler-Reinsch basis, two inner products are dened:< h
;h
>
1= Xk
h
(xk)
W
kh
(xk) and< h
;h
>
2= Z< d
X
m
! 1!:::
d!@
mh
@x
1 1:::@x
d
d
!
@
mh
@x
1 1:::@x
d
d
!
d
xThe Demmler-Reinsch basis is denoted byf
g
g and is dened by the three following properties:1. f
g
gfor 1N
spans the same subspace asfjgand the radial basis functions. 2.< g
;g
>
1= 1 for = and 0 for6
=
. 3.< g
;g
>
2=D
for= and 0 for6
=
.residual sums of squares and the roughness penalty can be expressed. By convention
D
are in ascending order.
3.1.2 Role in spline solution
Because this basis spans the same space as the components of
f
(x) we can now express bothf
(x) andy
as a linear combinations of these basis functions:f
(xk) =N
X
=1
g
(xk) andy
k=N
X
=1
u
g
(xk):
(2) Using these expression it can be easily shown thatu
= Pnk=1
g
(xk)
W
kY
k:
Moreover, under the assumption that the random errors are independent and distributedN
(0;
2I
), theu
's are also independent and are distributedN
(;
2
I
):
With the Demmler-Reinsch basis functions the residual sum of squares and the rough-ness penalty can be expressed:
n
X
k=1
(
Y
k?f
(xk)) 2w
k =Xn
=1
(
u
?) 2 andJ
m(f
) =Xn=1
D
2:
Thus, it is apparent that minimizing
S
(f
) is equivalent to ndingmin 2<
n
n
X
=1
(
u
?) 2+n
X
=1
D
2:
(3)The minimizing expression is:
c
= 1 +u
D
;
and so,
^
f
(x) =N
X
=1
c
g
(x) =N
X
=1
u
In fact, the
G
= fg
g=1;::;N we are seeking is (cleverly) the same G we found
in the ridge regression formulation and property (2) is easily veried by noting that
G
T(X
TWX
)G
=I:
To determine if property (3) holds, observe thatG
THG
=D
. Asan important point of clarication recall that the upper
r
r
block ofH
is 0 wherer
is the dimension of the null space. This impliesD
= 0 for = 1;::;r
whereD
are theeigenvalues of
G
THG
. More intuitively, by virtue of the way in which the basis functionsare dened, the roughness of the basis functions is
J
m(g
) =D
andr
of these basisfunctions will be polynomials whose roughness is 0.
With this formulation, it is apparent that if the
D
's, which are the eigenvalues ofH
,and the
g
's, which are the Demmler-Reinsch basis functions, are correlated for the thinplate and broken spline, then the two roughness penalties are similar and measuring the same features of the data.
4 Examples
4.1 Example datasets
engine was run (a measure of the richness of the air/ethanol mix). The response variable is the concentration of nitric oxide and nitrogen dioxide in engine exhaust, normalized by the work done by the engine. The mini-triathalon data contains swimming, biking, and running times from 110 entrants in a Cary, NC event. In the example we are using swimming and biking times to predict running times. The BD2 dataset comes from results of a sequence of RSM designs in a DNA amplication experiment performed at Becton Dickinson Technologies. The explanatory variables are potassium phosphate ( a buer), magnesium acetate (a salt), and dimethyl sulfoxide ( a solvent). The response variable is yield. In summary, the ethanol and mini-triathalon datasets each have 2 explanatory variables and the stack and BD2 datasets have 3.
•••••••••••••••••• •• •
• •
• •
•
m=1
m=2
0 10 20 30 40
0
2
4
6
8
minitri H eigenvalues
•••••• ••••
•• • •
• •
• • •
•
m=1
m=2
0.0 0.2 0.4 0.6
0.0
0.04
0.08
0.12
stack.loss H eigenvalues
•••••••••••••••••••• • ••
• •
•
• • •
m=1
m=2
0 5 10 15
0.0
1.0
2.0
3.0
ethanol H eigenvalues
•••••••••••••••••••••• •••••• ••• ••
•• • •
•
•• •
•
m=1
m=2
0.0 0.4 0.8 1.2
0.0
0.05
0.15
BD2 H eigenvalues
H eigenvalues for m=1 and m=2
Figure 2 shows image plots of the correlations of the thin plate spline and broken spline Demmler-Reinsch basis functions for these four data sets. Lines on the graphs indicate the division of the basis functions associated with the null space and roughness penalty. The areas in the regions greater than the lines on the graph indicate the regions where the basis functions are those associated with the null space. The area enclosed within the lines indicates the basis functions associated with the roughness penalty, which these plots show are highly correlated. As expected, the area outside the lines is not as highly correlated, as the polynomials in the null space are not the same for the thin plate and broken splines. There is, however, some correlation between the basis functions representing the null space in the thin plate spline and those representing the roughness penalty in the broken spline. This indicates that the polynomial terms missing from the broken spline are being accounted for in some way by the roughness penalty. There is some dierence between the two sets of basis functions, but overall when ordered in ascending order by eigenvalues the two sets of basis functions match closely beyond the rst few functions related to the null space. Clearly, more theoretical work is needed to explicitly quantify the relationship given these promising empirical results.
0 20 40 60 80
0
20
40
60
80
120 -1.0 -0.5 0.0 0.5 1.0
m=1
m=2
minitri
0 5 10 15 20
0
5
10
15
20
25
-1.0 -0.5 0.0 0.5 1.0
m=1
m=2
stack.loss
0 20 40 60 80
0
20
40
60
80
100 -1.0 -0.5 0.0 0.5 1.0
m=1
m=2
ethanol
0 10 20 30 40
0
10
20
30
40
50
-1.0 -0.5 0.0 0.5 1.0
m=1
m=2
BD2 Correlation of Demmler-Reinsch basis functions for m=1 and m=2
• • • • • • • •• • • • • • • • •• • • • • ••• • • • • •• • • •• • • • • • • • • • • • • ••• • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • •• Predicted Values Observed Values
25 30 35 40 45 50 55
25 30 35 40 45 50 55
R^2 = 69.56% Q^2 = 64.88%
• • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • •• Predicted values Residuals
30 35 40 45 50 55
-10
-5
0
5
10
m=1, Broken spline
RMSE = 3.571
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• •• •• •• •• •• •• •• •• •• ••• ••••••
Effective number of parameters
Average Prediction Error
0 20 40 60 80 100
20 40 60 80 100 120
Eff. df. = 8 Res. df. = 102 GCV min = 13.753
GCV • • • • • • • •• • • • • • • • • • • •••••• • • • • • • • • •• • • • • • • • • • • • • ••• • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • •• Predicted Values Observed Values
25 30 35 40 45 50 55
25 30 35 40 45 50 55
R^2 = 70.23% Q^2 = 64.66%
• • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• Predicted values Residuals
30 35 40 45 50 55
-10
-5
0
5
10
m=2, Thin plate Spline
RMSE = 3.552
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• •• •• ••• ••••• ••••
Effective number of parameters
Average Prediction Error
0 20 40 60 80 100
20 40 60 80 100 120
Eff. df. = 9.6 Res. df. = 100.4 GCV min = 13.828
GCV
4.2 A larger example dataset
••••••••••••••••• •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••• ••••••••••••••
••••••• ••••••
•••••• •••
••• •••
• • • •
• •• •• • • •• • • • • • • •
m=2
m=3
0.0 0.01 0.02 0.03
0.0
0.002
0.004
0.006
0.008
0.010
0.012
example
Figure 4: Comparison of H eigenvalues for the example dataset t with a broken and thin plate spline.
the broken spline than for the thin plate spline.
0 50 100 150
0
50
100
150
200
250
-1.0 -0.5 0.0 0.5 1.0
m=2
m=3
example
Figure 5: Correlation of basis functions for the example dataset t with a broken and thin plate spline.
case the broken spline is doing a slightly better job of tting the data. • • • • • •• • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • Predicted Values Observed Values
-2 -1 0 1 2
-2
-1
0
1
2 R^2 = 45.12% Q^2 = 10.18%
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • Predicted values Residuals
-1.0 -0.5 0.0 0.5
-2
-1
0
1
2
m=2, Broken spline
RMSE = 0.6713
Pure Error = 0.6929 •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• •• •• •••• ••••• •••••
Effective number of parameters
Average Prediction Error
0 50 100 150
0.6
0.8
1.0
1.2
Eff. df. = 46.3 Res. df. = 244.7 GCV min = 0.536
GCV • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Predicted Values Observed Values
-2 -1 0 1 2
-2
-1
0
1
2 R^2 = 37.5% Q^2 = 6.3%
•• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • Predicted values Residuals
-1.0 -0.5 0.0 0.5
-2
-1
0
1
2
m=3, Thin plate Spline
RMSE = 0.6761
Pure Error = 0.6929 •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •• •• •• ••• •••• ••••••
Effective number of parameters
Average Prediction Error
50 100 150
0.6
0.8
1.0
1.2
Eff. df. = 46.3 Res. df. = 244.7 GCV min = 0.543
GCV
Figure 6: Diagnostic plots for the example dataset t with a broken and thin plate spline.
5 Conclusions
It can be shown that the thin plate spline solution is the unique and optimal solution to the posed minimization problem. The broken spline is also a unique solution to the minimization problem given a xed roughness penalty matrix as described.
In the thin plate spline, the roughness penalty is explicitly dened and makes intuitive sense from a physical perspective as the bending energy of a thin plate. In fact, positive denite matrices arise in many applications involving energy, and clearly only make sense for a roughness penalty that is to be minimized. With the broken spline, although the eective roughness penalty can not be explicitly dened, it is closely related to the \bend-ing energy" roughness penalty and empirically gives good results. The new insight that it is possible to achieve good results using the sum of polynomial terms and radial basis functions to estimate smooth functions without the restrictions imposed for thin plate splines opens the door for experimentation using this modeling technique for many more types of datasets.
A Denition of a thin plate spline
The thin plate spline estimator of
f
(Wahba, 1990) is the minimizer of the following penalized sums of squares for ad
-dimensional explanatory variablex:S
(f
) = 1n
Xni=1
w
i(y
i?f
(xi)) 2+J
m(
f
) for>
0:
(5)Thus, an ^
f
minimizing (5) will result inf
with some level of smoothness dictated by . The function that minimizes this expression has the formf
(xi) =t
X
j=1
j(xi)j +N
X
k=1
E
(kxi?xkk;m;d
)k (6)where Xt
j=1
In this formulation, the
j(xi) are a set oft
polynomial functions (of orderm
?1) andE
(k xi?xk k;m;d
) are a set ofN
radial basis functions. It is assumed that j is estimable.The radial basis functions are explicitly dened as below:
E
(r;m;d
) = 8 > > < > > :a
mdkrk(2m?d)
log
(krk)
d
evena
mdkrk(2m?d)
d
oddwhere
a
mddepends only onm
andd
. One standard way of determiningis by generalizedcross-validation (GCV) (Bates et al, 1987).
B References
Bates, D.M., Lindstrom, M.J., Wahba, G. and Yandell, B.S. (1987). GCVPACK - Rou-tines for generalized cross validation. Comm. Stat. Sim. Comp. 16, 263-297. Berlin, pp. 85-100.
Nychka, D., B. Bailey, S. Ellner, P. Haaland, and M. O'Connell (1996). FunFits data analysis and statistical tools for estimating functions. Software and paper available from statlib.
StatSci (1993)S-PLUSReferenceManualVol. 2,Version3.2MathSoft, Inc., Seattle, WA. Wahba, G. (1990). Spline ModelsforObservationalData. Society for Industrial Applied