The Aitken method
2.2 Least-squares approximation
As we have pointed out, interpolation is mainly used to find the local approxi- mation of a given discrete set of data. In many situations in physics we need to know the global behavior of a set of data in order to understand the trend in a specific measurement or observation. A typical example is a polynomial fit to a set of experimental data with error bars.
The most common approximation scheme is based on the least squares of the differences between the approximation pm(x) and the data f(x). If f(x) is the
data function to be approximated in the region [a,b] and the approximation is an mth-order polynomial pm(x)= m k=0 akxk, (2.21)
we can construct a function ofakfork=0,1, . . . ,mas
χ2 [ak]= b a [pm(x)− f(x)] 2 d x (2.22)
for the continuous data function f(x), and
χ2[a k]= n i=0 [pm(xi)− f(xi)]2 (2.23)
for the discrete data function f(xi) withi =0,1, . . . ,n. Here we have used a
generic variableakinside a square bracket for a quantity that is a function of a set
of independent variablesa0,a1, . . . ,am. This notation will be used throughout
this book. Hereχ2is the conventional notation for the summation of the squares of the deviations.
The least-squares approximation is obtained withχ2[a
k] minimized with re-
spect to all them+1 coefficients through
∂χ2[a
k]
∂al
=0, (2.24)
forl=0,1,2, . . . ,m. The task left is to solve this set ofm+1 linear equations to obtain all theal. This general problem of solving a linear equation set will be
discussed in detail in Chapter 5. Here we consider a special case withm=1, that is, the linear fit. Then we have
p1(x)=a0+a1x, (2.25) with χ2[a k]= n i=0 (a0+a1xi− fi)2. (2.26)
From Eq. (2.24), we obtain
(n+1)a0+c1a1−c3=0, (2.27)
2.2 Least-squares approximation 25 wherec1= n i=0xi,c2= n i=0x 2 i,c3= n i=0 fi, andc4 = n i=0xifi. Solving
these two equations together, we obtain
a0 = c1c4−c2c3 c2 1−(n+1)c2 , (2.29) a1 = c1c3−(n+1)c4 c2 1−(n+1)c2 . (2.30)
We will see an example of implementing this linear approximation in the analysis of the data from the Millikan experiment in the next section. Note that this approach becomes very involved whenmbecomes large.
Here we change the strategy and tackle the problem with orthogonal poly- nomials. In principle, we can express the polynomial pm(x) in terms of a set of
orthogonal polynomials with
pm(x)=
m
k=0
αkuk(x), (2.31)
whereuk(x) is a set of real orthogonal polynomials that satisfy
b a
uk(x)w(x)ul(x)d x= uk|ul =δklNk, (2.32)
withw(x) being the weight whose form depends on the specific set of orthogonal polynomials. Hereδklis the Kroneckerδfunction, which is 1 fork=land 0 for
k =l, andNk is a normalization constant. The coefficientsαkcan be formally
related toaj by a matrix transformation, and are determined withχ2[αk] mini-
mized. Note thatχ2[α
k] is the same quantity defined in Eq. (2.22) or Eq. (2.23)
withpm(x) from Eq. (2.31). If we want the polynomials to be orthonormal, we can
simply divideuk(x) by
Nk. We will also use the notation in the above equation for the discrete case with
uk|ul = n
i=0
uk(xi)w(xi)ul(xi)=δklNk. (2.33)
The orthogonal polynomials can be generated with the following recursion:
uk+1(x)=(x−gk)uk(x)−hkuk−1(x), (2.34)
where the coefficientsgkandhkare given by
gk = xuk|uk uk|uk , (2.35) hk = xuk|uk−1 uk−1|uk−1 , (2.36)
with the startingu0(x)=1 andh0 =0. We can take the above polynomials and show that they are orthogonal regardless of whether they are continuous or dis- crete; they always satisfyuk|ul =δklNk. For simplicity, we will just consider the
case withw(x)=1. The formalism developed here can easily be generalized to the cases withw(x) =1. We will have more discussions on orthogonal polynomi- als in Chapter 6 when we introduce special functions and Gaussian quadratures.
The least-squares approximation is obtained if we find all the coefficientsαkthat
minimize the functionχ2[α
k]. In other words, we would like to have
∂χ2[α k] ∂αj =0 (2.37) and ∂2χ2[α k] ∂α2 j >0, (2.38)
for j=0,1, . . . ,m. The first-order derivative ofχ2[αk] can easily be obtained.
After exchanging the summation and the integration in∂χ2[α
k]/∂αj =0, we have αj= uj|f uj|uj , (2.39)
which ensures a minimum value ofχ2[α
k], because∂2χ2[αk]/∂α2j =2uj|uj
is always greater than zero. We can always construct a set of discrete orthogonal polynomials numerically in the region [a,b]. The following method is a simple example for obtaining a set of orthogonal polynomialsuk(xi) and the coefficients
αkfor a given set of discrete data fi at data pointsxi.
// Method to generate the orthogonal polynomials and the // least-squares fitting coefficients.
public static double[][] orthogonalPolynomialFit (int m, double x[], double f[]) {
int n = x.length-1;
double u[][] = new double[m+1][n+2]; double s[] = new double[n+1]; double g[] = new double[n+1]; double h[] = new double[n+1]; // Check and fix the order of the curve
if (m>n) { m = n;
System.out.println("The highest power" + " is adjusted to: " + n);
}
// Set up the zeroth-order polynomial for (int i=0; i<=n; ++i) {
u[0][i] = 1; double stmp = u[0][i]*u[0][i]; s[0] += stmp; g[0] += x[i]*stmp; u[0][n+1] += u[0][i]*f[i]; } g[0] = g[0]/s[0]; u[0][n+1] = u[0][n+1]/s[0]; // Set up the first-order polynomial
for (int i=0; i<=n; ++i) {