• No results found

Least-squares approximation

The Aitken method

2.2 Least-squares approximation

As we have pointed out, interpolation is mainly used to find the local approxi- mation of a given discrete set of data. In many situations in physics we need to know the global behavior of a set of data in order to understand the trend in a specific measurement or observation. A typical example is a polynomial fit to a set of experimental data with error bars.

The most common approximation scheme is based on the least squares of the differences between the approximation pm(x) and the data f(x). If f(x) is the

data function to be approximated in the region [a,b] and the approximation is an mth-order polynomial pm(x)= m k=0 akxk, (2.21)

we can construct a function ofakfork=0,1, . . . ,mas

χ2 [ak]= b a [pm(x)− f(x)] 2 d x (2.22)

for the continuous data function f(x), and

χ2[a k]= n i=0 [pm(xi)− f(xi)]2 (2.23)

for the discrete data function f(xi) withi =0,1, . . . ,n. Here we have used a

generic variableakinside a square bracket for a quantity that is a function of a set

of independent variablesa0,a1, . . . ,am. This notation will be used throughout

this book. Hereχ2is the conventional notation for the summation of the squares of the deviations.

The least-squares approximation is obtained withχ2[a

k] minimized with re-

spect to all them+1 coefficients through

∂χ2[a

k]

∂al

=0, (2.24)

forl=0,1,2, . . . ,m. The task left is to solve this set ofm+1 linear equations to obtain all theal. This general problem of solving a linear equation set will be

discussed in detail in Chapter 5. Here we consider a special case withm=1, that is, the linear fit. Then we have

p1(x)=a0+a1x, (2.25) with χ2[a k]= n i=0 (a0+a1xifi)2. (2.26)

From Eq. (2.24), we obtain

(n+1)a0+c1a1−c3=0, (2.27)

2.2 Least-squares approximation 25 wherec1= n i=0xi,c2= n i=0x 2 i,c3= n i=0 fi, andc4 = n i=0xifi. Solving

these two equations together, we obtain

a0 = c1c4−c2c3 c2 1−(n+1)c2 , (2.29) a1 = c1c3−(n+1)c4 c2 1−(n+1)c2 . (2.30)

We will see an example of implementing this linear approximation in the analysis of the data from the Millikan experiment in the next section. Note that this approach becomes very involved whenmbecomes large.

Here we change the strategy and tackle the problem with orthogonal poly- nomials. In principle, we can express the polynomial pm(x) in terms of a set of

orthogonal polynomials with

pm(x)=

m

k=0

αkuk(x), (2.31)

whereuk(x) is a set of real orthogonal polynomials that satisfy

b a

uk(x)w(x)ul(x)d x= uk|ul =δklNk, (2.32)

withw(x) being the weight whose form depends on the specific set of orthogonal polynomials. Hereδklis the Kroneckerδfunction, which is 1 fork=land 0 for

k =l, andNk is a normalization constant. The coefficientsαkcan be formally

related toaj by a matrix transformation, and are determined withχ2[αk] mini-

mized. Note thatχ2[α

k] is the same quantity defined in Eq. (2.22) or Eq. (2.23)

withpm(x) from Eq. (2.31). If we want the polynomials to be orthonormal, we can

simply divideuk(x) by

Nk. We will also use the notation in the above equation for the discrete case with

uk|ul = n

i=0

uk(xi)w(xi)ul(xi)=δklNk. (2.33)

The orthogonal polynomials can be generated with the following recursion:

uk+1(x)=(xgk)uk(x)−hkuk−1(x), (2.34)

where the coefficientsgkandhkare given by

gk = xuk|uk uk|uk , (2.35) hk = xuk|uk−1 uk−1|uk−1 , (2.36)

with the startingu0(x)=1 andh0 =0. We can take the above polynomials and show that they are orthogonal regardless of whether they are continuous or dis- crete; they always satisfyuk|ul =δklNk. For simplicity, we will just consider the

case withw(x)=1. The formalism developed here can easily be generalized to the cases withw(x) =1. We will have more discussions on orthogonal polynomi- als in Chapter 6 when we introduce special functions and Gaussian quadratures.

The least-squares approximation is obtained if we find all the coefficientsαkthat

minimize the functionχ2[α

k]. In other words, we would like to have

∂χ2[α k] ∂αj =0 (2.37) and 2χ2[α k] ∂α2 j >0, (2.38)

for j=0,1, . . . ,m. The first-order derivative ofχ2[αk] can easily be obtained.

After exchanging the summation and the integration in∂χ2[α

k]/∂αj =0, we have αj= uj|f uj|uj , (2.39)

which ensures a minimum value ofχ2[α

k], because2χ2[αk]/∂α2j =2uj|uj

is always greater than zero. We can always construct a set of discrete orthogonal polynomials numerically in the region [a,b]. The following method is a simple example for obtaining a set of orthogonal polynomialsuk(xi) and the coefficients

αkfor a given set of discrete data fi at data pointsxi.

// Method to generate the orthogonal polynomials and the // least-squares fitting coefficients.

public static double[][] orthogonalPolynomialFit (int m, double x[], double f[]) {

int n = x.length-1;

double u[][] = new double[m+1][n+2]; double s[] = new double[n+1]; double g[] = new double[n+1]; double h[] = new double[n+1]; // Check and fix the order of the curve

if (m>n) { m = n;

System.out.println("The highest power" + " is adjusted to: " + n);

}

// Set up the zeroth-order polynomial for (int i=0; i<=n; ++i) {

u[0][i] = 1; double stmp = u[0][i]*u[0][i]; s[0] += stmp; g[0] += x[i]*stmp; u[0][n+1] += u[0][i]*f[i]; } g[0] = g[0]/s[0]; u[0][n+1] = u[0][n+1]/s[0]; // Set up the first-order polynomial

for (int i=0; i<=n; ++i) {