where X can be the spectroscopic data, p is the first eigenvector of X X, and t is the score vector, which is the projection of X onto the first eigenvector The score is a
4.6 Acquiring the NIR spectrum
In the previous sections, emphasis has been on the preparation of arranging the NIR energy to interact with the measuring samples, choosing the spectrograph to separate the NIR energy into a linear disperse spectrum, and to optimise signal to noise ratio on the detection system. The following will discuss how the NIR dispersed spectrum presented at the output of the spectrograph is collected.
4.6.1 Acquiring the NIR spectrum from the spectrographs
output
The NIR spectrum is collected by moving the PbS detector across the spectrum in discrete steps. This allows precise location of the detector in the spectrum by the step counter {i unit shown in the figure 4.15). The detector is set stationary for 1 second while measuring (gated integrating) the infrared signal. The duration of the stationary interval and the longer integration time has given better signal to noise performance, even though the detector has a response time in the order of 0.5 to 1 ms. The timing sequence of the stepping process is illustrated in the figure below.
X I to 15 Gated Integrator and Averager P lu 16 2( wavelength {t) wavelength (/*’)
Fig 4.15 This illustrates the discrete scanning of a water transmittance spectrum. At /th wavelength,
the detector is at stationary position and integrates the incoming NIR signal. The detector is
then moved to next distance of the dispersed spectrum to integrate the next or i+1 th
N ear Infrared S pectroscopy Technique f o r B ioprocess M onitoring an d C on trol C h apter 4
This approach provided good stable detection but has a long spectrum collection time. This was because the stepper motor is required to be turned on and off at each discrete point of detection.
. It was pointed out that the NIR spectrum should be collected at every 2 nm (section 4.2) for the chosen bioprocessing application. However, because the discrete method is a slow process, in order to reduce the spectrum scanning time, spectral data is collected at every 4 nm. This was necessary when large volumes of sample are required to be process within time constraints during the calibration experiments, see chapter 5.
4.6.2 Spectral smoothing
The Spectral smoothing process is applied immediately after recording of the NIR spectrum. Although various precautions have been mentioned on collecting the NIR spectra, smoothing can remove further noise or any disturbance during recordings of the spectrum. Spectral smoothing is primarily concerned with reduction in higher frequency ripple noise. The corresponding lower frequency noise (e.g. instrument drift during the scanning of spectrum) is more difficult to suppress because it may resemble the real information in the spectra. Therefore smoothing is necessary to remove as much noise as possible in the spectrum without excessively degrading important information. Two popular methods for smoothing spectra are moving point average and spline.
Throughout all spectral smoothing in this project, a spline function that uses the generalised cross validation (GCV) and genetic algorithm (GA) method by Thornhill (Thornhill et al, 1994) is used. This has the advance of choosing both the degree of spline and minimising its parameters at the expenses of computation time. This GCV/GA spline smoothing has provided better smoothing compared to moving average or conventional cubic spline functions. The following describes briefly both
N ear In frared S pectroscopy Technique f o r B ioprocess M onitoring an d C ontrol C hapter 4
smoothing techniques and an example is given which indicates the advantage of using GCV/GA spline over the others mentioned.
4.6.2.1
Moving point averages
Moving point average smoothing performs an average of an odd number of sequential points, replacing the centre point with the average, i.e. the spectral data Xj
at each wavelength 7 = 1, 2, . . ., 7 is replaced by averages of itself and its neighbouring points from j-D to 7+Z).
+ Z? y
[4.11]
- D
The denominator N in the equation is any odd number of sequential points and is equal to 2D+1. The moving point average smoothing process usually starts on the left end of the spectrum (at the shortest wavelength) and moves to the right one data at a time until the right end of the spectrum is reached. Note that the centre point of the interval is replaced with average, D points are not included in each end of the smoothed spectrum. The detailed properties of moving point average are given by Rabiner and Gold (Martens, 1989).
4.6.2.2
GCV/GA spline
The spline technique is based upon the assumption that in small intervals most functions can be fitted by lower degree polynomials. Therefore it divides the spectrum into segments and fits polynomials to each of these segments under the restriction that the resulting combined polynomial is a continuous function. The polynomial segments join at the points where all the derivatives, except the highest, are continuous. For example, a cubic spline has a discontinuities only in the third derivative at the segment joints. It has been recognised that the smoothing of the spline approximation avoids the following of random noise in the data. An adjustable smoothness factor was suggested by Reinsch (1967). This is achieved by
N ear Infrared S pectroscopy Technique f o r B ioprocess M onitoring an d C ontrol C hapter 4
balancing between the fidelity of the spline to the data, and its roughness indicated by the values of the higher derivatives.
It was found by Craven and Wahba (1979) that an optimum choice of smoothness factor by generalised cross validation (GCV) eliminated the need for a fidelity constraint. The GCV technique generates a set of splines from the spectral data, but some spectral data are neglected during generalisation of the spline. The spline is then used to predict the neglected data values. The spline function used in this project, developed by Thornhill et al (Thornhill, 1994) for process measurement data, chooses both the degree of the spline and its parameters by minimising the GCV with a genetic algorithm (GA).