• No results found

2.3 ANN Applications on Data Interpretation

2.3.1 Classification of Spectra using ANN’s

I he classification ol different types of spectra using AI techniques such as ANN's has been explored in various ways particularly for pattern recognition purposes.

Anand et. al. (199j>) addressed the problem of analysing images containing multiple sparse overlapping patterns from nuclear magnetic resonance (NMR) spectra. It is a naturally occurring problem in the analysis of the composition of organic macromolecules using data gathered from their NMR spectra. By using a feedforward MLP BP [Rumelhart 1986] network approach, they were able to achieve accurate classification in discriminating between pairs of feature classes. Their results were excellent in analysing the presence of various amino acids in protein molecules. They achieved high correct classification (~ 87%) for images containing as many as five substantially distorted overlapping patterns.

Boger et. al. (1994) used ANN's to derive quantitative information from Ion Mobility Spectra (IMS). Unlike conventional methods where areas or heights of known and identified peaks are used for calibration, employing ANN's does not require detailed knowledge of ion chemistry of the measured system. There were limitations with regards to long training times incurred when there was a large ^-dimensional input data space. This was rectified with specific data preprocessing, and usually about 80% °* l,le data was randomly chosen for the training set and the remaining 20% was left lor testing the trained ANN. The conjugate gradient error BP method was

employed to train the network. The network was able to interpret the mobility spectra and concentration for three chemicals in air, at concentrations in the parts per billion to the parts per million ranges. The agreement between calculated concentration and the 'known values determined from the parameters of the dynamic flow system was within a few percent in all cases.

Once the learning set had been processed by the ANN. correlated measured test points with the real concentration of the chemical was straightforward and rapid. Application ol ANN online was deemed a distinct possibility for IMS instruments.

Lohninger et. al. (1992) have compared the performance of ANN's to well-established methods ot multivariate data analysis in classifying mass spectral data. They employed the popular BP MTP network and matched its performance with linear discriminant analysis. MLP's can be converted to linear discriminant classifiers by introducing linear output functions on the processing units. It was noted that there was little difference in classification performance which could be attributable to the fact that all of the mass spectra data sets were linearly separable. If the data sets were non-linearly separable then MLP networks would have certainly given better results. Lohmnger et. al. noted from experience, that with regards to mass spectra, they suspected that only few instances with non-linearly separable classes actually exist, future work was aimed to shed light on this linear separability of mass spectral data.

Glick et. al. (1991) constructed ANN's for classifying metal alloys based on their elemental constituents. Glow-discharge atomic emission spectra (GD-AES) obtained with a PDA (photodiode array) spectrometer were used in a multivariate calibration of seven elements in 37 different nickel-based (Ni-based) alloys and 15 iron-based (Fe- based) alloys. Subsets of the two major classes (i.e. Ni-based and Fe-based alloys) lormed calibration sets for multiple linear regression (MLR). The remaining samples were used to validate the calibration models.

The input units consisted of the seven elemental concentrations of the sample alloy, the number of output units was nine (each representing a single class of alloy). The ANN was trained by BP using reference elemental concentrations from a training set of 32 alloys. The final architecture of the feedforward network was 7:20:9 (i.e. 7

input, 20 hidden and 9 output units). A nonlinear sigmoidal function was used as the transfer function in the hidden and output layers.

Once training of the network was completed, the trained ANN functioned as a pattern classifier. The MLR with a selection of diodes demonstrated a superior ability over conventional calibration to construct useful calibration models even for complex spectra and in the presence of spectral interferences. T his is called a limited-variable approach and is more practical in this setting than a whole-spectrum approach since the spectra are relatively sparse and are composed of only a few analytical lines. With the addition ot noise to the input patterns during network training, the network was still able to generalise and assign unknown alloys to the appropriate class.

The classification ot metal alloys is not necessarily a problem that requires this feature ol ANN's, but in this report by Glick et. al., it demonstrates some of the characteristics ot ANN s applied to sample classification problems' whether linearly separable or not. Here, the ANN approach performed better than K-nearest neighbour (K-NN), with k=l, which misclassified three of the testing-set alloys with the determined values, and performed better than linear discriminant analysis [Bishop 1995]. The advantage ot the ANN is that it can generalise from a limited set of examples which was demonstrated by the small data set used in Glick et. al.'s study. Since the network training process is empirical, any application with ANN's must undergo several training runs to create a more robust classifier.

Automatic analysis of JET charge exchange spectra (CXRS) using a MLP network was carried out by Bishop et. al. (1993). Their aim was to reduce or exclude human supervision as well as produce rapid processing.

CXRS is a powerful technique for analysis of ion temperatures, densities, and rotation velocities in plasma physics experiments. The spectral lines exhibit Doppler broadening due to finite temperature and wavelength shift caused by bulk plasma rotation, resulting in overlapping spectral lines. The conventional analysis of CXRS involves a least-squares fitting process which although yielding a good accuracy, has two drawbacks namely (i) it needs computationally intensive iteration; and (ii) it cannot be used in real-time feedback applications. In the light of these shortcomings, the MLP network proves much faster in processing new data once the network training (on a suitable set of training data) is complete. The data set of CXRS needs

to be sufficiently large to account for new spectra likely to be encountered during processing and allow for division into training and test sets. Also, the network training must be 'overdetermined' i.e. the number of training examples should be sufficiently large in relation to the number of degrees of freedom in the network.

Two key issues that Bishop et. al addressed by applying ANN's were:

1. I he need to preprocess data to reduce its dimensionality (dimensional space) - a technique called feature extraction. They split the large dimensional data space into subsets of input data, and used several networks each dealing with sub- sections of the spectrum. This provided an alternative to standard PCA. The values ol the features were taken as inputs to the network which was then trained in the usual fashion. Selection of the most appropriate feature was the important problem as it had a direct bearing on the performance of the complete system.

2. The capability of the network to adequately deal with data which was significantly different from that on which it had been trained i.e. generalisation. For an automatic system it would have been necessary to provide some form of validation to ensure that the network outputs were satisfactory.

Bishop et. al. s ANN approach to CXRS JET plasma analysis was a complementary tool to conventional iterative techniques and could be used for high speed and very high accuracy analyses.

Having illustrated the varied automated interpretation of different types of spectra using ANN's, there are a selection of applications that have used ANN's in attempts to emphasise the use of the MLP feedforward network as a successful analytical tool. Also, comparisons of other machine learning algorithms to the popular BP algorithm will illustrate some of the reasons that have prompted research into the explanation of the internal distributed representation of a trained ANN.