3.5 Pattern Classification
3.5.1 Artificial Neural Network
Artificial Neural Network (ANN) is an information-processing system that attempts to imitate biological neural networks (Sivanandam & Deepa, 2006). The neural network was invented in order to overcome the technical limitation of the computer's ability to perform certain tasks. These tasks, such as, reading a handwritten document or recognizing a face, may seem simple for human beings, but are difficult for even the most advanced computers (Abdul-Kareem et al., 2000)
The applications of ANN in biomedicine have gained a tremendous interest from many researchers because of its ability to perform non-liner data processing with relatively simple algorithm (Cohen & Hudson, 1999).
78
ANN is composed of large number of highly interconnected processing elements namely neurons, working together to solve a specific problem (Deepa, 2006). Figure 3.15 diagrammatically depicts a common ANN hierarchal architecture composed of several layers. The layers are connected and the neurons are organized along these layers. The network is linked to the outside environment through the neurons of the input and output layers.
Figure 3.15: Neural Network (Verma and Blumenstein, 2008)
An artificial neuron is the basic component and fundamental unit that performs a simple mathematical operation on its inputs and imitates the functions of a biological neuron and its unique process of learning (Hayati & Shirvany, 2007). An artificial neuron receives multiple inputs and calculates its output which corresponds to the impulse frequency of a real neuron.
ANN performs its processing by accepting inputs, , which are then multiplied by a set of weights, . The neurons then, nonlinearly transform the sum of the weighted inputs, by means of an activation function into an output value , as illustrated in Equation 3.50. Figure 3.16 shows the basic architecture of an artificial neuron.
79
Figure 3.16: Basic Architecture of Artificial Neuron (Negnevitsky 2005)
(3.50)
The output of a neuron , thus, depends on the neuron’s input and on its activation function. Sometimes a bias is also added to the network. The bias is then regarded as a weight, with a constant input of 1 (Fausett, 1994; Negnevitsky, 2005). There are many kinds of neuron activation functions, such as the logistic function, the hyperbolic-tangent
function the sigmoid function, etc. of which the sigmoid function is the most widely used (Zhen-Zhen & Su-Yu, 2012).
3.5.1.1 Multilayer Perceptron Feed-Forward Neural Network
Various ANN architectures are used for classification or prediction purpose. However one of the most common is the multilayer perceptron feed-forward neural network (MLP-NN). In a MLP-NN, the connections between neurons in each layer are unidirectional where the information being processed pass through the input layer, to the hidden layer(s), and then to the output layer. An example of a MLP-NN with two hidden layers is shown in Figure 3.17.
80
Figure 3.17: MLP-NN with Two Hidden Layers (Negnevitsky 2005)
There are several works implemented the MLP-NN for the purpose of blast cells classification, such as the works by (Scotti, 2005, Mohapatra, et al., 2013).
The design of the typical MLP-NN consists of, an input, at least one hidden layer, and an output. Theoretically, there is no specific limit on the number of hidden layers. However, in most cases, one or two hidden layers are adequate (Bishop, 1995).
MLP-NN commonly uses the Back-Propagation (BP) (Jiang et al., 2010) supervised learning rule to dynamically alter the weights and bias values for each neuron in the network. Back-propagation learning method is implemented through the delta rule, which is a gradient descent learning regulation for updating the weights of the artificial neurons in a single-layer perceptron. For a neuron j with activation function g(x) the delta rule for weight is given by Equation 3.51 (Sivanandam & Deepa, 2006).
∆ (3.51)
Where, is a small constant, called, the learning rate, is neurons starting or activation
function, is the target output, is the sum of the product of the weight and , is the actual output, is the inputs. It holds that ∑ and .
Out
put
Signal
Input Si
gnal
Input
Layer
Output
Layer
First
Hidden
Layer
Second
Hidden
Layer
81
The training of a MLP-NN with back-propagation is an iterative process carried out in two phases. In the first phase, the input is propagated forward to the output unit where the error of the network is measured. The error is usually defined as the square difference between the output and the target as illustrated in Equation 3.52 (Henseler, 1995, (Sivanandam & Deepa, 2006).
, for each of the output pattern (3.52)
In the second phase, the error is propagated backward through the network, and used for adapting the connection. On the other hand, the testing process is carried out by giving new unseen input features to a trained network and eventually obtaining the target output.
The MLP-NN architecture is particularly suitable for applications in medical imaging where the inputs and outputs are numerical and pairs of input/output vectors provide a clear basis for training in a supervised manner (Jiang et al., 2010). It is claimed to be the most common, most competent, and the most efficient model (Fausett, 1994).
There are several MLP-NN parameters that need to be optimized, in order to obtain the most suitable network structure, which can give the best testing performance of the phenomenon under study, such as, the number of hidden layers, the number of neurons in each hidden layer, the number of training cycles (epochs) and the learning rate. These parameters are discussed in more details in Chapter 4 Section 4.5. The final MLP-NN architecture used in this research to classify blasts cells is presented in Chapter 7 Section 7.5.
82