In chapter 4, multiple neural network solutions are put forward as powerful ways of enhancing neural network performance in complex pattern recognition problems. Multiple network architectures consist of a number of neural networks as components, with little inter-processor communication requirements, and cooperating or competing to solve a problem. In chapter 7, a Hopfield/Backpropagation architecture is presented, as an example of such systems. In this chapter, two new architectures are put forward, simulated, and mapped onto parallel processors, using MATLIB and NEJUB functions. The objective of these simulations is to enhance neural network performance through parallelism in terms of achieving efficient parallel executions and designing powerful hybrid systems.
8.4.1. Cooperating SOM/Backpropagation Networks
This simulation demonstrates the cooperation of the SOM and the Backpropagation models, and their parallel execution on parallel processors. The SOM is an unsupervised algorithm, which is used in clustering patterns into a number of classes. The SOM can detect salient features in input patterns, and group them in topologically close nodes on the output grid. The Backpropagation model, on the other hand, operates on pairs of input and target patterns, and builds an internal representation which enables the network to produce nonlinear mappings between inputs and targets. The two networks can be used in cooperation, complementing each other in pattern classification tasks (Figure 8.17). The hybrid architecture involves the SOM acting as a front-end feature detector, filtering inputs to the Backpropagation network which is trained to take appropriate action for the patterns filtered by the SOM. In this case, Backpropagation auto-associates noisy input patterns with the original target patterns. The SOM network receives and clusters noisy input patterns into a number of classes, then, the vectors representing the class centres are used to train the Backpropagation network for an auto-associative recall of the targets. This modular, multi-network configuration has a number of advantages. Firstly, this architecture enhances the strengths of both models; the SOM as a pattern pre-processor, and the Backpropagation as a nonlinear pattern mapping device. Secondly, the noise filtering carried out by the SOM facilitates and speeds up the training of the Backpropagation network.
BP SOM Sun4 SOM Sun4 BP Sun4 SCHEDULER Sim4 Ethernet
Sequential Execution Pipelined Execution
Figure 8.17. Cooperating SOM/Backpropagation Networks
As the application, the same pattern recognition problem has been used. For this application, the SOM has 64 input and 12 output nodes, and the three layered Backpropagation network architecture has 64 input, 12 hidden and 64 output units. A total of 12 patterns is presented to the combined architecture. The following steps have been taken in the simulation: firstly, a flat listing of MATLIB definition of the SOM/BP algorithm has been written. This listing is processed by the computational analysis tool, which identifies data-flow paths, variable dependency and computational costs. The analysis reveals a cutting point between the SOM and BP algorithms, suitable for a two- processor pipeline. By dividing the representation into two sections, a pipeline is organised. The partitioned MATLIB representation is then parallelised on a 3 processor configuration involving 3 SUN4s. The first SUN workstation is used as the scheduler, which opens a socket, and waits for data transfer requests. The SOM is mapped onto the second SUN which trains on the noisy inputs, transmitting the weight matrix and the winners table to the third SUN station which runs the Backpropagation model, training on the inputs it receives from the SOM and the targets which are local.
The simulation results for the pipelined execution on 3 SUN4s shows an improvement in performance on the sequential execution. The parallel pipelined execution takes only 43 seconds as opposed to the sequential execution resulting in 1:06 min (66 sec). Considering that the first processor is only the server, the results correspond to a speed-up of 1.5 on the two-processor parallel architecture.
Using the CAT results in the identification of the cutting point, the Automatic Parallel Mapper is able to generate automatic parallel pipelined code for this configuration with a near optimum cutting point on the MATLIB representation. See Appendix D.4 for CAT and APM results.
8.4.2. Competing Backpropagation Networks
Another method of enhancing neural network performance through parallelism is to implement network level competition. Simulations in this section aim to achieve optimal neural network designs, exploiting computational methods on parallel hardware. In this section, a number of parallel Backpropagation networks are simulated on a number of SUN workstations, competing with each other for a better network topology.
One of the major difficulties in using the Backpropagation model is optimising the network topology and parameters for the network training. These parameters are: the initial set of weights, the learning rate, and the number of nodes in the hidden layer. One way of establishing these parameters is to carry out a number of simulations, and to choose the network configuration with the best results. But this method is too time consuming. A parallel architecture can be used to reduce the time spent in finding an optimised network architecture (Figure 8.18).
B Pl BP 2 BPN
SCHEDULER
Ethernet
Figure 8.18. Competing Backpropagation Networks
Another difficulty associated with the Backpropagation model is the inherent lack of ability to explain any input-output mapping which the network produces. Input perturbation techniques can be used to identify the most significant input parameter. This method is somehow similar to Monte Carlo simulations. Certain input values are modified, and the outputs are observed, by examining the distribution of inputs and outputs, dependency to the inputs can be established. The same method can be applied to optimise the most significant parameters of the network. Implementation of this method is again too time consuming on sequential architectures, as it involves a number of serial simulations and the comparison of their results. Again, a parallel hardware configuration can be used to obtain results in a shorter time.
Backpropagation networks, each with a different number of hidden neurons learning the same training dataset. Using the simple NETJJB library function bpleam()y 4
Backpropagation programs are written, compiled and mapped onto 4 SUN workstations. The scheduler is used as file I/O and a passive server, which monitors error and decides which is the best configuration for the given problem. There is no data dependency and little communication between independent neural network modules. Each Backpropagation network occasionally reports the recall error to the scheduler program which evaluates their performance. NEIUB listings of server and client programs are presented Appendix E.
These simulations bring benefits even on general computing platforms such as the SUN LAN used for the simulations. The method of competition is important, as is a practical solution to a complex theoretical problem which interests many neural network researchers in the pursuit of the optimum network design.
8.5. Summary
This chapter presented the simulation results for analysing, partitioning and mapping MATLIB representations. A number of simulations are used to confirm CAT results as an approximate computational model of the execution on SUN workstations. CAT is then used to analyse and map MATLIB representations of the three neural network models that have been the focus throughout this thesis. Feasibility of data and task parallel executions are investigated for the Hopfield, the SOM and the Backpropagation models, using CAT. Exploiting the results, the three models are partitioned, parallelised and pipelined. Parallel simulations are carried out on a SUN LAN, and simulation results are presented. Finally, multiple neural networks are simulated on a number of parallel processors using MATLIB and NETLIB functions.
Chapter 9
Assessment
This chapter assesses the thesis work; an investigation o f representation and mapping strategies for efficient execution o f neural networks on parallel hardware. The thesis consists o f a series o f analyses, design and implementation work, towards building a general purpose neural computer. Within the context o f the research objectives, the analyses, design, implementations and results are assessed, and alternatives are explored.