Synchronisation of Processors - Methods Proposed for MIMD and SIMD Switching

Diagram 5.1 Methods Proposed for MIMD and SIMD Switching

5.3.1.4 Synchronisation of Processors

As in any array of processors, it is important to be able to synchronise processors in the array for certain events. In this architecture there are two main synchronisation events that need to be considered:

•Synchronise SIMD Control between Processors •Synchronise Each Network Update

The first of these ensure that when a processor switches to SIMD control, it will only start to execute the SIMD program at the first instruction in the SIMD sequence. The second of these events is to enable the array to be synchronised at the start of each network update. This is important since input and output from the device must occur between each network update, requiring each processor to wait while this occurs.

Synchronise SIMD Control between Processors

To synchronise the execution of the SIMD control requires each processor to start the execution of the SIMD program on the first instruction in

the sequence. To achieve this a synchronisation signal is broadcast

Richard Palmer Phd Thesis

processor only commencing SIMD control when this signal is present.

Synchronise Each Network Update

The synchronisation of each network update requires first an indication of when each processor has completed all its virtual neurons, and then an indication to when the next network update can commence.

To indicate the completion of each network update a signal is generated: this is controlled by an instruction in the MIMD program. This signal is daisy-chained between the processors to indicate when all processors have completed their virtual neurons. This signal can then be used to control the input and output from the device. To indicate when this input and output is complete, an externally generated signal is

broadcast to each processor, this results in the next network update

commencing.

5.4 Proposed Architecture

Now that the key points have been discussed it is possible to present the overall architecture. An outline of this architecture is shown in Diagram 5.2.

Processor Array

This array is composed of custom built processors that operate on a special instruction set optimised for neural models. These processors are each connected to four buses:

•Sigma Bus

The Sigma Bus connects each processor to the Sigma Function lookup table. This bus is also used for transferring partial sum values between processors. This bus is multiplexed between a 32 bit partial sum, which is used as a lookup address, and an 8 bit data value.

•SIMD Bus

The SIMD Bus is a bi-directional bus that is used to broadcast the SIMD program to the array, and to monitor the status of the

Richard Palmer Phd Thesis / O C o n t r o l I/O W E I G H T S TABLE WE i GH TS W E I G H T S TABLE SI MD C O N T R O L MO D U L O ADDRESS NPU T FABLE S I GM A T H R E S H O L D S I GM A F U N C T I ON PE PE PE

Diagram 5.2 Outline of Architecture

array.

•Input Bus

The Input Bus is shared between the processors, and is used to access the Input Table. This bus is multiplexed between a 16 bit address and an 8 bit data value.

•Weight Bus

The Weight Bus is ow^ed individually by each processor, and is multiplexed between a 16 bit address and 8 bit data value.

The other features of this architecture are required to implement the global control, to provide storage for the input values and weights and the implementation of the sigma function. These blocks are described below:

Richard Palmer Phd Thesis

•SIMD Control

The SIMD control operates by broadcasting the SIMD program to the array, this program updates a single synapse upon each iteration. The SIMD control is also used to synchronise the processor array, and external input and output devices.

•Weights Table

The Weights Tables hold both the weights and the MIMD

instructions, with each processor having exclusive access to its own weights table.

•Input Table

The Input Table provides the communications between each virtual neuron and the input and output from the array.

•Modulo Address

The Modulo Address generation is performed under control from each processor, and is used in the implementation of input windows. This method of implementing an input window follows a similar technique to that used by the Motorola DSP56000. (See Chapter 3 section 3.5 .2.1.)

•Sigma Function

The Sigma Function provides a lookup table with which to compute the sigma function. The use of a lookup table allows the sigma function to be altered for different network characteristics.

•Sigma Threshold

The Sigma Threshold is used to cut down on the size of the sigma lookup table required. This is required since simulations have shown that a 2K bytes lookup table is adequate for this type of application; this threshold function is used to compress large partial sums into this 2k bytes lookup table.

In document A novel architecture for a high performance low complexity neural device (Page 89-92)