Processing Element Organisation - — NSC and Target Architecture

Chapter 6 — NSC and Target Architecture

6.4. Processing Element Organisation

Based upon the definition of the Generic Neuron model, the hardware implementation of the processing element (PE) can be organised into three units: memory, communication, and execution. Figure 6.5 shows the PE’s internal structure and the following sub-sections examine each unit individually.

Input A ddress Control Data

Bus B us Bus O utput Data Bus Communical Unit Addres$ Module

Memory

Execution Unit

Figure 6.5 — Processing Element’s Internal Organisation

6.4.1. Communication Unit

The purpose of the communication unit is to control the flow of data between a particular PE and the rest of the network. This unit is responsible for the following basic functions:

• initialise the PE’s parameters, including initial weight values, target output patterns (for supervised learning), and some architecture-specific parameters;

• read and store input data into the appropriate memory block at specific instants, and issue control signals to the execution unit indicating when a state or error calculation should start; and

• transmit to the output data bus the calculated value from the execution unit, which can be either a state or an error value. In this case, the communication unit should also signal the rest of the network (and the central controller) that a legal value is being written onto the data bus.

The communication unit performs the above functions by interfacing with the off-chip broadcast busses, which are: the two data busses (input and output) that can be externally connected by a single bus (see Figure 6.4a); the address bus, and the control

Chapter 6 NSC and Target Architecture 107

bus, both driven by the central controller. To perform these functions, the communication unit is divided into two modules: datapath and control, as illustrated in Figure 6.6.

Datapath Module cZw' Control Module

r ~f= — I Bus f ---- --- Off-Chip I Address'^ Bus I my_addre^1' I prev laycf }“ n»»t_layf } ... Combinational Logic cs/rdy Off-ChIp from States Weights Internal

Data Bus exec Bus Bus Address unit Bus

Figure 6.6 — Communication Unit’s Internal Structure

The datapath module implements the PE’s external bus arbitration. It is basically composed of comparators and registers, which determine when an output value (state or error) should be broadcasted onto the data bus, and verify when a valid data on the bus is addressed to the PE.

The control module generates all command lines to regulate the data transmission between the specific PE and the rest of the network. The control module is implemented by a FSM, which can be realised by a PLA, or random logic, according to the technique adopted by the low level synthesis tool used in conjunction with the NSC.

6.4.2. Memory Unit

The purpose of the memory unit is to store all relevant data required to execute neural network models. This data is accessed either externally or internally. External access is controlled by the communication unit to store all incoming data related to the PE. Internal access is controlled by the execution unit to perform the appropriate mathematical operations upon its stored data values.

Memory access can be done by the communication unit and the execution unit simultaneously. This is generally the case when the PE starts reading state values from other PEs, and as soon as at least one data set is obtained, the execution unit can start manipulating further data to perform the sum of products To achieve this parallelism, two-phase clock mechanism is employed.

The memory unit’s internal structure is depicted in Figure 6.7. It comprises two major modules: the storage module and the addressing module.

108 NSC and Target Architecture Chapter 6

Control signals from comm unication and execution units ;; I Total of mpul PE*

W eights A ddress Data

Bus Bus

Control signals from S ta te s comm unication and Data

execution units Bus

Counter

trolal o f output PEsj A ddressing Module - backw to execution unit FOTMMTt S lo c k s Wp block block

s

i I lalodM We block i ____ , block iT T ^Stqlylge M o^ufe____ j

Figure 6.7 — Memory Unit’s Internal Organisation

The storage module holds the basic data values required by the vast majority of neural models: input states and their associated synaptic weights. Although this module depends upon the neural algorithm and application being implemented, the Generic Neuron model identifies four basic blocks: a forward block, which stores the input states (S) and their weight values (Wp); and a backward block, which stores the backward flow of data, the input errors (E) and their related weights (Wg). Therefore, there are up to four blocks of memory, which contain the necessary data information to accomplish the recall and learning phase of the majority of neural models (see Table 6.1). Applications that do not require the learning phase, or back propagation of errors, can have the backward block excluded from the implementation of the storage module by the NSC.

The addressing module provides the mechanism to access all memory blocks integrated into the PE, according to the adopted neural algorithm. It consists of a counter, for controlling the sequential memory addressing, and up to two comparators, to determine when all relevant input values have been processed. One comparator is associated with the forward phase, and embodies a register that contains the total number of inputs. A second comparator, present only when learning with back propagation of errors is defined by the neural model, holds the number of backward inputs needed to implement the backward phase of such a neural model.

6.4.3. Execution Unit

The execution unit deals with the actual computation of the neural functions. It is responsible for executing all the three basic neural functions — / j , /2, and — of the

Generic Neuron model (see Figure 6.3), according to the high level description provided by the application designer. The basic framework of this unit is depicted in Figure 6.8, which consists of two major modules: datapath module and control module.

The datapath module contains all the necessary blocks to perform the mathematical operations required by neural models. This module is ultimately configured

Chapter 6 NSC and Target Architecture 109

by the NSC, although some basic units can be identified. This includes: an ALU, which executes the basic operations such as addition and subtraction; an Accum ulator!Shifter module, required to store intermediate results to perform the multiplication operation in conjunction with the ALU; some registers, to hold the algorithm-dependent parameters, such as the output state (sj) and the output error (ej); a lookup table ROM, responsible for implementing the activation function; a m ultiplicand register, a special register used to implement the multiplication algorithm; and finally, some auxiliary registers, required to store some intermediate results, which are used to process the neural function.

un* un* Inlem al Bus Com binational Logic multiplier FSM ALU I ! EccüinüiëiôÂLI--- I ; j ROM I7— > *JOatapaÿiJitlpduJe___ ICpnttpLModule _ _

Figure 6.8 — Execution Unit’s Internal Structure

The control m odule provides the necessary commands to execute all operations that are specific to the neural network algorithm. These include the propagation function followed by the activation function, the error calculation, and the weight update function. These commands are sent to the execution unit’s data path, which effectively performs the computation. In addition, this module implements the multiplication operation, according to the Booth’s algorithm [55].

In document High Level Synthesis of Neural Network Chips (Page 107-110)