THE ATM ARCHITECTURE
SAR Sublayer
6.3 Shared memory ATM switch architectures
This is a very popular ATM switch architecture and its main feature is a shared memory that is used to store all the cells coming in from the input ports. The cells in the shared memory are organized into linked lists, one per output port, as shown in figure 6.18. The shared memory is dual-ported. That is, it can read and write at the same time. Currently memories can handle up to 5 Gbps. At the beginning of a slot, all input ports that have a cell, write it into the shared memory. At the same time, all output ports with a cell to transmit read the cell from the top of their linked list and transmit it out. If N is the number of input/output ports, then in one slot, up to N cells can be written into the shared memory and up to N cells can be transmitted out of the shared memory. If the speed of transmission on each incoming and outgoing link is V, then the switch can keep up at maximum arrival rate, if the memory's bandwidth is at least 2NV.
The total number of cells that can be stored in the memory is bounded by the memory's capacity B, expressed in cells. Modern shared memory switches have a large
. . . .
Shared memory
1
N N
1
Figure 6.18: A shared memory switch
shared memory and they can hold hundred of thousands of cells. The total number of cells allowed to queue for each output port i is limited to Bi, where Bi<B. That is, the
linked list associated with output port i cannot exceed Bi. This constraint is necessary in
gets hot when a lot of the incoming traffic goes to that particular port. When this happens, it is possible that the linked list associated with the hot output port may grow to the point that it takes over most of the shared memory. In this case, there will be little space left for cells destined to other output ports. Typically, the sum of the Bi capacities of all linked
lists is greater than B. More complicated constraints can also be used. For instance, each linked list i may be associated with a minimum capacity LBi in addition to its maximum
capacity Bi, where LBi<B. LBi is a dedicated buffer for output port i and it is never shared
with the other output ports. The sum of the LBi capacities of all linked lists is less than B. Cell loss occurs when a cell arrives at a time when the shared memory is full , that is it, contains B cells. Cell loss also occurs when a cell with destination output port i arrives at a time when the total number of cells queued for this output port is Bi cells. In
this case, the cell is lost, even if the total number of cells in the shared memory is less than B.
A large switch can be constructing by interconnecting several shared memory switches. That is, the shared memory switch described above is used as a switching element, and all the switching elements are organized into a multistage interconnection network.
An example of a shared memory switch is the one shown in figure 6.18, and it was proposed by Hitachi. First, cells are converted from serial to parallel (S/P), and header conversion (HD CNV) takes place. Subsequently, cells from all input ports are multiplexed and written into the shared memory. For each linked list, there is a pair of address registers (one to write, WA, and one to read, RA). The WA register for linked list
Shared memory S/P S/P . . . P/S P/S RT DEC OUT DEC S/P! serial to parallel HD CNV Header conversion WA, RA Write address, read addres RT DEC Route decoder
OUT DEC Output decoder
1 32 1 32 . . . RA RA 1 32 . . . 1 32 IABF WA WA M U X D E M U X HD CNV HD CNV
Figure 6.19: The Hitachi shared memory switch
address of the last cell of list i, which is always empty. The incoming cell is written in that address. At the same time, an address of a new empty buffer is read from the IABF chip, which keeps a pool of empty buffer locations, to update WA. Similarly, at each slot a packet from each linked list is identified through the content of the RA register, retrieved, demultiplexed and transmitted. The empty buffer is returned to the pool, and RA is updated with the next cell address of the linked list. Priorities may be implemented, by maintaining multiple linked lists, one for each priority, for the same output port.
One of the problems associated with the construction of early shared memory switches was that memories did not have the necessary bandwidth to support many input ports at a high speed. As a way of getting round this technical problem, a scheme known as bit-slicing was used in the Hitachi switch, in which K shared memories were employed instead of a single one. An arriving cell is divided into K sub-cells, and each sub-cell is written into a different shared memory at the same time in the same location. As a result, a cell in a linked list is stored in K fragments over the K shared memories. All the pointers for the linked list are the same in the K memories. Transmitting a cell out of the switch requires reading these K fragments from the K memories. Now if we assume that we have N links, and that the speed of each incoming and each outgoing link is V, then
the bandwidth that each shared memory is required to have is 2NV/K, rather than 2NV as in the case of a single shared memory.
Shared buffer Shared buffer . . . Non-blocking switch fabric . . . . . . . . . . . . . . .
Figure 6.20: Shared memory used for a group of output ports
The shared memory switch architecture has also been used in a non-blocking switch with output buffering, as shown in figure 6.20. Instead of using a dedicated buffer for each output port, a shared memory switch is used to serve a number of output ports. The advantage of this scheme is the following. When using a dedicated buffer for each output port, free space in the buffer of one output port cannot be used to store cells of another output port. This may result in poor utilization of the buffer space. This problem is alleviated with multiple output ports sharing the same memory.