• No results found

2.2 SRAM Based FPGAs

2.2.5 Managing Data Content by Utilizing the Bitstream

Memories are essential elements in SoC designs as a standard way to store data on a temporary or permanent basis. This data may have different purposes.

In SRAM FPGA based systems the most common functionality is to store in-formation of specific applications, entire programs or sequences of instructions, program state information and/or configuration information of the device. As it has been introduced, when implementing memory elements in FPGAs the two alternatives that can be used, which are the BRAM resources and the distributed general-purpose logic fabric. In the case of soft-core processors, while program and data memories are commonly implemented as BRAM structures, the small

memory elements, such as the registers, are implemented by utilizing distributed fabric logic.

The standard way to access and manage data in memory modules is to use their input ports, such as data output, data input, write-enable and address.

This access method requires a control mechanism to manage the inputs and read the outputs in a coordinated fashion. This is often accomplished by a memory controller, which can be implemented in different ways, such as using soft-core or hard processors, specific IP cores or custom FSM based modules.

Frequently, additional elements are also required, like bus interfaces or auxiliary memories to store processed data. The implementation of all the above elements demands the usage of a variety of resources of the FPGA, further increasing the resource overhead. Besides, if it is required to read or write large amounts of data, resources committed to storing such data will be blocked and unavailable for other purposes. For example, it is not possible to read the data from the first two memory addresses while the last ten data addresses are being written.

Due to the limitations of standard data management, there are several scenarios where new data management alternatives could improve some applications or solve different existing issues. Bearing in mind that the user data is also stored in the configuration bitstream, the idea of using this bitstream to manage data content is an attractive alternative.

The impossibility of modifying data content of ROM memories, widely used as program memories in soft-core processors represents another relevant issue since this functionality may be needed in order to change the purpose of a processor or to recover after an SEU in the program memory. The regular way to modify the content of a ROM memory in an FPGA design is to re-synthesize and re-deploy the entire design with the new data onto the FPGA. In the best case scenario, if the program memory has been implemented as a reconfigurable partition, only this part should be re-synthesized and re-implemented. Following this approach, the program memory is implemented as a RAM in [39], and a dedicated IP core loads new memory content during the system operation by using a serial interface.

However, this scheme increases requirements in terms of logic resources, and an upset in the input port of the memory may lead to changes in the memory content, increasing the probability of malfunction due to SEUs. Thus, modifying the program memory could be a good application example for a bitstream based data content management.

Another scenario where a bitstream based management can be an interesting alternative is the damage of the interfaces on memory resources. Thus, as a consequence of an SEU, an error can disable or affect to relevant ports of the memory, which it is likely to provoke permanent damages. Moreover, if ports

such as, address, reset, clock or data output are affected by the fault, it could make it unsuitable to recover the information stored in memories when resort-ing to conventional, off-the-self methods. This scenario can be a critical issue when the memory has not been hardened and the information stored has a spe-cial operational relevance. The extraction of such information by utilizing the data content stored in the bitstream could allow recovering the information in a straightforward manner.

Considering the direct relation between the integrity of the user data and the reliability, several fault tolerance related strategies could also take advantage of a bitstream based data management.

Bearing in mind the potential benefits that can be obtained with this strategy, there is a scarcity of investigations around this strategy. The lines below present the most significant works proposed in the literature, especially focusing in Xilinx devices.

When reading and writing back user data from one memory block to another, two main cases can be found: Homogeneous and heterogeneous architectures. While homogeneous memory modules share main characteristics, such as size, shape, and used resources (with only different fabric locations), heterogeneous modules support distinct features due to the usage of different resources and granular-ity levels. Data management utilizing the bitstream is more straightforward in the case of homogeneous implementations, since it mainly requires simple bit-stream manipulations in order to relocate the data. Nevertheless, the utilization of heterogeneous architectures restraints the designing process.

In any case, extracting and writing user data content through the bitstream requires certain knowledge of its structure. Thus, they are device dependant strategies. While devices from the 7 series by Xilinx share the majority of the characteristics of the bitstream structure, many of them are different in previous FPGA series from the same vendor. Thus, despite in several cases the flows and the main ideas can be adapted to different devices, the developed approaches are commonly related with a particular FPGA vendor, a particular FPGA series or even to a particular device.

With the latest FPGA series (Virtex-4, Virtex-5, Virtex-6, 3A, Spartan-3AN, Spartan-3A DSP, Spartan-6 and 7 Series), Xilinx offers the possibility to use the Data2MEM [121] data translation software to initialize BRAMs. Among other features, Data2MEM can replace the contents of BRAMs in configuration bitstreams in a straightforward fashion, without requiring any implementation tool. However, this application undergoes several limitations. One of the most relevant is that this software must be executed by an operating system (Windows

or Linux). In addition, this program requires previously generated files to create the output files, such as BRAM Memory Map (BMM) files or Linkable Format (ELF) files. The configuration bitstream files to be updated by the tool must be created without compression and/or encryption. To sum up, this is a quite com-plex solution, not supported for partial bitstreams, feasible only for data writing in BRAMs and not for reading. It is therefore not proper data management approaches for autonomous standalone systems based on partial reconfiguration.

Jbits [122] was one of the first tools reported to be able to modify the bitstream, including the data content. It consisted on a set of Java classes that provided an application program interface into the configuration bitstream for XC4000 and Virtex families by Xilinx (by utilizing the SelectMAP interface). One of the biggest advantages of this approach was that it did not require additional hardware structures. Nevertheless, it came with a poor data efficiency. This was because it required to read the entire bitstream, while the user data only uses a relative small percentage of the bitstream. This tool became obsolete and it is not suitable for 7 series devices.

The work presented in [7] is a step forward towards bitstream based data content management. As Figure 2.17 extracted from that work describes, the approach proposed focuses on saving an relocating the context a of a soft-core processor, by extracting (and writing back once processed) the bitstream through the Se-lectMAP interface. In this process the entire configuration data is not read back, but only the frames that contain the target information. This context extraction is done while reading the configuration data and requires to stop the clock of the particular hardware task to prevent chances during the read process. In [123], a similar approach that proposes a partial displacement defragmentation algo-rithm for heterogeneous reconfigurable systems was presented. However, these approaches have several drawbacks. The most remarkable one is that, since they are only usable with the old one-dimensional partial reconfiguration FPGA fam-ilies from Xilinx, they cannot be utilized in the newest devices. In addition, they require high processing effort (the use of tools, such as a configuration manager, PARTBIT software and the REPLICA filter) and to use a complex database to store the placement of each data bit.

Further approaches focused on dealing with newer Xilinx devices that support two-dimensional partial reconfiguration can be found. The approach presented in [61] is capable of capturing, storing and copying the content of flip-flops within a Virtex-V FPGA device on a particular reconfigurable region by utilizing the bitstream in combination with the STARTUP VIRTEX5 primitive, the GCAPTURE commands and the ICAP interface. In a later work [8], the previous approach was extended to enable to copy the content of flip-flops between heterogeneous

SelectMAP FPGA

Figure 2.17: A relocation alternative.

reconfigurable regions by processing captured bitstreams. In [124], the same authors present an implementation of preemptive hardware multitasking for par-tially reconfigurable FPGAs that enables configuration prefetching and reuse.

This approach based on their previous works reduces configuration overheads, improving the system performance.

As Figure 2.18 obtained from [8] describes, the strategy of the mentioned works follow different steps. The first is to capture the state of the flip-flops by using either the GCAPTURE command or the STARTUP VIRTEX5 block. It is advisable to stop the clocks before capturing the data in order to ensure that the register system is in a stable state. In a next step the captured data is stored in an external memory. After, if it is necessary (for instance due to data relocation purposes), this information can be processed creating a new partial bitstream.

This bitstream processing can be a complicated task bearing in mind the com-plexity of the bitstream structure, especially when a large number of registers are utilized. This processing is mainly done by utilizing the information from the logic location text file (*.ll). The next step is to download the bitstream to the FPGA, which can be done in any moment because the changes don’t take effect until the GSR signal from the the STARTUP VIRTEX5 primitive is trigged. An alternative to the STARTUP VIRTEX5 primitive is to toggle the GSR signal by using the GRESTORE command.

The main challenge of using this strategy is that by default the GSR signal copies all INT0/INT1 bits to all the flip-flops, changing the content of all the regis-ters of the device. Since, in many application only some particular register are needed to be affected by the context restoring process an additional strategy has

Capture

Figure 2.18: Flow chart of a context capture and restoration alternative.

to be followed. Every stack of configuration resource is related with a specific bit, commonly named mask bit. GCAPTURE/GRESTORE commands only affect the state of the flip-flops not marked with the mask bit. Each column is related to a single frame in the mask column. In this way, for every mask frame set, the entire related column in the clock region of DSP/CLB is protected/unprotected.

Further information of this bit (for Virtex-V devices) can be found in either [30]

or in [8]. This feature makes it possible to protect/unprotect specific areas de-termining which resources are to be modified. In this way, [8, 61] follow the strategy depicted in Figure 2.19 (extracted from [8]). Since the entire FPGA design is unprotected by default, in a first step the entire FPGA design is pro-tected. When context is needed to be saved the target region has to be protected before capturing the data. After, capturing data the target region can optionally be protected, in any case before restoring the context the target region has to be unprotected. After finishing context saving/restoring processes the target region can be protected in order to avoid undesired changes in future startup sequences.

Protect

target region target region target region target region

Capture/

Restore

Figure 2.19: Flow chart of an FPGA protection and unprotection alternative.

In general terms, these published approaches present some common drawbacks.

First, they require a controller module and a memory block to store the read data. In addition, considering that the reconfiguration is a time demanding process, the price to pay when using these strategies is a low availability. Another limitation of these approaches is that they require to disable the CRC feature of the generated bitstream. This is because due to the modification of the bitstream content the original CRC value would not be valid, making unsuitable to use it for the reconfiguration. Hence, the utilization of the CRC feature would require to generate the new CRC value which requires a high processing effort. Finally,

no approaches have been proposed to perform such applications in newer devices, such as, 7 series vy Xilinx.

In summary, despite of the presented disadvantages, bitstream based user data content management techniques provide an alternative access to data content without increasing the area overhead and without interfering the logic of the FPGA logic which, in many cases, can be an interesting way to overcome certain limitations of the classic memory management.

2.3 Radiation Effects on Soft-Core Processors