Firmware development is always a very time consuming task, especially for such complex two level FPGA systems like the DSSC DAQ chain. During development, the single firmware modules have to be simulated to verify their behavior before they can be implemented in the real hardware5. The firmware simulation grew rapidly with the increasing number of implemented features of the system. The integration of the individual modules into the system and their behavior within the readout sequence required a comprehensive simulation which comprises the whole readout chain. This large simulation made it possible to monitor and evaluate all important details of the firmware interaction and is presented in the following.
Beside simulation also debugging of the signals in the FPGA is possible. Therefore, the Xilinx ChipScope application is used to directly access signals in the firmware logic.
For this purpose, the DSSC system provides certain debug and monitoring features which turned out to be very useful for firmware and software development. These features will be presented at the end of this section.
6.5.1 System Simulation and Simulation Mode
Verification of firmware modules is realized by integrating the modules into the large system simulation which comprises all central building blocks of the digital space of the detector:
• All devices controlled by EPC registers. • The full data path in the PPT FPGA. • A DDR3 memory simulation module. • The IOB firmware.
• The digital part of the readout ASIC, including an SRAM module which delivers simulated data during the readout simulation.
• A simplified version of the SIB which answers to the commands from the PPT. • A simulation dummy for the PRBs which reacts on the signals from the IOB.
Just the Microblaze is not included in the simulation, since its operation system can hardly be simulated. Nevertheless, in the real system, all configuration data for the EPC registers are delivered by the microcontroller. Consequently, a Verilog module is needed in
5The utilized simulation software for the Verilog modules has been the Incisive (NCSim) Functional Simualtor
from Cadence.
6.5. Debug Features and Simulation Mode
the simulation which implements the configuration of the EPC registers to initiate firmware module behavior.
Another challenge of the simulation is the definition of an initial state, which represents a realistic configuration of the system. In the full system simulation, the very large number of configuration registers requires an automated sequence to generate the initial state. Especially the large number of JTAG registers of the ASIC have to be configured before the ASIC can operate at all, also in simulation.
System Simulation Kintex-7 FPGA Control Software c++ control library libPPT SimFile Writer GUI Config. Data .txt File Microblace Simulation Mode Normal Mode
CmdFileReader EPC Register EPC Register EPC Device EPC Device Ethernet EPC Register EPC Register EPC Device EPC Device
Simulation Mode
Figure 6.9.:Simulation Mode. By activating simulation mode int the control software all config-
uration data is redirected into a text file which can be used to configure the system simulation.
Therefore, the CmdFileReader module has been introduced. The module is able to read bytes from a text file and to inject it synchronously into the system simulation as visualized in figure 6.9. Effectively, in simulation the CmdFileReader replaces the Microblaze and reproduces the configuration interface of the EPC registers. In this way, the whole detector system simulation can be initialized and controlled from a text file.
However, the generation of this text file has also to be automated as much as possible. This has been solved already in a very early state of the project: Already in the control software of the ASIC Test System a method could be introduced which automatically redirects the configuration data to a text file. In this way, large numbers of configuration Bytes can easily be provided to the simulation in the identical order like it is sent to the real system6.
Because of its high value for firmware development, this mode is also implemented in the control software for the real detector system resp. the PPT Test Setup. Unfortunately, things are more complicated in this environment because the control software does not communicate directly with the EPC registers but via Ethernet and the SlowControlServer (see section 9.2) application which runs on the Microblaze. Most of the functionality of the SlowControlServer has also been integrated into the simulation mode in order to allow
6. Design of the System Control Firmware
control of the large system simulation from the control software.
6.5.2 FPGA Firmware Debugging
In order to monitor the status and the behavior of the firmware modules implemented in the real system, the Xilinx FPGAs provides a possibility to access and monitor selected signals and registers during operation. This is done by connecting special hardware blocks of the FPGA to the desired signals. Using the Xilinx Chipscope Analyzer [91] application, it is possible to monitor the behavior of these signals with a graphical user interface. A screenshot of this application is depicted in figure 6.10.
Figure 6.10.:Screenshot of the Xilinx Chipscope Pro Analyzer[91].
The direct access to signals in the FPGA logic is realized by integrating logic analyzers (ILA) directly into the target design. By connecting a special debugging cable directly to the JTAG signals of the FPGA, the ILA can be accessed from a computer. However, in a multi-stage FPGA system, the direct access to these signals is not given for every device. Especially in the DSSC system, in which the IOB FPGA is operated in vacuum where there is no way to connect a debugging cable.
Since the IOB FPGAs is programmed from the PPT FPGA, their JTAG signals are con- nected and accessible from the Microblaze. For this configuration, Xilinx provides a tool implementing a TCP/IP-based protocol which can be used instead of a physical debugging cable. The Xilinx Virtual Cable (XVC) [92] protocol can be used to access the debug features of a distant FPGA within an embedded environment. In figure 6.11, the connection scheme of the FPGA debugging signals of the DSSC system is shown.
By connecting the JTAG signals of the Kintex-7 FPGA on the PPT to general purpose I/Os, the XVC can be used to access the debugging features of the IOB FPGAs and even of the Kintex-7 FPGA itself.
6.5. Debug Features and Simulation Mode
Slow Control Ethernet
Kintex-7 PPT FPGA
FPGA JTAG Signals Connections
Spartan-6 IOB FPGA JTAG Microblaze PPT IO B1 JTAG Xilinx Virtual Cable Server PPT IOB1 IO B2 IO B3 IO B4
Figure 6.11.:Connection of the JTAG signals of the system FPGAs.By running the Xiling Virtual
Cable Server on the Microblaze, the JTAG interface of the IOB FPGA and even of the PPT FPGA itself, can be accessed. Via the Ethernet connection of the PPT, the Xilinx Chipscope Software can connect to the XVC server and access the debugging logic inside the IOB FPGAs from remote.
The XVC protocol is implemented by a custom server application which runs on the Microblaze. The ChipScope application can connect via Ethernet to the server as if it was a physical cable. Also the usage of the application is identical. The XVC server for the Microblaze has been implemented by Jan Soldat.
In the current implementation of the PPT and the IOB firmware, ILAs (Integrated Logic Analyzers) are inserted at several positions. The main state machines and important parts of the data path can be monitored. Additional ILAs are inserted in controllers of the IOB. Since the debugging features do not slow down the operations in the firmware there is no need to remove them from the final design.
There exist also another tool in the ChipScope Pro Serial I/O Toolkit, allowing measure- ments of the transfer quality of the high-speed links of the DSSC system. This Integrated Bit Error Ratio Test (IBERT) can easily be integrated into a firmware design and connected to the respective high speed transceivers. It allows also injection of bit-errors on purpose for testing. The evaluation of the IBERT measurements is implemented in the ChipScope application which provides a graphical interface for the IBERT IP-core. The tool is used to measure the bit error rate of the serial connections of the data path from the IOB to the PPT and of the 10 Gigabit-Ethernet links of the QSFP+ connection. The measurements and results are described in section 8.
7
Implementation of the Readout Chain
This chapter describes the implementation of the high performance readout chain [93] which transfers the image data from the in-pixel memory of the readout ASIC to the fast 10 Gigabit-Ethernet links of the Patch Panel Transceiver. Since the data path of one quadrant can be regarded as an independent system, all data rate computations and discussions refer to the quadrant system. However, all computations can easily be translated to the full megapixel camera which consists of four quadrants.
By implementing modern components, the PPT has become a very compact board, providing at the same time enough performance to deal with the large amount of image data which is produced by the quarter megapixel camera during each incident X-ray pulse train. An average data rate of roughly 34 GBit/s is transferred which is a constant utilization of 85% of the available 40 GBit/s bandwidth during continuous operation. At these high data rates, the data stream is not only packaged into Ethernet packets and transferred but also image-wise resorted. The specification prescribes that the images in the output stream are sorted in the same chronological order as they have been acquired. This is also realized in the data path within the Kintex-7 FPGA.
Readout ASIC IOBoard Patch Panel Transceiver
- Deserialization
- Forwarding of the data to the Aurora core
- Pixel word reconstruction - Buffering & image sorting - Generation of Ethernet
packets - Signal amplification &
filtering - Digitization
Scheme of the DSSC Readout Chain
Figure 7.1.:Scheme of the readout chain of the DSSC detector. It is implemented in three
hardware levels. The major tasks of the modules are summarized.
The readout chain comprises the path of the data which starts at the 64 channel parallel readout from the quadrant ASICs over the FPGA-to-FPGA connection between the IOB and the PPT, the image wise reordering up to the generation of the Ethernet packets. Each
7. Implementation of the Readout Chain
component and task has to provide at minimum the bandwidth of the previous one in order to not stall the whole path and lose data. The highest demands are placed on the DDR3 memory, since the data must be written into the buffer and read out again during the same cycle. Only by using the DDR3 memory in this way, it is possible to change the order of the images in the stream without interrupting the transmission. The management of the different tasks requires several interlinked state machines which have been presented in the previous chapter.
At the output, the readout data has to be transferred in a given, standardized format, which is common for all three megapixel detectors. The specification comprises a defined format for the single Ethernet packets and also a certain format for the Train Data in total. The description of the Train Data Format can be found in section 4.2.1.
The scheme of the readout chain in figure 7.1 summarizes the major tasks.
The high data rates that are required to transfer all data out of the system have been the critical factor for all implementations in the data path and will be discussed at the end of the chapter. Another critical factor is the data transfer quality between the different devices. All data transfer connections have been measured and their bit error rate has been evaluated. The results are presented in chapter 8.