• No results found

Setup of FPGA-based Embedded Hardware System for Graphics

Chapter 4: FPGA-based Embedded Hardware System for Graphics Applications ······

4.7 Setup of FPGA-based Embedded Hardware System for Graphics

applications. An FPGA-based embedded hardware system has to be constructed for the graphics applications. Besides the microprocessor, memory and general peripherals, an LCD, frame buffers and hardware units for graphics pipelines have to be specified for this project. To build up an FPGA-based embedded hardware system, as shown in Figure 3.6, two tools provided by Altera Corporation must be used, Quartus II and SOPC builder. 4.7.1 Nios II Processer Settings

As a system component, a Nios II processor is a soft core that is volatile and present only after the FPGA is configured, as shown in Figure 3.6. Therefore, it must be added to the SOPC system with the SOPC builder when the embedded hardware system is set up. As shown in Figure 4.7, there are three configurations of the Nios II processor, the lowest configuration is Nios II/e (size-optimised economy), the middle one is Nios II/s (standard), and the highest one is Nios II/f (performance-optimum fast). Since the graphics speed-up with hardware can support the graphics acceleration with the hybrid way that is the goal of this project, the Nios II/f has been chosen. Besides the RISC and 32-bit structure of the Nios II/e, the Nios II/f consists of hardware supports for the instruction cache, data cache, dynamic branch prediction, hardware multiply, hardware divide, and barrel shifter. All the elements can accelerate the computation systemically. The accelerated computation is critical for the algorithms of 3D graphics speed-up.

The Nios II/f processor is configured to run at the 100-MHz frequency. Its performance can be up to 113 DMIPS. Its logic usage in FPGA is 1400-1800 LEs. The reset vector is located at the physical address 0x10000000 and offset 0x0 in the ext_flash, which is the external 64-MByte CFI (common flash interface) flash. The exception vector is located at

Chapter 4 FPGA-based Embedded Hardware System for Graphics Applications 72

the physical address 0x1c000040 and offset 0x40 in the ddr_sdram_1, which is one of two external DDR2 SRAM memories. Since the operating system does not support the memory management, both of MMU (memory management unit) and MPU (memory protection unit) options are disabled. In Figure 4.8, both the instruction and data caches are set to 32 Kbytes. The data cache line size is 32 Bytes.

Figure 4.7 Nios II Settings (1)

Figure 4.8 Nios II Settings (2) 4.7.2 System Clock Settings

To enhance the whole performance of the system, different peripherals run in the different clock domains from one of Nios II processor.

Chapter 4 FPGA-based Embedded Hardware System for Graphics Applications 73

two external clock sources. One is 50 MHz; the other is 125 MHz. From the 50 MHz clock, a PLL is used to produce two clocks. One is 100 MHz for the CPU and the LCD; the other is 60 MHz for the slow peripherals. The 50-MHz clock is for one of the two external DDR SDRAM memory controllers and the 125-MHz clock is for the other one.

Because the Nios II processor is connected to the peripherals that run in different clock domains from its clock, three clock crossing bridges are needed. Two bridges are used to connect the CPU with two external DDR SDRAM memories, respectively. One bridge connects the CPU to the slow peripherals, such as system timers, and others.

4.7.3 DDR2 SDRAM Memory Controller Settings

For graphics applications, frame buffers are necessary. As an ES, the system code and data have to be stored during the system runs. These requirements result in the system memory. Two external 64-MByte DDR2 SDRAM memory banks of Micron MT4732M16CC-3 are used in the system. They can run at a full rate of 153.85 MHz or half rate of 76.9 MHz.

One memory that stores video frames is controlled by one of two controllers and connected to the entry of the video pipeline. It plays the role of video frame buffer. The video frame data stored in the buffer adopt a 64-bit format. The entry of the video pipeline is an SG-DMA, which is set to a 64-bit width at the half rate of 76.9 MHz and delivers the video stream data from the memory into the FIFO of the video pipeline. Thus, the memory is controlled to run at the half rate of 76.9 MHz with a 64-bit data width.

The other memory is controlled by another controller. It is the system instruction and data memory, stores instruction code and data, and is connected to the Nios II data bus. It runs at the full rate of 153.85 MHz with the 32-bit data width.

4.7.4 CFI Flash Memory Controller Settings

FPGA configuration files of FPGA-base ESs have to be stored in a permanent memory. For graphics applications, the application software also has to be also stored in a permanent memory. A Spansion Flash memory with 64-MByte capacity has been chosen as the permanent memory in the present research. It stores the FPGA configuration data and application programs for this project.

The CFI (common flash interface) -compliant flash memory controller is set to control this external flash device. Its address width is 25 bits, and its data width is 16 bits. The setup time, wait-state time, and hold time for read and write transfers are set to 80.0 ns, 40.0 ns, and 20.0 ns, respectively.

Chapter 4 FPGA-based Embedded Hardware System for Graphics Applications 74

4.7.5 JTAG UART Settings

During ES development, it is necessary to communicate between a PC host where the ES development is practised and the FPGA development board where the target FPGA device is located. The JTAG UART (universal asynchronous receiver/transmitter) is used to build up the serial communication between the PC host and the FPGA development board. Through the JTAG port and cable, the FPGA configuration file and application software are downloaded to the devices on the FPGA board. The write FIFO (from Avalon interface of FPGA to JTAG) of the JTAG UART is set to eight bytes of buffer depth and four of IRQ level. Its read FIFO (from JTAG to Avalon interface) is also set to eight bytes of buffer depth and four of IRQ level. Both of them are constructed by using the on-chip registers.

4.7.6 Settings for LCD Controller Interface and Video Pipeline

For graphics applications, an LCD and video pipeline are the final part of the graphics pipeline. The LCD is the screen device used to display the pixels of a graphics image. Since the data format stored in frame buffers is different from one streaming to the LCD device, the video pipeline is used to do the data matching and synchronising. In this project, the LCD device on the board is a 4.3” Toppoly TD043MTEA1 active matrix colour display with 800 X 480 pixel resolution. The LCD controller interface and video pipeline are integrated into the system.

4.7.6.1 LCD Controller Interface

The LCD controller interface built with three Altera PIO (parallel I/O) cores consists of three one-bit ports, including <lcd_i2c_scl>, <lcd_i2c_sdat>, and <lcd_i2c_en>. The <lcd_i2c_scl> is an output port for the LCD controller clock output. The <lcd_i2c_sdat> is a bidirectional port for the LCD controller data. The <lcd_i2c_en> is an output port for the LCD controller enable.

4.7.6.2 Video Pipeline

The video pipeline is composed of IP cores that can be customised to suit the resolution and aspect ratio of the LCD device. Besides the video frame buffer and SGDMA that were introduced in Section 4.7.3, the video pipeline is composed of an FIFO, two data format adapters, a pixel format converter, and a video sync-generator.

The FIFO is set to a dual clock FIFO with 128-unit depth and constructed with the on-chip memory blocks. It can buffer video stream data when the rate of fetching video stream data from the video frame buffer is faster than the rate of displaying the pixels on the LCD device.

Chapter 4 FPGA-based Embedded Hardware System for Graphics Applications 75

eight data symbols per system clock for input and four data symbols per system clock for output. Each symbol has eight bits.

The pixel format converter is designed to take the 32 bits from the upstream and send 24 bits to the downstream by discarding eight bits. The 24 bits consist of eight bits for each of three channels of red, green and blue.

When sent out of the FPGA chip, the pixel stream data are a stream of three eight-bit data. It needs the other data format adapter to turn the 24-bit stream into the eight-bit stream. This adapter is set to three data symbols per system clock for input and a data symbol per system clock for output.

The video sync-generator is used to synchronise the RGB pixels in rows and columns in an image by means of horizontal and vertical synchronisation signals. As the display scan is done line by line, the horizontal synchronisation for a whole line of pixels should be done before the vertical synchronisation. The horizontal synchronisation of 800 RGB pixels in a line is done at the rate of one pixel per system clock. The vertical synchronisation of 480 lines of pixels in an image is done line by line. Since the horizontal blank pixels are set to 216, horizontal front porch pixels are 40, and horizontal sync pulse pixels are one, the total number of horizontal scan pixels is 1056. The vertical blank lines are set to 35, vertical front porch lines are ten, and vertical sync pulse lines are one. So the total number of vertical scan lines is 525. Figure 4.9 illustrates the video pipeline.

Figure 4.9 Block Diagram of Video Pipeline (Purple Blocks are Off-Video-Pipeline Blocks) Off-Chip LCD Data Interface

8 bits

Video-Sync Generator

8 bits

Data Format Adapter

24 bits

Pixel Format Converter

32 bits

Data Format Adapter

64 bits

FIFO

64 bits

SGDMA

64 bits

Chapter 4 FPGA-based Embedded Hardware System for Graphics Applications 76

The video pipeline provided by the Altera is used simply for the fundamental functions of the control and transfer of pixel data to the off-chip LCD display device. It is not sufficient for the graphics pipeline. The rest of the graphics pipeline is implemented by an algorithm-specified module and Mesa-OpenGL implementation. The detail of graphics pipeline will be discussed in Chapter 5.

Outline

Related documents