• No results found

In this thesis, existing approaches to bitstream relocation on FPGAs and their draw-backs have been discussed. This work proposes a novel multi-row M-FDR algorithm that can relocate frame data between PRRs faster than contemporary state-of-the-art relocation techniques by Corbetta et al. [4], Carver et al. [5], and Sudarsanam et al. [6]. The proposed algorithm has been implemented using a dedicated hardware architecture called the Accel-erated Memory-based Reconfiguration Circuit (AMRC). A performance estimation model of the AMRC has been proposed that quantifies the performance benefit of AMRC over ARC [6], which was found to be the fastest relocation method among previous approaches that were studied. AMRC was found to be on an average 26.6% faster than ARC [6] when relocating PRRs. An average speedup of 231x over BiRF [4], while for the method proposed by Carver et al. [5] a speedup of 9.2x for circuits on the same side of the FPGA, and 24.2x for circuits on the opposite side of the FPGA was observed.

The major challenge in this thesis involved understanding the existing relocation tech-niques like PRR-PRR relocation [6] and BiRF [4] and identifying their drawbacks. Several tests were used to help identify the performance bottlenecks when using the ICAP and off-chip versus on-off-chip memory during frame data relocation. Another important constraint to manage during M-FDR is that the ICAP cannot be written more than 2048 frame data words at once. Therefore for larger designs, a column pattern had to be identified for copy-ing frames in groups of either two CLB columns, or one CLB and one DSP column, or one BRAM column, because of the way columns are addressed and arranged in Xilinx Virtex-4 FPGAs. Bit reversal is an important consideration when moving frames to the opposite side of the FPGA. The AMRC has been designed to handle configuration requests to any side of the FPGA and to reverse bits for only those frames that fall on the opposite side of

Future research can be carried out in the following directions.

1. Use compression techniques to reduce the memory footprint of frame data in the FDB for efficient on-chip memory utilization.

2. Implement the AMRC on FPGAs with higher gate densities like the Virtex-5 and Virtex-6 families of Xilinx FPGAs to reduce the penalty of using on-chip memory to store frame data.

3. Improve the AMRC hardware to store more than one circuit in the FDB, so as to support a more granular form of run-time parallelization between PRMs.

4. Implement sharing of the FDB between the static and reconfigurable regions of the SoC so that unused BRAM memory in the FDB can be re-used by the static region for storage.

5. Investigate ways to further improve the relocation time, such as reducing relocation overhead during the frame read process, and implementing bulk frame reads.

References

[1] T. Becker, W. Luk, and P. Y. K. Cheung, “Enhancing relocatability of partial bit-streams for runtime reconfiguration,” in Proceedings IEEE International Symposium on Field-Programmable Custom Computing Machines, pp. 35–44, 2007.

[2] H. Kalte, G. Lee, M. Porrmann, and U. Ruckert, “Replica: A bitstream manipulation filter for module relocation in partial reconfigurable systems,” in Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) - Workshop 3, p. 151. Washington, DC: IEEE Computer Society, 2005.

[3] H. Kalte and M. Porrmann, “Replica2pro: task relocation by bitstream manipulation in virtex-ii/pro fpgas,” in CF ’06: Proceedings of the 3rd Conference on Computing Frontiers, pp. 403–412. New York: Association for Computing Machinery, 2006.

[4] S. Corbetta, M. Morandi, M. Novati, M. Santambrogio, D. Sciuto, and P. Spoletini,

“Internal and external bitstream relocation for partial dynamic reconfiguration,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions, vol. 17, no. 11, pp. 1650–

1654, Nov. 2009.

[5] J. Carver, R. Pittman, and A. Forin, “Relocation and automatic floor-planning of fpga partial reconfiguration bitstreams,” Microsoft Research Technical Report, Aug. 2008.

[6] A. Sudarsanam, R. Kallam, and A. Dasu, “Prr-prr dynamic relocation,” IEEE Computer Architecture Letters, vol. 8, pp. 44–47, July 2009 [Online]. Available:

http://dx.doi.org/10.1109/L-CA.2009.49.

[7] A. Sudarsanam, R. Barnes, J. Carver, R. Kallam, and A. Dasu, “Dynamically re-configurable systolic array accelerators: A case study with extended kalman filter and discrete wavelet transform algorithms,” Computers Digital Techniques, The Institution of Engineering and Technology (IET)), vol. 4, no. 2, pp. 126–142, Mar. 2010.

[8] A. Sreeramareddy, J. G. Josiah, A. Akoglu, and A. Stoica, “Scars: Scalable self-configurable architecture for reusable space systems,” in AHS ’08: Proceedings of the 2008 NASA/ESA Conference on Adaptive Hardware and Systems, pp. 204–210.

Washington, DC: IEEE Computer Society, 2008.

[9] 1-CORE Technologies, “Fpga logic cells comparison,” [http://www.1-core.com/

library/digital/fpga-logic-cells/], 2004 - 2009.

[10] Xilinx, “Design flow overview,” [http://www.xilinx.com/itp/xilinx7/books/

data/docs/dev/dev0013_5.html], 2005.

[11] Xilinx, “Early access partial reconfiguration user guide,” UG208 (v1.1), Xilinx Inc., Mar. 6, 2006.

[12] C. Kao, “Benefits of partial reconfiguration,” Xcell Journal, pp. 65–76, 2005.

manuals/xilinx11/edk_ctt.pdf], 2009.

[14] Xilinx, “Microblaze soft processor core,” [http://www.xilinx.com/tools/

microblaze.htm], 2010.

[15] C. Conger, R. Hymel, M. Rewak, A. George, and H. Lam, “Fpga design framework for dynamic partial reconfiguration,” in Proceedings of the 15th Reconfigurable Architecture Workshop, RAW 2008, pp. 1–8, April 14-15, 2008.

[16] Xilinx, “Planahead user guide,” [http://www.xilinx.com/support/documentation/

sw_manuals/xilinx11/PlanAhead_UserGuide.pdf], 2009.

[17] Xilinx, “Virtex-4 configuration guide,” [http://www.xilinx.com/support/

documentation/user_guides/ug071.pdf], 2009.

[18] D. Mesquita, O. Moraes, J. Palma, R. Mller, and N. Calazans, “Remote and partial reconfiguration of fpgas: tools and trends,” in Proceedings of International Parallel and Distributed Processing Symposium, pp. 22–26, 2003.

[19] E. Carvalho, N. Calazans, E. Briao, and F. Moraes, “Padreh - a framework for the design and implementation of dynamically and partially reconfigurable systems,” in Proceedings of the 17th Symposium on Integrated Circuits and Systems Design, Sim-posio Brasileiro de Concepcao de Circuitos Integrados (SBCCI), pp. 10–15, 2004.

[20] T. Grotker, S. Liao, G. Martin, and S. Swan, System Design with SystemC. Norwell, MA: Kluwer Academic Publishers, 2002.

[21] F. Ferrandi, M. Santambrogio, and D. Sciuto, “A design methodology for dynamic reconfiguration: the caronte architecture,” in Proceedings of 19th IEEE International Parallel and Distributed Processing Symposium, p. 4, 2005.

[22] A. Raghavan and P. Sutton, “Jpg - a partial bitstream generation tool to support partial reconfiguration in virtex fpgas,” in Proceedings of International Parallel and Distributed Processing Symposium, IPDPS 2002, Abstracts and CD-ROM, pp. 155–

160, 2002.

[23] S. Guccione, D. Levi, and P. Sundararajan, “Jbits: Java based interface for reconfig-urable computing.” Xilinx Inc., 1999.

[24] K. Nasi, T. Karouhalis, M. Danek, and Z. Pohl, “Figaro - an automatic tool flow for designs with dynamic reconfiguration,” in International Conference on Field Pro-grammable Logic and Applications, pp. 590–593, 2005.

[25] P. Sedcole, B. Blodget, T. Becker, J. Anderson, and P. Lysaght, “Modular dynamic reconfiguration in virtex fpgas,” in IEE Proceedings of Computers and Digital Tech-niques, vol. 153, no. 3, pp. 157–164, Feb. 2006.

[26] O. Diessel and G. Milne, “Hardware compiler realising concurrent processes in recon-figurable logic,” in IEE Proceedings of Computers and Digital Techniques, vol. 148, no. 45, pp. 152–162, July/Sept. 2001.

[27] A. Bailey, “Using the circal process algebra in digital system design,” in IEE Colloquium on Formal and Semi-Formal Methods for Digital Systems Design, pp. 4/1–4/4, 1991.

[28] M. Gericota, G. Alves, M. Silva, and J. Ferreira, “Run-time management of logic resources on reconfigurable systems,” in Design, Automation and Test in Europe Con-ference and Exhibition, pp. 974–979, 2003.

[29] H. Tan and R. F. DeMara, “A physical resource management approach to minimizing fpga partial reconfiguration overhead,” in IEEE International Conference on Reconfig-urable Computing and FPGAs, pp. 1–5, Sept. 2006.

[30] L. Singhal and E. Bozorgzadeh, “Physically-aware exploitation of component reuse in a partially reconfigurable architecture,” in Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006, p. 8, 2006.

[31] E. Horta, J. Lockwood, D. Taylor, and D. Parlour, “Dynamic hardware plugins in an fpga with partial run-time reconfiguration,” in Proceedings of the 39th Design Automa-tion Conference, pp. 343–348, 2002.

[32] Y. Krasteva, E. de la Torre, T. Riesgo, and D. Joly, “Virtex ii fpga bitstream manip-ulation: Application to reconfiguration control systems,” in International Conference on Field Programmable Logic and Applications, pp. 1–4, 2006.

[33] D. Montminy, R. Baldwin, P. Williams, and B. Mullins, “Using relocatable bitstreams for fault tolerance,” in 2nd NASA/ESA Conference on Adaptive Hardware and Systems, pp. 701–708, 2007.

[34] Xilinx, “Ml40x getting started tutorial,” [http://www.xilinx.com/support/

documentation/boards_and_kits/ug083.pdf], 2006.

[35] Xilinx, “Xtremedsp for virtex-4 fpgas,” [http://www.xilinx.com/support/

documentation/user_guides/ug073.pdf], 2008.

Related documents