• No results found

The RTL design of 32-bit RISC processor using verilog HDL

N/A
N/A
Protected

Academic year: 2020

Share "The RTL design of 32-bit RISC processor using verilog HDL"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

THE RTL DESIGN OF 32-BIT RISC PROCESSOR USING VERILOG HDL

HAFIZUL HASNI BIN MANAB

(2)

THE RTL DESIGN OF 32-BIT RISC PROCESSOR USING VERILOG HDL

HAFIZUL HASNI BIN MANAB

A project report submitted in partial fulfilment of the requirements for the award of the degree of

Master of Engineering (Electrical-Computer & Microelectronic System)

Faculty of Electrical Engineering Universiti Teknologi Malaysia

(3)

iii

(4)

iv

ACKNOWLEDGEMENT

Special thanks to project supervisor AP Dr. Muhammad Nasir Bin Ibrahim, lecturers Prof. Dr. Mohamed Khalil Hani, Dr. Muhammad Nadzir Marsono and Dr. Usman Ullah Sheikh Izzat Ullah Sheik for their ideas and knowledge sharing which help me a lot in finishing this project.

(5)

v

ABSTRACT

(6)

vi

ABSTRAK

(7)

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION... ii

DEDICATION... iii

ACKNOWLEDGEMENT ... iv

ABSTRACT ...v

ABSTRAK ... vi

TABLE OF CONTENTS ... vii

LIST OF TABLES ... xii

LIST OF FIGURES ... xiii

LIST OF ABBREVIATIONS ... xvi

LIST OF APPENDICES ... xvii

1 INTRODUCTION ...1

1.1 Research Background ...1

1.2 Project Background ...2

1.3 Problem Statement ...3

1.4 Objective ...4

1.5 Scope of Work ...4

1.6 Project Schedule...4

1.7 Report Outline ...4

(8)

viii

2 LITERATURE REVIEW ...6

2.1 Procedure Identification ...6

2.2 Instruction Set Architecture (ISA) ...7

2.2.1 Complex Instruction Set Computer (CISC) ...7

2.2.2 Reduced Instruction Set Computer (RISC)...8

2.2.3 Instruction Usage ...9

2.3 Load/Store Architecture ...10

2.4 Pipelining ...11

2.5 Pipeline Hazard ...13

2.5.1 Structural Hazard ...13

2.5.2 Data Hazard ...13

2.5.3 Control Hazard ...14

2.6 Processor Resources...14

2.6.1 Incrementer ...15

2.6.2 Program Counter ...15

2.6.3 Link Register ...15

2.6.4 Instruction Memory ...15

2.6.5 Instruction Register ...16

2.6.6 Register File ...16

2.6.7 Barrel Shifter ...16

2.6.8 Arithmetic and Logic Unit ...16

2.6.9 Status Register ...17

2.6.10 Data Memory ...17

2.6.11 Control Unit ...17

2.6.12 Data Bus ...17

2.6.13 Address Bus ...18

2.6.14 Control Bus ...18

2.6.15 Pipeline Register ...18

2.7 Previous Work ...18

2.8 Jurij Silc, Advanced Processor Architecture ...18

(9)

ix

2.8.2 VHDL Design of A 32-bit RISC Processor Core for

FPGA Implementation ...21

2.8.3 The RTL Design of 32-bit 5-stage Pipeline RISC Processor Using Verilog HDL ...23

2.9 Introduction to ARM Architecture...27

2.10 Summary of Chapter 2 ...28

3 METHODOLOGY AND PRECEDURES ...29

3.1 Methodology ...29

3.2 Research Flow and Methodology ...30

3.3 Design Flow and Methodology ...32

3.4 Summary of Chapter 3 ...36

4 RESULT AND DISCUSSION ...37

4.1 Processor Top Level Design ...37

4.2 Processor Design Specification...40

4.3 Processor Resources...42

4.3.1 Clock ...42

4.3.2 Incrementer ...43

4.3.3 Program Counter ...45

4.3.4 Link Register ...46

4.3.5 Instruction Memory ...48

4.3.6 Instruction Register ...50

4.3.7 Register File ...51

4.3.8 Barrel Shifter ...54

4.3.8.1 Logical Shift Left ...56

4.3.8.2 Logical Shift Right ...57

4.3.8.3 Arithmetic Shift Right...59

4.3.8.4 Rotate Right ...60

4.3.8.5 Rotate Right Extended ...62

4.3.9 Arithmetic Logic Unit ...63

4.3.10 Current Program Status Register ...65

(10)

x

4.4 Instruction Set Architecture and Format ...68

4.4.1 Data Processing ...68

4.4.2 Single Data Transfer ...71

4.4.3 Branch ...74

4.5 ASM Flowchart ...74

4.6 Datapath Unit Design ...77

4.7 Control Unit Design ...79

4.8 myARM RISC Processor Core Architecture ...81

4.9 Processor Validation ...83

4.10 Design Optimization and Performance Analysis ...85

4.10.1 Maximum Operating Frequency ...85

4.10.2 Critical Path Delay ...85

4.10.3 Throughput ...86

4.10.4 Latency ...86

4.10.5 Resource Utilization...86

4.11 Summary of Chapter 4 ...87

5 CONCLUSION ...88

5.1 Limitation of the Processor Core ...88

5.2 Project Conclusion ...88

5.3 Future Works ...89

5.3.1 Pipeline Hazard Handling ...89

5.3.2 Interrupt Capability ...89

5.3.3 Multiplier and Divisor Design ...90

5.3.4 Assembler Compatibility ...90

5.3.5 Implementation on FPGA ...90

5.3.6 Memory Extension ...90

5.3.7 Extending the Instruction ...91

5.3.8 Cache Memory Architecture ...91

5.3.9 Interfacing with Co-Processor ...92

5.3.10 I/O System ...92

(11)

xi

REFERENCES ...93 Appendix A ...96

(12)

xii

LIST OF TABLES

TABLE NO. TITLE PAGE

4.1 Register file operation ...54

4.2 Barrel shifter operation ...55

4.3 Logical shift left operation barrel shifter ...57

4.4 Logical shift right operation barrel shifter ...59

4.5 Arithmetic shift right operation barrel shifter ...60

4.6 Rotate right operation of barrel shifter...61

4.7 Rotate right extended operation of barrel shifter ...63

4.8 Arithmetic and logic unit operation ...64

4.9 Condition flags ...65

4.10 Instruction set and format ...68

4.11 Data processing instruction ...70

4.12 RTL notation for single data transfer operation ...73

4.13 Pipeline stages ...77

4.14 Hardware for pipeline stages ...78

(13)

xiii

LIST OF FIGURES

FIGURE NO. TITLE PAGE

2.1 Basic 5-stage pipeline structure ...12

2.2 5-stage pipeline structure by Jurij Silc ...19

2.3 5-stage pipeline structure by V. Heuring ...20

2.4 VHDL Design of A 32-bit RISC Processor Core ...22

2.5 Top level architecture ...24

2.6 Datapath unit design and the interconnection ...25

2.7 ARM7TDMI core architecture ...28

3.1 Project Methodology ...29

3.2 Research Methodology ...30

3.3 Experiment on memory’s timing performance ...31

3.4 RTL Design Flow ...32

3.5 Hierarchical Modularity of RTL Design Methodology ...34

3.6 RTL Design Methodology ...35

4.1 Selected blocks from ARM7TDMI core architecture ...38

4.2 Hierarchical modularity of processor design ...39

4.3 Hierarchical modularity of processor design ...40

4.4 Clocking design ...43

4.5 Incrementer design ...44

4.6 Incrementer simulation result ...44

4.7 Program counter design ...45

4.8 Output waveform program counter simulation ...46

4.9 Link register design...47

(14)

xiv

4.11 Instruction memory design ...49

4.12 Memory initialization file ...49

4.13 Output waveform of instruction memory simulation ...50

4.14 Output waveform of instruction memory simulation ...50

4.15 Register file design ...51

4.16 Register file organization ...52

4.17 Register file organization ...53

4.18 Barrel shifter design ...55

4.19 Output waveform of logical shift left operation ...56

4.20 Output waveform of logical shift right operation ...58

4.21 Output waveform of arithmetic shift right operation ...59

4.22 Output waveform of rotate right operation ...61

4.23 Output waveform of rotate right extended operation ...62

4.24 Arithmetic and logic unit design ...64

4.25 Current Program Status Register design ...65

4.26 Comparing the condition field with CPSR ...66

4.27 Data Memory design ...67

4.28 Store and load instructions ...67

4.29 Data processing instruction binary encoding ...69

4.30 Shift operation binary encoding ...70

4.31 Single data transfer binary encoding...71

4.32 Branch instruction binary encoding ...74

4.33 Algorithmic State Machine ...75

4.34 Processor’s state transition ...76

4.35 Datapath unit design ...78

4.36 Top level design of datapath unit ...79

4.37 Control unit design ...80

4.38 Conceptual design of myAR ...81

4.39 myARM core architecture ...82

4.40 Test program ...83

4.41 Output waveform of pipeline verification...84

4.42 Critical path delay ...85

(15)

xv

(16)

xvi

LIST OF ABBREVIATIONS

RISC - Reduced Instruction Set Computer CISC - Complex Instruction Set Computer

SoC - System-on-Chip

ISA - Instruction Set Architecture

PC - Program Counter

(17)

xvii

LIST OF APPENDICES

APPENDIX TITLE PAGE

(18)

CHAPTER 1

INTRODUCTION

This project report is about 32-bit 5-stage pipeline RISC processor design based on ARM instruction set architecture and format. This chapter discusses the introduction to this project which covers the background research, research motivation, scope of work and the report organization. The problem statement and the objective that lead to implementation of this project are also discussed.

1.1 Research Background

(19)

2

In this project, a processor is designed which is based on RISC processor design. The design philosophy of RISC processor is to reduce the complexity of the ISA by limiting the instruction set in to a smaller number of more frequently used instruction that yields better efficiency in modern computing [2].

Besides, the RISC processor throughput is improved by implementation of the pipeline mechanism that brings the processor to achieve a high performance in speed because all the operations are done by the registers. RISC architecture was first introduced by IBM in 1975 [16]. However, RISC designs such as Berkeley‟s RISC processor and Stanford‟s MIPS processor which were introduced by respective university research teams were gaining higher popularity in term of public RISC design [16].

1.2 Project Background

The RISC processor design proposed is based on ARM processor core architecture is designed using Verilog HDL design entry and the design methodology is based on hierarchical modularity of RTL design methodology so that the functional unit of the processor can be modeled using behavioral programming style and the all functional blocks will be integrated into a system using structural modeling technique for both processor core datapath and control unit design.

(20)

3

Compared to RISC processor, it operates on very few data types, simple and yet limited addressing modes, and does only the simple instructions [5]. It supports very few addressing modes and is mostly register based. Most of the instructions operate on data present in the register files, so called register-to-register operation, and this is faster than CISC‟s memory-to-memory operation [6]. Only load and store data from and into memory are working on memory accessing. Furthermore, the RISC instruction length is fixed and hence the decoding technique is easier compared to CISC microprogrammed decoding technique to generate the control signals [6].

Parallel execution of instructions through the pipelined mechanism of processor will improves the overall throughput [4]. The ARM architecture is used as guidance to design the RISC processor because of successful design of its architecture in many embedded systems as well as mobile computing. The SoC design could be used to develop various fast, small die area size usages and yet low powered embedded system as well mobile computing [4].

1.3 Problem Statement

(21)

4

1.4 Objective

The objective of this project is to study, design, and validate a 32-bit 5-stage pipeline RISC processor based on ARM instruction set architecture and format. It covers the study of ARM core architecture datapath design and investigation on how the processor executes it‟s instruction.

1.5 Scope of Work

The scope of works in this project covers the design of a 32-bit RISC processor with implementation of 5-stage pipeline that can execute three main types of ARM instruction set architecture which are data processing, single data transfer, as well as branching. The project covers the design entry using Verilog HDL and synthesizing using Altera Quartus II Tool.

1.6 Project Schedule

The project is scheduled for two semesters, which is span for eight months. The hardware specification and requirement were derived in the first semester while the Verilog HDL coding, synthesis, compilation, simulation, and validation for the design processor were done in the second semester.

1.7 Report Outline

Report outline discusses the content in each chapter of this project report.

(22)

5

Chapter 2 Introduction to the instruction set architectures, pipelining mechanism, basic processor‟s functional units needed to design a RISC processor, and the previous works done.

Chapter 3 Research and design methodology that applied to design the processor and the tools used.

Chapter 4 Discussion of the processor design and the performance analysis.

Chapter 5 Conclusion and suggestion for future works.

1.8 Summary of Chapter 1

(23)

93

REFERENCE

[1] R. de J. Romero-Troncoso; A. Ordaz-Moreno; J. A. Vite-Frias; A. Garcia-Perez. (2006). 8-bit CISC Microprocessor Core for Teaching Applications in the Digital Systems Laboratory. Reconfigurable Computing and FPGA's, 2006. ReConFig 2006.

IEEE International Conference. Sept. 2006. San Luis Potosi. 1 – 5.

[2] Kwanghyun Cho; Jaebeom Kim; Euibong Jung; Sik Kim; Zhenmin Li; Young-Rae Cho; Byeong Min; Kyu-Myung Choi (2008). Reusable Platform Design Methology For SoC Integration and Verification. SoC Design Conference, 2008.

ISOCC '08. International.24-25 Nov. 2008. Busan . I-78 - I-81.

[3] Zhihui Xiong; Sikun Li; Jihua Chen; Dawei Wang (2004), A Platform Based SoC Hardware/Software Co-Design Environment. Computer Supported Cooperative Work in Design, 2004. Proceedings. The 8th International Conference. 26-28 May 2004.

Vol 2. 443 – 448.

[4] Muhammad Nadzir Bin Marsono (2001.) VHDL Design of A 32-Bit RISC Processor Core For FPGA Implementation, Master‟s Degree. Universiti Teknologi Malaysia,

Skudai.

[5] Hasan Krad, Aws Yousif Al-Taie (2007). A New Trend for CISC and RISC Architecture. Department of Computer Science & Engineering College of

(24)

94

[6] Hannibal (1999). RISC vs. CISC: The Post-RISC Era “A historical approach to the debate”. http://http://arstechnica.com/cpu/4q99/risc-cisc/rvc-2.html.

[7] Mar Smotherman, Manoj Franklin (1995). Improving CISC Instruction Decoding Performance Using a Fill Unit. Microarchitecture, 1995. Proceedings of the 28th Annual International Symposium. 29 Nov-1 Dec 1995 . Ann Arbor, MI , USA. 219 -

229 .

[8] Shiu, R.-M. ; Chiu, J.-C. ; Cheng, S.-K. ; Shann, J.J.-J. (2000). Decoding of CISC instruction in superscalar processors with high issue rate. Mar 2000. Computers and Digital Techniques, IEE Proceedings. 147 , Issue:2. 101 – 107.

[9] Breternitz, M. ; Hum, H. ; Peri, R. ; Pickett, J. ; Youfeng Wu (2005). Enhanced Code Density of Embedded CISC Processor With Echo Technology. Hardware/Software Codesign and System Synthesis, 2005. CODES+ISSS '05. Third IEEE/ACM/IFIP

International Conference. Sept. 2005. 160 – 165.

[10] Jun Zhang ; Xiao-Ya Fan ; Song-He Liu (2008). A Precise Interrupt Mechanism Based on Micro-Operation Tracing of Instruction Boundary for Embedded Processor. Networking, Architecture, and Storage, 2008. NAS '08. International Conference.

12-14 June 2008. 181 – 182.

[11] Alan Clements, (2006) Principle of Computer Hardware, Fourth Edition Oxford University Press.

[12] Dr Mohamed Khalil Hani (2010), Starter’s Guide to Digital Systems VHDL & Verilog Design, Revised Edition 2, July 2010, Desktop Publisher.

[13] Jurij Silc Advanced Processor Architecture : http://csd.ijs.si/silc/.

(25)

95

[15] Jonie Lin Joo Nee (2008). The RTL Design of 32-bit 5-stage Pipeline RISC Processor Using Verilog HDL. Master‟s Degree. Universiti Teknologi Malaysia, Skudai.

[16] Koay Boon Wooi (2009). Development of Single Board Computer Based on 32-bit 5-stage Pipeline RISC Processor. Master‟s Degree. Universiti Teknologi Malaysia,

Skudai.

[17] ATM Limited (2001). ARM Open Access Datasheet. http://infocenter.arm.com.

[18] Stephen Brown & Zvonko Vranesic (2003), Fundamentals of Digital Logic with Verilog Design, Second Edition, 2003, McGraw Hill.

[19] M. Rafizuzzaman (2005), Fundamental of Digital Logic and Microprocessor Design, Fifth Edition 2005, John Wiley &Sons, Inc.

References

Related documents

Der Nutzen für spätere theoretische Prüfungen wird von den Befragten nicht als sehr groß angegeben; aus den eigenen Anmerkungen der Praktikanten wird deutlich, dass dies teils

NANOELECTROMECHANICAL MEMBRANES FOR MULTIMODE MASS SPECTROMETRY Senior Thesis by Jarvis Li In Partial Fulfillment of the Requirements for the Degree of Bachelors of Science in

CUAJ ? October 2019 ? Volume 13, Issue 10 ? 2019 Canadian Urological Association 307 Cite as Can Urol Assoc J 2019;13(10) 307 14 http //dx doi org/10 5489/cuaj 6136 Published online

Key points with regard to liberation from mechanical ventilation are: most mechanically ventilated patients can be liberated from mechanical ventilation after a short spon-

MFCC is used to extract the characteristics from the input speech signal with respect to a particular word uttered by a particular speaker.. Then HMM is used on Quantized

The biotin auxotroph strain S288c was able to grow on media lacking biotin when BIO1 and the known biotin synthesis gene BIO6 were introduced together on a plasmid vector.. BIO1 is

Values are pooled averages of two seasons ( 14 N and 15 N experiments). Error bars represent standard error of the mean.. Partitioning of whole plant 15 N in the shoot. Plotted

Minisatellites, one of the major classes of repetitive DNA sequences in eukaryotic genomes, are stable in somatic cells but destabilize during meiosis. We previously established a