• No results found

Computer Architecture ELEC3441

N/A
N/A
Protected

Academic year: 2021

Share "Computer Architecture ELEC3441"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Computer Architecture

ELEC3441

Lecture 4 – Single Cycle Processor

Dr. Hayden Kwok-Hay So

Department of Electrical and

Electronic Engineering

Overview

n

First implementation the RISC-V ISA in this

course

More variations to come…

n

Single cycle processor:

Each instruction takes 1 cycle to complete

n

Idealized memory

Instantaneous read

Single cycle write

n

Implements base RV32

HKU EEE ENGG3441 - HS 2

Full RISCV1Stage Datapath (HW1)

3 +4 Instruction Mem Reg File IType Sign Extend Decoder Data Mem ir[24:20] br/jmp pc+4 p c_ se l ir[31:20] rs1 ALU Control Signals w b _ se l Reg File rf _ w e n va l me m_ rw PC tohost testrig_tohost cpr_en me m_ va l addr wdata rdata Inst BrJmp TargGen ir[19:15] ir[31], ir[7], ir[30:25], ir[11:8]] PC+4 jalr 0 rs2 Branch CondGen br_eq? br_lt? co n tro l st a tu s re g ist e rs Execute Stage br_ltu? PC addr BType Sign Extend JumpReg TargGen Op2Sel Op1Sel AluFun data wa wd en addr data ir[ 1 1 :7 ] 13

HKU EEE ENGG3441 - HS HKU EEE ENGG3441 - HS 4

RV32I

RISC-V v2.1

specification

(2)

Hardware Elements

n

Combinational circuits

Mux, Decoder, ALU, ...

n

Synchronous state elements

Flipflop, Register, Register file, SRAM, DRAM

Edge-triggered: Data is sampled at the rising edge

Clk

D

Q

En

ff

Q

D

Clk

En

OpSelect - Add, Sub, ... - And, Or, Xor, Not, ... - GT, LT, EQ, Zero, ... Result Comp? A B

ALU

Sel O A0 A1 An-1 Mux

..

.

lg(n) A De co d e r

..

.

O0 O1 On-1 lg(n)

HKU EEE ENGG3441 - HS 5

Register

n

An n-bit register can be constructed by combining

n

DFFs in parallel

n

Each DFF responsible for read/write of 1 bit of

the input

n

Shared control signal:

Clock, reset, enable, …

HKU EEE ENGG3441 - HS 6

ff

Q

0

D

0 Clk En

ff

Q

1

D

1

ff

Q

2

D

2

ff

Q

n-1

D

n-1

...

...

...

register

D Q en

D<n-1:0>

Q<n-1:0>

Register File

n

Reads are combinational

rd=regfile(ra) in the same cycle

n

Writes take place at rising the clock edge

Write only take place if WE=1 at clock edge

n

Read address (ra) and write address (wa) choose which

register to read/write

n

RISCV needs a register file with

2 read ports

and

1 write port

n

What happen in these cases?

ra1 = ra2

ra1 = wa1

7

ReadData1

ReadAddr1

ReadAddr2

WriteAddr

Register

file

2R+1W

ReadData2

WriteData

WE

Clock

rd1 ra1 ra2 wa wd rd2 we

HKU EEE ENGG3441 - HS

Register File Implementation

n

2R + 1W, 32 registers, each 32 bits wide

n

Decoder select 1 out of 32 register depending on wa, ra1, ra2

n

On writes:

Only 1 of the 32 register has WE=1

n

On reads:

Only 1 of the 32 register may output to rd{1,2} bus

The same register may output to both rd1 and rd2 bus (e.g. rs1 = rs2)

8

reg 31

wa

clk

reg 1

wd

we

ra1

rd1

rd2

reg 0

32

5 32 32

ra2

5 5

(3)

A Simple Memory Model

9

Reads and writes are always completed in one cycle

• a Read can be done any time (i.e. combinational)

• a Write is performed at the rising clock edge

if it is enabled

⟹ the write address and data

must be stable at the clock edge

Later in the course we will present a more realistic model of memory

MAGIC

RAM

ReadData

WriteData

Address

WriteEnable

Clock

32 32 32

HKU EEE ENGG3441 - HS

Instruction Execution

10

Execution of a RISC-V instruction involves:

1. instruction fetch

2. decode and register fetch

3. ALU operation

4. memory operation (optional)

5. write back to register file (optional)

+ the computation of the

next instruction

address

HKU EEE ENGG3441 - HS

Datapath: Reg-Reg ALU Instructions

RegWrite Timing?

0x4 Add clk addr inst Inst. Memory PC Inst<19:15> Inst<24:20> Inst<11:7> <31:25,14:12> OpCode ALU ALU Control RegWriteEn clk rd1 GPRs ra1 ra2 wa wd rd2 we 31 25 24 20 19 15 14 12 11 7 6 0

funct7 rs2 rs1 funct3 rd opcode

7 5 5 3 5 7 0000000 src2 src1 ADD/SLT/SLTU dest OP 0000000 src2 src1 AND/OR/XOR dest OP 0000000 src2 src1 SLL/SRL dest OP 0100000 src2 src1 SUB/SRA dest OP

rd ß (rs1) func (rs2)

11 HKU EEE

Datapath: Reg-Imm ALU Instructions

Sign Ext inst<31:20> OpCode 0x4 Add clk addr inst Inst. Memory PC ALU RegWriteEn clk rd1 GPRs ra1 ra2 wa wd rd2 we inst<19:15> inst<11:7> inst<14:12> ALU Control

rd ß (rs1) op imm

HKU EEE ENGG3441 - HS 12

31 20 19 15 14 12 11 7 6 0

imm[11:0] rs1 funct3 rd opcode

12 5 3 5 7

I-immediate[11:0] src ADDI/SLTI[U] dest OP-IMM I-immediate[11:0] src ANDI/ORI/XORI dest OP-IMM

(4)

Conflicts in Merging Datapath

13 Sign Ext OpCode 0x4 Add clk addr inst Inst. Memory PC ALU RegWriteEn clk rd1 GPRs ra1 ra2 wa wd rd2 we <19:15> <11:7> <31:20> <31:25,14:12> ALU Control <14:12>

Introduce

muxes

<24:20> 31 27 26 25 24 20 19 15 14 12 11 7 6 0

funct7 rs2 rs1 funct3 rd opcode R-type imm[11:0] rs1 funct3 rd opcode I-type imm[11:5] rs2 rs1 funct3 imm[4:0] opcode S-type HKU EEE ENGG3441 - HS

ALU Op2 Select

14 Sign Ext OpCode 0x4 Add clk addr inst Inst. Memory PC ALU RegWriteEn clk rd1 GPRs ra1 ra2 wa wd rd2 we <19:15> <11:7> <31:20> ALU Control <24:20> 31 27 26 25 24 20 19 15 14 12 11 7 6 0

funct7 rs2 rs1 funct3 rd opcode R-type imm[11:0] rs1 funct3 rd opcode I-type imm[11:5] rs2 rs1 funct3 imm[4:0] opcode S-type

op2Sel Reg / Imm

If OPCODE==“OP” then op2sel = ‘1’ else ‘0’

HKU EEE ENGG3441 - HS

Quick Quiz

n

How do you implement op2sel in hardware?

HKU EEE ENGG3441 - HS 15

If OPCODE==“OP” then op2sel = ‘1’ else ‘0’

opcode

op2sel

OP

=?

opcode

op2sel

OP

=?

1

0

1

0

How do you implement this?

=?

2

1

Determining ALU functions

n

All basic integer R-R instructions have opcode = OP

(“0110011”)

only funct3 and funct7 are needed to determine needed ALU

function:

E.g. 000èAdd, 001èShiftLeft, 100èXOR,0100000èSub, …

n

Immediate instructions requires the same ALU function,

but has slightly different encoding

ADDI is same as ADD, except no need to check for Sub in

funct7

opcode = OP-IMM (“0010011”)

n

Need opcode to help determine ALU function

More cases like these come up later…

HKU EEE ENGG3441 - HS 16

imm[11:5] rs2 rs1 010 imm[4:0] 0100011 SW rs1,rs2,imm

imm[11:0] rs1 000 rd 0010011 ADDI rd,rs1,imm

imm[11:0] rs1 010 rd 0010011 SLTI rd,rs1,imm

0000000 rs2 rs1 000 rd 0110011 ADD rd,rs1,rs2

0100000 rs2 rs1 000 rd 0110011 SUB rd,rs1,rs2

imm[11:0] rs1 110 rd 0010011 ORI rd,rs1,imm

imm[11:0] rs1 111 rd 0010011 ANDI rd,rs1,imm

000000 shamt rs1 001 rd 0010011 SLLI rd,rs1,shamt

000000 shamt rs1 101 rd 0010011 SRLI rd,rs1,shamt

010000 shamt rs1 101 rd 0010011 SRAI rd,rs1,shamt

(5)

ALU Instructions Datapath

17 <30,14:12,6:0> Op2Sel Reg / Imm Sign Ext OpCode 0x4 Add clk addr inst Inst. Memory PC ALU RegWriteEn clk rd1 GPRs rs1 rs2 wa wd rd2 we <19:15> <24:20> ALU Control <11:7> 31 27 26 25 24 20 19 15 14 12 11 7 6 0

funct7 rs2 rs1 funct3 rd opcode R-type imm[11:0] rs1 funct3 rd opcode I-type imm[11:5] rs2 rs1 funct3 imm[4:0] opcode S-type

<31:20>

HKU EEE ENGG3441 - HS

Load Instructions

n

Use ALU for address calculation

n

Mux to select data for regfile: mem or ALU

18 WBSel ALU / Mem Op2Sel base offset OpCode ALU Control ALU 0x4 Add clk addr inst Inst. Memory PC RegWriteEn clk rd1 GPRs rs1 rs2 wa wd rd2 we Sign Ext clk MemWrite addr wdata rdata Data Memory we imm[11:0] rs1 f3 rd opcode

I

offset[11:0] base width dest LOAD

Load: (dest) ß M[(base) + offset]

<11:7>

HKU EEE ENGG3441 - HS

Store Instructions

n

Also use ALU for address calculation

n

No need to write back to regfile

n

Need to tell memory it is a write è Set MemWrite to ‘1’

19 WBSel ALU / Mem Op2Sel base offset OpCode ALU Control ALU 0x4 Add clk addr inst Inst. Memory PC RegWriteEn clk rd1 GPRs rs1 rs2 wa wd rd2 we Sign Ext clk MemWrite addr wdata rdata Data Memory we

S

imm[11:5] rs2 rs1 f3 imm[4:0] opcode

Store: M[(base) + offset] ß (src)

offset[11:5] src base width offset[4:0] STORE

HKU EEE ENGG3441 - HS

RISC-V Conditional Branches

n

Requires:

1. Logic to compare register values (rs1 and rs2)

2. Datapath to calculate branch target address relative to

PC

n

Current implementation: dedicated logic for both 1

and 2

Dedicated comparison logic (=, <, [≠, ≥])

Dedicated adder for jump target calculation

n

May use ALU for (2) above

Performance tradeoff…

20

SB

imm[10:5] rs2 rs1 funct3 imm[4:1] opcode

offset[12,10:5] src2 src1 BEQ/BNE BLT[U] BGE[U} offset[11,4:0] BRANCH imm[11] imm[12]

if (rs1 BR_OP rs2) then jump to PC + branch_imm

(6)

Conditional Branches

(BEQ/BNE/BLT/BGE/BLTU/BGEU)

21 0x4 Add PCSel clk WBSel MemWrite addr wdata rdata Data Memory we Op2Sel OpCode Bcomp? clk clk addr inst Inst. Memory PC rd1 GPRs rs1 rs2 wa wd rd2 we Branch Imm ALU ALU Control Add br pc+4 RegWrEn Br Logic

HKU EEE ENGG3441 - HS

RISC-V

Unconditional JAL

UB

offset[20:1] dest JAL

jump to PC + j_imm; rd ß PC+4

imm[10:1] imm[19:12] rd opcode

imm[20] Imm[11] 0x4 Add PCSel clk WBSel MemWrite addr wdata rdata Data Memory we Op2Sel OpCode Bcomp? clk clk addr inst Inst. Memory PC rd1 GPRs rs1 rs2 wa wd rd2 we Branch Imm ALU ALU Control Add brjmp pc+4 RegWrEn Br Logic Jump Imm

HKU EEE ENGG3441 - HS 22

JALR

23 0x4 Add PCSel clk WBSel MemWrite addr wdata rdata Data Memory we Op2Sel OpCode Bcomp? clk clk addr inst Inst. Memory PC rd1 GPRs rs1 rs2 wa wd rd2 we ALU ALU Control Add brjmp pc+4 RegWrEn Br Logic imm[11:0] rs1 f3 rd opcode

I

offset[11:0] base 000 dest JALR

jump to imm + (rs1); rd ß PC+4

offset Sign

Ext

jmpreg

HKU EEE ENGG3441 - HS

Hardwired Control is pure

Combinational Logic

n

Decoding instruction determines the setting of

various muxes and ALU function

n

Simple decoding helps to make faster hardware

24

combinational

logic

op code

Bcomp?

Op2Sel

AluFunc

MemWrite

WBSel

RegWriteEn

PCSel

(7)

Hardwired Control Table (Excerpt)

Instruction Op2Sel AluFunc WBSel RegWriteEn MemWrite PCSel

ADD

SUB

ADDI

SLL

LW

SW

BEQ

JAL

JALR

HKU EEE ENGG3441 - HS 25

RS2

RS2

IMI

RS2

IMI

IMS

IMB

IMJ

IMI

ADD

SUB

ADD

SLL

ADD

ADD

X

X

X

ALU

ALU

ALU

ALU

MEM

X

X

PC+4

PC+4

T

T

T

T

T

F

F

T

T

F

F

F

F

F

T

F

F

F

PC+4

PC+4

PC+4

PC+4

PC+4

PC+4

PC+4/BA

JA

JRA

• Op2Sel: rs2, {I,B,J}-type immediate IM{I, B, J}

• AluFunc: Add, Sub, Shift, XOR, etc

• WBSel: what values to write to rd

Single-Cycle Hardwired Control

We will assume clock period is sufficiently long for

all of the following steps to be “completed”:

1.

Instruction fetch

2.

Decode and register fetch

3.

ALU operation

4.

Data fetch if required

5.

Register write-back setup time

t

C

> t

IFetch

+ t

RFetch

+ t

ALU

+ t

DMem

+ t

RWB

At the rising edge of the following clock, the PC,

register file and memory are updated

26 HKU EEE ENGG3441 - HS

Full RISCV1Stage Datapath (HW1)

27 +4 Instruction Mem Reg File IType Sign Extend Decoder Data Mem ir[24:20] br/jmp pc+4 p c_ se l ir[31:20] rs1 ALU Control Signals w b _ se l Reg File rf _ w e n va l me m_ rw PC tohost testrig_tohost cpr_en me m_ va l addr wdata rdata Inst BrJmp TargGen ir[19:15] ir[31], ir[7], ir[30:25], ir[11:8]] PC+4 jalr 0 rs2 Branch CondGen br_eq? br_lt? co n tro l st a tu s re g ist e rs Execute Stage br_ltu? PC addr BType Sign Extend JumpReg TargGen Op2Sel Op1Sel AluFun data wa wd en addr data ir[ 1 1 :7 ] 13

HKU EEE ENGG3441 - HS 28

Acknowledgements

n

These slides contain material developed and

copyright by:

Arvind (MIT)

Krste Asanovic (MIT/UCB)

Joel Emer (Intel/MIT)

James Hoe (CMU)

John Kubiatowicz (UCB)

David Patterson (UCB)

n

MIT material derived from course 6.823

n

UCB material derived from course CS152,

CS252

References

Related documents

Other government-sponsored programs for specific groups—such as Medicaid and the State Children’s Health Insurance Program (SCHIP) for low-income individuals and families—and plans

In a developing society, many strength and threats may affect the business sector. 0espite business sector of Bangladesh is marching forward and growing rapidly. In future, this

mod-4 counter MOD4 2 inc roll0 digit0 clr MOD4 2 roll1 digit1 clr clk clk inc inc roll roll dout dout clr clr MOD4 2 roll1 digit2 clr clk inc roll dout clr..

Setup and hold times at the pins of a chip (6) CLK FF CLK FF D DATA Tclk Tsu Th Tdata Tdata Tseup Thold D Q combinatorial logic (and delay). Tsu/Th Tdata Tclk clk data pins

CAUTION: This email originated from outside your organization. Exercise caution when opening attachments or clicking links, especially from unknown senders. Hi: Just wanted to

If within 40 seconds of unlocking with the remote control, neither door nor trunk is opened, the electronic key is not inserted in the steering lock, or the central locking switch

In the current study, the SHARP Program researchers examined State Fund workers’ compensation claims for general and selected specific hand/wrist, elbow, shoulder, neck and

The results are in line with Jarocinski &amp; Karadi (2020) and Steinsson (2019): the Fed information shock has a more prolonged effect on the one year rate, positive effect