C
OMPUTERO
RGANIZATION ANDD
ESIGNThe Hardware/Software Interface
5th
Edition
Chapter 4
The Processor
Agenda
◼
Performance Issues
◼MIPS Pipeline:
Performance Issues
◼
Longest delay determines clock period
◼ Critical path: load instruction
◼ Instruction memory → register file → ALU →
data memory → register file
Performance Issues
◼
Not feasible to vary period for different
instructions
Pipelining Analogy
◼
Pipelined laundry: overlapping execution
◼ Parallelism improves performance
MIPS Pipeline
◼
Five stages, one step per stage
1. IF: Instruction fetch from memory
2. ID: Instruction decode & register read
3. EX: Execute operation or calculate address
4. MEM: Access memory operand
5. WB: Write result back to register
Pipeline Performance
◼ Assume time for stages is
◼ 100ps for register read or write ◼ 200ps for other stages
◼ Compare pipelined datapath with single-cycle
datapath
Pipeline Performance
Single-cycle (Tc= 800ps)
Pipeline Speedup
◼
Speedup = CPUtime
NCPUtime
P◼ CPUtimeN= IC × TN
Pipeline Speedup
◼
Speedup = CPUtime
NCPUtime
P◼ CPUtimeN= IC × TN 3×800=2400
Pipeline Speedup
◼
Speedup = CPUtime
NCPUtime
P◼ CPUtimeP= No. Stages × TP + (IC-1) × TP
Pipelined (T = 200ps)
◼ Speedup = IC × TN
No. Stages × TP + (IC-1) × TP = IC × TN
TP (No. Stages + IC-1)
Pipeline Speedup
◼
Speedup = CPUtime
NCPUtime
P◼ CPUtimeN= IC × TN
◼ Speedup = IC × TN
No. Stages × TP + (IC-1) × TP = IC × TN
TP (No. Stages -1+ IC)
Pipeline Speedup
◼
Speedup = CPUtime
NCPUtime
P◼ CPUtimeN= IC × TN
◼ CPUtimeP= No. Stages × TP + (IC-1) × TP
◼ Speedup = IC × TN
No. Stages × TP + (IC-1) × TP
= IC × TN = TN = No. Stages × TP Tp× IC TP TP
Pipeline Speedup
◼Speedup = CPUtime
NCPUtime
P ◼ CPUtimeN= IC × TN◼ CPUtimeP= No. Stages × TP + (IC-1) × TP
= No. Stages
3×800=2400 (5×200)+ (2×200)=1400
1.7
◼ Speedup = IC × TN
No. Stages × TP + (IC-1) × TP
= IC × TN = TN = No. Stages × TP Tp× IC TP TP
Pipeline Speedup
◼Speedup = CPUtime
NCPUtime
P ◼ CPUtimeN= IC × TN◼ CPUtimeP= No. Stages × TP + (IC-1) × TP
= No. Stages
1000×800=800000 (5×200)+ (999×200)=200800
3.9
Pipelined Datapath
◼
Need registers between stages
Pipeline Operation
◼
Cycle-by-cycle flow of instructions through the
pipelined datapath
◼ “Single-clock-cycle” pipeline diagram ◼ Shows pipeline usage in a single cycle ◼ Highlight resources used
◼ c.f. “multi-clock-cycle” diagram ◼ Graph of operation over time
WB for Load
Multi-Cycle Pipeline Diagram
Multi-Cycle Pipeline Diagram
Single-Cycle Pipeline Diagram
Pipelined Control
◼
Control signals derived from instruction
Concluding Remarks
◼ ISA influences design of datapath and control ◼ Datapath and control influence design of ISA ◼ Pipelining improves instruction throughput
using parallelism
◼ More instructions completed per second ◼ Latency for each instruction not reduced