• No results found

Memory Technology. Administrivia

N/A
N/A
Protected

Academic year: 2021

Share "Memory Technology. Administrivia"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

Memory Technology

Computer Science 104

Lecture 16

© Alvin R. Lebeck

Administrivia

•  Midterm II Next Monday

(2)

3

© Alvin R. Lebeck CPS 104

•  Memory

Outline

•  Review

•  Big Picture of Memory

•  Memory Technology

 SRAM  DRAM

Reading

C.9

Today’s Lecture

Instr Decode / Reg Fetrch

The Five Steps of a Load Instruction

Clk PC Rs, Rt, Rd, Op, Func Clk-to-Q ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA

Register File Access Time

Old Value New Value

busB

ALU Delay

Old Value New Value

Old Value New Value New Value

Old Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

Address Old Value New Value

Delay through Extender & Mux

Data Memory Access Time

Instruction Fetch Address Data Memory Reg Wr

(3)

5 © Alvin R. Lebeck

A Pipelined

Datapath

IF/ID Register ID/Ex Register Ex/Mem Register Mem/W

r Register PC Data Mem WA Di RA Do IUnit A I RFile Di Ra Rb Rw Mem Wr RegWr ExtOp Exec Unit busA busB Imm16 ALUOp ALUSrc Mux 1 0 MemtoReg 1 0 RegDst Rt Rd Imm16 PC+4 PC+4 Rs Rt PC+4 Zero Branch 1 0 Clk

Ifetch Reg/Dec Exec Mem WrB

© Alvin R. Lebeck

A More Extensive Pipelining

Example

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

Ifetch Reg/Dec Exec Mem WrB

0: Load

Ifetch Reg/Dec Exec Mem WrB

4: R-type

Ifetch Reg/Dec Exec Mem WrB

8: Store

Ifetch Reg/Dec Exec Mem WrB

12: Beq (target is 1000) End of Cycle 4 End of Cycle 5 End of Cycle 6 End of Cycle 7

•  End of Cycle 4: Load’sMem, R-type’s Exec, Store’s Reg, Beq’s Ifetch

•  End of Cycle 5: Load’sWrB, R-type’s Mem, Store’s Exec, Beq’sReg

•  End of Cycle 6: R-type’sWrB, Store’sMem, Beq’sExec

(4)

7 © Alvin R. Lebeck

Initial Representation: Finite State Diagram

1: PCWr, IRWr ALUOp=Add Others: 0s x: PCWrCond RegDst, Mem2R Ifetch 1: BrWr, ExtOp ALUOp=Add Others: 0s x: RegDst, PCSrc ALUSelB=10 IorD, MemtoReg Rfetch/Decode 1: PCWrCond ALUOp=Sub x: IorD, Mem2Reg ALUSelB=01 RegDst, ExtOp ALUSelA BrComplete PCSrc 1: RegDst ALUOp=Rtype ALUSelB=01 x: PCSrc, IorD MemtoReg ALUSelA ExtOp RExec 1: RegDst, RegWr ALUOp=Rtype ALUselA x: IorD, PCSrc ALUSelB=01 ExtOp Rfinish ALUOp=Or IorD, PCSrc 1: ALUSelA ALUSelB=11 x: MemtoReg OriExec 1: ALUSelA ALUOp=Or x: IorD, PCSrc RegWr ALUSelB=11 OriFinish ALUOp=Add PCSrc 1: ExtOp ALUSelB=11 x: MemtoReg ALUSelA AdrCal ALUOp=Add x: PCSrc,RegDst 1: ExtOp ALUSelB=11 MemtoReg MemWr ALUSelA SWMem ALUOp=Add x: MemtoReg 1: ExtOp ALUSelB=11 ALUSelA, IorD PCSrc LWmem ALUOp=Add x: PCSrc 1: ALUSelA ALUSelB=11 MemtoReg RegWr, ExtOp IorD LWwr lw or sw lw sw Rtype Ori beq

0

1

8

10

6

5

3

2

4

7

11

Wait Wait

•  The Five Classic Components of a Computer

•  Today’s Topic: Memory Technology

(5)

9

© Alvin R. Lebeck CPS 104

Where Are We?

I/O system CPU Compiler Operating System Application Digital Design Circuit Design Instruction Set Architecture, Memory, I/O Firmware Memory

Software

Hardware

Interface Between

HW and SW

You are here.

© Alvin R. Lebeck 1 2 3 4

2n-1

0 00110110 00001100 Byte Address Data

Review: Program’s View of Memory

•  Memory is a large linear array of

bytes.

 Each byte has a unique address (location).  Byte of data at address 0x100, and 0x101

•  Most computers have instructions

with byte (8-bit) addressing.

•  Data may have to be aligned on

word (4 byte) or double word (8

byte) boundary.

  int is 4 bytes

 double precision floating point is 8 bytes

•  32-bit v.s. 64-bit addresses

 we will assume 32-bit for rest of course,

unless otherwise stated

(6)

11 © Alvin R. Lebeck CPS 104 Clk 5 Rw Ra Rb 32 32-bit Registers Rd ALU Clk Data In DataOut Data Address Ideal Data Memory Instruction Instruction Address Ideal Instruction Memory Clk PC 5 Rs 5 Rt 16 Imm 32 32 32 32 A B

Our Naïve View of Memory (Single Cycle)

Question

•  What issues do we need to worry about in

(7)

13 © Alvin R. Lebeck CPS 104 I/O Bus Memory Bus CPU Cache Disk Controller Disk Memory Disk Graphics

Controller Network Interface

Graphics Network interrupts

System Organization

I/O Bridge

Core Chip Set

The

memory

hierarchy

© Alvin R. Lebeck

Level Two Cache

Datapath

Registers

Level One

Cache

Control

Processor

Processor and Caches

(8)

15

© Alvin R. Lebeck CPS 104

Memory

Controller

Memory Bus

DIMM Slot 0 DIMM Slot 1 DIMM Slot 2 DIMM Slot 3 DIMM Slot 4 DIMM Slot 5 DIMM Slot 6 DIMM Slot 7

DRAM DIMM

DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM

Main Memory

Why is it called DRAM?

To Processor

•  Random Access:

 “Random” is good: access time is the same for all locations

 DRAM: Dynamic Random Access Memory

»  High density, low power, cheap, slow »  Dynamic: needs to be “refreshed” regularly »  Main memory

 SRAM: Static Random Access Memory

»  Low density, high power, expensive, fast

»  Static: content will last “forever” (until power loss) »  Caches

•  “Not-so-random” Access Technology:

 Access time varies from location to location and from time to time  Examples: Disk, DVD/CD

•  Sequential Access Technology: access time linear in

location (e.g.,Tape)

(9)

17

© Alvin R. Lebeck CPS 104

•  Why do computer professionals need to know about

RAM technology?

 Processor performance is usually limited by

memory latency and bandwidth.

 Latency: The time it takes to access a single word in memory.

 Bandwidth: The average speed of access to memory (Words/Sec).

 As integrated circuit (IC) densities increase, lots of memory will fit

on processor chip

»  Tailor on-chip memory to specific needs.

-  Instruction cache -  Data cache -  Write buffer

•  What makes RAM different from a bunch of flip

-flops?

 Density: RAM is much more dense

 Speed: RAM access is slower than flip-flop (register) access.

Random Access Memory (RAM) Technology

© Alvin R. Lebeck

DRAM

Year Size Cycle Time 1980 64 Kb 250 ns 1983 256 Kb 220 ns 1986 1 Mb 190 ns 1989 4 Mb 165 ns 1992 16 Mb 145 ns 1995  64 Mb 120 ns 1999  128Mb 100 ns 2003  256Mb 100 ns 2007 2Gb 55ns

Capacity

Speed

Logic:

2x in 3 years 2x in 3 years

DRAM:

4x in 3 years 1.4x in 10 years

Disk:

2x in 3 years 1.4x in 10 years

1000:1!

2:1!

(10)

19

© Alvin R. Lebeck CPS 104

6-Transistor SRAM Cell

bit bit

word (row select)

•  Write:

1. Drive bit lines (bit=1, bit=0) 2. Select row

•  Read:

1. Precharge bit and bit to Vdd (set to 1) 2. Select row

3. Cell pulls one line low (pulls to 0)

4. Sense amp on column detects difference between bit and bit

bit bit

word

1 0

0 1

Static RAM Cell

SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell - Sense Amp + - Sense Amp + - Sense Amp + - Sense Amp +

:

:

:

:

Word 0 Word 1 Word 15 Dout 0 Dout 1 Dout 2 Dout 3

- Wr Driver & Precharger + - Wr Driver & Precharger + - Wr Driver & Precharger + - Wr Driver & Precharger +

Addr ess Decoder WrEn Precharge Din 0 Din 1 Din 2 Din 3 A0 A1 A2 A3

(11)

21

© Alvin R. Lebeck CPS 104

•  Write Enable is usually active low (WE_L)

•  Din and Dout are combined to save pins:

 A new control signal, output enable (OE_L) is needed  WE_L is asserted (Low), OE_L is disasserted (High)

»  D serves as the data input pin

 WE_L is disasserted (High), OE_L is asserted (Low) »  D is the data output pin

 Both WE_L and OE_L are asserted: »  Result is unknown. Don’t do that!!!

A D OE_L 2 N words x M bit SRAM N M WE_L

Logic Diagram of a Typical SRAM

© Alvin R. Lebeck Write Timing: D Read Timing: WE_L A Write Hold Time

Write Setup Time

A D OE_L 2 N words x M bit SRAM N M WE_L Data In Write Address OE_L High Z

Junk Read Address Junk Read Access Time Data Out Read Access Time Data Out Junk Read Address

(12)

23

© Alvin R. Lebeck CPS 104

•  Dynamic RAM (DRAM):

  Refresh required   Very high density

  Low power (.1 - .5 W active,

.25 - 10 mW standby)

  Low cost per bit   Pin sensitive (few pins):

»  Output Enable (OE_L) »  Write Enable (WE_L) »  Row address strobe (ras) »  Col address strobe (cas)

cell array NxN bits N N r o w SA & c o l addr log N 2 D WE_L OE_L

Introduction to DRAM

•  Write:

 1. Drive bit line  2. Select row

•  Read:

 1. Precharge bit line to Vdd (1)  2. Select row

 3. Cell and bit line share charges »  Very small voltage changes on the

bit line

 4. Sense (fancy sense amp)

»  Can detect changes of ~1 million

electrons

 5. Write: restore the value

•  Refresh

 1. Just do a dummy read to every cell.

row select

bit

(13)

25 © Alvin R. Lebeck CPS 104 r o w d e c o d e r row address Sense-Amps, Column Selector &

I/O Circuits Column Address

data RAM Cell Array

word (row) select bit (data) lines

•  Row and Column Address together:

 Select 1 bit a time

Each intersection represents a 1-T DRAM Cell

Classical DRAM Organization (square)

© Alvin R. Lebeck

•  Typical DRAMs: access multiple bits in parallel

 Example: 2 Mb DRAM = 256K x 8 = 512 rows x 512 cols x 8 bits  Row and column addresses are applied to all 8 planes in parallel

One “Plane” of 256 Kb DRAM 512 rows Plane 0 512 cols D<0> Plane 1 D<1> Plane 7 D<7> 256 Kb DRAM 256 Kb DRAM

(14)

27 © Alvin R. Lebeck CPS 104 A D OE_L 256K x 8 DRAM 9 8 WE_L

•  Control Signals (RAS_L, CAS_L, WE_L, OE_L) are all

active low

•  Din and Dout are combined (D):

 WE_L is asserted (Low), OE_L is disasserted (High) »  D serves as the data input pin

 WE_L is disasserted (High), OE_L is asserted (Low) »  D is the data output pin

•  Row and column addresses share the same pins (A)

 RAS_L goes low: Pins A are latched in as row address  CAS_L goes low: Pins A are latched in as column address  RAS/CAS edge-sensitive

CAS_L RAS_L

Logic Diagram of a Typical DRAM

A D OE_L 256K x 8 DRAM 9 8 WE_L CAS_L RAS_L

•  Every DRAM access begins at:

  The assertion of the RAS_L   2 ways to write:

early or late v. CAS

WE_L A Row Address

OE_L

Junk

WR Access Time WR Access Time CAS_L

RAS_L

Col Address Row Address Col Address Junk

D Junk Data In Junk Data In Junk

DRAM WR Cycle Time

Early Wr Cycle: WE_L asserted before CAS_L Late Wr Cycle: WE_L asserted after CAS_L

(15)

29 © Alvin R. Lebeck CPS 104 A D OE_L 256K x 8 DRAM 9 8 WE_L CAS_L RAS_L

•  Every DRAM access begins at:

  The assertion of the RAS_L   2 ways to read:

early or late v. CAS

OE_L A Row Address WE_L Junk Read Access Time Output Enable Delay CAS_L RAS_L

Col Address Row Address Col Address Junk

D High Z Data Out

DRAM Read Cycle Time

Early Read Cycle: OE_L asserted before CAS_L Late Read Cycle: OE_L asserted after CAS_L

Junk Data Out High Z

Asynchronous DRAM Read Timing

© Alvin R. Lebeck

Access Pattern without Interleaving:

Start Access for D1

CPU Memory

Start Access for D2 D1 available

Access Pattern with 4-way Interleaving:

Access Bank 0

Access Bank 1

Access Bank 2

Access Bank 3

We can Access Bank 0 again CPU Memory Bank 1 Memory Bank 0 Memory Bank 3 Memory Bank 2

Increasing Bandwidth - Interleaving

(16)

31

© Alvin R. Lebeck CPS 104

Fast Memory Systems: DRAM specific

•  Multiple RAS accesses: several names

 page mode, fast page mode, EDO

 64 Mbit DRAM: cycle time = 100 ns, page mode = 20 ns

•  New DRAMs

 Synchronous DRAM (SDRAM): Provide a clock signal to DRAM, transfer synchronous to system clock

 Dual Data Rate DRAM (DDRAM) Also RAMBUS (DDR, DDR2, DDR3) »  transfer data on both clock edges

»  Each Chip a module vs. slice of memory »  Short bus between CPU and chips »  Does own refresh

»  Variable amount of data returned »  1 byte / 2 ns (500 MB/s per chip)

 Cached DRAM (CDRAM): Keep entire row in SRAM

Summary of Memory Technology

•  DRAM is

slow

but

cheap

and

dense

:

 Good choice for presenting the user with a BIG memory system  Uses one transistor, must be refreshed.

•  SRAM is

fast

but

expensive

and

not very dense

:

 Good choice for providing the user FAST access time.

 Uses six transistors, holds state as long as power is supplied.

•  GOAL:

 Present the user with large amounts of memory using the cheapest

technology.

 Provide access at the speed offered by the fastest technology.

References

Related documents

Any and all securities, or contracts relating thereto, now or hereafter held or carried by the Broker in any of your account (either individually or jointly

Figure S3: The TNFα inhibitory activity of lipid extracts from different seafood organisms; (A) Penaeus plebejus (Australian school prawn), body flesh and head, including viscera;

Inject a controlled QEMUTimer into qemu-kvm at a known address Eject the emulated ISA bridge.. Force an allocation into the freed RTCState, with second_timer pointing at our

Filter Needle (5 Micron), Blunt Tip; 3 mL Syringe BD Luer-Lok™; 5 mL BD Glaspak™ Syringe Luer Slip; 10 mL Syringe Luer Slip; Sharps Stick Pad; Cap, Female BD Luer-Lok™; Cap, Male

We observed four distinct types of back-arc spreading segments, summa- rized in Figure 17: (I) magmatic segments, characterized by significant axial volcanic rises (500–550 m) that

/RAS Row Address Strobe Latches row addresses on the positive edge of the CK with /RAS low /CAS Column Address Strobe Latches Column addresses on the positive edge of the CK

By your signature below, you hereby authorize the Company to order consumer reports and/or investigative consumer reports on you including, but not limited to,

Input from all areas throughout organisation required DISTRIBUTION Business mix by panel member Profits by panel member Type of distribution Strategic re-broking COMPETITORS