• No results found

Outline. Cache Parameters. Lecture 5 Cache Operation

N/A
N/A
Protected

Academic year: 2021

Share "Outline. Cache Parameters. Lecture 5 Cache Operation"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Lecture 5

Cache Operation

ECE 463/521

Fall 2002

Edward F. Gehringer

Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU

Outline

• Review of cache parameters

• Example of operation of a direct-mapped

cache.

• Example of operation of a set-associative

cache.

• Simulating a generic cache.

• Write-through vs. write-back caches.

• Write-allocate vs. no write-allocate.

• Victim caches.

Cache Parameters

SIZE

= total amount of cache data storage,

in bytes

BLOCKSIZE

= total number of bytes in a

single block

ASSOC

= associativity, i.e., # of blocks in a

set

(2)

Cache Parameters (cont.)

• Equation for # of

cache blocks

in cache:

• Equation for # of

sets

in cache:

BLOCKSIZE

SIZE

?

blocks

cache

#

ASSOC

BLOCKSIZE

SIZE

ASSOC

?

?

?

#

cache

blocks

sets

#

Address Fields

0 31 block offset index tag

Tag field is compared to the tag(s) of the indexed cache line(s).

– If it matches, block is there (hit). – If it doesn’t match,

block is not there (miss).

Used to look up a “set,” whose lines contain one or more memory blocks. (The # of blocks per set is the “associativity ”.) Once a block is found, the offset selects a particular byte or word of data in the block.

Address Fields (cont.)

• Widths of address fields (# bits)

# index bits

= log

2

(

# sets

)

# block offset bits

= log

2

(

block size

)

# tag bits

= 32 –

# index bits

# block offset

bits

Assuming 32-bit addresses 0 31 block offset index tag

(3)

Tags Data block offset index

tag

Byte or word select

requested data (byte or word) (to processor)

Match?

hit/miss

Example of cache operation

Example:

Processor accesses a

256-byte direct-mapped cache, which has

32-byte lines, with the following

sequence of addresses.

– Show contents of the cache after each

access.

– Count # of hits and # of replacements.

Example address sequence

Index (decimal) 0x00122544 0x0012255C 0x00101064 0x002183E0 0x00101078 0xFF0040E8 0xFF0040E2 0xBEEF005C 0xFF0040E0 Comment Index & offset

bits (binary) Tag (hex)

(4)

Example (cont.)

# index bits= log2(# sets) = log2(8) = 3

# block offset bits= log2(block size) = log2(32 bytes) = 5

# tag bits= total # address bits# index bits# block offset bits = 32 bits – 3 bits – 5 bits = 24

8

1

32

256

sets

#

?

?

?

?

?

ASSOC

BLOCKSIZE

SIZE

Thus, the top 6 nibbles (24 bits) of address form the tag and lower 2 nibbles (8 bits) of address form the index and block offset fields. 2 2 3 7 3 7 7 2 7 Index (decimal) Hit 0100 0100 0x001225 0x00122544 Miss/Replace 0101 1100 0x001225 0x0012255C Hit 0110 0100 0x001010 0x00101064 Miss/Replace 1110 0000 0x002183 0x002183E0 Miss 0111 1000 0x001010 0x00101078 Hit 1110 1000 0xFF0040 0xFF0040E8 Hit 1110 0010 0xFF0040 0xFF0040E2 Miss 0101 1100 0xBEEF00 0xBEEF005C Miss 1110 0000 0xFF0040 0xFF0040E0 Comment Index & offset

bits (binary) Tag (hex) Address (hex) 3 24 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 FF0040 7 FF0040

Get block from memory (slow)

(5)

BEEF00 2 BEEF00 FF0040 3 Match? Tags Data block offset index tag 0 1 2 3 4 5 6 7

Get block from memory (slow) 24 FF0040 7 BEEF00 FF0040 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24 3 FF0040 7 BEEF00 FF0040 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24 3

(6)

001010 3

BEEF00

FF0040

Get block from memory (slow) 001010 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24 3 002183 7 BEEF00 FF0040 001010

Get block from Memory (slow) 002183 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24 3 3 001010 3 BEEF00 001010 002183 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24

(7)

001225 2

BEEF00 001010

002183

Get block from memory (slow) 001225 Match? Tags Data block offset index tag 0 1 2 3 4 5 6 7 24 3 001225 2 001010 002183 001225 Match? Tags Data 0 31 block offset index tag 8 7 5 4 0 1 2 3 4 5 6 7 24 3

Set-Associative Example

Example:

Processor accesses a 256-byte

2-way set -associative cache, which has

block size of 32 bytes, with the following

sequence of addresses.

– Show contents of cache after each access.

– Count # of hits and # of replacements.

(8)

Tags Data 0 31 block offset index tag select a block Match? hit

Address (from processor)

? 32 bytes?

select certain bytes Match?

or

? 32 bytes?

Example (cont.)

# index bits

= log

2

(

# sets

) = log

2

(4) = 2

# block offset bits

= log

2

(

block size

) = log

2

(32 bytes)

= 5

# tag bits

=

total # address bits

# index bits

# block offset bits =

32 bits – 2 bits – 5 bits = 25

4

2

32

256

sets

#

?

?

?

?

?

ASSOC

BLOCKSIZE

SIZE

Hit 3 110 0010 0x1FE0081 0xFF0040E2 Miss 3 111 1000 0x0002020 0x00101078 3 3 3 2 3 Index (decimal) Hit 110 0100 0x0002020 0x00101064 Miss/Replace 110 0000 0x0004307 0x002183E0 Hit 111 1000 0x0002020 0x00101078 Miss 101 1100 0x17DDE00 0xBEEF005C Miss 110 0000 0x1FE0081 0xFF0040E0 Comment Index & offset

bits (binary) Tag (hex)

(9)

2 Match? block offset index tag 1FE0081 3 Match? 25 Data not shown for convenience Tags 0 1 2 3 1FE0081 17DDE00 2 1FE0081 17DDE00 2 Match? 0 31 block offset index tag 7 6 5 4 Match? 25 Data not shown for convenience Tags 0 1 2 3 0002020 3 1FE0081 17DDE00 0002020 2 Match? 0 31 block offset index tag 7 6 5 4 Match? 25 Data not shown for convenience Tags 0 1 2 3

(10)

1FE0081 3 1FE0081 17DDE00 0002020 2 Match? 0 31 block offset index tag 7 6 5 4 Match? 25 Data not shown for convenience Tags 0 1 2 3 0002020 3 1FE0081 17DDE00 0002020 2 Match? 0 31 block offset index tag 7 6 5 4 Match? 25 Data not shown for convenience Tags 0 1 2 3 0004307 3 1FE0081 17DDE00 0002020 0004307 2 Match? 0 31 block offset index tag 7 6 5 4 Match? 25 Data not shown for convenience Tags 0 1 2 3

(11)

0002020 3 0004307 17DDE00 0002020 2 Match? block offset index tag Match? 25 Data not shown for convenience Tags 0 1 2 3

Generic Cache

Every cache is an

n

-way set-associative

cache.

1. Direct-mapped: – 1-way set-associative 2. n-way set -associative:

n-way set-associative 3. Fully -associative:

n-way set-associative, where there is only 1 set containing nblocks

– # index bits = log2(1) = 0 (equation still works!)

Generic Cache (cont.)

The same equations hold for any cache type

• Equation for # of cache blocks in cache:

• Equation for # of sets in cache:

• Fully-associative:

ASSOC

= # cache blocks

BLOCKSIZE

SIZE

?

blocks

cache

#

ASSOC BLOCKSIZE SIZE ASSOC ? ? ?#cacheblocks sets #

(12)

Generic Cache (cont.)

• What this means

– You don’t have to treat the three cache types differently in your simulator!

– Support a generic n-way set -associative cache – You don’t have to specifically worry about the two

extremes (direct-mapped & fully -associative). – E.g., even DM cache has LRU information (in

simulator), even though it is not used.

Replacement Policy

• Which block in a set should be replaced

when a new block has to be allocated?

– LRU (least recently used) is the most common

policy.

– Implementation:

• Real hardware maintains LRU bits with each block in set.

• Simulator:

Cheatusing “sequence numbers” as timestamps. – Makes simulation easier – gives same effect as hardware.

LRU Example

• Assume that blocks A, B, C, D, and E all map to the same set

• Trace: A B C D D D B D E

• Assign sequence numbers (increment by 1 for each new access). This yields …

A(0) B(1) C(2) D(3) D(4) D(5) B(6) D(7) E(8) A(0) A(0) B(1) A(0) B(1) C(2) A(0) B(1) C(2) D(3) A(0) B(1) C(2) D(4) A(0) B(1) C(2) D(5) A(0) B(6) C(2) D(5) A(0) B(6) C(2) D(7) E(8) B(6) C(2) D(7)

(13)

Handling Writes

What happens when a write occurs?

Two issues

1. Is just the cache updated with new data,

or are lower levels of memory hierarchy

updated at same time?

2. Do we allocate a new block in the cache

if the write misses?

The Write-Update Question

• Write-through (WT) policy: Writes that go to

the cache are also “written through” to the

next level in the memory hierarchy.

cache

next level in memory hierarchy

The Write-Update Question, cont.

• Write-back (WB) policy: Writes go only to the

cache, and are not (immediately) written

through to the next level of the hierarchy.

cache

next level in memory hierarchy

(14)

The Write-Update Question, cont.

Write-back, cont.

– What happens when a line previously

written to needs to be replaced?

1. We need to have a “dirty bit” (D) with

each line in the cache, and set it when

line is written to.

2. When a dirty block is replaced, we need

to write the entire block back to next level

of memory (“write-back”).

The Write-Update Question, cont.

• With the write-back policy, replacement of a

dirty block triggers a writeback of the entire

line.

cache next level in memory hierarchy D Replacement causes writeback

The Write-Allocation Question

• Write-Allocate (WA)

– Bring the block into the cache if the write misses (handled just like a read miss).

Typicallyused with write-back policy: WBWA

• Write-No-Allocate (NA)

– Do not bring the block into the cache if the write misses.

Mustbe used in conjunction with write-through: WTNA.

(15)

The Write-Allocation Question, cont.

• WTNA (scenario: the write misses)

cache

next level in memory hierarchy

write miss

The Write Allocation Question, cont.

• WBWA (scenario: the write misses)

cache

next level in memory hierarchy

D

Victim Caches

• A

victim cache

is a small fully-associative

cache that sits alongside primary cache

– E.g., holds 2–16 cache blocks

– Management is slightly “weird” compared to conventional caches

• When main cache evicts (replaces) a block, the victim cache will take the evicted (replaced) block. • The evicted block is called the “victim.” • When the main cache misses, the victim cache is

searched for recently discarded blocks. A victim-cache hit means the main cache doesn’t have to go to memory for the block.

(16)

Victim-Cache Example

• Suppose we have a 2-entry victim

cache

– It initially holds blocks X and Y.

– Y is the LRU block in the victim cache.

• The main cache is direct-mapped.

– Blocks A and B map to the same set in

main cache

– Trace: A B A B A B …

X Y (LRU) A

Main cache Victim cache

Victim-Cache Example, cont.

X (LRU) A B

Main cache Victim cache 1. B misses in main

cache and evicts A. 2. A goes to victim

cache & replaces Y. (the previous LRU) 3. X becomes LRU.

(17)

X (LRU) B A

L1 cache Victim cache 1. A misses in L1 but hits in

victim cache, so A and B

swappositions:

2. A is moved from victim cache to L1, and B (the victim) goes to victim cache where A was located. (Note:We don’t replace the LRU block, X, in case of victim-cache hit)

Victim-Cache Example, cont.

Victim Cache – Why?

• Direct-mapped caches suffer badly from

repeated conflicts

– Victim cache provides illusion of set

associativity.

– A poor-man’s version of set associativity.

– A victim cache does not have to be large to

be effective; even a 4–8 entry victim cache

will frequently remove > 50% of conflict

misses.

References

Related documents

Dari hasil penelitian deskriptif eksploratif yang dilakukan dapat disimpulkan: 1) Konsep apotek modern yang diinginkan warga provinsi DKI Jakarta adalah apotek yang menjual

Adult Basic Education (ABE) certificate programs, 35 choosing courses, 34 course list, 35–36 diploma program, 36 financial aid, 15 advising, student, 2–4.. aging,

Antireflection coating at oblique incidence The previous antireflection coating are typically designed for normal incidence, as the angle of incidence increase, the

The revision of EU procurement legislation provides an opportunity to clarify the scope of the link to the subject matter of the contract to include environmental and social

Dust suppression methods, such as misting (not soaking) surfaces prior to remediation, are recommended. Contaminated materials that cannot be cleaned should be removed from the

Frente ao estudo que se propôs compreender se o uso do crédito rural, via Programa Nacional de Fortalecimento da Agricultura Familiar (Pronaf), pode contribuir na melhoria da gestão

When comparing their food security status at home and on campus, 42.5% of the freshmen who experienced campus food inse- curity believed that their access to food had worsened since

The total number of years lived by the cohort from any given age can also be calculated and, when divided by the number of survivors in the cohort entering upon that year of age,