Intel XScale ® Core 3
3.4.1 Instruction Cache Operation
3.4.1.1 Operation when Instruction Cache is Enabled
When the cache is enabled, it compares every instruction request address to the addresses of instructions that it is holding in cache. If the requested instruction is found, the access “hits” the cache, which returns the requested instruction. If the instruction is not found, the access “misses”
the cache, which requests a fetch from external memory of the 8-word line (32 bytes) that contains the instruction (using the fetch policy). As the fetch returns instructions to the cache, they are put in one of two fetch buffers and the requested instruction is delivered to the instruction decoder. A fetched line is written into the cache if it is cacheable (code is cacheable if the MMU is disabled or if the MMU is enabled and the cacheable (C) bit is set to 1 in its corresponding page).
Note: An instruction fetch may “miss” the cache but “hit” one of the fetch buffers. If this happens, the requested instruction is delivered to the instruction decoder in the same manner as a cache “hit.”
Figure 18. Instruction Cache Organization
A9685-01
Note: CAM = Content Addressable Memory
Set Index
3.4.1.2 Operation when Instruction Cache is Disabled
Disabling the cache prevents any lines from being written into the instruction cache. Although the cache is disabled, it is still accessed and may generate a “hit” if the data is already in the cache.
Disabling the instruction cache does not disable instruction buffering that may occur within the instruction fetch buffers. Two 8-word instruction fetch buffers will always be enabled in the cache disabled mode. As instruction fetches continue to “hit” within either buffer (even in the presence of forward and backward branches), no external fetches for instructions are generated. A miss causes one or the other buffer to be filled from external memory using the fill policy.
3.4.1.3 Fetch Policy
An instruction-cache “miss” occurs when the requested instruction is not found in the instruction fetch buffers or instruction cache; a fetch request is then made to external memory. The instruction cache can handle up to two “misses.” Each external fetch request uses a fetch buffer that holds 32-bytes and eight valid bits, one for each word. A miss causes the following:
1. A fetch buffer is allocated.
2. The instruction cache sends a fetch request to the external bus. This request is for a 32-byte line.
3. Instructions words are returned back from the external bus, at a maximum rate of 1 word per core cycle. As each word returns, the corresponding valid bit is set for the word in the fetch buffer.
4. As soon as the fetch buffer receives the requested instruction, it forwards the instruction to the instruction decoder for execution.
5. When all words have returned, the fetched line will be written into the instruction cache if it is cacheable and if the instruction cache is enabled. The line chosen for update in the cache is controlled by the round-robin replacement algorithm. This update may evict a valid line at that location.
6. Once the cache is updated, the eight valid bits of the fetch buffer are invalidated.
3.4.1.4 Round-Robin Replacement Algorithm
The line replacement algorithm for the instruction cache is round-robin. Each set in the instruction cache has a round-robin pointer that keeps track of the next line (in that set) to replace. The next line to replace in a set is the one after the last line that was written. For example, if the line for the last external instruction fetch was written into way 5-set 2, the next line to replace for that set would be way 6. None of the other round-robin pointers for the other sets are affected in this case.
After reset, way 31 is pointed to by the round-robin pointer for all the sets. Once a line is written into way 31, the round-robin pointer points to the first available way of a set, beginning with way0 if no lines have been locked into that particular set. Locking lines into the instruction cache effectively reduces the available lines for cache updating. For example, if the first three lines of a set were locked down, the round-robin pointer would point to the line at way 3 after it rolled over from way 31.
3.4.1.5 Parity Protection
The instruction cache is protected by parity to ensure data integrity. Each instruction cache word has 1 parity bit. (The instruction cache tag is not parity protected.) When a parity error is detected on an instruction cache access, a prefetch abort exception occurs if the Intel XScale® core attempts to execute the instruction. Before servicing the exception, hardware place a notification of the error in the Fault Status register (Coprocessor 15, register 5).
A software exception handler can recover from an instruction cache parity error. This can be accomplished by invalidating the instruction cache and the branch target buffer and then returning to the instruction that caused the prefetch abort exception. A simplified code example is shown in Example 17. A more complex handler might choose to invalidate the specific line that caused the exception and then invalidate the BTB.
If a parity error occurs on an instruction that is locked in the cache, the software exception handler needs to unlock the instruction cache, invalidate the cache and then re-lock the code in before it returns to the faulting instruction.
3.4.1.6 Instruction Cache Coherency
The instruction cache does not detect modification to program memory by loads, stores or actions of other bus masters. Several situations may require program memory modification, such as uploading code from disk.
The application program is responsible for synchronizing code modification and invalidating the cache. In general, software must ensure that modified code space is not accessed until modification and invalidating are completed.
To achieve cache coherence, instruction cache contents can be invalidated after code modification in external memory is complete.
If the instruction cache is not enabled, or code is being written to a non-cacheable region, software must still invalidate the instruction cache before using the newly-written code. This precaution ensures that state associated with the new code is not buffered elsewhere in the processor, such as the fetch buffers or the BTB.
Naturally, when writing code as data, care must be taken to force it completely out of the processor into external memory before attempting to execute it. If writing into a non-cacheable region, flushing the write buffers is sufficient precaution. If writing to a cacheable region, then the data cache should be submitted to a Clean/Invalidate operation to ensure coherency.
Example 17. Recovering from an Instruction Cache Parity Error
; Prefetch abort handler
MCR P15,0,R0,C7,C5,0 ; Invalidate the instruction cache and branch target
; buffer
CPWAIT ; wait for effect
;
SUBS PC,R14,#4 ; Returns to the instruction that generated the
; parity error
; The Instruction Cache is guaranteed to be invalidated at this point