Programming Model
3.3 MMUs/Bus Interface Unit
The bus interface unit (BIU) is compatible with those of the PowerPC 601™ and PowerPC 603™ microprocessors. It implements both tenured and split-transaction modes and can handle as many as three outstanding transactions in pipelined mode. If permitted, the BIU can complete one or more write transactions between the address and data tenures of a read transaction. The BIU has 32-bit address and 64-bit data buses protected by byte parity.
The BIU implements the critical-double-word-first access where the double word requested by the fetcher or the LSU is fetched first and the remaining words in the line are fetched later. The critical double word as well as other words in the cache block are forwarded to the fetcher or to the LSU before they are written to the cache.
The bus can be run at 1x, 2/3x, 1/2x or 1/3x the speed of the processor. The programmable on-chip phase-locked loop (PLL) generates the necessary processor clocks from the bus clock.
When a memory access fails to hit in the cache, the 604e accesses system memory through the bus interface unit. These operations must arbitrate for bus access.
The memory management units (MMUs) provide address translation as specified by the PowerPC OEA, including block address translation and page translation of memory segments. The MMUs and the bus interface unit are shown in Figure 3-3.
The 604e implements separate MMUs, one for instruction accesses and one for data accesses. Virtual address translation uses two 128-entry, two-way set-associative (64 x 2) translation lookaside buffers (TLBs), one for instruction accesses and one for data accesses. The 604e provides hardware that performs the TLB reload (also known as page table walk) when a translation is not in a TLB. Memory management is described in Chapter 5, “Memory Management.”
The BIU handles block fill and write-back requests from either cache, as well as all noncacheable reads and writes.
Figure 3-3. Bus Interface Unit and MMU
As shown in Figure 3-4, the 604e implements four types of memory queues to support the four types of operations—line-fill, write, copy-back, and invalidation operations. For a line- fill operation, the line-fill address from either the instruction or data cache is kept in the memory address queue until the address can be sent out in an address tenure. After the address tenure, the address is transferred to the line-fill address queue, which releases the address bus for other transactions in split-transaction mode. As each double word for the line-fill operation is returned, it is transferred to the line-fill buffer, where it is forwarded to the LSU.
If a subsequent in-order load to the same cache block hits on valid data in the data line-fill buffer, it is forwarded to the load/store unit from the line-fill buffer. In the 604e, a subsequent in-order load to the same cache block is required to wait until the line-fill buffer is completely written into the cache before data is accessed from the cache.
Load/Store Unit Data MMU Instr uction Cache Instruction Unit Instruction MMU
Bus Interface Unit
Bus TLB Reload
Figure 3-4. Memory Queue Organization
For write operations, the address is kept in the memory address queue and the data is kept in the write buffer until both can be sent out in a write transaction. Similarly, for copy-back operations the address is kept in the copy-back address queue and the data is kept in the copy-back buffer until both can be sent out in a burst write transaction. For a cache control instruction or a store to a shared cache block, the address is kept in the cache control address queue until an address-only transaction is sent out to broadcast the cache control command. Because all address queues in the 604e are treated as part of the coherent memory system, they are checked against the data cache and snoop addresses to ensure data consistency and to maintain MESI coherency protocol.
Icache Address Dcache Address
I-Line Fill Address
Store Data (2 w
ord)
Address Bus Data Bus
Snoop Address to Data Cache Memory Address Q0 Memory Address Q1 Share-Invalidate Queue Write Data Q0 (2 word) Write Data Q1 (2 word) Copy-Back Address Q2 Copy-Back Address Q3 Snoop Address Register Address Bus Register I–Line Fill Address Q D–Line Fill Address Q0 D–Line Fill Address Q1 Data In Register
D-Line Fill Address
D-Line Fill Data
I-Line Fill Data
Data Bus Register Copy-Back Data Q2 (8 word) Copy-Back Data Q3 (8 word)
Line Fill Data Q0 (8 word) Line Fill Data
Q1 (8 word) Copy-Back Address Q0 Copy-Back Address Q1 Copy-Back Data Q0 (8 word) Copy-Back Data Q1 (8 word)
To support the increased bandwidth of the nonblocking caches, the BIU can handle as many as three pipelined transactions before data has to be provided by the memory system. The three outstanding transactions can be any combination of the following—two noncacheable or write-through write operations, two data cache reloads, one instruction cache reload, and three cache block copybacks. In addition, address-only transactions are not counted in the three outstanding transactions.
Typically, the three copy-back buffers are written to memory in the same order in which they are filled, having the lowest priority access among all the bus interface unit’s memory queues. Write operations from the copy-back buffers can occur out-of-order under the two following conditions:
• A snoop hit on one or more copy-back buffers causes the copy-back buffers to have the second highest priority among the BIU’s memory queues, after only the snoop- push buffer. In this case, the next write from these three copy-back buffers will be from the buffer that contains the newest data corresponding to the snoop hit. If the snoop address hit on multiple copy-back buffers (possibly due to the dcbst instruction), the accesses for all matching buffers except the one with the newest data are cancelled.
• Similarly, if execution of the dcbst instruction causes multiple copy-back buffers to contain the same address, each buffer that contains this address is cancelled unless it contains the newest data or unless the buffer is the next address transaction to go to the bus.
Note that the three copy-back buffers in the 604e improve the performance of multiple dcbf and dcbst instructions because the address and data tenures of burst writes can be pipelined. For details concerning the signals, see Chapter 7, “Signal Descriptions,” and for information regarding bus protocol, see Chapter 8, “System Interface Operation.”