User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The BIU has both AR buffers and a 4-deep reload-request queue. So, the BIU operation for the MuM support is not dependent on the LSU queue, as it has enough buffers and queue depth to manage the outstanding transactions. The LSU has no additional queues for MuM. MuM just uses what is already there. If there are four data cache reload requests, the data cache does a lookup, reports a miss, and passes the request on to the L2/BIU, and it does not store any information.

//////////////////////////////////////////////////////////////////////////////////////////////////

example:1

 

 

 

addis

r15,0,0x4200

# Base Data Address in r15

 

addi

r16,0,0x0020

# Loop Count is 32

 

mtctr

r16

 

lptop:

lwz

r20,0x0000(r15)

 

 

lwz

r21,0x0020(r15)

 

 

lwz

r22,0x0040(r15)

 

 

lwz

r23,0x0060(r15)

 

 

addi

r15,r15,0x0080

# Modify Base Data Address

 

bdnz

lptop

 

bfinish

//////////////////////////////////////////////////////////////////////////////////////////////////

There are several conditions, listed below, that can stall, limit, or prevent MuM so that the performance advantage on the bus will not be realized.

1.Sequential cacheable loads to the same index.

Sequential cacheable loads or stores that reference the same L1 cache index will not be pipelined. The index bits are EA(20:26). A load to the same cache-line index as an outstanding load miss will prohibit all further MuM, even for successive loads to a different cache line, until the outstanding load miss is com- plete. The MuM feature does not look deeper into the load/store request queue for other loads that do not reference the same cache line once an index conflict exists.

Cache-inhibited loads do not have this restriction.

2.TLB miss resulting in a table walk.

When address translation is enabled using paging, a TLB miss occurs for load or store instructions that reference a page of memory not described in the TLB. A hardware table walk is then started, which reads page table entries (PTEs) from the caches or the bus. During this process, MuM will be prohibited until the table walk is complete.

3.Load request for a graphics instruction, such as External Control In Word Indexed (eciwx).

Graphics instructions, such as eciwx, halt MuM, but once the eciwx is active on the bus, other qualified loads can initiate an MuM. An eciwx load will not be pipelined into other loads, but other loads can pipe- line into it.

4.There is a load to guarded memory.

Guarded loads are not allowed to be pipelined into other loads, and other loads are not pipelined into it.

Note: Real mode, the default if address translation is not enabled, defines the write-through, caching- inhibited, memory coherency, guarded (WIMG) bits to b'0011'. The guarded attribute is set, and, there- fore, MuM will not occur. Address translation must be enabled, and it must set the guarded bit (of WIMG) to zero for MuM.

5.Load multiple and load string instructions limit MuM.

Bus Interface Operation

gx_08.fm.(1.2)

Page 288 of 377

March 27, 2006