Lock Mechanism

1.When the Mbox requests a Dcache fill, the Cbox uses the CTAG array entry to find if the Dcache already contains the requested physical address in another virtually- indexed Dcache line. If it does, the Cbox invalidates that cache line after first writ- ing the data back to the Bcache if it was in the modified state. The Cbox also checks to see if the Dcache contains an address different from the requested address, but maps to the same Bcache line. If it does, the Dcache line is evicted in order to keep the Dcache a subset of the Bcache.

2.When the Ibox requests an Icache fill, the Cbox uses the CTAG array entries to find if the Dcache contains the requested physical address in the modified state. If it does, the Cbox forces the line to be written back to the Bcache before servicing the Icache fill request. The Cbox also checks to see if the Dcache contains an address different from the requested address but which maps to the same Bcache line. In this case the Istream request will miss the Bcache, and the Cbox will

service the request by launching a noncached Fetch command to the system port and will not put the Istream block into the Bcache. This mechanism allows the 21264/EV68A to use a cache resident lock flag for LDx_L/STx_C instructions.

3.The Cbox uses the CTAG array entries to find whether probe addresses are held in the Dcache without interrupting load/store instruction processing in the processor core.

4.6Lock Mechanism

The 21264/EV68A does not contain a dedicated lock register, nor are system compo- nents required to do so.

When a load-lock (LDx_L) instruction executes, data is accessed from the Dcache or Bcache. If there is a cache miss, data is accessed from memory with a RdBlk command. Its associated cache line is filled into the Dcache in the clean state, if it is not already there.

When the store-conditional (STx_C) instruction executes, it is allowed to succeed if its associated cache line is still present in the Dcache and can be made writable; otherwise, it fails.

This algorithm is successful because another agent in the system writing to the cache line between the load-lock and the store-conditional cache line would make the cache line invalid. This mechanism’s coherence is based on the following four items:

1.LDx_L instructions are processed in-order in relation to the associated STx_C.

2.Once a block is locked by way of an LDx_L instruction, no internal agent can evict the block from the Dcache as a side-effect of its processing.

3.Any external agent that intends to update the contents of the stored block must use an invalidating probe command to inform the 21264/EV68A.

4.The system is the only agent with sufficient information to manage the tasks of fair- ness and liveness. However, to enable these tasks, the 21264/EV68A only generates external commands for nonspeculative STx_C instructions, and once given a suc- cess indication from the system, must faithfully update the Dcache with the STx_C value.

The system is entirely responsible for item number three. The 21264/EV68A plays an active role in items one, two, and four.

4–14Cache and External Interfaces

21264/EV68A Hardware Reference Manual

Page 102
Image 102
Compaq EV68A specifications Lock Mechanism, 14Cache and External Interfaces