2–28 Internal Architectur e
21264/EV68A Hardware Reference Manual
Memory and I/O Address Space Instructions
TheMbox allocates a new MAF entry to an I/O load instruction and increases I/O band-
widthby attempting to merge I/O load instructions in a merge register.Table 2–7 shows
the rules formerging data. The columns represent the load instructions replayed to the
MAF while therows represent the size of the load in the merge register.
In summary,Table 2–7 shows some of the following rules:
Byte/word loadinstructions and different size load instructionsare not allowed to
merge.
A streamof ascending non-overlapping, but not necessarily consecutive, longword
load instructionsare allowed to merge into naturally aligned 32-byte blocks.
Astream of ascending non-overlapping, but not necessarily consecutive, quadword
load instructionsare allowed to merge into naturally aligned 64-byte blocks.
Mergingof quadwords can be limited to naturally-aligned 32-byte blocks based on
the Cbox WRITE_ONCE chain 32_BYTE_IO field.
IssuedM B,W MB, and I/O load instructionsclose the I/O register mergewindow.
Tominimize latency, the merge window is also closed when a timer detects no I/O
storeinstruction activity for 1024 cycles.
After theM box I/Or egisterhas closed its merge window, the Cbox sends I/O read
requestsoffchip in the order that they were received from the Mbox.
2.8.3 MemoryAddress Space Store Instructions
The Mbox begins executionof a store instruction by translating its virtual address to a
physicaladdress using the DTB and by probing the Dcache. The Mbox puts informa-
tionabout the store instruction, including its physical address, its data and the results of
the Dcacheprobe, into the store queue (SQ).
If the Mbox does not findthe addressed location in theDc ache,it places the address
intothe MAF forprocessingby the Cbox. If the Mbox finds the addressed location in a
Dcacheblock thatis not dirty,then it places a ChangeToDirty request into the MAF.
A store instructioncan write its data into the Dcache when it is retired, and when the
Dcacheblock containing its addressis dirty and not shared. SQ entries that meet these
two conditionscan be placed into the writable state. These SQ entries are placed into
the writablestate in program order at amaximum rate of two entries per cycle. The
Mbox transfers writable store queue entrydata from the SQ to the Dcache in program
orderat a maximum rate of two entries per cycle. Dcache lines associated with writable
storequeue entries are locked by the Mbox. System port probe commands cannot evict
theseblocks until their associated writable SQ entries havebeen transferred into the
Dcache.This restriction assists in STx_C instruction andDc acheECC processing.
Table 2–7 Rules for I/O Address Space Load Instruction Data Me rging
MergeRegister/
ReplayedInstruction LoadByte/Word LoadLongword LoadQuadword
Byte/Word Nomerge Nomerge Nomerge
Longword Nomerge Mergeup to 32 bytes Nomerge
Quadword Nomerge Nomerge Mergeupto64bytes