PALcode Restrictions and Guidelines D-9

Guideline 6 : Avoid Consecutive Read-Modify-Write-Read-Modify-Write

D.4 Guideline 6 : Avoid Consecutive Read-Modify-Write-Read- Modify-Write

Avoid consecutive read-modify-write-read-modify-write sequences to IPRs in the same scoreboard group.

The latency between the first write and the second read is determined by the retire latency of the IPR. For convenience of implementation, the latency between the time when the read is issued and when the final write is issued depends on the run-time con- tents of the issue queue. It is somewhere between four and nine cycles, even if there is no data dependency between the read and write.

D.5 Restriction 7 : Replay Trap, Interrupt Code Sequence, and STF/

ITOF

On an Mbox replay trap, the 21264/EV68A Ibox guarantees that the refetched load or store instruction that caused the trap is issued before any newer load or store instruc- tions. For load and integer store instructions, this is a consequence of the natural opera- tion of the issue queue. The refetched instruction enters the age-prioritized queue ahead of newer load and store instructions and does not have any dependencies on dirty regis- ters.

Because there is no overhead time for checking these register dependencies (that is, it is known upon enqueueing that there are no dirty registers), the queue will issue the refetched instruction in priority order. For floating-point store instructions, there is nor- mally some overhead associated with checking the floating-point source register dirty status, so the store instruction would normally wait before being issued. This would have the undesired consequence of allowing newer load and store instructions to be issued out of order. A deadlock can occur if issuing the instructions out-of-order causes the floating-point store instruction to continually replay the trap. To avoid the deadlock on a floating-point store instruction replay trap, the source register dirty status is not checked (the source register is assumed to be clean because the store instruction was issued previously).

The hardware mechanism that keeps track of replayed floating-point store instructions, and cancels the dirty register check, requires some software restrictions to guarantee that it is applied appropriately to the replayed instruction and not to other floating-point store instructions. The hardware mechanism marks the position in the fetch block (low two bits of the PC) where the replay trap occurred. This action cancels the dirty float- ing-point source register check of the next valid instruction enqueued to the integer queue (integer, all load and store, and ITOF instructions) that has the same position in the fetch block (normally the replayed STF). If the PC is somehow diverted to a PAL- code flow, this hardware might inadvertently cancel the register check of some other STF (or ITOF) instruction. Fortunately, there are a minimal number of reasons why the PC might be diverted during a replay trap. They are interrupts and ITB fills.

The following PALcode example shows that an STF or ITOF instruction, in a given position in a fetch block, must be preceded by a valid instruction that is issued out of the integer queue in the same position in an earlier fetch block. Acceptable instruction classes include load, integer store, and integer operate instructions that do not have R31 as a destination or branch.

Compaq EV68A specifications Restriction 7 Replay Trap, Interrupt Code Sequence, and STF

Models: EV68A

D.5 Restriction 7 : Replay Trap, Interrupt Code Sequence, and STF/

ITOF

21264/EV68A Hardware Reference Manual

PALcode Restrictions and Guidelines D–9