The computed virtual address must specify a location within the naturally aligned 16-byte block in virtual memory accessed by the preceding LDx_L instruction.

The resultant physical address must specify a location within the naturally aligned 16-byte block in physical memory accessed by the preceding LDx_L instruction.

If those addressing constraints are not met, it is UNPREDICTABLE whether the STx_C instruction succeeds or fails, regardless of the state of the lock_flag, unless the lock_flag is cleared as described in the next paragraph.

Whether or not the addressing constraints are met, a zero is returned and no write to memory occurs if the lock_flag was cleared by execution on a processor of a CALL_PAL REI, CALL_PAL rti, CALL_PAL rfe, or STx_C, after the most recent execution on that processor of a LDx_L instruction (in processor issue sequence).

In all cases, the lock_flag is set to zero at the end of the operation.

Notes:

Software will not emulate unaligned STx_C instructions.

Each implementation must do the test and store atomically, as illustrated in the follow- ing two examples. (See Section 5.6.1 for complete information.)

If two processors attempt STx_C instructions to the same lock range and that lock range was accessed by both processors’ preceding LDx_L instructions, exactly one of the stores succeeds.

A processor executes a LDx_L/STx_C sequence and includes an MB between the LDx_L to a particular address and the successful STx_C to a different address (one that meets the constraints required for predictable behavior). That instruction sequence establishes an access order under which a store operation by another pro- cessor to that lock range occurs before the LDx_L or after the STx_C.

If the virtual and physical addresses for a LDx_L and STx_C sequence are not within the same naturally aligned 16-byte sections of virtual and physical memory, that sequence may always fail, or may succeed despite another processor’s store to the lock range; hence, no useful program should do this.

The following sequence should not be used:

try_again: LDQ_L

R1, x

<modify

R1>

STQ_C

R1, x

BEQ

R1, try_again

That sequence penalizes performance when the STQ_C succeeds, because the sequence contains a backward branch, which is predicted to be taken in the Alpha architecture. In the case where the STQ_C succeeds and the branch will actually fall through, that sequence incurs unnecessary delay due to a mispredicted backward branch. Instead, a forward branch should be used to handle the failure case, as shown in Section 5.5.2.

Instruction Descriptions 4–13

Page 69
Image 69
Compaq ECQD2KCTE manual Stqc BEQ