•The computed virtual address must specify a location within the naturally aligned
•The resultant physical address must specify a location within the naturally aligned
If those addressing constraints are not met, it is UNPREDICTABLE whether the STx_C instruction succeeds or fails, regardless of the state of the lock_flag, unless the lock_flag is cleared as described in the next paragraph.
Whether or not the addressing constraints are met, a zero is returned and no write to memory occurs if the lock_flag was cleared by execution on a processor of a CALL_PAL REI, CALL_PAL rti, CALL_PAL rfe, or STx_C, after the most recent execution on that processor of a LDx_L instruction (in processor issue sequence).
In all cases, the lock_flag is set to zero at the end of the operation.
Notes:
•Software will not emulate unaligned STx_C instructions.
•Each implementation must do the test and store atomically, as illustrated in the follow- ing two examples. (See Section 5.6.1 for complete information.)
–If two processors attempt STx_C instructions to the same lock range and that lock range was accessed by both processors’ preceding LDx_L instructions, exactly one of the stores succeeds.
–A processor executes a LDx_L/STx_C sequence and includes an MB between the LDx_L to a particular address and the successful STx_C to a different address (one that meets the constraints required for predictable behavior). That instruction sequence establishes an access order under which a store operation by another pro- cessor to that lock range occurs before the LDx_L or after the STx_C.
•If the virtual and physical addresses for a LDx_L and STx_C sequence are not within the same naturally aligned
•The following sequence should not be used:
try_again: LDQ_L | R1, x |
<modify | R1> |
STQ_C | R1, x |
BEQ | R1, try_again |
That sequence penalizes performance when the STQ_C succeeds, because the sequence contains a backward branch, which is predicted to be taken in the Alpha architecture. In the case where the STQ_C succeeds and the branch will actually fall through, that sequence incurs unnecessary delay due to a mispredicted backward branch. Instead, a forward branch should be used to handle the failure case, as shown in Section 5.5.2.
Instruction Descriptions