22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Store-to-Load Forwarding Restrictions

Store-to-load forwarding refers to the process of a load reading (forwarding) data from the store buffer (LS2). There are instances in the AMD Athlon processor load/store architecture when either a load operation is not allowed to read needed data from a store in the store buffer, or a load OP detects a false data dependency on a store in the store buffer.

In either case, the load cannot complete (load the needed data into a register) until the store has retired out of the store buffer and written to the data cache. A store-buffer entry cannot retire and write to the data cache until every instruction before the store has completed and retired from the reorder buffer.

The implication of this restriction is that all instructions in the reorder buffer, up to and including the store, must complete and retire out of the reorder buffer before the load can complete. Effectively, the load has a false dependency on every instruction up to the store.

The following sections describe store-to-load forwarding examples that are acceptable and those that should be avoided.

Store-to-Load Forwarding Pitfalls —True Dependencies

A load is allowed to read data from the store-buffer entry only if all of the following conditions are satisfied:

The start address of the load matches the start address of the store.

The load operand size is equal to or smaller than the store operand size.

Neither the load or store is misaligned.

The store data is not from a high-byte register (AH, BH, CH, or DH).

The following sections describe common-case scenarios to avoid whereby a load has a true dependency on a LS2-buffered store but cannot read (forward) data from a store-buffer entry.

Store-to-Load Forwarding Restrictions

51

Page 67
Image 67
AMD x86 manual Store-to-Load Forwarding Restrictions, Store-to-Load Forwarding Pitfalls -True Dependencies