AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

One Supported Store- to-Load Forwarding Case

There is one case of a mismatched store-to-load forwarding that is supported by the by AMD Athlon processor. The lower 32 bits from an aligned QWORD write feeding into a DWORD read is allowed.

Example 8 (Allowed):

MOVQ

[AlignedQword], mm0

...

 

MOV

EAX, [AlignedQword]

Summary of Store-to-Load Forwarding Pitfalls to Avoid

To avoid store-to-load forwarding pitfalls, code should conform to the following guidelines:

Maintain consistent use of operand size across all loads and stores. Preferably, use doubleword or quadword operand sizes.

Avoid misaligned data references.

Avoid narrow-to-wide and wide-to-narrow forwarding cases.

When using word or byte stores, avoid loading data from anywhere in the same doubleword of memory other than the identical start addresses of the stores.

Stack Alignment Considerations

 

Make sure the stack is suitably aligned for the local variable

 

with the largest base type. Then, using the technique described

 

in “C Language Structure Component Considerations” on page

 

55, all variables can be properly aligned with no padding.

Extend to 32 Bits

Function arguments smaller than 32 bits should be extended to

Before Pushing onto

32 bits before being pushed onto the stack, which ensures that

Stack

the stack is always doubleword aligned on entry to a function.

 

If a function has no local variables with a base type larger than

 

doubleword, no further work is necessary. If the function does

 

have local variables whose base type is larger than a

 

doubleword, additional code should be inserted to ensure

 

proper alignment of the stack. For example, the following code

 

achieves quadword alignment:

54

Stack Alignment Considerations

Page 70
Image 70
AMD x86 manual Stack Alignment Considerations, Summary of Store-to-Load Forwarding Pitfalls to Avoid