25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

4-Kbyte pages away from the load address (address bits 47–12 do not match). Avoid the type of code shown in the following example:

mov eax, 10h

 

 

 

 

mov [eax], bx

; Word store to address 10

mov cx, [eax+2]

; Word load

to address

12

 

; Load detects a false

dependency

 

;

on store

because it

is in the

 

; same doubleword of memory.

mov cx, [eax+4]

; Word load

to address

14

 

; Load does

not detect

a false

 

; dependency because it is to a

 

;

different doubleword of memory.

Here is another example of the type of code to avoid:

mov eax, 10h mov [eax], bl mov [eax+1], cl mov dl, [eax]

;First store to DWORD at address 10h

;Second store to DWORD at address 10h

;Load detects a false

;dependency on the second store

;because it is the most recent

;store to the same doubleword of

;memory as the load.

Summary of Store-to-Load-Forwarding Pitfalls to Avoid

To avoid store-to-load-forwarding pitfalls, follow these guidelines:

Maintain consistent use of operand size across all loads and stores. Preferably use doubleword or quadword operand sizes.

Avoid misaligned data references.

Avoid narrow-to-wide and wide-to-narrow forwarding cases.

When using word or byte stores, avoid loading data from anywhere in the same doubleword of memory other than the identical start addresses of the stores.

Application

This optimization applies to:

32-bit software

64-bit software

Chapter 5

Cache and Memory Optimizations

103

Page 119
Image 119
AMD 250 manual Summary of Store-to-Load-Forwarding Pitfalls to Avoid, 103