AMD x86 manual Example 3 Avoid, Example 4 Avoid

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 68
Image 68

AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

Narrow-to-Wide

Store-Buffer Data

Forwarding

Restriction

Wide-to-Narrow

Store-Buffer Data

Forwarding

Restriction

I f t h e fo ll ow i n g c o n d i t i o n s a re p re s e n t , t h e re is a narrow-to-wide store-buffer data forwarding restriction:

The operand size of the store data is smaller than the operand size of the load data.

The range of addresses spanned by the store data covers some sub-region of range of addresses spanned by the load data.

Avoid the type of code shown in the following two examples.

Example 1 (Avoid):

MOV EAX, 10h

MOV WORD PTR [EAX], BX

;word store

 

...

 

 

 

MOV

ECX, DWORD PTR [EAX]

;doubleword

load

 

 

;cannot forward upper

 

 

; byte from

store buffer

Example 2 (Avoid):

MOV EAX, 10h

MOV BYTE PTR [EAX + 3], BL ;byte store

...

MOV ECX, DWORD PTR [EAX]

;doubleword load

 

;cannot forward upper byte

 

; from store buffer

I f t h e fo ll ow i n g c o n d i t i o n s a re p re s e n t , t h e re is a wide-to-narrow store-buffer data forwarding restriction:

The operand size of the store data is greater than the operand size of the load data.

The start address of the store data does not match the start address of the load.

Example 3 (Avoid):

MOV EAX, 10h

ADD

DWORD PTR [EAX], EBX

;doubleword

store

 

MOV

CX, WORD PTR [EAX + 2] ;word load-cannot

forward high

 

 

; word from

store

buffer

Use example 5 instead of example 4.

Example 4 (Avoid):

MOVQ

[foo], MM1

;store upper and lower half

...

 

 

 

ADD

EAX,

[foo]

;fine

ADD

EDX,

[foo+4]

;uh-oh!

52

Store-to-Load Forwarding Restrictions

Page 68
Image 68
AMD x86 manual Example 3 Avoid, Example 4 Avoid