Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

FIXED_U_16_16 z;

z.whole = x.whole + y.whole; return(z);

}

__inline unsigned int fixed_int(FIXED_U_16_16 x) { return (x.whole >> 16);

}

...

FIXED_U_16_16 y, z; unsigned int q;

...

label1:

y= fixed_add (y, z); q = fixed_int (y);

label2:

...

The object code generated for the source code between label1 and label2 typically looks like this:

mov edx, DWORD PTR [z]

 

mov eax, DWORD PTR [y]

 

add eax, edx

 

mov DWORD PTR [y], eax

; -+

mov eax, DWORD PTR [y]

; <+ Aligned (size/address match)--forwarding in LSU

shr eax, 16

 

mov DWORD PTR [q], eax

 

24

C and C++ Source-Level Optimizations

Chapter 2

Page 40
Image 40
AMD 250 manual C++ Source-Level Optimizations