25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

The following conversion code causes a 64-bit store to feed into a 32-bit load. The load is from the lower 32 bits of the 64-bit store, the one case of size mismatch between a store and a dependent load that is specifically supported by the store-to-load-forwarding hardware of the AMD Athlon 64 and AMD Opteron processors.

Examples

Listing 23. Slow

double x; int i;

i = x;

Listing 24. Fast

#define DOUBLE2INT(i, d) \

{double t = ((d) + 6755399441055744.0); i = *((int *)(&t));}

double x; int i;

DOUBLE2INT(i, x);

Chapter 2

C and C++ Source-Level Optimizations

53

Page 69
Image 69
AMD 250 manual Listing 23. Slow