Fast Floating-Point-to-Integer Conversion

Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

2.26Fast Floating-Point-to-Integer Conversion

Optimization

Use 3DNow! PF2ID instruction to perform truncating conversion to accomplish rapid floating-point- to-integer conversion, if the floating-point operand is a type float.

Application

This optimization applies to 32-bit software.

Rationale

Floating-point-to-integer conversion in C programs is typically a very slow operation. The semantics of C and C++ demand that the conversion use truncation. If the floating-point operand is of type float, and the compiler supports 3DNow! code generation, then the 3DNow! PF2ID instruction, which performs truncating conversion, can be utilized by the compiler to accomplish rapid floating- point-to-integer conversion.

Note: The PF2ID instruction does not provide conversion compliant with the IEEE-754 standard. Some operands of type float (IEEE-754 single precision) such as NaNs, infinities, and denormals, are either unsupported or not handled in compliance with the IEEE-754 standard by 3DNow! technology.

For double precision operands, the usual way to accomplish truncating conversion involves the following algorithm:

1.Save the current x87 rounding mode (this is usually round to nearest or even).

2.Set the x87 rounding mode to truncation.

3.Load the floating-point source operand and store the integer result.

4.Restore the original x87 rounding mode.

This algorithm is typically implemented through the C run-time library function ftol. While the AMD Athlon 64 and AMD Opteron processors have special hardware optimizations to speed up the changing of x87 rounding modes and therefore ftol, calls to ftol may still tend to be slow.

For situations where very fast floating-point-to-integer conversion is required, the conversion code in Listing 24 on page 53 may be helpful. This code uses the current rounding mode instead of truncation when performing the conversion. Therefore, the result may differ by 1 from the ftol result. The replacement code adds the “magic number” 252+251 to the source operand, then stores the double precision result to memory and retrieves the lower doubleword of the stored result. Adding the magic number shifts the original argument to the right inside the double precision mantissa, placing the binary point of the sum immediately to the right of the least-significant mantissa bit. Extracting the lower doubleword of the sum then delivers the integral portion of the original argument.

C and C++ Source-Level Optimizations

Chapter 2

AMD 250 manual Fast Floating-Point-to-Integer Conversion

Models: 250