AMD x86 manual Minimize Floating-Point-to-Integer Conversions, Example 1 Fast

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 116
Image 116

AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

Minimize Floating-Point-to-Integer Conversions

C++, C, and Fortran define floating-point-to-integer conversions as truncating. This creates a problem because the active rounding mode in an application is typically round-to-nearest- even. The classical way to do a double-to-int conversion therefore works as follows:

Example 1 (Fast):

SUB

[I], EDX

 

;trunc(X)=rndint(X)-correction

FLD

QWORD PTR

[X]

;load double to be converted

FSTCW

[SAVE_CW]

 

;save current FPU control word

MOVZX

EAX, WORD

PTR[SAVE_CW];retrieve control word

OR

EAX, 0C00h

 

;rounding control field = truncate

MOV

WORD PTR [NEW_CW], AX ;new FPU control word

FLDCW

[NEW_CW]

 

;load new FPU control word

FISTP

DWORD PTR

[I]

;do double->int conversion

FLDCW

[SAVE_CW]

 

;restore original control word

The AMD Athlon processor contains special acceleration hardware to execute such code as quickly as possible. In most situations, the above code is therefore the fastest way to perform floating-point-to-integer conversion and the conversion is compliant both with programming language standards and the IEEE-754 standard.

According to the recommendations for inlining (see “Always Inline Functions with Fewer than 25 Machine Instructions” on page 72), the above code should not be put into a separate subroutine (e.g., ftol). It should rather be inlined into the main code.

In some codes, floating-point numbers are converted to an integer and the result is immediately converted back to floating-point. In such cases, the FRNDINT instruction should be used for maximum performance instead of FISTP in the code above. FRNDINT delivers the integral result directly to an FPU register in floating-point form, which is faster than first using FISTP to store the integer result and then converting it back to floating-point with FILD.

If there are multiple, consecutive floating-point-to-integer conversions, the cost of FLDCW operations should be minimized by saving the current FPU control word, forcing the

100

Minimize Floating-Point-to-Integer Conversions

Page 116
Image 116
AMD x86 manual Minimize Floating-Point-to-Integer Conversions, Example 1 Fast