AMD x86 manual Accelerating Floating-Point Divides and Square Roots

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 46
Image 46

AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

necessary for the currently selected precision. This means that setting precision control to single precision (versus Win32 default of double precision) lowers the latency of those operations.

The Microsoft® Visual C environment provides functions to manipulate the FPU control word and thus the precision control. Note that these functions are not very fast, so changes of precision control should be inserted where it creates little overhead, such as outside a computation-intensive loop. Otherwise the overhead created by the function calls outweighs the benefit from reducing the latencies of divide and square root operations.

The following example shows how to set the precision control to single precision and later restore the original settings in the Microsoft Visual C environment.

Example:

/* prototype for _controlfp() function */ #include <float.h>

unsigned int orig_cw;

/* Get current FPU control word and save it */

orig_cw = _controlfp (0,0);

/* Set precision control in FPU control word to single precision. This reduces the latency of divide and square root operations.

*/

_controlfp (_PC_24, MCW_PC);

/* restore original FPU control word */

_controlfp (orig_cw, 0xfffff);

30

Accelerating Floating-Point Divides and Square Roots

Page 46
Image 46
AMD x86 manual Accelerating Floating-Point Divides and Square Roots