AMD x86 manual Floating-Point Optimizations, Ensure All FPU Data is Aligned

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 113
Image 113

22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

9

Floating-Point Optimizations

Th i s ch a p t e r d e t a il s t h e m e t h o d s u s e d t o o p t i m i z e floating-point code to the pipelined floating-point unit (FPU). Guidelines are listed in order of importance.

Ensure All FPU Data is Aligned

As discussed in “Memory Size and Alignment Issues” on page 45, floating-point data should be naturally aligned. That is, words should be aligned on word boundaries, doublewords on doubleword boundaries, and quadwords on quadword boundaries. Misaligned memory accesses reduce the available memory bandwidth.

Use Multiplies Rather than Divides

If accuracy requirements allow, floating-point division by a constant should be converted to a multiply by the reciprocal. Divisors that are powers of two and their reciprocal are exactly representable, except in the rare case that the reciprocal overflows or underflows, and therefore does not cause an accuracy issue. Unless such an overflow or underflow occurs, a division by a power of two should always be converted to a multiply. Although th e AMD Athlon ™ proc essor has high-performance division, multiplies are significantly faster than divides.

Ensure All FPU Data is Aligned

97

Page 113
Image 113
AMD x86 manual Floating-Point Optimizations, Ensure All FPU Data is Aligned, Use Multiplies Rather than Divides