Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

8.1Replacing Division with Multiplication

Optimization

Replace integer division by constants with multiplication by the reciprocal.

Rationale

Because the AMD Athlon™ 64 and AMD Opteron™ processors have very fast integer multiplication (3–8 cycles signed, 3–8 cycles unsigned) and the integer division delivers only one bit of quotient per cycle (22–47 cycles signed, 17–41 cycles unsigned), the equivalent code is much faster. Either follow the examples in this chapter that illustrate the use of integer division by constants or create the executables using the code in “Derivation of Algorithm, Multiplier, and Shift Factor for Integer Division by Constants” on page 186.

Multiplication by Reciprocal (Division) Utility

The code for the utilities is shown in “Derivation of Algorithm, Multiplier, and Shift Factor for Integer Division by Constants” on page 186. The utilities provided in this document are for reference only and are not supported by AMD.

Signed Division Utility

The sdiv.exe utility finds the fastest code for signed division by a constant. The utility displays the code after the user enters a signed constant divisor. To redirect the code to a file, type the following command:

sdiv > example.out

Unsigned Division Utility

The udiv.exe utility finds the fastest code for unsigned division by a constant. The utility displays the code after the user enters an unsigned constant divisor. To redirect the code to a file, type the following command:

udiv > example.out

160

Integer Optimizations

Chapter 8

Page 176
Image 176
AMD 250 Replacing Division with Multiplication, Multiplication by Reciprocal Division Utility, Signed Division Utility