22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Remainder of Signed

;IN:EAX = dividend

 

Integer 2n or –(2n)

;OUT:EAX =

remainder

 

 

CDQ

 

 

;Sign extend into EDX

 

AND

EDX,

(2^n–1)

;Mask correction (abs(divison)–1)

 

ADD

EAX,

EDX

;Apply pre-correction

 

AND

EAX,

(2^n–1)

;Mask out remainder (abs(divison)–1)

 

SUB

EAX,

EDX

;Apply pre-correction, if necessary

 

MOV

[remainder], EAX

 

Use Alternative Code When Multiplying by a Constant

A 32-bit integer multiply by a constant has a latency of five cycles. Therefore, use alternative code when multiplying by certain constants. In addition, because there is just one multiply unit, the replacement code may provide better throughput.

The following code samples are designed such that the original source also receives the final result. Other sequences are possible if the result is in a different register. Adds have been favored over shifts to keep code size small. Generally, there is a fast replacement if the constant has very few 1 bits in binary.

More constants are found in the file multiply_by_constants.txt located in the same directory where this document is located in the SDK.

by 2:

ADD

REG1, REG1

;1 cycle

by 3:

LEA

REG1, [REG1*2+REG1]

;2 cycles

by 4:

SHL

REG1, 2

;1 cycle

by 5:

LEA

REG1, [REG1*4+REG1]

;2 cycles

by 6:

LEA

REG2, [REG1*4+REG1]

;3 cycles

 

ADD

REG1, REG2

 

by 7:

MOV

REG2, REG1

;2 cycles

 

SHL

REG1, 3

 

 

SUB

REG1, REG2

 

by 8:

SHL

REG1, 3

;1 cycle

by 9:

LEA

REG1, [REG1*8+REG1]

;2 cycles

by 10:

LEA

REG2, [REG1*8+REG1]

;3 cycles

 

ADD

REG1, REG2

 

Use Alternative Code When Multiplying by a Constant

81

Page 97
Image 97
AMD x86 manual Use Alternative Code When Multiplying by a Constant, Integer 2n or -2n