25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

Example 2

To perform the low-order half of the product of two 64-bit integers using 32-bit registers, a procedure such as the following is necessary:

; In:

[ESP+8]:[ESP+4] = multiplicand

;[ESP+16]:[ESP+12] = multiplier

; Out:

EDX:EAX = (multiplicand * multiplier) % 2^64

; Destroys: EAX, ECX, EDX, EFlags

llmul PROC

 

 

 

mov edx, [esp+8]

; multiplicand_hi

mov ecx, [esp+16]

; multiplier_hi

or

edx, ecx

; One

operand >= 2^32?

mov

edx, [esp+12]

; multiplier_lo

mov

eax, [esp+4]

; multiplicand_lo

jnz

twomul

; Yes, need two multiplies.

mul edx

; multiplicand_lo * multiplier_lo

ret

 

; Done,

return to caller.

twomul:

 

 

 

 

imul edx, [esp+8]

 

;

p3_lo = multiplicand_hi * multiplier_lo

imul

ecx, eax

 

;

p2_lo = multiplier_hi * multiplicand_lo

add

ecx, edx

 

;

p2_lo + p3_lo

mul

dword ptr [esp+12]

;

p1 = multiplicand_lo * multiplier_lo

add

edx, ecx

 

;

p1 + p2_lo + p3_lo = result in EDX:EAX

ret

 

 

;

Done, return to caller.

llmul ENDP

Using 64-bit registers, the entire product can be produced with only one instruction:

;Multiply RAX by RBX. The 128-bit product is stored in RDX:RAX. 00000000 48 F7 EB imul rbx

Related Information

For more examples of 64-bit arithmetic using only 32-bit registers, see “Efficient 64-Bit Integer Arithmetic in 32-Bit Mode” on page 170.

Chapter 3

General 64-Bit Optimizations

61

Page 77
Image 77
AMD 250 manual ESP+8ESP+4 = multiplicand