334 Instruction Latencies Appendix C
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
PAND xmmreg1,
xmmreg2
66h 0Fh DBh Double FADD/
FMUL
21/1
PAND xmmreg,
mem128
66h 0Fh DBh Double FADD/
FMUL
41/1
PANDN xmmreg1,
xmmreg2
66h 0Fh DFh Double FADD/
FMUL
21/1
PANDN xmmreg,
mem128
66h 0Fh DFh Double FADD/
FMUL
41/1
PAVGB xmmreg1,
xmmreg2
66h 0Fh E0h Double FADD/
FMUL
21/1
PAVGB xm mr eg,
mem128
66h 0Fh E0h Double FADD/
FMUL
41/1
PAVGW xmmreg1,
xmmreg2
66h 0Fh E3h Double FADD/
FMUL
21/1
PAVGW xm mre g,
mem128
66h 0Fh E3h Double FADD/
FMUL
41/1
PCMPEQB xmmreg1,
xmmreg2
66h 0Fh 74h Double FADD/
FMUL
21/1
PCMPEQB xmmreg,
mem128
66h 0Fh 74h Double FADD/
FMUL
41/1
PCMPEQD xmmreg1,
xmmreg2
66h 0Fh 76h Double FADD/
FMUL
21/1
PCMPEQD xmmreg,
mem128
66h 0Fh 76h Double FADD/
FMUL
41/1
PCMPEQW xmmreg1,
xmmreg2
66h 0Fh 75h Double FADD/
FMUL
21/1
PCMPEQW xmmreg,
mem128
66h 0Fh 75h Double FADD/
FMUL
41/1
PCMPGTB xmmreg1,
xmmreg2
66h 0Fh 64h Double FADD/
FMUL
21/1
PCMPGTB xmmreg,
mem128
66h 0Fh 64h Double FADD/
FMUL
41/1
PCMPGTD xmmreg1,
xmmreg2
66h 0Fh 66h Double FADD/
FMUL
21/1
Table 19. SSE2 Instructions (Continued)
Syntax
Encoding
Decode
type
FPU
pipe(s)
Latency
Throughput
Note
Prefix
byte
First
byte
2nd
byte ModRM byte
Notes:
1. The low half of the result is available one cycle earlier than listed.
2. This is the execution latency for the instruction. The time to complete the external write depends on the memory
speed and the hardware implementation.