25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

Table 13. Integer Instructions (Continued)

 

 

Encoding

Decode

 

 

Syntax

 

 

 

Latency

Note

First

Second

ModRM

type

 

 

 

 

byte

byte

byte

 

 

 

 

 

 

 

 

 

 

IN AL, DX

ECh

 

 

VectorPath

179

 

 

 

 

 

 

 

 

IN AX, DX

EDh

 

 

VectorPath

179

 

 

 

 

 

 

 

 

IN EAX, DX

EDh

 

 

VectorPath

181

 

 

 

 

 

 

 

 

INC AX, EAX

40h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC CX, ECX

41h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC DX, EDX

42h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC BX, EBX

43h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC SP, ESP

44h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC BP, EBP

45h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC SI, ESI

46h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC DI, EDI

47h

 

 

DirectPath

1

8

 

 

 

 

 

 

 

INC mreg8

FEh

 

11-000-xxx

DirectPath

1

 

 

 

 

 

 

 

 

INC mem8

FEh

 

mm-000-xxx

DirectPath

4

 

 

 

 

 

 

 

 

INC mreg16/32/64

FFh

 

11-000-xxx

DirectPath

1

 

 

 

 

 

 

 

 

INC mem16/32/64

FFh

 

mm-000-xxx

DirectPath

4

 

 

 

 

 

 

 

 

INSB/INS mem8, DX

6Ch

 

 

VectorPath

184

 

 

 

 

 

 

 

 

INSD/INS mem32, DX

6Dh

 

 

VectorPath

185

 

 

 

 

 

 

 

 

INSW/INS mem16, DX

6Dh

 

 

VectorPath

186

 

 

 

 

 

 

 

 

INT imm8 (no CPL change)

CDh

 

 

VectorPath

87–109

 

 

 

 

 

 

 

 

INT imm8 (CPL change)

CDh

 

 

VectorPath

91–112

 

 

 

 

 

 

 

 

INVD

0Fh

08h

 

VectorPath

247

 

 

 

 

 

 

 

 

INVLPG

0Fh

01h

mm-111-xxx

VectorPath

101/80

7

 

 

 

 

 

 

 

IRET, IRETD, IRETQ (from 64-bit to 64-bit)

CFh

 

 

VectorPath

91

 

 

 

 

 

 

 

 

IRET, IRETD, IRETQ (from 64-bit to 32-bit)

CFh

 

 

VectorPath

111

 

 

 

 

 

 

 

 

Notes:

1. Static timing assumes a predicted branch.

2. Store operation also updates ESP—the new register value is available one clock earlier than the specified latency.

3. The clock count, regardless of the number of shifts or rotates, as determined by CL or imm8.

4. LEA instructions have a latency of 1 when there are two source operands (as in the case of the base + index form LEA EAX, [EDX+EDI]). Forms with a scale or more than two source operands will have a latency of 2 (LEA EAX, [EBX+EBX*8]).

5. These instructions have an effective latency as shown. They map to internal NOPs that can be issued at a rate of three per cycle but do not occupy execution resources.

6. The latency of repeated string instructions can be found in “Latency of Repeated String Instructions” on page 167.

7. The first latency value is for 32-bit mode. The second is for 64-bit mode.

8. This opcode is used as a REX prefix in 64-bit mode.

Appendix C

Instruction Latencies

283

Page 299
Image 299
AMD 250 manual 283