Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

C.1 Understanding Instruction Entries

To use the information in this appendix effectively, you need to understand how the entry for an instruction is organized and how to interpret certain items.

Example: Instruction Entry

The entry for an instruction begins with its syntax. Subsequent columns provide additional information about the instruction.

 

 

Encoding

Decode

 

 

Syntax

 

 

 

Latency

Note

First

Second

ModRM

type

 

 

 

 

byte

byte

byte

 

 

 

 

 

 

 

 

 

 

ADD mreg8, reg8

00h

 

11-xxx-xxx

DirectPath

1

 

 

 

 

 

 

 

 

Parts of the Instruction Entry

This table describes the columns that are common to each instruction entry in this appendix.

Column

Description

 

 

Syntax

Shows the syntax for the instruction—the permitted arrangement of its parts. Items in

 

italics are placeholders for operands that you must provide. For information on how to

 

interpret the placeholders, see “Interpreting Placeholders” on page 271

 

 

Encoding

Shows how the assembler translates the instruction into machine language.

 

Subcolumns show the individual bytes of the encoding.

 

 

Decode type

Shows the method that the processor uses to decode the instruction—either DirectPath

 

Single (DirectPath), DirectPath Double (Double), or VectorPath.

 

 

Latency

Shows the static execution latency for the instruction. For details on how to interpret the

 

latency information, see “Interpreting Latencies” on page 272.

 

 

Throughput

This value indicates the maximum theoretical rate of execution of that instruction. For

 

example, a value of 1/2 means that one such instruction executes every two clocks, or

 

two such instructions in four clocks and so on. A value of 3/1 indicates that three such

 

instructions can be executed every clock, but fewer than three such instructions would

 

still take one clock.

 

 

The entries for floating-point, MMX, SSE, and SSE2, and 3DNow!™ instructions have an additional column [FPU Pipe(s)] that lists the possible floating-point unit (FPU) pipelines available for use by any particular DirectPath or Double decoded operation. For example, the floating point multiplier is represented by FMUL.

270

Instruction Latencies

Appendix C

Page 286
Image 286
AMD 250 manual Understanding Instruction Entries, Example Instruction Entry, Parts of the Instruction Entry, 270