AMD x86 manual Top

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 51
Image 51

22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Use Load-Execute Floating-Point Instructions with Floating-Point Operands

 

When operating on single-precision or double-precision

 

floating-point data, wherever possible use floating-point

TOP

load-execute instructions to increase code density.

 

Note: This optimization applies only to floating-point instructions

with floating-point operands and not with integer operands,

as described in the next optimization.

 

This coding style helps in two ways. First, denser code allows

 

more work to be held in the instruction cache. Second, the

 

denser code generates fewer internal OPs and, therefore, the

 

FPU scheduler holds more work, which increases the chances of

 

extracting parallelism from the code.

Example 1 (Avoid):

FLD

QWORD

PTR [TEST1]

FLD

QWORD

PTR

[TEST2]

FMUL

ST, ST(1)

 

Example 2 (Preferred):

FLD

QWORD

PTR

[TEST1]

FMUL

QWORD

PTR

[TEST2]

Avoid Load-Execute Floating-Point Instructions with Integer Operands

operands: FIADD, FISUB, FISUBR, FIMUL, FIDIV, FIDIVR,

TOPFICOM, and FICOMP. Remember th at floatin g -p ointDo not use load-execute floating-point instructions with integerinstructions can have integer operands while integerinstruction cannot have floating-point operands.

Floating-point computations involving integer-memory operands should use separate FILD and arithmetic instructions. This optimization has the potential to increase decode bandwidth and OP density in the FPU scheduler. The floating- point load-execute instructions with integer operands are VectorPath and generate two OPs in a cycle, while the discrete equivalent enables a third DirectPath instruction to be decoded in the same cycle. In some situations this optimizations can also reduce execution time if the FILD can be scheduled several instructions ahead of the arithmetic instruction in order to cover the FILD latency.

Load-Execute Instruction Usage

35

Page 51
Image 51
AMD x86 manual Top