AMD x86 manual Take Advantage of Write Combining, Use 3DNow! Instructions

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 26
Image 26

AMD Athlon™ Processor x86 Code Optimization

Avoid Load-Execute Floating-Point Instructions with Integer Operands

22007E/0 — November 1999

 

Do not use load-execute floating-point instructions with integer

TOP

operands. The floating-point load-execute instructions with

integer operands are VectorPath and generate two OPs in a

cycle, while the discrete equivalent enables a third DirectPath

 

instruction to be decoded in the same cycle.

Take Advantage of Write Combining

 

This guideline applies only to operating system, device driver,

TOP

an d B IO S p rog ram m ers . I n o rd e r t o i m p rove sy st em

performance, the AMD Athlon processor aggressively combines

multiple memory-write cycles of any data size that address

 

locations within a 64-byte cache line aligned write buffer.

 

See Appendix C, “Implementation of Write Combining” on

 

page 155 for more details.

Use 3DNow!™ Instructions

 

Unless accuracy requirements dictate otherwise, perform

TOP

floating-point computations using the 3DNow! instructions

instead of x87 instructions. The SIMD nature of 3DNow!

instructions achieves twice the number of FLOPs that are

 

achieved through x87 instructions. 3DNow! instructions also

 

provide for a flat register file instead of the stack-based

 

approach of x87 instructions.

 

See Table 23 on page 217 for a list of 3DNow! instructions. For

 

information about instruction usage, see the 3DNow!™

 

Technology Manual, order# 21928.

Avoid Branches Dependent on Random Data

 

Avoid data-dependent branches around a single instruction .

TOP

Data-dependent branches acting upon basically random data

can cause the branch prediction logic to mispredict the branch

about 50% of the time. Design branch-free alternative code

sequences, which results in shorter average execution time.

See “Avoid Branches Dependent on Random Data” on page 57 for more details.

10

Group II Optimizations —Secondary Optimizations

Page 26
Image 26
AMD x86 manual Take Advantage of Write Combining, Use 3DNow! Instructions, Avoid Branches Dependent on Random Data