AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

Switch Statement Usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Optimize Switch Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Use Prototypes for All Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Use Const Type Qualifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Generic Loop Hoisting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Generalization for Multiple Constant Control Code. . . . . . . . 23 Declare Local Functions as Static . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Dynamic Memory Allocation Consideration . . . . . . . . . . . . . . . . . . . 25 Introduce Explicit Parallelism into Code . . . . . . . . . . . . . . . . . . . . . . 25 Explicitly Extract Common Subexpressions . . . . . . . . . . . . . . . . . . . 26 C Language Structure Component Considerations . . . . . . . . . . . . . . 27 Sort Local Variables According to Base Type Size . . . . . . . . . . . . . . 28 Accelerating Floating-Point Divides and Square Roots . . . . . . . . . . 29 Avoid Unnecessary Integer Division. . . . . . . . . . . . . . . . . . . . . . . . . . 31

Copy Frequently De-referenced Pointer Arguments to

Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4

Instruction Decoding Optimizations

33

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Select DirectPath Over VectorPath Instructions. . . . . . . . . . . . . . . . 34

Load-Execute Instruction Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Use Load-Execute Integer Instructions . . . . . . . . . . . . . . . . . . 34

Use Load-ExecuteFloating-Point Instructions with

Floating-Point Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Avoid Load-ExecuteFloating-Point Instructions with

Integer Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Align Branch Targets in Program Hot Spots . . . . . . . . . . . . . . . . . . . 36 Use Short Instruction Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Avoid Partial Register Reads and Writes. . . . . . . . . . . . . . . . . . . . . . 37 Replace Certain SHLD Instructions with Alternative Code. . . . . . . 38 Use 8-BitSign-Extended Immediates . . . . . . . . . . . . . . . . . . . . . . . . . 38

iv

Contents

Page 4
Image 4
AMD x86 manual Instruction Decoding Optimizations