22007E/0— November 1999

AMD Athlon™ Processor x86 Code Optimization

Contents

Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

1

Introduction

1

About this Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 AMD Athlon™ Processor Family. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 AMD Athlon Processor Microarchitecture Summary . . . . . . . . . . . . . 4

2

Top Optimizations

7

Optimization Star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Group I Optimizations — Essential Optimizations . . . . . . . . . . . . . . . 8

Memory Size and Alignment Issues . . . . . . . . . . . . . . . . . . . . . . 8

Use the 3DNow!™ PREFETCH and PREFETCHW

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Select DirectPath Over VectorPath Instructions . . . . . . . . . . . 9 Group II Optimizations—Secondary Optimizations . . . . . . . . . . . . . . 9 Load-Execute Instruction Usage. . . . . . . . . . . . . . . . . . . . . . . . . 9 Take Advantage of Write Combining. . . . . . . . . . . . . . . . . . . . 10 Use 3DNow! Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Avoid Branches Dependent on Random Data . . . . . . . . . . . . . 10

Avoid Placing Code and Data in the Same

64-Byte Cache Line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3

C Source Level Optimizations

13

Ensure Floating-Point Variables and Expressions

are of Type Float . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Use 32-Bit Data Types for Integer Code . . . . . . . . . . . . . . . . . . . . . . . 13 Consider the Sign of Integer Operands . . . . . . . . . . . . . . . . . . . . . . . 14 Use Array Style Instead of Pointer Style Code . . . . . . . . . . . . . . . . . 15 Completely Unroll Small Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Avoid Unnecessary Store-to-Load Dependencies . . . . . . . . . . . . . . . 18 Consider Expression Order in Compound Branch Conditions . . . . . 20

Contents

iii

Page 3
Image 3
AMD x86 manual Contents