AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

Align Data Where Possible

In general, avoid misaligned data references. All data whose

size is a power of 2 is considered aligned if it is naturally

 

TOP

aligned. For example:

 

 

 

 

 

 

 

QWORD accesses

are

aligned if

they access

an

address

 

 

 

divisible by 8.

 

 

 

 

 

 

 

 

DWORD accesses

are

aligned

if

they

access

an

address

 

 

divisible by 4.

 

 

 

 

 

 

 

 

WORD accesses

are

aligned

if

they

access

an

address

 

 

divisible by 2.

 

 

 

 

 

 

 

 

TBYTE accesses

are

aligned

if

they

access

an

address

 

 

divisible by 8.

 

 

 

 

 

 

 

A misaligned store or load operation suffers a minimum one-cycle penalty in the AMD Athlon processor load/store pipeline. In addition, using misaligned loads and stores increases the likelihood of encountering a store-to-load forwarding pitfall. For a more detailed discussion of store-to- load forwarding issues, see “Store-to-Load Forwarding Restrictions” on page 51.

Use the 3DNow!™ PREFETCH and PREFETCHW Instructions

3DNow! PREFETCH and PREFETCHW instructions to

TOPincrease the effective bandwidth to the AMD Athlon processor.For code that can take advantage of prefetching, use theThe PREFETCH and PREFETCHW instructions takeadvantage of the AMD Athlon processor’s high bus bandwidth

to hide long latencies when fetching data from system memory. The prefetch instructions are essentially integer instructions and can be used anywhere, in any type of code (integer, x87, 3DNow!, MMX, etc.).

Large data sets typically require unit-stride access to ensure that all data pulled in by PREFETCH or PREFETCHW is actually used. If necessary, algorithms or data structures should be reorganized to allow unit-stride access.

46

Use the 3DNow!™ PREFETCH and PREFETCHW

Page 62
Image 62
AMD x86 manual Use the 3DNow! Prefetch and Prefetchw Instructions, Align Data Where Possible