AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

Take Advantage of Write Combining

 

Operating system and device driver programmers should take

 

advantage of the write -combining capabilities of the

TOP

AMD Athlon processor. The AMD Athlon processor has a very

aggressive write-combining algorithm, which improves

 

performance significantly.

 

See Appendix C, “Implementation of Write Combining” on

 

page 155 for more details.

Avoid Placing Code and Data in the Same 64-Byte Cache Line

 

Sharing code and data in the same 64-byte cache line may cause

 

the L1 caches to thrash (unnecessary castout of code/data) in

TOP

order to maintain coherency between the separate instruction

and data caches. The AMD Athlon processor has a cache-line

 

 

size of 64-bytes, which is twice the size of previous processors.

Programmers must be aware that code and data should not be

 

shared within this larger cache line, especially if the data

 

becomes modified.

 

For example, programmers should consider that a memory

 

indirect JMP instruction may have the data for the jump table

 

residing in the same 64-byte cache line as the JMP instruction,

 

which would result in lower performance.

 

Although rare, do not place critical code at the border between

 

32-byte aligned code segments and a data segments. The code

 

at the start or end of your data segment should be as rarely

 

executed as possible or simply padded with garbage.

In general, the following should be avoided:

self-modifying code

storing data in code segments

50

Take Advantage of Write Combining

Page 66
Image 66
AMD x86 manual Take Advantage of Write Combining, Avoid Placing Code and Data in the Same 64-Byte Cache Line