22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Recommendations for AMD-K6®Family and AMD Athlon™ Processor Blended Code

On x86 processors other than the AMD Athlon processor (including the AMD-K6 family of processors), the REP prefix and especially multiple prefixes cause decoding overhead, so the above technique is not recommended for code that has to run well both on AMD Athlon processor and other x86 processors (blended code). In such cases the instructions and instruction sequences below are recommended. For neutral code fillers longer than eight bytes in length, the JMP instruction can be used to jump across the padding region.

Note that each of the instructions and instruction sequences below utilizes an x86 register. To avoid performance degradation, the register used in the padding should be selected so as to not lengthen existing dependency chains, i.e., one should select a register that is not used by instructions in the vicinity of the neutral code filler. Note that certain instructions use registers implicitly. For example, PUSH, POP, CALL, and RET all make implicit use of the ESP register. The 5-byte filler sequence below consists of two instructions. If flag changes across the code padding are acceptable, the following instructions may be used as single instruction, 5-byte code fillers:

TEST EAX, 0FFFF0000h

CMP EAX, 0FFFF0000h

Th e fo l l ow in g a s s e m b ly l a n g u a g e m a c ro s s h ow t h e recommended neutral code fillers for code optimized for the AMD Athlon processor that also has to run well on other x86 processors. Note for some padding lengths, versions using ESP or EBP are missing due to the lack of fully generalized addressing modes.

NOP2_EAX TEXTEQU <DB 08Bh,0C0h> ;mov eax, eax

NOP2_EBX TEXTEQU <DB 08Bh,0DBh> ;mov ebx, ebx

NOP2_ECX TEXTEQU <DB 08Bh,0C9h> ;mov ecx, ecx

NOP2_EDX TEXTEQU <DB 08Bh,0D2h> ;mov edx, edx

NOP2_ESI TEXTEQU <DB 08Bh,0F6h> ;mov esi, esi

NOP2_EDI TEXTEQU <DB 08Bh,0FFh> ;mov edi, edi

NOP2_ESP TEXTEQU <DB 08Bh,0E4h> ;mov esp, esp

NOP2_EBP TEXTEQU <DB 08Bh,0EDh> ;mov ebp, ebp

NOP3_EAX TEXTEQU <DB 08Dh,004h,020h> ;lea eax, [eax]

NOP3_EBX TEXTEQU <DB 08Dh,01Ch,023h> ;lea ebx, [ebx]

Code Padding Using Neutral Code Fillers

41

Page 57
Image 57
AMD x86 manual Code Padding Using Neutral Code Fillers