25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

4.12Code Padding with Operand-Size Override and

NOP

Optimization

Use one or more operand-size overrides (66h) and the NOP instruction (90h) to align code and space out branches.

Application

This optimization applies to:

32-bit software

64-bit software

Rationale

Occasionally it is necessary to insert neutral code fillers into the code stream (for example, for code- alignment purposes or to space out branches). Because this filler code is executable, it should take up as few execution resources as possible, not diminish decode density, and not modify any processor state other than advancing the instruction pointer (rIP). Although there are several possible multibyte NOP-equivalent instructions that do not change the processor state (other than rIP), combinations of the operand-size override and the NOP instruction work best.

Example

Assign code-padding sequences like these and use them to align code and space out branches. These sequences are suitable for both 32-bit and 64-bit code, and you can use them on the AMD Athlon 64 and AMD Opteron processors, as well as seventh-generation AMD Athlon processors:

NOP1_OVERRIDE_NOP TEXTEQU <DB 090h>

NOP2_OVERRIDE_NOP TEXTEQU <DB 066h,090h>

NOP3_OVERRIDE_NOP TEXTEQU <DB 066h,066h,090h> NOP4_OVERRIDE_NOP TEXTEQU <DB 066h,066h,066h,090h> NOP5_OVERRIDE_NOP TEXTEQU <DB 066h,066h,090h,066h,090h> NOP6_OVERRIDE_NOP TEXTEQU <DB 066h,066h,090h,066h,066h,090h> NOP7_OVERRIDE_NOP TEXTEQU <DB 066h,066h,066h,090h,066h,066h,090h>

NOP8_OVERRIDE_NOP TEXTEQU <DB 066h,066h,066h,090h,066h,066h,066h,090h>

NOP9_OVERRIDE_NOP TEXTEQU <DB 066h,066h,090h,066h,066h,090h,066h,066h,090h>

For x87 floating-point instructions, a better single-byte padding exists. See “Align and Pack DirectPath x87 Instructions” on page 242.

Chapter 4

Instruction-Decoding Optimizations

89

Page 105
Image 105
AMD 250 manual Code Padding with Operand-Size Override, Nop