25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

9.7Use MMX™ Instructions to Construct Fast Block- Copy Routines in 32-Bit Mode

Optimization

Use MMX instructions when moving integer data in a block-copy routine.

Application

This optimization applies to:

32-bit software

Rationale

MMX instructions relieve the high register pressure typical of x86 code because of the small register file.

In addition, MMX instructions increase the available parallelism on AMD Athlon 64 and

AMD Opteron processors because they use both sides (integer and floating-point) of the execution pipeline. For an example of how to move a large quadword-aligned block of data using the MMX

MOVQ instruction, see "Optimizing Main Memory Performance for Large Arrays" in the AMD Athlon™ Processor x86 Code Optimization Guide (order # 22007).

If a block-copy routine is not used, do not move integer data through MMX registers.

Chapter 9

Optimizing with SIMD Instructions

207

Page 223
Image 223
AMD 250 manual 207