25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

E.3 Reuse of Dead Registers

Optimization

When it is necessary to save the contents of a register that is in FPS format to another unused (or dead) register, where the previous contents of the dead register are unknown and could be a denormal,

then use movaps xmm1, xmm2 instead of movss xmm1, xmm2.

Application

This optimization applies to:

32-bit software

64-bit software

Rationale

The movss xmm1, xmm2 instruction takes additional time to execute if the previous contents of XMM1 are a denormal.

Appendix E

SSE and SSE2 Optimizations

359

Page 375
Image 375
AMD 250 manual Reuse of Dead Registers, 359