25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

E.7 Explicit Load Instructions

Optimization

Use movlpd xmm1, mem64 when loading a scalar FPD value from memory.

Application

This optimization applies to:

32-bit software

64-bit software

Rationale

The movlpd xmm1, mem64 instruction is more efficient than movsd xmm1, mem64. Use MOVSD only if you need to ensure that the upper half of XMM1 is also set to FPD format, perhaps because a vector operation is planned on the register.

When loading a scalar FPS value from memory, use MOVSS.

Appendix E

SSE and SSE2 Optimizations

363

Page 379
Image 379
AMD 250 manual Explicit Load Instructions, 363