25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

Table 22. Converting Scalar Values (Continued)

Source

Destination format

Preferred instructions

Notes

format

 

 

 

 

 

 

 

INT GPR

FPS

xorps xmm1, xmm1

XORPS is used to ensure that the

 

 

cvtsi2ss xmm1, reg32/64

high half of XMM1 is in FPS

 

 

 

format. This is also better in case a

 

 

 

MOVAPS instruction is used later.

 

 

 

 

INT GPR

FPD

cvtsi2sd xmm1, reg32/64

 

 

 

 

 

Table 23. Converting Vector Values

Source

Destination format

Preferred instructions

Notes

format

 

 

 

 

 

 

 

FPS

INT XMM

cvtps2dq xmm1, xmm2

 

 

 

 

 

FPS

FPD

cvtps2pd xmm1, xmm2

 

 

 

 

 

FPD

INT XMM

cvtpd2dq xmm1, xmm2

 

 

 

 

 

FPD

FPS

cvtpd2ps xmm1, xmm2

 

 

 

 

 

INT XMM

FPS

cvtdq2ps xmm1, xmm2

 

 

 

 

 

INT XMM

FPD

cvtdq2pd xmm1, xmm2

 

 

 

 

 

Table 24. Converting Directly from Memory

Source

Destination format

Preferred instructions

Notes

format

 

 

 

 

 

 

 

FPD

FPS

xorps xmm1, xmm1

XORPS ensures that the high half

 

 

cvtsd2ss xmm1, mem64

of XMM1 is in FPS format in case

 

 

 

a MOVAPS instruction is used

 

 

 

later.

 

 

 

 

INT GPR

FPS

xorps xmm1, xmm1

XORPS is used to ensure that the

 

 

cvtsi2ss xmm1, mem32/64

high half of XMM1 is in FPS

 

 

 

format. This is also better in case a

 

 

 

MOVAPS instruction is used later.

 

 

 

 

Appendix E

SSE and SSE2 Optimizations

365

Page 381
Image 381
AMD 250 manual Converting Vector Values, Converting Directly from Memory, 365, Int Gpr Fps, Int Gpr Fpd