Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

E.8 Data Conversion

Optimization

Use care when selecting instructions to convert values from one type to another.

Application

This optimization applies to:

32-bit software

64-bit software

Rationale

For example, the CVTDQ2PS instruction converts four packed 32-bit signed integer values in an XMM register or a 128-bit memory location to four packed single-precision floating-point values and writes the converted values to another XMM register. In some cases, an additional instruction is recommended to ensure that both halves of register operands are of the same type (as recommended in “Zeroing Out an XMM Register” on page 357).

Table 22 shows the recommendations for register-to-register conversion of scalar values. Table 23 on page 365 shows the recommendations for register-to-register conversion of vector operands. When converting values directly from memory, use the preferred instructions provided in Table 24 on page 365.

Table 22. Converting Scalar Values

Source

Destination format

Preferred instructions

Notes

format

 

 

 

 

 

 

 

FPS

INT XMM

cvtps2dq xmm1, xmm2

 

 

 

 

 

FPS

INT GPR

cvtss2si reg32/64, xmm1

 

 

 

 

 

FPS

FPD

cvtss2sd xmm1, xmm2

 

 

 

 

 

FPD

INT XMM

unpcklpd xmm2, xmm2

UNPCKLPD ensures that the high

 

 

cvtpd2dq xmm1, xmm2

half of XMM2 is also in FPD

 

 

 

format.

 

 

 

 

FPD

INT GPR

cvtsd2si reg32/64, xmm1

 

 

 

 

 

FPD

FPS

xorps xmm1, xmm1

XORPS ensures that the high half

 

 

cvtsd2ss xmm1, xmm2

of XMM1 is in FPS format in case

 

 

 

a MOVAPS instruction is used

 

 

 

later.

 

 

 

 

INT XMM

FPS

cvtdq2ps xmm1, xmm2

 

 

 

 

 

INT XMM

FPD

cvtdq2pd xmm1, xmm2

 

 

 

 

 

364

SSE and SSE2 Optimizations

Appendix E

Page 380
Image 380
AMD 250 manual Data Conversion, Converting Scalar Values, 364